Hi Kurt,

  From my very limited experience I found that kword did a pretty good
job at importing PDF. I used also OpenOffice to write out -poor-
docbook.
  You should be able to import your PDF file directly in KWord and
write out (X)HTML file. Watch out that all your formatting will be
lost (no more title, section...).

  I used the following script to convert HTML to docbook:

http://wiki.docbook.org/topic/Html2DocBook

  But in my case, my input HTML was -somewhat- organized.

Good luck

On Fri, Jun 11, 2010 at 11:31 PM, Kurt A Richardson
<[email protected]> wrote:
> Hi list
>
> I am new to DocBook, and XML-based publishing in general.   I run a small
> publishing company (30 titles), that specializes in complexity theory and I
> have been looking for ways to not only improve my little doc flow
> methodology, but also make our content available to our readers in a variety
> of new modes and formats.  I have been drawn to DocBook and the possibility
> of using XSLT as a means to realize these goals.  I have little trouble
> figuring out how to prepare new content and am hoping to produce our next
> two titles purely from DocBook XML.  However, I also have about 6000 pages
> of PDFs (not all having the same format) that I'd like to 'down convert' to
> DocBook XML.  I am making SLOW progress and wondered if anyone here had any
> bright ideas about how to approach this task... e.g., is PDF to html the
> best first step?  Or does anyone know of any affordable services being
> provided to do the down conversion for me.
>
> Many thanks in advance for any guidance you can provide.
>
> I'm really rather excited about the possibilities that arise once I move our
> publishing from Adobe CS to XML-based!
>
> Kind regards, Kurt
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [email protected]
> For additional commands, e-mail: [email protected]
>
>



-- 
Mathieu

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to