I've bee using pdftohtml (get the latest version from
poppler.freedesktop.org) with the '-complex -xml' options, to generate an
XML file (which I am then processing with a Perl prog to make an ePub) -
depending on the PDF, it does a pretty good job, and you may be able to
import the XML directly?




On 18 July 2014 15:05, TimA <[email protected]> wrote:

> Hi Terry
>
>
> On 18/07/14 14:47, Terry Coles wrote:
>
>> Hi,
>>
>> Does anyone know how I can use tools available in Linux to convert a PDF
>> file
>> to MS Word .doc or .docx format (or even to LibreOffice .odt)?
>>
>
> Closest I'm aware of is pdftotext (also pdf2text, pdf2txt etc). But of
> course you'll lose the formatting. There's also pdf2ps from which maybe you
> can use
>
> http://www.coolutils.com/PS-to-DOC
>
> or something similar
>
> Cheers
>
> Tim
>
>
>
>> I thought I could do it using LibreOffice, but it reads the PDF content
>> as if it
>> is a series of graphical objects with text labels.  As a consequence, I
>> can
>> only save it as .odg or export it to a graphical format.
>>
>> The problem is that we have a number of specifications in PDF format.  We
>> need
>> to get them into an editable form (preferably word) because they need
>> translating.
>>
>> At work I tried the real thing (Adobe Writer), but it seriously mangles
>> the
>> format, even when it works.
>>
>> The originals seem to have been created using a number of different
>> tools; some
>> were created in MS Word 2010, some PDFCreator (presumably from a Word
>> Source,
>> some with Acrobat Distiller and some by conversion from Postscript.  Adobe
>> Writer was only able to save three out of five documents and they were
>> not very
>> good.
>>
>>
>>
>
> --
> Next meeting:  Bournemouth, Tuesday, 2014-08-05 20:00
> Meets, Mailing list, IRC, LinkedIn, ...  http://dorset.lug.org.uk/
> New thread on mailing list:  mailto:[email protected]
> How to Report Bugs Effectively:  http://goo.gl/4Xue
>



-- 
best regards,
웃
Victor Churchill,
Bournemouth
-- 
Next meeting:  Bournemouth, Tuesday, 2014-08-05 20:00
Meets, Mailing list, IRC, LinkedIn, ...  http://dorset.lug.org.uk/
New thread on mailing list:  mailto:[email protected]
How to Report Bugs Effectively:  http://goo.gl/4Xue

Reply via email to