On Fri, 11 Aug 2006 15:39:41 +0200 Claudia Drechsle <[EMAIL PROTECTED]> wrote:
> > >>> Can I convert a saved PDF to a Microsoft Word with the openoffice > >>> program? > >>> > >>> > >> No. This can only be done with one of the commercial Adobe > >> programs. All others are prohibited from doing so. > > > > If you have access to Linux (or a Linux Live CD), KWord can do it > > very well. > > > > Yes, KWord can import PDF-files. But if there are frames, graphics, > tables etc within the pdf, the result is not very usable. > In my cases KWord even crashed in some cases when I tried to import a > PDF-file. The same (KWord 1.5.1, KDE 3.5.3). And I couldn't get any acceptable results saving to odt and opening with OOo. > So I think, PDF that contain only text may be imported by KWord. > Are other programs able to import also structured PDF's into a > Text-Format? Open source - I guess no. Possible workaround (indirect) may be pdftohtml. I've tried version 0.36 (debian package) on a 132 pages user's manual with tables and pictures (marked with textboxes, arrows, etc.), and here are the results: Simple output: looses many images, ignores fonts and font sizes, but preserves text flow and bold text (maybe italics, too - don't know), and outputs a single file (+ separate index and outline files). Tables and text boxes are imported as simple text, and drawing objects aren't imported at all. If at least all the images were imported in right places, this way would be the best. Some formating should still be done, but it's not hard using styles. --------------------------------- Complex html output (-c option) is very precise when opened in Firefox. The precision is achieved at the cost of making everything but the text a single background image, and placing text in <div>'s with absolute positioning. Each pdf page is converted to a separate html document. Text flow is lost (paragraphs are split, too big spaces are made) on subscripts and superscripts. A few paragraphs have wrong fonts - strange since only one font is used in the document. This option is good if you don't need to change the layout of the original graphics. It means you are lost if you e.g. translate the document, and the translation doesn't fit into a table cell, or a picture needs to be moved due to different row number in a paragraph. -------------------------------- But better try it yourself - I've tested just one file, and the only docs I've read was a man page (not sure if there are other docs). Another thing is getting the html to odt. I couldn't get good results straight away with Writer/Web. But these were my first html ones opened with OOo, so must not be a problem to anyone who knows how to use it. So - waiting for KWord to improve both in import and odf? :-) --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
