https://bugs.documentfoundation.org/show_bug.cgi?id=152143
Hossein <[email protected]> changed: What |Removed |Added ---------------------------------------------------------------------------- Component|Writer |Draw --- Comment #2 from Hossein <[email protected]> --- I don't think this is a duplicate of tdf#32249. The title of that one is: Bug 32249 "When importing PDF with text in it , it will be better to have a easy and fluent option to edit the imported Text". So, the above issue is basically talking about being able to edit the text. I am here talking about being able to export the PDF as a text file. These are obviously different, even if you discuss about the commonalities in the implementation. > So you can already select and consolidate entire pages of imported draw shape > textboxes (by glyph index lookup in a ToUinicode CMAP) into a single draw > shape textbox--a sentence or paragraph of text. And then select that text, > copy it and paste it as needed. Then correct as lexically necessary. I disagree. This is not what was intended in this feature request. I have specifically requested means of exporting the whole PDF document as a text file, both via UI and command line. The above consolidation feature might help internally when you want to implement such a feature, but that is not what I have asked for. > Also, because PDF provides no lexical sense to the runs in a document (it is > a > published presentation format)--the discrete imported draw shape text boxes > *must be selected in sequence* for a manual merge. That would remain the case > working with draw shape textboxes on the Writer canvas and is a limitation of > the published rendering encoded into PDF. I disagree again. We have text boxes in LibreOffice, MS Office and elsewhere, but we can export the contents to text files. I haven't requested for a smart software that can understand the meaning of the document. The goal is to export the contents to a text file. > Doing more efficient and high fidelity text extraction from PDF into ODF > paragraphs is the end goal of bug 32249. > > Export of lexically correct word, sentence or paragraph to other document > formats then becomes routine export filtering that is already present. Even by accepting this implementation path, it can be said that this feature request is depending on tdf#32249, not a duplicate of it. -- You are receiving this mail because: You are the assignee for the bug.
