> what are the downsides/upsides of following the content stream order? > Depends on whether you know something about the PDF producers that you are getting content from.
If all the PDFs that you are trying to process are coming from modern, well written, products then you are probably fine. However, poorly made PDF creators will produce PDFs that will end up resulting in garbage from your extraction process. Leonard On 5/17/19, 3:14 AM, "poppler on behalf of Massimo Redaelli" <[email protected] on behalf of [email protected]> wrote: On Thu, May 16, 2019, 8:08 PM Albert Astals Cid <[email protected]> wrote: > > Are there reasons not to use it? > > The man page explains the reason not to use it. Yes, I should have asked: what are the downsides/upsides of following the content stream order? But i guess I'm mainly asking: > Is the option going to be deprecated, or can we count on it being > there for the foreseeable future? M. On Thu, May 16, 2019 at 6:08 PM Albert Astals Cid <[email protected]> wrote: > > El dijous, 16 de maig de 2019, a les 17:00:27 CEST, Massimo Redaelli va escriure: > > Hey all. > > > > Question regarding pdftotext. > > > > The help says that `raw` is not recommended anymore, but for all PDFs > > I tried it actually gives better results than the default mode, by > > which I mean that paragraphs are not interrupted by extraneous text, > > like headers or boxes. > > (I do have to handle hyphenated words, but that looks easy.) > > > > Is the option going to be deprecated, or can we count on it being > > there for the foreseeable future? > > Are there reasons not to use it? > > The man page explains the reason not to use it. > > Cheers, > Albert > > > > > Thanks! > > > > > > > > > _______________________________________________ > poppler mailing list > [email protected] > https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fpoppler&data=02%7C01%7Clrosenth%40adobe.com%7C272a49cb5c3d414e939508d6da973c9c%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C1%7C636936740422101367&sdata=fg4lR%2FIlWWvLrtUsUbTtAI6yLYBDgR8F16oScaibohM%3D&reserved=0 -- M. _______________________________________________ poppler mailing list [email protected] https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fpoppler&data=02%7C01%7Clrosenth%40adobe.com%7C272a49cb5c3d414e939508d6da973c9c%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C1%7C636936740422101367&sdata=fg4lR%2FIlWWvLrtUsUbTtAI6yLYBDgR8F16oScaibohM%3D&reserved=0 _______________________________________________ poppler mailing list [email protected] https://lists.freedesktop.org/mailman/listinfo/poppler
