RE: Extract bold text from a PDF file

2019-03-19 Thread Hesham Gneady
Thans for sharing your experience about this Peter! I will have to use the “heavy” comparison then for the font name! .. I thought their might be another indication for this in my attached PDF file. Best regards, Hesham

Re: Extract bold text from a PDF file

2019-03-18 Thread Peter Murray-Rust
I have processed over 100,000 PDFs (mainly scientific publications) and I am reasonably certain there is no universal property that is "Bold" that can be algorithmically detected. "Bold" is an instruction for the authoring software to create something that stands out visually. This can be done by:

Re: Extract bold text from a PDF file

2019-03-18 Thread Gilad Denneboom
I don't see why there *must* be such an option. Bold fonts are not a subset of existing fonts, despite what it might look like when you use Word (which creates fake bold fonts on its own). They exist on their own, with their own names. True, they are usually a variant of another existing font, but

RE: Extract bold text from a PDF file

2019-03-18 Thread Hesham Gneady
I have 100s of PDF files used! There must be some property used in my attached PDF file that cause the bold font, not just the font type used! .. I see properties like ForceBold() but it’s set to false too .. I mean; something like that? Best regards, Hesham

Re: Extract bold text from a PDF file

2019-03-18 Thread Gilad Denneboom
Instead of a partial match for the name you could compile a list of all the names of the bold variants of your fonts, and then compare the font name to that list. On Mon, Mar 18, 2019 at 11:13 AM Hesham Gneady wrote: > Hello , > > > > I am trying to extract the bold text for some PDF files, but

Extract bold text from a PDF file

2019-03-18 Thread Hesham Gneady
Hello , I am trying to extract the bold text for some PDF files, but some fail like this one: https://www.dropbox.com/s/gh2zwdh3sl3isck/Bold%20Font%20Sample.pdf?dl=0 I am overriding the processTextPosition (.) method to do this, and i have tried all these options, but none has worked for