Re: [EXTERNAL] Extracting font information from xml

Jay Chuk Wed, 16 Oct 2019 02:38:13 -0700

Thanks for the quick reply Chris.
Please is there a possible code snippet in python for it.


Reagrds,
Jay

On Tue, Oct 15, 2019 at 6:52 PM Chris Mattmann <[email protected]> wrote:

> Hi Jay, yes, I believe so. Tika Python is just a thin client to Tika
> Server and it
> provides this functionality. CC’ing dev@tika
>
>
>
>
>
>
>
> *From: *Jay Chuk <[email protected]>
> *Date: *Tuesday, October 15, 2019 at 3:47 PM
> *To: *"Mattmann, Chris A (US 1761)" <[email protected]>
> *Subject: *[EXTERNAL] Extracting font information from xml
>
>
>
> Hi Chris,
>
>
>
> Thanks for provide the python package -Tika, to use for extracting text
> from pdf's.
>
>
>
> I'll like to confirm it is possible when converting pdf to xml  to get the
> font style for the text e.g the font type, if the text is bold/solid .
>
> I need such information in identifying section headers and titles from the
> documents.
>
>
>
> Please let me know if it is possible or if there is another way tp gp
> about this.
>
>
>
> Thank you
>
> Jay
>

Re: [EXTERNAL] Extracting font information from xml

Reply via email to