There isn't currently a way to do this in Tika, but it _should_ be possible
to add.  I think there's been some interest in this over the years, but
there hasn't been enough momentum to add this to Tika.

@Tilman this should be doable, right?

On Tue, Nov 17, 2020 at 12:42 PM Bogdan Kostic <[email protected]> wrote:

> Hello,
>
> I am using tika to extract text out of pdf documents. I want to write a
> heuristic to differentiate between headings and paragraphs. For this, I
> need font style and size of the extracted text. Is there any way to get
> font style and size using tika? I was not able to find an option to extract
> this information.
>
> Thank you in advance!

Reply via email to