Does TIKA have the ability to extract title and header or footer information based on an analysis of content on the page (as opposed to meta data)? For example, it could look for boldfaced, centered content, with blank lines above and/or below to find a title. Or, in the case of headers, say it was analyzing a letter, it could look for to/from address information towards the top of the first page.
If TIKA doesn't go this far, are there any tools one would recommend to do this?
