On Thu, 9 Jan 2014, Hong-Thai Nguyen wrote:
BTW, may I'm wrong to say that thumbnail handling in Alfresco is quite complex because Alfresco can call external thumbnail generation with PDFBox or PDFRender ....

It can do, yes, but there are also dedicated classes to pull out most of the common thumbnails from common office formats that have them, that was the bit I had in mind referencing.

Could you guide me an example of returning embedded document in Tika parsers ?

To see the output side, your best bet is the -z option to Tika App. For the parser side, look at something like AbstractPOIFSExtractor (esp the handleEmbedded methods) or look at PackageParser (almost all the content from that is embedded resources)

Nick

Reply via email to