This looks like an area for a  new feature in both Tika and POI.  I've only 
looked very briefly into the POI libraries, and I may have missed how to 
extract text from autoshapes.  I'll open an issue in both projects.

-----Original Message-----
From: Hiroshi Tatsumi [mailto:honekich...@comet.ocn.ne.jp] 
Sent: Sunday, July 21, 2013 10:16 AM
To: user@tika.apache.org
Subject: How to extract autoshape text in Excel 2007+

Hi,

I am using Tika 1.3 and Solr 4.3.1.
I'd like to extract autoshape text in Excel 2007+(.xlsx), but I can't.

I tried to extract from some MS office files.
The results are below.

Success (I can extract autoshape text.)
- Excel 2003(.xls)
- Word 2003(.doc)
- Word 2007+(.docx)

Failed (I cannot extract autoshape text.)
- Excel 2007+(.xlsx)

Is this a bug?
If you know, could you tell me how to extract autoshape text in Excel 2007+?

Thanks,
Hiro. 

Reply via email to