Hi, On Sat, Mar 28, 2009 at 6:18 AM, David Weekly <da...@pbwiki.com> wrote: > So this is part "bug report" (the columns of the first sheet should > definitely be included!)
Agreed. Can you please file a Jira bug report for this? It looks similar to some of the zero- vs. one-based index issues we faced when upgrading to POI 3.5. > and part query as to whether or not there is a plan > w/Tika to extract more than sheet & cell data from documents. Doing so would be very nice. You may want to file a Jira improvement request for that. And if you're familiar with Apache POI (or willing to learn it), patches would of course also be welcome. :-) Otherwise I don't know when one of us will encounter a similar need. You may also want to contact the POI project to see if they've already implemented text extraction improvements that would cover these features. Last week at the ApacheCon I noticed that they've recently been improving the out-of-the-box text extraction features in POI. BR, Jukka Zitting