On Fri, 1 Aug 2014, Allison, Timothy B. wrote:
I found one regression in the handling of an xlsx file:
http://digitalcorpora.org/corp/nps/files/govdocs1/598/598948.xlsx
Tika 1.6 w/ POI 3.11 Beta 1 is not extracting the comments in this file,
whereas Tika 1.5 (and Tika 1.6 w/ POI 3.10-Final) did extract the
comments. This suggests that the issue is with POI, but I haven't had a
chance to dig in, and unfortunately, I don't think I will have a chance
until Monday.
Tika used to look up cell comments manually in the xlsx extractor, but
that logic has now been moved into the POI xlsx event handler. My hunch is
there's something not quite right in that, that's probably the place to
look + write unit test for!
Nick
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]