Hi Jérôme,
Jérôme Charron wrote:
The changes are not difficult, but I still
observe some other problems with this plugin.
Ok, what kind of problems?
I took at random some xls-files from the internet, crawled them and saw
some errors. I haven't been able to check the errors further. So I can't
give you a more specific description of the problem :-( If you're
interested, I can mail you the url with my test-documents "off-list".
Regards
Michael
050829 192634 fetching http://www.xxxxxx.xx/xxx/xls/RAL_RGB_Farbkarte.xls
java.lang.reflect.InvocationTargetException
at
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:494)
at
org.apache.poi.hssf.record.RecordFactory.createRecord(RecordFactory.java:224)
at
org.apache.poi.hssf.record.RecordFactory.createRecords(RecordFactory.java:160)
at
org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:163)
at
org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:210)
at
org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:191)
at
org.apache.nutch.parse.msexcel.ExcelTextExtractor.extractText(ExcelTextExtractor.java:34)
at
org.apache.nutch.parse.msexcel.MSExcelParser.getParse(MSExcelParser.java:73)
at
org.apache.nutch.fetcher.Fetcher$FetcherThread.handleFetch(Fetcher.java:254)
at
org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:148)
Caused by: java.lang.ArrayIndexOutOfBoundsException
at java.lang.System.arraycopy(Native Method)
at
org.apache.poi.hssf.record.UnknownRecord.<init>(UnknownRecord.java:62)
at
org.apache.poi.hssf.record.SubRecord.createSubRecord(SubRecord.java:57)
at
org.apache.poi.hssf.record.ObjRecord.fillFields(ObjRecord.java:99)
at org.apache.poi.hssf.record.Record.fillFields(Record.java:90)
at org.apache.poi.hssf.record.Record.<init>(Record.java:55)
at org.apache.poi.hssf.record.ObjRecord.<init>(ObjRecord.java:61)
... 13 more
050829 192650 fetching http://www.xxxxxxx.xx/xxxx/xls/TAG_Liste.xls
java.lang.reflect.InvocationTargetException
at
sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
at
sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:39)
at
sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:494)
at
org.apache.poi.hssf.record.RecordFactory.createRecord(RecordFactory.java:224)
at
org.apache.poi.hssf.record.RecordFactory.createRecords(RecordFactory.java:160)
at
org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:163)
at
org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:210)
at
org.apache.poi.hssf.usermodel.HSSFWorkbook.<init>(HSSFWorkbook.java:191)
at
org.apache.nutch.parse.msexcel.ExcelTextExtractor.extractText(ExcelTextExtractor.java:34)
at
org.apache.nutch.parse.msexcel.MSExcelParser.getParse(MSExcelParser.java:73)
at
org.apache.nutch.fetcher.Fetcher$FetcherThread.handleFetch(Fetcher.java:254)
at
org.apache.nutch.fetcher.Fetcher$FetcherThread.run(Fetcher.java:148)
Caused by: java.lang.ArrayIndexOutOfBoundsException
at java.lang.System.arraycopy(Native Method)
at
org.apache.poi.hssf.record.UnknownRecord.<init>(UnknownRecord.java:62)
at
org.apache.poi.hssf.record.SubRecord.createSubRecord(SubRecord.java:57)
at
org.apache.poi.hssf.record.ObjRecord.fillFields(ObjRecord.java:99)
at org.apache.poi.hssf.record.Record.fillFields(Record.java:90)
at org.apache.poi.hssf.record.Record.<init>(Record.java:55)
at org.apache.poi.hssf.record.ObjRecord.<init>(ObjRecord.java:61)
... 13 more
--
Michael Nebel
http://www.nebel.de/
http://www.netluchs.de/