Hello there, I'm new with POI, and i want use it to integrate the word files in = Lucene.
I know that something suitable for my purpose should be on = http://www.textmining.org/ but everytime I try to open that adress I = just get this message: Hacked Fotolog ? ITALY owned .org ? HACKED BY ITALY=20 and so, I was not able to check out that package. In anycase, I tried to download all three latest .jar files from = http://encore.torchbox.com/poi-cvs-build/ and compile a test class in order to test POI with Lucene. It's not able to compile because it doesn't find the WordExtractor = class. C:\Documents and Settings\Alessandro\Desktop\Copia di = PDFBox-0.7.2\PDFBox-0.7.2> javac -classpath = lucene-1.4.3.jar;lucene-demos-1.4.3.jar;PDFBox-0.7.2.jar;poi-3. 0.jar;poi-scratchpad-3.0.jar = src\org\pdfbox\searchengine\lucene\luceneWORDDocume nt.java src\org\pdfbox\searchengine\lucene\luceneWORDDocument.java:294: cannot = resolve s ymbol symbol : class WordExtractor location: class org.pdfbox.searchengine.lucene.LuceneWORDDocument WordExtractor extractor =3D new WordExtractor(); ^ src\org\pdfbox\searchengine\lucene\luceneWORDDocument.java:294: cannot = resolve s ymbol symbol : class WordExtractor location: class org.pdfbox.searchengine.lucene.LuceneWORDDocument WordExtractor extractor =3D new WordExtractor(); ^ 2 errors This's the code used HWPFDocument wdoc =3D new HWPFDocument(is); WordExtractor extractor =3D new WordExtractor(); String contents =3D extractor.extractText(wdoc); StringReader reader =3D new StringReader( contents ); document.add( Field.Text( "contents", reader ) ); I'll apreciate any comments or suggestions to solve my problem. Thanks in advance
