I wrote the code below to extract word , It was working well until I
restarted eclipse , afterward I got these exceptions, I googled them and
found out it's a library bug so I downloaded the last version of POI, but I
still this error, any Idea how to fix that, thanks in advance
/////////////////////////////////////////////////////////////////////////////////
java.io.IOException: block[ 0 ] already removed - does your POIFS have
circular or duplicate block references?
at
org.apache.poi.poifs.storage.BlockListImpl.remove(BlockListImpl.java:89)
at
org.apache.poi.poifs.storage.SmallDocumentBlockList.remove(SmallDocumentBlockList.java:30)
at
org.apache.poi.poifs.storage.BlockAllocationTableReader.fetchBlocks(BlockAllocationTableReader.java:216)
at
org.apache.poi.poifs.storage.BlockListImpl.fetchBlocks(BlockListImpl.java:123)
at
org.apache.poi.poifs.storage.SmallDocumentBlockList.fetchBlocks(SmallDocumentBlockList.java:30)
at
org.apache.poi.poifs.filesystem.POIFSFileSystem.processProperties(POIFSFileSystem.java:538)
at
org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java:180)
at
org.apache.poi.hwpf.HWPFDocumentCore.verifyAndBuildPOIFS(HWPFDocumentCore.java:96)
at org.apache.poi.hwpf.HWPFDocument.<init>(HWPFDocument.java:119)
at test.PDFExtractor(test.java:44)
at test.main(test.java:106)
0
//////////////////////////////////////////////////////////////////////////////////////////////////
String delims = "[ {}.,?!%+\\-*/\\^_=)(&%...@\\[\\]|:;'\"<>\n\t]+";
File file = null;
file = new File("C:\\beta.doc");
FileInputStream myInput =new
FileInputStream(file.getAbsolutePath());
HWPFDocument myDoc = new HWPFDocument(myInput);
extractor = new WordExtractor(myDoc);
String [] fileData = extractor.getParagraphText();
for(int i=0;i<fileData.length;i++)
{
if(fileData[i] != null)
{
String[] ttokens = fileData[i].split(delims);
for(int h = 0 ; h < ttokens.length ; h++)
{
builder.write(ttokens[h]+"\t"+j+"\t"+
ttokens[h].toString().length()+"\n" );
j++;
}
}
}
--
View this message in context:
http://apache-poi.1045710.n5.nabble.com/block-0-already-removed-does-your-POIFS-have-circular-or-duplicate-block-references-tp3255202p3255202.html
Sent from the POI - User mailing list archive at Nabble.com.
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]