I recently posted on poi-user about using POIFS to decode an Outlook 
.msg file.  I had made a lot of progress figuring out the format of the 
file, and finally got around to trying to modify it when I ran into a 
problem where an IOException was thrown with the message "block[ 0 ] 
already removed"  In trying to track this down, it appears to be a bug 
somewhere in POIFS.

Reproducing the problem is fairly simple.  Take an outlook .msg file and 
"copy" it using:
java org.apache.poi.poifs.filesystem.POIFSFileSystem in.msg out.msg

Viewing the source message works fine:
java org.apache.poi.dev.POIFSViewer in.msg

But trying to view the copied file results in the error:
java org.apache.poi.dev.POIFSViewer out.msg


Since such basic functionality probably works well for all other POI 
users, I figured it was probably a side effect of the different document 
content (i.e. a outlook file rather than an excel spreadsheet).  So, I 
created a tester class to figure out where within the document it was 
failing.

Summary of my interpretation of the problem: POIFS does not appear to 
"write" an empty document stream correctly, but the failure occurs on 
the read of the incorrectly written document.

I created a tester class that reads in a POI file system, then copies 
the file system document to document, directory to directory.  After 
each document and directory is copied, I try writing the document out, 
then read it in again.  I can send in this code if you think its helpful.

The output from my tester was the following (note: I'm using 
jakarta-poi-1.8.0-dev-20020813.jar):

Copying: Root Entry
Copying: Root Entry/__substg1.0_0E04001E
Copying: Root Entry/__substg1.0_0E03001E
Copying: Root Entry/__substg1.0_0E1D001E
Copying: Root Entry/__substg1.0_0E02001E
Copying: Root Entry/__substg1.0_0037001E
Copying: Root Entry/__substg1.0_003D001E
Copying: Root Entry/__properties_version1.0
Copying: Root Entry/__substg1.0_001A001E
Copying: Root Entry/__nameid_version1.0
Copying: Root Entry/__nameid_version1.0/__substg1.0_10090102
Copying: Root Entry/__nameid_version1.0/__substg1.0_100F0102
Copying: Root Entry/__nameid_version1.0/__substg1.0_00040102
Exception in thread "main" java.io.IOException: block[ 0 ] already removed
         at 
org.apache.poi.poifs.storage.BlockListImpl.remove(BlockListImpl.java:133)
         at 
org.apache.poi.poifs.storage.BlockAllocationTableReader.fetchBlocks(BlockAllocationTableReader.java:227)
         at 
org.apache.poi.poifs.storage.BlockListImpl.fetchBlocks(BlockListImpl.java:165)
         at 
org.apache.poi.poifs.filesystem.POIFSFileSystem.processProperties(POIFSFileSystem.java:435)
         at 
org.apache.poi.poifs.filesystem.POIFSFileSystem.processProperties(POIFSFileSystem.java:423)
         at 
org.apache.poi.poifs.filesystem.POIFSFileSystem.<init>(POIFSFileSystem.java:139)
         at Tester.test(Tester.java:71)
         at Tester.copy(Tester.java:62)
         at Tester.copy(Tester.java:52)
         at Tester.main(Tester.java:24)

Here's an excerpt from POIFSViewer on the input file:

     __substg1.0_00040102
       Property: "__substg1.0_00040102"
         Name          = "__substg1.0_00040102"
         Property Type = 2
         Node Color    = 1
         Time 1        = 0
         Time 2        = 0
       Document: "__substg1.0_00040102" size = 0
         <NO DATA>

This is the *only* document within the file which prints "<NO DATA>".  I 
then ran the viewer on various word and excel documents, and none of 
them displayed <NO DATA> for any documents.  If I chance my tester to 
ignore documents that have no data (i.e. available() returns 0), then 
the copy succeeds and the resulting document is readable.  It just 
doesn't have the empty document.

So, since the POIFSViewer (and some of my own utilities) are reading 
this document without problems from the original file, I can only 
believe the the process of writing the empty document is messing up the 
file such that a subsequent read fails.  I tried changing the copy code 
to use a POIFSWriterListener rather than just creating the document 
based on an input stream, but I got the same problem.

Realizing this is probably the cause, I created a simple test case that 
constructs a POIFSFileSystem with a single document that is empty. 
Instead of the error message above, I got the message  "Cannot remove 
block[ 0 ]; out of range".  If I add a non-zero length document to the 
POIFSFileSystem first, and then add the empty document, I get the same 
error as above ("block[ 0 ] already removed").  This test case (for both 
cases) is attached.  It can be easily made into a JUnit test case and 
added to the POI test suite, I just don't a POI development environment 
setup.

I am certainly no expert on the internals of POIFS, and so I'm not 
comfortable trying to track down this problem.  I figure someone with 
more knowledge of the code would have a much easier time tracking it down.

regards,
michael
import java.io.IOException;
import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;

import org.apache.poi.poifs.filesystem.POIFSFileSystem;
import org.apache.poi.poifs.filesystem.POIFSWriterEvent;
import org.apache.poi.poifs.filesystem.POIFSWriterListener;
import org.apache.poi.poifs.filesystem.DirectoryEntry;

public class BugOnEmptyDocumentWrite {

  public static void main(String[] args) {
    BugOnEmptyDocumentWrite driver = new BugOnEmptyDocumentWrite();

    System.out.println();
    System.out.println("As only file...");
    System.out.println();

    System.out.print("Trying using createDocument(String,InputStream): ");
    try {
      driver.testSingleEmptyDocument();
      System.out.println("Worked!");
    } catch (IOException exception) {
      System.out.println("failed! ");
      System.out.println(exception.toString());
    }
    System.out.println();

    System.out.print
      ("Trying using createDocument(String,int,POIFSWriterListener): ");
    try {
      driver.testSingleEmptyDocumentEvent();
      System.out.println("Worked!");
    } catch (IOException exception) {
      System.out.println("failed!");
      System.out.println(exception.toString());
    }
    System.out.println();

    System.out.println();
    System.out.println("After another file...");
    System.out.println();

    System.out.print("Trying using createDocument(String,InputStream): ");
    try {
      driver.testEmptyDocumentWithFriend();
      System.out.println("Worked!");
    } catch (IOException exception) {
      System.out.println("failed! ");
      System.out.println(exception.toString());
    }
    System.out.println();

    System.out.print
      ("Trying using createDocument(String,int,POIFSWriterListener): ");
    try {
      driver.testEmptyDocumentWithFriend();
      System.out.println("Worked!");
    } catch (IOException exception) {
      System.out.println("failed!");
      System.out.println(exception.toString());
    }
    System.out.println();
  }

  public void testSingleEmptyDocument() throws IOException {
    POIFSFileSystem fs = new POIFSFileSystem();
    DirectoryEntry dir = fs.getRoot();
    dir = fs.getRoot();
    dir.createDocument("Foo", new ByteArrayInputStream(new byte[] { }));
    
    ByteArrayOutputStream out = new ByteArrayOutputStream();
    fs.writeFilesystem(out);
    new POIFSFileSystem(new ByteArrayInputStream(out.toByteArray()));
  }

  public void testSingleEmptyDocumentEvent() throws IOException {
    POIFSFileSystem fs = new POIFSFileSystem();
    DirectoryEntry dir = fs.getRoot();
    dir = fs.getRoot();
    dir.createDocument("Foo", 0, new POIFSWriterListener() {
      public void processPOIFSWriterEvent(POIFSWriterEvent event) {
      }
    });
    
    ByteArrayOutputStream out = new ByteArrayOutputStream();
    fs.writeFilesystem(out);
    new POIFSFileSystem(new ByteArrayInputStream(out.toByteArray()));
  }

  public void testEmptyDocumentWithFriend() throws IOException {
    POIFSFileSystem fs = new POIFSFileSystem();
    DirectoryEntry dir = fs.getRoot();
    dir = fs.getRoot();
    dir.createDocument("Bar", new ByteArrayInputStream(new byte[] { 0 }));
    dir.createDocument("Foo", new ByteArrayInputStream(new byte[] { }));
    
    ByteArrayOutputStream out = new ByteArrayOutputStream();
    fs.writeFilesystem(out);
    new POIFSFileSystem(new ByteArrayInputStream(out.toByteArray()));
  }

  public void testEmptyDocumentEventWithFriend() throws IOException {
    POIFSFileSystem fs = new POIFSFileSystem();
    DirectoryEntry dir = fs.getRoot();
    dir = fs.getRoot();
    dir.createDocument("Bar", 1, new POIFSWriterListener() {
      public void processPOIFSWriterEvent(POIFSWriterEvent event) {
        try {
          event.getStream().write(0);
        } catch (IOException exception) {
          throw new RuntimeException("exception on write: " + exception);
        }
      }
    });
    dir.createDocument("Foo", 0, new POIFSWriterListener() {
      public void processPOIFSWriterEvent(POIFSWriterEvent event) {
      }
    });
    
    ByteArrayOutputStream out = new ByteArrayOutputStream();
    fs.writeFilesystem(out);
    new POIFSFileSystem(new ByteArrayInputStream(out.toByteArray()));
  }
}

--
To unsubscribe, e-mail:   <mailto:[EMAIL PROTECTED]>
For additional commands, e-mail: <mailto:[EMAIL PROTECTED]>

Reply via email to