https://issues.apache.org/bugzilla/show_bug.cgi?id=52446
Antoni Mylka antoni.my...@gmail.com changed:
What|Removed |Added
Status|NEEDINFO|NEW
--- Comment #2 from Antoni Mylka antoni.my...@gmail.com 2012-01-11 10:40:17
UTC ---
I took a very close look in the debugger. POIFSViewer seems to work at a
higher-level, where blocks are already combined into streams. I know nothing
about the POI format, yet from what I understand it goes like this:
NPropertyTable is constructed with an iterator on byte buffers. Each byte
buffer represents a single block. In this file the blocks are 512-bytes large.
The NPropertyTable constructor goes through this stack trace twice:
ByteArrayBackedDataSource.read(int, long) line: 48
NPOIFSFileSystem.getBlockAt(int) line: 420
NPOIFSStream$StreamBlockByteBufferIterator.next() line: 213
NPOIFSStream$StreamBlockByteBufferIterator.next() line: 1
NPropertyTable.buildProperties(IteratorByteBuffer, POIFSBigBlockSize) line:
84
The first time getBlockAt is called with 946. When I look at offset
947*512=484864 within the file it contains four: UTF-16 strings like Root
Entry, Data, 1Table, WordDocument. AFAIU these are names of top-level
directory entries. This block is parsed correctly by
PropertyFactory.convertToProperties(data, properties);
Afterwards comes the second block, index 956. It also comes down to
ByteArrayBackedDataSource.read(int, long) line: 48. Unfortunately the (957*512
+ 512) exceeds the size of the file. The returned byte buffer is only 510 bytes
large, hence the BufferUnderflowException. I don't know how many blocks should
there be (there is BAT, but I don't understand it). What I know, is that this
file has been truncated somewhere in the process.
When the second block is parsed, with 510 bytes, the
PropertyFactory.convertToProperties begins with
int property_count = data.length / POIFSConstants.PROPERTY_SIZE;
In my case this evaluates to 3. The last 126 bytes are not taken into account -
hence no errors. The second block, when viewed in XVI shows UTF-16 strings
SummaryInformation, DocumentSummaryInformation, and \u0001CompObj (the
three correct properties). The fourth, truncated property contains only
zeros:
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 FF FF FF FF FF FF FF FF FF FF FF FF
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 00 00 00 00 00 00 00 00 00 00 00 00 00
Therefore no information is lost. I think that my workaround is actually
correct.
--
Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email
--- You are receiving this mail because: ---
You are the assignee for the bug.
-
To unsubscribe, e-mail: dev-unsubscr...@poi.apache.org
For additional commands, e-mail: dev-h...@poi.apache.org