https://issues.apache.org/bugzilla/show_bug.cgi?id=50779
--- Comment #5 from Yegor Kozlov <[email protected]> 2011-03-07 09:18:34 EST --- Interesting. So far we assumed that for primitive types (short, int, long, etc.) a continue record break always occurs at the type boundary. Your attachments clearly demonstrate that it is not always so and a CONTINUE break can be in the middle of a primitive type. I know how to fix it, but I'm hesitating whether this behavior should be default or only applied to this particular case. Initialization of BIFF records sits on top of the RecordInputStream class which greedily reads the primitive types. To properly handle CONTINUE it needs to reads byte by byte and then make sense of the read data. Something like this: // current version. Does not work if CONTINUE occurs between two bytes. public int readUShort() { checkRecordPosition(LittleEndian.SHORT_SIZE); _currentDataOffset += LittleEndian.SHORT_SIZE; return _dataInput.readUShort(); } // Corrected. readByte() rolls over CONTINUE if necessary public int readUShort() { int ch1 = readByte(); int ch2 = readByte(); return (ch2 << 8) + (ch1 << 0); } Note that there is at least one case where readShort() must be greedy: for double-byte characters a Continue record break MUST occur at the double-byte character boundary. Yegor -- Configure bugmail: https://issues.apache.org/bugzilla/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug. --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
