Oops, there's a comment that isContinueNext "Should never be called before end of current record" so the code must be correct.
I'm not sure how it happens, but the exception is raised when POI tries to read phonetic text in ExtRst of UnicodeString. POI tries to read 18 double-byte characters, but RecordInputStream has only 29 bytes (14 characters + 1 byte.) If I check the Excel file with binary editor, everything up to here seems correct. When I open the excel file and deleted sheets which do not contain the problematic string, then the problem disappeared. I tried deleting different sheets, then the problem sometimes disappeared and sometimes not. I also tried re-inputting the problematic string by cut & pasting in Excel, then the problem always disappeared. My best guess is that Excel sometimes saves corrupted data. Corruption occurs in phonetic texts which is not usually visible to users, it is not very obvious. So, my workaround is to use debugger to find strings that causes the issue, then I re-input the string in Excel and save. Then, POI can read the file without any errors. apptaro On Thu, Feb 10, 2011 at 3:21 PM, Taro App <[email protected]> wrote: > Hi Nick, thank you for your advice. > > I debugged into SSTRecord, and found something weird in > org.apache.poi.hssf.record.RecordInputStream.java where raises the > exception: > > private void checkRecordPosition(int requiredByteCount) { > > int nAvailable = remaining(); > if (nAvailable >= requiredByteCount) { > // all OK > return; > } > if (nAvailable == 0 && isContinueNext()) { > nextRecord(); > return; > } > throw new RecordFormatException("Not enough data (" + nAvailable > + ") to read requested (" + requiredByteCount +") bytes"); > } > > The above code seems to raise an exception even if requiredByteCount > is available if it reads next record. When two bytes are required, and > only one byte is left, even if there is a next record, it raises an > exception. I'm new to POI, so I'd appreciate if you could confirm if > this is a bug. > > There are three bugs related to "checkRecordPosition", but none of > them seems related to this, because in the three bugs, errors say "Not > enough data (0)" instead of "Not enough data (1)" > - Bug 49219 - Not enough data (0) to read requested (1) bytes error on > Excel read > - Bug 49677 - About sheet.getDefaultColumnWidth() serious error! > - Bug 47247 - Initialisation of record 0x850 left 3060 bytes remaining > still to be read. > > Please advise. > > apptaro > > > > On Wed, Feb 9, 2011 at 10:05 PM, Nick Burch <[email protected]> wrote: >> On Wed, 9 Feb 2011, Taro App wrote: >>> >>> Caused by: org.apache.poi.hssf.record.RecordFormatException: Not >>> enough data (1) to read requested (2) bytes >>> at >>> org.apache.poi.hssf.record.RecordInputStream.checkRecordPosition(RecordInputStream.java:216) >>> at >>> org.apache.poi.hssf.record.RecordInputStream.readUShort(RecordInputStream.java:267) >>> at >>> org.apache.poi.util.StringUtil.readUnicodeLE(StringUtil.java:277) >>> at >>> org.apache.poi.hssf.record.common.UnicodeString$ExtRst.<init>(UnicodeString.java:172) >>> at >>> org.apache.poi.hssf.record.common.UnicodeString.<init>(UnicodeString.java:438) >>> at >>> org.apache.poi.hssf.record.SSTDeserializer.manufactureStrings(SSTDeserializer.java:55) >>> at org.apache.poi.hssf.record.SSTRecord.<init>(SSTRecord.java:250) >> >> This is the key bit of the stack trace. You have a SST Record which contains >> a string that claims to be a 1 character unicode string (2 bytes). However, >> the record only had 1 byte left, which is never a valid size for a unicode >> string (they must have even lengths) >> >> I've no idea if excel has written something invalid, or if POI is confused >> about the string being unicode vs ascii. >> >>> The file is saved with Excel 2003. It's a confidential file, so I >>> can't provide the file here. >> >> I'd suggest you use a debugger to step into the SSTRecord code, and see if >> you can spot what's going wrong. Is this the only string in the record? If >> not, do the strings before it make sense or is there garbage in them? >> >> Nick >> >> --------------------------------------------------------------------- >> To unsubscribe, e-mail: [email protected] >> For additional commands, e-mail: [email protected] >> >> > --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
