https://issues.apache.org/bugzilla/show_bug.cgi?id=56893

            Bug ID: 56893
           Summary: Overflow in UnicodeString results in corrupted file
                    when setCellValue() is called with a string larger
                    than 32767
           Product: POI
           Version: 3.9-FINAL
          Hardware: PC
                OS: Linux
            Status: NEW
          Severity: normal
          Priority: P2
         Component: HSSF
          Assignee: [email protected]
          Reporter: [email protected]

Hi,

In the following snippet from org.apache.poi.hssf.record.common.UnicodeString
there is a cast from int to a short. In my case I had a string with a length of
65571 which when cast to short would be 35.

public void setString(String string)
    {
        field_3_string = string;
        setCharCount((short)field_3_string.length());
        // scan for characters greater than 255 ... if any are
        // present, we have to use 16-bit encoding. Otherwise, we
        // can use 8-bit encoding
        boolean useUTF16 = false;
        int strlen = string.length();

        for ( int j = 0; j < strlen; j++ )
        {
            if ( string.charAt( j ) > 255 )
        {
                useUTF16 = true;
                break;
            }
        }
        if (useUTF16)
          //Set the uncompressed bit
          field_2_optionflags = highByte.setByte(field_2_optionflags);
        else field_2_optionflags = highByte.clearByte(field_2_optionflags);
    }

Now as setCellValue(String value) in HSSFCell first creates an
HSSFRichTextString then calls setCellValue(RichTextString value) this makes the
check below valid while in fact it's not.

if(hvalue.length() > SpreadsheetVersion.EXCEL97.getMaxTextLength()){
  throw new IllegalArgumentException("The maximum length of cell contents
(text) is 32,767 characters");
}

As I'm not very familiar with the concepts here, excuse me if I'm wrong, but it
would seem that the cast could be removed from the setCharCount call. A
UnicodeString wouldn't need to enforce the limit set by the Excel standard.
That should be just enough to avoid this overflow problem, otherwise checking
the size of the original string in setCellValue(String value) might be enough.

-- 
You are receiving this mail because:
You are the assignee for the bug.

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to