Gopal V created ORC-406:
---------------------------

             Summary: ORC: Char(n) and Varchar(n) writers truncate to n bytes & 
corrupts multi-byte data
                 Key: ORC-406
                 URL: https://issues.apache.org/jira/browse/ORC-406
             Project: ORC
          Issue Type: Bug
    Affects Versions: 1.5.2
            Reporter: Gopal V


https://github.com/apache/orc/blob/master/java/core/src/java/org/apache/orc/impl/writer/CharTreeWriter.java#L41

{code}
    itemLength = schema.getMaxLength();
    padding = new byte[itemLength];
  }
{code}

https://github.com/apache/orc/blob/master/java/core/src/java/org/apache/orc/impl/writer/VarcharTreeWriter.java#L48

{code}
      if (vector.noNulls || !vector.isNull[0]) {
        int itemLength = Math.min(vec.length[0], maxLength);
{code}





--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to