[ https://issues.apache.org/jira/browse/ORC-101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15495149#comment-15495149 ]
ASF GitHub Bot commented on ORC-101: ------------------------------------ Github user prasanthj commented on a diff in the pull request: https://github.com/apache/orc/pull/60#discussion_r79097833 --- Diff: java/core/src/java/org/apache/orc/impl/WriterImpl.java --- @@ -1901,7 +2014,11 @@ void writeBatch(ColumnVector vector, int offset, HiveDecimal value = vec.vector[0].getHiveDecimal(); indexStatistics.updateDecimal(value); if (createBloomFilter) { - bloomFilter.addString(value.toString()); + String str = value.toString(); + if (bloomFilter != null) { + bloomFilter.addString(str); --- End diff -- Can you plz add a comment here for not using UTF-8 for decimals? > Correct the use of the default charset in the bloomfilter > --------------------------------------------------------- > > Key: ORC-101 > URL: https://issues.apache.org/jira/browse/ORC-101 > Project: Orc > Issue Type: Improvement > Reporter: Owen O'Malley > Assignee: Owen O'Malley > > Currently ORC's bloom filter depends on the default character set, which > isn't constant between computers. -- This message was sent by Atlassian JIRA (v6.3.4#6332)