[
https://issues.apache.org/jira/browse/ORC-101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15495149#comment-15495149
]
ASF GitHub Bot commented on ORC-101:
------------------------------------
Github user prasanthj commented on a diff in the pull request:
https://github.com/apache/orc/pull/60#discussion_r79097833
--- Diff: java/core/src/java/org/apache/orc/impl/WriterImpl.java ---
@@ -1901,7 +2014,11 @@ void writeBatch(ColumnVector vector, int offset,
HiveDecimal value = vec.vector[0].getHiveDecimal();
indexStatistics.updateDecimal(value);
if (createBloomFilter) {
- bloomFilter.addString(value.toString());
+ String str = value.toString();
+ if (bloomFilter != null) {
+ bloomFilter.addString(str);
--- End diff --
Can you plz add a comment here for not using UTF-8 for decimals?
> Correct the use of the default charset in the bloomfilter
> ---------------------------------------------------------
>
> Key: ORC-101
> URL: https://issues.apache.org/jira/browse/ORC-101
> Project: Orc
> Issue Type: Improvement
> Reporter: Owen O'Malley
> Assignee: Owen O'Malley
>
> Currently ORC's bloom filter depends on the default character set, which
> isn't constant between computers.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)