[
https://issues.apache.org/jira/browse/ORC-101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15495147#comment-15495147
]
ASF GitHub Bot commented on ORC-101:
------------------------------------
Github user prasanthj commented on a diff in the pull request:
https://github.com/apache/orc/pull/60#discussion_r79097639
--- Diff: java/core/src/java/org/apache/orc/util/BloomFilter.java ---
@@ -130,7 +125,7 @@ public void addString(String val) {
if (val == null) {
add(null);
} else {
- add(val.getBytes());
+ add(val.getBytes(Charset.defaultCharset()));
--- End diff --
Should we make the default "UTF-8" and provide an alternate addString()
that accepts Charset?
> Correct the use of the default charset in the bloomfilter
> ---------------------------------------------------------
>
> Key: ORC-101
> URL: https://issues.apache.org/jira/browse/ORC-101
> Project: Orc
> Issue Type: Improvement
> Reporter: Owen O'Malley
> Assignee: Owen O'Malley
>
> Currently ORC's bloom filter depends on the default character set, which
> isn't constant between computers.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)