[
https://issues.apache.org/jira/browse/ORC-101?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15495045#comment-15495045
]
ASF GitHub Bot commented on ORC-101:
------------------------------------
Github user prasanthj commented on a diff in the pull request:
https://github.com/apache/orc/pull/60#discussion_r79093595
--- Diff: java/core/src/java/org/apache/orc/OrcConf.java ---
@@ -105,6 +105,12 @@
"dictionary or not will be retained thereafter."),
BLOOM_FILTER_COLUMNS("orc.bloom.filter.columns",
"orc.bloom.filter.columns",
"", "List of columns to create bloom filters for when writing."),
+ BLOOM_FILTER_WRITE_VERSION("orc.bloom.filter.write.version",
+ "orc.bloom.filter.write.version",
OrcFile.BloomFilterVersion.UTF8.toString(),
+ "Which version of the bloom filter should we write."),
--- End diff --
Any reason to have this config? If we are going to default to UTF8 anyways
we don't need this right?
> Correct the use of the default charset in the bloomfilter
> ---------------------------------------------------------
>
> Key: ORC-101
> URL: https://issues.apache.org/jira/browse/ORC-101
> Project: Orc
> Issue Type: Improvement
> Reporter: Owen O'Malley
> Assignee: Owen O'Malley
>
> Currently ORC's bloom filter depends on the default character set, which
> isn't constant between computers.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)