[
https://issues.apache.org/jira/browse/SPARK-12417?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Dongjoon Hyun resolved SPARK-12417.
-----------------------------------
Resolution: Fixed
Fix Version/s: 2.0.0
This is fixed since 2.0.0.
{code}
scala> spark.version
res0: String = 2.0.0
scala> Seq((1,2)).toDF("a", "b").write.option("orc.bloom.filter.columns",
"*").orc("/tmp/orc200")
{code}
$ hive --orcfiledump
/tmp/orc200/part-r-00007-d36ca145-1e23-4d3a-ba99-09506e4ed8cc.snappy.orc
...
Stripes:
Stripe: offset: 3 data: 12 rows: 1 tail: 92 index: 1390
Stream: column 0 section ROW_INDEX start: 3 length 11
Stream: column 0 section BLOOM_FILTER start: 14 length 426
Stream: column 1 section ROW_INDEX start: 440 length 24
Stream: column 1 section BLOOM_FILTER start: 464 length 456
Stream: column 2 section ROW_INDEX start: 920 length 24
Stream: column 2 section BLOOM_FILTER start: 944 length 449
Stream: column 1 section DATA start: 1393 length 6
Stream: column 2 section DATA start: 1399 length 6
...
{code}
> Orc bloom filter options are not propagated during file write in spark
> ----------------------------------------------------------------------
>
> Key: SPARK-12417
> URL: https://issues.apache.org/jira/browse/SPARK-12417
> Project: Spark
> Issue Type: Improvement
> Components: SQL
> Reporter: Rajesh Balamohan
> Assignee: Apache Spark
> Priority: Minor
> Fix For: 2.0.0
>
> Attachments: SPARK-12417.1.patch
>
>
> ORC bloom filter is supported by the version of hive used in Spark 1.5.2.
> However, when trying to create orc file with bloom filter option, it does not
> make use of it.
> E.g, following orc output does not create the bloom filter even though the
> options are specified.
> {noformat}
> Map<String, String> orcOption = new HashMap<String, String>();
> orcOption.put("orc.bloom.filter.columns", "*");
> hiveContext.sql("select * from accounts where
> effective_date='2015-12-30'").write().
> format("orc").options(orcOption).save("/tmp/accounts");
> {noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]