[
https://issues.apache.org/jira/browse/SPARK-4213?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Michael Armbrust resolved SPARK-4213.
-------------------------------------
Resolution: Fixed
Issue resolved by pull request 3083
[https://github.com/apache/spark/pull/3083]
> SparkSQL - ParquetFilters - No support for LT, LTE, GT, GTE operators
> ---------------------------------------------------------------------
>
> Key: SPARK-4213
> URL: https://issues.apache.org/jira/browse/SPARK-4213
> Project: Spark
> Issue Type: Bug
> Components: SQL
> Affects Versions: 1.2.0
> Environment: CDH5.2, Hive 0.13.1, Spark 1.2 snapshot (commit hash
> 76386e1a23c)
> Reporter: Terry Siu
> Priority: Blocker
> Fix For: 1.2.0
>
>
> When I issue a hql query against a HiveContext where my predicate uses a
> column of string type with one of LT, LTE, GT, or GTE operator, I get the
> following error:
> scala.MatchError: StringType (of class
> org.apache.spark.sql.catalyst.types.StringType$)
> Looking at the code in org.apache.spark.sql.parquet.ParquetFilters,
> StringType is absent from the corresponding functions for creating these
> filters.
> To reproduce, in a Hive 0.13.1 shell, I created the following table (at a
> specified DB):
> create table sparkbug (
> id int,
> event string
> ) stored as parquet;
> Insert some sample data:
> insert into table sparkbug select 1, '2011-06-18' from <some table> limit 1;
> insert into table sparkbug select 2, '2012-01-01' from <some table> limit 1;
> Launch a spark shell and create a HiveContext to the metastore where the
> table above is located.
> import org.apache.spark.sql._
> import org.apache.spark.sql.SQLContext
> import org.apache.spark.sql.hive.HiveContext
> val hc = new HiveContext(sc)
> hc.setConf("spark.sql.shuffle.partitions", "10")
> hc.setConf("spark.sql.hive.convertMetastoreParquet", "true")
> hc.setConf("spark.sql.parquet.compression.codec", "snappy")
> import hc._
> hc.hql("select * from <db>.sparkbug where event >= '2011-12-01'")
> A scala.MatchError will appear in the output.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]