GitHub user sarutak opened a pull request:

    https://github.com/apache/spark/pull/3083

    [SPARK-4213] ParquetFilters - No support for LT, LTE, GT, GTE operators

    Following description is quoted from JIRA:
    
    When I issue a hql query against a HiveContext where my predicate uses a 
column of string type with one of LT, LTE, GT, or GTE operator, I get the 
following error:
    scala.MatchError: StringType (of class 
org.apache.spark.sql.catalyst.types.StringType$)
    Looking at the code in org.apache.spark.sql.parquet.ParquetFilters, 
StringType is absent from the corresponding functions for creating these 
filters.
    To reproduce, in a Hive 0.13.1 shell, I created the following table (at a 
specified DB):
    create table sparkbug (
    id int,
    event string
    ) stored as parquet;
    Insert some sample data:
    insert into table sparkbug select 1, '2011-06-18' from <some table> limit 1;
    insert into table sparkbug select 2, '2012-01-01' from <some table> limit 1;
    Launch a spark shell and create a HiveContext to the metastore where the 
table above is located.
    import org.apache.spark.sql._
    import org.apache.spark.sql.SQLContext
    import org.apache.spark.sql.hive.HiveContext
    val hc = new HiveContext(sc)
    hc.setConf("spark.sql.shuffle.partitions", "10")
    hc.setConf("spark.sql.hive.convertMetastoreParquet", "true")
    hc.setConf("spark.sql.parquet.compression.codec", "snappy")
    import hc._
    hc.hql("select * from <db>.sparkbug where event >= '2011-12-01'")
    A scala.MatchError will appear in the output.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/sarutak/spark SPARK-4213

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/3083.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #3083
    
----
commit 9a1fae7fc87ad32af78ba843ab8a4457a9f8c067
Author: Kousuke Saruta <[email protected]>
Date:   2014-11-04T00:50:46Z

    Fixed ParquetFilters so that compare Strings

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to