Github user piaozhexiu commented on the pull request:

    https://github.com/apache/spark/pull/6921#issuecomment-116243679
  
    @marmbrus I got all the unit tests passing now. Can you please review my 
patch? This is no longer WIP.
    
    I incorporated all your suggestions. A couple of things that I should 
mention-
    * All the BinaryComparison operators (=, <, >, <=, >=) are supported for 
both integral and string key types.
    * In v0.13, integral types should not be encoded as string. For eg, 
`int_key = "1"` throws an error while `int_key = 1` works. In fact, this was 
why I was seeing many unit tests failures before. Fixing it makes all unit 
tests pass now.
    * In v0.12, integral types should be encoded as string. I ran some test 
queries against v0.12 metastore server in my environment, and I saw an error. 
But if I encode integral types as string (i.e. `int_key = "1"`), the error goes 
away. So this is a different behavior from v0.13. My question is, are you going 
to support v0.12 in 1.5 release? If so, we might need to add some logic to 
encode integral types as string only for v0.12.
    * Predicates are not pushed down when 
`spark.sql.hive.convertMetastoreParquet` is set true. This is because in 
`HiveMetastoreCatalog`, the full list of partitions is needed to convert Hive 
metadata into Parquet (before `HiveTableScan` is invoked). Please let me if 
this can be optimized. 
    * Hive `varchar` type cannot be pushed down into the Hive metastore, so I 
had to remove varchar keys from partition pruning predicates in 
`HiveTableScan`. But since Catalyst treats varchar as string, it was not 
possible to tell whether an expression whose data type is Catalyst string is a 
Hive varchar or string. So I needed to pass in the Hive partition keys schema 
in addition to raw Catalyst `Expression` objects to `HiveShim`.
    
    Thanks!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to