Re: [SparkSql] Casting of Predicate Literals

2020-08-26 Thread Chao Sun
Thanks Bart. I'll give it a try. Presto has done something very similar on this (thanks DB for finding this!). They published an article ([1]) last year with a very thorough analysis on all the cases which I think can be used as a reference for the implementation in Spark. [1]:

Re: [SparkSql] Casting of Predicate Literals

2020-08-26 Thread Bart Samwel
IMO it's worth an attempt. The previous attempts seem to be closed because of a general sense that this gets messy and leads to lots of special cases, but that's just how it is. This optimization would make the difference between getting sub-par performance for using some of these datatypes to

Re: [SparkSql] Casting of Predicate Literals

2020-08-25 Thread Chao Sun
Hi, So just realized there were already multiple attempts on this issue in the past. From the discussion it seems the preferred approach is to eliminate the cast before they get pushed to data sources, at least for a few common cases such as numeric types. However, a few PRs following this

Re: [SparkSql] Casting of Predicate Literals

2020-08-24 Thread Chao Sun
> Currently we can't. This is something we should improve, by either pushing down the cast to the data source, or simplifying the predicates to eliminate the cast. Hi all, I've created https://issues.apache.org/jira/browse/SPARK-32694 to track this. Welcome to comment on the JIRA. On Wed, Aug

Re: [SparkSql] Casting of Predicate Literals

2020-08-19 Thread Wenchen Fan
Currently we can't. This is something we should improve, by either pushing down the cast to the data source, or simplifying the predicates to eliminate the cast. On Wed, Aug 19, 2020 at 5:09 PM Bart Samwel wrote: > And how are we doing here on integer pushdowns? If someone does e.g. >

Re: [SparkSql] Casting of Predicate Literals

2020-08-19 Thread Bart Samwel
And how are we doing here on integer pushdowns? If someone does e.g. CAST(short_col AS LONG) < 1000, can we still push down "short_col < 1000" without the cast? On Tue, Aug 4, 2020 at 6:55 PM Russell Spitzer wrote: > Thanks! That's exactly what I was hoping for! Thanks for finding the Jira >

Re: [SparkSql] Casting of Predicate Literals

2020-08-04 Thread Russell Spitzer
Thanks! That's exactly what I was hoping for! Thanks for finding the Jira for me! On Tue, Aug 4, 2020 at 11:46 AM Wenchen Fan wrote: > I think this is not a problem in 3.0 anymore, see > https://issues.apache.org/jira/browse/SPARK-27638 > > On Wed, Aug 5, 2020 at 12:08 AM Russell Spitzer >

Re: [SparkSql] Casting of Predicate Literals

2020-08-04 Thread Xiao Li
Hi, Russell, You might hit the other cases in which CAST blocks the predicate pushdown. If the Cast was added by users and it changes the actual type, we are unable to optimize it automatically because it could change the query correctness. If it was added by our type coercion rules

Re: [SparkSql] Casting of Predicate Literals

2020-08-04 Thread Wenchen Fan
I think this is not a problem in 3.0 anymore, see https://issues.apache.org/jira/browse/SPARK-27638 On Wed, Aug 5, 2020 at 12:08 AM Russell Spitzer wrote: > I've just run into this issue again with another user and I feel like most > folks here have seen some flavor of this at some point. > >