Hi, Mich Did you check the URL Josh referred to?; the cast for string comparisons is needed for accepting `c_date >= "2016"`.
// maropu On Fri, Apr 15, 2016 at 10:30 AM, Hyukjin Kwon <gurwls...@gmail.com> wrote: > Hi, > > > String comparison itself is pushed down fine but the problem is to deal > with Cast. > > > It was pushed down before but is was reverted, ( > https://github.com/apache/spark/pull/8049). > > Several fixes were tried here, https://github.com/apache/spark/pull/11005 > and etc. but there were no changes to make it. > > > To cut it short, it is not being pushed down because it is unsafe to > resolve cast (eg. long to integer) > > For an workaround, the implementation of Solr data source should be > changed to one with CatalystScan, which take all the filters. > > But CatalystScan is not designed to be binary compatible across releases, > however it looks some think it is stable now, as mentioned here, > https://github.com/apache/spark/pull/10750#issuecomment-175400704. > > > Thanks! > > > 2016-04-15 3:30 GMT+09:00 Mich Talebzadeh <mich.talebza...@gmail.com>: > >> Hi Josh, >> >> Can you please clarify whether date comparisons as two strings work at >> all? >> >> I was under the impression is that with string comparison only first >> characters are compared? >> >> Thanks >> >> Dr Mich Talebzadeh >> >> >> >> LinkedIn * >> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >> >> >> >> http://talebzadehmich.wordpress.com >> >> >> >> On 14 April 2016 at 19:26, Josh Rosen <joshro...@databricks.com> wrote: >> >>> AFAIK this is not being pushed down because it involves an implicit cast >>> and we currently don't push casts into data sources or scans; see >>> https://github.com/databricks/spark-redshift/issues/155 for a >>> possibly-related discussion. >>> >>> On Thu, Apr 14, 2016 at 10:27 AM Mich Talebzadeh < >>> mich.talebza...@gmail.com> wrote: >>> >>>> Are you comparing strings in here or timestamp? >>>> >>>> Filter ((cast(registration#37 as string) >= 2015-05-28) && >>>> (cast(registration#37 as string) <= 2015-05-29)) >>>> >>>> >>>> Dr Mich Talebzadeh >>>> >>>> >>>> >>>> LinkedIn * >>>> https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw >>>> <https://www.linkedin.com/profile/view?id=AAEAAAAWh2gBxianrbJd6zP6AcPCCdOABUrV8Pw>* >>>> >>>> >>>> >>>> http://talebzadehmich.wordpress.com >>>> >>>> >>>> >>>> On 14 April 2016 at 18:04, Kiran Chitturi < >>>> kiran.chitt...@lucidworks.com> wrote: >>>> >>>>> Hi, >>>>> >>>>> Timestamp range filter queries in SQL are not getting pushed down to >>>>> the PrunedFilteredScan instances. The filtering is happening at the Spark >>>>> layer. >>>>> >>>>> The physical plan for timestamp range queries is not showing the >>>>> pushed filters where as range queries on other types is working fine as >>>>> the >>>>> physical plan is showing the pushed filters. >>>>> >>>>> Please see below for code and examples. >>>>> >>>>> *Example:* >>>>> >>>>> *1.* Range filter queries on Timestamp types >>>>> >>>>> *code: * >>>>> >>>>>> sqlContext.sql("SELECT * from events WHERE `registration` >= >>>>>> '2015-05-28' AND `registration` <= '2015-05-29' ") >>>>> >>>>> *Full example*: >>>>> https://github.com/lucidworks/spark-solr/blob/master/src/test/scala/com/lucidworks/spark/EventsimTestSuite.scala#L151 >>>>> * plan*: >>>>> https://gist.github.com/kiranchitturi/4a52688c9f0abe3d4b2bd8b938044421#file-time-range-sql >>>>> >>>>> *2. * Range filter queries on Long types >>>>> >>>>> *code*: >>>>> >>>>>> sqlContext.sql("SELECT * from events WHERE `length` >= '700' and >>>>>> `length` <= '1000'") >>>>> >>>>> *Full example*: >>>>> https://github.com/lucidworks/spark-solr/blob/master/src/test/scala/com/lucidworks/spark/EventsimTestSuite.scala#L151 >>>>> *plan*: >>>>> https://gist.github.com/kiranchitturi/4a52688c9f0abe3d4b2bd8b938044421#file-length-range-sql >>>>> >>>>> The SolrRelation class we use extends >>>>> <https://github.com/lucidworks/spark-solr/blob/master/src/main/scala/com/lucidworks/spark/SolrRelation.scala#L37> >>>>> the PrunedFilteredScan. >>>>> >>>>> Since Solr supports date ranges, I would like for the timestamp >>>>> filters to be pushed down to the Solr query. >>>>> >>>>> Are there limitations on the type of filters that are passed down with >>>>> Timestamp types ? >>>>> Is there something that I should do in my code to fix this ? >>>>> >>>>> Thanks, >>>>> -- >>>>> Kiran Chitturi >>>>> >>>>> >>>> >> > -- --- Takeshi Yamamuro