Stephen: The data in the column is in 'MM/dd/yyyy' format. I mentioned 'yyyy-MM-dd' when I was asking about implicit date conversion and comparison values. It seems to do string comparison (see below), not explicit date conversion.
The following query produces the wrong result, presumably it's doing a string comparison. 0: jdbc:drill:zk=local> select columns[11] from myCsv where columns[11] < '2012-01-01'; +------------+ | EXPR$0 | +------------+ | 2/4/2012 | | 2/22/2012 | | 2/2/2015 | | 1/17/2013 | | 2/25/2013 | This query produces incorrect results, too. 0: jdbc:drill:zk=local> select columns[11] from myCsv where columns[11] < '03/01/2012'; +------------+ | EXPR$0 | +------------+ +------------+ No rows selected (0.267 seconds) Explicitly, albeit verbosely, casting values to dates works. On Thu, Feb 12, 2015 at 6:20 PM, Hao Zhu <[email protected]> wrote: > Actually per my tests, below 3 methods can generate the same results. > > > select columns[0] from dfs.tmp.`test.csv` where cast(columns[0] as date) > > cast('2010-01-01' as date); > +------------+ > | EXPR$0 | > +------------+ > | 2014-01-01 | > | 2010-01-02 | > | 2010-01-03 | > | 2010-01-04 | > +------------+ > 4 rows selected (0.16 seconds) > > > select columns[0] from dfs.tmp.`test.csv` where cast(columns[0] as > varchar(10)) > cast('2010-01-01' as varchar(10)); > +------------+ > | EXPR$0 | > +------------+ > | 2014-01-01 | > | 2010-01-02 | > | 2010-01-03 | > | 2010-01-04 | > +------------+ > 4 rows selected (0.138 seconds) > > > select columns[0] from dfs.tmp.`test.csv` where columns[0] > > '2010-01-01'; > +------------+ > | EXPR$0 | > +------------+ > | 2014-01-01 | > | 2010-01-02 | > | 2010-01-03 | > | 2010-01-04 | > +------------+ > 4 rows selected (0.145 seconds) > > > > On Thu, Feb 12, 2015 at 3:13 PM, Steven Phillips <[email protected]> > wrote: > > > What exactly was the result. I would expect it would implicitly cast the > > string to a date for the comparison. > > > > On Thu, Feb 12, 2015 at 2:38 PM, Minnow Noir <[email protected]> > wrote: > > > > > Yes. > > > On Feb 12, 2015 5:36 PM, "Steven Phillips" <[email protected]> > > wrote: > > > > > > > did you try the form: > > > > where columns[2] > date '2010-01-01' > > > > > > > > On Thu, Feb 12, 2015 at 2:28 PM, Aditya <[email protected]> > > wrote: > > > > > > > > > In a CSV file, every field is a VARCHAR. > > > > > > > > > > On Thu, Feb 12, 2015 at 12:47 PM, Minnow Noir < > [email protected]> > > > > > wrote: > > > > > > > > > > > I'm trying to write a WHERE clause that returns rows from a CSV > > file > > > > > where > > > > > > the value of the date column is greater or less than a comparison > > > value > > > > > > (e.g., where columns[2] > '2010-01-01'). > > > > > > > > > > > > It looks like Drill cannot automatically infer date values in CSV > > > > files, > > > > > > and that one must manually/explicitly tell Drill that both the > > value > > > in > > > > > the > > > > > > date column and the comparison value in the where clause > predicate > > > are > > > > > > dates, so that it can correctly compare values. > > > > > > > > > > > > OTOH, select <fields> from <plugin> WHERE to_date(<column>, > > > <format>) < > > > > > > to_date(<value>, <format>) seems to work, although it's verbose. > > > > > > > > > > > > Do I have this right, and is there a less verbose way to handle > > this? > > > > > > > > > > > > Thanks > > > > > > > > > > > > > > > > > > > > > > > > > > > -- > > > > Steven Phillips > > > > Software Engineer > > > > > > > > mapr.com > > > > > > > > > > > > > > > -- > > Steven Phillips > > Software Engineer > > > > mapr.com > > >
