Yes in the second query, SELECT columns[8] FROM `allData.csv` limit 10; Drill reads and returns results as varchar (in the format that it was in the input csv file)
and in the case where we cast the column to TIME we see three digits after SS field. Should we return HH:MM:SS.sss or Should we return HH:MM:SS (which is the SQL standard format for TIME), when we cast columns to TIME datatype ? Thanks, Khurram On Mon, Jun 8, 2015 at 3:00 PM, Abdel Hakim Deneche <[email protected]> wrote: > for the 2nd query (without cast to time) I think Drill just displays the > column as VARCHAR. You are only seeing 2 digits after SS. because that's > how the data is stored in the file (I suppose). > > I don't know what's the standard is, but looking at SQL Server > documentation, you can have up to 3 digits after the period: > > Milliseconds can be preceded by either a colon (:) or a period (.). If a > > colon is used, the number means thousandths-of-a-second. If a period is > > used, a single digit means tenths-of-a-second, two digits mean > > hundredths-of-a-second, and three digits mean thousandths-of-a-second. > > > > On Mon, Jun 8, 2015 at 2:47 PM, Khurram Faraaz <[email protected]> > wrote: > > > Hi All, > > > > Can someone please review this ? > > > > SQL standard (ISO/IEC 9075-1:2011(E)) specifies that a value of datatype > > TIME comprises values of DATETIME fields HOUR, MINUTE and SECONDS. > > (HH:MM:SS) > > > > The question here is for the same input data Drill and Postgres return > > different output/results, when we cast the column to TIME in Drill. > > > > Drill 1.0 results below are with cast to TIME. Note that there are three > > digits after the seconds. (HH:MM:SS.sss) > > > > {code} > > > > 0: jdbc:drill:schema=dfs.tmp> SELECT cast( columns[8] as TIME ) FROM > > `allData.csv` limit 10; > > > > *+---------------+* > > > > *| ** EXPR$0 ** |* > > > > *+---------------+* > > > > *| *08:16:08.580 * |* > > > > *| *09:11:49.170 * |* > > > > *| *18:44:25.430 * |* > > > > *| *20:52:08.560 * |* > > > > *| *19:46:10.420 * |* > > > > *| *15:21:34.390 * |* > > > > *| *10:31:55.240 * |* > > > > *| *01:37:47.310 * |* > > > > *| *01:50:05.110 * |* > > > > *| *11:28:25.100 * |* > > > > *+---------------+* > > > > 10 rows selected (0.173 seconds) > > > > > > explain plan for above query > > > > > > *| *00-00 Screen > > > > 00-01 SelectionVectorRemover > > > > 00-02 Limit(fetch=[10]) > > > > 00-03 Project(EXPR$0=[CAST(ITEM($0, 8)):TIME(0)]) > > > > 00-04 Scan(groupscan=[EasyGroupScan > > [selectionRoot=/tmp/allData.csv, numFiles=1, columns=[`columns`[8]], > > files=[maprfs:///tmp/allData.csv]]]) > > > > > > {code} > > > > > > Drill results below are without cast to TIME, these are same as Postgres > > results. However, we should note that there are only two digits after SS > in > > the results, and this is not as per the SQL specification for TIME > > datatype. > > > > > > {code} > > > > 0: jdbc:drill:schema=dfs.tmp> SELECT columns[8] FROM `allData.csv` limit > > 10; > > > > *+--------------+* > > > > *| ** EXPR$0 ** |* > > > > *+--------------+* > > > > *| *8:16:8.58 * |* > > > > *| *9:11:49.17 * |* > > > > *| *18:44:25.43 * |* > > > > *| *20:52:8.56 * |* > > > > *| *19:46:10.42 * |* > > > > *| *15:21:34.39 * |* > > > > *| *10:31:55.24 * |* > > > > *| *1:37:47.31 * |* > > > > *| *1:50:5.11 * |* > > > > *| *11:28:25.1 * |* > > > > *+--------------+* > > > > 10 rows selected (0.264 seconds) > > > > > > explain plan for query that does not cast to TIME > > > > > > *| *00-00 Screen > > > > 00-01 SelectionVectorRemover > > > > 00-02 Limit(fetch=[10]) > > > > 00-03 Project(EXPR$0=[ITEM($0, 8)]) > > > > 00-04 Scan(groupscan=[EasyGroupScan > > [selectionRoot=/tmp/allData.csv, numFiles=1, columns=[`columns`[8]], > > files=[maprfs:///tmp/allData.csv]]]) > > > > {code} > > > > > > Postgres 9.3 results are > > > > > > {code} > > > > postgres=# SELECT col_tm FROM all_typs_tbl limit 10; > > > > col_tm > > > > ------------- > > > > 08:16:08.58 > > > > 09:11:49.17 > > > > 18:44:25.43 > > > > 20:52:08.56 > > > > 19:46:10.42 > > > > 15:21:34.39 > > > > 10:31:55.24 > > > > 01:37:47.31 > > > > 01:50:05.11 > > > > 11:28:25.1 > > > > (10 rows) > > {code} > > > > Thanks, > > Khurram > > > > > > -- > > Abdelhakim Deneche > > Software Engineer > > <http://www.mapr.com/> > > > Now Available - Free Hadoop On-Demand Training > < > http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available > > >
