for the 2nd query (without cast to time) I think Drill just displays the column as VARCHAR. You are only seeing 2 digits after SS. because that's how the data is stored in the file (I suppose).
I don't know what's the standard is, but looking at SQL Server documentation, you can have up to 3 digits after the period: Milliseconds can be preceded by either a colon (:) or a period (.). If a > colon is used, the number means thousandths-of-a-second. If a period is > used, a single digit means tenths-of-a-second, two digits mean > hundredths-of-a-second, and three digits mean thousandths-of-a-second. On Mon, Jun 8, 2015 at 2:47 PM, Khurram Faraaz <[email protected]> wrote: > Hi All, > > Can someone please review this ? > > SQL standard (ISO/IEC 9075-1:2011(E)) specifies that a value of datatype > TIME comprises values of DATETIME fields HOUR, MINUTE and SECONDS. > (HH:MM:SS) > > The question here is for the same input data Drill and Postgres return > different output/results, when we cast the column to TIME in Drill. > > Drill 1.0 results below are with cast to TIME. Note that there are three > digits after the seconds. (HH:MM:SS.sss) > > {code} > > 0: jdbc:drill:schema=dfs.tmp> SELECT cast( columns[8] as TIME ) FROM > `allData.csv` limit 10; > > *+---------------+* > > *| ** EXPR$0 ** |* > > *+---------------+* > > *| *08:16:08.580 * |* > > *| *09:11:49.170 * |* > > *| *18:44:25.430 * |* > > *| *20:52:08.560 * |* > > *| *19:46:10.420 * |* > > *| *15:21:34.390 * |* > > *| *10:31:55.240 * |* > > *| *01:37:47.310 * |* > > *| *01:50:05.110 * |* > > *| *11:28:25.100 * |* > > *+---------------+* > > 10 rows selected (0.173 seconds) > > > explain plan for above query > > > *| *00-00 Screen > > 00-01 SelectionVectorRemover > > 00-02 Limit(fetch=[10]) > > 00-03 Project(EXPR$0=[CAST(ITEM($0, 8)):TIME(0)]) > > 00-04 Scan(groupscan=[EasyGroupScan > [selectionRoot=/tmp/allData.csv, numFiles=1, columns=[`columns`[8]], > files=[maprfs:///tmp/allData.csv]]]) > > > {code} > > > Drill results below are without cast to TIME, these are same as Postgres > results. However, we should note that there are only two digits after SS in > the results, and this is not as per the SQL specification for TIME > datatype. > > > {code} > > 0: jdbc:drill:schema=dfs.tmp> SELECT columns[8] FROM `allData.csv` limit > 10; > > *+--------------+* > > *| ** EXPR$0 ** |* > > *+--------------+* > > *| *8:16:8.58 * |* > > *| *9:11:49.17 * |* > > *| *18:44:25.43 * |* > > *| *20:52:8.56 * |* > > *| *19:46:10.42 * |* > > *| *15:21:34.39 * |* > > *| *10:31:55.24 * |* > > *| *1:37:47.31 * |* > > *| *1:50:5.11 * |* > > *| *11:28:25.1 * |* > > *+--------------+* > > 10 rows selected (0.264 seconds) > > > explain plan for query that does not cast to TIME > > > *| *00-00 Screen > > 00-01 SelectionVectorRemover > > 00-02 Limit(fetch=[10]) > > 00-03 Project(EXPR$0=[ITEM($0, 8)]) > > 00-04 Scan(groupscan=[EasyGroupScan > [selectionRoot=/tmp/allData.csv, numFiles=1, columns=[`columns`[8]], > files=[maprfs:///tmp/allData.csv]]]) > > {code} > > > Postgres 9.3 results are > > > {code} > > postgres=# SELECT col_tm FROM all_typs_tbl limit 10; > > col_tm > > ------------- > > 08:16:08.58 > > 09:11:49.17 > > 18:44:25.43 > > 20:52:08.56 > > 19:46:10.42 > > 15:21:34.39 > > 10:31:55.24 > > 01:37:47.31 > > 01:50:05.11 > > 11:28:25.1 > > (10 rows) > {code} > > Thanks, > Khurram > -- Abdelhakim Deneche Software Engineer <http://www.mapr.com/> Now Available - Free Hadoop On-Demand Training <http://www.mapr.com/training?utm_source=Email&utm_medium=Signature&utm_campaign=Free%20available>
