Or you you can use a simple predicate to filter out the header. Something like
Select ….. from …. where columns[0] <> ‘“date1”’; In case it doesn’t display correctly it is single quote ‘ then double quote “ then date1 followed by double quote and then single quote. —Andries On Apr 1, 2015, at 11:58 PM, Junjun Olympia <[email protected]> wrote: > While waiting for DRILL-951 > <https://issues.apache.org/jira/browse/DRILL-951>, maybe you can use > something like this: > > select sum(cast(trim(columns[6]) as int)) from HDFS.`/test.csv` where > trim(columns[6]) similar to '^(\+|-)?[0-9]+(\.[0-9]+)?'; > > Cheers, > > Junjun > > > On Thu, Apr 2, 2015 at 2:43 PM, Mahesh Sankaran <[email protected]> > wrote: > >> we are waiting for Apache Drill 1.0.Thanks for the information. >> >> On Thu, Apr 2, 2015 at 12:04 PM, Aman Sinha <[email protected]> wrote: >> >>> The exact release date depends on a variety of factors - I will let folks >>> who manage the release timeline chime in. >>> >>> On Wed, Apr 1, 2015 at 11:19 PM, Mahesh Sankaran < >> [email protected] >>>> >>> wrote: >>> >>>> thank you aman.May i know the release date of apache drill 1.0. >>>> >>>> On Thu, Apr 2, 2015 at 11:40 AM, Aman Sinha <[email protected]> >> wrote: >>>> >>>>> Hi Mahesh, >>>>> Please see https://issues.apache.org/jira/browse/DRILL-951 for the >>>> issue >>>>> of CSV headers. It is a feature that will be addressed in an >> upcoming >>>>> release (currently tagged for 1.0). >>>>> >>>>> Aman >>>>> >>>>> On Wed, Apr 1, 2015 at 10:52 PM, Mahesh Sankaran < >>>> [email protected] >>>>>> >>>>> wrote: >>>>> >>>>>> Hi , >>>>>> I am currently working in Apache Drill to analyse CSV >>> files.My >>>>>> problem is, If the CSV file has headers means we cant do any sum >>>> query.It >>>>>> shows the following errors. >>>>>> >>>>>> 0: jdbc:drill:zk=nn01:2181,dn02:2181,dn03:218> select >>>> sum(cast(columns[6] >>>>>> as int)) from HDFS.`/test.csv` limit 10; >>>>>> Query failed: RemoteRpcException: Failure while running fragment., >>>>> rcvdbyte >>>>>> [ 584925d6-dab6-42ce-8eb3-fa7abfb0e0f2 on nn01:31010 ] >>>>>> [ 584925d6-dab6-42ce-8eb3-fa7abfb0e0f2 on nn01:31010 ] >>>>>> >>>>>> >>>>>> Error: exception while executing query: Failure while executing >>> query. >>>>>> (state=,code=0) >>>>>> >>>>>> *But the above query is working well without headers.There is any >> way >>>> to >>>>>> sum the columns in CSV files with headers in Apache Drill.* >>>>>> >>>>>> *This is our example file:* >>>>>> 0: jdbc:drill:zk=nn01:2181,dn02:2181,dn03:218> select * from >>>>>> HDFS.`/test.csv` limit 10; >>>>>> +------------+------------+ >>>>>> | columns | dir0 | >>>>>> +------------+------------+ >>>>>> | >> ["date1","time1","srcip","dstip","service","sentbyte","rcvdbyte"] | >>>>>> nn01:9000 | >>>>>> | >>>>> >>> ["2015-01-01","00:00:00","10.10.100.74","192.168.0.12","DNS","0","193"] | >>>>>> nn01:9000 | >>>>>> | >>>>> >>> ["2015-01-01","00:00:00","10.10.100.74","192.168.0.12","DNS","0","166"] | >>>>>> nn01:9000 | >>>>>> | >>>>> >>> ["2015-01-01","00:00:00","10.10.100.74","192.168.0.12","DNS","60","359"] >>>>>> | nn01:9000 | >>>>>> | >>>>>> >>>>>> >>>>> >>>> >>> >> ["2015-01-01","00:00:00","10.10.50.195","106.10.193.45","php","717","359","0","0"] >>>>>> | nn01:9000 | >>>>>> | >>>>> >>> ["2015-01-01","00:00:00","111.123.180.44","117.239.67.36","9064","0","0"] >>>>>> | nn01:9000 | >>>>>> | >>>>> >>> ["2015-01-01","00:00:00","111.123.180.44","117.239.67.37","9064","0","0"] >>>>>> | nn01:9000 | >>>>>> | >>>>> >>> ["2015-01-01","00:00:00","111.123.180.44","117.239.67.38","9064","0","0"] >>>>>> | nn01:9000 | >>>>>> | >>>>> >>> ["2015-01-01","00:00:00","111.123.180.44","117.239.67.34","9064","0","0"] >>>>>> | nn01:9000 | >>>>>> | >>>>> >>> ["2015-01-01","00:00:00","111.123.180.44","117.239.67.44","9064","0","0"] >>>>>> | nn01:9000 | >>>>>> >>>>>> >>>>>> Thanks and Regards, >>>>>> >>>>>> Mahesh Sankaran >>>>>> >>>>> >>>> >>> >>
