While waiting for DRILL-951
<https://issues.apache.org/jira/browse/DRILL-951>, maybe you can use
something like this:

select sum(cast(trim(columns[6]) as int)) from HDFS.`/test.csv` where
trim(columns[6]) similar to '^(\+|-)?[0-9]+(\.[0-9]+)?';

Cheers,

Junjun


On Thu, Apr 2, 2015 at 2:43 PM, Mahesh Sankaran <[email protected]>
wrote:

> we are waiting for Apache Drill 1.0.Thanks for the information.
>
> On Thu, Apr 2, 2015 at 12:04 PM, Aman Sinha <[email protected]> wrote:
>
> > The exact release date depends on a variety of factors - I will let folks
> > who manage the release timeline chime in.
> >
> > On Wed, Apr 1, 2015 at 11:19 PM, Mahesh Sankaran <
> [email protected]
> > >
> > wrote:
> >
> > > thank you aman.May i know the release date of apache drill 1.0.
> > >
> > > On Thu, Apr 2, 2015 at 11:40 AM, Aman Sinha <[email protected]>
> wrote:
> > >
> > > > Hi Mahesh,
> > > > Please see https://issues.apache.org/jira/browse/DRILL-951  for the
> > > issue
> > > > of CSV headers.  It is a feature that will be addressed in an
> upcoming
> > > > release (currently tagged for 1.0).
> > > >
> > > > Aman
> > > >
> > > > On Wed, Apr 1, 2015 at 10:52 PM, Mahesh Sankaran <
> > > [email protected]
> > > > >
> > > > wrote:
> > > >
> > > > > Hi ,
> > > > >          I am currently working in Apache Drill to analyse CSV
> > files.My
> > > > > problem is, If the CSV file has headers means we cant do any sum
> > > query.It
> > > > > shows the following errors.
> > > > >
> > > > > 0: jdbc:drill:zk=nn01:2181,dn02:2181,dn03:218> select
> > > sum(cast(columns[6]
> > > > > as int)) from HDFS.`/test.csv` limit 10;
> > > > > Query failed: RemoteRpcException: Failure while running fragment.,
> > > > rcvdbyte
> > > > > [ 584925d6-dab6-42ce-8eb3-fa7abfb0e0f2 on nn01:31010 ]
> > > > > [ 584925d6-dab6-42ce-8eb3-fa7abfb0e0f2 on nn01:31010 ]
> > > > >
> > > > >
> > > > > Error: exception while executing query: Failure while executing
> > query.
> > > > > (state=,code=0)
> > > > >
> > > > > *But the above query is working well without headers.There is any
> way
> > > to
> > > > > sum the columns in CSV files with headers in Apache Drill.*
> > > > >
> > > > > *This is our example file:*
> > > > > 0: jdbc:drill:zk=nn01:2181,dn02:2181,dn03:218> select * from
> > > > > HDFS.`/test.csv` limit 10;
> > > > > +------------+------------+
> > > > > |  columns   |    dir0    |
> > > > > +------------+------------+
> > > > > |
> ["date1","time1","srcip","dstip","service","sentbyte","rcvdbyte"] |
> > > > > nn01:9000  |
> > > > > |
> > > >
> > ["2015-01-01","00:00:00","10.10.100.74","192.168.0.12","DNS","0","193"] |
> > > > > nn01:9000  |
> > > > > |
> > > >
> > ["2015-01-01","00:00:00","10.10.100.74","192.168.0.12","DNS","0","166"] |
> > > > > nn01:9000  |
> > > > > |
> > > >
> > ["2015-01-01","00:00:00","10.10.100.74","192.168.0.12","DNS","60","359"]
> > > > > | nn01:9000  |
> > > > > |
> > > > >
> > > > >
> > > >
> > >
> >
> ["2015-01-01","00:00:00","10.10.50.195","106.10.193.45","php","717","359","0","0"]
> > > > > | nn01:9000  |
> > > > > |
> > > >
> > ["2015-01-01","00:00:00","111.123.180.44","117.239.67.36","9064","0","0"]
> > > > > | nn01:9000  |
> > > > > |
> > > >
> > ["2015-01-01","00:00:00","111.123.180.44","117.239.67.37","9064","0","0"]
> > > > > | nn01:9000  |
> > > > > |
> > > >
> > ["2015-01-01","00:00:00","111.123.180.44","117.239.67.38","9064","0","0"]
> > > > > | nn01:9000  |
> > > > > |
> > > >
> > ["2015-01-01","00:00:00","111.123.180.44","117.239.67.34","9064","0","0"]
> > > > > | nn01:9000  |
> > > > > |
> > > >
> > ["2015-01-01","00:00:00","111.123.180.44","117.239.67.44","9064","0","0"]
> > > > > | nn01:9000  |
> > > > >
> > > > >
> > > > > Thanks and Regards,
> > > > >
> > > > > Mahesh Sankaran
> > > > >
> > > >
> > >
> >
>

Reply via email to