May be you have to cast it to integer. Sudhakar Thota Sent from my iPhone
> On Feb 8, 2015, at 6:54 PM, Minnow Noir <[email protected]> wrote: > > The error actually happens when I remove the header row from the file. > > cat test2.csv > "Ed",100 > "Pete",200 > "Ed",100 > "Pete",400 > > > 0: jdbc:drill:zk=local> select columns[0], columns[1] from > dfs.`/data/test2.csv`; > +------------+------------+ > | EXPR$0 | EXPR$1 | > +------------+------------+ > | "Ed" | 100 | > | "Pete" | 200 | > | "Ed" | 100 | > | "Pete" | 400 | > +------------+------------+ > 4 rows selected (0.837 seconds) > > 0: jdbc:drill:zk=local> select columns[0], sum(columns[1]) from > dfs.`/data/test2.csv` group by columns[0]; > Query failed: Query failed: Failure while running fragment., Only COUNT > aggregate function supported for Boolean type [ > a9344c36-ebb4-4b9d-9c7d-75a0f7e52edd on sandbox.hortonworks.com:31010 ] > [ a9344c36-ebb4-4b9d-9c7d-75a0f7e52edd on sandbox.hortonworks.com:31010 ] > > > Error: exception while executing query: Failure while executing query. > (state=,code=0) > > Also, I don't see anything on the Drill wiki about filtering out the header > row. How is that done? > > Thanks > > > On Sun, Feb 8, 2015 at 9:23 PM, Andries Engelbrecht < > [email protected]> wrote: > >> You need to filter out the header line on the CSV file as you are trying >> to sum a string in column1. >> >> >> —Andries >> >> >>> On Feb 8, 2015, at 6:12 PM, Minnow Noir <[email protected]> wrote: >>> >>> I'm trying to perform a basic query in order to learn Drill, but getting >> an >>> the error message in the subject line. >>> >>> I created a dead simple CSV file on disk. Note that Sales values are not >>> quoted. >>> >>> cat test.csv >>> Employee,Sales >>> Ed,100 >>> Pete,200 >>> Ed,100 >>> Pete,400 >>> >>> When I query it without performing a sum, it returns the expected values. >>> >>> 0: jdbc:drill:zk=local> select columns[0], columns[1] as TotalSales from >>> dfs.`/data/test.csv`; >>> +------------+------------+ >>> | EXPR$0 | TotalSales | >>> +------------+------------+ >>> | Employee | Sales | >>> | Ed | 100 | >>> | Pete | 200 | >>> | Ed | 100 | >>> | Pete | 400 | >>> +------------+------------+ >>> 5 rows selected (0.11 seconds) >>> >>> However, if I throw a sum() in there, I get the confusing error message >> in >>> the subject line: >>> >>> 0: jdbc:drill:zk=local> select columns[0], sum(columns[1]) as TotalSales >>> from dfs.` >>> /data/test.csv` group by columns[0]; >>> Query failed: Query failed: Failure while running fragment., Only COUNT >>> aggregate function supported for Boolean type [ >>> bfd34bd1-2fac-4d9e-a9bd-26bced552120 on sandbox.hortonworks.com:31010 ] >>> [ bfd34bd1-2fac-4d9e-a9bd-26bced552120 on sandbox.hortonworks.com:31010 >> ] >>> >>> The message seems to be saying that Drill interpreted the Sales column >> data >>> as being Boolean somehow, and therefore, the only function that can be >>> called is count(). It's not clear why Drill would interpret the Sales >>> column values as being Boolean. >>> >>> Some Googling turned up this thread, which says the error also occurs for >>> character data: https://issues.apache.org/jira/browse/DRILL-1998 >>> >>> Of course, it's not clear why Drill would interpret 100, 200, etc. as >> being >>> character data unless it's getting thrown off by the header row...which >> of >>> course is present in every CSV and TSV file. However, I created a copy >> of >>> test.csv *without* the header row, and reran the query, but got the same >>> error. >>> >>> Any ideas what's causing the issue and how to resolve it? >>> >>> Thanks >> >>
