May be you have to cast it to integer.

Sudhakar Thota
Sent from my iPhone

> On Feb 8, 2015, at 6:54 PM, Minnow Noir <[email protected]> wrote:
> 
> The error actually happens when I remove the header row from the file.
> 
> cat test2.csv
> "Ed",100
> "Pete",200
> "Ed",100
> "Pete",400
> 
> 
> 0: jdbc:drill:zk=local> select columns[0], columns[1] from
> dfs.`/data/test2.csv`;
> +------------+------------+
> |   EXPR$0   |   EXPR$1   |
> +------------+------------+
> | "Ed"       | 100        |
> | "Pete"     | 200        |
> | "Ed"       | 100        |
> | "Pete"     | 400        |
> +------------+------------+
> 4 rows selected (0.837 seconds)
> 
> 0: jdbc:drill:zk=local> select columns[0], sum(columns[1]) from
> dfs.`/data/test2.csv` group by columns[0];
> Query failed: Query failed: Failure while running fragment., Only COUNT
> aggregate function supported for Boolean type [
> a9344c36-ebb4-4b9d-9c7d-75a0f7e52edd on sandbox.hortonworks.com:31010 ]
> [ a9344c36-ebb4-4b9d-9c7d-75a0f7e52edd on sandbox.hortonworks.com:31010 ]
> 
> 
> Error: exception while executing query: Failure while executing query.
> (state=,code=0)
> 
> Also, I don't see anything on the Drill wiki about filtering out the header
> row.  How is that done?
> 
> Thanks
> 
> 
> On Sun, Feb 8, 2015 at 9:23 PM, Andries Engelbrecht <
> [email protected]> wrote:
> 
>> You need to filter out the header line on the CSV file as you are trying
>> to sum a string in column1.
>> 
>> 
>> —Andries
>> 
>> 
>>> On Feb 8, 2015, at 6:12 PM, Minnow Noir <[email protected]> wrote:
>>> 
>>> I'm trying to perform a basic query in order to learn Drill, but getting
>> an
>>> the error message in the subject line.
>>> 
>>> I created a dead simple CSV file on disk.  Note that Sales values are not
>>> quoted.
>>> 
>>> cat test.csv
>>> Employee,Sales
>>> Ed,100
>>> Pete,200
>>> Ed,100
>>> Pete,400
>>> 
>>> When I query it without performing a sum, it returns the expected values.
>>> 
>>> 0: jdbc:drill:zk=local> select columns[0], columns[1] as TotalSales from
>>> dfs.`/data/test.csv`;
>>> +------------+------------+
>>> |   EXPR$0   | TotalSales |
>>> +------------+------------+
>>> | Employee   | Sales      |
>>> | Ed         | 100        |
>>> | Pete       | 200        |
>>> | Ed         | 100        |
>>> | Pete       | 400        |
>>> +------------+------------+
>>> 5 rows selected (0.11 seconds)
>>> 
>>> However, if I throw a sum() in there, I get the confusing error message
>> in
>>> the subject line:
>>> 
>>> 0: jdbc:drill:zk=local> select columns[0], sum(columns[1]) as TotalSales
>>> from dfs.`
>>> /data/test.csv` group by columns[0];
>>> Query failed: Query failed: Failure while running fragment., Only COUNT
>>> aggregate function supported for Boolean type [
>>> bfd34bd1-2fac-4d9e-a9bd-26bced552120 on sandbox.hortonworks.com:31010 ]
>>> [ bfd34bd1-2fac-4d9e-a9bd-26bced552120 on sandbox.hortonworks.com:31010
>> ]
>>> 
>>> The message seems to be saying that Drill interpreted the Sales column
>> data
>>> as being Boolean somehow, and therefore, the only function that can be
>>> called is count().  It's not clear why Drill would interpret the Sales
>>> column values as being Boolean.
>>> 
>>> Some Googling turned up this thread, which says the error also occurs for
>>> character data: https://issues.apache.org/jira/browse/DRILL-1998
>>> 
>>> Of course, it's not clear why Drill would interpret 100, 200, etc. as
>> being
>>> character data unless it's getting thrown off by the header row...which
>> of
>>> course is present in every CSV and TSV file.  However, I created a copy
>> of
>>> test.csv *without* the header row, and reran the query, but got the same
>>> error.
>>> 
>>> Any ideas what's causing the issue and how to resolve it?
>>> 
>>> Thanks
>> 
>> 

Reply via email to