Can you please try this, since you mention column has empty values try using 
null instead of zero, because 0 (zero) is  not null.


CASE WHEN `Capacity` = '' THEN CAST(null AS integer) ELSE CAST(`Capacity` AS 
integer)

END AS `Capacity`


Thanks,

Khurram

________________________________
From: Divya Gehlot <divya.htco...@gmail.com>
Sent: Monday, August 14, 2017 12:14:26 PM
To: user@drill.apache.org
Subject: UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema 
changes

Hi,
I am getting below error :

> UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema
> changes Fragment 0:0


*Some background to understand the error scenario : *

I have  data saved in parquet files in multidirectory structure.
One of the column of the table is empty values so I used below to handle
null values

> CASE WHEN `Capacity` = '' THEN CAST(0 AS INT) ELSE CAST(`Capacity` AS INT)
> END AS `Capacity`


My query does some aggregation like sum of the column values .My query
looks like below :
SELECT
`COUNTRY`,
`YEAR`,
`MONTH`,
`DAY`,
`Capacity`,
SUM(CAST(`UserCount` AS INT)) AS `Dailyusers`
FROM `dfs`.`default`.`./parquetfiles/countries/`
WHERE `YEAR` =2017 AND `MONTH` = 2
group by `COUNTRY`,`YEAR`,`MONTH`,`DAY`,`Office``Capacity`

When I include `Capacity` column in above query it gives me below error :

> UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema
> changes


When I exclude the `Capacity` column the query runs fine and I could view
the expected results .

Few approaches/verification  which I tried after browsing the internet
regarding the error whic i encounter :
1 : alter session set `planner.enable_streamagg` = false;
    alter session set `planner.enable_hashjoin` = false;

Still the same error



>  UNSUPPORTED_OPERATION ERROR: Hash aggregate does not support schema
> changes Fragment


2 : somebody suggested that if  some of your files are empty you will
encounter the error but I checked and confirm that none of my parquet file
is empty.

3.I have same schema across all the files .

4. If I query the files individually there is no error and It works fine .
For instance if my query is
SELECT
`COUNTRY`,
`YEAR`,
`MONTH`,
`DAY`,
`Capacity`,
SUM(CAST(`UserCount` AS INT)) AS `Dailyusers`
FROM `dfs`.`default`.`./parquetfiles/countries/china/0_0_0.parquet`
WHERE `YEAR` =2017 AND `MONTH` = 2
group by `COUNTRY`,`YEAR`,`MONTH`,`DAY`,`Office``Capacity`

Above query works fine for all the countries which I have under countries
directory.
But if I query until directory countries its throws error .

Appreciate the help!

Thanks,
Divya

Reply via email to