Hi, I'm using Impala on production for a while. But since yesterday, some queries reports memory limit exceeded. Then I try a very simple count query, it still have memory limit exceeded.
The query is: select count(0) from adhoc_data_fast.log where day>='2017-04-04' and day<='2017-04-06'; And the response in the Impala shell is: Query submitted at: 2017-04-06 00:41:00 (Coordinator: http://szq7.appadhoc.com:25000) Query progress can be monitored at: http://szq7.appadhoc.com:25000/query_plan?query_id=4947a3fecd146df4:734bcc1d00000000 WARNINGS: Memory limit exceeded GzipDecompressor failed to allocate 54525952000 bytes. I have many nodes and each of them have lots of memory avaliable (~ 60 GB). And the query failed very fast after I execute it and the nodes have almost no memory usage. The table "adhoc_data_fast.log" is an AVRO table and is encoded with gzip and is partitioned by the field "day". And each partition has no more than one billion rows. My Impala version is: hdfs@szq7:/home/ubuntu$ impalad --version impalad version 2.7.0-cdh5.9.1 RELEASE (build 24ad6df788d66e4af9496edb26ac4d1f1d2a1f2c) Built on Wed Jan 11 13:39:25 PST 2017 Any one can help for this? Thanks very much!
