Apparently you have a gzipped file that is >=50GB. You either need to
break up those files, or run on larger machines.

On Wed, Apr 5, 2017 at 9:52 AM, Bin Wang <[email protected]> wrote:
> Hi,
>
> I'm using Impala on production for a while. But since yesterday, some
> queries reports memory limit exceeded. Then I try a very simple count query,
> it still have memory limit exceeded.
>
> The query is:
>
> select count(0) from adhoc_data_fast.log where day>='2017-04-04' and
> day<='2017-04-06';
>
> And the response in the Impala shell is:
>
> Query submitted at: 2017-04-06 00:41:00 (Coordinator:
> http://szq7.appadhoc.com:25000)
> Query progress can be monitored at:
> http://szq7.appadhoc.com:25000/query_plan?query_id=4947a3fecd146df4:734bcc1d00000000
> WARNINGS:
> Memory limit exceeded
> GzipDecompressor failed to allocate 54525952000 bytes.
>
> I have many nodes and each of them have lots of memory avaliable (~ 60 GB).
> And the query failed very fast after I execute it and the nodes have almost
> no memory usage.
>
> The table "adhoc_data_fast.log" is an AVRO table and is encoded with gzip
> and is partitioned by the field "day". And each partition has no more than
> one billion rows.
>
> My Impala version is:
>
> hdfs@szq7:/home/ubuntu$ impalad --version
> impalad version 2.7.0-cdh5.9.1 RELEASE (build
> 24ad6df788d66e4af9496edb26ac4d1f1d2a1f2c)
> Built on Wed Jan 11 13:39:25 PST 2017
>
> Any one can help for this? Thanks very much!
>

Reply via email to