On Wed, Apr 5, 2017 at 10:14 AM, Bin Wang <[email protected]> wrote:
> Will Impala load all the file into the memory? That sounds horrible. And
> with "show partition adhoc_data_fast.log", the compressed files are no
> bigger that 4GB:

The *uncompressed* size of one of your files is 50GB. Gzip needs to
allocate memory for that.

>
> | 2017-04-04 | -1    | 46     | 2.69GB   | NOT CACHED   | NOT CACHED
> | AVRO   | false             |
> hdfs://hfds-service/user/hive/warehouse/adhoc_data_fast.db/log/2017-04-04 |
> | 2017-04-05 | -1    | 25     | 3.42GB   | NOT CACHED   | NOT CACHED
> | AVRO   | false             |
> hdfs://hfds-service/user/hive/warehouse/adhoc_data_fast.db/log/2017-04-05 |
>
>
> Marcel Kornacker <[email protected]>于2017年4月6日周四 上午12:58写道:
>>
>> Apparently you have a gzipped file that is >=50GB. You either need to
>> break up those files, or run on larger machines.
>>
>> On Wed, Apr 5, 2017 at 9:52 AM, Bin Wang <[email protected]> wrote:
>> > Hi,
>> >
>> > I'm using Impala on production for a while. But since yesterday, some
>> > queries reports memory limit exceeded. Then I try a very simple count
>> > query,
>> > it still have memory limit exceeded.
>> >
>> > The query is:
>> >
>> > select count(0) from adhoc_data_fast.log where day>='2017-04-04' and
>> > day<='2017-04-06';
>> >
>> > And the response in the Impala shell is:
>> >
>> > Query submitted at: 2017-04-06 00:41:00 (Coordinator:
>> > http://szq7.appadhoc.com:25000)
>> > Query progress can be monitored at:
>> >
>> > http://szq7.appadhoc.com:25000/query_plan?query_id=4947a3fecd146df4:734bcc1d00000000
>> > WARNINGS:
>> > Memory limit exceeded
>> > GzipDecompressor failed to allocate 54525952000 bytes.
>> >
>> > I have many nodes and each of them have lots of memory avaliable (~ 60
>> > GB).
>> > And the query failed very fast after I execute it and the nodes have
>> > almost
>> > no memory usage.
>> >
>> > The table "adhoc_data_fast.log" is an AVRO table and is encoded with
>> > gzip
>> > and is partitioned by the field "day". And each partition has no more
>> > than
>> > one billion rows.
>> >
>> > My Impala version is:
>> >
>> > hdfs@szq7:/home/ubuntu$ impalad --version
>> > impalad version 2.7.0-cdh5.9.1 RELEASE (build
>> > 24ad6df788d66e4af9496edb26ac4d1f1d2a1f2c)
>> > Built on Wed Jan 11 13:39:25 PST 2017
>> >
>> > Any one can help for this? Thanks very much!
>> >

Reply via email to