Re: Container out of memory: ORC format with many dynamic partitions

Matt Olson Mon, 02 May 2016 19:42:07 -0700

Hi Prasanth,

Thank you for the helpful information. I have been using the default ORC
stripe size, which I believe is 67,108,864 bytes.


I was able to remove the constant value for dt as you suggested, and set
hive.optimize.sort.dynamic.partition=true. I saw in the new explain plan
that the partitions are now being sorted, and rather than a map-only job I
now have a map-reduce job as expected.

For some reason, though, I'm still getting the same error as before. The
logs say that the map is 100% complete and the reduce is 99% complete, and
then it fails with

Container
[pid=8962,containerID=container_e26_1460661845156_55073_01_005311] is
running beyond physical memory limits. Current usage: 2.0 GB of 2 GB
physical memory used; 2.7 GB of 4.2 GB virtual memory used. Killing
container.

This seems strange since there should be plenty of memory for 30 * 5 * 256
KB, which is only about 38 MB. I have tried reducing the compression buffer
size from 256 KB to 16 KB and even 1 KB by setting "orc.compress.size" in
the dynamic_partitioned_table properties, but it still fails with the same
error.

Thanks again for your explanations so far. I'll keep working on this and
let you know if I work it out.

Matt

On Mon, May 2, 2016 at 3:43 PM, Prasanth Jayachandran <
pjayachand...@hortonworks.com> wrote:

> Hi Matt
>
> So it looks like you are hitting the issue that I had mentioned
> previously.
> You might need to apply the patch from HIVE-12893. Alternatively, if dt
> has only one possible value then
> its better to remove the constant value for dt and the where condition.
> This will enable sorted dynamic partition optimization which
> is more scalable when the number of combined partitioned count is huge.
>
> What is the stripe size that you are using?
>
> The reason why it is causing OOM for ORC is
>
> ORC needs to buffer the incoming rows in columnar way before writing it to
> the file. It buffers until configured
> stripe size is reached and the entire stripe gets flushed. This is usually
> not a problem when there few ORC writers.
> When there are multiple concurrent writers then the available memory is
> shared across all writers. In case of dynamic
> partitioning, there will 1 writer per partition and per bucket in each
> mapper/reducer.
>
>  If there are 100 partition, 4 buckets, 25 columns then memory requirement
> will be
> 100 * 4 * 25 * 5 (approx. number of internal streams per column) * 256KB
> (compression buffer size).
> This can get really huge if the number of partition increases. The way
> around this memory requirement is to reduce
> the number of orc writers. hive.optimize.sort.dynamic.partition sort the
> data on partition column and bucket number
> so there will be only 1 writer per mapper/reducer reducing the memory
> requirement to 25 * 5 * 256Kb which is more
> manageable. If this value needs to be further reduce, reduce the
> compression buffer size.
>
> Because of the bug outlined in HIVE-12893, in your case the optimization
> to have single orc writer is not kicking in causing OOM.
>
> Thanks
> Prasanth
>
> On May 2, 2016, at 3:30 PM, Matt Olson <maolso...@gmail.com> wrote:
>
> Hi Prasanth,
>
> Here is the explain plan for the insert query:
>
> OK
> STAGE DEPENDENCIES:
>   Stage-1 is a root stage
>   Stage-7 depends on stages: Stage-1 , consists of Stage-4, Stage-3,
> Stage-5
>   Stage-4
>   Stage-0 depends on stages: Stage-4, Stage-3, Stage-6
>   Stage-2 depends on stages: Stage-0
>   Stage-3
>   Stage-5
>   Stage-6 depends on stages: Stage-5
>
> STAGE PLANS:
>   Stage: Stage-1
>     Map Reduce
>       Map Operator Tree:
>           TableScan
>             alias: original_table
>             Statistics: Num rows: 44962613 Data size: 264560040271 Basic
> stats: COMPLETE Column stats: NONE
>             Select Operator
>               expressions: ...
>               outputColumnNames: _col0, _col1, _col2, _col3, _col4, _col5,
> _col6, _col7, _col8, _col9, _col10, _col11, _col12, _col13, _col14, _col15,
> _col16, _col17, _col18, _col19, _col20, _col21, _col22, _col23, _col24,
> _col25, _col26, _col27
>               Statistics: Num rows: 44962613 Data size: 264560040271 Basic
> stats: COMPLETE Column stats: NONE
>               File Output Operator
>                 compressed: true
>                 Statistics: Num rows: 44962613 Data size: 264560040271
> Basic stats: COMPLETE Column stats: NONE
>                 table:
>                     input format:
> org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
>                     output format:
> org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
>                     serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
>                     name: pin.dynamic_partitioned_table
>
>   Stage: Stage-7
>     Conditional Operator
>
>   Stage: Stage-4
>     Move Operator
>       files:
>           hdfs directory: true
>           destination:
> hdfs://ci-ocean/mnt/tmp/hive-molson/molson/9f6b1ce0-f71a-4c87-9440-77f09e3860eb/hive_2016-05-02_20-14-12_260_7512820923555713567-1/-ext-10000
>
>   Stage: Stage-0
>     Move Operator
>       tables:
>           partition:
>             dt 2016-04-05
>             title_id
>             title_id_type
>           replace: true
>           table:
>               input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
>               output format:
> org.apache.hadoop.hive.ql.io.orc.OrcOutputFormat
>               serde: org.apache.hadoop.hive.ql.io.orc.OrcSerde
>               name: pin.dynamic_partitioned_table
>
>   Stage: Stage-2
>     Stats-Aggr Operator
>
>   Stage: Stage-3
>     Merge File Operator
>       Map Operator Tree:
>           ORC File Merge Operator
>       merge level: stripe
>       input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
>
>   Stage: Stage-5
>     Merge File Operator
>       Map Operator Tree:
>           ORC File Merge Operator
>       merge level: stripe
>       input format: org.apache.hadoop.hive.ql.io.orc.OrcInputFormat
>
>   Stage: Stage-6
>     Move Operator
>       files:
>           hdfs directory: true
>           destination:
> hdfs://ci-ocean/mnt/tmp/hive-molson/molson/9f6b1ce0-f71a-4c87-9440-77f09e3860eb/hive_2016-05-02_20-14-12_260_7512820923555713567-1/-ext-10000
>
> Thank you,
> Matt
>
>
> On Mon, May 2, 2016 at 12:48 PM, Prasanth Jayachandran <
> pjayachand...@hortonworks.com> wrote:
>
>> Hi
>>
>> Can you please post explain plan for your insert query? I suspect sorted
>> dynamic partition optimization is bailing out because of
>> the constant value for ‘dt' column. If you are not seeing a reducer then
>> its likely not using the sorted dynamic partition optimization.
>> You are probably hitting this bug
>> https://issues.apache.org/jira/browse/HIVE-12893
>> I can confirm if thats the case by looking at the explain plan.
>>
>> Thanks
>> Prasanth
>>
>> On May 2, 2016, at 2:24 PM, Ryan Harris <ryan.har...@zionsbancorp.com>
>> wrote:
>>
>> reading this:
>> "but when I add 2000 new titles with 300 rows each"
>> I'm thinking that you are over-partitioning your data....
>> I'm not sure exactly how that relates to the OOM error you are getting
>> (it may not)....I'd test things out partitioning by date-only.... maybe
>> date + title_type, but adding 2000+ dynamic partitions that each have 300
>> rows of data in them is asking for problems in Hive IMO...
>>
>>
>> *From:* Matt Olson [mailto:maolso...@gmail.com <maolso...@gmail.com>]
>> *Sent:* Friday, April 29, 2016 7:50 PM
>> *To:* user@hive.apache.org
>> *Subject:* Container out of memory: ORC format with many dynamic
>> partitions
>>
>> Hi all,
>>
>> I am using Hive 1.0.1 and trying to do a simple insert into an ORC table,
>> creating dynamic partitions. I am selecting from a table partitioned by dt
>> and category, and inserting into a table partitioned by dt, title, and
>> title_type. Other than the partitioning, the tables have the same schemas.
>> Both title and title_type are fields in the first table, and when I insert
>> into the second table, I am using them to create dynamic partitions. The .q
>> file with the CREATE and INSERT statements is copied below.
>>
>> SET hive.optimize.sort.dynamic.partition=true;
>> SET hive.exec.orc.memory.pool=1.0;
>> SET hive.exec.max.dynamic.partitions = 5000;
>> SET hive.exec.max.dynamic.partitions.pernode = 5000;
>> SET hive.merge.mapfiles = true;
>> SET mapred.min.split.size=134217728;
>> SET mapred.min.split.size.per.node=134217728;
>> SET mapred.min.split.size.per.rack=134217728;
>> SET mapred.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;
>> SET
>> mapred.map.output.compression.codec=com.hadoop.compression.lzo.LzoCodec;
>> SET mapred.max.split.size=134217728;
>> SET hive.map.aggr.hash.percentmemory=0.125;
>> SET hive.exec.parallel=true;
>> SET hive.exec.compress.intermediate=true;
>> SET hive.exec.compress.output=true;
>> SET mapred.map.child.java.opts=-Xmx2048M;
>> SET mapred.child.java.opts=-Xmx2048M;
>> SET mapred.task.profile=false;
>>
>> CREATE EXTERNAL TABLE IF NOT EXISTS dynamic_partition_table (
>>
>> field1 string,
>>
>> field2 string,
>>
>> ...
>>
>> field26 string
>>
>> )
>> PARTITIONED BY (dt string, title string, title_type string)
>> STORED AS ORC
>> LOCATION '/hive/warehouse/partitioned_table'
>> TBLPROPERTIES ("orc.compress.size"="16000");
>>
>> INSERT OVERWRITE TABLE dynamic_partition_table PARTITION
>> (dt="2016-04-05", title, title_type)
>> SELECT
>>
>> field1,
>>
>> field2,
>> ...
>>
>> title,
>>
>> title_type
>>
>> FROM original_table
>> WHERE dt = "2016-04-05";
>>
>> The original table has about 250 GB of data for 2016-04-05, and about 260
>> different titles (some titles have very little data, some have ~20 GB).
>> There is generally only one title_type per title. The INSERT action
>> succeeds on that data set, but when I add 2000 new titles with 300 rows
>> each to the original table, I get the following error during the INSERT:
>>
>>
>> Container [pid=6278,containerID=container_e26_1460661845156_49295_01_000244] 
>> is running beyond physical memory limits. Current usage: 2.2 GB of 2 GB 
>> physical memory used; 2.7 GB of 4.2 GB virtual memory used. Killing 
>> container.
>>
>>
>> I've found a couple questions online about this same error message for
>> ORC files with lots of dynamic partitions, on an older version of Hive:
>> https://qnalist.com/questions/4836037/hive-0-12-orc-heap-issues-on-write
>>
>> Based on that and the information about configuration properties at
>> https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties#ConfigurationProperties-ORCFileFormat,
>> I have tried setting hive.exec.orc.memory.pool=1.0 in order to give as much
>> heap space as possible to the ORC file writers. As you can see from the
>> CREATE TABLE statement, I also decreased the orc.compress.size from the
>> default 256 kb to 16 kb. After making these changes, the INSERT is still
>> failing with the "beyond physical memory limits" error.
>>
>> I've tried inserting into a table stored as RCFile rather than ORC, and
>> in that case the action succeeds even with the additional 2000 titles.
>>
>> Can anyone explain how exactly the two ORC parameters above affect the
>> writing of dynamic partitions in ORC files, and why I'm not getting the OOM
>> error when I use the RCFile format instead?  I'd also appreciate any
>> suggestions for other tuning I could do to fix the memory management when
>> using ORC.
>>
>> Thanks for any help,
>> Matt
>> ------------------------------
>> THIS ELECTRONIC MESSAGE, INCLUDING ANY ACCOMPANYING DOCUMENTS, IS
>> CONFIDENTIAL and may contain information that is privileged and exempt from
>> disclosure under applicable law. If you are neither the intended recipient
>> nor responsible for delivering the message to the intended recipient,
>> please note that any dissemination, distribution, copying or the taking of
>> any action in reliance upon the message is strictly prohibited. If you have
>> received this communication in error, please notify the sender immediately.
>> Thank you.
>>
>>
>>
>
>

Re: Container out of memory: ORC format with many dynamic partitions

Reply via email to