Review Request 57353: Intern Properties objects referenced from PartitionDesc to reduce memory pressure.

Misha Dmitriev Mon, 06 Mar 2017 17:22:36 -0800

-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/57353/
-----------------------------------------------------------


Review request for hive, Chaozhong Yang, Alan Gates, Rui Li, Prasanth_J, Sergio 
Pena, Sahil Takiar, Vihang Karajgaonkar, and Xuefu Zhang.


Bugs: HIVE-16079
    https://issues.apache.org/jira/browse/HIVE-16079


Repository: hive-git


Description
-------

When multiple concurrent Hive queries run, a separate copy of
org.apache.hadoop.hive.ql.metadata.Partition and
ql.plan.PartitionDesc is created for each table partition
per each query instance. So when in my benchmark explained in
HIVE-16079 we have 2000 partitions and 50 concurrent queries running
over them, we end up, in the worst case, with 2000*50=100,000 instances
of Partition and PartitionDesc in memory. These objects themselves
collectively take just ~2% of memory. However, other data structures
that each of them reference, take a lot more. In particular, Properties
objects take more than 20% of memory. When we have 50 concurrent
read-only queries, there are 50 identical copies of Properties per
each partition. That's a huge waste of memory.

This change introduces a new class that extends Properties, called
CopyOnFirstWriteProperties. It utilizes a unique interned copy of
Properties whenever possible. However, when one of the methods that
modify properties is called, a copy is created. When this class is
used, memory consumption by Properties falls from 20% to 5..6%.


Diffs
-----

  common/src/java/org/apache/hadoop/hive/common/CopyOnFirstWriteProperties.java 
PRE-CREATION 
  ql/src/java/org/apache/hadoop/hive/ql/exec/SerializationUtilities.java 
247d5890ea8131404b9543d22876ca4c052578e0 
  ql/src/java/org/apache/hadoop/hive/ql/plan/PartitionDesc.java 
d05c1c68fdb7296c0346d73967071da1ebe7bb72 


Diff: https://reviews.apache.org/r/57353/diff/1/


Testing
-------


Thanks,

Misha Dmitriev

Review Request 57353: Intern Properties objects referenced from PartitionDesc to reduce memory pressure.

Reply via email to