----------------------------------------------------------- This is an automatically generated e-mail. To reply, visit: https://reviews.apache.org/r/31800/ -----------------------------------------------------------
(Updated March 20, 2015, 3:59 p.m.) Review request for hive, Ryan Blue and cheng xu. Changes ------- New patch with changes. Bugs: HIVE-9658 https://issues.apache.org/jira/browse/HIVE-9658 Repository: hive-git Description ------- This patch bypasses primitive java objects to hive object inspectors without using primitive Writable objects. It helps to reduce memory usage. I did not bypass other complex objects, such as binaries, decimal and date/timestamp, because their Writable objects are needed in other parts of the code, and creating them later takes more ops/s to do it. Better save time at the beginning. Diffs (updated) ----- itests/hive-jmh/src/main/java/org/apache/hive/benchmark/storage/ColumnarStorageBench.java 4f6985cd13017ce37f4f0c100b16a27aa5b02f8b ql/src/java/org/apache/hadoop/hive/ql/exec/vector/VectorColumnAssignFactory.java c915f728fc9b27da0fabefab5d8f5faa53640b78 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/MapredParquetInputFormat.java 0391229723cc3ecef551fa44b8456b0d2ac93fb5 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/VectorizedParquetInputFormat.java d7edd52614771857d1b21971a66894841c248ef9 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ConverterParent.java 6ff6b473c9f1867bc14bb597094ddb92487cc954 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/DataWritableRecordConverter.java a43661eb54ba29692c07c264584b5aecf648ef99 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/ETypeConverter.java 3fc012970e23bbc188ce2a2e2ba0b04bc6f22317 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveCollectionConverter.java f1c8b6f13718b37f590263e5b35ed6c327f5cf4f ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveGroupConverter.java c6d03a19029d5bcc86b998dd7a8609973648c103 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/HiveStructConverter.java f95d15eddc21bc432fa53572de5756751a13341a ql/src/java/org/apache/hadoop/hive/ql/io/parquet/convert/Repeated.java ee57b31dac53d99af0c5a520f51102796ca32fd3 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/DataWritableReadSupport.java 57ae7a9740d55b407cadfc8bc030593b29f90700 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/read/ParquetRecordReaderWrapper.java a26199612cf338e336f210f29acb0398c536e1f9 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/AbstractParquetMapInspector.java 49bf1c5325833993f4c09efdf1546af560783c28 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ArrayWritableObjectInspector.java 609188206f88e296d893b84bcaaab53f974e6b7d ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/DeepParquetHiveMapInspector.java 143d72e76502d4877e8208181d9743259051dcea ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ObjectArrayWritableObjectInspector.java PRE-CREATION ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveArrayInspector.java bde0dcbb3978ba47b15ae2c9bbe2f87ed3984ab1 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/ParquetHiveSerDe.java 7fd5e9612d4e3c9bf3b816bc48dbdbe59fb8a5a8 ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/StandardParquetHiveMapInspector.java 22250b30a14d52907fb22d4f44b93c7633c6a89e ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/primitive/ParquetByteInspector.java 864f56292fa4856df155f546064e4a6732cc663f ql/src/java/org/apache/hadoop/hive/ql/io/parquet/serde/primitive/ParquetShortInspector.java 39f265777c7e164382117e3902c3b6e491295f70 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/AbstractTestParquetDirect.java 3a476731e31bf38822f0d530f0aea2eadb675a49 ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestArrayCompatibility.java d45d8eeb9e8a61f254098ab15d0305fc71152abd ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestDataWritableWriter.java 8f03c5b403332f7b36b2271a2246a0fc90b3bfba ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestMapStructures.java 3c7401ffbe88ce66b96f9cceab4e9c3d6267f8fe ql/src/test/org/apache/hadoop/hive/ql/io/parquet/TestMapredParquetInputFormat.java 1a54bf5797efd5859c9e665bcc7134168e5d193f serde/src/java/org/apache/hadoop/hive/serde2/io/ObjectArrayWritable.java PRE-CREATION Diff: https://reviews.apache.org/r/31800/diff/ Testing ------- Some performance tests were done to validate this. Schema: int,double,boolean,string,array<int>,map<string,string>,struct<a:int,b:int> - JMH (Microbenchmarks) calls on parquet reads. Before: 579 ops/s After: 651 ops/s - YourKit Java Profiler to measure memory objects recorded. Reading 20,000 random rows (10 times) Before: Objects recorded: 1,863,610 Objects size: 42,373,808 Total memory usage: 29% After: Objects recorded: 1,596,804 Objects size: 34,192,832 Total memory usage: 24% All tests were run multiple times to get same results. Thanks, Sergio Pena