[jira] [Commented] (HIVE-9605) Remove parquet nested objects from wrapper writable objects
[ https://issues.apache.org/jira/browse/HIVE-9605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14320321#comment-14320321 ] Sergio Peña commented on HIVE-9605: --- This test passes in 'parquet' branch. The patch required the HIVE-9333 patch in order to run correctly. Remove parquet nested objects from wrapper writable objects --- Key: HIVE-9605 URL: https://issues.apache.org/jira/browse/HIVE-9605 Project: Hive Issue Type: Sub-task Affects Versions: 0.14.0 Reporter: Sergio Peña Assignee: Sergio Peña Attachments: HIVE-9605.3.patch, HIVE-9605.4.patch Parquet nested types are using an extra wrapper object (ArrayWritable) as a wrapper of map and list elements. This extra object is not needed and causing unnecessary memory allocations. An example of code is on HiveCollectionConverter.java: {noformat} public void end() { parent.set(index, wrapList(new ArrayWritable( Writable.class, list.toArray(new Writable[list.size()]; } {noformat} This object is later unwrapped on AbstractParquetMapInspector, i.e.: {noformat} final Writable[] mapContainer = ((ArrayWritable) data).get(); final Writable[] mapArray = ((ArrayWritable) mapContainer[0]).get(); for (final Writable obj : mapArray) { ... } {noformat} We should get rid of this wrapper object to save time and memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9605) Remove parquet nested objects from wrapper writable objects
[ https://issues.apache.org/jira/browse/HIVE-9605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14319061#comment-14319061 ] Hive QA commented on HIVE-9605: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12698243/HIVE-9605.4.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 7541 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.ql.io.parquet.TestDataWritableWriter.testArrayOfArrays org.apache.hadoop.hive.ql.io.parquet.TestDataWritableWriter.testArrayType org.apache.hadoop.hive.ql.io.parquet.TestDataWritableWriter.testMapType {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2780/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2780/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2780/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12698243 - PreCommit-HIVE-TRUNK-Build Remove parquet nested objects from wrapper writable objects --- Key: HIVE-9605 URL: https://issues.apache.org/jira/browse/HIVE-9605 Project: Hive Issue Type: Sub-task Affects Versions: 0.14.0 Reporter: Sergio Peña Assignee: Sergio Peña Attachments: HIVE-9605.3.patch, HIVE-9605.4.patch Parquet nested types are using an extra wrapper object (ArrayWritable) as a wrapper of map and list elements. This extra object is not needed and causing unnecessary memory allocations. An example of code is on HiveCollectionConverter.java: {noformat} public void end() { parent.set(index, wrapList(new ArrayWritable( Writable.class, list.toArray(new Writable[list.size()]; } {noformat} This object is later unwrapped on AbstractParquetMapInspector, i.e.: {noformat} final Writable[] mapContainer = ((ArrayWritable) data).get(); final Writable[] mapArray = ((ArrayWritable) mapContainer[0]).get(); for (final Writable obj : mapArray) { ... } {noformat} We should get rid of this wrapper object to save time and memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9605) Remove parquet nested objects from wrapper writable objects
[ https://issues.apache.org/jira/browse/HIVE-9605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14318403#comment-14318403 ] Sergio Peña commented on HIVE-9605: --- Patch reviewed on RB. Remove parquet nested objects from wrapper writable objects --- Key: HIVE-9605 URL: https://issues.apache.org/jira/browse/HIVE-9605 Project: Hive Issue Type: Sub-task Affects Versions: 0.14.0 Reporter: Sergio Peña Assignee: Sergio Peña Attachments: HIVE-9605.3.patch, HIVE-9605.4.patch Parquet nested types are using an extra wrapper object (ArrayWritable) as a wrapper of map and list elements. This extra object is not needed and causing unnecessary memory allocations. An example of code is on HiveCollectionConverter.java: {noformat} public void end() { parent.set(index, wrapList(new ArrayWritable( Writable.class, list.toArray(new Writable[list.size()]; } {noformat} This object is later unwrapped on AbstractParquetMapInspector, i.e.: {noformat} final Writable[] mapContainer = ((ArrayWritable) data).get(); final Writable[] mapArray = ((ArrayWritable) mapContainer[0]).get(); for (final Writable obj : mapArray) { ... } {noformat} We should get rid of this wrapper object to save time and memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9605) Remove parquet nested objects from wrapper writable objects
[ https://issues.apache.org/jira/browse/HIVE-9605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14318592#comment-14318592 ] Brock Noland commented on HIVE-9605: +1 pending tests Remove parquet nested objects from wrapper writable objects --- Key: HIVE-9605 URL: https://issues.apache.org/jira/browse/HIVE-9605 Project: Hive Issue Type: Sub-task Affects Versions: 0.14.0 Reporter: Sergio Peña Assignee: Sergio Peña Attachments: HIVE-9605.3.patch, HIVE-9605.4.patch Parquet nested types are using an extra wrapper object (ArrayWritable) as a wrapper of map and list elements. This extra object is not needed and causing unnecessary memory allocations. An example of code is on HiveCollectionConverter.java: {noformat} public void end() { parent.set(index, wrapList(new ArrayWritable( Writable.class, list.toArray(new Writable[list.size()]; } {noformat} This object is later unwrapped on AbstractParquetMapInspector, i.e.: {noformat} final Writable[] mapContainer = ((ArrayWritable) data).get(); final Writable[] mapArray = ((ArrayWritable) mapContainer[0]).get(); for (final Writable obj : mapArray) { ... } {noformat} We should get rid of this wrapper object to save time and memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-9605) Remove parquet nested objects from wrapper writable objects
[ https://issues.apache.org/jira/browse/HIVE-9605?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14313436#comment-14313436 ] Hive QA commented on HIVE-9605: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12697518/HIVE-9605.2.patch {color:red}ERROR:{color} -1 due to 14 failed/errored test(s), 7540 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testAmbiguousSingleFieldGroupInList org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testAvroPrimitiveInList org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testAvroSingleFieldGroupInList org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testHiveRequiredGroupInList org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testMultiFieldGroupInList org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testNewOptionalGroupInList org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testNewRequiredGroupInList org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testThriftPrimitiveInList org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testThriftSingleFieldGroupInList org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testUnannotatedListOfGroups org.apache.hadoop.hive.ql.io.parquet.TestArrayCompatibility.testUnannotatedListOfPrimitives org.apache.hadoop.hive.ql.io.parquet.TestParquetSerDe.testParquetHiveSerDe org.apache.hadoop.hive.ql.io.parquet.serde.TestAbstractParquetMapInspector.testEmptyContainer org.apache.hadoop.hive.ql.io.parquet.serde.TestParquetHiveArrayInspector.testEmptyContainer {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2729/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-TRUNK-Build/2729/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-TRUNK-Build-2729/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 14 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12697518 - PreCommit-HIVE-TRUNK-Build Remove parquet nested objects from wrapper writable objects --- Key: HIVE-9605 URL: https://issues.apache.org/jira/browse/HIVE-9605 Project: Hive Issue Type: Sub-task Affects Versions: 0.14.0 Reporter: Sergio Peña Assignee: Sergio Peña Attachments: HIVE-9605.2.patch Parquet nested types are using an extra wrapper object (ArrayWritable) as a wrapper of map and list elements. This extra object is not needed and causing unnecessary memory allocations. An example of code is on HiveCollectionConverter.java: {noformat} public void end() { parent.set(index, wrapList(new ArrayWritable( Writable.class, list.toArray(new Writable[list.size()]; } {noformat} This object is later unwrapped on AbstractParquetMapInspector, i.e.: {noformat} final Writable[] mapContainer = ((ArrayWritable) data).get(); final Writable[] mapArray = ((ArrayWritable) mapContainer[0]).get(); for (final Writable obj : mapArray) { ... } {noformat} We should get rid of this wrapper object to save time and memory. -- This message was sent by Atlassian JIRA (v6.3.4#6332)