[jira] [Commented] (HIVE-15507) Nested column pruning: fix issue when selecting struct field from array/map element
[ https://issues.apache.org/jira/browse/HIVE-15507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15773909#comment-15773909 ] Chao Sun commented on HIVE-15507: - I believe the test failures are unrelated. [~Ferd], can you take a look at this patch? thanks. > Nested column pruning: fix issue when selecting struct field from array/map > element > --- > > Key: HIVE-15507 > URL: https://issues.apache.org/jira/browse/HIVE-15507 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer, Physical Optimizer, > Serializers/Deserializers >Affects Versions: 2.2.0 >Reporter: Chao Sun >Assignee: Chao Sun > Attachments: 15507.1.patch > > > When running the following query: > {code} > SELECT count(col), arr[0].f > FROM tbl > GROUP BY arr[0].f > {code} > where {{arr}} is an array of struct with field {{f}}. Nested column pruning > will fail. This is because we currently process {{GenericUDFIndex}} in the > same way as any other UDF. In this case, it will generate path {{arr.f}}, > which will not match the struct type info when doing the pruning. > Same thing for map. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15507) Nested column pruning: fix issue when selecting struct field from array/map element
[ https://issues.apache.org/jira/browse/HIVE-15507?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15773881#comment-15773881 ] Hive QA commented on HIVE-15507: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12844611/15507.1.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 10866 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=234) TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=144) [vectorized_rcfile_columnar.q,vector_elt.q,explainuser_1.q,multi_insert.q,tez_dml.q,vector_bround.q,schema_evol_orc_acid_table.q,vector_when_case_null.q,orc_ppd_schema_evol_1b.q,vector_join30.q,vectorization_11.q,cte_3.q,update_tmp_table.q,vector_decimal_cast.q,groupby_grouping_id2.q,vector_decimal_round.q,tez_smb_empty.q,orc_merge6.q,vector_char_mapjoin1.q,vector_decimal_trailing.q,cte_5.q,tez_union.q,vector_decimal_2.q,columnStatsUpdateForStatsOptimizer_1.q,vector_outer_join3.q,schema_evol_text_vec_part_all_complex.q,tez_dynpart_hashjoin_2.q,auto_sortmerge_join_12.q,offset_limit.q,tez_union_multiinsert.q] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=93) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_4] (batchId=93) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2719/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2719/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2719/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12844611 - PreCommit-HIVE-Build > Nested column pruning: fix issue when selecting struct field from array/map > element > --- > > Key: HIVE-15507 > URL: https://issues.apache.org/jira/browse/HIVE-15507 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer, Physical Optimizer, > Serializers/Deserializers >Affects Versions: 2.2.0 >Reporter: Chao Sun >Assignee: Chao Sun > Attachments: 15507.1.patch > > > When running the following query: > {code} > SELECT count(col), arr[0].f > FROM tbl > GROUP BY arr[0].f > {code} > where {{arr}} is an array of struct with field {{f}}. Nested column pruning > will fail. This is because we currently process {{GenericUDFIndex}} in the > same way as any other UDF. In this case, it will generate path {{arr.f}}, > which will not match the struct type info when doing the pruning. > Same thing for map. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15507) Nested column pruning: fix issue when selecting struct field from array/map element
[ https://issues.apache.org/jira/browse/HIVE-15507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated HIVE-15507: Status: Patch Available (was: Open) > Nested column pruning: fix issue when selecting struct field from array/map > element > --- > > Key: HIVE-15507 > URL: https://issues.apache.org/jira/browse/HIVE-15507 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer, Physical Optimizer, > Serializers/Deserializers >Affects Versions: 2.2.0 >Reporter: Chao Sun >Assignee: Chao Sun > Attachments: 15507.1.patch > > > When running the following query: > {code} > SELECT count(col), arr[0].f > FROM tbl > GROUP BY arr[0].f > {code} > where {{arr}} is an array of struct with field {{f}}. Nested column pruning > will fail. This is because we currently process {{GenericUDFIndex}} in the > same way as any other UDF. In this case, it will generate path {{arr.f}}, > which will not match the struct type info when doing the pruning. > Same thing for map. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15507) Nested column pruning: fix issue when selecting struct field from array/map element
[ https://issues.apache.org/jira/browse/HIVE-15507?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated HIVE-15507: Attachment: 15507.1.patch > Nested column pruning: fix issue when selecting struct field from array/map > element > --- > > Key: HIVE-15507 > URL: https://issues.apache.org/jira/browse/HIVE-15507 > Project: Hive > Issue Type: Sub-task > Components: Logical Optimizer, Physical Optimizer, > Serializers/Deserializers >Affects Versions: 2.2.0 >Reporter: Chao Sun >Assignee: Chao Sun > Attachments: 15507.1.patch > > > When running the following query: > {code} > SELECT count(col), arr[0].f > FROM tbl > GROUP BY arr[0].f > {code} > where {{arr}} is an array of struct with field {{f}}. Nested column pruning > will fail. This is because we currently process {{GenericUDFIndex}} in the > same way as any other UDF. In this case, it will generate path {{arr.f}}, > which will not match the struct type info when doing the pruning. > Same thing for map. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15488) Native Vector MapJoin fails when trying to serialize BigTable rows that have (unreferenced) complex types
[ https://issues.apache.org/jira/browse/HIVE-15488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-15488: Resolution: Fixed Status: Resolved (was: Patch Available) > Native Vector MapJoin fails when trying to serialize BigTable rows that have > (unreferenced) complex types > - > > Key: HIVE-15488 > URL: https://issues.apache.org/jira/browse/HIVE-15488 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Fix For: 2.2.0 > > Attachments: HIVE-15488.01.patch > > > When creating VectorSerializeRow we need to exclude any complex types. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15488) Native Vector MapJoin fails when trying to serialize BigTable rows that have (unreferenced) complex types
[ https://issues.apache.org/jira/browse/HIVE-15488?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-15488: Fix Version/s: 2.2.0 > Native Vector MapJoin fails when trying to serialize BigTable rows that have > (unreferenced) complex types > - > > Key: HIVE-15488 > URL: https://issues.apache.org/jira/browse/HIVE-15488 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Fix For: 2.2.0 > > Attachments: HIVE-15488.01.patch > > > When creating VectorSerializeRow we need to exclude any complex types. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15488) Native Vector MapJoin fails when trying to serialize BigTable rows that have (unreferenced) complex types
[ https://issues.apache.org/jira/browse/HIVE-15488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15773740#comment-15773740 ] Matt McCline commented on HIVE-15488: - Committed to master. > Native Vector MapJoin fails when trying to serialize BigTable rows that have > (unreferenced) complex types > - > > Key: HIVE-15488 > URL: https://issues.apache.org/jira/browse/HIVE-15488 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Fix For: 2.2.0 > > Attachments: HIVE-15488.01.patch > > > When creating VectorSerializeRow we need to exclude any complex types. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15488) Native Vector MapJoin fails when trying to serialize BigTable rows that have (unreferenced) complex types
[ https://issues.apache.org/jira/browse/HIVE-15488?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15773726#comment-15773726 ] Matt McCline commented on HIVE-15488: - Test failed are unrelated. > Native Vector MapJoin fails when trying to serialize BigTable rows that have > (unreferenced) complex types > - > > Key: HIVE-15488 > URL: https://issues.apache.org/jira/browse/HIVE-15488 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Attachments: HIVE-15488.01.patch > > > When creating VectorSerializeRow we need to exclude any complex types. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-8373) OOM for a simple query with spark.master=local [Spark Branch]
[ https://issues.apache.org/jira/browse/HIVE-8373?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15773646#comment-15773646 ] Andrew Sears commented on HIVE-8373: [~lirui] Would be best to control the upper bounds to limit effects of OOM on the entire system. There are some memory leaks in Spark that may be caught by setting this to a lower setting. > OOM for a simple query with spark.master=local [Spark Branch] > - > > Key: HIVE-8373 > URL: https://issues.apache.org/jira/browse/HIVE-8373 > Project: Hive > Issue Type: Bug > Components: Spark >Reporter: Xuefu Zhang >Assignee: liyunzhang_intel > > I have a straigh forward query to run in Spark local mode, but get an OOM > even though the data volumn is tiny: > {code} > Exception in thread "Spark Context Cleaner" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Spark Context Cleaner" > Exception in thread "Executor task launch worker-1" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Executor task launch worker-1" > Exception in thread "Keep-Alive-Timer" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Keep-Alive-Timer" > Exception in thread "Driver Heartbeater" > Exception: java.lang.OutOfMemoryError thrown from the > UncaughtExceptionHandler in thread "Driver Heartbeater" > {code} > The query is: > {code} > select product_name, avg(item_price) as avg_price from product join item on > item.product_pk=product.product_pk group by product_name order by avg_price; > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Work started] (HIVE-15055) Column pruning for nested fields in Parquet
[ https://issues.apache.org/jira/browse/HIVE-15055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Work on HIVE-15055 started by Chao Sun. --- > Column pruning for nested fields in Parquet > --- > > Key: HIVE-15055 > URL: https://issues.apache.org/jira/browse/HIVE-15055 > Project: Hive > Issue Type: New Feature > Components: Logical Optimizer, Physical Optimizer, > Serializers/Deserializers >Reporter: Chao Sun >Assignee: Chao Sun > Labels: performance > Attachments: benchmark-hos.pdf, design-doc-nested-column-pruning.pdf > > > Some columnar file formats such as Parquet store fields in struct type also > column by column using encoding described in Google Dramel pager. It's very > common in big data where data are stored in structs while queries only needs > a subset of the the fields in the structs. However, presently Hive still > needs to read the whole struct regardless whether all fields are selected. > Therefore, pruning unwanted sub-fields in struct or nested fields at file > reading time would be a big performance boost for such scenarios. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15503) LLAP: Fix use of Runtime.getRuntime.maxMemory in Hive operators
[ https://issues.apache.org/jira/browse/HIVE-15503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15773505#comment-15773505 ] Gunther Hagleitner commented on HIVE-15503: --- LGTM +1. [~gopalv] and [~sershe] might also be interested. For the TODOs in the code, can you open jiras? Or are you planning to resolve before commit? > LLAP: Fix use of Runtime.getRuntime.maxMemory in Hive operators > --- > > Key: HIVE-15503 > URL: https://issues.apache.org/jira/browse/HIVE-15503 > Project: Hive > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: HIVE-15503.1.patch, HIVE-15503.2.patch > > > {code} > ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java: > maxHashTblMemory = (long) (memoryPercentage * > Runtime.getRuntime().maxMemory()); > ql/src/java/org/apache/hadoop/hive/ql/exec/TopNHash.java:// Total Free > Memory = maxMemory() - Used Memory; > ql/src/java/org/apache/hadoop/hive/ql/exec/TopNHash.java:long > totalFreeMemory = Runtime.getRuntime().maxMemory() - > {code} > This will not work very well with LLAP because of the memory sharing by > executors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15055) Column pruning for nested fields in Parquet
[ https://issues.apache.org/jira/browse/HIVE-15055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated HIVE-15055: Attachment: benchmark-hos.pdf > Column pruning for nested fields in Parquet > --- > > Key: HIVE-15055 > URL: https://issues.apache.org/jira/browse/HIVE-15055 > Project: Hive > Issue Type: New Feature > Components: Logical Optimizer, Physical Optimizer, > Serializers/Deserializers >Reporter: Chao Sun >Assignee: Chao Sun > Labels: performance > Attachments: benchmark-hos.pdf, design-doc-nested-column-pruning.pdf > > > Some columnar file formats such as Parquet store fields in struct type also > column by column using encoding described in Google Dramel pager. It's very > common in big data where data are stored in structs while queries only needs > a subset of the the fields in the structs. However, presently Hive still > needs to read the whole struct regardless whether all fields are selected. > Therefore, pruning unwanted sub-fields in struct or nested fields at file > reading time would be a big performance boost for such scenarios. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15055) Column pruning for nested fields in Parquet
[ https://issues.apache.org/jira/browse/HIVE-15055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated HIVE-15055: Attachment: (was: benchmark-hos.pdf) > Column pruning for nested fields in Parquet > --- > > Key: HIVE-15055 > URL: https://issues.apache.org/jira/browse/HIVE-15055 > Project: Hive > Issue Type: New Feature > Components: Logical Optimizer, Physical Optimizer, > Serializers/Deserializers >Reporter: Chao Sun >Assignee: Chao Sun > Labels: performance > Attachments: benchmark-hos.pdf, design-doc-nested-column-pruning.pdf > > > Some columnar file formats such as Parquet store fields in struct type also > column by column using encoding described in Google Dramel pager. It's very > common in big data where data are stored in structs while queries only needs > a subset of the the fields in the structs. However, presently Hive still > needs to read the whole struct regardless whether all fields are selected. > Therefore, pruning unwanted sub-fields in struct or nested fields at file > reading time would be a big performance boost for such scenarios. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15055) Column pruning for nested fields in Parquet
[ https://issues.apache.org/jira/browse/HIVE-15055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated HIVE-15055: Attachment: benchmark-hos.pdf [~Ferd], sure - updated. > Column pruning for nested fields in Parquet > --- > > Key: HIVE-15055 > URL: https://issues.apache.org/jira/browse/HIVE-15055 > Project: Hive > Issue Type: New Feature > Components: Logical Optimizer, Physical Optimizer, > Serializers/Deserializers >Reporter: Chao Sun >Assignee: Chao Sun > Labels: performance > Attachments: benchmark-hos.pdf, design-doc-nested-column-pruning.pdf > > > Some columnar file formats such as Parquet store fields in struct type also > column by column using encoding described in Google Dramel pager. It's very > common in big data where data are stored in structs while queries only needs > a subset of the the fields in the structs. However, presently Hive still > needs to read the whole struct regardless whether all fields are selected. > Therefore, pruning unwanted sub-fields in struct or nested fields at file > reading time would be a big performance boost for such scenarios. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15055) Column pruning for nested fields in Parquet
[ https://issues.apache.org/jira/browse/HIVE-15055?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Chao Sun updated HIVE-15055: Attachment: (was: benchmark-hos.pdf) > Column pruning for nested fields in Parquet > --- > > Key: HIVE-15055 > URL: https://issues.apache.org/jira/browse/HIVE-15055 > Project: Hive > Issue Type: New Feature > Components: Logical Optimizer, Physical Optimizer, > Serializers/Deserializers >Reporter: Chao Sun >Assignee: Chao Sun > Labels: performance > Attachments: design-doc-nested-column-pruning.pdf > > > Some columnar file formats such as Parquet store fields in struct type also > column by column using encoding described in Google Dramel pager. It's very > common in big data where data are stored in structs while queries only needs > a subset of the the fields in the structs. However, presently Hive still > needs to read the whole struct regardless whether all fields are selected. > Therefore, pruning unwanted sub-fields in struct or nested fields at file > reading time would be a big performance boost for such scenarios. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15433) setting hive.warehouse.subdir.inherit.perms in HIVE won't overwrite it in hive configuration
[ https://issues.apache.org/jira/browse/HIVE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15773148#comment-15773148 ] Hive QA commented on HIVE-15433: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12844581/HIVE-15433-branch-1.2.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2718/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2718/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2718/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2016-12-23 15:51:33.263 + [[ -n /usr/lib/jvm/java-7-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64 + export PATH=/usr/lib/jvm/java-7-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-7-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-2718/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z branch-1.2 ]] + [[ -d apache-github-branch-1.2-source ]] + [[ ! -d apache-github-branch-1.2-source/.git ]] + [[ ! -d apache-github-branch-1.2-source ]] + date '+%Y-%m-%d %T.%3N' 2016-12-23 15:51:33.309 + cd apache-github-branch-1.2-source + git fetch origin >From https://github.com/apache/hive 6b64b98..e39db2a branch-1 -> origin/branch-1 3d4b436..2301035 branch-2.1 -> origin/branch-2.1 46e7657..d00196c hive-14535 -> origin/hive-14535 444af20..7befe8e master -> origin/master * [new branch] master-15147 -> origin/master-15147 * [new tag] rel/release-2.1.1 -> rel/release-2.1.1 * [new tag] release-2.1.1-rc0 -> release-2.1.1-rc0 * [new tag] release-2.1.1-rc1 -> release-2.1.1-rc1 + git reset --hard HEAD HEAD is now at 643e9c0 HIVE-14964: Failing Test: Fix TestBeelineArgParsing tests (Zoltan Haindrich reviewed by Ferdinand Xu, Siddharth Seth) + git clean -f -d + git checkout branch-1.2 Already on 'branch-1.2' Your branch is up-to-date with 'origin/branch-1.2'. + git reset --hard origin/branch-1.2 HEAD is now at 643e9c0 HIVE-14964: Failing Test: Fix TestBeelineArgParsing tests (Zoltan Haindrich reviewed by Ferdinand Xu, Siddharth Seth) + git merge --ff-only origin/branch-1.2 Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2016-12-23 15:51:41.543 + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch Going to apply patch with: patch -p0 patching file metastore/if/hive_metastore.thrift patching file metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java patching file metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java patching file metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java patching file metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java patching file ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java + [[ maven == \m\a\v\e\n ]] + rm -rf /data/hiveptest/working/maven/org/apache/hive + mvn -B clean install -DskipTests -T 4 -q -Dmaven.repo.local=/data/hiveptest/working/maven -Phadoop-2 ANTLR Parser Generator Version 3.4 org/apache/hadoop/hive/metastore/parser/Filter.g [ERROR] COMPILATION ERROR : [ERROR] /data/hiveptest/working/apache-github-branch-1.2-source/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java:[1456,5] method does not override or implement a method from a supertype [ERROR] /data/hiveptest/working/apache-github-branch-1.2-source/metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java:[2051,11] method create_table_with_environment_context in interface org.apache.hadoop.hive.metastore.api.ThriftHiveMetastore.Iface cannot be applied to given types; required: org.apache.hadoop.hive.metastore.api.Table,org.apache.hadoop.hive.metastore.api.EnvironmentContext found: org.apache.hadoop.hive.metastore.api.Table,org.apache.hadoop.hive.metastore.api.EnvironmentContext,boolean reason: actual and formal argument lists differ in le
[jira] [Updated] (HIVE-15433) setting hive.warehouse.subdir.inherit.perms in HIVE won't overwrite it in hive configuration
[ https://issues.apache.org/jira/browse/HIVE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alina Abramova updated HIVE-15433: -- Attachment: HIVE-15433-branch-1.2.patch > setting hive.warehouse.subdir.inherit.perms in HIVE won't overwrite it in > hive configuration > > > Key: HIVE-15433 > URL: https://issues.apache.org/jira/browse/HIVE-15433 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.0.0, 1.2.0, 2.0.0 >Reporter: Alina Abramova >Assignee: Alina Abramova > Fix For: 1.2.0 > > Attachments: HIVE-15433-branch-1.2.patch, HIVE-15433.1.patch > > > Setting hive.warehouse.subdir.inherit.perms in HIVE won't make any effect. It > will always take the default value from HiveConf until you define it in > hive-site.xml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15433) setting hive.warehouse.subdir.inherit.perms in HIVE won't overwrite it in hive configuration
[ https://issues.apache.org/jira/browse/HIVE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alina Abramova updated HIVE-15433: -- Fix Version/s: (was: 2.1.0) 1.2.0 > setting hive.warehouse.subdir.inherit.perms in HIVE won't overwrite it in > hive configuration > > > Key: HIVE-15433 > URL: https://issues.apache.org/jira/browse/HIVE-15433 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.0.0, 1.2.0, 2.0.0 >Reporter: Alina Abramova >Assignee: Alina Abramova > Fix For: 1.2.0 > > Attachments: HIVE-15433-branch-1.2.patch, HIVE-15433.1.patch > > > Setting hive.warehouse.subdir.inherit.perms in HIVE won't make any effect. It > will always take the default value from HiveConf until you define it in > hive-site.xml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15433) setting hive.warehouse.subdir.inherit.perms in HIVE won't overwrite it in hive configuration
[ https://issues.apache.org/jira/browse/HIVE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15772999#comment-15772999 ] Hive QA commented on HIVE-15433: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12844570/HIVE-15433.1.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2717/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2717/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2717/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2016-12-23 14:22:03.178 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-2717/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2016-12-23 14:22:03.180 + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 7befe8e HIVE-15487: LLAP: Improvements to random selection while scheduling + git clean -f -d Removing ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/VectorizerReason.java Removing ql/src/java/org/apache/hadoop/hive/ql/plan/OperatorExplainVectorization.java Removing ql/src/java/org/apache/hadoop/hive/ql/plan/VectorAppMasterEventDesc.java Removing ql/src/java/org/apache/hadoop/hive/ql/plan/VectorFileSinkDesc.java Removing ql/src/java/org/apache/hadoop/hive/ql/plan/VectorFilterDesc.java Removing ql/src/java/org/apache/hadoop/hive/ql/plan/VectorLimitDesc.java Removing ql/src/java/org/apache/hadoop/hive/ql/plan/VectorMapJoinInfo.java Removing ql/src/java/org/apache/hadoop/hive/ql/plan/VectorSMBJoinDesc.java Removing ql/src/java/org/apache/hadoop/hive/ql/plan/VectorSelectDesc.java Removing ql/src/java/org/apache/hadoop/hive/ql/plan/VectorSparkHashTableSinkDesc.java Removing ql/src/java/org/apache/hadoop/hive/ql/plan/VectorSparkPartitionPruningSinkDesc.java Removing ql/src/java/org/apache/hadoop/hive/ql/plan/VectorTableScanDesc.java Removing ql/src/java/org/apache/hadoop/hive/ql/plan/VectorizationCondition.java + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at 7befe8e HIVE-15487: LLAP: Improvements to random selection while scheduling + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2016-12-23 14:22:04.249 + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch Going to apply patch with: patch -p0 patching file metastore/if/hive_metastore.thrift patching file metastore/src/java/org/apache/hadoop/hive/metastore/HiveAlterHandler.java patching file metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStore.java patching file metastore/src/java/org/apache/hadoop/hive/metastore/HiveMetaStoreClient.java patching file metastore/src/java/org/apache/hadoop/hive/metastore/Warehouse.java patching file ql/src/java/org/apache/hadoop/hive/ql/metadata/SessionHiveMetaStoreClient.java + [[ maven == \m\a\v\e\n ]] + rm -rf /data/hiveptest/working/maven/org/apache/hive + mvn -B clean install -DskipTests -T 4 -q -Dmaven.repo.local=/data/hiveptest/working/maven ANTLR Parser Generator Version 3.5.2 Output file /data/hiveptest/working/apache-github-source-source/metastore/target/generated-sources/antlr3/org/apache/hadoop/hive/metastore/parser/FilterParser.java does not exist: must build /data/hiveptest/working/apache-github-source-source/metastore/src/java/org/apache/hadoop/hive/metastore/parser/Filter.g org/apache/hadoop/hive/metastore/parser/Filter.g [ERROR] COMPILATION ERROR : [ERROR] /data/hiveptest/working/apache-github-source-source/metastore/src/java/org/
[jira] [Commented] (HIVE-11394) Enhance EXPLAIN display for vectorization
[ https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15772993#comment-15772993 ] Hive QA commented on HIVE-11394: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12844565/HIVE-11394.095.patch {color:green}SUCCESS:{color} +1 due to 159 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 113 failed/errored test(s), 10836 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=234) TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=144) [vectorized_rcfile_columnar.q,vector_elt.q,explainuser_1.q,multi_insert.q,tez_dml.q,vector_bround.q,schema_evol_orc_acid_table.q,vector_when_case_null.q,orc_ppd_schema_evol_1b.q,vector_join30.q,vectorization_11.q,cte_3.q,update_tmp_table.q,vector_decimal_cast.q,groupby_grouping_id2.q,vector_decimal_round.q,tez_smb_empty.q,orc_merge6.q,vector_char_mapjoin1.q,vector_decimal_trailing.q,cte_5.q,tez_union.q,vector_decimal_2.q,columnStatsUpdateForStatsOptimizer_1.q,vector_outer_join3.q,schema_evol_text_vec_part_all_complex.q,tez_dynpart_hashjoin_2.q,auto_sortmerge_join_12.q,offset_limit.q,tez_union_multiinsert.q] TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=146) [load_dyn_part5.q,vector_complex_join.q,orc_llap.q,vectorization_pushdown.q,cbo_gby_empty.q,vectorization_short_regress.q,cbo_gby.q,auto_sortmerge_join_1.q,lineage3.q,cross_product_check_1.q,cbo_join.q,vector_struct_in.q,bucketmapjoin3.q,current_date_timestamp.q,orc_ppd_schema_evol_2a.q,groupby2.q,schema_evol_text_vec_table.q,vectorized_join46.q,orc_ppd_date.q,multiMapJoin1.q,sample10.q,vector_outer_join1.q,vector_char_simple.q,dynpart_sort_optimization_acid.q,auto_sortmerge_join_2.q,bucketizedhiveinputformat.q,leftsemijoin.q,special_character_in_tabnames_1.q,cte_mat_2.q,vectorization_8.q] org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vector_complex_all] (batchId=55) org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver[vectorized_date_funcs] (batchId=69) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a] (batchId=135) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_vec_part] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_vec_part_all_complex] (batchId=147) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_vec_part_all_primitive] (batchId=151) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_orc_vec_table] (batchId=145) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part] (batchId=148) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vec_part_all_primitive] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vecrow_part] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vecrow_part_all_complex] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vecrow_part_all_primitive] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[schema_evol_text_vecrow_table] (batchId=139) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_complex_all] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_left_outer_join2] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_left_outer_join] (batchId=141) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_leftsemi_mapjoin] (batchId=139) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_mapjoin_reduce] (batchId=152) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_mr_diff_schema_alias] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_multi_insert] (batchId=153) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_null_projection] (batchId=139) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_nullsafe_join] (batchId=153) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_number_compare_projection] (batchId=139) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_nvl] (batchId=151) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_orderby_5] (batchId=145) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_outer_join0] (batchId=149) org.apache.hadoop.hive.cli.TestMiniLlapLocalCliDriver.testCliDriver[vector_outer_jo
[jira] [Commented] (HIVE-15433) setting hive.warehouse.subdir.inherit.perms in HIVE won't overwrite it in hive configuration
[ https://issues.apache.org/jira/browse/HIVE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15772981#comment-15772981 ] Alina Abramova commented on HIVE-15433: --- I've changed hive_metastore.thrift and I think to make patch actually work you should recreate files that are related to this thrift file. ROOT-CAUSE: metastore server won't change hive configuration that were set for session through hive cli. Same issue with other data structures related to create functionality. > setting hive.warehouse.subdir.inherit.perms in HIVE won't overwrite it in > hive configuration > > > Key: HIVE-15433 > URL: https://issues.apache.org/jira/browse/HIVE-15433 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.0.0, 1.2.0, 2.0.0 >Reporter: Alina Abramova >Assignee: Alina Abramova > Fix For: 2.1.0 > > Attachments: HIVE-15433.1.patch > > > Setting hive.warehouse.subdir.inherit.perms in HIVE won't make any effect. It > will always take the default value from HiveConf until you define it in > hive-site.xml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15433) setting hive.warehouse.subdir.inherit.perms in HIVE won't overwrite it in hive configuration
[ https://issues.apache.org/jira/browse/HIVE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alina Abramova updated HIVE-15433: -- Fix Version/s: 2.1.0 Status: Patch Available (was: Open) > setting hive.warehouse.subdir.inherit.perms in HIVE won't overwrite it in > hive configuration > > > Key: HIVE-15433 > URL: https://issues.apache.org/jira/browse/HIVE-15433 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 2.0.0, 1.2.0, 1.0.0 >Reporter: Alina Abramova >Assignee: Alina Abramova > Fix For: 2.1.0 > > Attachments: HIVE-15433.1.patch > > > Setting hive.warehouse.subdir.inherit.perms in HIVE won't make any effect. It > will always take the default value from HiveConf until you define it in > hive-site.xml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15433) setting hive.warehouse.subdir.inherit.perms in HIVE won't overwrite it in hive configuration
[ https://issues.apache.org/jira/browse/HIVE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alina Abramova updated HIVE-15433: -- Attachment: HIVE-15433.1.patch > setting hive.warehouse.subdir.inherit.perms in HIVE won't overwrite it in > hive configuration > > > Key: HIVE-15433 > URL: https://issues.apache.org/jira/browse/HIVE-15433 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.0.0, 1.2.0, 2.0.0 >Reporter: Alina Abramova >Assignee: Alina Abramova > Attachments: HIVE-15433.1.patch > > > Setting hive.warehouse.subdir.inherit.perms in HIVE won't make any effect. It > will always take the default value from HiveConf until you define it in > hive-site.xml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (HIVE-15433) setting hive.warehouse.subdir.inherit.perms in HIVE won't overwrite it in hive configuration
[ https://issues.apache.org/jira/browse/HIVE-15433?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Alina Abramova reassigned HIVE-15433: - Assignee: Alina Abramova > setting hive.warehouse.subdir.inherit.perms in HIVE won't overwrite it in > hive configuration > > > Key: HIVE-15433 > URL: https://issues.apache.org/jira/browse/HIVE-15433 > Project: Hive > Issue Type: Bug > Components: Hive >Affects Versions: 1.0.0, 1.2.0, 2.0.0 >Reporter: Alina Abramova >Assignee: Alina Abramova > Attachments: HIVE-15433.1.patch > > > Setting hive.warehouse.subdir.inherit.perms in HIVE won't make any effect. It > will always take the default value from HiveConf until you define it in > hive-site.xml. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-11394) Enhance EXPLAIN display for vectorization
[ https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-11394: Status: Patch Available (was: In Progress) > Enhance EXPLAIN display for vectorization > - > > Key: HIVE-11394 > URL: https://issues.apache.org/jira/browse/HIVE-11394 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Fix For: 2.2.0 > > Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, > HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, > HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, > HIVE-11394.09.patch, HIVE-11394.091.patch, HIVE-11394.092.patch, > HIVE-11394.093.patch, HIVE-11394.094.patch, HIVE-11394.095.patch > > > Add detail to the EXPLAIN output showing why a Map and Reduce work is not > vectorized. > New syntax is: EXPLAIN VECTORIZATION \[ONLY\] > \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\] > The ONLY option suppresses most non-vectorization elements. > SUMMARY shows vectorization information for the PLAN (is vectorization > enabled) and a summary of Map and Reduce work. > OPERATOR shows vectorization information for operators. E.g. Filter > Vectorization. It includes all information of SUMMARY, too. > EXPRESSION shows vectorization information for expressions. E.g. > predicateExpression. It includes all information of SUMMARY and OPERATOR, > too. > DETAIL shows very vectorization information. > It includes all information of SUMMARY, OPERATOR, and EXPRESSION too. > The optional clause defaults are not ONLY and SUMMARY. > --- > Here are some examples: > EXPLAIN VECTORIZATION example: > (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization > sections) > Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION > SUMMARY. > Under Reducer 3’s "Reduce Vectorization:" you’ll see > notVectorizedReason: Aggregation Function UDF avg parameter expression for > GROUPBY operator: Data type struct of > Column\[VALUE._col2\] not supported > For Reducer 2’s "Reduce Vectorization:" you’ll see "groupByVectorOutput:": > "false" which says a node has a GROUP BY with an AVG or some other aggregator > that outputs a non-PRIMITIVE type (e.g. STRUCT) and all downstream operators > are row-mode. I.e. not vector output. > If "usesVectorUDFAdaptor:": "false" were true, it would say there was at > least one vectorized expression is using VectorUDFAdaptor. > And, "allNative:": "false" will be true when all operators are native. > Today, GROUP BY and FILE SINK are not native. MAP JOIN and REDUCE SINK are > conditionally native. FILTER and SELECT are native. > {code} > PLAN VECTORIZATION: > enabled: true > enabledConditionsMet: [hive.vectorized.execution.enabled IS true] > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Tez > ... > Edges: > Reducer 2 <- Map 1 (SIMPLE_EDGE) > Reducer 3 <- Reducer 2 (SIMPLE_EDGE) > ... > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: alltypesorc > Statistics: Num rows: 12288 Data size: 36696 Basic stats: > COMPLETE Column stats: COMPLETE > Select Operator > expressions: cint (type: int) > outputColumnNames: cint > Statistics: Num rows: 12288 Data size: 36696 Basic stats: > COMPLETE Column stats: COMPLETE > Group By Operator > keys: cint (type: int) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 5775 Data size: 17248 Basic > stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: int) > sort order: + > Map-reduce partition columns: _col0 (type: int) > Statistics: Num rows: 5775 Data size: 17248 Basic > stats: COMPLETE Column stats: COMPLETE > Execution mode: vectorized, llap > LLAP IO: all inputs > Map Vectorization: > enabled: true > enabledConditionsMet: > hive.vectorized.use.vectorized.input.format IS true > groupByVectorOutput: true > inputFileFormats: > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat > allNative: false > usesVectorUDFAdaptor: false > vectorized: true > Reducer 2 >
[jira] [Updated] (HIVE-11394) Enhance EXPLAIN display for vectorization
[ https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-11394: Attachment: HIVE-11394.095.patch > Enhance EXPLAIN display for vectorization > - > > Key: HIVE-11394 > URL: https://issues.apache.org/jira/browse/HIVE-11394 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Fix For: 2.2.0 > > Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, > HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, > HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, > HIVE-11394.09.patch, HIVE-11394.091.patch, HIVE-11394.092.patch, > HIVE-11394.093.patch, HIVE-11394.094.patch, HIVE-11394.095.patch > > > Add detail to the EXPLAIN output showing why a Map and Reduce work is not > vectorized. > New syntax is: EXPLAIN VECTORIZATION \[ONLY\] > \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\] > The ONLY option suppresses most non-vectorization elements. > SUMMARY shows vectorization information for the PLAN (is vectorization > enabled) and a summary of Map and Reduce work. > OPERATOR shows vectorization information for operators. E.g. Filter > Vectorization. It includes all information of SUMMARY, too. > EXPRESSION shows vectorization information for expressions. E.g. > predicateExpression. It includes all information of SUMMARY and OPERATOR, > too. > DETAIL shows very vectorization information. > It includes all information of SUMMARY, OPERATOR, and EXPRESSION too. > The optional clause defaults are not ONLY and SUMMARY. > --- > Here are some examples: > EXPLAIN VECTORIZATION example: > (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization > sections) > Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION > SUMMARY. > Under Reducer 3’s "Reduce Vectorization:" you’ll see > notVectorizedReason: Aggregation Function UDF avg parameter expression for > GROUPBY operator: Data type struct of > Column\[VALUE._col2\] not supported > For Reducer 2’s "Reduce Vectorization:" you’ll see "groupByVectorOutput:": > "false" which says a node has a GROUP BY with an AVG or some other aggregator > that outputs a non-PRIMITIVE type (e.g. STRUCT) and all downstream operators > are row-mode. I.e. not vector output. > If "usesVectorUDFAdaptor:": "false" were true, it would say there was at > least one vectorized expression is using VectorUDFAdaptor. > And, "allNative:": "false" will be true when all operators are native. > Today, GROUP BY and FILE SINK are not native. MAP JOIN and REDUCE SINK are > conditionally native. FILTER and SELECT are native. > {code} > PLAN VECTORIZATION: > enabled: true > enabledConditionsMet: [hive.vectorized.execution.enabled IS true] > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Tez > ... > Edges: > Reducer 2 <- Map 1 (SIMPLE_EDGE) > Reducer 3 <- Reducer 2 (SIMPLE_EDGE) > ... > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: alltypesorc > Statistics: Num rows: 12288 Data size: 36696 Basic stats: > COMPLETE Column stats: COMPLETE > Select Operator > expressions: cint (type: int) > outputColumnNames: cint > Statistics: Num rows: 12288 Data size: 36696 Basic stats: > COMPLETE Column stats: COMPLETE > Group By Operator > keys: cint (type: int) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 5775 Data size: 17248 Basic > stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: int) > sort order: + > Map-reduce partition columns: _col0 (type: int) > Statistics: Num rows: 5775 Data size: 17248 Basic > stats: COMPLETE Column stats: COMPLETE > Execution mode: vectorized, llap > LLAP IO: all inputs > Map Vectorization: > enabled: true > enabledConditionsMet: > hive.vectorized.use.vectorized.input.format IS true > groupByVectorOutput: true > inputFileFormats: > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat > allNative: false > usesVectorUDFAdaptor: false > vectorized: true > Reducer 2 >
[jira] [Updated] (HIVE-11394) Enhance EXPLAIN display for vectorization
[ https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-11394: Status: In Progress (was: Patch Available) > Enhance EXPLAIN display for vectorization > - > > Key: HIVE-11394 > URL: https://issues.apache.org/jira/browse/HIVE-11394 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Fix For: 2.2.0 > > Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, > HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, > HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, > HIVE-11394.09.patch, HIVE-11394.091.patch, HIVE-11394.092.patch, > HIVE-11394.093.patch, HIVE-11394.094.patch > > > Add detail to the EXPLAIN output showing why a Map and Reduce work is not > vectorized. > New syntax is: EXPLAIN VECTORIZATION \[ONLY\] > \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\] > The ONLY option suppresses most non-vectorization elements. > SUMMARY shows vectorization information for the PLAN (is vectorization > enabled) and a summary of Map and Reduce work. > OPERATOR shows vectorization information for operators. E.g. Filter > Vectorization. It includes all information of SUMMARY, too. > EXPRESSION shows vectorization information for expressions. E.g. > predicateExpression. It includes all information of SUMMARY and OPERATOR, > too. > DETAIL shows very vectorization information. > It includes all information of SUMMARY, OPERATOR, and EXPRESSION too. > The optional clause defaults are not ONLY and SUMMARY. > --- > Here are some examples: > EXPLAIN VECTORIZATION example: > (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization > sections) > Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION > SUMMARY. > Under Reducer 3’s "Reduce Vectorization:" you’ll see > notVectorizedReason: Aggregation Function UDF avg parameter expression for > GROUPBY operator: Data type struct of > Column\[VALUE._col2\] not supported > For Reducer 2’s "Reduce Vectorization:" you’ll see "groupByVectorOutput:": > "false" which says a node has a GROUP BY with an AVG or some other aggregator > that outputs a non-PRIMITIVE type (e.g. STRUCT) and all downstream operators > are row-mode. I.e. not vector output. > If "usesVectorUDFAdaptor:": "false" were true, it would say there was at > least one vectorized expression is using VectorUDFAdaptor. > And, "allNative:": "false" will be true when all operators are native. > Today, GROUP BY and FILE SINK are not native. MAP JOIN and REDUCE SINK are > conditionally native. FILTER and SELECT are native. > {code} > PLAN VECTORIZATION: > enabled: true > enabledConditionsMet: [hive.vectorized.execution.enabled IS true] > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Tez > ... > Edges: > Reducer 2 <- Map 1 (SIMPLE_EDGE) > Reducer 3 <- Reducer 2 (SIMPLE_EDGE) > ... > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: alltypesorc > Statistics: Num rows: 12288 Data size: 36696 Basic stats: > COMPLETE Column stats: COMPLETE > Select Operator > expressions: cint (type: int) > outputColumnNames: cint > Statistics: Num rows: 12288 Data size: 36696 Basic stats: > COMPLETE Column stats: COMPLETE > Group By Operator > keys: cint (type: int) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 5775 Data size: 17248 Basic > stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: int) > sort order: + > Map-reduce partition columns: _col0 (type: int) > Statistics: Num rows: 5775 Data size: 17248 Basic > stats: COMPLETE Column stats: COMPLETE > Execution mode: vectorized, llap > LLAP IO: all inputs > Map Vectorization: > enabled: true > enabledConditionsMet: > hive.vectorized.use.vectorized.input.format IS true > groupByVectorOutput: true > inputFileFormats: > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat > allNative: false > usesVectorUDFAdaptor: false > vectorized: true > Reducer 2 > Execution
[jira] [Commented] (HIVE-11394) Enhance EXPLAIN display for vectorization
[ https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15772567#comment-15772567 ] Hive QA commented on HIVE-11394: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12844550/HIVE-11394.094.patch {color:red}ERROR:{color} -1 due to build exiting with an error Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2715/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2715/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2715/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests exited with: NonZeroExitCodeException Command 'bash /data/hiveptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ date '+%Y-%m-%d %T.%3N' 2016-12-23 11:08:58.263 + [[ -n /usr/lib/jvm/java-8-openjdk-amd64 ]] + export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + JAVA_HOME=/usr/lib/jvm/java-8-openjdk-amd64 + export PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + PATH=/usr/lib/jvm/java-8-openjdk-amd64/bin/:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m ' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m ' + export 'MAVEN_OPTS=-Xmx1g ' + MAVEN_OPTS='-Xmx1g ' + cd /data/hiveptest/working/ + tee /data/hiveptest/logs/PreCommit-HIVE-Build-2715/source-prep.txt + [[ false == \t\r\u\e ]] + mkdir -p maven ivy + [[ git = \s\v\n ]] + [[ git = \g\i\t ]] + [[ -z master ]] + [[ -d apache-github-source-source ]] + [[ ! -d apache-github-source-source/.git ]] + [[ ! -d apache-github-source-source ]] + date '+%Y-%m-%d %T.%3N' 2016-12-23 11:08:58.266 + cd apache-github-source-source + git fetch origin + git reset --hard HEAD HEAD is now at 7befe8e HIVE-15487: LLAP: Improvements to random selection while scheduling + git clean -f -d + git checkout master Already on 'master' Your branch is up-to-date with 'origin/master'. + git reset --hard origin/master HEAD is now at 7befe8e HIVE-15487: LLAP: Improvements to random selection while scheduling + git merge --ff-only origin/master Already up-to-date. + date '+%Y-%m-%d %T.%3N' 2016-12-23 11:08:59.249 + patchCommandPath=/data/hiveptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hiveptest/working/scratch/build.patch + [[ -f /data/hiveptest/working/scratch/build.patch ]] + chmod +x /data/hiveptest/working/scratch/smart-apply-patch.sh + /data/hiveptest/working/scratch/smart-apply-patch.sh /data/hiveptest/working/scratch/build.patch error: patch failed: ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java:2479 error: ql/src/java/org/apache/hadoop/hive/ql/optimizer/physical/Vectorizer.java: patch does not apply The patch does not appear to apply with p0, p1, or p2 + exit 1 ' {noformat} This message is automatically generated. ATTACHMENT ID: 12844550 - PreCommit-HIVE-Build > Enhance EXPLAIN display for vectorization > - > > Key: HIVE-11394 > URL: https://issues.apache.org/jira/browse/HIVE-11394 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Fix For: 2.2.0 > > Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, > HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, > HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, > HIVE-11394.09.patch, HIVE-11394.091.patch, HIVE-11394.092.patch, > HIVE-11394.093.patch, HIVE-11394.094.patch > > > Add detail to the EXPLAIN output showing why a Map and Reduce work is not > vectorized. > New syntax is: EXPLAIN VECTORIZATION \[ONLY\] > \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\] > The ONLY option suppresses most non-vectorization elements. > SUMMARY shows vectorization information for the PLAN (is vectorization > enabled) and a summary of Map and Reduce work. > OPERATOR shows vectorization information for operators. E.g. Filter > Vectorization. It includes all information of SUMMARY, too. > EXPRESSION shows vectorization information for expressions. E.g. > predicateExpression. It includes all information of SUMMARY and OPERATOR, > too. > DETAIL shows very vectorization information. > It includes all information of SUMMARY, OPERATOR, and EXPRESSION too. > The optional clause defaults are not ONLY and SUMMARY. > --- > Here are some examples: > EXPLAIN VECTORIZATION example: > (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization > sections) > Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION > SUMMAR
[jira] [Updated] (HIVE-11394) Enhance EXPLAIN display for vectorization
[ https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-11394: Status: Patch Available (was: Reopened) > Enhance EXPLAIN display for vectorization > - > > Key: HIVE-11394 > URL: https://issues.apache.org/jira/browse/HIVE-11394 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Fix For: 2.2.0 > > Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, > HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, > HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, > HIVE-11394.09.patch, HIVE-11394.091.patch, HIVE-11394.092.patch, > HIVE-11394.093.patch, HIVE-11394.094.patch > > > Add detail to the EXPLAIN output showing why a Map and Reduce work is not > vectorized. > New syntax is: EXPLAIN VECTORIZATION \[ONLY\] > \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\] > The ONLY option suppresses most non-vectorization elements. > SUMMARY shows vectorization information for the PLAN (is vectorization > enabled) and a summary of Map and Reduce work. > OPERATOR shows vectorization information for operators. E.g. Filter > Vectorization. It includes all information of SUMMARY, too. > EXPRESSION shows vectorization information for expressions. E.g. > predicateExpression. It includes all information of SUMMARY and OPERATOR, > too. > DETAIL shows very vectorization information. > It includes all information of SUMMARY, OPERATOR, and EXPRESSION too. > The optional clause defaults are not ONLY and SUMMARY. > --- > Here are some examples: > EXPLAIN VECTORIZATION example: > (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization > sections) > Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION > SUMMARY. > Under Reducer 3’s "Reduce Vectorization:" you’ll see > notVectorizedReason: Aggregation Function UDF avg parameter expression for > GROUPBY operator: Data type struct of > Column\[VALUE._col2\] not supported > For Reducer 2’s "Reduce Vectorization:" you’ll see "groupByVectorOutput:": > "false" which says a node has a GROUP BY with an AVG or some other aggregator > that outputs a non-PRIMITIVE type (e.g. STRUCT) and all downstream operators > are row-mode. I.e. not vector output. > If "usesVectorUDFAdaptor:": "false" were true, it would say there was at > least one vectorized expression is using VectorUDFAdaptor. > And, "allNative:": "false" will be true when all operators are native. > Today, GROUP BY and FILE SINK are not native. MAP JOIN and REDUCE SINK are > conditionally native. FILTER and SELECT are native. > {code} > PLAN VECTORIZATION: > enabled: true > enabledConditionsMet: [hive.vectorized.execution.enabled IS true] > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Tez > ... > Edges: > Reducer 2 <- Map 1 (SIMPLE_EDGE) > Reducer 3 <- Reducer 2 (SIMPLE_EDGE) > ... > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: alltypesorc > Statistics: Num rows: 12288 Data size: 36696 Basic stats: > COMPLETE Column stats: COMPLETE > Select Operator > expressions: cint (type: int) > outputColumnNames: cint > Statistics: Num rows: 12288 Data size: 36696 Basic stats: > COMPLETE Column stats: COMPLETE > Group By Operator > keys: cint (type: int) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 5775 Data size: 17248 Basic > stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: int) > sort order: + > Map-reduce partition columns: _col0 (type: int) > Statistics: Num rows: 5775 Data size: 17248 Basic > stats: COMPLETE Column stats: COMPLETE > Execution mode: vectorized, llap > LLAP IO: all inputs > Map Vectorization: > enabled: true > enabledConditionsMet: > hive.vectorized.use.vectorized.input.format IS true > groupByVectorOutput: true > inputFileFormats: > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat > allNative: false > usesVectorUDFAdaptor: false > vectorized: true > Reducer 2 > Execution mod
[jira] [Updated] (HIVE-11394) Enhance EXPLAIN display for vectorization
[ https://issues.apache.org/jira/browse/HIVE-11394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matt McCline updated HIVE-11394: Attachment: HIVE-11394.094.patch Warm this patch back up. > Enhance EXPLAIN display for vectorization > - > > Key: HIVE-11394 > URL: https://issues.apache.org/jira/browse/HIVE-11394 > Project: Hive > Issue Type: Bug > Components: Hive >Reporter: Matt McCline >Assignee: Matt McCline >Priority: Critical > Fix For: 2.2.0 > > Attachments: HIVE-11394.01.patch, HIVE-11394.02.patch, > HIVE-11394.03.patch, HIVE-11394.04.patch, HIVE-11394.05.patch, > HIVE-11394.06.patch, HIVE-11394.07.patch, HIVE-11394.08.patch, > HIVE-11394.09.patch, HIVE-11394.091.patch, HIVE-11394.092.patch, > HIVE-11394.093.patch, HIVE-11394.094.patch > > > Add detail to the EXPLAIN output showing why a Map and Reduce work is not > vectorized. > New syntax is: EXPLAIN VECTORIZATION \[ONLY\] > \[SUMMARY|OPERATOR|EXPRESSION|DETAIL\] > The ONLY option suppresses most non-vectorization elements. > SUMMARY shows vectorization information for the PLAN (is vectorization > enabled) and a summary of Map and Reduce work. > OPERATOR shows vectorization information for operators. E.g. Filter > Vectorization. It includes all information of SUMMARY, too. > EXPRESSION shows vectorization information for expressions. E.g. > predicateExpression. It includes all information of SUMMARY and OPERATOR, > too. > DETAIL shows very vectorization information. > It includes all information of SUMMARY, OPERATOR, and EXPRESSION too. > The optional clause defaults are not ONLY and SUMMARY. > --- > Here are some examples: > EXPLAIN VECTORIZATION example: > (Note the PLAN VECTORIZATION, Map Vectorization, Reduce Vectorization > sections) > Since SUMMARY is the default, it is the output of EXPLAIN VECTORIZATION > SUMMARY. > Under Reducer 3’s "Reduce Vectorization:" you’ll see > notVectorizedReason: Aggregation Function UDF avg parameter expression for > GROUPBY operator: Data type struct of > Column\[VALUE._col2\] not supported > For Reducer 2’s "Reduce Vectorization:" you’ll see "groupByVectorOutput:": > "false" which says a node has a GROUP BY with an AVG or some other aggregator > that outputs a non-PRIMITIVE type (e.g. STRUCT) and all downstream operators > are row-mode. I.e. not vector output. > If "usesVectorUDFAdaptor:": "false" were true, it would say there was at > least one vectorized expression is using VectorUDFAdaptor. > And, "allNative:": "false" will be true when all operators are native. > Today, GROUP BY and FILE SINK are not native. MAP JOIN and REDUCE SINK are > conditionally native. FILTER and SELECT are native. > {code} > PLAN VECTORIZATION: > enabled: true > enabledConditionsMet: [hive.vectorized.execution.enabled IS true] > STAGE DEPENDENCIES: > Stage-1 is a root stage > Stage-0 depends on stages: Stage-1 > STAGE PLANS: > Stage: Stage-1 > Tez > ... > Edges: > Reducer 2 <- Map 1 (SIMPLE_EDGE) > Reducer 3 <- Reducer 2 (SIMPLE_EDGE) > ... > Vertices: > Map 1 > Map Operator Tree: > TableScan > alias: alltypesorc > Statistics: Num rows: 12288 Data size: 36696 Basic stats: > COMPLETE Column stats: COMPLETE > Select Operator > expressions: cint (type: int) > outputColumnNames: cint > Statistics: Num rows: 12288 Data size: 36696 Basic stats: > COMPLETE Column stats: COMPLETE > Group By Operator > keys: cint (type: int) > mode: hash > outputColumnNames: _col0 > Statistics: Num rows: 5775 Data size: 17248 Basic > stats: COMPLETE Column stats: COMPLETE > Reduce Output Operator > key expressions: _col0 (type: int) > sort order: + > Map-reduce partition columns: _col0 (type: int) > Statistics: Num rows: 5775 Data size: 17248 Basic > stats: COMPLETE Column stats: COMPLETE > Execution mode: vectorized, llap > LLAP IO: all inputs > Map Vectorization: > enabled: true > enabledConditionsMet: > hive.vectorized.use.vectorized.input.format IS true > groupByVectorOutput: true > inputFileFormats: > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat > allNative: false > usesVectorUDFAdaptor: false > vectorized: true > Reducer 2 >
[jira] [Commented] (HIVE-15503) LLAP: Fix use of Runtime.getRuntime.maxMemory in Hive operators
[ https://issues.apache.org/jira/browse/HIVE-15503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15772536#comment-15772536 ] Hive QA commented on HIVE-15503: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12844536/HIVE-15503.2.patch {color:red}ERROR:{color} -1 due to no test(s) being added or modified. {color:red}ERROR:{color} -1 due to 5 failed/errored test(s), 10896 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=234) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_basic] (batchId=134) org.apache.hadoop.hive.cli.TestMiniLlapCliDriver.testCliDriver[orc_ppd_schema_evol_3a] (batchId=135) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=93) org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_5] (batchId=92) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2714/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2714/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2714/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 5 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12844536 - PreCommit-HIVE-Build > LLAP: Fix use of Runtime.getRuntime.maxMemory in Hive operators > --- > > Key: HIVE-15503 > URL: https://issues.apache.org/jira/browse/HIVE-15503 > Project: Hive > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: HIVE-15503.1.patch, HIVE-15503.2.patch > > > {code} > ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java: > maxHashTblMemory = (long) (memoryPercentage * > Runtime.getRuntime().maxMemory()); > ql/src/java/org/apache/hadoop/hive/ql/exec/TopNHash.java:// Total Free > Memory = maxMemory() - Used Memory; > ql/src/java/org/apache/hadoop/hive/ql/exec/TopNHash.java:long > totalFreeMemory = Runtime.getRuntime().maxMemory() - > {code} > This will not work very well with LLAP because of the memory sharing by > executors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15448) ChangeManager for replication
[ https://issues.apache.org/jira/browse/HIVE-15448?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15772458#comment-15772458 ] Hive QA commented on HIVE-15448: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12844533/HIVE-15448.2.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10868 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=234) TestMiniLlapLocalCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=144) [vectorized_rcfile_columnar.q,vector_elt.q,explainuser_1.q,multi_insert.q,tez_dml.q,vector_bround.q,schema_evol_orc_acid_table.q,vector_when_case_null.q,orc_ppd_schema_evol_1b.q,vector_join30.q,vectorization_11.q,cte_3.q,update_tmp_table.q,vector_decimal_cast.q,groupby_grouping_id2.q,vector_decimal_round.q,tez_smb_empty.q,orc_merge6.q,vector_char_mapjoin1.q,vector_decimal_trailing.q,cte_5.q,tez_union.q,vector_decimal_2.q,columnStatsUpdateForStatsOptimizer_1.q,vector_outer_join3.q,schema_evol_text_vec_part_all_complex.q,tez_dynpart_hashjoin_2.q,auto_sortmerge_join_12.q,offset_limit.q,tez_union_multiinsert.q] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_2] (batchId=93) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2713/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2713/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2713/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12844533 - PreCommit-HIVE-Build > ChangeManager for replication > - > > Key: HIVE-15448 > URL: https://issues.apache.org/jira/browse/HIVE-15448 > Project: Hive > Issue Type: Sub-task > Components: repl >Reporter: Daniel Dai >Assignee: Daniel Dai > Attachments: HIVE-15448.1.patch, HIVE-15448.2.patch > > > The change manager implementation as described in > https://cwiki.apache.org/confluence/display/Hive/HiveReplicationv2Development#HiveReplicationv2Development-Changemanagement. > This issue tracks the infrastructure code. Hooking to actions will be > tracked in other ticket. > ReplChangeManager includes: > * method to generate checksum > * method to convert file path to cm path > * method to move table/partition/file into cm > * thread to clear cm files if expires -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-3491) Expose column names to UDFs
[ https://issues.apache.org/jira/browse/HIVE-3491?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15772391#comment-15772391 ] Jan Filipiak commented on HIVE-3491: Hello just want to let people know that I actually do this asterisk thing now and ran into exactly this limitation. Now i will probably need todo mapping to the user expected StructOi output by ordinal position, wich makes schema evolution a lot harder :( Anyone plans on getting this done? > Expose column names to UDFs > --- > > Key: HIVE-3491 > URL: https://issues.apache.org/jira/browse/HIVE-3491 > Project: Hive > Issue Type: New Feature > Components: Query Processor, UDF >Reporter: Adam Kramer > > If I run > SELECT MY_FUNC(a.foo, b.bar) FROM baz1 a JOIN baz2 b; > ...the parsed query structure (i.e., that "foo" and "bar" are the name of the > columns) should be available to the UDF in some manner. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15487) LLAP: Improvements to random selection while scheduling
[ https://issues.apache.org/jira/browse/HIVE-15487?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-15487: - Resolution: Fixed Fix Version/s: 2.2.0 Status: Resolved (was: Patch Available) Test failures are not related to this patch. Committed to master. > LLAP: Improvements to random selection while scheduling > --- > > Key: HIVE-15487 > URL: https://issues.apache.org/jira/browse/HIVE-15487 > Project: Hive > Issue Type: Bug > Components: llap >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran > Fix For: 2.2.0 > > Attachments: HIVE-15487.1.patch > > > Currently llap scheduler, picks up random host when no locality information > is specified or when all requested hosts are busy serving other requests with > forced locality. In such cases, we can pick up the next available node in > consistent order to get better locality instead of random selection. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15503) LLAP: Fix use of Runtime.getRuntime.maxMemory in Hive operators
[ https://issues.apache.org/jira/browse/HIVE-15503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-15503: - Attachment: (was: HIVE-15503.WIP.patch) > LLAP: Fix use of Runtime.getRuntime.maxMemory in Hive operators > --- > > Key: HIVE-15503 > URL: https://issues.apache.org/jira/browse/HIVE-15503 > Project: Hive > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: HIVE-15503.1.patch, HIVE-15503.2.patch > > > {code} > ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java: > maxHashTblMemory = (long) (memoryPercentage * > Runtime.getRuntime().maxMemory()); > ql/src/java/org/apache/hadoop/hive/ql/exec/TopNHash.java:// Total Free > Memory = maxMemory() - Used Memory; > ql/src/java/org/apache/hadoop/hive/ql/exec/TopNHash.java:long > totalFreeMemory = Runtime.getRuntime().maxMemory() - > {code} > This will not work very well with LLAP because of the memory sharing by > executors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15503) LLAP: Fix use of Runtime.getRuntime.maxMemory in Hive operators
[ https://issues.apache.org/jira/browse/HIVE-15503?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15772377#comment-15772377 ] Prasanth Jayachandran commented on HIVE-15503: -- RB link: https://reviews.apache.org/r/55010/ [~hagleitn] can you please review the patch? > LLAP: Fix use of Runtime.getRuntime.maxMemory in Hive operators > --- > > Key: HIVE-15503 > URL: https://issues.apache.org/jira/browse/HIVE-15503 > Project: Hive > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: HIVE-15503.1.patch, HIVE-15503.2.patch, > HIVE-15503.WIP.patch > > > {code} > ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java: > maxHashTblMemory = (long) (memoryPercentage * > Runtime.getRuntime().maxMemory()); > ql/src/java/org/apache/hadoop/hive/ql/exec/TopNHash.java:// Total Free > Memory = maxMemory() - Used Memory; > ql/src/java/org/apache/hadoop/hive/ql/exec/TopNHash.java:long > totalFreeMemory = Runtime.getRuntime().maxMemory() - > {code} > This will not work very well with LLAP because of the memory sharing by > executors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15503) LLAP: Fix use of Runtime.getRuntime.maxMemory in Hive operators
[ https://issues.apache.org/jira/browse/HIVE-15503?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Prasanth Jayachandran updated HIVE-15503: - Attachment: HIVE-15503.2.patch Added TopNHash as well. > LLAP: Fix use of Runtime.getRuntime.maxMemory in Hive operators > --- > > Key: HIVE-15503 > URL: https://issues.apache.org/jira/browse/HIVE-15503 > Project: Hive > Issue Type: Bug >Affects Versions: 2.2.0 >Reporter: Prasanth Jayachandran >Assignee: Prasanth Jayachandran >Priority: Critical > Attachments: HIVE-15503.1.patch, HIVE-15503.2.patch, > HIVE-15503.WIP.patch > > > {code} > ql/src/java/org/apache/hadoop/hive/ql/exec/GroupByOperator.java: > maxHashTblMemory = (long) (memoryPercentage * > Runtime.getRuntime().maxMemory()); > ql/src/java/org/apache/hadoop/hive/ql/exec/TopNHash.java:// Total Free > Memory = maxMemory() - Used Memory; > ql/src/java/org/apache/hadoop/hive/ql/exec/TopNHash.java:long > totalFreeMemory = Runtime.getRuntime().maxMemory() - > {code} > This will not work very well with LLAP because of the memory sharing by > executors. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (HIVE-15448) ChangeManager for replication
[ https://issues.apache.org/jira/browse/HIVE-15448?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Dai updated HIVE-15448: -- Attachment: HIVE-15448.2.patch Address Thejas/Sushanth's comments. > ChangeManager for replication > - > > Key: HIVE-15448 > URL: https://issues.apache.org/jira/browse/HIVE-15448 > Project: Hive > Issue Type: Sub-task > Components: repl >Reporter: Daniel Dai >Assignee: Daniel Dai > Attachments: HIVE-15448.1.patch, HIVE-15448.2.patch > > > The change manager implementation as described in > https://cwiki.apache.org/confluence/display/Hive/HiveReplicationv2Development#HiveReplicationv2Development-Changemanagement. > This issue tracks the infrastructure code. Hooking to actions will be > tracked in other ticket. > ReplChangeManager includes: > * method to generate checksum > * method to convert file path to cm path > * method to move table/partition/file into cm > * thread to clear cm files if expires -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HIVE-15499) Nested column pruning: don't prune paths when a SerDe is used only for serializing
[ https://issues.apache.org/jira/browse/HIVE-15499?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15772250#comment-15772250 ] Hive QA commented on HIVE-15499: Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12844520/HIVE-15499.2.patch {color:green}SUCCESS:{color} +1 due to 1 test(s) being added or modified. {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 10880 tests executed *Failed tests:* {noformat} TestDerbyConnector - did not produce a TEST-*.xml file (likely timed out) (batchId=234) TestSparkCliDriver - did not produce a TEST-*.xml file (likely timed out) (batchId=129) [groupby6_map.q,groupby2_noskew_multi_distinct.q,load_dyn_part12.q,scriptfile1.q,join15.q,auto_join17.q,join_hive_626.q,tez_join_tests.q,auto_join21.q,join_view.q,join_cond_pushdown_4.q,vectorization_0.q,union_null.q,auto_join3.q,vectorization_decimal_date.q] org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver[explainanalyze_4] (batchId=93) {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/2712/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/2712/console Test logs: http://104.198.109.242/logs/PreCommit-HIVE-Build-2712/ Messages: {noformat} Executing org.apache.hive.ptest.execution.TestCheckPhase Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12844520 - PreCommit-HIVE-Build > Nested column pruning: don't prune paths when a SerDe is used only for > serializing > -- > > Key: HIVE-15499 > URL: https://issues.apache.org/jira/browse/HIVE-15499 > Project: Hive > Issue Type: Sub-task > Components: Serializers/Deserializers >Reporter: Chao Sun >Assignee: Chao Sun > Attachments: HIVE-15499.1.patch, HIVE-15499.2.patch > > > In {{FileSinkOperator}}, a serializer is created to write output data. When > initializing it we should not read the > {{ColumnProjectionUtils.READ_NESTED_COLUMN_PATH_CONF_STR}} property since > this is only used for the read path, and the path may not match the schema > for the output table (for instance, in the case of insert). -- This message was sent by Atlassian JIRA (v6.3.4#6332)