[jira] [Commented] (HIVE-7262) Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize
[ https://issues.apache.org/jira/browse/HIVE-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14061269#comment-14061269 ] Matt McCline commented on HIVE-7262: Discarded original review because it referenced wrong repository. New review is https://reviews.apache.org/r/23459/ Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize -- Key: HIVE-7262 URL: https://issues.apache.org/jira/browse/HIVE-7262 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-7262.1.patch, HIVE-7262.2.patch, HIVE-7262.3.patch In ptf.q, create the part table with STORED AS ORC and SET hive.vectorized.execution.enabled=true; Queries fail to find BLOCKOFFSET virtual column during vectorization and suffers an exception. ERROR vector.VectorizationContext (VectorizationContext.java:getInputColumnIndex(186)) - The column BLOCK__OFFSET__INSIDE__FILE is not in the vectorization context column map. Jitendra pointed to the routine that returns the VectorizationContext in Vectorize.java needing to add virtual columns to the map, too. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7262) Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize
[ https://issues.apache.org/jira/browse/HIVE-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14061365#comment-14061365 ] Jitendra Nath Pandey commented on HIVE-7262: +1, lgtm Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize -- Key: HIVE-7262 URL: https://issues.apache.org/jira/browse/HIVE-7262 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-7262.1.patch, HIVE-7262.2.patch, HIVE-7262.3.patch In ptf.q, create the part table with STORED AS ORC and SET hive.vectorized.execution.enabled=true; Queries fail to find BLOCKOFFSET virtual column during vectorization and suffers an exception. ERROR vector.VectorizationContext (VectorizationContext.java:getInputColumnIndex(186)) - The column BLOCK__OFFSET__INSIDE__FILE is not in the vectorization context column map. Jitendra pointed to the routine that returns the VectorizationContext in Vectorize.java needing to add virtual columns to the map, too. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7262) Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize
[ https://issues.apache.org/jira/browse/HIVE-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14061373#comment-14061373 ] Matt McCline commented on HIVE-7262: Note that vectorized_ptf.q is a copy of ptf.q with the table changed to be ORC format. Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize -- Key: HIVE-7262 URL: https://issues.apache.org/jira/browse/HIVE-7262 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-7262.1.patch, HIVE-7262.2.patch, HIVE-7262.3.patch In ptf.q, create the part table with STORED AS ORC and SET hive.vectorized.execution.enabled=true; Queries fail to find BLOCKOFFSET virtual column during vectorization and suffers an exception. ERROR vector.VectorizationContext (VectorizationContext.java:getInputColumnIndex(186)) - The column BLOCK__OFFSET__INSIDE__FILE is not in the vectorization context column map. Jitendra pointed to the routine that returns the VectorizationContext in Vectorize.java needing to add virtual columns to the map, too. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7262) Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize
[ https://issues.apache.org/jira/browse/HIVE-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14061478#comment-14061478 ] Hive QA commented on HIVE-7262: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12655619/HIVE-7262.3.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5719 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMiniTezCliDriver.testCliDriver_temp_table org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_ql_rewrite_gbtoidx org.apache.hive.hcatalog.pig.TestHCatLoader.testReadDataPrimitiveTypes {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/782/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/782/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-782/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12655619 Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize -- Key: HIVE-7262 URL: https://issues.apache.org/jira/browse/HIVE-7262 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-7262.1.patch, HIVE-7262.2.patch, HIVE-7262.3.patch In ptf.q, create the part table with STORED AS ORC and SET hive.vectorized.execution.enabled=true; Queries fail to find BLOCKOFFSET virtual column during vectorization and suffers an exception. ERROR vector.VectorizationContext (VectorizationContext.java:getInputColumnIndex(186)) - The column BLOCK__OFFSET__INSIDE__FILE is not in the vectorization context column map. Jitendra pointed to the routine that returns the VectorizationContext in Vectorize.java needing to add virtual columns to the map, too. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7262) Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize
[ https://issues.apache.org/jira/browse/HIVE-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14061535#comment-14061535 ] Matt McCline commented on HIVE-7262: These failures are unrelated to this change. Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize -- Key: HIVE-7262 URL: https://issues.apache.org/jira/browse/HIVE-7262 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-7262.1.patch, HIVE-7262.2.patch, HIVE-7262.3.patch In ptf.q, create the part table with STORED AS ORC and SET hive.vectorized.execution.enabled=true; Queries fail to find BLOCKOFFSET virtual column during vectorization and suffers an exception. ERROR vector.VectorizationContext (VectorizationContext.java:getInputColumnIndex(186)) - The column BLOCK__OFFSET__INSIDE__FILE is not in the vectorization context column map. Jitendra pointed to the routine that returns the VectorizationContext in Vectorize.java needing to add virtual columns to the map, too. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7262) Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize
[ https://issues.apache.org/jira/browse/HIVE-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14057078#comment-14057078 ] Eric Hanson commented on HIVE-7262: --- [~mmccline] put a code review at: https://reviews.apache.org/r/23186/. Matt, if you could attach this to your JIRAs in the future, that'd be great. Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize -- Key: HIVE-7262 URL: https://issues.apache.org/jira/browse/HIVE-7262 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-7262.1.patch, HIVE-7262.2.patch In ptf.q, create the part table with STORED AS ORC and SET hive.vectorized.execution.enabled=true; Queries fail to find BLOCKOFFSET virtual column during vectorization and suffers an exception. ERROR vector.VectorizationContext (VectorizationContext.java:getInputColumnIndex(186)) - The column BLOCK__OFFSET__INSIDE__FILE is not in the vectorization context column map. Jitendra pointed to the routine that returns the VectorizationContext in Vectorize.java needing to add virtual columns to the map, too. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7262) Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize
[ https://issues.apache.org/jira/browse/HIVE-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14057085#comment-14057085 ] Eric Hanson commented on HIVE-7262: --- Matt, can you upload your patch to your ReviewBoard page? I didn't see a View Diff button. I see you did include a link above -- sorry I missed that. Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize -- Key: HIVE-7262 URL: https://issues.apache.org/jira/browse/HIVE-7262 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-7262.1.patch, HIVE-7262.2.patch In ptf.q, create the part table with STORED AS ORC and SET hive.vectorized.execution.enabled=true; Queries fail to find BLOCKOFFSET virtual column during vectorization and suffers an exception. ERROR vector.VectorizationContext (VectorizationContext.java:getInputColumnIndex(186)) - The column BLOCK__OFFSET__INSIDE__FILE is not in the vectorization context column map. Jitendra pointed to the routine that returns the VectorizationContext in Vectorize.java needing to add virtual columns to the map, too. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7262) Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize
[ https://issues.apache.org/jira/browse/HIVE-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14049519#comment-14049519 ] Hive QA commented on HIVE-7262: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12653480/HIVE-7262.2.patch {color:red}ERROR:{color} -1 due to 2 failed/errored test(s), 5672 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_schemeAuthority {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/656/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/656/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-656/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 2 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12653480 Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize -- Key: HIVE-7262 URL: https://issues.apache.org/jira/browse/HIVE-7262 Project: Hive Issue Type: Sub-task Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-7262.1.patch, HIVE-7262.2.patch In ptf.q, create the part table with STORED AS ORC and SET hive.vectorized.execution.enabled=true; Queries fail to find BLOCKOFFSET virtual column during vectorization and suffers an exception. ERROR vector.VectorizationContext (VectorizationContext.java:getInputColumnIndex(186)) - The column BLOCK__OFFSET__INSIDE__FILE is not in the vectorization context column map. Jitendra pointed to the routine that returns the VectorizationContext in Vectorize.java needing to add virtual columns to the map, too. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7262) Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize
[ https://issues.apache.org/jira/browse/HIVE-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14048142#comment-14048142 ] Matt McCline commented on HIVE-7262: In talking to Harish, the issue is we should not try to vectorize pure or true' table functions like NOOP. We should only vectorize PTF when it is strictly for windowing only. Then, the automatically added virtual columns like FILENAME and BLOCKOFFSET will get pruned away very early and not be an issue. Separately, there is another issue HIVE-5570 Handle virtual columns and schema evolution in vector code path when someone actually want one of those virtual columns. Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize -- Key: HIVE-7262 URL: https://issues.apache.org/jira/browse/HIVE-7262 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline In ptf.q, create the part table with STORED AS ORC and SET hive.vectorized.execution.enabled=true; Queries fail to find BLOCKOFFSET virtual column during vectorization and suffers an exception. ERROR vector.VectorizationContext (VectorizationContext.java:getInputColumnIndex(186)) - The column BLOCK__OFFSET__INSIDE__FILE is not in the vectorization context column map. Jitendra pointed to the routine that returns the VectorizationContext in Vectorize.java needing to add virtual columns to the map, too. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7262) Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize
[ https://issues.apache.org/jira/browse/HIVE-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14048314#comment-14048314 ] Matt McCline commented on HIVE-7262: The change here is to detect virtual columns and not vectorize those MapWork (and soon ReduceWork) tasks. Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize -- Key: HIVE-7262 URL: https://issues.apache.org/jira/browse/HIVE-7262 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-7262.1.patch In ptf.q, create the part table with STORED AS ORC and SET hive.vectorized.execution.enabled=true; Queries fail to find BLOCKOFFSET virtual column during vectorization and suffers an exception. ERROR vector.VectorizationContext (VectorizationContext.java:getInputColumnIndex(186)) - The column BLOCK__OFFSET__INSIDE__FILE is not in the vectorization context column map. Jitendra pointed to the routine that returns the VectorizationContext in Vectorize.java needing to add virtual columns to the map, too. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (HIVE-7262) Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize
[ https://issues.apache.org/jira/browse/HIVE-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14048411#comment-14048411 ] Hive QA commented on HIVE-7262: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12653264/HIVE-7262.1.patch {color:red}ERROR:{color} -1 due to 3 failed/errored test(s), 5672 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_vectorized_ptf org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_root_dir_external_table org.apache.hive.jdbc.miniHS2.TestHiveServer2.testConnection {noformat} Test results: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/634/testReport Console output: http://ec2-174-129-184-35.compute-1.amazonaws.com/jenkins/job/PreCommit-HIVE-Build/634/console Test logs: http://ec2-174-129-184-35.compute-1.amazonaws.com/logs/PreCommit-HIVE-Build-634/ Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests exited with: TestsFailedException: 3 tests failed {noformat} This message is automatically generated. ATTACHMENT ID: 12653264 Partitioned Table Function (PTF) query fails on ORC table when attempting to vectorize -- Key: HIVE-7262 URL: https://issues.apache.org/jira/browse/HIVE-7262 Project: Hive Issue Type: Bug Reporter: Matt McCline Assignee: Matt McCline Attachments: HIVE-7262.1.patch In ptf.q, create the part table with STORED AS ORC and SET hive.vectorized.execution.enabled=true; Queries fail to find BLOCKOFFSET virtual column during vectorization and suffers an exception. ERROR vector.VectorizationContext (VectorizationContext.java:getInputColumnIndex(186)) - The column BLOCK__OFFSET__INSIDE__FILE is not in the vectorization context column map. Jitendra pointed to the routine that returns the VectorizationContext in Vectorize.java needing to add virtual columns to the map, too. -- This message was sent by Atlassian JIRA (v6.2#6252)