[jira] [Commented] (HIVE-4961) Create bridge for custom UDFs to operate in vectorized mode
[ https://issues.apache.org/jira/browse/HIVE-4961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769935#comment-13769935 ] Hive QA commented on HIVE-4961: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12603642/HIVE-4961.4-vectorization.patch {color:red}ERROR:{color} -1 due to 8 failed/errored test(s), 3954 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_plan_json org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDictionaryThreshold org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDump org.apache.hcatalog.cli.TestPermsGrp.testCustomPerms org.apache.hive.hcatalog.fileformats.TestOrcDynamicPartitioned.testHCatDynamicPartitionedTable org.apache.hive.hcatalog.mapreduce.TestHCatExternalHCatNonPartitioned.testHCatNonPartitionedTable org.apache.hive.hcatalog.mapreduce.TestHCatExternalPartitioned.testHCatPartitionedTable {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/786/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/786/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 8 tests failed {noformat} This message is automatically generated. Create bridge for custom UDFs to operate in vectorized mode --- Key: HIVE-4961 URL: https://issues.apache.org/jira/browse/HIVE-4961 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Eric Hanson Assignee: Eric Hanson Fix For: vectorization-branch Attachments: HIVE-4961.1-vectorization.patch, HIVE-4961.2-vectorization.patch, HIVE-4961.3-vectorization.patch, HIVE-4961.4-vectorization.patch, vectorUDF.4.patch, vectorUDF.5.patch, vectorUDF.8.patch, vectorUDF.9.patch Suppose you have a custom UDF myUDF() that you've created to extend hive. The goal of this JIRA is to create a facility where if you run a query that uses myUDF() in an expression, the query will run in vectorized mode. This would be a general-purpose bridge for custom UDFs that users add to Hive. It would work with existing UDFs. I'm considering a separate JIRA for a new kind of custom UDF implementation that is vectorized from the beginning, to optimize performance. That is not covered by this JIRA. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4961) Create bridge for custom UDFs to operate in vectorized mode
[ https://issues.apache.org/jira/browse/HIVE-4961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13770251#comment-13770251 ] Eric Hanson commented on HIVE-4961: --- See the design specification attached to HIVE-4160 for a design description of this patch. Create bridge for custom UDFs to operate in vectorized mode --- Key: HIVE-4961 URL: https://issues.apache.org/jira/browse/HIVE-4961 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Eric Hanson Assignee: Eric Hanson Fix For: vectorization-branch Attachments: HIVE-4961.1-vectorization.patch, HIVE-4961.2-vectorization.patch, HIVE-4961.3-vectorization.patch, HIVE-4961.4-vectorization.patch, vectorUDF.4.patch, vectorUDF.5.patch, vectorUDF.8.patch, vectorUDF.9.patch Suppose you have a custom UDF myUDF() that you've created to extend hive. The goal of this JIRA is to create a facility where if you run a query that uses myUDF() in an expression, the query will run in vectorized mode. This would be a general-purpose bridge for custom UDFs that users add to Hive. It would work with existing UDFs. I'm considering a separate JIRA for a new kind of custom UDF implementation that is vectorized from the beginning, to optimize performance. That is not covered by this JIRA. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4961) Create bridge for custom UDFs to operate in vectorized mode
[ https://issues.apache.org/jira/browse/HIVE-4961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13768679#comment-13768679 ] Hive QA commented on HIVE-4961: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12603380/HIVE-4961.3-vectorization.patch {color:red}ERROR:{color} -1 due to 11 failed/errored test(s), 3954 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_plan_json org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDictionaryThreshold org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDump org.apache.hcatalog.listener.TestNotificationListener.testAMQListener org.apache.hive.hcatalog.fileformats.TestOrcDynamicPartitioned.testHCatDynamicPartitionedTable org.apache.hive.hcatalog.fileformats.TestOrcDynamicPartitioned.testHCatDynamicPartitionedTableMultipleTask org.apache.hive.hcatalog.pig.TestHCatStorer.testPartColsInData org.apache.hive.hcatalog.pig.TestHCatStorer.testStoreInPartiitonedTbl org.apache.hive.hcatalog.pig.TestHCatStorer.testStoreMultiTables org.apache.hive.hcatalog.pig.TestHCatStorer.testStoreWithNoSchema {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/763/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/763/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 11 tests failed {noformat} This message is automatically generated. Create bridge for custom UDFs to operate in vectorized mode --- Key: HIVE-4961 URL: https://issues.apache.org/jira/browse/HIVE-4961 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Eric Hanson Assignee: Eric Hanson Fix For: vectorization-branch Attachments: HIVE-4961.1-vectorization.patch, HIVE-4961.2-vectorization.patch, HIVE-4961.3-vectorization.patch, vectorUDF.4.patch, vectorUDF.5.patch, vectorUDF.8.patch, vectorUDF.9.patch Suppose you have a custom UDF myUDF() that you've created to extend hive. The goal of this JIRA is to create a facility where if you run a query that uses myUDF() in an expression, the query will run in vectorized mode. This would be a general-purpose bridge for custom UDFs that users add to Hive. It would work with existing UDFs. I'm considering a separate JIRA for a new kind of custom UDF implementation that is vectorized from the beginning, to optimize performance. That is not covered by this JIRA. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4961) Create bridge for custom UDFs to operate in vectorized mode
[ https://issues.apache.org/jira/browse/HIVE-4961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13768839#comment-13768839 ] Eric Hanson commented on HIVE-4961: --- As far as I can tell, the 11 test failures report in the last test run are not related to this patch. Create bridge for custom UDFs to operate in vectorized mode --- Key: HIVE-4961 URL: https://issues.apache.org/jira/browse/HIVE-4961 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Eric Hanson Assignee: Eric Hanson Fix For: vectorization-branch Attachments: HIVE-4961.1-vectorization.patch, HIVE-4961.2-vectorization.patch, HIVE-4961.3-vectorization.patch, vectorUDF.4.patch, vectorUDF.5.patch, vectorUDF.8.patch, vectorUDF.9.patch Suppose you have a custom UDF myUDF() that you've created to extend hive. The goal of this JIRA is to create a facility where if you run a query that uses myUDF() in an expression, the query will run in vectorized mode. This would be a general-purpose bridge for custom UDFs that users add to Hive. It would work with existing UDFs. I'm considering a separate JIRA for a new kind of custom UDF implementation that is vectorized from the beginning, to optimize performance. That is not covered by this JIRA. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4961) Create bridge for custom UDFs to operate in vectorized mode
[ https://issues.apache.org/jira/browse/HIVE-4961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13768969#comment-13768969 ] Eric Hanson commented on HIVE-4961: --- I ran the failing tests on my machine on a clean version of the vectorization branch without my patch. These tests failed: org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_plan_json org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDictionaryThreshold org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDump These tests would not run in a way that produced output in ant testreport, and my changes should not affect them. org.apache.hcatalog.listener.TestNotificationListener.testAMQListener org.apache.hive.hcatalog.fileformats.TestOrcDynamicPartitioned.testHCatDynamicPartitionedTable org.apache.hive.hcatalog.fileformats.TestOrcDynamicPartitioned.testHCatDynamicPartitionedTableMultipleTask org.apache.hive.hcatalog.pig.TestHCatStorer.testPartColsInData org.apache.hive.hcatalog.pig.TestHCatStorer.testStoreInPartiitonedTbl org.apache.hive.hcatalog.pig.TestHCatStorer.testStoreMultiTables org.apache.hive.hcatalog.pig.TestHCatStorer.testStoreWithNoSchema Create bridge for custom UDFs to operate in vectorized mode --- Key: HIVE-4961 URL: https://issues.apache.org/jira/browse/HIVE-4961 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Eric Hanson Assignee: Eric Hanson Fix For: vectorization-branch Attachments: HIVE-4961.1-vectorization.patch, HIVE-4961.2-vectorization.patch, HIVE-4961.3-vectorization.patch, vectorUDF.4.patch, vectorUDF.5.patch, vectorUDF.8.patch, vectorUDF.9.patch Suppose you have a custom UDF myUDF() that you've created to extend hive. The goal of this JIRA is to create a facility where if you run a query that uses myUDF() in an expression, the query will run in vectorized mode. This would be a general-purpose bridge for custom UDFs that users add to Hive. It would work with existing UDFs. I'm considering a separate JIRA for a new kind of custom UDF implementation that is vectorized from the beginning, to optimize performance. That is not covered by this JIRA. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4961) Create bridge for custom UDFs to operate in vectorized mode
[ https://issues.apache.org/jira/browse/HIVE-4961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769044#comment-13769044 ] Ashutosh Chauhan commented on HIVE-4961: hcatalog tests are flaky and we can ignore them. But, none of hive tests fail in trunk. Its not likely related to your patch though. I have seen {{input4.q}} and {{plan_json.q}} to fail consistently only on vectorization branch, so they need to be debugged on branch. orc tests I am not sure, but if they fail regardless of patch, I think this patch is good to go. Create bridge for custom UDFs to operate in vectorized mode --- Key: HIVE-4961 URL: https://issues.apache.org/jira/browse/HIVE-4961 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Eric Hanson Assignee: Eric Hanson Fix For: vectorization-branch Attachments: HIVE-4961.1-vectorization.patch, HIVE-4961.2-vectorization.patch, HIVE-4961.3-vectorization.patch, vectorUDF.4.patch, vectorUDF.5.patch, vectorUDF.8.patch, vectorUDF.9.patch Suppose you have a custom UDF myUDF() that you've created to extend hive. The goal of this JIRA is to create a facility where if you run a query that uses myUDF() in an expression, the query will run in vectorized mode. This would be a general-purpose bridge for custom UDFs that users add to Hive. It would work with existing UDFs. I'm considering a separate JIRA for a new kind of custom UDF implementation that is vectorized from the beginning, to optimize performance. That is not covered by this JIRA. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4961) Create bridge for custom UDFs to operate in vectorized mode
[ https://issues.apache.org/jira/browse/HIVE-4961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13769064#comment-13769064 ] Ashutosh Chauhan commented on HIVE-4961: I have few concerns regarding code organization. * We should create new packages ql/exec/vector/udf, ql/exec/vector/udf/legacy and ql/exec/vector/udf/generic. * We should put classes VectorUDFAdaptor and VectorUDFArgDesc in vector/udf. * We should put LongUDF in vector/udf/legacy * We should put GenericUDFIsNull in vector/udf/generic You can chose to do this in follow-up patch or in this one. I am fine either way, let me know. Create bridge for custom UDFs to operate in vectorized mode --- Key: HIVE-4961 URL: https://issues.apache.org/jira/browse/HIVE-4961 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Eric Hanson Assignee: Eric Hanson Fix For: vectorization-branch Attachments: HIVE-4961.1-vectorization.patch, HIVE-4961.2-vectorization.patch, HIVE-4961.3-vectorization.patch, vectorUDF.4.patch, vectorUDF.5.patch, vectorUDF.8.patch, vectorUDF.9.patch Suppose you have a custom UDF myUDF() that you've created to extend hive. The goal of this JIRA is to create a facility where if you run a query that uses myUDF() in an expression, the query will run in vectorized mode. This would be a general-purpose bridge for custom UDFs that users add to Hive. It would work with existing UDFs. I'm considering a separate JIRA for a new kind of custom UDF implementation that is vectorized from the beginning, to optimize performance. That is not covered by this JIRA. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4961) Create bridge for custom UDFs to operate in vectorized mode
[ https://issues.apache.org/jira/browse/HIVE-4961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13767701#comment-13767701 ] Hive QA commented on HIVE-4961: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12603172/HIVE-4961.2-vectorization.patch {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 3955 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDump org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDictionaryThreshold org.apache.hive.hcatalog.pig.TestOrcHCatStorer.testStoreTableMulti org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_plan_json org.apache.hadoop.hive.ql.exec.vector.util.TestUDF.initializationError {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/749/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/749/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. Create bridge for custom UDFs to operate in vectorized mode --- Key: HIVE-4961 URL: https://issues.apache.org/jira/browse/HIVE-4961 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Eric Hanson Assignee: Eric Hanson Fix For: vectorization-branch Attachments: HIVE-4961.1-vectorization.patch, HIVE-4961.2-vectorization.patch, vectorUDF.4.patch, vectorUDF.5.patch, vectorUDF.8.patch, vectorUDF.9.patch Suppose you have a custom UDF myUDF() that you've created to extend hive. The goal of this JIRA is to create a facility where if you run a query that uses myUDF() in an expression, the query will run in vectorized mode. This would be a general-purpose bridge for custom UDFs that users add to Hive. It would work with existing UDFs. I'm considering a separate JIRA for a new kind of custom UDF implementation that is vectorized from the beginning, to optimize performance. That is not covered by this JIRA. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4961) Create bridge for custom UDFs to operate in vectorized mode
[ https://issues.apache.org/jira/browse/HIVE-4961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13766671#comment-13766671 ] Hive QA commented on HIVE-4961: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12602912/HIVE-4961.1-vectorization.patch {color:red}ERROR:{color} -1 due to 49 failed/errored test(s), 3955 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.ql.parse.TestParse.testParse_input2 org.apache.hadoop.hive.ql.parse.TestParse.testParse_udf1 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby2 org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDump org.apache.hadoop.hive.ql.parse.TestParse.testParse_cast1 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input8 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input3 org.apache.hadoop.hive.ql.parse.TestParse.testParse_join4 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby5 org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample5 org.apache.hadoop.hive.ql.parse.TestParse.testParse_join8 org.apache.hive.hcatalog.mapreduce.TestSequenceFileReadWrite.testSequenceTableWriteRead org.apache.hadoop.hive.ql.io.orc.TestFileDump.testDictionaryThreshold org.apache.hadoop.hive.ql.parse.TestParse.testParse_input_testxpath org.apache.hadoop.hive.ql.parse.TestParse.testParse_input_part1 org.apache.hadoop.hive.ql.parse.TestParse.testParse_join2 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input4 org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample1 org.apache.hive.hcatalog.mapreduce.TestSequenceFileReadWrite.testTextTableWriteRead org.apache.hadoop.hive.ql.parse.TestParse.testParse_join7 org.apache.hadoop.hive.ql.parse.TestParse.testParse_subq org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby6 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input20 org.apache.hadoop.hive.ql.parse.TestParse.testParse_udf_when org.apache.hadoop.hive.ql.parse.TestParse.testParse_input_testxpath2 org.apache.hcatalog.cli.TestPermsGrp.testCustomPerms org.apache.hadoop.hive.ql.parse.TestParse.testParse_udf4 org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample7 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input6 org.apache.hive.hcatalog.mapreduce.TestSequenceFileReadWrite.testSequenceTableWriteReadMR org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample3 org.apache.hadoop.hive.ql.parse.TestParse.testParse_case_sensitivity org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample2 org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample6 org.apache.hadoop.hive.ql.parse.TestParse.testParse_input9 org.apache.hive.hcatalog.mapreduce.TestSequenceFileReadWrite.testTextTableWriteReadMR org.apache.hadoop.hive.ql.parse.TestParse.testParse_union org.apache.hadoop.hive.ql.parse.TestParse.testParse_udf6 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby4 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby3 org.apache.hadoop.hive.ql.parse.TestParse.testParse_join5 org.apache.hadoop.hive.ql.parse.TestParse.testParse_join6 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_input4 org.apache.hadoop.hive.cli.TestCliDriver.testCliDriver_plan_json org.apache.hadoop.hive.ql.parse.TestParse.testParse_input1 org.apache.hadoop.hive.ql.parse.TestParse.testParse_groupby1 org.apache.hadoop.hive.ql.parse.TestParse.testParse_sample4 org.apache.hadoop.hive.ql.parse.TestParse.testParse_udf_case org.apache.hadoop.hive.ql.exec.vector.expressions.TestUDF.initializationError {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/718/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/718/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 49 tests failed {noformat} This message is automatically generated. Create bridge for custom UDFs to operate in vectorized mode --- Key: HIVE-4961 URL: https://issues.apache.org/jira/browse/HIVE-4961 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Eric Hanson Assignee: Eric Hanson Fix For: vectorization-branch Attachments: HIVE-4961.1-vectorization.patch, vectorUDF.4.patch, vectorUDF.5.patch, vectorUDF.8.patch, vectorUDF.9.patch Suppose you have a custom UDF myUDF() that you've created to extend hive. The goal of this JIRA is to create a facility where if you run a query that uses myUDF() in an expression, the query will run in vectorized mode. This would be a general-purpose bridge for custom UDFs that users add to Hive. It
[jira] [Commented] (HIVE-4961) Create bridge for custom UDFs to operate in vectorized mode
[ https://issues.apache.org/jira/browse/HIVE-4961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13766047#comment-13766047 ] Eric Hanson commented on HIVE-4961: --- Code review available on ReviewBoard: https://reviews.apache.org/r/14113/ Create bridge for custom UDFs to operate in vectorized mode --- Key: HIVE-4961 URL: https://issues.apache.org/jira/browse/HIVE-4961 Project: Hive Issue Type: Sub-task Affects Versions: vectorization-branch Reporter: Eric Hanson Assignee: Eric Hanson Fix For: vectorization-branch Attachments: HIVE-4961.1-vectorization.patch, vectorUDF.4.patch, vectorUDF.5.patch, vectorUDF.8.patch, vectorUDF.9.patch Suppose you have a custom UDF myUDF() that you've created to extend hive. The goal of this JIRA is to create a facility where if you run a query that uses myUDF() in an expression, the query will run in vectorized mode. This would be a general-purpose bridge for custom UDFs that users add to Hive. It would work with existing UDFs. I'm considering a separate JIRA for a new kind of custom UDF implementation that is vectorized from the beginning, to optimize performance. That is not covered by this JIRA. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4961) Create bridge for custom UDFs to operate in vectorized mode
[ https://issues.apache.org/jira/browse/HIVE-4961?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13749189#comment-13749189 ] Eric Hanson commented on HIVE-4961: --- Completed working version of bridge to allow custom UDFs that are subclasses of UDF to work in vectorized mode. This supports UDFs with evaluate() methods that take and return boxed types (e.g. Long), Writable types (e.g. LongWritable) and standard types (e.g. long). Generic UDFs are not supported. That will be the subject of a future patch. I did manual testing for a large set of UDFs taking and returning the types supported by vectorization: tinyint, smallint, int, bigint, float, double, boolean, string, timestamp. UDFs one argument and multiple arguments were tested. Both constant and variable arguments were tested. Including the tests with the patch, or doing another patch with end-to-end tests, is yet to be done. Create bridge for custom UDFs to operate in vectorized mode --- Key: HIVE-4961 URL: https://issues.apache.org/jira/browse/HIVE-4961 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Eric Hanson Attachments: vectorUDF.4.patch, vectorUDF.5.patch Suppose you have a custom UDF myUDF() that you've created to extend hive. The goal of this JIRA is to create a facility where if you run a query that uses myUDF() in an expression, the query will run in vectorized mode. This would be a general-purpose bridge for custom UDFs that users add to Hive. It would work with existing UDFs. I'm considering a separate JIRA for a new kind of custom UDF implementation that is vectorized from the beginning, to optimize performance. That is not covered by this JIRA. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira