[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13785659#comment-13785659 ] Hudson commented on HIVE-4642: -- FAILURE: Integrated in Hive-trunk-hadoop2-ptest #123 (See [https://builds.apache.org/job/Hive-trunk-hadoop2-ptest/123/]) HIVE-4642 : Implement vectorized RLIKE and REGEXP filter expressions (Teddy Choi via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1528917) * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/AbstractFilterStringColLikeStringScalar.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterStringColLikeStringScalar.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterStringColRegExpStringScalar.java * /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorStringExpressions.java Implement vectorized RLIKE and REGEXP filter expressions Key: HIVE-4642 URL: https://issues.apache.org/jira/browse/HIVE-4642 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi Fix For: 0.13.0 Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, HIVE-4642.3.patch.txt, HIVE-4642.4.patch.txt, HIVE-4642.5.patch.txt, HIVE-4642.6.patch.txt, HIVE-4642.7.patch.txt, HIVE-4642.8.patch.txt, HIVE-4642.8-vectorization.patch, Hive-Vectorized-Query-Execution-Design-rev10.docx See title. I will add more details next week. The goal is (a) make this work correctly and (b) optimize it as well as possible, at least for the common cases. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13785648#comment-13785648 ] Hudson commented on HIVE-4642: -- ABORTED: Integrated in Hive-trunk-hadoop2 #473 (See [https://builds.apache.org/job/Hive-trunk-hadoop2/473/]) HIVE-4642 : Implement vectorized RLIKE and REGEXP filter expressions (Teddy Choi via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1528917) * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/AbstractFilterStringColLikeStringScalar.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterStringColLikeStringScalar.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterStringColRegExpStringScalar.java * /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorStringExpressions.java Implement vectorized RLIKE and REGEXP filter expressions Key: HIVE-4642 URL: https://issues.apache.org/jira/browse/HIVE-4642 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi Fix For: 0.13.0 Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, HIVE-4642.3.patch.txt, HIVE-4642.4.patch.txt, HIVE-4642.5.patch.txt, HIVE-4642.6.patch.txt, HIVE-4642.7.patch.txt, HIVE-4642.8.patch.txt, HIVE-4642.8-vectorization.patch, Hive-Vectorized-Query-Execution-Design-rev10.docx See title. I will add more details next week. The goal is (a) make this work correctly and (b) optimize it as well as possible, at least for the common cases. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13785647#comment-13785647 ] Hudson commented on HIVE-4642: -- ABORTED: Integrated in Hive-trunk-h0.21 #2377 (See [https://builds.apache.org/job/Hive-trunk-h0.21/2377/]) HIVE-4642 : Implement vectorized RLIKE and REGEXP filter expressions (Teddy Choi via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1528917) * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/AbstractFilterStringColLikeStringScalar.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterStringColLikeStringScalar.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterStringColRegExpStringScalar.java * /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorStringExpressions.java Implement vectorized RLIKE and REGEXP filter expressions Key: HIVE-4642 URL: https://issues.apache.org/jira/browse/HIVE-4642 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi Fix For: 0.13.0 Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, HIVE-4642.3.patch.txt, HIVE-4642.4.patch.txt, HIVE-4642.5.patch.txt, HIVE-4642.6.patch.txt, HIVE-4642.7.patch.txt, HIVE-4642.8.patch.txt, HIVE-4642.8-vectorization.patch, Hive-Vectorized-Query-Execution-Design-rev10.docx See title. I will add more details next week. The goal is (a) make this work correctly and (b) optimize it as well as possible, at least for the common cases. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13785839#comment-13785839 ] Hudson commented on HIVE-4642: -- SUCCESS: Integrated in Hive-trunk-hadoop1-ptest #189 (See [https://builds.apache.org/job/Hive-trunk-hadoop1-ptest/189/]) HIVE-4642 : Implement vectorized RLIKE and REGEXP filter expressions (Teddy Choi via Ashutosh Chauhan) (hashutosh: http://svn.apache.org/viewcvs.cgi/?root=Apache-SVNview=revrev=1528917) * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/AbstractFilterStringColLikeStringScalar.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterStringColLikeStringScalar.java * /hive/trunk/ql/src/java/org/apache/hadoop/hive/ql/exec/vector/expressions/FilterStringColRegExpStringScalar.java * /hive/trunk/ql/src/test/org/apache/hadoop/hive/ql/exec/vector/expressions/TestVectorStringExpressions.java Implement vectorized RLIKE and REGEXP filter expressions Key: HIVE-4642 URL: https://issues.apache.org/jira/browse/HIVE-4642 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi Fix For: 0.13.0 Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, HIVE-4642.3.patch.txt, HIVE-4642.4.patch.txt, HIVE-4642.5.patch.txt, HIVE-4642.6.patch.txt, HIVE-4642.7.patch.txt, HIVE-4642.8.patch.txt, HIVE-4642.8-vectorization.patch, Hive-Vectorized-Query-Execution-Design-rev10.docx See title. I will add more details next week. The goal is (a) make this work correctly and (b) optimize it as well as possible, at least for the common cases. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13784588#comment-13784588 ] Ashutosh Chauhan commented on HIVE-4642: +1 Implement vectorized RLIKE and REGEXP filter expressions Key: HIVE-4642 URL: https://issues.apache.org/jira/browse/HIVE-4642 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, HIVE-4642.3.patch.txt, HIVE-4642.4.patch.txt, HIVE-4642.5.patch.txt, HIVE-4642.6.patch.txt, HIVE-4642.7.patch.txt, HIVE-4642.8.patch.txt, HIVE-4642.8-vectorization.patch, Hive-Vectorized-Query-Execution-Design-rev10.docx See title. I will add more details next week. The goal is (a) make this work correctly and (b) optimize it as well as possible, at least for the common cases. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13783648#comment-13783648 ] Jitendra Nath Pandey commented on HIVE-4642: +1 for the patch. However, the vectorization work is now committed to trunk. Please rebase the patch against trunk. All vectorization work is now happening directly on trunk. Implement vectorized RLIKE and REGEXP filter expressions Key: HIVE-4642 URL: https://issues.apache.org/jira/browse/HIVE-4642 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, HIVE-4642.3.patch.txt, HIVE-4642.4.patch.txt, HIVE-4642.5.patch.txt, HIVE-4642.6.patch.txt, HIVE-4642.7.patch.txt, HIVE-4642.8.patch.txt, HIVE-4642.8-vectorization.patch, Hive-Vectorized-Query-Execution-Design-rev10.docx See title. I will add more details next week. The goal is (a) make this work correctly and (b) optimize it as well as possible, at least for the common cases. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13781992#comment-13781992 ] Eric Hanson commented on HIVE-4642: --- I agree, it looks like the tests that failed are not related to your patch. Implement vectorized RLIKE and REGEXP filter expressions Key: HIVE-4642 URL: https://issues.apache.org/jira/browse/HIVE-4642 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, HIVE-4642.3.patch.txt, HIVE-4642.4.patch.txt, HIVE-4642.5.patch.txt, HIVE-4642.6.patch.txt, HIVE-4642.7.patch.txt, HIVE-4642.8.patch.txt, HIVE-4642.8-vectorization.patch, Hive-Vectorized-Query-Execution-Design-rev10.docx See title. I will add more details next week. The goal is (a) make this work correctly and (b) optimize it as well as possible, at least for the common cases. -- This message was sent by Atlassian JIRA (v6.1#6144)
[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13780027#comment-13780027 ] Eric Hanson commented on HIVE-4642: --- The cleanest way I've found is to pull hive, git checkout the 'vectorization' branch, create a new branch locally for the patch, say teddy/hive-4642, then create the patch. When it is done and tested, pull vectorization to make it up to date, then merge vectorization into your work branch. Finally, diff the head of the two branches. E.g. git checkout vectorization git checkout -b teddy/hive-4642 ... work ... git commit -m ... git checkout vectorization git pull vectorization git checkout teddy/hive-4642 git merge vectorization git diff vectorization teddy/hive-4642 HIVE-4642.n-vectorization.patch (where n is a serial number for the patch like 1, 2. 3 ...) Then upload the patch to the JIRA. The naming convention using n-branch_name in the patch name allows the automated tester to try to apply your patch to the correct branch. The workflow above makes it likely the patch will apply correctly to the vectorization branch. Implement vectorized RLIKE and REGEXP filter expressions Key: HIVE-4642 URL: https://issues.apache.org/jira/browse/HIVE-4642 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, HIVE-4642.3.patch.txt, HIVE-4642.4.patch.txt, HIVE-4642.5.patch.txt, HIVE-4642.6.patch.txt, HIVE-4642.7.patch.txt, Hive-Vectorized-Query-Execution-Design-rev10.docx See title. I will add more details next week. The goal is (a) make this work correctly and (b) optimize it as well as possible, at least for the common cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13780030#comment-13780030 ] Eric Hanson commented on HIVE-4642: --- Correction, need to pull vectorization first before creating your work branch. git checkout vectorization git pull git checkout -b teddy/hive-4642 ... work ... git commit -m ... git checkout vectorization git pull vectorization git checkout teddy/hive-4642 git merge vectorization git diff vectorization teddy/hive-4642 HIVE-4642.n-vectorization.patch Implement vectorized RLIKE and REGEXP filter expressions Key: HIVE-4642 URL: https://issues.apache.org/jira/browse/HIVE-4642 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, HIVE-4642.3.patch.txt, HIVE-4642.4.patch.txt, HIVE-4642.5.patch.txt, HIVE-4642.6.patch.txt, HIVE-4642.7.patch.txt, Hive-Vectorized-Query-Execution-Design-rev10.docx See title. I will add more details next week. The goal is (a) make this work correctly and (b) optimize it as well as possible, at least for the common cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13780153#comment-13780153 ] Teddy Choi commented on HIVE-4642: -- Sorry. My branch was not clean, it was merged. I should be more careful. My new patch is from a clean branch, so it must not have any problems. Implement vectorized RLIKE and REGEXP filter expressions Key: HIVE-4642 URL: https://issues.apache.org/jira/browse/HIVE-4642 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, HIVE-4642.3.patch.txt, HIVE-4642.4.patch.txt, HIVE-4642.5.patch.txt, HIVE-4642.6.patch.txt, HIVE-4642.7.patch.txt, HIVE-4642.8.patch.txt, HIVE-4642.8-vectorization.patch, Hive-Vectorized-Query-Execution-Design-rev10.docx See title. I will add more details next week. The goal is (a) make this work correctly and (b) optimize it as well as possible, at least for the common cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13780522#comment-13780522 ] Hive QA commented on HIVE-4642: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12605494/HIVE-4642.8-vectorization.patch {color:red}ERROR:{color} -1 due to 6 failed/errored test(s), 4053 tests executed *Failed tests:* {noformat} org.apache.hcatalog.api.TestHCatClient.testBasicDDLCommands org.apache.hive.hcatalog.fileformats.TestOrcDynamicPartitioned.testHCatDynamicPartitionedTable org.apache.hive.hcatalog.fileformats.TestOrcDynamicPartitioned.testHCatDynamicPartitionedTableMultipleTask org.apache.hive.hcatalog.mapreduce.TestHCatExternalDynamicPartitioned.testHCatDynamicPartitionedTableMultipleTask org.apache.hive.hcatalog.mapreduce.TestHCatHiveCompatibility.testUnpartedReadWrite org.apache.hive.hcatalog.pig.TestOrcHCatLoaderComplexSchema.testTupleInBagInTupleInBag {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/938/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/938/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 6 tests failed {noformat} This message is automatically generated. Implement vectorized RLIKE and REGEXP filter expressions Key: HIVE-4642 URL: https://issues.apache.org/jira/browse/HIVE-4642 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, HIVE-4642.3.patch.txt, HIVE-4642.4.patch.txt, HIVE-4642.5.patch.txt, HIVE-4642.6.patch.txt, HIVE-4642.7.patch.txt, HIVE-4642.8.patch.txt, HIVE-4642.8-vectorization.patch, Hive-Vectorized-Query-Execution-Design-rev10.docx See title. I will add more details next week. The goal is (a) make this work correctly and (b) optimize it as well as possible, at least for the common cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13780600#comment-13780600 ] Hive QA commented on HIVE-4642: --- {color:red}Overall{color}: -1 at least one tests failed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12605494/HIVE-4642.8-vectorization.patch {color:red}ERROR:{color} -1 due to 4 failed/errored test(s), 4053 tests executed *Failed tests:* {noformat} org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucket_num_reducers org.apache.hive.hcatalog.fileformats.TestOrcDynamicPartitioned.testHCatDynamicPartitionedTable org.apache.hive.hcatalog.mapreduce.TestHCatExternalDynamicPartitioned.testHCatDynamicPartitionedTableMultipleTask org.apache.hive.hcatalog.mapreduce.TestHCatPartitioned.testHCatPartitionedTable {noformat} Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/939/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/939/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Executing org.apache.hive.ptest.execution.ExecutionPhase Executing org.apache.hive.ptest.execution.ReportingPhase Tests failed with: TestsFailedException: 4 tests failed {noformat} This message is automatically generated. Implement vectorized RLIKE and REGEXP filter expressions Key: HIVE-4642 URL: https://issues.apache.org/jira/browse/HIVE-4642 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, HIVE-4642.3.patch.txt, HIVE-4642.4.patch.txt, HIVE-4642.5.patch.txt, HIVE-4642.6.patch.txt, HIVE-4642.7.patch.txt, HIVE-4642.8.patch.txt, HIVE-4642.8-vectorization.patch, Hive-Vectorized-Query-Execution-Design-rev10.docx See title. I will add more details next week. The goal is (a) make this work correctly and (b) optimize it as well as possible, at least for the common cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13780640#comment-13780640 ] Teddy Choi commented on HIVE-4642: -- I'm suspicious of that those errors are from this patch. Because they don't use LIKE/REGEXP functions. * HIVE-5380 fails org.apache.hadoop.hive.cli.TestMinimrCliDriver.testCliDriver_bucket_num_reducers * HIVE-5199 fails org.apache.hive.hcatalog.mapreduce.TestHCatExternalPartitioned If I missed something, please tell me. Implement vectorized RLIKE and REGEXP filter expressions Key: HIVE-4642 URL: https://issues.apache.org/jira/browse/HIVE-4642 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, HIVE-4642.3.patch.txt, HIVE-4642.4.patch.txt, HIVE-4642.5.patch.txt, HIVE-4642.6.patch.txt, HIVE-4642.7.patch.txt, HIVE-4642.8.patch.txt, HIVE-4642.8-vectorization.patch, Hive-Vectorized-Query-Execution-Design-rev10.docx See title. I will add more details next week. The goal is (a) make this work correctly and (b) optimize it as well as possible, at least for the common cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13779413#comment-13779413 ] Hive QA commented on HIVE-4642: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12605231/HIVE-4642.7.patch.txt Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/920/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/920/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests failed with: NonZeroExitCodeException: Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n '' ]] + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-Build-920/source-prep.txt + mkdir -p maven ivy + [[ svn = \s\v\n ]] + [[ -n '' ]] + [[ -d apache-svn-trunk-source ]] + [[ ! -d apache-svn-trunk-source/.svn ]] + [[ ! -d apache-svn-trunk-source ]] + cd apache-svn-trunk-source + svn revert -R . ++ awk '{print $2}' ++ egrep -v '^X|^Performing status on external' ++ svn status --no-ignore + rm -rf + svn update Fetching external item into 'hcatalog/src/test/e2e/harness' External at revision 1526730. At revision 1526730. + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0 to p2 + exit 1 ' {noformat} This message is automatically generated. Implement vectorized RLIKE and REGEXP filter expressions Key: HIVE-4642 URL: https://issues.apache.org/jira/browse/HIVE-4642 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, HIVE-4642.3.patch.txt, HIVE-4642.4.patch.txt, HIVE-4642.5.patch.txt, HIVE-4642.6.patch.txt, HIVE-4642.7.patch.txt, Hive-Vectorized-Query-Execution-Design-rev10.docx See title. I will add more details next week. The goal is (a) make this work correctly and (b) optimize it as well as possible, at least for the common cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13770480#comment-13770480 ] Jitendra Nath Pandey commented on HIVE-4642: [~teddy.choi] Thanks for taking care of the serialization part. I think the patch doesn't apply because of the recent changes to the branch. Please rebase the patch again. Implement vectorized RLIKE and REGEXP filter expressions Key: HIVE-4642 URL: https://issues.apache.org/jira/browse/HIVE-4642 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, HIVE-4642.3.patch.txt, HIVE-4642.4.patch.txt, HIVE-4642.5.patch.txt, HIVE-4642.6.patch.txt, Hive-Vectorized-Query-Execution-Design-rev10.docx See title. I will add more details next week. The goal is (a) make this work correctly and (b) optimize it as well as possible, at least for the common cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13761859#comment-13761859 ] Hive QA commented on HIVE-4642: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12602047/HIVE-4642.6.patch.txt Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/671/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/671/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests failed with: NonZeroExitCodeException: Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n '' ]] + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-Build-671/source-prep.txt + mkdir -p maven ivy + [[ svn = \s\v\n ]] + [[ -n '' ]] + [[ -d apache-svn-trunk-source ]] + [[ ! -d apache-svn-trunk-source/.svn ]] + [[ ! -d apache-svn-trunk-source ]] + cd apache-svn-trunk-source + svn revert -R . ++ awk '{print $2}' ++ egrep -v '^X|^Performing status on external' ++ svn status --no-ignore + rm -rf + svn update Fetching external item into 'hcatalog/src/test/e2e/harness' External at revision 152. At revision 152. + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0 to p2 + exit 1 ' {noformat} This message is automatically generated. Implement vectorized RLIKE and REGEXP filter expressions Key: HIVE-4642 URL: https://issues.apache.org/jira/browse/HIVE-4642 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, HIVE-4642.3.patch.txt, HIVE-4642.4.patch.txt, HIVE-4642.5.patch.txt, HIVE-4642.6.patch.txt, Hive-Vectorized-Query-Execution-Design-rev10.docx See title. I will add more details next week. The goal is (a) make this work correctly and (b) optimize it as well as possible, at least for the common cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13754249#comment-13754249 ] Jitendra Nath Pandey commented on HIVE-4642: Please make these new expressions serializable. HIVE-4959 changes will mandate that all vectorized expressions are serializable. The VectorExpression class already implements Serializable interface. Implement vectorized RLIKE and REGEXP filter expressions Key: HIVE-4642 URL: https://issues.apache.org/jira/browse/HIVE-4642 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, HIVE-4642.3.patch.txt, HIVE-4642.4.patch.txt, HIVE-4642.5.patch.txt, Hive-Vectorized-Query-Execution-Design-rev10.docx See title. I will add more details next week. The goal is (a) make this work correctly and (b) optimize it as well as possible, at least for the common cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13754309#comment-13754309 ] Teddy Choi commented on HIVE-4642: -- [~jnp] Okay, I'll do it soon. :) Implement vectorized RLIKE and REGEXP filter expressions Key: HIVE-4642 URL: https://issues.apache.org/jira/browse/HIVE-4642 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, HIVE-4642.3.patch.txt, HIVE-4642.4.patch.txt, HIVE-4642.5.patch.txt, Hive-Vectorized-Query-Execution-Design-rev10.docx See title. I will add more details next week. The goal is (a) make this work correctly and (b) optimize it as well as possible, at least for the common cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13736359#comment-13736359 ] Hive QA commented on HIVE-4642: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12597351/HIVE-4642.5.patch.txt Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/391/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/391/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests failed with: NonZeroExitCodeException: Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n '' ]] + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-Build-391/source-prep.txt + mkdir -p maven ivy + [[ svn = \s\v\n ]] + [[ -n '' ]] + [[ -d apache-svn-trunk-source ]] + [[ ! -d apache-svn-trunk-source/.svn ]] + [[ ! -d apache-svn-trunk-source ]] + cd apache-svn-trunk-source + svn revert -R . Reverted 'ql/src/java/org/apache/hadoop/hive/ql/optimizer/MapJoinProcessor.java' ++ egrep -v '^X|^Performing status on external' ++ awk '{print $2}' ++ svn status --no-ignore + rm -rf build hcatalog/build hcatalog/core/build hcatalog/storage-handlers/hbase/build hcatalog/server-extensions/build hcatalog/webhcat/svr/build hcatalog/webhcat/java-client/build hcatalog/hcatalog-pig-adapter/build common/src/gen ql/src/test/results/clientpositive/auto_join_reordering_values.q.out ql/src/test/queries/clientpositive/auto_join_reordering_values.q + svn update Fetching external item into 'hcatalog/src/test/e2e/harness' External at revision 1512979. At revision 1512979. + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0 to p2 + exit 1 ' {noformat} This message is automatically generated. Implement vectorized RLIKE and REGEXP filter expressions Key: HIVE-4642 URL: https://issues.apache.org/jira/browse/HIVE-4642 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, HIVE-4642.3.patch.txt, HIVE-4642.4.patch.txt, HIVE-4642.5.patch.txt, Hive-Vectorized-Query-Execution-Design-rev10.docx See title. I will add more details next week. The goal is (a) make this work correctly and (b) optimize it as well as possible, at least for the common cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13730498#comment-13730498 ] Hive QA commented on HIVE-4642: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12596277/HIVE-4642.4.patch.txt Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/322/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/322/console Messages: {noformat} Executing org.apache.hive.ptest.execution.PrepPhase Tests failed with: NonZeroExitCodeException: Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n '' ]] + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-Build-322/source-prep.txt + mkdir -p maven ivy + [[ svn = \s\v\n ]] + [[ -n '' ]] + [[ -d apache-svn-trunk-source ]] + [[ ! -d apache-svn-trunk-source/.svn ]] + [[ ! -d apache-svn-trunk-source ]] + cd apache-svn-trunk-source + svn revert -R . Reverted 'ql/src/java/org/apache/hadoop/hive/ql/processors/DfsProcessor.java' ++ egrep -v '^X|^Performing status on external' ++ awk '{print $2}' ++ svn status --no-ignore + rm -rf build hcatalog/build hcatalog/core/build hcatalog/storage-handlers/hbase/build hcatalog/server-extensions/build hcatalog/webhcat/svr/build hcatalog/webhcat/java-client/build hcatalog/hcatalog-pig-adapter/build common/src/gen ql/src/test/results/clientpositive/dfscmd.q.out ql/src/test/queries/clientpositive/dfscmd.q + svn update Fetching external item into 'hcatalog/src/test/e2e/harness' External at revision 1510878. At revision 1510878. + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0 to p2 + exit 1 ' {noformat} This message is automatically generated. Implement vectorized RLIKE and REGEXP filter expressions Key: HIVE-4642 URL: https://issues.apache.org/jira/browse/HIVE-4642 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, HIVE-4642.3.patch.txt, HIVE-4642.4.patch.txt, Hive-Vectorized-Query-Execution-Design-rev10.docx See title. I will add more details next week. The goal is (a) make this work correctly and (b) optimize it as well as possible, at least for the common cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13721516#comment-13721516 ] Teddy Choi commented on HIVE-4642: -- My pleasure. :) Implement vectorized RLIKE and REGEXP filter expressions Key: HIVE-4642 URL: https://issues.apache.org/jira/browse/HIVE-4642 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, HIVE-4642.3.patch.txt, Hive-Vectorized-Query-Execution-Design-rev10.docx See title. I will add more details next week. The goal is (a) make this work correctly and (b) optimize it as well as possible, at least for the common cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13720059#comment-13720059 ] Eric Hanson commented on HIVE-4642: --- +1 Looks good. Thanks again, Teddy. Implement vectorized RLIKE and REGEXP filter expressions Key: HIVE-4642 URL: https://issues.apache.org/jira/browse/HIVE-4642 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, HIVE-4642.3.patch.txt, Hive-Vectorized-Query-Execution-Design-rev10.docx See title. I will add more details next week. The goal is (a) make this work correctly and (b) optimize it as well as possible, at least for the common cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13718319#comment-13718319 ] Hive QA commented on HIVE-4642: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12593818/HIVE-4642.3.patch.txt Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/163/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/163/console Messages: {noformat} Executing org.apache.hive.ptest.execution.CleanupPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests failed with: NonZeroExitCodeException: Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n '' ]] + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-Build-163/source-prep.txt + mkdir -p maven ivy + [[ svn = \s\v\n ]] + [[ -n '' ]] + [[ -d apache-svn-trunk-source ]] + [[ ! -d apache-svn-trunk-source/.svn ]] + [[ ! -d apache-svn-trunk-source ]] + cd apache-svn-trunk-source + svn revert -R . Reverted 'jdbc/src/test/org/apache/hive/jdbc/TestJdbcDriver2.java' Reverted 'jdbc/src/java/org/apache/hive/jdbc/HiveStatement.java' Reverted 'service/src/java/org/apache/hive/service/cli/operation/SQLOperation.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/exec/TaskRunner.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/history/HiveHistory.java' Reverted 'ql/src/java/org/apache/hadoop/hive/ql/Driver.java' ++ egrep -v '^X|^Performing status on external' ++ awk '{print $2}' ++ svn status --no-ignore + rm -rf build hcatalog/build hcatalog/core/build hcatalog/storage-handlers/hbase/build hcatalog/server-extensions/build hcatalog/webhcat/svr/build hcatalog/webhcat/java-client/build hcatalog/hcatalog-pig-adapter/build common/src/gen + svn update Fetching external item into 'hcatalog/src/test/e2e/harness' External at revision 1506523. At revision 1506523. + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0 to p2 + exit 1 ' {noformat} This message is automatically generated. Implement vectorized RLIKE and REGEXP filter expressions Key: HIVE-4642 URL: https://issues.apache.org/jira/browse/HIVE-4642 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, HIVE-4642.3.patch.txt See title. I will add more details next week. The goal is (a) make this work correctly and (b) optimize it as well as possible, at least for the common cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13718539#comment-13718539 ] Eric Hanson commented on HIVE-4642: --- Thanks Teddy! Implement vectorized RLIKE and REGEXP filter expressions Key: HIVE-4642 URL: https://issues.apache.org/jira/browse/HIVE-4642 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, HIVE-4642.3.patch.txt, Hive-Vectorized-Query-Execution-Design-rev10.docx See title. I will add more details next week. The goal is (a) make this work correctly and (b) optimize it as well as possible, at least for the common cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13716573#comment-13716573 ] Eric Hanson commented on HIVE-4642: --- Hi Teddy, how's it going with this? When do you think you can finish this up? Implement vectorized RLIKE and REGEXP filter expressions Key: HIVE-4642 URL: https://issues.apache.org/jira/browse/HIVE-4642 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch See title. I will add more details next week. The goal is (a) make this work correctly and (b) optimize it as well as possible, at least for the common cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13717829#comment-13717829 ] Teddy Choi commented on HIVE-4642: -- Hello Eric. I uploaded a patch and I will upload its design specification on today night. It has more detailed comments and tests. It also applies your review. I'm sorry for being late. Implement vectorized RLIKE and REGEXP filter expressions Key: HIVE-4642 URL: https://issues.apache.org/jira/browse/HIVE-4642 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, HIVE-4642.3.patch.txt See title. I will add more details next week. The goal is (a) make this work correctly and (b) optimize it as well as possible, at least for the common cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13717981#comment-13717981 ] Hive QA commented on HIVE-4642: --- {color:red}Overall{color}: -1 no tests executed Here are the results of testing the latest attachment: https://issues.apache.org/jira/secure/attachment/12593818/HIVE-4642.3.patch.txt Test results: https://builds.apache.org/job/PreCommit-HIVE-Build/160/testReport Console output: https://builds.apache.org/job/PreCommit-HIVE-Build/160/console Messages: {noformat} Executing org.apache.hive.ptest.execution.CleanupPhase Executing org.apache.hive.ptest.execution.PrepPhase Tests failed with: NonZeroExitCodeException: Command 'bash /data/hive-ptest/working/scratch/source-prep.sh' failed with exit status 1 and output '+ [[ -n '' ]] + export 'ANT_OPTS=-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + ANT_OPTS='-Xmx1g -XX:MaxPermSize=256m -Dhttp.proxyHost=localhost -Dhttp.proxyPort=3128' + cd /data/hive-ptest/working/ + tee /data/hive-ptest/logs/PreCommit-HIVE-Build-160/source-prep.txt + mkdir -p maven ivy + [[ svn = \s\v\n ]] + [[ -n '' ]] + [[ -d apache-svn-trunk-source ]] + [[ ! -d apache-svn-trunk-source/.svn ]] + [[ ! -d apache-svn-trunk-source ]] + cd apache-svn-trunk-source + svn revert -R . Reverted 'hcatalog/src/test/e2e/templeton/README.txt' Reverted 'hcatalog/src/test/e2e/templeton/drivers/TestDriverCurl.pm' Reverted 'hcatalog/src/test/e2e/templeton/build.xml' Reverted 'hcatalog/webhcat/svr/src/main/java/org/apache/hcatalog/templeton/tool/TempletonControllerJob.java' ++ egrep -v '^X|^Performing status on external' ++ awk '{print $2}' ++ svn status --no-ignore + rm -rf build hcatalog/build hcatalog/src/test/e2e/templeton/tests/jobsubmission2.conf hcatalog/core/build hcatalog/storage-handlers/hbase/build hcatalog/server-extensions/build hcatalog/webhcat/svr/build hcatalog/webhcat/java-client/build hcatalog/hcatalog-pig-adapter/build common/src/gen + svn update Aql/src/test/queries/clientpositive/create_view_translate.q Aql/src/test/queries/clientpositive/view_cast.q Aql/src/test/results/clientpositive/create_view_translate.q.out Aql/src/test/results/clientpositive/view_cast.q.out Uql/src/java/org/apache/hadoop/hive/ql/parse/UnparseTranslator.java Fetching external item into 'hcatalog/src/test/e2e/harness' Updated external to revision 1506396. Updated to revision 1506396. + patchCommandPath=/data/hive-ptest/working/scratch/smart-apply-patch.sh + patchFilePath=/data/hive-ptest/working/scratch/build.patch + [[ -f /data/hive-ptest/working/scratch/build.patch ]] + chmod +x /data/hive-ptest/working/scratch/smart-apply-patch.sh + /data/hive-ptest/working/scratch/smart-apply-patch.sh /data/hive-ptest/working/scratch/build.patch The patch does not appear to apply with p0 to p2 + exit 1 ' {noformat} This message is automatically generated. Implement vectorized RLIKE and REGEXP filter expressions Key: HIVE-4642 URL: https://issues.apache.org/jira/browse/HIVE-4642 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, HIVE-4642.3.patch.txt See title. I will add more details next week. The goal is (a) make this work correctly and (b) optimize it as well as possible, at least for the common cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13702148#comment-13702148 ] Eric Hanson commented on HIVE-4642: --- I will look at this today. Implement vectorized RLIKE and REGEXP filter expressions Key: HIVE-4642 URL: https://issues.apache.org/jira/browse/HIVE-4642 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch See title. I will add more details next week. The goal is (a) make this work correctly and (b) optimize it as well as possible, at least for the common cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13698047#comment-13698047 ] Teddy Choi commented on HIVE-4642: -- Review request on https://reviews.apache.org/r/12235/ Implement vectorized RLIKE and REGEXP filter expressions Key: HIVE-4642 URL: https://issues.apache.org/jira/browse/HIVE-4642 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch See title. I will add more details next week. The goal is (a) make this work correctly and (b) optimize it as well as possible, at least for the common cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13683558#comment-13683558 ] Eric Hanson commented on HIVE-4642: --- This is great that you are learning about regular expression implementation algorithms. If you can come up with an approach that allows you to compile the regular expression once into a good internal format when you build the vectorized FilterStringColRegExpStringScalar class instance, that will be good. Then you can re-use the internal format (say some kind of FA) for each batch. Be careful to make sure the common cases are fast. Don't make the project too big. I am not sure how much time it will take to implement a fully general regexp matcher. If you think you can do it in the next month or two, fine. If it takes longer, maybe you should think of a different approach. If it looks like the project will become too big, consider focusing just on common special cases (like matching phone numbers, URLS, email addresses, various number formats, etc.), then use an existing RegExp matcher when the pattern is not one of the limited class of expressions your new code can handle. Implement vectorized RLIKE and REGEXP filter expressions Key: HIVE-4642 URL: https://issues.apache.org/jira/browse/HIVE-4642 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi See title. I will add more details next week. The goal is (a) make this work correctly and (b) optimize it as well as possible, at least for the common cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13682299#comment-13682299 ] Teddy Choi commented on HIVE-4642: -- [~ehans], I read some books and papers about regular expressions. According to http://en.wikipedia.org/wiki/Regular_expression#Implementations_and_running_times , DFA construction time and backtracking NFA running time are exponential, so they are not good solutions. According to http://swtch.com/~rsc/regexp/regexp1.html and http://www.cs.ucdavis.edu/~green/papers/techrept02.pdf , lazy DFA could be a good choice. It's construction time is O(n ), and its running time is O(m^2*n) at most. If there are enough target strings and they share a similar pattern, then the average running time will become O(n ). So it looks promising. It's hard to find a Java lazy DFA regular expression engine. java.util.regex is traditional NFA, and JRegex is DFA. Jarkta Regex is retired, so is Jarkarta ORO. And others are not updated for years. I'm surprised that it's hard to find one. So I think it will be good to implement one by myself. Fortunately, it doesn't seem hard to implement. If you know an alternative solution, please let me know it. Implement vectorized RLIKE and REGEXP filter expressions Key: HIVE-4642 URL: https://issues.apache.org/jira/browse/HIVE-4642 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi See title. I will add more details next week. The goal is (a) make this work correctly and (b) optimize it as well as possible, at least for the common cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13679690#comment-13679690 ] Eric Hanson commented on HIVE-4642: --- This sounds good. Think about additional common cases for matching such as seeing if a text string is a phone number, email address, or URL. If you can broaden the use cases you accelerate to cover those, that would be a plus. If these are too complicated, you can defer them until later, maybe in a different JIRA. Implement vectorized RLIKE and REGEXP filter expressions Key: HIVE-4642 URL: https://issues.apache.org/jira/browse/HIVE-4642 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi See title. I will add more details next week. The goal is (a) make this work correctly and (b) optimize it as well as possible, at least for the common cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13678996#comment-13678996 ] Teddy Choi commented on HIVE-4642: -- I found that most methods FilterStringColRegExpStringScalar class are same with FilterStringColLikeStringScalar class of HIVE-4548. So I revised my spec again. {panel} Create AbstractFilterStringColLikeStringScalar class and move up all methods of FilterStringColLikeStringScalar class except parseSimplePattern() method. Make FilterStringColLikeStringScalar class and FilterStringColRegExpStringScalar class extend AbstractFilterStringColLikeStringScalar class. Implement constructers and parseSimplePattern() method on each class differently. The class hierarchy will be; {noformat} AbstractFilterStringColLikeStringScalar + FilterStringColRegExpStringScalar + FilterStringColLikeStringScalar {noformat} Evaluate a REGEXP pattern .\*abc as a LIKE pattern %abc where abc contains literal characters only. Also evaluate abc.\* as abc%, .\*abc.\* as %abc%, abc as abc, and others as others. Cache a Matcher member instance on AbstractFilterStringColLikeStringScalar class and call Matcher#reset(CharSequence). Optimize patterns containing _(or .) and literal characters only. {panel} Implement vectorized RLIKE and REGEXP filter expressions Key: HIVE-4642 URL: https://issues.apache.org/jira/browse/HIVE-4642 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi See title. I will add more details next week. The goal is (a) make this work correctly and (b) optimize it as well as possible, at least for the common cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674426#comment-13674426 ] Teddy Choi commented on HIVE-4642: -- Here is my draft spec. Please leave a comment. The base version can be easily implemented with the basic template and the UDFRegExp class. It will be expensive, and it needs to be optimized more. Problem: Regular expression matcher is about 10+ times slower than prefix/suffix matcher(as shown in HIVE-4548). Because the Pattern is already compiled, it's hard to optimize the Pattern more. Matchers don't depend on each other, so they are distributable over threads. Also the base version will create new objects per call. These can be implemented more efficiently. Goal: Reduce object creations per call, and distribute matching loads over multiple threads. Cache and reuse a compiled pattern, a byte buffer, a char buffer, and a UTF-8 decoder as HIVE-4548. Divide matching tasks into groups, and run each group on different thread. Or apply the producer-consumer pattern. If there are enough idle CPU cores, total execution time will be reduced significantly. If it is feasible, implement prefix/suffix matchers for further optimization. People may use LIKE filter more for simpler filtering. So these matchers may not be used frequently but will run faster. Implement vectorized RLIKE and REGEXP filter expressions Key: HIVE-4642 URL: https://issues.apache.org/jira/browse/HIVE-4642 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi See title. I will add more details next week. The goal is (a) make this work correctly and (b) optimize it as well as possible, at least for the common cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13674690#comment-13674690 ] Eric Hanson commented on HIVE-4642: --- I think this sounds good except that using multi-threaded parallelism is not a good idea here. We should rely on getting parallelism for large data sets by having multiple splits processed in parallel in different processes. Using file-grain multi-threaded parallelism within a process only for purposes of speeding up RLIKE/REGEXP does not see appropriate. I'd recommend focusing on the fastest operation you can get within a single thread, at least for common patterns, or maybe even all possible patterns. Implement vectorized RLIKE and REGEXP filter expressions Key: HIVE-4642 URL: https://issues.apache.org/jira/browse/HIVE-4642 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi See title. I will add more details next week. The goal is (a) make this work correctly and (b) optimize it as well as possible, at least for the common cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13675467#comment-13675467 ] Teddy Choi commented on HIVE-4642: -- I see. I was not sure about parallelization. I'll focus in single thread. Thank you for feedback. Implement vectorized RLIKE and REGEXP filter expressions Key: HIVE-4642 URL: https://issues.apache.org/jira/browse/HIVE-4642 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi See title. I will add more details next week. The goal is (a) make this work correctly and (b) optimize it as well as possible, at least for the common cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Commented] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions
[ https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=13671867#comment-13671867 ] Eric Hanson commented on HIVE-4642: --- This is for a later milestone, not the first milestone we are stabilizing in June. But we can start now. Implement vectorized RLIKE and REGEXP filter expressions Key: HIVE-4642 URL: https://issues.apache.org/jira/browse/HIVE-4642 Project: Hive Issue Type: Sub-task Reporter: Eric Hanson Assignee: Teddy Choi See title. I will add more details next week. The goal is (a) make this work correctly and (b) optimize it as well as possible, at least for the common cases. -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira