[jira] [Updated] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions

2013-10-03 Thread Ashutosh Chauhan (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ashutosh Chauhan updated HIVE-4642:
---

   Resolution: Fixed
Fix Version/s: 0.13.0
   Status: Resolved  (was: Patch Available)

Committed to trunk. Thanks, Teddy!

 Implement vectorized RLIKE and REGEXP filter expressions
 

 Key: HIVE-4642
 URL: https://issues.apache.org/jira/browse/HIVE-4642
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Teddy Choi
 Fix For: 0.13.0

 Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, 
 HIVE-4642.3.patch.txt, HIVE-4642.4.patch.txt, HIVE-4642.5.patch.txt, 
 HIVE-4642.6.patch.txt, HIVE-4642.7.patch.txt, HIVE-4642.8.patch.txt, 
 HIVE-4642.8-vectorization.patch, 
 Hive-Vectorized-Query-Execution-Design-rev10.docx


 See title. I will add more details next week. The goal is (a) make this work 
 correctly and (b) optimize it as well as possible, at least for the common 
 cases.



--
This message was sent by Atlassian JIRA
(v6.1#6144)


[jira] [Updated] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions

2013-09-27 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-4642:
-

Attachment: HIVE-4642.8.patch.txt

 Implement vectorized RLIKE and REGEXP filter expressions
 

 Key: HIVE-4642
 URL: https://issues.apache.org/jira/browse/HIVE-4642
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Teddy Choi
 Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, 
 HIVE-4642.3.patch.txt, HIVE-4642.4.patch.txt, HIVE-4642.5.patch.txt, 
 HIVE-4642.6.patch.txt, HIVE-4642.7.patch.txt, HIVE-4642.8.patch.txt, 
 Hive-Vectorized-Query-Execution-Design-rev10.docx


 See title. I will add more details next week. The goal is (a) make this work 
 correctly and (b) optimize it as well as possible, at least for the common 
 cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions

2013-09-27 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-4642:
-

Attachment: HIVE-4642.8-vectorization.patch

 Implement vectorized RLIKE and REGEXP filter expressions
 

 Key: HIVE-4642
 URL: https://issues.apache.org/jira/browse/HIVE-4642
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Teddy Choi
 Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, 
 HIVE-4642.3.patch.txt, HIVE-4642.4.patch.txt, HIVE-4642.5.patch.txt, 
 HIVE-4642.6.patch.txt, HIVE-4642.7.patch.txt, HIVE-4642.8.patch.txt, 
 HIVE-4642.8-vectorization.patch, 
 Hive-Vectorized-Query-Execution-Design-rev10.docx


 See title. I will add more details next week. The goal is (a) make this work 
 correctly and (b) optimize it as well as possible, at least for the common 
 cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions

2013-09-26 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-4642:
-

Attachment: HIVE-4642.7.patch.txt

I attach a rebased version of the last patch.

The problem was that plan serialization does not use setter/getter methods so 
the checker member variable never gets assigned after deserialization. Now it 
is assigned on evaluate() method. It passes tests without any misleading errors.

I wish that this would be the last patch. :P

 Implement vectorized RLIKE and REGEXP filter expressions
 

 Key: HIVE-4642
 URL: https://issues.apache.org/jira/browse/HIVE-4642
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Teddy Choi
 Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, 
 HIVE-4642.3.patch.txt, HIVE-4642.4.patch.txt, HIVE-4642.5.patch.txt, 
 HIVE-4642.6.patch.txt, HIVE-4642.7.patch.txt, 
 Hive-Vectorized-Query-Execution-Design-rev10.docx


 See title. I will add more details next week. The goal is (a) make this work 
 correctly and (b) optimize it as well as possible, at least for the common 
 cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions

2013-09-09 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-4642:
-

Status: Patch Available  (was: Open)

 Implement vectorized RLIKE and REGEXP filter expressions
 

 Key: HIVE-4642
 URL: https://issues.apache.org/jira/browse/HIVE-4642
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Teddy Choi
 Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, 
 HIVE-4642.3.patch.txt, HIVE-4642.4.patch.txt, HIVE-4642.5.patch.txt, 
 HIVE-4642.6.patch.txt, Hive-Vectorized-Query-Execution-Design-rev10.docx


 See title. I will add more details next week. The goal is (a) make this work 
 correctly and (b) optimize it as well as possible, at least for the common 
 cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions

2013-09-08 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-4642:
-

Attachment: HIVE-4642.6.patch.txt

Added supports for serialization

 Implement vectorized RLIKE and REGEXP filter expressions
 

 Key: HIVE-4642
 URL: https://issues.apache.org/jira/browse/HIVE-4642
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Teddy Choi
 Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, 
 HIVE-4642.3.patch.txt, HIVE-4642.4.patch.txt, HIVE-4642.5.patch.txt, 
 HIVE-4642.6.patch.txt, Hive-Vectorized-Query-Execution-Design-rev10.docx


 See title. I will add more details next week. The goal is (a) make this work 
 correctly and (b) optimize it as well as possible, at least for the common 
 cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions

2013-08-29 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-4642:
-

Status: Open  (was: Patch Available)

 Implement vectorized RLIKE and REGEXP filter expressions
 

 Key: HIVE-4642
 URL: https://issues.apache.org/jira/browse/HIVE-4642
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Teddy Choi
 Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, 
 HIVE-4642.3.patch.txt, HIVE-4642.4.patch.txt, HIVE-4642.5.patch.txt, 
 Hive-Vectorized-Query-Execution-Design-rev10.docx


 See title. I will add more details next week. The goal is (a) make this work 
 correctly and (b) optimize it as well as possible, at least for the common 
 cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions

2013-08-11 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-4642:
-

Attachment: HIVE-4642.5.patch.txt

I uploaded 4th patch with an incorrect contents. This 5th patch corrects it.

 Implement vectorized RLIKE and REGEXP filter expressions
 

 Key: HIVE-4642
 URL: https://issues.apache.org/jira/browse/HIVE-4642
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Teddy Choi
 Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, 
 HIVE-4642.3.patch.txt, HIVE-4642.4.patch.txt, HIVE-4642.5.patch.txt, 
 Hive-Vectorized-Query-Execution-Design-rev10.docx


 See title. I will add more details next week. The goal is (a) make this work 
 correctly and (b) optimize it as well as possible, at least for the common 
 cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions

2013-08-05 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-4642:
-

Attachment: HIVE-4642.4.patch.txt

4th patch contains the following changes.
- Added code on AbstractFilterStringColLikeStringScalar.java to evaluate child 
expressions.
- Removed misleading comments.

 Implement vectorized RLIKE and REGEXP filter expressions
 

 Key: HIVE-4642
 URL: https://issues.apache.org/jira/browse/HIVE-4642
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Teddy Choi
 Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, 
 HIVE-4642.3.patch.txt, HIVE-4642.4.patch.txt, 
 Hive-Vectorized-Query-Execution-Design-rev10.docx


 See title. I will add more details next week. The goal is (a) make this work 
 correctly and (b) optimize it as well as possible, at least for the common 
 cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions

2013-07-24 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-4642:
-

Attachment: Hive-Vectorized-Query-Execution-Design-rev10.docx

I wrote LIKE and REGEXP expressions: section in Filter operator. Following 
is the added text.
{quote}
Filter condition expressions

LIKE and REGEXP expressions:

LIKE and REGEXP expressions find any strings fitting a pattern. They compile a 
pattern on creation, and find strings on evaluation.
Both kinds of expression use the Java regular expression package. REGEXP 
expressions use the package as it is. But LIKE expressions have different 
grammar, so they need conversion. “%” is converted to “.*” and “_” is converted 
to “.”. AbstractFilterStringColLikeStringScalar class defines common behaviors. 
FilterStringColLikeStringScalar class and FilterStringColRegExpStringScalar 
class implement differences.
There are simple and frequently used patterns; such as prefix match, suffix 
match, middle match, exact match, and phone numbers. There are optimized 
implementations for them. They evaluate using byte arrays directly to avoid 
UTF-8 decoding load.
{quote}

This file is edited on Word for Mac 2011, so it may have incompatibilities.

 Implement vectorized RLIKE and REGEXP filter expressions
 

 Key: HIVE-4642
 URL: https://issues.apache.org/jira/browse/HIVE-4642
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Teddy Choi
 Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, 
 HIVE-4642.3.patch.txt, Hive-Vectorized-Query-Execution-Design-rev10.docx


 See title. I will add more details next week. The goal is (a) make this work 
 correctly and (b) optimize it as well as possible, at least for the common 
 cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions

2013-07-23 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-4642:
-

Attachment: HIVE-4642.3.patch.txt

 Implement vectorized RLIKE and REGEXP filter expressions
 

 Key: HIVE-4642
 URL: https://issues.apache.org/jira/browse/HIVE-4642
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Teddy Choi
 Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, 
 HIVE-4642.3.patch.txt


 See title. I will add more details next week. The goal is (a) make this work 
 correctly and (b) optimize it as well as possible, at least for the common 
 cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions

2013-07-23 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-4642:
-

Status: Patch Available  (was: In Progress)

 Implement vectorized RLIKE and REGEXP filter expressions
 

 Key: HIVE-4642
 URL: https://issues.apache.org/jira/browse/HIVE-4642
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Teddy Choi
 Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch, 
 HIVE-4642.3.patch.txt


 See title. I will add more details next week. The goal is (a) make this work 
 correctly and (b) optimize it as well as possible, at least for the common 
 cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions

2013-07-02 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-4642:
-

Attachment: HIVE-4642.2.patch

After applying HIVE-4548, the previous patch became not available to apply on 
the vectorization branch. Because both of them change 
FilterStringColLikeStringScalar.

This patch is available to apply on the vectorization branch.

 Implement vectorized RLIKE and REGEXP filter expressions
 

 Key: HIVE-4642
 URL: https://issues.apache.org/jira/browse/HIVE-4642
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Teddy Choi
 Attachments: HIVE-4642-1.patch, HIVE-4642.2.patch


 See title. I will add more details next week. The goal is (a) make this work 
 correctly and (b) optimize it as well as possible, at least for the common 
 cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4642) Implement vectorized RLIKE and REGEXP filter expressions

2013-06-30 Thread Teddy Choi (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4642?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Teddy Choi updated HIVE-4642:
-

Attachment: HIVE-4642-1.patch

I wrote draft code. It needs more comments, tests, and refactoring.

I agree that FA generation will be a heavy job, so I didn't implemented it. 
Common phone number patterns are covered with a simple fixed automaton. I will 
add more simple automata.

There are already hard coded decisions, and more will come. So I introduced an 
interface that generalizes decisions. It may reduce performance little bit.

Class hierarchy:

AbstractFilterStringColLikeStringScalar
+ FilterStringColLikeStringScalar
+ FilterStringColRegExpStringScalar

AbstractFilter...#Checker
+ AbstractFilter...#BeginChecker
+ AbstractFilter...#EndChecker
+ AbstractFilter...#MiddleChecker
+ AbstractFilter...#NoneChecker
+ AbstractFilter...#AnyCharChecker
+ AbstractFilter...#ComplexChecker
+ FilterStringColRegExpStringScalar#PhoneNumberChecker

AbstractFilter...#CheckerFactory
+ Filter...Like...#LikeBeginCheckerFactory
+ Filter...Like...#LikeEndCheckerFactory
+ Filter...Like...#LikeMiddleCheckerFactory
+ Filter...Like...#LikeNoneCheckerFactory
+ Filter...Like...#LikeAnyCharCheckerFactory
+ Filter...Like...#LikeComplexCheckerFactory
+ Filter...RegExp...#RegExpBeginCheckerFactory
+ Filter...RegExp...#RegExpEndCheckerFactory
+ Filter...RegExp...#RegExpMiddleCheckerFactory
+ Filter...RegExp...#RegExpNoneCheckerFactory
+ Filter...RegExp...#RegExpAnyCharCheckerFactory
+ Filter...RegExp...#RegExpComplexCheckerFactory
+ Filter...RegExp...#RegExpPhoneNumberCheckerFactory

 Implement vectorized RLIKE and REGEXP filter expressions
 

 Key: HIVE-4642
 URL: https://issues.apache.org/jira/browse/HIVE-4642
 Project: Hive
  Issue Type: Sub-task
Reporter: Eric Hanson
Assignee: Teddy Choi
 Attachments: HIVE-4642-1.patch


 See title. I will add more details next week. The goal is (a) make this work 
 correctly and (b) optimize it as well as possible, at least for the common 
 cases.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira