[ 
https://issues.apache.org/jira/browse/PHOENIX-1287?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14360263#comment-14360263
 ] 

Shuxiong Ye commented on PHOENIX-1287:
--------------------------------------

Hi [~jamestaylor], 

Most Code of ByteBasedLikeExpression will be same as LikeExpression, and I 
think this is not good for further management and development. Considering such 
as a case, if we want to update the logic of LikeExpression, we have to update 
ByteBaseLikeExpression, too.

My plan will be:

1. Pass USE_BYTE_BASE_REGEX options to the reg expression(LikeExpression, 
RegexpReplaceFunction, RegexpSplitFunction, RegexpSubstrFunction)
2. In these expressions, they try to use the proper Pattern Matcher according 
the options.

3. There will be a Pattern Matcher Factory, which produces j.u.regex-based one 
and byte-based one.
4. Interface of The Base Pattern Matcher looks like:
{code:java}
Pattern compile(ImmutableBytesWritable ptr);
Matcher matcher(ImmutableBytesWritable ptr);
void replace(ImmutableBytesWritable ptr, ImmutableBytesWritable outputPtr)
{code}

How about this?

Thanks.


> Use the joni byte[] regex engine in place of j.u.regex
> ------------------------------------------------------
>
>                 Key: PHOENIX-1287
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-1287
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: James Taylor
>              Labels: gsoc2015
>
> See HBASE-11907. We'd get a 2x perf benefit plus it's driven off of byte[] 
> instead of strings.Thanks for the pointer, [~apurtell].



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to