[ 
https://issues.apache.org/jira/browse/HBASE-4612?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13130020#comment-13130020
 ] 

Eran Kutner commented on HBASE-4612:
------------------------------------

Hi Jonathan, thanks for the feedback! See answers inline:

{quote}There's no explanation of the behavior anywhere. In the constructors and 
addPrefix() methods, you should document that this creates an OR condition 
across all of the prefixes, correct?{quote} - good point, added some more 
explanations.
{quote}No need to instantiate a new comparator all the time (use 
Bytes.BYTES_COMPARATOR){quote} - Didn't know it existed. Changed.
{quote}Something seems odd when you keep adding to the end of a List and then 
sort. How about a TreeSet? You can easily ignore dupes that way.{quote} - This 
is intentional. Sorting is done only during initialization but accessing a 
ArrayList, which is actually based on an array, is much more efficient than 
accessing a tree, so I sacrifice the aesthetics of the code for better runtime 
performance.
{quote}There's no input verification so, for example, you could pass a null to 
the constructor or an empty byte[][] and have some strange behavior. Like it 
will instantiate okay but then you'll get server-side NPEs or IOOB.{quote} - 
it's a good point but I've looked and no other filter is validating its input 
either. I can throw a InvalidArgumentException but don't know if it's a good 
idea considering it's not the norm.
{quote}this.prefixes.size() == 0 -> this.prefixes.isEmpty(){quote} - ok, 
changed.
{quote}your comment at the top of filterColumn, i wouldn't exactly call it a 
workaround, but it's a good comment. looking at the logic, it seems like 
correct behavior would be that it can be called with current == size() but it 
would be a bug if current > size(), right? should you add an assert or throw an 
exception?{quote} - well it is kind of a workaround, because as an individual 
filter I expect not be called again after returning NEXT_ROW, however, when 
used with FilterList the filter does get called again which puts it in an 
ilegal state, so it has to explicitly handle that case. That is also why it 
can't throw an exception in that scenario, because it seems to be happening 
normally when used with FilterList. as for "current" it has to be smaller than 
size() or it would be outside the bounds of the array.


                
> Allow ColumnPrefixFilter to support multiple prefixes
> -----------------------------------------------------
>
>                 Key: HBASE-4612
>                 URL: https://issues.apache.org/jira/browse/HBASE-4612
>             Project: HBase
>          Issue Type: Improvement
>          Components: filters
>    Affects Versions: 0.90.4
>            Reporter: Eran Kutner
>            Assignee: Eran Kutner
>            Priority: Minor
>             Fix For: 0.94.0
>
>         Attachments: HBASE-4612-0.90.patch
>
>
> When having a lot of columns grouped by name I've found that it would be very 
> useful to be able to scan them using multiple prefixes, allowing to fetch 
> specific groups in one scan, without fetching the entire row. This is 
> impossible to achieve using a FilterList, so I've added such support to the 
> existing ColmnPrefixFilter while keeping backward compatibility.
> The attached patch is based on 0.90.4, I noticed that the 0.92 branch has a 
> new method to support instantiating filters using Thrift. I'm not sure how 
> the serialization works there so I didn't implement that, but the rest of my 
> code should work in 0.92 as well.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to