[jira] [Commented] (PHOENIX-29) Add custom filter to more efficiently navigate KeyValues in row

Anoop Sam John (JIRA) Thu, 20 Feb 2014 01:05:23 -0800

    [ 
https://issues.apache.org/jira/browse/PHOENIX-29?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13906778#comment-13906778
 ]


Anoop Sam John commented on PHOENIX-29:
---------------------------------------

Case with
select a, b from t where c = 5

Yes we do ColumnProjectionFilter creation after the step of adding the empty 
column into Scan and so ColumnProjectionFilter  will contain the empty column 
also.

bq.1) You're doing a SELECT * wildcard projection. In this case, we want to 
keep all the KeyValue (and basically not include your filter).
Yes in this case we will not add the new Filter to the Scan.
{code}
for (Entry<byte[], NavigableSet<byte[]>> entry : familyMap.entrySet()) {
            if(entry.getValue() != null){
                NavigableSet<ImmutableBytesPtr> cols = new 
TreeSet<ImmutableBytesPtr>();
                for(byte[] col: entry.getValue()){
                    cols.add(new ImmutableBytesPtr(col));
                }
                columnsTracker.put(new ImmutableBytesPtr(entry.getKey()), cols);
            }
        }
        if(!columnsTracker.isEmpty()){
            for (ImmutableBytesPtr f : columnsTracker.keySet()) {
                // This addFamily will remove explicit cols in scan familyMap 
and make it as entire row.
                // We don't want the ExplicitColumnTracker to be used. Instead 
we have the ColumnProjectionFilter
                scan.addFamily(f.get());
            }
            ScanUtil.andFilterAtEndButBefore(scan, new 
ColumnProjectionFilter(columnsTracker),
                    PageFilter.class.getName());
        }
{code}
In that case the family map will contain family key but value will be null 
making columnsTracker to be empty and so we will not add the new filter.

So u say in case of mapped view, projector.isProjectEmptyKeyValue() will be 
false  and I have to add the empty column into the my column tracker?  I am not 
fully getting here how that helps.

So out of 3 concerns u raised above 2 are no issues. Only Mapped view is some 
thing have to work on.

> Add custom filter to more efficiently navigate KeyValues in row
> ---------------------------------------------------------------
>
>                 Key: PHOENIX-29
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-29
>             Project: Phoenix
>          Issue Type: Bug
>            Reporter: James Taylor
>         Attachments: PHOENIX-29.patch, PHOENIX-29_V2.patch
>
>
> Currently HBase is 50% faster at selecting the first KV in a row than in 
> selecting any other column. The reason is that when you project a column into 
> a Scan, HBase uses its ExplicitColumTracker which does a reseek to the 
> column. The only case where this is not necessary is when the column is the 
> first one.
> In most cases (unless you have thousands of versions), it'd be more efficient 
> to just do a NEXT instead of a reseek (especially if your KV is the next 
> one). We can provide our own custom filter through which we pass two lists:
> 1) all KVs referenced in the select expressions. These are the only ones that 
> need to be returned back to the client which is another advantage we'd get 
> writing this custom filter.
> 2) all KVs referenced in the WHERE clause.
> The filter could sort the KVs using the standard KeyValue.COMPARATOR and 
> merge between them and the incoming KVs, using NEXT instead of a reseek. We 
> could potentially use a reseek if the number of columns in the table is 
> beyond a certain threshold.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (PHOENIX-29) Add custom filter to more efficiently navigate KeyValues in row

Reply via email to