[jira] [Commented] (KUDU-1363) Add Multiple column range predicates for the same column in a single scan

Dan Burkert (JIRA) Wed, 13 Apr 2016 17:38:14 -0700

    [ 
https://issues.apache.org/jira/browse/KUDU-1363?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15240323#comment-15240323
 ]


Dan Burkert commented on KUDU-1363:
-----------------------------------

Hi Sameer,

That's correct.  The basic steps for adding a new predicate type are:

1) On the c++ side add the predicate type to the kudu::ColumnPredicate class in 
column_predicate.h/cc. This includes updating Merge and Evaluate to work with 
the new predicate type.  You will also need to update some of the scan 
optimization logic in scan_spec.cc to account for the new predicate type, and 
the partition pruning logic in partition_pruner.cc.

2) Add the new predicate type to the ColumnPredicatePB message.

3) Expose the predicate type in the public API (KuduPredicate in 
scan_predicate.h)

4) Add the predicate type to the Java client in the KuduPredicate class.

Note that if the goal is to have a multi-get API for retrieving multiple rows 
where you know the primary key for each, an IN predicate will be quite 
inefficient (it will require a full table scan).  We have discussed a multi-get 
API in the past, and would definitely be open to contributions on this problem 
as well.

> Add Multiple column range predicates for the same column in a single scan
> -------------------------------------------------------------------------
>
>                 Key: KUDU-1363
>                 URL: https://issues.apache.org/jira/browse/KUDU-1363
>             Project: Kudu
>          Issue Type: New Feature
>            Reporter: Chris George
>
> Currently adding multiple column range predicates for the same column does 
> essentially an AND between the two predicates which will cause no results to 
> be returned. 
> This would greatly increase performance were I can complete in one scan what 
> would otherwise take two.
> As an example using the java api:
> ColumnRangePredicate columnRangePredicateColumnNameA = new 
> ColumnRangePredicate(new ColumnSchema.ColumnSchemaBuilder("column_name", 
> Type.STRING).build());
> columnRangePredicateColumnNameA.setLowerBound("A");
> columnRangePredicateColumnNameA.setUpperBound("A");
> ColumnRangePredicate columnRangePredicateColumnNameB = new 
> ColumnRangePredicate(new ColumnSchema.ColumnSchemaBuilder("column_name", 
> Type.STRING).build());
> columnRangePredicateColumnNameB.setLowerBound("B");
> columnRangePredicateColumnNameB.setUpperBound("B");
> which would be equivalent:
> select * from some_table where column_name="A" or column_name="B"



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (KUDU-1363) Add Multiple column range predicates for the same column in a single scan

Reply via email to