[ 
https://issues.apache.org/jira/browse/CASSANDRA-8576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14536341#comment-14536341
 ] 

Piotr Kołaczkowski commented on CASSANDRA-8576:
-----------------------------------------------

Some comments were not addressed.
{noformat}
              boolean containToken;
                for (Range<Token> subrange : ranges)
                {
                    //make sure subrange contains the token
                    containToken = false;
                    if (token != null)
                    {
                        if (subrange.contains(token))
                            containToken = true;
                        else
                            continue;
                    }

                    ColumnFamilySplit split =
                            new ColumnFamilySplit(
                                    factory.toString(subrange.left),
                                    factory.toString(subrange.right),
                                    subSplit.getRow_count(),
                                    endpoints);

                    if (containToken)
                        split.setPartitionKeyEqQuery(containToken);
                    logger.debug("adding {}", split);
{noformat}
Multiple code smells in this fragment:
* boolean flag declared in a needlessly broad scope. If something is used only 
inside a loop, it should be declared only inside the loop.
* continue controlled by a boolean flag
* redundant if (the code is equivalent without if (containToken)

I simplified it for you:
{noformat}
                for (Range<Token> subrange : ranges)
                {
                    boolean containsToken = token != null && 
subrange.contains(token);
                    if (token == null || containsToken) {
                        ColumnFamilySplit split =
                            new ColumnFamilySplit(
                                factory.toString(subrange.left),
                                factory.toString(subrange.right),
                                subSplit.getRow_count(),
                                endpoints);
                        split.setPartitionKeyEqQuery(containsToken);
                        logger.debug("adding {}", split);
                        splits.add(split);
                    }
                }
{noformat}





> Primary Key Pushdown For Hadoop
> -------------------------------
>
>                 Key: CASSANDRA-8576
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-8576
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Hadoop
>            Reporter: Russell Alexander Spitzer
>            Assignee: Alex Liu
>             Fix For: 2.1.x
>
>         Attachments: 8576-2.1-branch.txt, 8576-trunk.txt, 
> CASSANDRA-8576-v2-2.1-branch.txt
>
>
> I've heard reports from several users that they would like to have predicate 
> pushdown functionality for hadoop (Hive in particular) based services. 
> Example usecase
> Table with wide partitions, one per customer
> Application team has HQL they would like to run on a single customer
> Currently time to complete scales with number of customers since Input Format 
> can't pushdown primary key predicate
> Current implementation requires a full table scan (since it can't recognize 
> that a single partition was specified)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to