[
https://issues.apache.org/jira/browse/CASSANDRA-8576?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14492709#comment-14492709
]
Piotr Kołaczkowski commented on CASSANDRA-8576:
-----------------------------------------------
{noformat}
@@ -79,6 +90,7 @@ public class ColumnFamilySplit extends InputSplit implements
Writable, org.apach
{
out.writeUTF(startToken);
out.writeUTF(endToken);
+ out.writeBoolean(partitionKeyEqQuery);
out.writeInt(dataNodes.length);
{noformat}
This is going to break mixed-version clusters. Hadoop tasks will error out in
weird ways on a cluster with some nodes 2.1.4 and some 2.1.5. This is actually
very unfortunate that split serialization doesn't write a length or version
header first, so we could detect it properly on the clients. Are you sure we
want to merge this feature in the middle of 2.1.x?
Are we
> Primary Key Pushdown For Hadoop
> -------------------------------
>
> Key: CASSANDRA-8576
> URL: https://issues.apache.org/jira/browse/CASSANDRA-8576
> Project: Cassandra
> Issue Type: Improvement
> Components: Hadoop
> Reporter: Russell Alexander Spitzer
> Assignee: Alex Liu
> Fix For: 2.1.5
>
> Attachments: 8576-2.1-branch.txt, 8576-trunk.txt
>
>
> I've heard reports from several users that they would like to have predicate
> pushdown functionality for hadoop (Hive in particular) based services.
> Example usecase
> Table with wide partitions, one per customer
> Application team has HQL they would like to run on a single customer
> Currently time to complete scales with number of customers since Input Format
> can't pushdown primary key predicate
> Current implementation requires a full table scan (since it can't recognize
> that a single partition was specified)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)