Too many splits for ColumnFamily with only a few rows
-----------------------------------------------------
Key: CASSANDRA-1050
URL: https://issues.apache.org/jira/browse/CASSANDRA-1050
Project: Cassandra
Issue Type: Bug
Components: Hadoop
Affects Versions: 0.6
Reporter: Joost Ouwerkerk
Fix For: 0.6.2
ColumnFamilyInputFormat creates splits for the entire Keyspace. If one
ColumnFamily has 100 Million rows and another has only 100 rows, the number of
splits will be the 1,526 (assuming 64k rows per split) for either one, since it
is based on the total number of unique keys across the whole keyspace, and not
on the number of rows in the ColumnFamily.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.