[ 
https://issues.apache.org/jira/browse/CASSANDRA-3777?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis resolved CASSANDRA-3777.
---------------------------------------

    Resolution: Not A Problem

See http://wiki.apache.org/cassandra/FAQ#range_ghosts and CASSANDRA-3982
                
> get_range_slices() always returns list of KeySlice containing all available 
> rows even if column size is empty
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: CASSANDRA-3777
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-3777
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: Core
>    Affects Versions: 1.0.7
>         Environment: Debian Squeeze
>            Reporter: bert Passek
>
> Hi,
> we are using Cassandra to store data in super column families with a date as 
> their name. We would like to iterate over the keys only containing data which 
> matches given slice range (e.g. a certain day). In fact, method 
> get_range_slices() always returns all rows where getColumnSize() on given 
> KeySlice is 0.
> In combination with Hadoop we use the ColumnFamilyInputFormat which currently 
> only supports SliceRanges. In our setup we might have billions of rows within 
> a column family. Even though setting a slice range we always have to iterate 
> all row keys, which in my opinion doesn't make any sense.
> Lets have a look at a very simple example:
>         Cassandra.Client client = ConfigHelper.createConnection("localhost", 
> 9160, true);
>         client.set_keyspace("Foo");
>         SlicePredicate predicate = new SlicePredicate();
>         SliceRange sliceRange = new SliceRange();
>         sliceRange.start = Util.bb("I@1327273200");
>         sliceRange.finish = Util.bb("I@1327273200~");
>         predicate.slice_range = sliceRange;
>         
>         KeyRange keyRange = new KeyRange();
>         keyRange.start_key = Util.bb("");
>         keyRange.end_key = Util.bb("");
>         List<KeySlice> rows = client.get_range_slices(new 
> ColumnParent("Bar"), predicate,
>                 keyRange, ConsistencyLevel.ONE);
>         
>         for (KeySlice slice : rows)
>         {
>             System.out.println("key: " + new String(slice.getKey()) + ", 
> columns: " + slice.getColumnsSize());
>         }
> This is the output:
> key: I@1327359600@14@2074@478@32798@80445@2011@138@205@4320@0, columns: 0
> key: I@1327273200@12@1151@139@801@1728@2033@138@219@4476@0, columns: 1
> key: I@1327359600@14@2055@359@1032@2078@2011@138@205@4320@0, columns: 0
> key: I@1327359600@14@1151@139@801@1728@2011@138@205@4320@0, columns: 0
> key: I@1327273200@12@2074@478@32798@80445@2033@138@219@4476@0, columns: 1
> key: I@1327273200@12@2055@359@1032@2079@2033@138@219@4476@0, columns: 1
> Searching by slice ranges works fine, but for all other row keys not matching 
> given slice range they are still part of the result list. We are filtering 
> out such key slices by checking their column size, but it would make more 
> sense to get only those keys we are looking for (which have obviously column 
> size > 0).
> ColumnFamilyRecordReader creates sorted maps from the result list which means 
> creating billions of maps and passing them to the mapper which are finally 
> thrown away because they do not contain any content.
> The question is: Is there a chance by using slice ranges to get only those 
> key slices which matches given slice range? Or is there any reason why this 
> behaviour is like described above?
> Best Regards
> Bert Passek

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to