[ https://issues.apache.org/jira/browse/CASSANDRA-2855?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Jeremy Hanna updated CASSANDRA-2855: ------------------------------------ Attachment: 2855-v2.txt v2 is tested to skip results with no columns and tombstones. Also fixed where an exception would occur because lastRow looked at the altered set of rows. > Skip rows with empty columns when slicing entire row > ---------------------------------------------------- > > Key: CASSANDRA-2855 > URL: https://issues.apache.org/jira/browse/CASSANDRA-2855 > Project: Cassandra > Issue Type: Improvement > Components: API > Reporter: Jeremy Hanna > Assignee: Jeremy Hanna > Priority: Minor > Labels: hadoop > Fix For: 0.7.9, 0.8.3 > > Attachments: 2855-v2.txt > > > We have been finding that range ghosts appear in results from Hadoop via Pig. > This could also happen if rows don't have data for the slice predicate that > is given. This leads to having to do a painful amount of defensive checking > on the Pig side, especially in the case of range ghosts. > We would like to add an option to skip rows that have no column values in it. > That functionality existed before in core Cassandra but was removed because > of the performance penalty of that checking. However with Hadoop support in > the RecordReader, that is batch oriented anyway, so individual row reading > performance isn't as much of an issue. Also we would make it an optional > config parameter for each job anyway, so people wouldn't have to incur that > penalty if they are confident that there won't be those empty rows or they > don't care. > It could be parameter cassandra.skip.empty.rows and be true/false. -- This message is automatically generated by JIRA. For more information on JIRA, see: http://www.atlassian.com/software/jira