[ 
https://issues.apache.org/jira/browse/CASSANDRA-2855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13072557#comment-13072557
 ] 

Jeremy Hanna edited comment on CASSANDRA-2855 at 7/28/11 10:16 PM:
-------------------------------------------------------------------

Added a configuration property cassandra.skip.empty.results which defaults to 
false.  We can't skip just complete empty rows because there is no way to tell 
if the complete row is empty based on a result from a slice predicate.

      was (Author: jeromatron):
    Added a configuration property cassandra.skip.empty.results which defaults 
to false.  We can't skip just complete empty rows because there is no way to 
tell if the complete row is empty based on a result that is a slice predicate.
  
> Skip rows with empty columns when slicing entire row
> ----------------------------------------------------
>
>                 Key: CASSANDRA-2855
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-2855
>             Project: Cassandra
>          Issue Type: Improvement
>          Components: API
>            Reporter: Jeremy Hanna
>            Assignee: Jeremy Hanna
>            Priority: Minor
>              Labels: hadoop
>             Fix For: 0.7.9, 0.8.3
>
>         Attachments: 2855-v2.txt, 2855-v3.txt
>
>
> We have been finding that range ghosts appear in results from Hadoop via Pig. 
>  This could also happen if rows don't have data for the slice predicate that 
> is given.  This leads to having to do a painful amount of defensive checking 
> on the Pig side, especially in the case of range ghosts.
> We would like to add an option to skip rows that have no column values in it. 
>  That functionality existed before in core Cassandra but was removed because 
> of the performance penalty of that checking.  However with Hadoop support in 
> the RecordReader, that is batch oriented anyway, so individual row reading 
> performance isn't as much of an issue.  Also we would make it an optional 
> config parameter for each job anyway, so people wouldn't have to incur that 
> penalty if they are confident that there won't be those empty rows or they 
> don't care.
> It could be parameter cassandra.skip.empty.rows and be true/false.

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to