[ 
https://issues.apache.org/jira/browse/CASSANDRA-4789?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13474171#comment-13474171
 ] 

Will Oberman commented on CASSANDRA-4789:
-----------------------------------------

It's definitely a problem using widerows=true and having keys that only map to 
one column.  Here is my patch on 1.1.5 (I also have a one line patch to fix the 
wide rows bug):

112a113
>     private ByteBuffer lastKey;
116d116
< 
153a154
>                           //check key == lastKey?
159a161
>                       lastKey = null;
177a180
>                   lastKey = (ByteBuffer)reader.getCurrentKey();
185a189,200
>                   if(lastKey != null && !(key.equals(lastKey))) // last key 
> only had one value
>                   {
>                       tuple.append(new DataByteArray(lastKey.array(), 
> lastKey.position()+lastKey.arrayOffset(), 
> lastKey.limit()+lastKey.arrayOffset()));
>                       for (Map.Entry<ByteBuffer, IColumn> entry : 
> lastRow.entrySet())
>                       {
>                           bag.add(columnToTuple(entry.getValue(), cfDef, 
> parseType(cfDef.getComparator_type())));
>                       }
>                       tuple.append(bag);
>                       lastKey = key;
>                       lastRow = 
> (SortedMap<ByteBuffer,IColumn>)reader.getCurrentValue();
>                       return tuple;
>                   }
194a210
>                   lastKey = null;
551c567
<             widerows = Boolean.valueOf(System.getProperty(PIG_WIDEROW_INPUT));
---
>             widerows = Boolean.valueOf(System.getenv(PIG_WIDEROW_INPUT));
                
> CassandraStorage.getNextWide produces corrupt data
> --------------------------------------------------
>
>                 Key: CASSANDRA-4789
>                 URL: https://issues.apache.org/jira/browse/CASSANDRA-4789
>             Project: Cassandra
>          Issue Type: Bug
>          Components: Hadoop
>    Affects Versions: 1.1.5
>            Reporter: Will Oberman
>            Assignee: Brandon Williams
>
> This took me a while to track down.  I'm seeing the problem when the "key 
> changes" case happens.  The intended behavior (as far as I can tell) when the 
> key changes is the method returns the current tuple, and picks up where it 
> left off on the next call to getNextWide().  The problem I'm seeing is the 
> sometimes the current key advances between method calls, sometimes not.  
> "Not" being the correct behavior, since the code is saving the value into an 
> instance variable, but when the key advances there is a key/value mismatch 
> (the result being the values for two different keys are being glued 
> together).  I think the problem might be related to keys that only have a 
> single column???  I'm still trying to track that down to help assist in 
> solving this case...
> Maybe this will be clearer from me pasting a bunch of logging I added to the 
> class.  The log messages are fairly self documenting (I hope):  
> ...lots of previous logging...
> enter getNextWide
> hasNext = true
> set key = dVNhbXAxMzQ3ODM1OA%3D%3D
> lastRow != null
> added 1 items to bag from lastRow
> added 1 items to bag from row
> hasNext = true
> added 1 items to bag from row
> hasNext = true
> added 1 items to bag from row
> hasNext = true
> added 1 items to bag from row
> hasNext = true
> added 1 items to bag from row
> hasNext = true
> added 1 items to bag from row
> hasNext = true
> added 1 items to bag from row
> hasNext = true
> added 1 items to bag from row
> hasNext = true
> key changed, new key = 669392df09572d0045b964bc65f86a2c
> exit getNextWide
> enter getNextWide
> hasNext = true
> //!!!THIS IS THE PROBLEM HERE I THINK!!!
> //!!!Usually the key here == key before "exit getNextWide"!!!
> set key = 5f900ee4bb1850f8cf387cc3d5fc23ca
> //!!! lastRow is data for 669392df09572d0045b964bc65f86a2c !!! 
> //!!! but it's being added to key 5f900ee4bb1850f8cf387cc3d5fc23ca !!!
> lastRow != null
> added 1 items to bag from lastRow
> //!!! Here are the real values for 5f900ee4bb1850f8cf387cc3d5fc23ca !!!
> added 1 items to bag from row
> hasNext = true
> added 1 items to bag from row
> hasNext = true
> key changed, new key = 50438549-cdb6-8c44-f93a-d18d7daeffd8
> exit getNextWide
> enter getNextWide
> hasNext = true
> set key = 50438549-cdb6-8c44-f93a-d18d7daeffd8

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to