[jira] [Updated] (CASSANDRA-5504) Eternal iteration when using newer hadoop version due to next() call and empty key value

2013-04-25 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-5504:
--

Attachment: 5504-v3.txt

Thanks for the patch, Oleksandr.

It looks to me like the root of the problem is that 
{{key.put(this.getCurrentKey())}} destructively modifies currentKey.  Attached 
is a patch to duplicate the buffer first.

This has the added benefit that we don't have to impose any overhead on the new 
mapreduce api to solve this problem in the old mapred one.

 Eternal iteration when using newer hadoop version due to next() call and 
 empty key value
 

 Key: CASSANDRA-5504
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5504
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.2.3
Reporter: Oleksandr Petrov
Priority: Critical
 Attachments: 5504-v3.txt, patch2.diff, patch.diff


 Currently, when using newer hadoop versions, due to the call to 
 next(ByteBuffer key, SortedMapByteBuffer, IColumn value)
 within ColumnFamilyRecordReader, because `key.clear();` is called, key is 
 emptied. That causes the StaticRowIterator and WideRowIterator to glitch, 
 namely, when Iterables.getLast(rows).key is called, key is already empty. 
 This will cause Hadoop to request the same range again and again all the time.
 Please see the attached patch/diff, it simply adds lastRowKey (ByteBuffer) 
 and saves it for the next iteration along with all the rows, this allows 
 query for the next range to be fully correct.
 This patch is branched from 1.2.3 version.
 Tested against Cassandra 1.2.3, with Hadoop 1.0.3, 1.0.4 and 0.20.2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-5504) Eternal iteration when using newer hadoop version due to next() call and empty key value

2013-04-25 Thread Jonathan Ellis (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jonathan Ellis updated CASSANDRA-5504:
--

 Priority: Minor  (was: Critical)
Affects Version/s: (was: 1.2.3)
   1.2.0
Fix Version/s: 1.2.5

While investigating whether this was also a problem in 1.1, I found that this 
was fixed for 1.1.7 in CASSANDRA-4834, with the same .duplicate() solution, but 
not merged forward.  I've applied this fix to the 1.2 branch.

 Eternal iteration when using newer hadoop version due to next() call and 
 empty key value
 

 Key: CASSANDRA-5504
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5504
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.2.0
Reporter: Oleksandr Petrov
Priority: Minor
 Fix For: 1.2.5

 Attachments: 5504-v3.txt, patch2.diff, patch.diff


 Currently, when using newer hadoop versions, due to the call to 
 next(ByteBuffer key, SortedMapByteBuffer, IColumn value)
 within ColumnFamilyRecordReader, because `key.clear();` is called, key is 
 emptied. That causes the StaticRowIterator and WideRowIterator to glitch, 
 namely, when Iterables.getLast(rows).key is called, key is already empty. 
 This will cause Hadoop to request the same range again and again all the time.
 Please see the attached patch/diff, it simply adds lastRowKey (ByteBuffer) 
 and saves it for the next iteration along with all the rows, this allows 
 query for the next range to be fully correct.
 This patch is branched from 1.2.3 version.
 Tested against Cassandra 1.2.3, with Hadoop 1.0.3, 1.0.4 and 0.20.2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-5504) Eternal iteration when using newer hadoop version due to next() call and empty key value

2013-04-22 Thread Oleksandr Petrov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleksandr Petrov updated CASSANDRA-5504:


Description: 
Currently, when using newer hadoop versions, due to the call to 

next(ByteBuffer key, SortedMapByteBuffer, IColumn value)

within ColumnFamilyRecordReader, because `key.clear();` is called, key is 
emptied. That causes the StaticRowIterator and WideRowIterator to glitch, 
namely, when Iterables.getLast(rows).key is called, key is already empty. This 
will cause Hadoop to request the same range again and again all the time.

Please see the attached patch/diff, it simply adds lastRowKey (ByteBuffer) and 
saves it for the next iteration along with all the rows, this allows query for 
the next range to be fully correct.

This patch is branched from 1.2.3 version.

Tested against Cassandra 1.2.3, with Hadoop 1.0.3, 1.0.4 and 0.20.2

 Eternal iteration when using newer hadoop version due to next() call and 
 empty key value
 

 Key: CASSANDRA-5504
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5504
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Reporter: Oleksandr Petrov
Priority: Critical

 Currently, when using newer hadoop versions, due to the call to 
 next(ByteBuffer key, SortedMapByteBuffer, IColumn value)
 within ColumnFamilyRecordReader, because `key.clear();` is called, key is 
 emptied. That causes the StaticRowIterator and WideRowIterator to glitch, 
 namely, when Iterables.getLast(rows).key is called, key is already empty. 
 This will cause Hadoop to request the same range again and again all the time.
 Please see the attached patch/diff, it simply adds lastRowKey (ByteBuffer) 
 and saves it for the next iteration along with all the rows, this allows 
 query for the next range to be fully correct.
 This patch is branched from 1.2.3 version.
 Tested against Cassandra 1.2.3, with Hadoop 1.0.3, 1.0.4 and 0.20.2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-5504) Eternal iteration when using newer hadoop version due to next() call and empty key value

2013-04-22 Thread Oleksandr Petrov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleksandr Petrov updated CASSANDRA-5504:


Attachment: patch.diff

 Eternal iteration when using newer hadoop version due to next() call and 
 empty key value
 

 Key: CASSANDRA-5504
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5504
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.2.3
Reporter: Oleksandr Petrov
Priority: Critical
 Attachments: patch.diff


 Currently, when using newer hadoop versions, due to the call to 
 next(ByteBuffer key, SortedMapByteBuffer, IColumn value)
 within ColumnFamilyRecordReader, because `key.clear();` is called, key is 
 emptied. That causes the StaticRowIterator and WideRowIterator to glitch, 
 namely, when Iterables.getLast(rows).key is called, key is already empty. 
 This will cause Hadoop to request the same range again and again all the time.
 Please see the attached patch/diff, it simply adds lastRowKey (ByteBuffer) 
 and saves it for the next iteration along with all the rows, this allows 
 query for the next range to be fully correct.
 This patch is branched from 1.2.3 version.
 Tested against Cassandra 1.2.3, with Hadoop 1.0.3, 1.0.4 and 0.20.2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (CASSANDRA-5504) Eternal iteration when using newer hadoop version due to next() call and empty key value

2013-04-22 Thread Oleksandr Petrov (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Oleksandr Petrov updated CASSANDRA-5504:


Attachment: patch2.diff

 Eternal iteration when using newer hadoop version due to next() call and 
 empty key value
 

 Key: CASSANDRA-5504
 URL: https://issues.apache.org/jira/browse/CASSANDRA-5504
 Project: Cassandra
  Issue Type: Bug
  Components: Hadoop
Affects Versions: 1.2.3
Reporter: Oleksandr Petrov
Priority: Critical
 Attachments: patch2.diff, patch.diff


 Currently, when using newer hadoop versions, due to the call to 
 next(ByteBuffer key, SortedMapByteBuffer, IColumn value)
 within ColumnFamilyRecordReader, because `key.clear();` is called, key is 
 emptied. That causes the StaticRowIterator and WideRowIterator to glitch, 
 namely, when Iterables.getLast(rows).key is called, key is already empty. 
 This will cause Hadoop to request the same range again and again all the time.
 Please see the attached patch/diff, it simply adds lastRowKey (ByteBuffer) 
 and saves it for the next iteration along with all the rows, this allows 
 query for the next range to be fully correct.
 This patch is branched from 1.2.3 version.
 Tested against Cassandra 1.2.3, with Hadoop 1.0.3, 1.0.4 and 0.20.2

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira