[jira] [Updated] (CASSANDRA-5504) Eternal iteration when using newer hadoop version due to next() call and empty key value
[ https://issues.apache.org/jira/browse/CASSANDRA-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-5504: -- Attachment: 5504-v3.txt Thanks for the patch, Oleksandr. It looks to me like the root of the problem is that {{key.put(this.getCurrentKey())}} destructively modifies currentKey. Attached is a patch to duplicate the buffer first. This has the added benefit that we don't have to impose any overhead on the new mapreduce api to solve this problem in the old mapred one. Eternal iteration when using newer hadoop version due to next() call and empty key value Key: CASSANDRA-5504 URL: https://issues.apache.org/jira/browse/CASSANDRA-5504 Project: Cassandra Issue Type: Bug Components: Hadoop Affects Versions: 1.2.3 Reporter: Oleksandr Petrov Priority: Critical Attachments: 5504-v3.txt, patch2.diff, patch.diff Currently, when using newer hadoop versions, due to the call to next(ByteBuffer key, SortedMapByteBuffer, IColumn value) within ColumnFamilyRecordReader, because `key.clear();` is called, key is emptied. That causes the StaticRowIterator and WideRowIterator to glitch, namely, when Iterables.getLast(rows).key is called, key is already empty. This will cause Hadoop to request the same range again and again all the time. Please see the attached patch/diff, it simply adds lastRowKey (ByteBuffer) and saves it for the next iteration along with all the rows, this allows query for the next range to be fully correct. This patch is branched from 1.2.3 version. Tested against Cassandra 1.2.3, with Hadoop 1.0.3, 1.0.4 and 0.20.2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-5504) Eternal iteration when using newer hadoop version due to next() call and empty key value
[ https://issues.apache.org/jira/browse/CASSANDRA-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jonathan Ellis updated CASSANDRA-5504: -- Priority: Minor (was: Critical) Affects Version/s: (was: 1.2.3) 1.2.0 Fix Version/s: 1.2.5 While investigating whether this was also a problem in 1.1, I found that this was fixed for 1.1.7 in CASSANDRA-4834, with the same .duplicate() solution, but not merged forward. I've applied this fix to the 1.2 branch. Eternal iteration when using newer hadoop version due to next() call and empty key value Key: CASSANDRA-5504 URL: https://issues.apache.org/jira/browse/CASSANDRA-5504 Project: Cassandra Issue Type: Bug Components: Hadoop Affects Versions: 1.2.0 Reporter: Oleksandr Petrov Priority: Minor Fix For: 1.2.5 Attachments: 5504-v3.txt, patch2.diff, patch.diff Currently, when using newer hadoop versions, due to the call to next(ByteBuffer key, SortedMapByteBuffer, IColumn value) within ColumnFamilyRecordReader, because `key.clear();` is called, key is emptied. That causes the StaticRowIterator and WideRowIterator to glitch, namely, when Iterables.getLast(rows).key is called, key is already empty. This will cause Hadoop to request the same range again and again all the time. Please see the attached patch/diff, it simply adds lastRowKey (ByteBuffer) and saves it for the next iteration along with all the rows, this allows query for the next range to be fully correct. This patch is branched from 1.2.3 version. Tested against Cassandra 1.2.3, with Hadoop 1.0.3, 1.0.4 and 0.20.2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-5504) Eternal iteration when using newer hadoop version due to next() call and empty key value
[ https://issues.apache.org/jira/browse/CASSANDRA-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oleksandr Petrov updated CASSANDRA-5504: Description: Currently, when using newer hadoop versions, due to the call to next(ByteBuffer key, SortedMapByteBuffer, IColumn value) within ColumnFamilyRecordReader, because `key.clear();` is called, key is emptied. That causes the StaticRowIterator and WideRowIterator to glitch, namely, when Iterables.getLast(rows).key is called, key is already empty. This will cause Hadoop to request the same range again and again all the time. Please see the attached patch/diff, it simply adds lastRowKey (ByteBuffer) and saves it for the next iteration along with all the rows, this allows query for the next range to be fully correct. This patch is branched from 1.2.3 version. Tested against Cassandra 1.2.3, with Hadoop 1.0.3, 1.0.4 and 0.20.2 Eternal iteration when using newer hadoop version due to next() call and empty key value Key: CASSANDRA-5504 URL: https://issues.apache.org/jira/browse/CASSANDRA-5504 Project: Cassandra Issue Type: Bug Components: Hadoop Reporter: Oleksandr Petrov Priority: Critical Currently, when using newer hadoop versions, due to the call to next(ByteBuffer key, SortedMapByteBuffer, IColumn value) within ColumnFamilyRecordReader, because `key.clear();` is called, key is emptied. That causes the StaticRowIterator and WideRowIterator to glitch, namely, when Iterables.getLast(rows).key is called, key is already empty. This will cause Hadoop to request the same range again and again all the time. Please see the attached patch/diff, it simply adds lastRowKey (ByteBuffer) and saves it for the next iteration along with all the rows, this allows query for the next range to be fully correct. This patch is branched from 1.2.3 version. Tested against Cassandra 1.2.3, with Hadoop 1.0.3, 1.0.4 and 0.20.2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-5504) Eternal iteration when using newer hadoop version due to next() call and empty key value
[ https://issues.apache.org/jira/browse/CASSANDRA-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oleksandr Petrov updated CASSANDRA-5504: Attachment: patch.diff Eternal iteration when using newer hadoop version due to next() call and empty key value Key: CASSANDRA-5504 URL: https://issues.apache.org/jira/browse/CASSANDRA-5504 Project: Cassandra Issue Type: Bug Components: Hadoop Affects Versions: 1.2.3 Reporter: Oleksandr Petrov Priority: Critical Attachments: patch.diff Currently, when using newer hadoop versions, due to the call to next(ByteBuffer key, SortedMapByteBuffer, IColumn value) within ColumnFamilyRecordReader, because `key.clear();` is called, key is emptied. That causes the StaticRowIterator and WideRowIterator to glitch, namely, when Iterables.getLast(rows).key is called, key is already empty. This will cause Hadoop to request the same range again and again all the time. Please see the attached patch/diff, it simply adds lastRowKey (ByteBuffer) and saves it for the next iteration along with all the rows, this allows query for the next range to be fully correct. This patch is branched from 1.2.3 version. Tested against Cassandra 1.2.3, with Hadoop 1.0.3, 1.0.4 and 0.20.2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (CASSANDRA-5504) Eternal iteration when using newer hadoop version due to next() call and empty key value
[ https://issues.apache.org/jira/browse/CASSANDRA-5504?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Oleksandr Petrov updated CASSANDRA-5504: Attachment: patch2.diff Eternal iteration when using newer hadoop version due to next() call and empty key value Key: CASSANDRA-5504 URL: https://issues.apache.org/jira/browse/CASSANDRA-5504 Project: Cassandra Issue Type: Bug Components: Hadoop Affects Versions: 1.2.3 Reporter: Oleksandr Petrov Priority: Critical Attachments: patch2.diff, patch.diff Currently, when using newer hadoop versions, due to the call to next(ByteBuffer key, SortedMapByteBuffer, IColumn value) within ColumnFamilyRecordReader, because `key.clear();` is called, key is emptied. That causes the StaticRowIterator and WideRowIterator to glitch, namely, when Iterables.getLast(rows).key is called, key is already empty. This will cause Hadoop to request the same range again and again all the time. Please see the attached patch/diff, it simply adds lastRowKey (ByteBuffer) and saves it for the next iteration along with all the rows, this allows query for the next range to be fully correct. This patch is branched from 1.2.3 version. Tested against Cassandra 1.2.3, with Hadoop 1.0.3, 1.0.4 and 0.20.2 -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira