Re: Exceptions on 0.7.0

Stu Hood Wed, 23 Feb 2011 00:35:24 -0800

I expect that this problem was due to
https://issues.apache.org/jira/browse/CASSANDRA-2216 : I'll make noise to
try and get it released soon as 0.7.3


On Tue, Feb 22, 2011 at 5:41 AM, David Boxenhorn <da...@lookin2.com> wrote:

> Thanks, Shimi. I'll keep you posted if we make progress. Riptano is working
> on this problem too.
>
> On Tue, Feb 22, 2011 at 3:30 PM, shimi <shim...@gmail.com> wrote:
>
>> I didn't solved it.
>> Since it is a test cluster I deleted all the data. I copied some sstables
>> from my production cluster and I tried again, this time I didn't have this
>> problem.
>> I am planing on removing everything from this test cluster. I will start
>> all over again with 0.6.x , then I will load it with 10th of GB of data (not
>> sstable copy) and test the upgrade again.
>>
>> I did a mistake that I didn't backup the data files before I upgraded.
>>
>> Shimi
>>
>> On Tue, Feb 22, 2011 at 2:24 PM, David Boxenhorn <da...@lookin2.com>wrote:
>>
>>> Shimi,
>>>
>>> I am getting the same error that you report here. What did you do to
>>> solve it?
>>>
>>> David
>>>
>>>
>>> On Thu, Feb 10, 2011 at 2:54 PM, shimi <shim...@gmail.com> wrote:
>>>
>>>> I upgraded the version on all the nodes but I still gets the Exceptions.
>>>> I run cleanup on one of the nodes but I don't think there is any cleanup
>>>> going on.
>>>>
>>>> Another weird thing that I see is:
>>>> INFO [CompactionExecutor:1] 2011-02-10 12:08:21,353
>>>> CompactionIterator.java (line 135) Compacting large row
>>>> 333531353730363835363237353338383836383035363036393135323132383
>>>> 73630323034313a446f20322e384c20656e67696e657320686176652061646a75737461626c65206c696674657273
>>>> (725849473109 bytes) incrementally
>>>>
>>>> In my production version the largest row is 10259. It shouldn't be
>>>> different in this case.
>>>>
>>>> The first Exception is been thrown on 3 nodes during compaction.
>>>> The second Exception (Internal error processing get_range_slices) is
>>>> been thrown all the time by a forth node. I disabled gossip and any client
>>>> traffic to it and I still get the Exceptions.
>>>> Is it possible to boot a node with gossip disable?
>>>>
>>>> Shimi
>>>>
>>>> On Thu, Feb 10, 2011 at 11:11 AM, aaron morton <aa...@thelastpickle.com
>>>> > wrote:
>>>>
>>>>> I should be able to repair, install the new version and kick off
>>>>> nodetool repair .
>>>>>
>>>>> If you are uncertain search for cassandra-1992 on the list, there has
>>>>> been some discussion. You can also wait till some peeps in the states wake
>>>>> up if you want to be extra sure.
>>>>>
>>>>>  The number if the number of columns the iterator is going to return
>>>>> from the row. I'm guessing that because this happening during compaction
>>>>> it's using asked for the maximum possible number of columns.
>>>>>
>>>>> Aaron
>>>>>
>>>>>
>>>>>
>>>>> On 10 Feb 2011, at 21:37, shimi wrote:
>>>>>
>>>>> On 10 Feb 2011, at 13:42, Dan Hendry wrote:
>>>>>
>>>>>  Out of curiosity, do you really have on the order of 1,986,622,313
>>>>> elements (I believe elements=keys) in the cf?
>>>>>
>>>>> Dan
>>>>>
>>>>> No. I was too puzzled by the numbers
>>>>>
>>>>>
>>>>> On Thu, Feb 10, 2011 at 10:30 AM, aaron morton <
>>>>> aa...@thelastpickle.com> wrote:
>>>>>
>>>>>> Shimi,
>>>>>> You may be seeing the result of CASSANDRA-1992, are you able to test
>>>>>> with the most recent 0.7 build ?
>>>>>> https://hudson.apache.org/hudson/job/Cassandra-0.7/
>>>>>>
>>>>>>
>>>>>> Aaron
>>>>>>
>>>>> I will. I hope the data was not corrupted.
>>>>>
>>>>>
>>>>>
>>>>> On Thu, Feb 10, 2011 at 10:30 AM, aaron morton <
>>>>> aa...@thelastpickle.com> wrote:
>>>>>
>>>>>> Shimi,
>>>>>> You may be seeing the result of CASSANDRA-1992, are you able to test
>>>>>> with the most recent 0.7 build ?
>>>>>> https://hudson.apache.org/hudson/job/Cassandra-0.7/
>>>>>>
>>>>>>
>>>>>> Aaron
>>>>>>
>>>>>> On 10 Feb 2011, at 13:42, Dan Hendry wrote:
>>>>>>
>>>>>> Out of curiosity, do you really have on the order of 1,986,622,313
>>>>>> elements (I believe elements=keys) in the cf?
>>>>>>
>>>>>> Dan
>>>>>>
>>>>>>  *From:* shimi [mailto:shim...@gmail.com]
>>>>>> *Sent:* February-09-11 15:06
>>>>>> *To:* user@cassandra.apache.org
>>>>>> *Subject:* Exceptions on 0.7.0
>>>>>>
>>>>>> I have a 4 node test cluster were I test the port to 0.7.0 from 0.6.X
>>>>>> On 3 out of the 4 nodes I get exceptions in the log.
>>>>>> I am using RP.
>>>>>> Changes that I did:
>>>>>> 1. changed the replication factor from 3 to 4
>>>>>> 2. configured the nodes to use Dynamic Snitch
>>>>>> 3. RR of 0.33
>>>>>>
>>>>>> I run repair on 2 nodes  before I noticed the errors. One of them is
>>>>>> having the first error and the other the second.
>>>>>> I restart the nodes but I still get the exceptions.
>>>>>>
>>>>>> The following Exception I get from 2 nodes:
>>>>>>  WARN [CompactionExecutor:1] 2011-02-09 19:50:51,281 BloomFilter.java
>>>>>> (line 84) Cannot provide an optimal Bloom
>>>>>> Filter for 1986622313 elements (1/4 buckets per element).
>>>>>> ERROR [CompactionExecutor:1] 2011-02-09 19:51:10,190
>>>>>> AbstractCassandraDaemon.java (line 91) Fatal exception in
>>>>>> thread Thread[CompactionExecutor:1,1,main]
>>>>>> java.io.IOError: java.io.EOFException
>>>>>>         at
>>>>>> org.apache.cassandra.io.sstable.SSTableIdentityIterator.next(SSTableIdentityIterator.java:105)
>>>>>>         at
>>>>>> org.apache.cassandra.io.sstable.SSTableIdentityIterator.next(SSTableIdentityIterator.java:34)
>>>>>>         at
>>>>>> org.apache.commons.collections.iterators.CollatingIterator.set(CollatingIterator.java:284)
>>>>>>         at
>>>>>> org.apache.commons.collections.iterators.CollatingIterator.least(CollatingIterator.java:326)
>>>>>>         at
>>>>>> org.apache.commons.collections.iterators.CollatingIterator.next(CollatingIterator.java:230)
>>>>>>         at
>>>>>> org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:68)
>>>>>>         at
>>>>>> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
>>>>>>         at
>>>>>> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
>>>>>>         at
>>>>>> com.google.common.collect.Iterators$7.computeNext(Iterators.java:604)
>>>>>>         at
>>>>>> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
>>>>>>         at
>>>>>> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
>>>>>>         at
>>>>>> org.apache.cassandra.db.ColumnIndexer.serializeInternal(ColumnIndexer.java:76)
>>>>>>         at
>>>>>> org.apache.cassandra.db.ColumnIndexer.serialize(ColumnIndexer.java:50)
>>>>>>         at
>>>>>> org.apache.cassandra.io.LazilyCompactedRow.<init>(LazilyCompactedRow.java:88)
>>>>>>         at
>>>>>> org.apache.cassandra.io.CompactionIterator.getCompactedRow(CompactionIterator.java:136)
>>>>>>         at
>>>>>> org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:107)
>>>>>>         at
>>>>>> org.apache.cassandra.io.CompactionIterator.getReduced(CompactionIterator.java:42)
>>>>>>         at
>>>>>> org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:73)
>>>>>>         at
>>>>>> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
>>>>>>         at
>>>>>> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
>>>>>>         at
>>>>>> org.apache.commons.collections.iterators.FilterIterator.setNextObject(FilterIterator.java:183)
>>>>>>         at
>>>>>> org.apache.commons.collections.iterators.FilterIterator.hasNext(FilterIterator.java:94)
>>>>>>         at
>>>>>> org.apache.cassandra.db.CompactionManager.doCompaction(CompactionManager.java:323)
>>>>>>         at
>>>>>> org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:122)
>>>>>>         at
>>>>>> org.apache.cassandra.db.CompactionManager$1.call(CompactionManager.java:92)
>>>>>>         at
>>>>>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
>>>>>>         at java.util.concurrent.FutureTask.run(FutureTask.java:138)
>>>>>>         at
>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>>>>>         at
>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>>>>>         at java.lang.Thread.run(Thread.java:619)
>>>>>>  Caused by: java.io.EOFException
>>>>>>         at
>>>>>> java.io.RandomAccessFile.readFully(RandomAccessFile.java:383)
>>>>>>         at
>>>>>> org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:280)
>>>>>>         at
>>>>>> org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:76)
>>>>>>         at
>>>>>> org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:35)
>>>>>>         at
>>>>>> org.apache.cassandra.io.sstable.SSTableIdentityIterator.next(SSTableIdentityIterator.java:101)
>>>>>>         ... 29 more
>>>>>>
>>>>>>
>>>>>> On another node I get:
>>>>>>
>>>>>> ERROR [pool-1-thread-2] 2011-02-09 19:48:32,137 Cassandra.java (line
>>>>>> 2876) Internal error processing get_range_
>>>>>> slices
>>>>>> java.lang.RuntimeException: error reading 1 of 1970563183
>>>>>>         at
>>>>>> org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:82)
>>>>>>         at
>>>>>> org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:39)
>>>>>>         at
>>>>>> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
>>>>>>         at
>>>>>> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
>>>>>>         at
>>>>>> org.apache.cassandra.db.columniterator.SSTableSliceIterator.hasNext(SSTableSliceIterator.java:108)
>>>>>>         at
>>>>>> org.apache.commons.collections.iterators.CollatingIterator.anyHasNext(CollatingIterator.java:364)
>>>>>>         at
>>>>>> org.apache.commons.collections.iterators.CollatingIterator.hasNext(CollatingIterator.java:217)
>>>>>>         at
>>>>>> org.apache.cassandra.db.RowIteratorFactory$3.getReduced(RowIteratorFactory.java:136)
>>>>>>         at
>>>>>> org.apache.cassandra.db.RowIteratorFactory$3.getReduced(RowIteratorFactory.java:106)
>>>>>>         at
>>>>>> org.apache.cassandra.utils.ReducingIterator.computeNext(ReducingIterator.java:73)
>>>>>>         at
>>>>>> com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:136)
>>>>>>         at
>>>>>> com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:131)
>>>>>>         at
>>>>>> org.apache.cassandra.db.RowIterator.hasNext(RowIterator.java:49)
>>>>>>         at
>>>>>> org.apache.cassandra.db.ColumnFamilyStore.getRangeSlice(ColumnFamilyStore.java:1294)
>>>>>>         at
>>>>>> org.apache.cassandra.service.StorageProxy.getRangeSlice(StorageProxy.java:438)
>>>>>>         at
>>>>>> org.apache.cassandra.thrift.CassandraServer.get_range_slices(CassandraServer.java:473)
>>>>>>         at
>>>>>> org.apache.cassandra.thrift.Cassandra$Processor$get_range_slices.process(Cassandra.java:2868)
>>>>>>         at
>>>>>> org.apache.cassandra.thrift.Cassandra$Processor.process(Cassandra.java:2555)
>>>>>>         at
>>>>>> org.apache.cassandra.thrift.CustomTThreadPoolServer$WorkerProcess.run(CustomTThreadPoolServer.java:1
>>>>>> 67)
>>>>>>         at
>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
>>>>>>         at
>>>>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
>>>>>>         at java.lang.Thread.run(Thread.java:619)
>>>>>> Caused by: java.io.EOFException
>>>>>>         at
>>>>>> java.io.RandomAccessFile.readFully(RandomAccessFile.java:383)
>>>>>>         at
>>>>>> org.apache.cassandra.utils.FBUtilities.readByteArray(FBUtilities.java:280)
>>>>>>         at
>>>>>> org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:94)
>>>>>>         at
>>>>>> org.apache.cassandra.db.ColumnSerializer.deserialize(ColumnSerializer.java:35)
>>>>>>         at
>>>>>> org.apache.cassandra.db.columniterator.SimpleSliceReader.computeNext(SimpleSliceReader.java:78)
>>>>>>         ... 21 more
>>>>>>
>>>>>> any idea what went wrong?
>>>>>> Shimi
>>>>>>
>>>>>> No virus found in this incoming message.
>>>>>> Checked by AVG - www.avg.com
>>>>>> Version: 9.0.872 / Virus Database: 271.1.1/3432 - Release Date:
>>>>>> 02/09/11 02:34:00
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Exceptions on 0.7.0

Reply via email to