Re: hanging validation compaction

2017-04-13 Thread Roland Otta
oh yes .. you are absolutely right
thank you!

i will provide all the necessary info in the created jira issue

On Thu, 2017-04-13 at 15:09 +0200, benjamin roth wrote:
you should be able to find that out by scrubbing the corresponding table(s) and 
see wich one hangs?
i guess the debuglog tells you which sstable is being scrubbed.

2017-04-13 15:07 GMT+02:00 Roland Otta 
>:
i made a copy and also have the permission to upload sstables for that 
particular column_family

is it possible to track down which sstable of that cf is affected or should i 
upload all of them?


br,
roland


On Thu, 2017-04-13 at 13:57 +0200, benjamin roth wrote:
I think thats a good reproduction case for the issue - you should copy the 
sstable away for further testing. Are you allowed to upload the broken sstable 
to JIRA?

2017-04-13 13:15 GMT+02:00 Roland Otta 
>:
sorry .. i have to correct myself .. the problem still persists.

tried nodetool scrub now for the table ... but scrub is also stuck at the same 
percentage

id   compaction type keyspace table
completed total unit  progress
380e4980-2037-11e7-a9a4-a5f3eec2d826 Validation  bds  ad_event 
805955242 841258085 bytes 95.80%
fb17b8b0-2039-11e7-a9a4-a5f3eec2d826 Scrub   bds  ad_event 
805961728 841258085 bytes 95.80%
Active compaction remaining time :   0h00m00s

according to the thread dump its the same issue

Stack trace:
com.github.benmanes.caffeine.cache.BoundedLocalCache$$Lambda$65/60401277.accept(Unknown
 Source)
com.github.benmanes.caffeine.cache.BoundedBuffer$RingBuffer.drainTo(BoundedBuffer.java:104)
com.github.benmanes.caffeine.cache.StripedBuffer.drainTo(StripedBuffer.java:160)
com.github.benmanes.caffeine.cache.BoundedLocalCache.drainReadBuffer(BoundedLocalCache.java:964)
com.github.benmanes.caffeine.cache.BoundedLocalCache.maintenance(BoundedLocalCache.java:918)
com.github.benmanes.caffeine.cache.BoundedLocalCache.performCleanUp(BoundedLocalCache.java:903)
com.github.benmanes.caffeine.cache.BoundedLocalCache$PerformCleanupTask.run(BoundedLocalCache.java:2680)
com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
com.github.benmanes.caffeine.cache.BoundedLocalCache.scheduleDrainBuffers(BoundedLocalCache.java:875)
com.github.benmanes.caffeine.cache.BoundedLocalCache.afterRead(BoundedLocalCache.java:748)
com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:1783)
com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:97)
com.github.benmanes.caffeine.cache.LocalLoadingCache.get(LocalLoadingCache.java:66)
org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:235)
org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:213)
org.apache.cassandra.io.util.LimitingRebufferer.rebuffer(LimitingRebufferer.java:54)
org.apache.cassandra.io.util.RandomAccessReader.reBufferAt(RandomAccessReader.java:65)
org.apache.cassandra.io.util.RandomAccessReader.reBuffer(RandomAccessReader.java:59)
org.apache.cassandra.io.util.RebufferingInputStream.read(RebufferingInputStream.java:88)
org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:66)
org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:60)
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402)
org.apache.cassandra.db.marshal.AbstractType.readValue(AbstractType.java:420)
org.apache.cassandra.db.rows.Cell$Serializer.deserialize(Cell.java:245)
org.apache.cassandra.db.rows.UnfilteredSerializer.readSimpleColumn(UnfilteredSerializer.java:610)
org.apache.cassandra.db.rows.UnfilteredSerializer.lambda$deserializeRowBody$1(UnfilteredSerializer.java:575)
org.apache.cassandra.db.rows.UnfilteredSerializer$$Lambda$85/168219100.accept(Unknown
 Source)
org.apache.cassandra.utils.btree.BTree.applyForwards(BTree.java:1222)
org.apache.cassandra.utils.btree.BTree.apply(BTree.java:1177)
org.apache.cassandra.db.Columns.apply(Columns.java:377)
org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeRowBody(UnfilteredSerializer.java:571)
org.apache.cassandra.db.rows.UnfilteredSerializer.deserialize(UnfilteredSerializer.java:440)
org.apache.cassandra.io.sstable.SSTableSimpleIterator$CurrentFormatIterator.computeNext(SSTableSimpleIterator.java:95)
org.apache.cassandra.io.sstable.SSTableSimpleIterator$CurrentFormatIterator.computeNext(SSTableSimpleIterator.java:73)
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
org.apache.cassandra.io.sstable.SSTableIdentityIterator.hasNext(SSTableIdentityIterator.java:122)
org.apache.cassandra.db.compaction.Scrubber$RowMergingSSTableIterator.next(Scrubber.java:503)

Re: IPv6-only host, can't seem to get Cassandra to bind to a public port

2017-04-13 Thread Khaja, Raziuddin (NIH/NLM/NCBI) [C]
You are welcome Martjin! Glad to have been able to help (
Best,
-Razi

On 4/13/17, 12:13 PM, "Martijn Pieters"  wrote:

On 13/04/2017, 15:06, "Khaja, Raziuddin (NIH/NLM/NCBI) [C]" 
 wrote:
> Looking at your original message: 
https://urldefense.proofpoint.com/v2/url?u=http-3A__www.mail-2Darchive.com_user-40cassandra.apache.org_msg51736.html=DwIGaQ=5VD0RTtNlTh3ycd41b3MUw=xmrhtBIDZ_UcilJaq1SH8Q=pRFNVjsQoNpE4xWTky8s9kt-twcOChSSvnvSDW6jrfc=R0lHgyuqfgHkYPRDYmCpEWN1Q02x5moqZa25mRLdJ80=
 
>  
>   I see you edited etc/cassandra/cassandra-env.sh, by changing:
>
>   
>+#JVM_OPTS="$JVM_OPTS -Djava.net.preferIPv4Stack=true"
>
>+JVM_OPTS="$JVM_OPTS -Djava.net.preferIPv6Addresses=true"   
>
>   First, I don’t think there is an option java.net.preferIPv6Addresses, 
so I would recommend removing that line.

The option does exist, see 
https://docs.oracle.com/javase/8/docs/api/java/net/doc-files/net-properties.html.
 I had tried both with and without the IPv6 option.

> Second, I believe that starting in apache-cassandra-3.2, that 
enabling/disabling the option has been moved to a file called *jvm.options* 

[snip evidence of the option moving]

> My guess right now is that you may have upgraded Cassandra from a version 
older that 3.1 and somehow your config files are not compatible with 3.10? 

**BINGO**. Indeed, I had at some point downgraded to a different Cassandra 
version in an attempt to resolve issues with `cqlsh` (which hardcodes a CQL 
version). As a result the option was being applied **twice**, in 
/etc/cassandra/cassandra-env.sh and via /etc/cassandra/jvm.options.

Removing the switch from both locations now lets Cassandra bind to IPv6.

I now can finally drop the SSH tunnel forwarding the port in my test 
cluster.

Thanks!





Re: IPv6-only host, can't seem to get Cassandra to bind to a public port

2017-04-13 Thread Martijn Pieters
On 13/04/2017, 15:06, "Khaja, Raziuddin (NIH/NLM/NCBI) [C]" 
 wrote:
> Looking at your original message: 
> https://urldefense.proofpoint.com/v2/url?u=http-3A__www.mail-2Darchive.com_user-40cassandra.apache.org_msg51736.html=DwIGaQ=5VD0RTtNlTh3ycd41b3MUw=xmrhtBIDZ_UcilJaq1SH8Q=pRFNVjsQoNpE4xWTky8s9kt-twcOChSSvnvSDW6jrfc=R0lHgyuqfgHkYPRDYmCpEWN1Q02x5moqZa25mRLdJ80=
>  
>  
>   I see you edited etc/cassandra/cassandra-env.sh, by changing:
>
>   
>+#JVM_OPTS="$JVM_OPTS -Djava.net.preferIPv4Stack=true"
>
>+JVM_OPTS="$JVM_OPTS -Djava.net.preferIPv6Addresses=true"   
>
>   First, I don’t think there is an option java.net.preferIPv6Addresses, so I 
> would recommend removing that line.

The option does exist, see 
https://docs.oracle.com/javase/8/docs/api/java/net/doc-files/net-properties.html.
 I had tried both with and without the IPv6 option.

> Second, I believe that starting in apache-cassandra-3.2, that 
> enabling/disabling the option has been moved to a file called *jvm.options* 

[snip evidence of the option moving]

> My guess right now is that you may have upgraded Cassandra from a version 
> older that 3.1 and somehow your config files are not compatible with 3.10? 

**BINGO**. Indeed, I had at some point downgraded to a different Cassandra 
version in an attempt to resolve issues with `cqlsh` (which hardcodes a CQL 
version). As a result the option was being applied **twice**, in 
/etc/cassandra/cassandra-env.sh and via /etc/cassandra/jvm.options.

Removing the switch from both locations now lets Cassandra bind to IPv6.

I now can finally drop the SSH tunnel forwarding the port in my test cluster.

Thanks!



How to stress test collections in Cassandra Stress

2017-04-13 Thread eugene miretsky
Hi,

I'm trying to do a stress test on a a table with a collection column, but
cannot figure out how to do that.

I tried

table_definition: |
  CREATE TABLE list (
customer_id bigint,
items list,
PRIMARY KEY (customer_id));

columnspec:
  - name: customer_id
size: fixed(64)
population: norm(0..40M)
  - name: items
cluster: fixed(40)

When running the benchmark, I get: java.io.IOException: Operation x10 on
key(s) [27056313]: Error executing: (NoSuchElementException)


Re: IPv6-only host, can't seem to get Cassandra to bind to a public port

2017-04-13 Thread Khaja, Raziuddin (NIH/NLM/NCBI) [C]
Hi Martjin, 

Looking at your original message: 
http://www.mail-archive.com/user@cassandra.apache.org/msg51736.html

I see you edited etc/cassandra/cassandra-env.sh, by changing:

+#JVM_OPTS="$JVM_OPTS -Djava.net.preferIPv4Stack=true"
+JVM_OPTS="$JVM_OPTS -Djava.net.preferIPv6Addresses=true"

First, I don’t think there is an option java.net.preferIPv6Addresses, so I 
would recommend removing that line.

Second, I believe that starting in apache-cassandra-3.2, that 
enabling/disabling the option has been moved to a file called *jvm.options* 

./2.1.16/apache-cassandra-2.1.16/conf/cassandra-env.sh:JVM_OPTS="$JVM_OPTS 
-Djava.net.preferIPv4Stack=true"
./2.2.4/apache-cassandra-2.2.4/conf/cassandra-env.sh:JVM_OPTS="$JVM_OPTS 
-Djava.net.preferIPv4Stack=true"
./2.2.5/apache-cassandra-2.2.5/conf/cassandra-env.sh:JVM_OPTS="$JVM_OPTS 
-Djava.net.preferIPv4Stack=true"
./2.2.6/apache-cassandra-2.2.6/conf/cassandra-env.sh:JVM_OPTS="$JVM_OPTS 
-Djava.net.preferIPv4Stack=true"
./2.2.7/apache-cassandra-2.2.7/conf/cassandra-env.sh:JVM_OPTS="$JVM_OPTS 
-Djava.net.preferIPv4Stack=true"
./3.0.0/apache-cassandra-3.0.0/conf/cassandra-env.sh:JVM_OPTS="$JVM_OPTS 
-Djava.net.preferIPv4Stack=true"
./3.0.1/apache-cassandra-3.0.1/conf/cassandra-env.sh:JVM_OPTS="$JVM_OPTS 
-Djava.net.preferIPv4Stack=true"
./3.0.2/apache-cassandra-3.0.2/conf/cassandra-env.sh:JVM_OPTS="$JVM_OPTS 
-Djava.net.preferIPv4Stack=true"
./3.0.3/apache-cassandra-3.0.3/conf/cassandra-env.sh:JVM_OPTS="$JVM_OPTS 
-Djava.net.preferIPv4Stack=true"
./3.0.4/apache-cassandra-3.0.4/conf/cassandra-env.sh:JVM_OPTS="$JVM_OPTS 
-Djava.net.preferIPv4Stack=true"
./3.0.5/apache-cassandra-3.0.5/conf/cassandra-env.sh:JVM_OPTS="$JVM_OPTS 
-Djava.net.preferIPv4Stack=true"
./3.0.6/apache-cassandra-3.0.6/conf/cassandra-env.sh:JVM_OPTS="$JVM_OPTS 
-Djava.net.preferIPv4Stack=true"
./3.0.7/apache-cassandra-3.0.7/conf/cassandra-env.sh:JVM_OPTS="$JVM_OPTS 
-Djava.net.preferIPv4Stack=true"
./3.0.8/apache-cassandra-3.0.8/conf/cassandra-env.sh:JVM_OPTS="$JVM_OPTS 
-Djava.net.preferIPv4Stack=true"
./3.0.9/apache-cassandra-3.0.9/conf/cassandra-env.sh:JVM_OPTS="$JVM_OPTS 
-Djava.net.preferIPv4Stack=true"
./3.1/apache-cassandra-3.1/conf/cassandra-env.sh:JVM_OPTS="$JVM_OPTS 
-Djava.net.preferIPv4Stack=true"
./3.1.1/apache-cassandra-3.1.1/conf/cassandra-env.sh:JVM_OPTS="$JVM_OPTS 
-Djava.net.preferIPv4Stack=true"
./3.2/apache-cassandra-3.2/conf/jvm.options:-Djava.net.preferIPv4Stack=true
./3.2.1/apache-cassandra-3.2.1/conf/jvm.options:-Djava.net.preferIPv4Stack=true
./3.3/apache-cassandra-3.3/conf/jvm.options:-Djava.net.preferIPv4Stack=true
./3.4/apache-cassandra-3.4/conf/jvm.options:-Djava.net.preferIPv4Stack=true
./3.5/apache-cassandra-3.5/conf/jvm.options:-Djava.net.preferIPv4Stack=true
./3.6/apache-cassandra-3.6/conf/jvm.options:-Djava.net.preferIPv4Stack=true
./3.7/apache-cassandra-3.7/conf/jvm.options:-Djava.net.preferIPv4Stack=true

My guess right now is that you may have upgraded Cassandra from a version older 
that 3.1 and somehow your config files are not compatible with 3.10? 

-Razi


On 4/13/17, 5:41 AM, "Martijn Pieters"  wrote:

From my original email: 
http://www.mail-archive.com/user@cassandra.apache.org/msg51736.html:

> My configuration changes:
>
> listen_address: 
> listen_interface_prefer_ipv6: true

listen_interface is commented out. I've just now tried again with "# 
listen_interface_prefer_ipv6: false" (option commented out), but the error 
persists. 

I've also rebooted the system, in case the upgrade from base 16.04 to 
16.04.2 left something in a funky state.

On 12/04/2017, 21:39, "Khaja, Raziuddin (NIH/NLM/NCBI) [C]" 
 wrote:

Are you specifying both the listen_address and listen_interface, or 
just one of the two?

Send,  an example of the following 3 lines.  Here is what I have on my 
2.1.16 cluster that uses ipv6:

listen_address: ::hhh::h::hhh:h
# listen_interface: eth0
# listen_interface_prefer_ipv6: false

Also, looking at my config, I can confirm that it is uneccessary or 
wrong to escape the ipv6 address with \ as I suggested before.

-Razi

On 4/12/17, 4:05 PM, "Martijn Pieters"  wrote:

From: "Khaja, Raziuddin (NIH/NLM/NCBI) [C]" 

> Maybe you have to escape the IPV6 addresses in the cassandra.yaml 
in the same way.
> I think it’s worth a try.

Nope, no luck. You get an error instead:

ERROR [main] 2017-04-12 20:03:46,899 CassandraDaemon.java:752 - 
Exception encountered during startup: Unknown listen_address 
'\:\:\:\:\:h\:hh\:h'

(actual address digits replaced with h characters).

Martijn


 

Re: hanging validation compaction

2017-04-13 Thread benjamin roth
you should be able to find that out by scrubbing the corresponding table(s)
and see wich one hangs?
i guess the debuglog tells you which sstable is being scrubbed.

2017-04-13 15:07 GMT+02:00 Roland Otta :

> i made a copy and also have the permission to upload sstables for that
> particular column_family
>
> is it possible to track down which sstable of that cf is affected or
> should i upload all of them?
>
>
> br,
> roland
>
>
> On Thu, 2017-04-13 at 13:57 +0200, benjamin roth wrote:
>
> I think thats a good reproduction case for the issue - you should copy the
> sstable away for further testing. Are you allowed to upload the broken
> sstable to JIRA?
>
> 2017-04-13 13:15 GMT+02:00 Roland Otta :
>
> sorry .. i have to correct myself .. the problem still persists.
>
> tried nodetool scrub now for the table ... but scrub is also stuck at the
> same percentage
>
> id   compaction type keyspace
> tablecompleted total unit  progress
> 380e4980-2037-11e7-a9a4-a5f3eec2d826 Validation  bds  ad_event
> 805955242 841258085 bytes 95.80%
> fb17b8b0-2039-11e7-a9a4-a5f3eec2d826 Scrub   bds  ad_event
> 805961728 841258085 bytes 95.80%
> Active compaction remaining time :   0h00m00s
>
> according to the thread dump its the same issue
>
> Stack trace:
> com.github.benmanes.caffeine.cache.BoundedLocalCache$$Lambda$65/60401277.accept(Unknown
> Source)
> com.github.benmanes.caffeine.cache.BoundedBuffer$RingBuffer.
> drainTo(BoundedBuffer.java:104)
> com.github.benmanes.caffeine.cache.StripedBuffer.drainTo(Str
> ipedBuffer.java:160)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.drainRe
> adBuffer(BoundedLocalCache.java:964)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.mainten
> ance(BoundedLocalCache.java:918)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.perform
> CleanUp(BoundedLocalCache.java:903)
> com.github.benmanes.caffeine.cache.BoundedLocalCache$Perform
> CleanupTask.run(BoundedLocalCache.java:2680)
> com.google.common.util.concurrent.MoreExecutors$DirectExecut
> or.execute(MoreExecutors.java:457)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.schedul
> eDrainBuffers(BoundedLocalCache.java:875)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.afterRe
> ad(BoundedLocalCache.java:748)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.compute
> IfAbsent(BoundedLocalCache.java:1783)
> com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsen
> t(LocalCache.java:97)
> com.github.benmanes.caffeine.cache.LocalLoadingCache.get(Loc
> alLoadingCache.java:66)
> org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebu
> ffer(ChunkCache.java:235)
> org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebu
> ffer(ChunkCache.java:213)
> org.apache.cassandra.io.util.LimitingRebufferer.rebuffer(Lim
> itingRebufferer.java:54)
> org.apache.cassandra.io.util.RandomAccessReader.reBufferAt(R
> andomAccessReader.java:65)
> org.apache.cassandra.io.util.RandomAccessReader.reBuffer(Ran
> domAccessReader.java:59)
> org.apache.cassandra.io.util.RebufferingInputStream.read(Reb
> ufferingInputStream.java:88)
> org.apache.cassandra.io.util.RebufferingInputStream.readFull
> y(RebufferingInputStream.java:66)
> org.apache.cassandra.io.util.RebufferingInputStream.readFull
> y(RebufferingInputStream.java:60)
> org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402)
> org.apache.cassandra.db.marshal.AbstractType.readValue(
> AbstractType.java:420)
> org.apache.cassandra.db.rows.Cell$Serializer.deserialize(Cell.java:245)
> org.apache.cassandra.db.rows.UnfilteredSerializer.readSimple
> Column(UnfilteredSerializer.java:610)
> org.apache.cassandra.db.rows.UnfilteredSerializer.lambda$des
> erializeRowBody$1(UnfilteredSerializer.java:575)
> org.apache.cassandra.db.rows.UnfilteredSerializer$$Lambda$85/168219100.accept(Unknown
> Source)
> org.apache.cassandra.utils.btree.BTree.applyForwards(BTree.java:1222)
> org.apache.cassandra.utils.btree.BTree.apply(BTree.java:1177)
> org.apache.cassandra.db.Columns.apply(Columns.java:377)
> org.apache.cassandra.db.rows.UnfilteredSerializer.deserializ
> eRowBody(UnfilteredSerializer.java:571)
> org.apache.cassandra.db.rows.UnfilteredSerializer.deserializ
> e(UnfilteredSerializer.java:440)
> org.apache.cassandra.io.sstable.SSTableSimpleIterator$Curren
> tFormatIterator.computeNext(SSTableSimpleIterator.java:95)
> org.apache.cassandra.io.sstable.SSTableSimpleIterator$Curren
> tFormatIterator.computeNext(SSTableSimpleIterator.java:73)
> org.apache.cassandra.utils.AbstractIterator.hasNext(Abstract
> Iterator.java:47)
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.hasN
> ext(SSTableIdentityIterator.java:122)
> org.apache.cassandra.db.compaction.Scrubber$RowMergingSSTabl
> eIterator.next(Scrubber.java:503)
> org.apache.cassandra.db.compaction.Scrubber$RowMergingSSTabl
> eIterator.next(Scrubber.java:481)
> 

Re: hanging validation compaction

2017-04-13 Thread Roland Otta
i made a copy and also have the permission to upload sstables for that 
particular column_family

is it possible to track down which sstable of that cf is affected or should i 
upload all of them?


br,
roland


On Thu, 2017-04-13 at 13:57 +0200, benjamin roth wrote:
I think thats a good reproduction case for the issue - you should copy the 
sstable away for further testing. Are you allowed to upload the broken sstable 
to JIRA?

2017-04-13 13:15 GMT+02:00 Roland Otta 
>:
sorry .. i have to correct myself .. the problem still persists.

tried nodetool scrub now for the table ... but scrub is also stuck at the same 
percentage

id   compaction type keyspace table
completed total unit  progress
380e4980-2037-11e7-a9a4-a5f3eec2d826 Validation  bds  ad_event 
805955242 841258085 bytes 95.80%
fb17b8b0-2039-11e7-a9a4-a5f3eec2d826 Scrub   bds  ad_event 
805961728 841258085 bytes 95.80%
Active compaction remaining time :   0h00m00s

according to the thread dump its the same issue

Stack trace:
com.github.benmanes.caffeine.cache.BoundedLocalCache$$Lambda$65/60401277.accept(Unknown
 Source)
com.github.benmanes.caffeine.cache.BoundedBuffer$RingBuffer.drainTo(BoundedBuffer.java:104)
com.github.benmanes.caffeine.cache.StripedBuffer.drainTo(StripedBuffer.java:160)
com.github.benmanes.caffeine.cache.BoundedLocalCache.drainReadBuffer(BoundedLocalCache.java:964)
com.github.benmanes.caffeine.cache.BoundedLocalCache.maintenance(BoundedLocalCache.java:918)
com.github.benmanes.caffeine.cache.BoundedLocalCache.performCleanUp(BoundedLocalCache.java:903)
com.github.benmanes.caffeine.cache.BoundedLocalCache$PerformCleanupTask.run(BoundedLocalCache.java:2680)
com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
com.github.benmanes.caffeine.cache.BoundedLocalCache.scheduleDrainBuffers(BoundedLocalCache.java:875)
com.github.benmanes.caffeine.cache.BoundedLocalCache.afterRead(BoundedLocalCache.java:748)
com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:1783)
com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:97)
com.github.benmanes.caffeine.cache.LocalLoadingCache.get(LocalLoadingCache.java:66)
org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:235)
org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:213)
org.apache.cassandra.io.util.LimitingRebufferer.rebuffer(LimitingRebufferer.java:54)
org.apache.cassandra.io.util.RandomAccessReader.reBufferAt(RandomAccessReader.java:65)
org.apache.cassandra.io.util.RandomAccessReader.reBuffer(RandomAccessReader.java:59)
org.apache.cassandra.io.util.RebufferingInputStream.read(RebufferingInputStream.java:88)
org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:66)
org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:60)
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402)
org.apache.cassandra.db.marshal.AbstractType.readValue(AbstractType.java:420)
org.apache.cassandra.db.rows.Cell$Serializer.deserialize(Cell.java:245)
org.apache.cassandra.db.rows.UnfilteredSerializer.readSimpleColumn(UnfilteredSerializer.java:610)
org.apache.cassandra.db.rows.UnfilteredSerializer.lambda$deserializeRowBody$1(UnfilteredSerializer.java:575)
org.apache.cassandra.db.rows.UnfilteredSerializer$$Lambda$85/168219100.accept(Unknown
 Source)
org.apache.cassandra.utils.btree.BTree.applyForwards(BTree.java:1222)
org.apache.cassandra.utils.btree.BTree.apply(BTree.java:1177)
org.apache.cassandra.db.Columns.apply(Columns.java:377)
org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeRowBody(UnfilteredSerializer.java:571)
org.apache.cassandra.db.rows.UnfilteredSerializer.deserialize(UnfilteredSerializer.java:440)
org.apache.cassandra.io.sstable.SSTableSimpleIterator$CurrentFormatIterator.computeNext(SSTableSimpleIterator.java:95)
org.apache.cassandra.io.sstable.SSTableSimpleIterator$CurrentFormatIterator.computeNext(SSTableSimpleIterator.java:73)
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
org.apache.cassandra.io.sstable.SSTableIdentityIterator.hasNext(SSTableIdentityIterator.java:122)
org.apache.cassandra.db.compaction.Scrubber$RowMergingSSTableIterator.next(Scrubber.java:503)
org.apache.cassandra.db.compaction.Scrubber$RowMergingSSTableIterator.next(Scrubber.java:481)
org.apache.cassandra.db.compaction.Scrubber$OrderCheckerIterator.computeNext(Scrubber.java:609)
org.apache.cassandra.db.compaction.Scrubber$OrderCheckerIterator.computeNext(Scrubber.java:526)
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133)

Migrating to LCS : Disk Size recommendation clashes

2017-04-13 Thread Amit Singh F
Hi All,

We are in process of migrating from STCS to LCS and was just doing few reads on 
line . Below is the excerpt from Datastax recommendation on data size  :

Doc link : 
https://docs.datastax.com/en/landing_page/doc/landing_page/planning/planningHardware.html

[cid:image004.png@01D2B47E.E29E7480][cid:image005.jpg@01D2B47E.E29E7480]

Also there is one more recommendation where it hints down to disk size can be 
limited to 10 TB (worst case) . Below is also excerpt also :

Doc link : 
http://www.datastax.com/dev/blog/leveled-compaction-in-apache-cassandra

[cid:image007.png@01D2B47E.E29E7480][cid:image008.jpg@01D2B47E.E29E7480]

So are there any restrictions/scenarios due to which 600GB is the preferred one 
in LCS.

Thanks & Regards
Amit Singh



Re: WriteTimeoutException with LWT after few milliseconds

2017-04-13 Thread benjamin roth
I found out that if the WTEs occur, there was already another process
inserting the same primary key because I found duplicates in some places
that perfectly match the WTE logs.

Does anybody know, why this throws a WTE instead of returning [applied]' =
false ?
This is quite confusing!

2017-04-12 17:41 GMT+02:00 Carlos Rolo :

> You can try to use TRACING to debug the situation, but for a LWT to fail
> so fast, the most probable cause is what you stated: "It is possible that
> there are concurrent inserts on the same PK - actually thats the reason why
> I use LWTs." AKA, someone inserted first.
>
> Regards,
>
> Carlos Juzarte Rolo
> Cassandra Consultant / Datastax Certified Architect / Cassandra MVP
>
> Pythian - Love your data
>
> rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin:
> *linkedin.com/in/carlosjuzarterolo
> *
> Mobile: +351 918 918 100 <+351%20918%20918%20100>
> www.pythian.com
>
> On Wed, Apr 12, 2017 at 3:51 PM, Roland Otta 
> wrote:
>
>> sorry .. ignore my comment ...
>>
>> i missed your comment that the record is in the table ...
>>
>> On Wed, 2017-04-12 at 16:48 +0200, Roland Otta wrote:
>>
>> Hi Benjamin,
>>
>> its unlikely that i can assist you .. but nevertheless ... i give it a
>> try ;-)
>>
>> whats your consistency level for the insert?
>> what if one ore more nodes are marked down and proper consistency cant be
>> achieved?
>> of course the error message does not indicate that problem (as it says
>> its a timeout)... but in that case you would get an instant error for
>> inserts. wouldn't you?
>>
>> br,
>> roland
>>
>>
>>
>> On Wed, 2017-04-12 at 15:09 +0200, benjamin roth wrote:
>>
>> Hi folks,
>>
>> Can someone explain why that occurs?
>>
>> Write timeout after 0.006s
>> Query: 'INSERT INTO log_moment_import ("source", "reference", "user_id",
>> "moment_id", "date", "finished") VALUES (3, '1305821272790495', 65675537,
>> 0, '2017-04-12 13:00:51', NULL) IF NOT EXISTS
>> Primary key and parition key is source + reference
>> Message: Operation timed out - received only 1 responses.
>>
>> This appears every now and then in the log. When I check the for the
>> record in the table, it is there.
>> I could explain that, if the WTE occured after the configured write
>> timeout but it happens withing a few milliseconds.
>> Is this caused by lock contention? It is possible that there are
>> concurrent inserts on the same PK - actually thats the reason why I use
>> LWTs.
>>
>> Thanks!
>>
>>
>
> --
>
>
>
>


Re: hanging validation compaction

2017-04-13 Thread benjamin roth
I think thats a good reproduction case for the issue - you should copy the
sstable away for further testing. Are you allowed to upload the broken
sstable to JIRA?

2017-04-13 13:15 GMT+02:00 Roland Otta :

> sorry .. i have to correct myself .. the problem still persists.
>
> tried nodetool scrub now for the table ... but scrub is also stuck at the
> same percentage
>
> id   compaction type keyspace
> tablecompleted total unit  progress
> 380e4980-2037-11e7-a9a4-a5f3eec2d826 Validation  bds  ad_event
> 805955242 841258085 bytes 95.80%
> fb17b8b0-2039-11e7-a9a4-a5f3eec2d826 Scrub   bds  ad_event
> 805961728 841258085 bytes 95.80%
> Active compaction remaining time :   0h00m00s
>
> according to the thread dump its the same issue
>
> Stack trace:
> com.github.benmanes.caffeine.cache.BoundedLocalCache$$
> Lambda$65/60401277.accept(Unknown Source)
> com.github.benmanes.caffeine.cache.BoundedBuffer$RingBuffer.drainTo(
> BoundedBuffer.java:104)
> com.github.benmanes.caffeine.cache.StripedBuffer.drainTo(
> StripedBuffer.java:160)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.drainReadBuffer(
> BoundedLocalCache.java:964)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.
> maintenance(BoundedLocalCache.java:918)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.performCleanUp(
> BoundedLocalCache.java:903)
> com.github.benmanes.caffeine.cache.BoundedLocalCache$
> PerformCleanupTask.run(BoundedLocalCache.java:2680)
> com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(
> MoreExecutors.java:457)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.scheduleDrainBuffers(
> BoundedLocalCache.java:875)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.
> afterRead(BoundedLocalCache.java:748)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(
> BoundedLocalCache.java:1783)
> com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.
> java:97)
> com.github.benmanes.caffeine.cache.LocalLoadingCache.get(
> LocalLoadingCache.java:66)
> org.apache.cassandra.cache.ChunkCache$CachingRebufferer.
> rebuffer(ChunkCache.java:235)
> org.apache.cassandra.cache.ChunkCache$CachingRebufferer.
> rebuffer(ChunkCache.java:213)
> org.apache.cassandra.io.util.LimitingRebufferer.rebuffer(
> LimitingRebufferer.java:54)
> org.apache.cassandra.io.util.RandomAccessReader.reBufferAt(
> RandomAccessReader.java:65)
> org.apache.cassandra.io.util.RandomAccessReader.reBuffer(
> RandomAccessReader.java:59)
> org.apache.cassandra.io.util.RebufferingInputStream.read(
> RebufferingInputStream.java:88)
> org.apache.cassandra.io.util.RebufferingInputStream.readFully(
> RebufferingInputStream.java:66)
> org.apache.cassandra.io.util.RebufferingInputStream.readFully(
> RebufferingInputStream.java:60)
> org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402)
> org.apache.cassandra.db.marshal.AbstractType.readValue(AbstractType.java:
> 420)
> org.apache.cassandra.db.rows.Cell$Serializer.deserialize(Cell.java:245)
> org.apache.cassandra.db.rows.UnfilteredSerializer.readSimpleColumn(
> UnfilteredSerializer.java:610)
> org.apache.cassandra.db.rows.UnfilteredSerializer.lambda$
> deserializeRowBody$1(UnfilteredSerializer.java:575)
> org.apache.cassandra.db.rows.UnfilteredSerializer$$Lambda$85/168219100.accept(Unknown
> Source)
> org.apache.cassandra.utils.btree.BTree.applyForwards(BTree.java:1222)
> org.apache.cassandra.utils.btree.BTree.apply(BTree.java:1177)
> org.apache.cassandra.db.Columns.apply(Columns.java:377)
> org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeRowBody(
> UnfilteredSerializer.java:571)
> org.apache.cassandra.db.rows.UnfilteredSerializer.deserialize(
> UnfilteredSerializer.java:440)
> org.apache.cassandra.io.sstable.SSTableSimpleIterator$
> CurrentFormatIterator.computeNext(SSTableSimpleIterator.java:95)
> org.apache.cassandra.io.sstable.SSTableSimpleIterator$
> CurrentFormatIterator.computeNext(SSTableSimpleIterator.java:73)
> org.apache.cassandra.utils.AbstractIterator.hasNext(
> AbstractIterator.java:47)
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.hasNext(
> SSTableIdentityIterator.java:122)
> org.apache.cassandra.db.compaction.Scrubber$RowMergingSSTableIterator.
> next(Scrubber.java:503)
> org.apache.cassandra.db.compaction.Scrubber$RowMergingSSTableIterator.
> next(Scrubber.java:481)
> org.apache.cassandra.db.compaction.Scrubber$OrderCheckerIterator.
> computeNext(Scrubber.java:609)
> org.apache.cassandra.db.compaction.Scrubber$OrderCheckerIterator.
> computeNext(Scrubber.java:526)
> org.apache.cassandra.utils.AbstractIterator.hasNext(
> AbstractIterator.java:47)
> org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133)
> org.apache.cassandra.db.ColumnIndex.buildRowIndex(ColumnIndex.java:110)
> org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(
> BigTableWriter.java:173)
> 

Re: hanging validation compaction

2017-04-13 Thread Roland Otta
sorry .. i have to correct myself .. the problem still persists.

tried nodetool scrub now for the table ... but scrub is also stuck at the same 
percentage

id   compaction type keyspace table
completed total unit  progress
380e4980-2037-11e7-a9a4-a5f3eec2d826 Validation  bds  ad_event 
805955242 841258085 bytes 95.80%
fb17b8b0-2039-11e7-a9a4-a5f3eec2d826 Scrub   bds  ad_event 
805961728 841258085 bytes 95.80%
Active compaction remaining time :   0h00m00s

according to the thread dump its the same issue

Stack trace:
com.github.benmanes.caffeine.cache.BoundedLocalCache$$Lambda$65/60401277.accept(Unknown
 Source)
com.github.benmanes.caffeine.cache.BoundedBuffer$RingBuffer.drainTo(BoundedBuffer.java:104)
com.github.benmanes.caffeine.cache.StripedBuffer.drainTo(StripedBuffer.java:160)
com.github.benmanes.caffeine.cache.BoundedLocalCache.drainReadBuffer(BoundedLocalCache.java:964)
com.github.benmanes.caffeine.cache.BoundedLocalCache.maintenance(BoundedLocalCache.java:918)
com.github.benmanes.caffeine.cache.BoundedLocalCache.performCleanUp(BoundedLocalCache.java:903)
com.github.benmanes.caffeine.cache.BoundedLocalCache$PerformCleanupTask.run(BoundedLocalCache.java:2680)
com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
com.github.benmanes.caffeine.cache.BoundedLocalCache.scheduleDrainBuffers(BoundedLocalCache.java:875)
com.github.benmanes.caffeine.cache.BoundedLocalCache.afterRead(BoundedLocalCache.java:748)
com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:1783)
com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:97)
com.github.benmanes.caffeine.cache.LocalLoadingCache.get(LocalLoadingCache.java:66)
org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:235)
org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:213)
org.apache.cassandra.io.util.LimitingRebufferer.rebuffer(LimitingRebufferer.java:54)
org.apache.cassandra.io.util.RandomAccessReader.reBufferAt(RandomAccessReader.java:65)
org.apache.cassandra.io.util.RandomAccessReader.reBuffer(RandomAccessReader.java:59)
org.apache.cassandra.io.util.RebufferingInputStream.read(RebufferingInputStream.java:88)
org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:66)
org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:60)
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402)
org.apache.cassandra.db.marshal.AbstractType.readValue(AbstractType.java:420)
org.apache.cassandra.db.rows.Cell$Serializer.deserialize(Cell.java:245)
org.apache.cassandra.db.rows.UnfilteredSerializer.readSimpleColumn(UnfilteredSerializer.java:610)
org.apache.cassandra.db.rows.UnfilteredSerializer.lambda$deserializeRowBody$1(UnfilteredSerializer.java:575)
org.apache.cassandra.db.rows.UnfilteredSerializer$$Lambda$85/168219100.accept(Unknown
 Source)
org.apache.cassandra.utils.btree.BTree.applyForwards(BTree.java:1222)
org.apache.cassandra.utils.btree.BTree.apply(BTree.java:1177)
org.apache.cassandra.db.Columns.apply(Columns.java:377)
org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeRowBody(UnfilteredSerializer.java:571)
org.apache.cassandra.db.rows.UnfilteredSerializer.deserialize(UnfilteredSerializer.java:440)
org.apache.cassandra.io.sstable.SSTableSimpleIterator$CurrentFormatIterator.computeNext(SSTableSimpleIterator.java:95)
org.apache.cassandra.io.sstable.SSTableSimpleIterator$CurrentFormatIterator.computeNext(SSTableSimpleIterator.java:73)
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
org.apache.cassandra.io.sstable.SSTableIdentityIterator.hasNext(SSTableIdentityIterator.java:122)
org.apache.cassandra.db.compaction.Scrubber$RowMergingSSTableIterator.next(Scrubber.java:503)
org.apache.cassandra.db.compaction.Scrubber$RowMergingSSTableIterator.next(Scrubber.java:481)
org.apache.cassandra.db.compaction.Scrubber$OrderCheckerIterator.computeNext(Scrubber.java:609)
org.apache.cassandra.db.compaction.Scrubber$OrderCheckerIterator.computeNext(Scrubber.java:526)
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133)
org.apache.cassandra.db.ColumnIndex.buildRowIndex(ColumnIndex.java:110)
org.apache.cassandra.io.sstable.format.big.BigTableWriter.append(BigTableWriter.java:173)
org.apache.cassandra.io.sstable.SSTableRewriter.append(SSTableRewriter.java:135)
org.apache.cassandra.io.sstable.SSTableRewriter.tryAppend(SSTableRewriter.java:156)
org.apache.cassandra.db.compaction.Scrubber.tryAppend(Scrubber.java:319)
org.apache.cassandra.db.compaction.Scrubber.scrub(Scrubber.java:214)
org.apache.cassandra.db.compaction.CompactionManager.scrubOne(CompactionManager.java:966)

Re: hanging validation compaction

2017-04-13 Thread benjamin roth
What if you run it again with cache enabled?

Am 13.04.2017 12:04 schrieb "Roland Otta" :

> i did 2 restarts before which did not help
>
> after that i have set for testing purposes file_cache_size_in_mb: 0 and
> buffer_pool_use_heap_if_exhausted: false and restarted again
>
> after that it worked ... but it also could be that it just worked by
> accident after the last restart and is not related to my config changes
>
> On Thu, 2017-04-13 at 11:58 +0200, benjamin roth wrote:
>
> If you restart the server the same validation completes successfully?
> If not, have you tries scrubbing the affected sstables?
>
> 2017-04-13 11:43 GMT+02:00 Roland Otta :
>
> thank you guys ... i will
>
> i just wanted to make sure that i am not doing something completely wrong
> before opening an issue
>
> br,
> roland
>
>
> On Thu, 2017-04-13 at 21:35 +1200, Nate McCall wrote:
>
> Not sure what is going on there either. Roland - can you open an issue
> with the information above:
> https://issues.apache.org/jira/browse/CASSANDRA
>
> On Thu, Apr 13, 2017 at 7:49 PM, benjamin roth  wrote:
>
> What I can tell you from that trace - given that this is the correct
> thread and it really hangs there:
>
> The validation is stuck when reading from an SSTable.
> Unfortunately I am no caffeine expert. It looks like the read is cached
> and after the read caffeine tries to drain the cache and this is stuck. I
> don't see the reason from that stack trace.
> Someone had to dig deeper into caffeine to find the root cause.
>
> 2017-04-13 9:27 GMT+02:00 Roland Otta :
>
> i had a closer look at the validation executor thread (i hope thats what
> you meant)
>
> it seems the thread is always repeating stuff in
> org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebu
> ffer(ChunkCache.java:235)
>
> here is the full stack trace ...
>
> i am sorry .. but i have no clue whats happening there ..
>
> com.github.benmanes.caffeine.cache.BoundedLocalCache$$Lambda$64/2098345091
> <(209)%20834-5091>.accept(Unknown Source)
> com.github.benmanes.caffeine.cache.BoundedBuffer$RingBuffer.
> drainTo(BoundedBuffer.java:104)
> com.github.benmanes.caffeine.cache.StripedBuffer.drainTo(Str
> ipedBuffer.java:160)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.drainRe
> adBuffer(BoundedLocalCache.java:964)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.mainten
> ance(BoundedLocalCache.java:918)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.perform
> CleanUp(BoundedLocalCache.java:903)
> com.github.benmanes.caffeine.cache.BoundedLocalCache$Perform
> CleanupTask.run(BoundedLocalCache.java:2680)
> com.google.common.util.concurrent.MoreExecutors$DirectExecut
> or.execute(MoreExecutors.java:457)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.schedul
> eDrainBuffers(BoundedLocalCache.java:875)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.afterRe
> ad(BoundedLocalCache.java:748)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.compute
> IfAbsent(BoundedLocalCache.java:1783)
> com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsen
> t(LocalCache.java:97)
> com.github.benmanes.caffeine.cache.LocalLoadingCache.get(Loc
> alLoadingCache.java:66)
> org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebu
> ffer(ChunkCache.java:235)
> org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebu
> ffer(ChunkCache.java:213)
> org.apache.cassandra.io.util.RandomAccessReader.reBufferAt(R
> andomAccessReader.java:65)
> org.apache.cassandra.io.util.RandomAccessReader.reBuffer(Ran
> domAccessReader.java:59)
> org.apache.cassandra.io.util.RebufferingInputStream.read(Reb
> ufferingInputStream.java:88)
> org.apache.cassandra.io.util.RebufferingInputStream.readFull
> y(RebufferingInputStream.java:66)
> org.apache.cassandra.io.util.RebufferingInputStream.readFull
> y(RebufferingInputStream.java:60)
> org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402)
> org.apache.cassandra.db.marshal.AbstractType.readValue(Abstr
> actType.java:420)
> org.apache.cassandra.db.rows.Cell$Serializer.deserialize(Cell.java:245)
> org.apache.cassandra.db.rows.UnfilteredSerializer.readSimple
> Column(UnfilteredSerializer.java:610)
> org.apache.cassandra.db.rows.UnfilteredSerializer.lambda$des
> erializeRowBody$1(UnfilteredSerializer.java:575)
> org.apache.cassandra.db.rows.UnfilteredSerializer$$Lambda$84/898489541.accept(Unknown
> Source)
> org.apache.cassandra.utils.btree.BTree.applyForwards(BTree.java:1222)
> org.apache.cassandra.utils.btree.BTree.apply(BTree.java:1177)
> org.apache.cassandra.db.Columns.apply(Columns.java:377)
> org.apache.cassandra.db.rows.UnfilteredSerializer.deserializ
> eRowBody(UnfilteredSerializer.java:571)
> org.apache.cassandra.db.rows.UnfilteredSerializer.deserializ
> e(UnfilteredSerializer.java:440)
> org.apache.cassandra.io.sstable.SSTableSimpleIterator$Curren
> 

Re: hanging validation compaction

2017-04-13 Thread Roland Otta
i did 2 restarts before which did not help

after that i have set for testing purposes file_cache_size_in_mb: 0 and 
buffer_pool_use_heap_if_exhausted: false and restarted again

after that it worked ... but it also could be that it just worked by accident 
after the last restart and is not related to my config changes

On Thu, 2017-04-13 at 11:58 +0200, benjamin roth wrote:
If you restart the server the same validation completes successfully?
If not, have you tries scrubbing the affected sstables?

2017-04-13 11:43 GMT+02:00 Roland Otta 
>:
thank you guys ... i will

i just wanted to make sure that i am not doing something completely wrong 
before opening an issue

br,
roland


On Thu, 2017-04-13 at 21:35 +1200, Nate McCall wrote:
Not sure what is going on there either. Roland - can you open an issue with the 
information above:
https://issues.apache.org/jira/browse/CASSANDRA

On Thu, Apr 13, 2017 at 7:49 PM, benjamin roth 
> wrote:
What I can tell you from that trace - given that this is the correct thread and 
it really hangs there:

The validation is stuck when reading from an SSTable.
Unfortunately I am no caffeine expert. It looks like the read is cached and 
after the read caffeine tries to drain the cache and this is stuck. I don't see 
the reason from that stack trace.
Someone had to dig deeper into caffeine to find the root cause.

2017-04-13 9:27 GMT+02:00 Roland Otta 
>:
i had a closer look at the validation executor thread (i hope thats what you 
meant)

it seems the thread is always repeating stuff in
org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:235)

here is the full stack trace ...

i am sorry .. but i have no clue whats happening there ..

com.github.benmanes.caffeine.cache.BoundedLocalCache$$Lambda$64/2098345091.accept(Unknown
 Source)
com.github.benmanes.caffeine.cache.BoundedBuffer$RingBuffer.drainTo(BoundedBuffer.java:104)
com.github.benmanes.caffeine.cache.StripedBuffer.drainTo(StripedBuffer.java:160)
com.github.benmanes.caffeine.cache.BoundedLocalCache.drainReadBuffer(BoundedLocalCache.java:964)
com.github.benmanes.caffeine.cache.BoundedLocalCache.maintenance(BoundedLocalCache.java:918)
com.github.benmanes.caffeine.cache.BoundedLocalCache.performCleanUp(BoundedLocalCache.java:903)
com.github.benmanes.caffeine.cache.BoundedLocalCache$PerformCleanupTask.run(BoundedLocalCache.java:2680)
com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
com.github.benmanes.caffeine.cache.BoundedLocalCache.scheduleDrainBuffers(BoundedLocalCache.java:875)
com.github.benmanes.caffeine.cache.BoundedLocalCache.afterRead(BoundedLocalCache.java:748)
com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:1783)
com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:97)
com.github.benmanes.caffeine.cache.LocalLoadingCache.get(LocalLoadingCache.java:66)
org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:235)
org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:213)
org.apache.cassandra.io.util.RandomAccessReader.reBufferAt(RandomAccessReader.java:65)
org.apache.cassandra.io.util.RandomAccessReader.reBuffer(RandomAccessReader.java:59)
org.apache.cassandra.io.util.RebufferingInputStream.read(RebufferingInputStream.java:88)
org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:66)
org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:60)
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402)
org.apache.cassandra.db.marshal.AbstractType.readValue(AbstractType.java:420)
org.apache.cassandra.db.rows.Cell$Serializer.deserialize(Cell.java:245)
org.apache.cassandra.db.rows.UnfilteredSerializer.readSimpleColumn(UnfilteredSerializer.java:610)
org.apache.cassandra.db.rows.UnfilteredSerializer.lambda$deserializeRowBody$1(UnfilteredSerializer.java:575)
org.apache.cassandra.db.rows.UnfilteredSerializer$$Lambda$84/898489541.accept(Unknown
 Source)
org.apache.cassandra.utils.btree.BTree.applyForwards(BTree.java:1222)
org.apache.cassandra.utils.btree.BTree.apply(BTree.java:1177)
org.apache.cassandra.db.Columns.apply(Columns.java:377)
org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeRowBody(UnfilteredSerializer.java:571)
org.apache.cassandra.db.rows.UnfilteredSerializer.deserialize(UnfilteredSerializer.java:440)
org.apache.cassandra.io.sstable.SSTableSimpleIterator$CurrentFormatIterator.computeNext(SSTableSimpleIterator.java:95)
org.apache.cassandra.io.sstable.SSTableSimpleIterator$CurrentFormatIterator.computeNext(SSTableSimpleIterator.java:73)

Re: hanging validation compaction

2017-04-13 Thread benjamin roth
If you restart the server the same validation completes successfully?
If not, have you tries scrubbing the affected sstables?

2017-04-13 11:43 GMT+02:00 Roland Otta :

> thank you guys ... i will
>
> i just wanted to make sure that i am not doing something completely wrong
> before opening an issue
>
> br,
> roland
>
>
> On Thu, 2017-04-13 at 21:35 +1200, Nate McCall wrote:
>
> Not sure what is going on there either. Roland - can you open an issue
> with the information above:
> https://issues.apache.org/jira/browse/CASSANDRA
>
> On Thu, Apr 13, 2017 at 7:49 PM, benjamin roth  wrote:
>
> What I can tell you from that trace - given that this is the correct
> thread and it really hangs there:
>
> The validation is stuck when reading from an SSTable.
> Unfortunately I am no caffeine expert. It looks like the read is cached
> and after the read caffeine tries to drain the cache and this is stuck. I
> don't see the reason from that stack trace.
> Someone had to dig deeper into caffeine to find the root cause.
>
> 2017-04-13 9:27 GMT+02:00 Roland Otta :
>
> i had a closer look at the validation executor thread (i hope thats what
> you meant)
>
> it seems the thread is always repeating stuff in
> org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebu
> ffer(ChunkCache.java:235)
>
> here is the full stack trace ...
>
> i am sorry .. but i have no clue whats happening there ..
>
> com.github.benmanes.caffeine.cache.BoundedLocalCache$$Lambda$64/2098345091
> <(209)%20834-5091>.accept(Unknown Source)
> com.github.benmanes.caffeine.cache.BoundedBuffer$RingBuffer.
> drainTo(BoundedBuffer.java:104)
> com.github.benmanes.caffeine.cache.StripedBuffer.drainTo(Str
> ipedBuffer.java:160)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.drainRe
> adBuffer(BoundedLocalCache.java:964)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.mainten
> ance(BoundedLocalCache.java:918)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.perform
> CleanUp(BoundedLocalCache.java:903)
> com.github.benmanes.caffeine.cache.BoundedLocalCache$Perform
> CleanupTask.run(BoundedLocalCache.java:2680)
> com.google.common.util.concurrent.MoreExecutors$DirectExecut
> or.execute(MoreExecutors.java:457)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.schedul
> eDrainBuffers(BoundedLocalCache.java:875)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.afterRe
> ad(BoundedLocalCache.java:748)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.compute
> IfAbsent(BoundedLocalCache.java:1783)
> com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsen
> t(LocalCache.java:97)
> com.github.benmanes.caffeine.cache.LocalLoadingCache.get(Loc
> alLoadingCache.java:66)
> org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebu
> ffer(ChunkCache.java:235)
> org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebu
> ffer(ChunkCache.java:213)
> org.apache.cassandra.io.util.RandomAccessReader.reBufferAt(R
> andomAccessReader.java:65)
> org.apache.cassandra.io.util.RandomAccessReader.reBuffer(Ran
> domAccessReader.java:59)
> org.apache.cassandra.io.util.RebufferingInputStream.read(Reb
> ufferingInputStream.java:88)
> org.apache.cassandra.io.util.RebufferingInputStream.readFull
> y(RebufferingInputStream.java:66)
> org.apache.cassandra.io.util.RebufferingInputStream.readFull
> y(RebufferingInputStream.java:60)
> org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402)
> org.apache.cassandra.db.marshal.AbstractType.readValue(Abstr
> actType.java:420)
> org.apache.cassandra.db.rows.Cell$Serializer.deserialize(Cell.java:245)
> org.apache.cassandra.db.rows.UnfilteredSerializer.readSimple
> Column(UnfilteredSerializer.java:610)
> org.apache.cassandra.db.rows.UnfilteredSerializer.lambda$des
> erializeRowBody$1(UnfilteredSerializer.java:575)
> org.apache.cassandra.db.rows.UnfilteredSerializer$$Lambda$84/898489541.accept(Unknown
> Source)
> org.apache.cassandra.utils.btree.BTree.applyForwards(BTree.java:1222)
> org.apache.cassandra.utils.btree.BTree.apply(BTree.java:1177)
> org.apache.cassandra.db.Columns.apply(Columns.java:377)
> org.apache.cassandra.db.rows.UnfilteredSerializer.deserializ
> eRowBody(UnfilteredSerializer.java:571)
> org.apache.cassandra.db.rows.UnfilteredSerializer.deserializ
> e(UnfilteredSerializer.java:440)
> org.apache.cassandra.io.sstable.SSTableSimpleIterator$Curren
> tFormatIterator.computeNext(SSTableSimpleIterator.java:95)
> org.apache.cassandra.io.sstable.SSTableSimpleIterator$Curren
> tFormatIterator.computeNext(SSTableSimpleIterator.java:73)
> org.apache.cassandra.utils.AbstractIterator.hasNext(Abstract
> Iterator.java:47)
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.hasN
> ext(SSTableIdentityIterator.java:122)
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowI
> terator.computeNext(LazilyInitializedUnfilteredRowIterator.java:100)
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowI
> 

Re: hanging validation compaction

2017-04-13 Thread Roland Otta
thank you guys ... i will

i just wanted to make sure that i am not doing something completely wrong 
before opening an issue

br,
roland


On Thu, 2017-04-13 at 21:35 +1200, Nate McCall wrote:
Not sure what is going on there either. Roland - can you open an issue with the 
information above:
https://issues.apache.org/jira/browse/CASSANDRA

On Thu, Apr 13, 2017 at 7:49 PM, benjamin roth 
> wrote:
What I can tell you from that trace - given that this is the correct thread and 
it really hangs there:

The validation is stuck when reading from an SSTable.
Unfortunately I am no caffeine expert. It looks like the read is cached and 
after the read caffeine tries to drain the cache and this is stuck. I don't see 
the reason from that stack trace.
Someone had to dig deeper into caffeine to find the root cause.

2017-04-13 9:27 GMT+02:00 Roland Otta 
>:
i had a closer look at the validation executor thread (i hope thats what you 
meant)

it seems the thread is always repeating stuff in
org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:235)

here is the full stack trace ...

i am sorry .. but i have no clue whats happening there ..

com.github.benmanes.caffeine.cache.BoundedLocalCache$$Lambda$64/2098345091.accept(Unknown
 Source)
com.github.benmanes.caffeine.cache.BoundedBuffer$RingBuffer.drainTo(BoundedBuffer.java:104)
com.github.benmanes.caffeine.cache.StripedBuffer.drainTo(StripedBuffer.java:160)
com.github.benmanes.caffeine.cache.BoundedLocalCache.drainReadBuffer(BoundedLocalCache.java:964)
com.github.benmanes.caffeine.cache.BoundedLocalCache.maintenance(BoundedLocalCache.java:918)
com.github.benmanes.caffeine.cache.BoundedLocalCache.performCleanUp(BoundedLocalCache.java:903)
com.github.benmanes.caffeine.cache.BoundedLocalCache$PerformCleanupTask.run(BoundedLocalCache.java:2680)
com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
com.github.benmanes.caffeine.cache.BoundedLocalCache.scheduleDrainBuffers(BoundedLocalCache.java:875)
com.github.benmanes.caffeine.cache.BoundedLocalCache.afterRead(BoundedLocalCache.java:748)
com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:1783)
com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:97)
com.github.benmanes.caffeine.cache.LocalLoadingCache.get(LocalLoadingCache.java:66)
org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:235)
org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:213)
org.apache.cassandra.io.util.RandomAccessReader.reBufferAt(RandomAccessReader.java:65)
org.apache.cassandra.io.util.RandomAccessReader.reBuffer(RandomAccessReader.java:59)
org.apache.cassandra.io.util.RebufferingInputStream.read(RebufferingInputStream.java:88)
org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:66)
org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:60)
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402)
org.apache.cassandra.db.marshal.AbstractType.readValue(AbstractType.java:420)
org.apache.cassandra.db.rows.Cell$Serializer.deserialize(Cell.java:245)
org.apache.cassandra.db.rows.UnfilteredSerializer.readSimpleColumn(UnfilteredSerializer.java:610)
org.apache.cassandra.db.rows.UnfilteredSerializer.lambda$deserializeRowBody$1(UnfilteredSerializer.java:575)
org.apache.cassandra.db.rows.UnfilteredSerializer$$Lambda$84/898489541.accept(Unknown
 Source)
org.apache.cassandra.utils.btree.BTree.applyForwards(BTree.java:1222)
org.apache.cassandra.utils.btree.BTree.apply(BTree.java:1177)
org.apache.cassandra.db.Columns.apply(Columns.java:377)
org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeRowBody(UnfilteredSerializer.java:571)
org.apache.cassandra.db.rows.UnfilteredSerializer.deserialize(UnfilteredSerializer.java:440)
org.apache.cassandra.io.sstable.SSTableSimpleIterator$CurrentFormatIterator.computeNext(SSTableSimpleIterator.java:95)
org.apache.cassandra.io.sstable.SSTableSimpleIterator$CurrentFormatIterator.computeNext(SSTableSimpleIterator.java:73)
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
org.apache.cassandra.io.sstable.SSTableIdentityIterator.hasNext(SSTableIdentityIterator.java:122)
org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:100)
org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32)
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:374)

Re: IPv6-only host, can't seem to get Cassandra to bind to a public port

2017-04-13 Thread Martijn Pieters
From my original email: 
http://www.mail-archive.com/user@cassandra.apache.org/msg51736.html:

> My configuration changes:
>
> listen_address: 
> listen_interface_prefer_ipv6: true

listen_interface is commented out. I've just now tried again with "# 
listen_interface_prefer_ipv6: false" (option commented out), but the error 
persists. 

I've also rebooted the system, in case the upgrade from base 16.04 to 16.04.2 
left something in a funky state.

On 12/04/2017, 21:39, "Khaja, Raziuddin (NIH/NLM/NCBI) [C]" 
 wrote:

Are you specifying both the listen_address and listen_interface, or just 
one of the two?

Send,  an example of the following 3 lines.  Here is what I have on my 
2.1.16 cluster that uses ipv6:

listen_address: ::hhh::h::hhh:h
# listen_interface: eth0
# listen_interface_prefer_ipv6: false

Also, looking at my config, I can confirm that it is uneccessary or wrong 
to escape the ipv6 address with \ as I suggested before.

-Razi

On 4/12/17, 4:05 PM, "Martijn Pieters"  wrote:

From: "Khaja, Raziuddin (NIH/NLM/NCBI) [C]" 
> Maybe you have to escape the IPV6 addresses in the cassandra.yaml in 
the same way.
> I think it’s worth a try.

Nope, no luck. You get an error instead:

ERROR [main] 2017-04-12 20:03:46,899 CassandraDaemon.java:752 - 
Exception encountered during startup: Unknown listen_address 
'\:\:\:\:\:h\:hh\:h'

(actual address digits replaced with h characters).

Martijn











Re: hanging validation compaction

2017-04-13 Thread Nate McCall
Not sure what is going on there either. Roland - can you open an issue with
the information above:
https://issues.apache.org/jira/browse/CASSANDRA

On Thu, Apr 13, 2017 at 7:49 PM, benjamin roth  wrote:

> What I can tell you from that trace - given that this is the correct
> thread and it really hangs there:
>
> The validation is stuck when reading from an SSTable.
> Unfortunately I am no caffeine expert. It looks like the read is cached
> and after the read caffeine tries to drain the cache and this is stuck. I
> don't see the reason from that stack trace.
> Someone had to dig deeper into caffeine to find the root cause.
>
> 2017-04-13 9:27 GMT+02:00 Roland Otta :
>
>> i had a closer look at the validation executor thread (i hope thats what
>> you meant)
>>
>> it seems the thread is always repeating stuff in
>> org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebu
>> ffer(ChunkCache.java:235)
>>
>> here is the full stack trace ...
>>
>> i am sorry .. but i have no clue whats happening there ..
>>
>> com.github.benmanes.caffeine.cache.BoundedLocalCache$$Lambda$64/
>> 2098345091 <(209)%20834-5091>.accept(Unknown Source)
>> com.github.benmanes.caffeine.cache.BoundedBuffer$RingBuffer.
>> drainTo(BoundedBuffer.java:104)
>> com.github.benmanes.caffeine.cache.StripedBuffer.drainTo(Str
>> ipedBuffer.java:160)
>> com.github.benmanes.caffeine.cache.BoundedLocalCache.drainRe
>> adBuffer(BoundedLocalCache.java:964)
>> com.github.benmanes.caffeine.cache.BoundedLocalCache.mainten
>> ance(BoundedLocalCache.java:918)
>> com.github.benmanes.caffeine.cache.BoundedLocalCache.perform
>> CleanUp(BoundedLocalCache.java:903)
>> com.github.benmanes.caffeine.cache.BoundedLocalCache$Perform
>> CleanupTask.run(BoundedLocalCache.java:2680)
>> com.google.common.util.concurrent.MoreExecutors$DirectExecut
>> or.execute(MoreExecutors.java:457)
>> com.github.benmanes.caffeine.cache.BoundedLocalCache.schedul
>> eDrainBuffers(BoundedLocalCache.java:875)
>> com.github.benmanes.caffeine.cache.BoundedLocalCache.afterRe
>> ad(BoundedLocalCache.java:748)
>> com.github.benmanes.caffeine.cache.BoundedLocalCache.compute
>> IfAbsent(BoundedLocalCache.java:1783)
>> com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsen
>> t(LocalCache.java:97)
>> com.github.benmanes.caffeine.cache.LocalLoadingCache.get(Loc
>> alLoadingCache.java:66)
>> org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebu
>> ffer(ChunkCache.java:235)
>> org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebu
>> ffer(ChunkCache.java:213)
>> org.apache.cassandra.io.util.RandomAccessReader.reBufferAt(R
>> andomAccessReader.java:65)
>> org.apache.cassandra.io.util.RandomAccessReader.reBuffer(Ran
>> domAccessReader.java:59)
>> org.apache.cassandra.io.util.RebufferingInputStream.read(Reb
>> ufferingInputStream.java:88)
>> org.apache.cassandra.io.util.RebufferingInputStream.readFull
>> y(RebufferingInputStream.java:66)
>> org.apache.cassandra.io.util.RebufferingInputStream.readFull
>> y(RebufferingInputStream.java:60)
>> org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402)
>> org.apache.cassandra.db.marshal.AbstractType.readValue(
>> AbstractType.java:420)
>> org.apache.cassandra.db.rows.Cell$Serializer.deserialize(Cell.java:245)
>> org.apache.cassandra.db.rows.UnfilteredSerializer.readSimple
>> Column(UnfilteredSerializer.java:610)
>> org.apache.cassandra.db.rows.UnfilteredSerializer.lambda$des
>> erializeRowBody$1(UnfilteredSerializer.java:575)
>> org.apache.cassandra.db.rows.UnfilteredSerializer$$Lambda$84/898489541.accept(Unknown
>> Source)
>> org.apache.cassandra.utils.btree.BTree.applyForwards(BTree.java:1222)
>> org.apache.cassandra.utils.btree.BTree.apply(BTree.java:1177)
>> org.apache.cassandra.db.Columns.apply(Columns.java:377)
>> org.apache.cassandra.db.rows.UnfilteredSerializer.deserializ
>> eRowBody(UnfilteredSerializer.java:571)
>> org.apache.cassandra.db.rows.UnfilteredSerializer.deserializ
>> e(UnfilteredSerializer.java:440)
>> org.apache.cassandra.io.sstable.SSTableSimpleIterator$Curren
>> tFormatIterator.computeNext(SSTableSimpleIterator.java:95)
>> org.apache.cassandra.io.sstable.SSTableSimpleIterator$Curren
>> tFormatIterator.computeNext(SSTableSimpleIterator.java:73)
>> org.apache.cassandra.utils.AbstractIterator.hasNext(Abstract
>> Iterator.java:47)
>> org.apache.cassandra.io.sstable.SSTableIdentityIterator.hasN
>> ext(SSTableIdentityIterator.java:122)
>> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowI
>> terator.computeNext(LazilyInitializedUnfilteredRowIterator.java:100)
>> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowI
>> terator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32)
>> org.apache.cassandra.utils.AbstractIterator.hasNext(Abstract
>> Iterator.java:47)
>> org.apache.cassandra.utils.MergeIterator$Candidate.advance(
>> MergeIterator.java:374)
>> org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(
>> MergeIterator.java:186)

Re: force processing of pending hinted handoffs

2017-04-13 Thread benjamin roth
I encountered this situation also once or twice but didn't succeed. I just
deleted the old hints and ran a repair 

Am 13.04.2017 10:35 schrieb "Roland Otta" :

> unfortunately it does not.
>
> i guess this is intended for resuming hinted handoff handling in case it
> hase been paused with the pausehandoff before.
> i have tested it (resuming .. pausing & resuming) but it has no effect on
> those old hints
>
> On Thu, 2017-04-13 at 10:27 +0200, benjamin roth wrote:
>
> There is a nodetool command to resume hints. Maybe that helps?
>
> Am 13.04.2017 09:42 schrieb "Roland Otta" :
>
> oh ... the operation is deprecated according to the docs ...
>
>
> On Thu, 2017-04-13 at 07:40 +, Roland Otta wrote:
> > i figured out that there is an mbean
> > org.apache.cassandra.db.type=HintedHandoffManager with the operation
> > scheduleHintDelivery
> >
> > i guess thats what i would need in that case. at least the docs let
> > me
> > think so http://javadox.com/org.apache.cassandra/cassandra-all/3.0.0/
> > or
> > g/apache/cassandra/db/HintedHandOffManagerMBean.html
> >
> > but everytime i try invoking that operation i get an
> > UnsupportedOperationException (tried it with hostname, ip and host-id
> > as parameters - everytime the same exception)
> >
> >
> >
> > On Tue, 2017-04-11 at 07:40 +, Roland Otta wrote:
> > > hi,
> > >
> > > sometimes we have the problem that we have hinted handoffs (for
> > > example
> > > because auf network problems between 2 DCs) that do not get
> > > processed
> > > even if the connection problem between the dcs recovers. Some of
> > > the
> > > files stay in the hints directory until we restart the node that
> > > contains the hints.
> > >
> > > after the restart of cassandra we can see the proper messages for
> > > the
> > > hints handling
> > >
> > > Apr 11 09:28:56 bigd006 cassandra: INFO  07:28:56 Deleted hint file
> > > c429ad19-ee9f-4b5a-abcd-1da1516d1003-1491895717182-1.hints
> > > Apr 11 09:28:56 bigd006 cassandra: INFO  07:28:56 Finished hinted
> > > handoff of file c429ad19-ee9f-4b5a-abcd-1da1516d1003-1491895717182-
> > > 1.hints to endpoint c429ad19-ee9f-4b5a-abcd-1da1516d1003
> > >
> > > is there a way (for example via jmx) to force a node to process
> > > outstanding hints instead of restarting the node?
> > > does anyone know whats the cause for not retrying to process those
> > > hints automatically?
> > >
> > > br,
> > > roland
> > >
>
>


Re: force processing of pending hinted handoffs

2017-04-13 Thread Roland Otta
unfortunately it does not.

i guess this is intended for resuming hinted handoff handling in case it hase 
been paused with the pausehandoff before.
i have tested it (resuming .. pausing & resuming) but it has no effect on those 
old hints

On Thu, 2017-04-13 at 10:27 +0200, benjamin roth wrote:
There is a nodetool command to resume hints. Maybe that helps?

Am 13.04.2017 09:42 schrieb "Roland Otta" 
>:
oh ... the operation is deprecated according to the docs ...


On Thu, 2017-04-13 at 07:40 +, Roland Otta wrote:
> i figured out that there is an mbean
> org.apache.cassandra.db.type=HintedHandoffManager with the operation
> scheduleHintDelivery
>
> i guess thats what i would need in that case. at least the docs let
> me
> think so http://javadox.com/org.apache.cassandra/cassandra-all/3.0.0/
> or
> g/apache/cassandra/db/HintedHandOffManagerMBean.html
>
> but everytime i try invoking that operation i get an
> UnsupportedOperationException (tried it with hostname, ip and host-id
> as parameters - everytime the same exception)
>
>
>
> On Tue, 2017-04-11 at 07:40 +, Roland Otta wrote:
> > hi,
> >
> > sometimes we have the problem that we have hinted handoffs (for
> > example
> > because auf network problems between 2 DCs) that do not get
> > processed
> > even if the connection problem between the dcs recovers. Some of
> > the
> > files stay in the hints directory until we restart the node that
> > contains the hints.
> >
> > after the restart of cassandra we can see the proper messages for
> > the
> > hints handling
> >
> > Apr 11 09:28:56 bigd006 cassandra: INFO  07:28:56 Deleted hint file
> > c429ad19-ee9f-4b5a-abcd-1da1516d1003-1491895717182-1.hints
> > Apr 11 09:28:56 bigd006 cassandra: INFO  07:28:56 Finished hinted
> > handoff of file c429ad19-ee9f-4b5a-abcd-1da1516d1003-1491895717182-
> > 1.hints to endpoint c429ad19-ee9f-4b5a-abcd-1da1516d1003
> >
> > is there a way (for example via jmx) to force a node to process
> > outstanding hints instead of restarting the node?
> > does anyone know whats the cause for not retrying to process those
> > hints automatically?
> >
> > br,
> > roland
> >


Re: force processing of pending hinted handoffs

2017-04-13 Thread benjamin roth
There is a nodetool command to resume hints. Maybe that helps?

Am 13.04.2017 09:42 schrieb "Roland Otta" :

> oh ... the operation is deprecated according to the docs ...
>
>
> On Thu, 2017-04-13 at 07:40 +, Roland Otta wrote:
> > i figured out that there is an mbean
> > org.apache.cassandra.db.type=HintedHandoffManager with the operation
> > scheduleHintDelivery
> >
> > i guess thats what i would need in that case. at least the docs let
> > me
> > think so http://javadox.com/org.apache.cassandra/cassandra-all/3.0.0/
> > or
> > g/apache/cassandra/db/HintedHandOffManagerMBean.html
> >
> > but everytime i try invoking that operation i get an
> > UnsupportedOperationException (tried it with hostname, ip and host-id
> > as parameters - everytime the same exception)
> >
> >
> >
> > On Tue, 2017-04-11 at 07:40 +, Roland Otta wrote:
> > > hi,
> > >
> > > sometimes we have the problem that we have hinted handoffs (for
> > > example
> > > because auf network problems between 2 DCs) that do not get
> > > processed
> > > even if the connection problem between the dcs recovers. Some of
> > > the
> > > files stay in the hints directory until we restart the node that
> > > contains the hints.
> > >
> > > after the restart of cassandra we can see the proper messages for
> > > the
> > > hints handling
> > >
> > > Apr 11 09:28:56 bigd006 cassandra: INFO  07:28:56 Deleted hint file
> > > c429ad19-ee9f-4b5a-abcd-1da1516d1003-1491895717182-1.hints
> > > Apr 11 09:28:56 bigd006 cassandra: INFO  07:28:56 Finished hinted
> > > handoff of file c429ad19-ee9f-4b5a-abcd-1da1516d1003-1491895717182-
> > > 1.hints to endpoint c429ad19-ee9f-4b5a-abcd-1da1516d1003
> > >
> > > is there a way (for example via jmx) to force a node to process
> > > outstanding hints instead of restarting the node?
> > > does anyone know whats the cause for not retrying to process those
> > > hints automatically?
> > >
> > > br,
> > > roland
> > >


Re: hanging validation compaction

2017-04-13 Thread benjamin roth
What I can tell you from that trace - given that this is the correct thread
and it really hangs there:

The validation is stuck when reading from an SSTable.
Unfortunately I am no caffeine expert. It looks like the read is cached and
after the read caffeine tries to drain the cache and this is stuck. I don't
see the reason from that stack trace.
Someone had to dig deeper into caffeine to find the root cause.

2017-04-13 9:27 GMT+02:00 Roland Otta :

> i had a closer look at the validation executor thread (i hope thats what
> you meant)
>
> it seems the thread is always repeating stuff in
> org.apache.cassandra.cache.ChunkCache$CachingRebufferer.
> rebuffer(ChunkCache.java:235)
>
> here is the full stack trace ...
>
> i am sorry .. but i have no clue whats happening there ..
>
> com.github.benmanes.caffeine.cache.BoundedLocalCache$$Lambda$64/2098345091
> <(209)%20834-5091>.accept(Unknown Source)
> com.github.benmanes.caffeine.cache.BoundedBuffer$RingBuffer.drainTo(
> BoundedBuffer.java:104)
> com.github.benmanes.caffeine.cache.StripedBuffer.drainTo(
> StripedBuffer.java:160)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.drainReadBuffer(
> BoundedLocalCache.java:964)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.
> maintenance(BoundedLocalCache.java:918)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.performCleanUp(
> BoundedLocalCache.java:903)
> com.github.benmanes.caffeine.cache.BoundedLocalCache$
> PerformCleanupTask.run(BoundedLocalCache.java:2680)
> com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(
> MoreExecutors.java:457)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.scheduleDrainBuffers(
> BoundedLocalCache.java:875)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.
> afterRead(BoundedLocalCache.java:748)
> com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(
> BoundedLocalCache.java:1783)
> com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.
> java:97)
> com.github.benmanes.caffeine.cache.LocalLoadingCache.get(
> LocalLoadingCache.java:66)
> org.apache.cassandra.cache.ChunkCache$CachingRebufferer.
> rebuffer(ChunkCache.java:235)
> org.apache.cassandra.cache.ChunkCache$CachingRebufferer.
> rebuffer(ChunkCache.java:213)
> org.apache.cassandra.io.util.RandomAccessReader.reBufferAt(
> RandomAccessReader.java:65)
> org.apache.cassandra.io.util.RandomAccessReader.reBuffer(
> RandomAccessReader.java:59)
> org.apache.cassandra.io.util.RebufferingInputStream.read(
> RebufferingInputStream.java:88)
> org.apache.cassandra.io.util.RebufferingInputStream.readFully(
> RebufferingInputStream.java:66)
> org.apache.cassandra.io.util.RebufferingInputStream.readFully(
> RebufferingInputStream.java:60)
> org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402)
> org.apache.cassandra.db.marshal.AbstractType.readValue(AbstractType.java:
> 420)
> org.apache.cassandra.db.rows.Cell$Serializer.deserialize(Cell.java:245)
> org.apache.cassandra.db.rows.UnfilteredSerializer.readSimpleColumn(
> UnfilteredSerializer.java:610)
> org.apache.cassandra.db.rows.UnfilteredSerializer.lambda$
> deserializeRowBody$1(UnfilteredSerializer.java:575)
> org.apache.cassandra.db.rows.UnfilteredSerializer$$Lambda$84/898489541.accept(Unknown
> Source)
> org.apache.cassandra.utils.btree.BTree.applyForwards(BTree.java:1222)
> org.apache.cassandra.utils.btree.BTree.apply(BTree.java:1177)
> org.apache.cassandra.db.Columns.apply(Columns.java:377)
> org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeRowBody(
> UnfilteredSerializer.java:571)
> org.apache.cassandra.db.rows.UnfilteredSerializer.deserialize(
> UnfilteredSerializer.java:440)
> org.apache.cassandra.io.sstable.SSTableSimpleIterator$
> CurrentFormatIterator.computeNext(SSTableSimpleIterator.java:95)
> org.apache.cassandra.io.sstable.SSTableSimpleIterator$
> CurrentFormatIterator.computeNext(SSTableSimpleIterator.java:73)
> org.apache.cassandra.utils.AbstractIterator.hasNext(
> AbstractIterator.java:47)
> org.apache.cassandra.io.sstable.SSTableIdentityIterator.hasNext(
> SSTableIdentityIterator.java:122)
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRow
> Iterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:100)
> org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRow
> Iterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32)
> org.apache.cassandra.utils.AbstractIterator.hasNext(
> AbstractIterator.java:47)
> org.apache.cassandra.utils.MergeIterator$Candidate.
> advance(MergeIterator.java:374)
> org.apache.cassandra.utils.MergeIterator$ManyToOne.
> advance(MergeIterator.java:186)
> org.apache.cassandra.utils.MergeIterator$ManyToOne.
> computeNext(MergeIterator.java:155)
> org.apache.cassandra.utils.AbstractIterator.hasNext(
> AbstractIterator.java:47)
> org.apache.cassandra.db.rows.UnfilteredRowIterators$
> UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:500)
> 

Re: force processing of pending hinted handoffs

2017-04-13 Thread Roland Otta
oh ... the operation is deprecated according to the docs ...


On Thu, 2017-04-13 at 07:40 +, Roland Otta wrote:
> i figured out that there is an mbean
> org.apache.cassandra.db.type=HintedHandoffManager with the operation
> scheduleHintDelivery
> 
> i guess thats what i would need in that case. at least the docs let
> me
> think so http://javadox.com/org.apache.cassandra/cassandra-all/3.0.0/
> or
> g/apache/cassandra/db/HintedHandOffManagerMBean.html
> 
> but everytime i try invoking that operation i get an
> UnsupportedOperationException (tried it with hostname, ip and host-id
> as parameters - everytime the same exception)
> 
> 
> 
> On Tue, 2017-04-11 at 07:40 +, Roland Otta wrote:
> > hi,
> > 
> > sometimes we have the problem that we have hinted handoffs (for
> > example
> > because auf network problems between 2 DCs) that do not get
> > processed
> > even if the connection problem between the dcs recovers. Some of
> > the
> > files stay in the hints directory until we restart the node that
> > contains the hints.
> > 
> > after the restart of cassandra we can see the proper messages for
> > the
> > hints handling
> > 
> > Apr 11 09:28:56 bigd006 cassandra: INFO  07:28:56 Deleted hint file
> > c429ad19-ee9f-4b5a-abcd-1da1516d1003-1491895717182-1.hints
> > Apr 11 09:28:56 bigd006 cassandra: INFO  07:28:56 Finished hinted
> > handoff of file c429ad19-ee9f-4b5a-abcd-1da1516d1003-1491895717182-
> > 1.hints to endpoint c429ad19-ee9f-4b5a-abcd-1da1516d1003
> > 
> > is there a way (for example via jmx) to force a node to process
> > outstanding hints instead of restarting the node?
> > does anyone know whats the cause for not retrying to process those
> > hints automatically?
> > 
> > br,
> > roland
> > 

Re: force processing of pending hinted handoffs

2017-04-13 Thread Roland Otta
i figured out that there is an mbean
org.apache.cassandra.db.type=HintedHandoffManager with the operation
scheduleHintDelivery

i guess thats what i would need in that case. at least the docs let me
think so http://javadox.com/org.apache.cassandra/cassandra-all/3.0.0/or
g/apache/cassandra/db/HintedHandOffManagerMBean.html

but everytime i try invoking that operation i get an
UnsupportedOperationException (tried it with hostname, ip and host-id
as parameters - everytime the same exception)



On Tue, 2017-04-11 at 07:40 +, Roland Otta wrote:
> hi,
> 
> sometimes we have the problem that we have hinted handoffs (for
> example
> because auf network problems between 2 DCs) that do not get processed
> even if the connection problem between the dcs recovers. Some of the
> files stay in the hints directory until we restart the node that
> contains the hints.
> 
> after the restart of cassandra we can see the proper messages for the
> hints handling
> 
> Apr 11 09:28:56 bigd006 cassandra: INFO  07:28:56 Deleted hint file
> c429ad19-ee9f-4b5a-abcd-1da1516d1003-1491895717182-1.hints
> Apr 11 09:28:56 bigd006 cassandra: INFO  07:28:56 Finished hinted
> handoff of file c429ad19-ee9f-4b5a-abcd-1da1516d1003-1491895717182-
> 1.hints to endpoint c429ad19-ee9f-4b5a-abcd-1da1516d1003
> 
> is there a way (for example via jmx) to force a node to process
> outstanding hints instead of restarting the node?
> does anyone know whats the cause for not retrying to process those
> hints automatically?
> 
> br,
> roland
> 
> 

Re: hanging validation compaction

2017-04-13 Thread Roland Otta
i had a closer look at the validation executor thread (i hope thats what you 
meant)

it seems the thread is always repeating stuff in
org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:235)

here is the full stack trace ...

i am sorry .. but i have no clue whats happening there ..

com.github.benmanes.caffeine.cache.BoundedLocalCache$$Lambda$64/2098345091.accept(Unknown
 Source)
com.github.benmanes.caffeine.cache.BoundedBuffer$RingBuffer.drainTo(BoundedBuffer.java:104)
com.github.benmanes.caffeine.cache.StripedBuffer.drainTo(StripedBuffer.java:160)
com.github.benmanes.caffeine.cache.BoundedLocalCache.drainReadBuffer(BoundedLocalCache.java:964)
com.github.benmanes.caffeine.cache.BoundedLocalCache.maintenance(BoundedLocalCache.java:918)
com.github.benmanes.caffeine.cache.BoundedLocalCache.performCleanUp(BoundedLocalCache.java:903)
com.github.benmanes.caffeine.cache.BoundedLocalCache$PerformCleanupTask.run(BoundedLocalCache.java:2680)
com.google.common.util.concurrent.MoreExecutors$DirectExecutor.execute(MoreExecutors.java:457)
com.github.benmanes.caffeine.cache.BoundedLocalCache.scheduleDrainBuffers(BoundedLocalCache.java:875)
com.github.benmanes.caffeine.cache.BoundedLocalCache.afterRead(BoundedLocalCache.java:748)
com.github.benmanes.caffeine.cache.BoundedLocalCache.computeIfAbsent(BoundedLocalCache.java:1783)
com.github.benmanes.caffeine.cache.LocalCache.computeIfAbsent(LocalCache.java:97)
com.github.benmanes.caffeine.cache.LocalLoadingCache.get(LocalLoadingCache.java:66)
org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:235)
org.apache.cassandra.cache.ChunkCache$CachingRebufferer.rebuffer(ChunkCache.java:213)
org.apache.cassandra.io.util.RandomAccessReader.reBufferAt(RandomAccessReader.java:65)
org.apache.cassandra.io.util.RandomAccessReader.reBuffer(RandomAccessReader.java:59)
org.apache.cassandra.io.util.RebufferingInputStream.read(RebufferingInputStream.java:88)
org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:66)
org.apache.cassandra.io.util.RebufferingInputStream.readFully(RebufferingInputStream.java:60)
org.apache.cassandra.utils.ByteBufferUtil.read(ByteBufferUtil.java:402)
org.apache.cassandra.db.marshal.AbstractType.readValue(AbstractType.java:420)
org.apache.cassandra.db.rows.Cell$Serializer.deserialize(Cell.java:245)
org.apache.cassandra.db.rows.UnfilteredSerializer.readSimpleColumn(UnfilteredSerializer.java:610)
org.apache.cassandra.db.rows.UnfilteredSerializer.lambda$deserializeRowBody$1(UnfilteredSerializer.java:575)
org.apache.cassandra.db.rows.UnfilteredSerializer$$Lambda$84/898489541.accept(Unknown
 Source)
org.apache.cassandra.utils.btree.BTree.applyForwards(BTree.java:1222)
org.apache.cassandra.utils.btree.BTree.apply(BTree.java:1177)
org.apache.cassandra.db.Columns.apply(Columns.java:377)
org.apache.cassandra.db.rows.UnfilteredSerializer.deserializeRowBody(UnfilteredSerializer.java:571)
org.apache.cassandra.db.rows.UnfilteredSerializer.deserialize(UnfilteredSerializer.java:440)
org.apache.cassandra.io.sstable.SSTableSimpleIterator$CurrentFormatIterator.computeNext(SSTableSimpleIterator.java:95)
org.apache.cassandra.io.sstable.SSTableSimpleIterator$CurrentFormatIterator.computeNext(SSTableSimpleIterator.java:73)
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
org.apache.cassandra.io.sstable.SSTableIdentityIterator.hasNext(SSTableIdentityIterator.java:122)
org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:100)
org.apache.cassandra.db.rows.LazilyInitializedUnfilteredRowIterator.computeNext(LazilyInitializedUnfilteredRowIterator.java:32)
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:374)
org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:186)
org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:155)
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:500)
org.apache.cassandra.db.rows.UnfilteredRowIterators$UnfilteredRowMergeIterator.computeNext(UnfilteredRowIterators.java:360)
org.apache.cassandra.utils.AbstractIterator.hasNext(AbstractIterator.java:47)
org.apache.cassandra.db.transform.BaseRows.hasNext(BaseRows.java:133)
org.apache.cassandra.db.rows.UnfilteredRowIterators.digest(UnfilteredRowIterators.java:178)
org.apache.cassandra.repair.Validator.rowHash(Validator.java:221)
org.apache.cassandra.repair.Validator.add(Validator.java:160)
org.apache.cassandra.db.compaction.CompactionManager.doValidationCompaction(CompactionManager.java:1364)
org.apache.cassandra.db.compaction.CompactionManager.access$700(CompactionManager.java:85)

Re: hanging validation compaction

2017-04-13 Thread benjamin roth
You should connect to the node with JConsole and see where the compaction
thread is stuck

2017-04-13 8:34 GMT+02:00 Roland Otta :

> hi,
>
> we have the following issue on our 3.10 development cluster.
>
> we are doing regular repairs with thelastpickle's fork of creaper.
> sometimes the repair (it is a full repair in that case) hangs because
> of a stuck validation compaction
>
> nodetool compactionstats gives me
> a1bb45c0-1fc6-11e7-81de-0fb0b3f5a345 Validation  bds  ad_event
> 805955242 841258085 bytes 95.80%
> we have here no more progress for hours
>
> nodetool tpstats shows
> alidationExecutor1 1  16186 0
>0
>
> i checked the logs on the affected node and could not find any
> suspicious errors.
>
> anyone that already had this issue and knows how to cope with that?
>
> a restart of the node helps to finish the repair ... but i am not sure
> whether that somehow breaks the full repair
>
> bg,
> roland
>


hanging validation compaction

2017-04-13 Thread Roland Otta
hi,

we have the following issue on our 3.10 development cluster.

we are doing regular repairs with thelastpickle's fork of creaper.
sometimes the repair (it is a full repair in that case) hangs because
of a stuck validation compaction

nodetool compactionstats gives me
a1bb45c0-1fc6-11e7-81de-0fb0b3f5a345 Validation  bds  ad_event
805955242 841258085 bytes 95.80% 
we have here no more progress for hours

nodetool tpstats shows
alidationExecutor1 1  16186 0  
   0

i checked the logs on the affected node and could not find any
suspicious errors.

anyone that already had this issue and knows how to cope with that?

a restart of the node helps to finish the repair ... but i am not sure
whether that somehow breaks the full repair

bg,
roland