Re: Benefit of LOCAL_SERIAL consistency

2016-12-08 Thread Sylvain Lebresne
On Fri, Dec 9, 2016 at 1:35 AM, Edward Capriolo 
wrote:
>
> I copied the wrong issue:
>
> The core issue was this: https://issues.apache.
> org/jira/browse/CASSANDRA-6123
>

Well, my previous remark applies equally well to this ticket so let me just
copy-paste:
"That ticket has nothing to do with LWT. In fact, LWT is the one mechanism
in
Cassandra where this ticket has not impact whatsoever because the whole
point of
the mechanism is to ensure timestamps are assigned in a collision free
manner."


> Which I believe was one of the key "call me maybe" Created issues.
>

The "call me maybe" blog post was not at exclusively about LWT and its
linearizability, so 6123 may have been an issue created following that
post, but it's
unrelated to LWT and again, don't affect it.


>
> 6123 references: this https://issues.apache.org/jira/browse/CASSANDRA-8892
>
>
> Which duplicates:
>
> https://issues.apache.org/jira/browse/CASSANDRA-6123
>
> So it is unclear to me what was resolved.
>

Again, it's unrelated to LWT but I don't think there is anything unclear
here: 6123 is
not resolved as indicated by the jira "resolution". Maybe what you found
unclear is
that CASSANDRA-8892 has the status "resolved", but that's just JIRA ugliness
here: marking a ticket "duplicate" imply "resolving" it as far as JIRA is
concerned, so
when you see "duplicate", you should basically ignore that ticket and only
look the
ticket it duplicates.


Re: When commitlog segment files are removed actually?

2016-12-08 Thread Satoshi Hikida
Hi, Yoshi

Thank you for your replay.

Of cause I read the document you linked, But I'm still confused about the
deletion timing of the commit log segment files Because the total size of
the commit log segment files grows until 3GB even if there are many
memtable flushes.

Anyway I'll check the deletion timing of the commit log segment files with
changing the memtable size (memtable_heap_space_in_mb).

Regards,
Satoshi


On Thu, Dec 8, 2016 at 5:56 PM, Yoshi Kimoto  wrote:

> That is probably because the relevant memtable is flushed to SSTable and
> the content of the commit log is not required any more so it's rewinding.
>
> See: http://docs.datastax.com/en/cassandra/3.x/cassandra/
> dml/dmlHowDataWritten.html
>
>
>
> 2016年12月8日(木) 17:45 Satoshi Hikida :
>
>> Hi,
>>
>> I have a question about commit log.
>>
>> When commit log segment files are remove actually?
>>
>> I'm running a single node cluster for a few weeks to test C* performance.
>> My simple test have been issuing only read and write requests to the
>> cluster, then the data size (SSTables size) are increasing monotonously.
>>
>> Here are customized C* settings in my environment.
>> - commitlog_total_space_in_mb: 8192
>> - memtable_flush_writers: 4
>> - memtable_heap_space_in_mb: 2048
>>
>> And here is the node specs used in my environment.
>> - CPU core: 4
>> - Memory: 16GB
>> - SSD: 100GB * 2 (one for commit log, the other for data)
>>
>> In this case, I expected that the total size of the commit log segment
>> files are increased up to 8GB. However, when the total size is
>> increased up to 3GB, the commit log segment files are removed and
>> total size reduced to around 250MB.
>>
>> I wonder why commit log segment files are removed before total size is
>> increased up to 8GB(= commitlog_total_space_in_mb) ?  Could anyone give
>> me some advices?
>>
>>
>> Regards,
>> Satosh
>>
>>
>>
>>


Re: Benefit of LOCAL_SERIAL consistency

2016-12-08 Thread Edward Capriolo
On Thu, Dec 8, 2016 at 5:10 AM, Sylvain Lebresne 
wrote:

> > The reason you don't want to use SERIAL in multi-DC clusters
>
> I'm not a fan of blanket statements like that. There is a high cost to
> SERIAL
> consistency in multi-DC setups, but if you *need* global linearizability,
> then
> you have no choice and the latency may be acceptable for your use case.
> Take
> the example of using LWT to ensure no 2 user creates accounts with the same
> name in your system: it's something you don't want to screw up, but it's
> also
> something for which a high-ish latency is probably acceptable. I don't
> think
> users would get super pissed off because registering a new account on some
> service takes 500ms.
>
> So yes it's costly, as is most things that willingly depends on cross-DC
> latency, but I don't think that means it's never ever useful.
>
> > So, I am not sure about what is the good use case for LOCAL_SERIAL.
>
> Well, a good use case is when you're ok with operations within a
> datacenter to
> be linearizable, but can accept 2 operations in different datacenters to
> not be.
> Imagine a service that pins a given user to a DC on login for different
> reasons,
> that service might be fine using LOCAL_SERIAL for operations confined to a
> given user session since it knows it's DC local.
>
> So I think both SERIAL and LOCAL_SERIAL have their uses, though we
> absolutely
> agree they are not meant to be used together. And it's certainly worth
> trying to
> design your system in a way that make sure LOCAL_SERIAL is enough for you,
> if
> you can, since SERIAL is pretty costly. But that doesn't mean there isn't
> case
> where you care more about global linearizability than latency: engineering
> is
> all about trade-offs.
>
> > I am not sure what of the state of this is anymore but I was under the
> > impression the linearizability of lwt was in question. I never head it
> > specifically addressed.
>
> That's a pretty vague statement to make, let's not get into FUD. You
> "might" be
> thinking of a fairly old blog post by Aphyr that tested LWT in their very
> early
> days and they were bugs indeed, but those were fixed a long time ago. Since
> then, his tests and much more were performed
> (http://www.datastax.com/dev/blog/testing-apache-cassandra-with-jepsen)
> and no problem with linearizability that I know of has been found. Don't
> get me
> wrong, any code can have subtle bug and not finding problems doesn't
> guarantee
> there isn't one, but if someone has demonstrated legit problems with the
> linearizability of LWT, it's unknown to me and I'm watching this pretty
> carefully.
>
> I'll note to be complete that I'm not pretending the LWT implementation is
> perfect, it's not (it's slow for one), and using them correctly can be more
> challenging that it may sound at first (mostly because you need to handle
> query timeouts properly and that's not always simple, sometimes requiring
> a more complex data model that you'd want), but those are not break of
> linearizability.
>
> > https://issues.apache.org/jira/browse/CASSANDRA-6106
>
> That ticket has nothing to do with LWT. In fact, LWT is the one mechanism
> in
> Cassandra where this ticket has not impact whatsoever because the whole
> point of
> the mechanism is to ensure timestamps are assigned in a collision free
> manner.
>
>
> On Thu, Dec 8, 2016 at 8:32 AM, Hiroyuki Yamada 
> wrote:
>
>> Hi DuyHai,
>>
>> Thank you for the comments.
>> Yes, that's exactly what I mean.
>> (Your comment is very helpful to support my opinion.)
>>
>> As you said, SERIAL with multi-DCs incurs latency increase,
>> but it's a trade-off between latency and high availability bacause one
>> DC can be down from a disaster.
>> I don't think there is any way to achieve global linearlizability
>> without latency increase, right ?
>>
>> > Edward
>> Thank you for the ticket.
>> I'll read it through.
>>
>> Thanks,
>> Hiro
>>
>> On Thu, Dec 8, 2016 at 12:01 AM, Edward Capriolo 
>> wrote:
>> >
>> >
>> > On Wed, Dec 7, 2016 at 8:25 AM, DuyHai Doan 
>> wrote:
>> >>
>> >> The reason you don't want to use SERIAL in multi-DC clusters is the
>> >> prohibitive cost of lightweight transaction (in term of latency),
>> especially
>> >> if your data centers are separated by continents. A ping from London
>> to New
>> >> York takes 52ms just by speed of light in optic cable. Since
>> LightWeight
>> >> Transaction involves 4 network round-trips, it means at least 200ms
>> just for
>> >> raw network transfer, not even taking into account the cost of
>> processing
>> >> the operation
>> >>
>> >> You're right to raise a warning about mixing LOCAL_SERIAL with SERIAL.
>> >> LOCAL_SERIAL guarantees you linearizability inside a DC, SERIAL
>> guarantees
>> >> you linearizability across multiple DC.
>> >>
>> >> If I have 3 DCs with RF = 3 each (total 9 replicas) and I did an
>> INSERT IF
>> >> NOT EXISTS with 

Re: Huge files in level 1 and level 0 of LeveledCompactionStrategy

2016-12-08 Thread Sotirios Delimanolis
What do you mean? 
I'm logging the list of files when creating the CompactionTask and it's showing 
these
-rw-r--r-- 1 user group      3540336 Dec  7 23:40 
lb-29715834-big-Data.db-rw-r--r-- 1 user group      5997853 Dec  7 22:07 
lb-29715833-big-Data.db-rw-r--r-- 1 user group      5210561 Dec  7 20:34 
lb-29715832-big-Data.db-rw-r--r-- 1 user group      9280836 Dec  7 19:01 
lb-29715831-big-Data.db-rw-r--r-- 1 user group      6161434 Dec  7 17:26 
lb-29715830-big-Data.db-rw-r--r-- 1 user group      6370388 Dec  7 16:02 
lb-29715829-big-Data.db-rw-r--r-- 1 user group      5333813 Dec  7 14:37 
lb-29715828-big-Data.db-rw-r--r-- 1 user group      3201999 Dec  7 13:02 
lb-29715827-big-Data.db-rw-r--r-- 1 user group      2358003 Dec  7 11:08 
lb-29715826-big-Data.db-rw-r--r-- 1 user group      1529995 Dec  7 08:50 
lb-29715825-big-Data.db-rw-r--r-- 1 user group      4318164 Dec  7 06:18 
lb-29715824-big-Data.db-rw-r--r-- 1 user group      4992116 Dec  7 04:04 
lb-29715823-big-Data.db-rw-r--r-- 1 user group      3935687 Dec  7 02:21 
lb-29715822-big-Data.db-rw-r--r-- 1 user group     52641621 Dec  7 01:19 
lb-29713870-big-Data.db-rw-r--r-- 1 user group    210914040 Dec  7 01:19 
lb-29713865-big-Data.db-rw-r--r-- 1 user group    210009811 Dec  7 01:18 
lb-29713861-big-Data.db-rw-r--r-- 1 user group    209900194 Dec  7 01:18 
lb-29713857-big-Data.db-rw-r--r-- 1 user group    210341449 Dec  7 01:18 
lb-29713852-big-Data.db-rw-r--r-- 1 user group    209886959 Dec  7 01:05 
lb-29713341-big-Data.db-rw-r--r-- 1 user group    210582486 Dec  7 01:05 
lb-29713338-big-Data.db-rw-r--r-- 1 user group    211389548 Dec  7 01:00 
lb-29713163-big-Data.db-rw-r--r-- 1 user group    212258569 Dec  7 01:00 
lb-29713158-big-Data.db-rw-r--r-- 1 user group    210187074 Dec  7 00:58 
lb-29713093-big-Data.db-rw-r--r-- 1 user group  20741012144 Dec  4 12:57 
lb-29685218-big-Data.db-rw-r--r-- 1 user group 212429252597 Dec  1 08:20 
lb-29678145-big-Data.db-rw-r--r-- 1 user group  78444316655 Dec  1 03:58 
lb-29677495-big-Data.db-rw-r--r-- 1 user group 138933736915 Dec  1 03:41 
lb-29677471-big-Data.db-rw-r--r-- 1 user group   1299907313 Nov 28 16:45 
lb-29675769-big-Data.db-rw-r--r-- 1 user group   9133813452 Nov 28 12:57 
lb-29675721-big-Data.db-rw-r--r-- 1 user group  41138758221 Nov 28 12:40 
lb-29675715-big-Data.db-rw-r--r-- 1 user group  65810157304 Nov 28 06:42 
lb-29675547-big-Data.db-rw-r--r-- 1 user group  39054510979 Nov 25 13:43 
lb-29672887-big-Data.db-rw-r--r-- 1 user group  40439104157 Nov 23 11:16 
lb-29670672-big-Data.db
Running sstablemetadata on these gives me
SSTable: /var/cassandra/data/Keyspace1/Table1/lb-29715834-bigSSTable max local 
deletion time: 2147483647SSTable Level: 0SSTable: 
/var/cassandra/data/Keyspace1/Table1/lb-29715833-bigSSTable max local deletion 
time: 2147483647SSTable Level: 0SSTable: 
/var/cassandra/data/Keyspace1/Table1/lb-29715832-bigSSTable max local deletion 
time: 2147483647SSTable Level: 0SSTable: 
/var/cassandra/data/Keyspace1/Table1/lb-29715831-bigSSTable max local deletion 
time: 2147483647SSTable Level: 0SSTable: 
/var/cassandra/data/Keyspace1/Table1/lb-29715830-bigSSTable max local deletion 
time: 2147483647SSTable Level: 0SSTable: 
/var/cassandra/data/Keyspace1/Table1/lb-29715829-bigSSTable max local deletion 
time: 2147483647SSTable Level: 0SSTable: 
/var/cassandra/data/Keyspace1/Table1/lb-29715828-bigSSTable max local deletion 
time: 2147483647SSTable Level: 0SSTable: 
/var/cassandra/data/Keyspace1/Table1/lb-29715827-bigSSTable max local deletion 
time: 2147483647SSTable Level: 0SSTable: 
/var/cassandra/data/Keyspace1/Table1/lb-29715826-bigSSTable max local deletion 
time: 2147483647SSTable Level: 0SSTable: 
/var/cassandra/data/Keyspace1/Table1/lb-29715825-bigSSTable max local deletion 
time: 2147483647SSTable Level: 0SSTable: 
/var/cassandra/data/Keyspace1/Table1/lb-29715824-bigSSTable max local deletion 
time: 2147483647SSTable Level: 0SSTable: 
/var/cassandra/data/Keyspace1/Table1/lb-29715823-bigSSTable max local deletion 
time: 2147483647SSTable Level: 0SSTable: 
/var/cassandra/data/Keyspace1/Table1/lb-29715822-bigSSTable max local deletion 
time: 2147483647SSTable Level: 0SSTable: 
/var/cassandra/data/Keyspace1/Table1/lb-29713870-bigSSTable max local deletion 
time: 2147483647SSTable Level: 1SSTable: 
/var/cassandra/data/Keyspace1/Table1/lb-29713865-bigSSTable max local deletion 
time: 2147483647SSTable Level: 1SSTable: 
/var/cassandra/data/Keyspace1/Table1/lb-29713861-bigSSTable max local deletion 
time: 2147483647SSTable Level: 1SSTable: 
/var/cassandra/data/Keyspace1/Table1/lb-29713857-bigSSTable max local deletion 
time: 2147483647SSTable Level: 1SSTable: 
/var/cassandra/data/Keyspace1/Table1/lb-29713852-bigSSTable max local deletion 
time: 2147483647SSTable Level: 1SSTable: 
/var/cassandra/data/Keyspace1/Table1/lb-29713341-bigSSTable max local deletion 
time: 2147483647SSTable Level: 1SSTable: 
/var/cassandra/data/Keyspace1/Table1/lb-29713338-bigSSTable max local deletion 

Imprecise Repair

2016-12-08 Thread Shalom Sagges
Hi Everyone,

I'm performing a repair as I usual do, but this time I got a weird
notification:
"Requested range intersects a local range but is not fully contained in
one; this would lead to imprecise repair".

I've never encountered this before during a repair.
The repair command that I ran is:
*nodetool repair -par -local mykeyspace mycolumnfamily*

The difference from other repairs I did is by adding *-local* and remove
*-pr* since I'm adding nodes in the other DC (have 2 DCs in the cluster)
and don't want the repair to interfere with the bootstrap.

I found CASSANDRA-7317 but saw it was fixed on 2.0.9. The version I'm using
is 2.0.14.
Any ideas?

Thanks!


Shalom Sagges
DBA
T: +972-74-700-4035
 
 We Create Meaningful Connections


-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.


Re: node decommission throttled

2016-12-08 Thread Eric Evans
On Thu, Dec 8, 2016 at 3:27 AM, Aleksandr Ivanov  wrote:
> On sending side no high CPU/IO/etc utilization.
> But on receiving node I see that one "STREAM-IN" thread takes 100% CPU and
> it just doesn't scale by design since "Each stream is a single thread"
> (http://www.mail-archive.com/user@cassandra.apache.org/msg42095.html)

Right, that is your bottleneck.  There is no per-host concurrency; I'm
afraid there is not much you can do about it at the moment.

https://issues.apache.org/jira/browse/CASSANDRA-4663 might be relevant

Cheers,

-- 
Eric Evans
john.eric.ev...@gmail.com


Re: Huge files in level 1 and level 0 of LeveledCompactionStrategy

2016-12-08 Thread Eric Evans
On Wed, Dec 7, 2016 at 6:35 PM, Sotirios Delimanolis
 wrote:
> I have a couple of SSTables that are humongous
>
> -rw-r--r-- 1 user group 138933736915 Dec  1 03:41 lb-29677471-big-Data.db
> -rw-r--r-- 1 user group  78444316655 Dec  1 03:58 lb-29677495-big-Data.db
> -rw-r--r-- 1 user group 212429252597 Dec  1 08:20 lb-29678145-big-Data.db
>
> sstablemetadata reports that these are all in SSTable Level 0. This table is
> running with
>
> compaction = {'sstable_size_in_mb': '200', 'tombstone_threshold': '0.25',
> 'tombstone_compaction_interval': '300', 'class':
> 'org.apache.cassandra.db.compaction.LeveledCompactionStrategy'}
>
> How could this happen?

The subject says level 1 and 0; What does level 1 look like?

-- 
Eric Evans
john.eric.ev...@gmail.com


Re: C* 3.5 - Not all SSTables were removed in DTCS

2016-12-08 Thread Jacek Luczak
/* Sorry for the mess, accidental Ctrl-Enter */

Fellow C* users,

I've got a cluster of C* 3.5 serving a single keyspace with DTCS table
and no deletes. We knew that data does not expire on time (even after
gc_grace_period) - that's sth we wanted to investigate later and
eventually let C* to keep it longer. The moment when C* decided to
remove old data happened yesterday and on a one node we've reduced the
data size by 40% (removing ~350 SSTables). On remaining nodes that was
only ~5%. That's of course wrong and all nodes should get down by 40%.
A closer look and we really found that ~40% is out-of-TTL and that
maps to ~350 SSTables.

I've triggered a user defined compaction on few of the SStables but ...
nodetool compact exited with 0 a second after and file remain
untouched (no entry in the debug.log about compaction). The files are
mmap()ed in the process as well.

Is there a known bug for this behaviour?

Why one node removed *all* data at once (there's data that should
expire 3 months ago) while others not?

All nodes are equal in spec and configuration.

-Jacek


2016-12-08 14:11 GMT+01:00 Jacek Luczak :
> Fellow C* users,
>
> I've got a cluster of C* 3.5 serving a single keyspace with DTCS table
> and no deletes. We knew that data does not expire on time (even after
> gc_grace_period) - that's sth we wanted to investigate later and
> eventually let C* to keep it longer. The moment when C* decided to
> remove old data happened yesterday and on a one node we've reduced the
> data size by 40% (removing ~350 SSTables). On remaining nodes that was
> only ~5%. That's of course wrong and all nodes should get down by 40%.
> A closer look and we really found that ~40% is out-of-TTL and that
> maps to ~350 SSTables.
>
> I've triggered a user defined compaction on few of the tables but ...
> nodetool compact


C* 3.5 - Not all SSTables were removed in DTCS

2016-12-08 Thread Jacek Luczak
Fellow C* users,

I've got a cluster of C* 3.5 serving a single keyspace with DTCS table
and no deletes. We knew that data does not expire on time (even after
gc_grace_period) - that's sth we wanted to investigate later and
eventually let C* to keep it longer. The moment when C* decided to
remove old data happened yesterday and on a one node we've reduced the
data size by 40% (removing ~350 SSTables). On remaining nodes that was
only ~5%. That's of course wrong and all nodes should get down by 40%.
A closer look and we really found that ~40% is out-of-TTL and that
maps to ~350 SSTables.

I've triggered a user defined compaction on few of the tables but ...
nodetool compact


Re: Benefit of LOCAL_SERIAL consistency

2016-12-08 Thread Sylvain Lebresne
> The reason you don't want to use SERIAL in multi-DC clusters

I'm not a fan of blanket statements like that. There is a high cost to
SERIAL
consistency in multi-DC setups, but if you *need* global linearizability,
then
you have no choice and the latency may be acceptable for your use case. Take
the example of using LWT to ensure no 2 user creates accounts with the same
name in your system: it's something you don't want to screw up, but it's
also
something for which a high-ish latency is probably acceptable. I don't think
users would get super pissed off because registering a new account on some
service takes 500ms.

So yes it's costly, as is most things that willingly depends on cross-DC
latency, but I don't think that means it's never ever useful.

> So, I am not sure about what is the good use case for LOCAL_SERIAL.

Well, a good use case is when you're ok with operations within a datacenter
to
be linearizable, but can accept 2 operations in different datacenters to
not be.
Imagine a service that pins a given user to a DC on login for different
reasons,
that service might be fine using LOCAL_SERIAL for operations confined to a
given user session since it knows it's DC local.

So I think both SERIAL and LOCAL_SERIAL have their uses, though we
absolutely
agree they are not meant to be used together. And it's certainly worth
trying to
design your system in a way that make sure LOCAL_SERIAL is enough for you,
if
you can, since SERIAL is pretty costly. But that doesn't mean there isn't
case
where you care more about global linearizability than latency: engineering
is
all about trade-offs.

> I am not sure what of the state of this is anymore but I was under the
> impression the linearizability of lwt was in question. I never head it
> specifically addressed.

That's a pretty vague statement to make, let's not get into FUD. You
"might" be
thinking of a fairly old blog post by Aphyr that tested LWT in their very
early
days and they were bugs indeed, but those were fixed a long time ago. Since
then, his tests and much more were performed
(http://www.datastax.com/dev/blog/testing-apache-cassandra-with-jepsen)
and no problem with linearizability that I know of has been found. Don't
get me
wrong, any code can have subtle bug and not finding problems doesn't
guarantee
there isn't one, but if someone has demonstrated legit problems with the
linearizability of LWT, it's unknown to me and I'm watching this pretty
carefully.

I'll note to be complete that I'm not pretending the LWT implementation is
perfect, it's not (it's slow for one), and using them correctly can be more
challenging that it may sound at first (mostly because you need to handle
query timeouts properly and that's not always simple, sometimes requiring
a more complex data model that you'd want), but those are not break of
linearizability.

> https://issues.apache.org/jira/browse/CASSANDRA-6106

That ticket has nothing to do with LWT. In fact, LWT is the one mechanism in
Cassandra where this ticket has not impact whatsoever because the whole
point of
the mechanism is to ensure timestamps are assigned in a collision free
manner.


On Thu, Dec 8, 2016 at 8:32 AM, Hiroyuki Yamada  wrote:

> Hi DuyHai,
>
> Thank you for the comments.
> Yes, that's exactly what I mean.
> (Your comment is very helpful to support my opinion.)
>
> As you said, SERIAL with multi-DCs incurs latency increase,
> but it's a trade-off between latency and high availability bacause one
> DC can be down from a disaster.
> I don't think there is any way to achieve global linearlizability
> without latency increase, right ?
>
> > Edward
> Thank you for the ticket.
> I'll read it through.
>
> Thanks,
> Hiro
>
> On Thu, Dec 8, 2016 at 12:01 AM, Edward Capriolo 
> wrote:
> >
> >
> > On Wed, Dec 7, 2016 at 8:25 AM, DuyHai Doan 
> wrote:
> >>
> >> The reason you don't want to use SERIAL in multi-DC clusters is the
> >> prohibitive cost of lightweight transaction (in term of latency),
> especially
> >> if your data centers are separated by continents. A ping from London to
> New
> >> York takes 52ms just by speed of light in optic cable. Since LightWeight
> >> Transaction involves 4 network round-trips, it means at least 200ms
> just for
> >> raw network transfer, not even taking into account the cost of
> processing
> >> the operation
> >>
> >> You're right to raise a warning about mixing LOCAL_SERIAL with SERIAL.
> >> LOCAL_SERIAL guarantees you linearizability inside a DC, SERIAL
> guarantees
> >> you linearizability across multiple DC.
> >>
> >> If I have 3 DCs with RF = 3 each (total 9 replicas) and I did an INSERT
> IF
> >> NOT EXISTS with LOCAL_SERIAL in DC1, then it's possible that a
> subsequent
> >> INSERT IF NOT EXISTS on the same record succeeds when using SERIAL
> because
> >> SERIAL on 9 replicas = at least 5 replicas. Those 5 replicas which
> respond
> >> can come from DC2 and DC3 and thus did not apply yet 

Re: node decommission throttled

2016-12-08 Thread Aleksandr Ivanov
Nope, no MVs

On Thu, Dec 8, 2016 at 11:31 AM, Benjamin Roth 
wrote:

> Just an educated guess: you have materialized Views? They are known to
> Stream very slow
>
> Am 08.12.2016 10:28 schrieb "Aleksandr Ivanov" :
>
>> Yes, I use compression.
>> Tried without and it gave ~15% increase in speed, but is still too low
>> (~35Mbps)
>>
>> On sending side no high CPU/IO/etc utilization.
>> But on receiving node I see that one "STREAM-IN" thread takes 100% CPU
>> and it just doesn't scale by design since "Each stream is a single
>> thread" (http://www.mail-archive.com/user@cassandra.apache.org/msg42
>> 095.html)
>>
>>
>>
>>> > I'm trying to decommission one C* node from 6 nodes cluster and see
>>> that
>>> > outbound network traffic on this node doesn't go over ~30Mb/s.
>>> > Looks like it is throttled somewhere in C*
>>>
>>> Do you use compression?  Try taking a thread dump and see what the
>>> utilization of the sending threads are.
>>>
>>>
>>> --
>>> Eric Evans
>>> john.eric.ev...@gmail.com
>>>
>>
>>


Re: node decommission throttled

2016-12-08 Thread Benjamin Roth
Just an educated guess: you have materialized Views? They are known to
Stream very slow

Am 08.12.2016 10:28 schrieb "Aleksandr Ivanov" :

> Yes, I use compression.
> Tried without and it gave ~15% increase in speed, but is still too low
> (~35Mbps)
>
> On sending side no high CPU/IO/etc utilization.
> But on receiving node I see that one "STREAM-IN" thread takes 100% CPU and
> it just doesn't scale by design since "Each stream is a single thread" (
> http://www.mail-archive.com/user@cassandra.apache.org/msg42095.html)
>
>
>
>> > I'm trying to decommission one C* node from 6 nodes cluster and see that
>> > outbound network traffic on this node doesn't go over ~30Mb/s.
>> > Looks like it is throttled somewhere in C*
>>
>> Do you use compression?  Try taking a thread dump and see what the
>> utilization of the sending threads are.
>>
>>
>> --
>> Eric Evans
>> john.eric.ev...@gmail.com
>>
>
>


Re: node decommission throttled

2016-12-08 Thread Aleksandr Ivanov
On sending side no high CPU/IO/etc utilization.

But on receiving node I see that one "STREAM-IN" thread takes 100% CPU and
it just doesn't scale by design since "Each stream is a single thread" (
http://www.mail-archive.com/user@cassandra.apache.org/msg42095.html)


Maybe your System cannot Stream faster. Is your cpu or hd/ssd fully
> utilized?
>
> Am 07.12.2016 16:07 schrieb "Eric Evans" :
>
>> On Tue, Dec 6, 2016 at 9:54 AM, Aleksandr Ivanov 
>> wrote:
>> > I'm trying to decommission one C* node from 6 nodes cluster and see that
>> > outbound network traffic on this node doesn't go over ~30Mb/s.
>> > Looks like it is throttled somewhere in C*
>>
>> Do you use compression?  Try taking a thread dump and see what the
>> utilization of the sending threads are.
>>
>>
>> --
>> Eric Evans
>> john.eric.ev...@gmail.com
>>
>


Re: node decommission throttled

2016-12-08 Thread Aleksandr Ivanov
Yes, I use compression.
Tried without and it gave ~15% increase in speed, but is still too low
(~35Mbps)

On sending side no high CPU/IO/etc utilization.
But on receiving node I see that one "STREAM-IN" thread takes 100% CPU and
it just doesn't scale by design since "Each stream is a single thread" (
http://www.mail-archive.com/user@cassandra.apache.org/msg42095.html)



> > I'm trying to decommission one C* node from 6 nodes cluster and see that
> > outbound network traffic on this node doesn't go over ~30Mb/s.
> > Looks like it is throttled somewhere in C*
>
> Do you use compression?  Try taking a thread dump and see what the
> utilization of the sending threads are.
>
>
> --
> Eric Evans
> john.eric.ev...@gmail.com
>


Re: When commitlog segment files are removed actually?

2016-12-08 Thread Yoshi Kimoto
That is probably because the relevant memtable is flushed to SSTable and
the content of the commit log is not required any more so it's rewinding.

See:
http://docs.datastax.com/en/cassandra/3.x/cassandra/dml/dmlHowDataWritten.html



2016年12月8日(木) 17:45 Satoshi Hikida :

> Hi,
>
> I have a question about commit log.
>
> When commit log segment files are remove actually?
>
> I'm running a single node cluster for a few weeks to test C* performance.
> My simple test have been issuing only read and write requests to the
> cluster, then the data size (SSTables size) are increasing monotonously.
>
> Here are customized C* settings in my environment.
> - commitlog_total_space_in_mb: 8192
> - memtable_flush_writers: 4
> - memtable_heap_space_in_mb: 2048
>
> And here is the node specs used in my environment.
> - CPU core: 4
> - Memory: 16GB
> - SSD: 100GB * 2 (one for commit log, the other for data)
>
> In this case, I expected that the total size of the commit log segment
> files are increased up to 8GB. However, when the total size is
> increased up to 3GB, the commit log segment files are removed and
> total size reduced to around 250MB.
>
> I wonder why commit log segment files are removed before total size is
> increased up to 8GB(= commitlog_total_space_in_mb) ?  Could anyone give
> me some advices?
>
>
> Regards,
> Satosh
>
>
>
>


When commitlog segment files are removed actually?

2016-12-08 Thread Satoshi Hikida
Hi,

I have a question about commit log.

When commit log segment files are remove actually?

I'm running a single node cluster for a few weeks to test C* performance.
My simple test have been issuing only read and write requests to the
cluster, then the data size (SSTables size) are increasing monotonously.

Here are customized C* settings in my environment.
- commitlog_total_space_in_mb: 8192
- memtable_flush_writers: 4
- memtable_heap_space_in_mb: 2048

And here is the node specs used in my environment.
- CPU core: 4
- Memory: 16GB
- SSD: 100GB * 2 (one for commit log, the other for data)

In this case, I expected that the total size of the commit log segment
files are increased up to 8GB. However, when the total size is
increased up to 3GB, the commit log segment files are removed and
total size reduced to around 250MB.

I wonder why commit log segment files are removed before total size is
increased up to 8GB(= commitlog_total_space_in_mb) ?  Could anyone give
me some advices?


Regards,
Satosh