Re: Bootstrapping node on Cassandra 3.7 causes cluster-wide performance issues

2017-09-11 Thread Paul Pollack
Thanks again guys, this has been a major blocker for us and I think we've
made some major progress with your advice.

We have gone ahead with Lerh's suggestion and the cluster is operating much
more smoothly while the new node compacts. We read at quorum, so in the
event that we don't make it within the hinted handoff window, at least
there won't be inconsistent data from reads.

Kurt - what we've been observing is that after the node finishes getting
data streamed to it from other nodes, it will go into state UN and only
then start the compactions, in this case it has about 130 pending. When
it's still joining we don't see an I/O bottleneck. I think the reason this
may be an issue for us is because our nodes generally are not OK since
they're constantly maxing out their disk throughput and have long queues,
which is why we're trying to increase capacity by both adding nodes and
switching to RAIDed disks. Under normal operating circumstances they're
pushed to their limits, so I think when the node gets backed up on
compactions it really is enough to tip over the cluster.

That's helpful to know regarding sstableofflinerelevel, in my dry run it
did appear that it would shuffle even more SSTables into L0.

On Mon, Sep 11, 2017 at 11:50 PM, kurt greaves  wrote:

>
>> Kurt - We're on 3.7, and our approach was to try thorttling compaction
>> throughput as much as possible rather than the opposite. I had found some
>> resources that suggested unthrottling to let it get it over with, but
>> wasn't sure if this would really help in our situation since the I/O pipe
>> was already fully saturated.
>>
>
> You should unthrottle during bootstrap as the node won't receive read
> queries until it finishes streaming and joins the cluster. It seems
> unlikely that you'd be bottlenecked on I/O during the bootstrapping
> process. If you were, you'd certainly have bigger problems. The aim is to
> clear out the majority of compactions *before* the node joins and starts
> servicing reads. You might also want to increase concurrent_compactors.
> Typical advice is same as # CPU cores, but you might want to increase it
> for the bootstrapping period.
>
> sstableofflinerelevel could help but I wouldn't count on it. Usage is
> pretty straightforward but you may find that a lot of the existing SSTables
> in L0 just get put back in L0 anyways, which is where the main compaction
> backlog comes from. Plus you have to take the node offline which may not be
> ideal. In this case I would suggest the strategy Lerh suggested as being
> more viable.
>
> Regardless, if the rest of your nodes are OK (and you don't have RF1/using
> CL=ALL) Cassandra should pretty effectively route around the slow node so a
> single node backed up on compactions shouldn't be a big deal.
>


Re: Bootstrapping node on Cassandra 3.7 causes cluster-wide performance issues

2017-09-11 Thread kurt greaves
>
>
> Kurt - We're on 3.7, and our approach was to try thorttling compaction
> throughput as much as possible rather than the opposite. I had found some
> resources that suggested unthrottling to let it get it over with, but
> wasn't sure if this would really help in our situation since the I/O pipe
> was already fully saturated.
>

You should unthrottle during bootstrap as the node won't receive read
queries until it finishes streaming and joins the cluster. It seems
unlikely that you'd be bottlenecked on I/O during the bootstrapping
process. If you were, you'd certainly have bigger problems. The aim is to
clear out the majority of compactions *before* the node joins and starts
servicing reads. You might also want to increase concurrent_compactors.
Typical advice is same as # CPU cores, but you might want to increase it
for the bootstrapping period.

sstableofflinerelevel could help but I wouldn't count on it. Usage is
pretty straightforward but you may find that a lot of the existing SSTables
in L0 just get put back in L0 anyways, which is where the main compaction
backlog comes from. Plus you have to take the node offline which may not be
ideal. In this case I would suggest the strategy Lerh suggested as being
more viable.

Regardless, if the rest of your nodes are OK (and you don't have RF1/using
CL=ALL) Cassandra should pretty effectively route around the slow node so a
single node backed up on compactions shouldn't be a big deal.


Re: Bootstrapping node on Cassandra 3.7 causes cluster-wide performance issues

2017-09-11 Thread Lerh Chuan Low
Hi Paul,

Agh, I don't have any experience with sstableofflinerelevel. Maybe Kurt
does, sorry.

Also, if it wasn't obvious, to add back the node to the cluster once it is
done would be the 3 commands, with enable substituted for disable. It feels
like it will take some time to get through all the compactions, likely more
than the hinted handoff window, so do make sure you are querying Cassandra
with strong consistency after you rejoin the node. Good luck!

Lerh

On 12 September 2017 at 11:53, Aaron Wykoff 
wrote:

> Unsubscribe
>
> On Mon, Sep 11, 2017 at 4:48 PM, Paul Pollack 
> wrote:
>
>> Hi,
>>
>> We run 48 node cluster that stores counts in wide rows. Each node is
>> using roughly 1TB space on a 2TB EBS gp2 drive for data directory and
>> LeveledCompactionStrategy. We have been trying to bootstrap new nodes that
>> use a raid0 configuration over 2 1TB EBS drives to increase I/O throughput
>> cap from 160 MB/s to 250 MB/s (AWS limits). Every time a node finishes
>> streaming it is bombarded by a large number of compactions. We see CPU load
>> on the new node spike extremely high and CPU load on all the other nodes in
>> the cluster drop unreasonably low. Meanwhile our app's latency for writes
>> to this cluster average 10 seconds or greater. We've already tried
>> throttling compaction throughput to 1 mbps and we've always had
>> concurrent_compactors set to 2 but the disk is still saturated. In every
>> case we have had to shut down the Cassandra process on the new node to
>> resume acceptable operations.
>>
>> We're currently upgrading all of our clients to use the 3.11.0 version of
>> the DataStax Python driver, which will allow us to add our next newly
>> bootstrapped node to a blacklist, hoping that if it doesn't accept writes
>> the rest of the cluster can serve them adequately (as is the case whenever
>> we turn down the bootstrapping node), and allow it to finish its
>> compactions.
>>
>> We were also interested in hearing if anyone has had much luck using the
>> sstableofflinerelevel tool, and if this is a reasonable approach for our
>> issue.
>>
>> One of my colleagues found a post where a user had a similar issue and
>> found that bloom filters had an extremely high false positive ratio, and
>> although I didn't check that during any of these attempts to bootstrap it
>> seems to me like if we have that many compactions to do we're likely to
>> observe that same thing.
>>
>> Would appreciate any guidance anyone can offer.
>>
>> Thanks,
>> Paul
>>
>
>


Re: Bootstrapping node on Cassandra 3.7 causes cluster-wide performance issues

2017-09-11 Thread Aaron Wykoff
Unsubscribe

On Mon, Sep 11, 2017 at 4:48 PM, Paul Pollack 
wrote:

> Hi,
>
> We run 48 node cluster that stores counts in wide rows. Each node is using
> roughly 1TB space on a 2TB EBS gp2 drive for data directory and
> LeveledCompactionStrategy. We have been trying to bootstrap new nodes that
> use a raid0 configuration over 2 1TB EBS drives to increase I/O throughput
> cap from 160 MB/s to 250 MB/s (AWS limits). Every time a node finishes
> streaming it is bombarded by a large number of compactions. We see CPU load
> on the new node spike extremely high and CPU load on all the other nodes in
> the cluster drop unreasonably low. Meanwhile our app's latency for writes
> to this cluster average 10 seconds or greater. We've already tried
> throttling compaction throughput to 1 mbps and we've always had
> concurrent_compactors set to 2 but the disk is still saturated. In every
> case we have had to shut down the Cassandra process on the new node to
> resume acceptable operations.
>
> We're currently upgrading all of our clients to use the 3.11.0 version of
> the DataStax Python driver, which will allow us to add our next newly
> bootstrapped node to a blacklist, hoping that if it doesn't accept writes
> the rest of the cluster can serve them adequately (as is the case whenever
> we turn down the bootstrapping node), and allow it to finish its
> compactions.
>
> We were also interested in hearing if anyone has had much luck using the
> sstableofflinerelevel tool, and if this is a reasonable approach for our
> issue.
>
> One of my colleagues found a post where a user had a similar issue and
> found that bloom filters had an extremely high false positive ratio, and
> although I didn't check that during any of these attempts to bootstrap it
> seems to me like if we have that many compactions to do we're likely to
> observe that same thing.
>
> Would appreciate any guidance anyone can offer.
>
> Thanks,
> Paul
>


Re: Bootstrapping node on Cassandra 3.7 causes cluster-wide performance issues

2017-09-11 Thread Paul Pollack
Thanks for the responses Lerh and Kurt!

Lerh - We had been considering those particular nodetool commands but were
hesitant to perform them on a production node without either testing
adequately in a dev environment or getting some feedback from someone who
knew what they were doing (such as yourself), so thank you for that! Your
point about the blacklist makes complete sense. So I think we'll probably
end up running those after the node finishes streaming and we confirm that
the blacklist is not improving latency. Just out of curiosity, do you have
any experience with sstableofflinerelevel? Is this something that would be
helpful to run with any kind of regularity?

Kurt - We're on 3.7, and our approach was to try thorttling compaction
throughput as much as possible rather than the opposite. I had found some
resources that suggested unthrottling to let it get it over with, but
wasn't sure if this would really help in our situation since the I/O pipe
was already fully saturated.

Best,
Paul

On Mon, Sep 11, 2017 at 9:16 PM, kurt greaves  wrote:

> What version are you using? There are improvements to streaming with LCS
> in 2.2.
> Also, are you unthrottling compaction throughput while the node is
> bootstrapping?
> ​
>


Re: load distribution that I can't explain

2017-09-11 Thread kurt greaves
Your first query will effectively have to perform table scans to satisfy
what you are asking. If a query requires ALLOW FILTERING to be specified,
it means that Cassandra can't really optimise that query in any way and
it's going to have to query a lot of data (all of it...) to satisfy the
result.
Because you've only specified one attribute of the partitioning key,
Cassandra doesn't know where to look for that data, and will need to query
all of it to find partitions matching that restriction.

If you want to select distinct you should probably do it in a distributed
manner using token range scans, however this is generally not a good use
case for Cassandra. If you really need to know your partitioning keys you
should probably store them in a separate cache.

​


Re: Bootstrapping node on Cassandra 3.7 causes cluster-wide performance issues

2017-09-11 Thread kurt greaves
What version are you using? There are improvements to streaming with LCS in
2.2.
Also, are you unthrottling compaction throughput while the node is
bootstrapping?
​


Re: Bootstrapping node on Cassandra 3.7 causes cluster-wide performance issues

2017-09-11 Thread Lerh Chuan Low
Hi Paul,

The new node will certainly have a lot of compactions to deal with being
LCS. Have you tried performing the following on the new node once it has
joined?

*nodetool disablebinary && nodetool disablethrift && nodetooldisablegossip*

This will disconnect Cassandra from the cluster, but not stop Cassandra
itself. At this point you can unthrottle compactions and let it compact
away. When it is done compacting, you can re-add it to the cluster and run
a repair if it has been over 3 hours. I don't think adding a blacklist will
help much because as long the data you insert replicates to the node (which
is slow) it will slow down the whole cluster.

As long as you have that node in the cluster, it will slow down everything.

Hope this helps you in some way :)

On 12 September 2017 at 09:48, Paul Pollack 
wrote:

> Hi,
>
> We run 48 node cluster that stores counts in wide rows. Each node is using
> roughly 1TB space on a 2TB EBS gp2 drive for data directory and
> LeveledCompactionStrategy. We have been trying to bootstrap new nodes that
> use a raid0 configuration over 2 1TB EBS drives to increase I/O throughput
> cap from 160 MB/s to 250 MB/s (AWS limits). Every time a node finishes
> streaming it is bombarded by a large number of compactions. We see CPU load
> on the new node spike extremely high and CPU load on all the other nodes in
> the cluster drop unreasonably low. Meanwhile our app's latency for writes
> to this cluster average 10 seconds or greater. We've already tried
> throttling compaction throughput to 1 mbps and we've always had
> concurrent_compactors set to 2 but the disk is still saturated. In every
> case we have had to shut down the Cassandra process on the new node to
> resume acceptable operations.
>
> We're currently upgrading all of our clients to use the 3.11.0 version of
> the DataStax Python driver, which will allow us to add our next newly
> bootstrapped node to a blacklist, hoping that if it doesn't accept writes
> the rest of the cluster can serve them adequately (as is the case whenever
> we turn down the bootstrapping node), and allow it to finish its
> compactions.
>
> We were also interested in hearing if anyone has had much luck using the
> sstableofflinerelevel tool, and if this is a reasonable approach for our
> issue.
>
> One of my colleagues found a post where a user had a similar issue and
> found that bloom filters had an extremely high false positive ratio, and
> although I didn't check that during any of these attempts to bootstrap it
> seems to me like if we have that many compactions to do we're likely to
> observe that same thing.
>
> Would appreciate any guidance anyone can offer.
>
> Thanks,
> Paul
>


Re: Cassandra downgrade of 2.1.15 to 2.1.12

2017-09-11 Thread Michael Shuler
On 09/11/2017 06:29 PM, Mark Furlong wrote:
> I have a requirement to test a downgrade of 2.1.15 to 2.1.12. Can
> someone please identify how to achieve this?

Downgrades have never been officially supported, but this is a
relatively small step. Testing it out is definitely a good thing. Since
protocols and on-disk sstable versions should be the same, I'd say work
backwards through NEWS.txt and see what you think about how it affects
your specific usage. I'd also be wary of the fixed bugs you will
re-introduce on downgrade (CHANGES.txt).

https://github.com/apache/cassandra/blob/cassandra-2.1.15/NEWS.txt#L16-L44
https://github.com/apache/cassandra/blob/cassandra-2.1.15/CHANGES.txt#L1-L100

As for the actual software downgrade, it depends on install method.
`wget` the 2.1.12 tar or deb files and `tar -xzvf` or `dpkg -i` them.
Here's where you can find the old versions of artifacts:

tar:
http://archive.apache.org/dist/cassandra/2.1.12/
deb:
http://archive.apache.org/dist/cassandra/debian/pool/main/c/cassandra/

This definitely would not work on a major release downgrade like 2.2.x
to 2.1.x, since the sstable versions would be different, but in your
2.1.15 to 2.1.12 example, this might "just work".

-- 
Kind regards,
Michael

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Bootstrapping node on Cassandra 3.7 causes cluster-wide performance issues

2017-09-11 Thread Paul Pollack
Hi,

We run 48 node cluster that stores counts in wide rows. Each node is using
roughly 1TB space on a 2TB EBS gp2 drive for data directory and
LeveledCompactionStrategy. We have been trying to bootstrap new nodes that
use a raid0 configuration over 2 1TB EBS drives to increase I/O throughput
cap from 160 MB/s to 250 MB/s (AWS limits). Every time a node finishes
streaming it is bombarded by a large number of compactions. We see CPU load
on the new node spike extremely high and CPU load on all the other nodes in
the cluster drop unreasonably low. Meanwhile our app's latency for writes
to this cluster average 10 seconds or greater. We've already tried
throttling compaction throughput to 1 mbps and we've always had
concurrent_compactors set to 2 but the disk is still saturated. In every
case we have had to shut down the Cassandra process on the new node to
resume acceptable operations.

We're currently upgrading all of our clients to use the 3.11.0 version of
the DataStax Python driver, which will allow us to add our next newly
bootstrapped node to a blacklist, hoping that if it doesn't accept writes
the rest of the cluster can serve them adequately (as is the case whenever
we turn down the bootstrapping node), and allow it to finish its
compactions.

We were also interested in hearing if anyone has had much luck using the
sstableofflinerelevel tool, and if this is a reasonable approach for our
issue.

One of my colleagues found a post where a user had a similar issue and
found that bloom filters had an extremely high false positive ratio, and
although I didn't check that during any of these attempts to bootstrap it
seems to me like if we have that many compactions to do we're likely to
observe that same thing.

Would appreciate any guidance anyone can offer.

Thanks,
Paul


Cassandra downgrade of 2.1.15 to 2.1.12

2017-09-11 Thread Mark Furlong
I have a requirement to test a downgrade of 2.1.15 to 2.1.12. Can someone 
please identify how to achieve this?

Thanks,
Mark Furlong

Sr. Database Administrator

mfurl...@ancestry.com
M: 801-859-7427
O: 801-705-7115
1300 W Traverse Pkwy
Lehi, UT 84043





​[http://c.mfcreative.com/mars/email/shared-icon/sig-logo.gif]





load distribution that I can't explain

2017-09-11 Thread kaveh minooie

Hi every one

So I have a 2 node( node1, node2 ) cassandra 3.11 cluster on which I 
have a keyspace with a replication factor of 2. this keyspace has only 
this table:


CREATE KEYSPACE myks WITH replication = {'class': 'SimpleStrategy', 
'replication_factor': '2'}  AND durable_writes = true;


CREATE TABLE myks.table1 (
id1 int,
id2 int,
id3 int,
att1 int,
PRIMARY KEY ((id1, id2, id3), att1)
) WITH CLUSTERING ORDER BY (att1 ASC)
AND bloom_filter_fp_chance = 0.01
AND caching = {'keys': 'ALL', 'rows_per_partition': 'NONE'}
AND comment = ''
AND compaction = {'class': 
'org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy', 
'max_threshold': '32', 'min_threshold': '4'}
AND compression = {'chunk_length_in_kb': '64', 'class': 
'org.apache.cassandra.io.compress.LZ4Compressor'}

AND crc_check_chance = 1.0
AND dclocal_read_repair_chance = 0.1
AND default_time_to_live = 0
AND gc_grace_seconds = 864000
AND max_index_interval = 2048
AND memtable_flush_period_in_ms = 0
AND min_index_interval = 128
AND read_repair_chance = 0.0
AND speculative_retry = '99PERCENTILE';


I run two tasks against this table:

task one involves reading first:

"SELECT DISTINCT id1, id2, id3 FROM table1 WHERE id1 = :id1-value ALLOW 
FILTERING;";


and then, per each result, reading :
"SELECT COUNT( att1 ) FROM table1 WHERE id1 = :id1-value AND id2 = 
:id2-value AND id3 = :id3-value ;";


and once done adds new data by executing this:

"INSERT INTO table1 ( id1, id2, id3, att1 ) VALUES ( :id1-value, 
:id2-value, :id3-value, :att1-value ) USING TTL ;"


as long as there is data for different id1s. All of these are at CL one 
or any for insert.


task two only does the select part, but doesn't add any new data, again 
for a hundred different id1 values in each run. these are java 
applications and use com.datastax.driver.


my problem is that when I am running these tasks, specially task one, I 
always see a lot more cpu load, as in ,on average a ratio of 10 to 1 and 
sometimes even as high as 30 to 1 load, on node2 than node1. Both of 
these node have the same spec. I don't know how to explain this or what 
configuration parameter I need to look into in order to explain this, 
and I couldn't find any thing on-line either. Any hint or suggestion 
would be really appreciated.


thanks,

--
Kaveh Minooie

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Do not use Cassandra 3.11.0+ or Cassandra 3.0.12+

2017-09-11 Thread CPC
Hi,

Is this bug fixed in dse 5.1.3? As I understand calling jmx getTombStoneRatio
trigers that bug. We are using opscenter as well and do you have any idea
whether opscenter using/calling this method?

Thanks

On Aug 29, 2017 6:35 AM, "Jeff Jirsa"  wrote:

> I shouldn't actually say I don't think it can happen on 3.0 - I haven't
> seen this happen on 3.0 without some other code change to enable it, but
> like I said, we're still investigating.
>
> --
> Jeff Jirsa
>
>
> > On Aug 28, 2017, at 8:30 PM, Jeff Jirsa  wrote:
> >
> > For what it's worth, I don't think this impacts 3.0 without adding some
> other code change (the reporter of the bug on 3.0 had added custom metrics
> that exposed a concurrency issue).
> >
> > We're looking at it on 3.11. I think 13038 made it far more likely to
> occur, but I think it could have happened pre-13038 as well (would take
> some serious luck with your deletion time distribution though - the
> rounding in 13038 does make it more likely, but the race was already there).
> >
> > --
> > Jeff Jirsa
> >
> >
> >> On Aug 28, 2017, at 8:24 PM, Jay Zhuang 
> wrote:
> >>
> >> We're using 3.0.12+ for a few months and haven't seen the issue like
> >> that. Do we know what could trigger the problem? Or is 3.0.x really
> >> impacted?
> >>
> >> Thanks,
> >> Jay
> >>
> >>> On 8/28/17 6:02 AM, Hannu Kröger wrote:
> >>> Hello,
> >>>
> >>> Current latest Cassandra version (3.11.0, possibly also 3.0.12+) has a
> race
> >>> condition that causes Cassandra to create broken sstables (stats file
> in
> >>> sstables to be precise).
> >>>
> >>> Bug described here:
> >>> https://issues.apache.org/jira/browse/CASSANDRA-13752
> >>>
> >>> This change might be causing it (but not sure):
> >>> https://issues.apache.org/jira/browse/CASSANDRA-13038
> >>>
> >>> Other related issues:
> >>> https://issues.apache.org/jira/browse/CASSANDRA-13718
> >>> https://issues.apache.org/jira/browse/CASSANDRA-13756
> >>>
> >>> I would not recommend using 3.11.0 nor upgrading to 3.0.12 or higher
> before
> >>> this is fixed.
> >>>
> >>> Cheers,
> >>> Hannu
> >>>
> >>
> >> -
> >> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> >> For additional commands, e-mail: user-h...@cassandra.apache.org
> >>
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


Re: Cassandra 3.7 repair error messages

2017-09-11 Thread Paul Pollack
Thanks Erick, and sorry it took me so long to respond, I had to turn my
attention to other things. It definitely looks like there had been some
network blips going on with that node for a while before we saw it marked
down from every other node's perspective. Additionally, my original comment
that all of the failure messages referred to the same node was incorrect,
it seems like every few hours it would start to log messages for other
nodes in turn.

I went through the logs on all of the other nodes that were reported failed
from .204's perspective and found that they all failed to create a merkle
tree. We decided to set the consistency level for reads on this cluster to
quorum, which has at least prevented any data inconsistencies and as far as
we can tell no noticeable performance loss.

To answer your last question, I did once successfully run a repair on a
different node. It ran in about 12 hours or so.

I think before I dig further into why this repair could not run to
completion I have to address some other issues with the cluster -- namely
that we're hitting the Amazon EBS throughput cap on the the data volumes
for our nodes, which is causing our disk queue length to get big and
cluster-wide throughput to tank.

Thanks again for your help,
Paul

On Wed, Aug 30, 2017 at 9:54 PM, Erick Ramirez  wrote:

> No, it isn't normal for sessions to fail and you will need to investigate.
> You need to review the logs on node .204 to determine why the session
> failed. For example, did it timeout because of a very large sstable? Or did
> the connection get truncated after a while?
>
> You will need to address the cause of those failures. It could be external
> to the nodes, e.g. firewall closing the socket so you might need to
> configure TCP keep_alive. 33 hours sounds like a really long time. Have you
> successfully run a repair on this cluster before?
>
> On Thu, Aug 31, 2017 at 11:39 AM, Paul Pollack 
> wrote:
>
>> Hi,
>>
>> I'm trying to run a repair on a node my Cassandra cluster, version 3.7,
>> and was hoping someone may be able to shed light on an error message that
>> keeps cropping up.
>>
>> I started the repair on a node after discovering that it somehow became
>> partitioned from the rest of the cluster, e.g. nodetool status on all other
>> nodes showed it as DN, and on the node itself showed all other nodes as DN.
>> After restarting the Cassandra daemon the node seemed to re-join the
>> cluster just fine, so I began a repair.
>>
>> The repair has been running for about 33 hours (first incremental repair
>> on this cluster), and every so often I'll see a line like this:
>>
>> [2017-08-31 00:18:16,300] Repair session f7ae4e71-8ce3-11e7-b466-79eba0383e4f
>> for range [(-5606588017314999649,-5604469721630340065],
>> (9047587767449433379,9047652965163017217]] failed with error Endpoint /
>> 20.0.122.204 died (progress: 9%)
>>
>> Every one of these lines refers to the same node, 20.0.122.204.
>>
>> I'm mostly looking for guidance here. Do these errors indicate that the
>> entire repair will be worthless, or just for token ranges shared by these
>> two nodes? Is it normal to see error messages of this nature and for a
>> repair not to terminate?
>>
>> Thanks,
>> Paul
>>
>
>


Re: Weird error (unable to start cassandra)

2017-09-11 Thread Kant Kodali
I had to do brew upgrade jemalloc to fix this issue.

On Mon, Sep 11, 2017 at 4:25 AM, Kant Kodali  wrote:

> Hi All,
>
> I am trying to start cassandra 3.11 on Mac OS Sierra 10.12.6. when invoke
> cassandra binary I get the following error
>
> java(2981,0x7fffedb763c0) malloc: *** malloc_zone_unregister() failed for
> 0x7fffedb6c000
>
> I have xcode version 8.3.3 installed (latest). Any clue ?
>
> Thanks!
>


Weird error (unable to start cassandra)

2017-09-11 Thread Kant Kodali
Hi All,

I am trying to start cassandra 3.11 on Mac OS Sierra 10.12.6. when invoke
cassandra binary I get the following error

java(2981,0x7fffedb763c0) malloc: *** malloc_zone_unregister() failed for
0x7fffedb6c000

I have xcode version 8.3.3 installed (latest). Any clue ?

Thanks!


Re: Self-healing data integrity?

2017-09-11 Thread DuyHai Doan
Agree

 A tricky detail about streaming is that:

1) On the sender side, the node just send the SSTable (without any other
components like CRC files, partition index, partition summary etc...)
2) The sender does not even bother to de-serialize the SSTable data, it is
just sending the stream of bytes by reading directly SSTables content from
disk
3) On the receiver side, the node receives the bytes stream and needs to
serialize it in memory to rebuild all the SSTable components (CRC files,
partition index, partition summary ...)

So the consequences are:

a. there is a bottleneck on receiving side because of serialization
b. if there is a bit rot in SSTables, since CRC files are not sent, no
chance to detect it from receiving side
c. if we want to include CRC checks in the streaming path, it's a whole
review of the streaming architecture, not only adding some feature

On Sat, Sep 9, 2017 at 10:06 PM, Jeff Jirsa  wrote:

> (Which isn't to say that someone shouldn't implement this; they should,
> and there's probably a JIRA to do so already written, but it's a project of
> volunteers, and nobody has volunteered to do the work yet)
>
> --
> Jeff Jirsa
>
>
> On Sep 9, 2017, at 12:59 PM, Jeff Jirsa  wrote:
>
> There is, but they aren't consulted on the streaming paths (only on normal
> reads)
>
>
> --
> Jeff Jirsa
>
>
> On Sep 9, 2017, at 12:02 PM, DuyHai Doan  wrote:
>
> Jeff,
>
>  With default compression enabled on each table, isn't there CRC files
> created along side with SSTables that can help detecting bit-rot ?
>
>
> On Sat, Sep 9, 2017 at 7:50 PM, Jeff Jirsa  wrote:
>
>> Cassandra doesn't do that automatically - it can guarantee consistency on
>> read or write via ConsistencyLevel on each query, and it can run active
>> (AntiEntropy) repairs. But active repairs must be scheduled (by human or
>> cron or by third party script like http://cassandra-reaper.io/), and to
>> be pedantic, repair only fixes consistency issue, there's some work to be
>> done to properly address/support fixing corrupted replicas (for example,
>> repair COULD send a bit flip from one node to all of the others)
>>
>>
>>
>> --
>> Jeff Jirsa
>>
>>
>> On Sep 9, 2017, at 1:07 AM, Ralph Soika  wrote:
>>
>> Hi,
>>
>> I am searching for a big data storage solution for the Imixs-Workflow
>> project. I started with Hadoop until I became aware of the
>> 'small-file-problem'. So I am considering using Cassandra now.
>>
>> But Hadoop has one important feature for me. The replicator continuously
>> examines whether data blocks are consistent across all datanodes. This will
>> detect disk errors and automatically move data from defective blocks to
>> working blocks. I think this is called 'self-healing mechanism'.
>>
>> Is there a similar feature in Cassandra too?
>>
>>
>> Thanks for help
>>
>> Ralph
>>
>>
>>
>> --
>>
>>
>


Re: disable reads from node till hints are fully synced

2017-09-11 Thread laxmikanth sadula
Hi,

If you are using datastax java driver, I think this might work.

http://docs.datastax.com/en/latest-java-driver-api/com/datastax/driver/core/policies/WhiteListPolicy.html

On Sep 11, 2017 2:28 AM, "Jeff Jirsa"  wrote:

> There's not - you can disable native/binary to make it less likely, but
> you can't stop reads entirely because you need gossip up in order to have
> hints deliver
>
> What you can do is use severity to make the dynamic snitch MUCH less
> likely to pick that node (and disable binary so it's not a coordinator).
> That often works for what you're trying to do, though it's imperfect.
> Brandon described this a bit here:
>
> https://www.datastax.com/dev/blog/dynamic-snitching-in-
> cassandra-past-present-and-future
>
>
>
> --
> Jeff Jirsa
>
>
> On Sep 10, 2017, at 1:28 PM, Aleksandr Ivanov  wrote:
>
> Hello,
>
> from time to time we have situations where node is down for longer period
> (but less than max_hint_window_in_ms). After node is up and hints are
> activly syncing to affected node, clients get inconsistent data (client
> uses LOCAL_ONE consistency due to performance reasons).
>
> Is any way exist to disable reads from such node till hints are fully
> synced?
>
> Regards,
> Aleksandr
>
>