Re: LWT broken?

2018-02-09 Thread Jonathan Haddad
If you want consistent reads you have to use the CL that enforces it.
There’s no way around it.
On Fri, Feb 9, 2018 at 2:35 PM Mahdi Ben Hamida  wrote:

> In this case, we only write using CAS (code guarantees that). We also
> never update, just insert if not exist. Once a hash exists, it never
> changes (it may get deleted later and that'll be a CAS delete as well).
>
> --
> Mahdi.
>
> On 2/9/18 1:38 PM, Jeff Jirsa wrote:
>
>
>
> On Fri, Feb 9, 2018 at 1:33 PM, Mahdi Ben Hamida 
> wrote:
>
>>  Under what circumstances would we be reading inconsistent results ? Is
>> there a case where we end up reading a value that actually end up not being
>> written ?
>>
>>
>>
>
> If you ever write the same value with CAS and without CAS (different code
> paths both updating the same value), you're using CAS wrong, and
> inconsistencies can happen.
>
>
>
>


Re: LWT broken?

2018-02-09 Thread Mahdi Ben Hamida
In this case, we only write using CAS (code guarantees that). We also 
never update, just insert if not exist. Once a hash exists, it never 
changes (it may get deleted later and that'll be a CAS delete as well).


--
Mahdi.

On 2/9/18 1:38 PM, Jeff Jirsa wrote:



On Fri, Feb 9, 2018 at 1:33 PM, Mahdi Ben Hamida > wrote:


 Under what circumstances would we be reading inconsistent results
? Is there a case where we end up reading a value that actually
end up not being written ?




If you ever write the same value with CAS and without CAS (different 
code paths both updating the same value), you're using CAS wrong, and 
inconsistencies can happen.







Re: LWT broken?

2018-02-09 Thread Jeff Jirsa
On Fri, Feb 9, 2018 at 1:33 PM, Mahdi Ben Hamida  wrote:

>  Under what circumstances would we be reading inconsistent results ? Is
> there a case where we end up reading a value that actually end up not being
> written ?
>
>
>

If you ever write the same value with CAS and without CAS (different code
paths both updating the same value), you're using CAS wrong, and
inconsistencies can happen.


Re: LWT broken?

2018-02-09 Thread Mahdi Ben Hamida

Hi Stefan,

I was hoping we could avoid the cost of a serial read (which I assume is 
a lot more expensive than a regular read due to the paxos requirements). 
I actually do a serial read at line #9 (ie, when we lose the LWT and 
have to read the winning value) and that still fails to ensure the 
uniqueness guarantees. Under what circumstances would we be reading 
inconsistent results ? Is there a case where we end up reading a value 
that actually end up not being written ?


Thanks !

--
Mahdi.

On 2/9/18 12:52 PM, Stefan Podkowinski wrote:


I'd not recommend using any consistency level but serial for reading 
tables updated by LWT operations. Otherwise you might end up reading 
inconsistent results.



On 09.02.18 08:06, Mahdi Ben Hamida wrote:


Hello,

I'm running a 2.0.17 cluster (I know, I know, need to upgrade) with 
46 nodes across 3 racks (& RF=3). I'm seeing that under high 
contention, LWT may actually not guarantee uniqueness. With a total 
of 16 million LWT transactions (with peak LWT concurrency around 
5k/sec), I found 38 conflicts that should have been impossible. I was 
wondering if there were any known issues that make LWT broken for 
this old version of cassandra.


I use LWT to guarantee that a 128 bit number (hash) maps to a unique 
64 bit number (id). There could be a large number of threads trying 
to allocate an id for a given hash.


I do the following logic (slightly more complicated than this due to 
timeout handling)


 1  existing_id = SELECT id FROM hash_id WHERE hash=computed_hash *| 
consistency = ONE*

 2  if existing_id != null:
 3    return existing_id
 4  new_id = generateUniqueId()
 5  result=INSERT INTO hash_id (id) VALUES(new_id) WHERE 
hash=computed_hash IF NOT EXIST | *consistency = QUORUM, 
serialConsistency = SERIAL*

 6  if result == [applied] // ie we won LWT
 7    return new_id
 8  else// we lost LWT, fetch the winning value
 9    existing_id = SELECT id FROM hash_id WHERE hash=computed_hash | 
consistency = ONE

10    return existing_id

Is there anything flawed about this ?
I do the read at line #1 and #9 at a consistency of ONE. Would that 
cause uncommitted changes to be seen (ie, dirty reads) ? Should it be 
a SERIAL consistency instead ? My understanding is that only one 
transaction will be able to apply the write (at quorum), so doing a 
read at consistency of one will either result in a null, or I would 
get the id that won the LWT race.


Any help is appreciated. I've been banging my head on this issue 
(thinking it was a bug in the code) for some time now.


--
Mahdi.






Re: LWT broken?

2018-02-09 Thread Stefan Podkowinski
I'd not recommend using any consistency level but serial for reading
tables updated by LWT operations. Otherwise you might end up reading
inconsistent results.


On 09.02.18 08:06, Mahdi Ben Hamida wrote:
>
> Hello,
>
> I'm running a 2.0.17 cluster (I know, I know, need to upgrade) with 46
> nodes across 3 racks (& RF=3). I'm seeing that under high contention,
> LWT may actually not guarantee uniqueness. With a total of 16 million
> LWT transactions (with peak LWT concurrency around 5k/sec), I found 38
> conflicts that should have been impossible. I was wondering if there
> were any known issues that make LWT broken for this old version of
> cassandra.
>
> I use LWT to guarantee that a 128 bit number (hash) maps to a unique
> 64 bit number (id). There could be a large number of threads trying to
> allocate an id for a given hash.
>
> I do the following logic (slightly more complicated than this due to
> timeout handling)
>
>  1  existing_id = SELECT id FROM hash_id WHERE hash=computed_hash *|
> consistency = ONE*
>  2  if existing_id != null:
>  3    return existing_id
>  4  new_id = generateUniqueId()
>  5  result=INSERT INTO hash_id (id) VALUES(new_id) WHERE
> hash=computed_hash IF NOT EXIST | *consistency = QUORUM,
> serialConsistency = SERIAL*
>  6  if result == [applied] // ie we won LWT
>  7    return new_id
>  8  else// we lost LWT, fetch the winning value
>  9    existing_id = SELECT id FROM hash_id WHERE hash=computed_hash |
> consistency = ONE
> 10    return existing_id
>
> Is there anything flawed about this ?
> I do the read at line #1 and #9 at a consistency of ONE. Would that
> cause uncommitted changes to be seen (ie, dirty reads) ? Should it be
> a SERIAL consistency instead ? My understanding is that only one
> transaction will be able to apply the write (at quorum), so doing a
> read at consistency of one will either result in a null, or I would
> get the id that won the LWT race.
>
> Any help is appreciated. I've been banging my head on this issue
> (thinking it was a bug in the code) for some time now.
>
> -- 
> Mahdi.



Re: GDPR, Right to Be Forgotten, and Cassandra

2018-02-09 Thread Stefan Podkowinski
Deleting data "without undue delay" in Cassandra can be implemented by
using crypto shredding and pseudonymization strategies in your data
model. All you have to do is to make sure that throwing away a person's
data encryption key will make it impossible to restore personal data and
impossible to resolve any pseudonyms associated with that person.


On 09.02.18 17:10, Nicolas Guyomar wrote:
> Hi everyone,
>
> Because of GDPR we really face the need to support “Right to Be
> Forgotten” requests => https://gdpr-info.eu/art-17-gdpr/  stating that
> /"the controller shall have the obligation to erase personal data
> *without undue delay*"/
>
> Because I usually meet customers that do not have that much clients,
> modeling one partition per client is almost always possible, easing
> deletion by partition key.
>
> Then, appart from triggering a manual compaction on impacted tables
> using STCS, I do not see how I can be GDPR compliant.
>
> I'm kind of surprised not to find any thread on that matter on the ML,
> do you guys have any modeling strategy that would make it easier to
> get rid of data ? 
>
> Thank you for any given advice
>
> Nicolas



Re: GDPR, Right to Be Forgotten, and Cassandra

2018-02-09 Thread J. D. Jordan
The times I have run into similar requirements from legislation or standards 
the fact that SELECT no longer returns the data is enough for all auditors I 
have worked with.
Otherwise you get down into screwy requirements of needing to zero out all 
unused sectors on your disks to actually remove the data, and make sure nothing 
has the drive sectored cached somewhere, and other such things.

-Jeremiah

> On Feb 9, 2018, at 1:54 PM, Jon Haddad  wrote:
> 
> A layer violation?  Seriously?  Technical solutions exist to solve business 
> problems and I’m 100% fine with introducing former to solve the latter.
> 
> Look, if the goal is to purge information out of the DB as quickly as 
> possible from a lot of accounts, the fastest way to do it is to hijack the 
> fact that you’re constantly rewriting data through compaction and (ab)use it. 
>  It avoids the overhead of tombstones, and can be implemented in a way that 
> allows you to to perform a single write / edit a text file / some other 
> trivial system and immediately start removing customer data.  It’s an 
> incredibly efficient way of bulk removing customer data.  
> 
> The wording around "The Right To Be Forgotten” is a little vague [1], and I 
> don’t know if "the right to be forgotten entitles the data subject to have 
> the data controller erase his/her personal data” means that tombstones are 
> OK.  If you tombstone some row using TWCS, it will literally *never* be 
> deleted off disk, as opposed to using DeletingCompactionStrategy where it 
> could easily be removed without leaving data laying around in SSTables.  I’ve 
> done this already for this *exact* use case and know it works and works very 
> well.
> 
> The debate around what is the “correct” way to solve the problem is a 
> dogmatic one and I don’t have any interest in pursuing it any further.  I’ve 
> simply offered a solution that I know works because I’ve done it, which is 
> what the OP asked for.
> 
> [1] https://www.eugdpr.org/key-changes.html
> 
>> On Feb 9, 2018, at 10:33 AM, Dor Laor  wrote:
>> 
>> I think you're introducing a layer violation. GDPR is a business requirement 
>> and
>> compaction is an implementation detail. 
>> 
>> IMHO it's enough to delete the partition using regular CQL.
>> It's true that it won't be deleted immedietly but it will be eventually 
>> deleted (welcome to eventual consistency ;).
>> 
>> Even with user defined compaction, compaction may not be running instantly, 
>> repair will be required,
>> there are other nodes in the cluster, maybe partitioned nodes with the data. 
>> There is data in snapshots
>> and backups.
>> 
>> The business idea is to delete the data in a fast, reasonable time for 
>> humans and make it
>> first unreachable and later delete completely. 
>> 
>>> On Fri, Feb 9, 2018 at 8:51 AM, Jonathan Haddad  wrote:
>>> That might be fine for a one off but is totally impractical at scale or 
>>> when using TWCS. 
 On Fri, Feb 9, 2018 at 8:39 AM DuyHai Doan  wrote:
 Or use the new user-defined compaction option recently introduced, 
 provided you can determine over which SSTables a partition is spread
 
> On Fri, Feb 9, 2018 at 5:23 PM, Jon Haddad  wrote:
> Give this a read through:
> 
> https://github.com/protectwise/cassandra-util/tree/master/deleting-compaction-strategy
> 
> Basically you write your own logic for how stuff gets forgotten, then you 
> can recompact every sstable with upgradesstables -a.  
> 
> Jon
> 
> 
>> On Feb 9, 2018, at 8:10 AM, Nicolas Guyomar  
>> wrote:
>> 
>> Hi everyone,
>> 
>> Because of GDPR we really face the need to support “Right to Be 
>> Forgotten” requests => https://gdpr-info.eu/art-17-gdpr/  stating that 
>> "the controller shall have the obligation to erase personal data without 
>> undue delay"
>> 
>> Because I usually meet customers that do not have that much clients, 
>> modeling one partition per client is almost always possible, easing 
>> deletion by partition key.
>> 
>> Then, appart from triggering a manual compaction on impacted tables 
>> using STCS, I do not see how I can be GDPR compliant.
>> 
>> I'm kind of surprised not to find any thread on that matter on the ML, 
>> do you guys have any modeling strategy that would make it easier to get 
>> rid of data ? 
>> 
>> Thank you for any given advice
>> 
>> Nicolas
> 
 
>> 
> 


Re: GDPR, Right to Be Forgotten, and Cassandra

2018-02-09 Thread Dor Laor
I think you're introducing a layer violation. GDPR is a business
requirement and
compaction is an implementation detail.

IMHO it's enough to delete the partition using regular CQL.
It's true that it won't be deleted immedietly but it will be eventually
deleted (welcome to eventual consistency ;).

Even with user defined compaction, compaction may not be running instantly,
repair will be required,
there are other nodes in the cluster, maybe partitioned nodes with the
data. There is data in snapshots
and backups.

The business idea is to delete the data in a fast, reasonable time for
humans and make it
first unreachable and later delete completely.

On Fri, Feb 9, 2018 at 8:51 AM, Jonathan Haddad  wrote:

> That might be fine for a one off but is totally impractical at scale or
> when using TWCS.
> On Fri, Feb 9, 2018 at 8:39 AM DuyHai Doan  wrote:
>
>> Or use the new user-defined compaction option recently introduced,
>> provided you can determine over which SSTables a partition is spread
>>
>> On Fri, Feb 9, 2018 at 5:23 PM, Jon Haddad  wrote:
>>
>>> Give this a read through:
>>>
>>> https://github.com/protectwise/cassandra-util/tree/master/deleting-
>>> compaction-strategy
>>>
>>> Basically you write your own logic for how stuff gets forgotten, then
>>> you can recompact every sstable with upgradesstables -a.
>>>
>>> Jon
>>>
>>>
>>> On Feb 9, 2018, at 8:10 AM, Nicolas Guyomar 
>>> wrote:
>>>
>>> Hi everyone,
>>>
>>> Because of GDPR we really face the need to support “Right to Be
>>> Forgotten” requests => https://gdpr-info.eu/art-17-gdpr/  stating that *"the
>>> controller shall have the obligation to erase personal data without undue
>>> delay"*
>>>
>>> Because I usually meet customers that do not have that much clients,
>>> modeling one partition per client is almost always possible, easing
>>> deletion by partition key.
>>>
>>> Then, appart from triggering a manual compaction on impacted tables
>>> using STCS, I do not see how I can be GDPR compliant.
>>>
>>> I'm kind of surprised not to find any thread on that matter on the ML,
>>> do you guys have any modeling strategy that would make it easier to get rid
>>> of data ?
>>>
>>> Thank you for any given advice
>>>
>>> Nicolas
>>>
>>>
>>>
>>


Re: GDPR, Right to Be Forgotten, and Cassandra

2018-02-09 Thread Jonathan Haddad
That might be fine for a one off but is totally impractical at scale or
when using TWCS.
On Fri, Feb 9, 2018 at 8:39 AM DuyHai Doan  wrote:

> Or use the new user-defined compaction option recently introduced,
> provided you can determine over which SSTables a partition is spread
>
> On Fri, Feb 9, 2018 at 5:23 PM, Jon Haddad  wrote:
>
>> Give this a read through:
>>
>>
>> https://github.com/protectwise/cassandra-util/tree/master/deleting-compaction-strategy
>>
>> Basically you write your own logic for how stuff gets forgotten, then you
>> can recompact every sstable with upgradesstables -a.
>>
>> Jon
>>
>>
>> On Feb 9, 2018, at 8:10 AM, Nicolas Guyomar 
>> wrote:
>>
>> Hi everyone,
>>
>> Because of GDPR we really face the need to support “Right to Be
>> Forgotten” requests => https://gdpr-info.eu/art-17-gdpr/  stating that *"the
>> controller shall have the obligation to erase personal data without undue
>> delay"*
>>
>> Because I usually meet customers that do not have that much clients,
>> modeling one partition per client is almost always possible, easing
>> deletion by partition key.
>>
>> Then, appart from triggering a manual compaction on impacted tables using
>> STCS, I do not see how I can be GDPR compliant.
>>
>> I'm kind of surprised not to find any thread on that matter on the ML, do
>> you guys have any modeling strategy that would make it easier to get rid of
>> data ?
>>
>> Thank you for any given advice
>>
>> Nicolas
>>
>>
>>
>


Re: GDPR, Right to Be Forgotten, and Cassandra

2018-02-09 Thread DuyHai Doan
Or use the new user-defined compaction option recently introduced, provided
you can determine over which SSTables a partition is spread

On Fri, Feb 9, 2018 at 5:23 PM, Jon Haddad  wrote:

> Give this a read through:
>
> https://github.com/protectwise/cassandra-util/tree/master/deleting-
> compaction-strategy
>
> Basically you write your own logic for how stuff gets forgotten, then you
> can recompact every sstable with upgradesstables -a.
>
> Jon
>
>
> On Feb 9, 2018, at 8:10 AM, Nicolas Guyomar 
> wrote:
>
> Hi everyone,
>
> Because of GDPR we really face the need to support “Right to Be Forgotten”
> requests => https://gdpr-info.eu/art-17-gdpr/  stating that *"the
> controller shall have the obligation to erase personal data without undue
> delay"*
>
> Because I usually meet customers that do not have that much clients,
> modeling one partition per client is almost always possible, easing
> deletion by partition key.
>
> Then, appart from triggering a manual compaction on impacted tables using
> STCS, I do not see how I can be GDPR compliant.
>
> I'm kind of surprised not to find any thread on that matter on the ML, do
> you guys have any modeling strategy that would make it easier to get rid of
> data ?
>
> Thank you for any given advice
>
> Nicolas
>
>
>


Re: GDPR, Right to Be Forgotten, and Cassandra

2018-02-09 Thread Jon Haddad
Give this a read through:

https://github.com/protectwise/cassandra-util/tree/master/deleting-compaction-strategy
 


Basically you write your own logic for how stuff gets forgotten, then you can 
recompact every sstable with upgradesstables -a.  

Jon


> On Feb 9, 2018, at 8:10 AM, Nicolas Guyomar  wrote:
> 
> Hi everyone,
> 
> Because of GDPR we really face the need to support “Right to Be Forgotten” 
> requests => https://gdpr-info.eu/art-17-gdpr/ 
>   stating that "the controller shall have 
> the obligation to erase personal data without undue delay"
> 
> Because I usually meet customers that do not have that much clients, modeling 
> one partition per client is almost always possible, easing deletion by 
> partition key.
> 
> Then, appart from triggering a manual compaction on impacted tables using 
> STCS, I do not see how I can be GDPR compliant.
> 
> I'm kind of surprised not to find any thread on that matter on the ML, do you 
> guys have any modeling strategy that would make it easier to get rid of data 
> ? 
> 
> Thank you for any given advice
> 
> Nicolas



GDPR, Right to Be Forgotten, and Cassandra

2018-02-09 Thread Nicolas Guyomar
Hi everyone,

Because of GDPR we really face the need to support “Right to Be Forgotten”
requests => https://gdpr-info.eu/art-17-gdpr/  stating that *"the
controller shall have the obligation to erase personal data without undue
delay"*

Because I usually meet customers that do not have that much clients,
modeling one partition per client is almost always possible, easing
deletion by partition key.

Then, appart from triggering a manual compaction on impacted tables using
STCS, I do not see how I can be GDPR compliant.

I'm kind of surprised not to find any thread on that matter on the ML, do
you guys have any modeling strategy that would make it easier to get rid of
data ?

Thank you for any given advice

Nicolas


Re: What kind of Automation you have for Cassandra related operations on AWS ?

2018-02-09 Thread vincent gromakowski
It will clearly follow your colleagues approach on the postgresql operator
https://github.com/zalando-incubator/postgres-operator

Just watch my repo for a first beta working version in the next weeks
https://github.com/vgkowski/cassandra-operator


2018-02-09 15:20 GMT+01:00 Oleksandr Shulgin :

> On Fri, Feb 9, 2018 at 1:01 PM, vincent gromakowski <
> vincent.gromakow...@gmail.com> wrote:
>
>> Working on a Kubernetes operator for Cassandra (Alpha stage...)
>>
>
> I would love to learn more about your approach.  Do you have anything to
> show already?  Design docs / prototype?
>
> --
> Alex
>
>


Re: What kind of Automation you have for Cassandra related operations on AWS ?

2018-02-09 Thread Oleksandr Shulgin
On Fri, Feb 9, 2018 at 1:01 PM, vincent gromakowski <
vincent.gromakow...@gmail.com> wrote:

> Working on a Kubernetes operator for Cassandra (Alpha stage...)
>

I would love to learn more about your approach.  Do you have anything to
show already?  Design docs / prototype?

--
Alex


Re: What kind of Automation you have for Cassandra related operations on AWS ?

2018-02-09 Thread vincent gromakowski
Working on a Kubernetes operator for Cassandra (Alpha stage...)

Le 9 févr. 2018 12:56 PM, "Oleksandr Shulgin" 
a écrit :

> On Fri, Feb 9, 2018 at 12:46 AM, Krish Donald 
> wrote:
>
>> Hi All,
>>
>> What kind of Automation you have for Cassandra related operations on AWS
>> like restacking, restart of the cluster , changing cassandra.yaml
>> parameters etc ?
>>
>
> We wrote some scripts customized for Zalando's STUPS platform:
> https://github.com/zalando-stups/planb-cassandra  (Warning! messy Python
> inside)
>
> We deploy EBS-backed instances with AWS EC2 auto-recovery enabled.
> Cassandra runs inside Docker on the EC2 hosts.
>
> The EBS setup allows us to perform rolling restarts / binary updates
> without streaming.
>
> Updating configuration parameters is a bit tricky since there are many
> places where different stuff is configured: cassandra-env.sh, jvm.options,
> cassandra.yaml and environment variables.  We don't have a comprehensive
> answer to that yet.
>
> Cheers,
> --
> Oleksandr "Alex" Shulgin | Database Engineer | Zalando SE | Tel: +49 176
> 127-59-707 <+49%20176%2012759707>
>
>


Re: What kind of Automation you have for Cassandra related operations on AWS ?

2018-02-09 Thread Oleksandr Shulgin
On Fri, Feb 9, 2018 at 12:46 AM, Krish Donald  wrote:

> Hi All,
>
> What kind of Automation you have for Cassandra related operations on AWS
> like restacking, restart of the cluster , changing cassandra.yaml
> parameters etc ?
>

We wrote some scripts customized for Zalando's STUPS platform:
https://github.com/zalando-stups/planb-cassandra  (Warning! messy Python
inside)

We deploy EBS-backed instances with AWS EC2 auto-recovery enabled.
Cassandra runs inside Docker on the EC2 hosts.

The EBS setup allows us to perform rolling restarts / binary updates
without streaming.

Updating configuration parameters is a bit tricky since there are many
places where different stuff is configured: cassandra-env.sh, jvm.options,
cassandra.yaml and environment variables.  We don't have a comprehensive
answer to that yet.

Cheers,
-- 
Oleksandr "Alex" Shulgin | Database Engineer | Zalando SE | Tel: +49 176
127-59-707


Re: Refresh from Prod to Dev

2018-02-09 Thread Pradeep Chhetri
Hi Anshu,

We used to have similar requirements in my workplace.

We tried multiple options like snapshot and restore it but the best one
which worked for us was making a same number of nodes of cas cluster in
preprod and doing a parallel scp of the data directly from production to
preprod and then run a nodetool refresh.

On Fri, 9 Feb 2018 at 12:03 PM, Anshu Vajpayee 
wrote:

> Team ,
>
> I want to validate and POC on production data. Data on production
> is huge.  What could be optimal method to move the data from Prod to Dev
> environment?  I know there are few solutions but what/which is most
> efficient method do refresh for dev env?
>
> --
> *C*heers,*
> *Anshu V*
>
>
>


Re: Refresh from Prod to Dev

2018-02-09 Thread Rahul Singh
If you have equivalent number of nodes then use snapshot to backup and then 
restore them on Dev. You will need to create the schema on the Dev box. The 
CFiD will be different so at most you may have to rename the Prod sstable dirs 
to match whats on Dev.

Another method is to use sstableloader if you don’t have an equivalent number 
of nodes.

Otherwise if you can throw away Dev , just take everything from Prod and bring 
it up in a new Dev.

--
Rahul Singh
rahul.si...@anant.us

Anant Corporation

On Feb 9, 2018, 1:18 AM -0500, Anshu Vajpayee , wrote:
> Team ,
>
> I want to validate and POC on production data. Data on production is huge.  
> What could be optimal method to move the data from Prod to Dev environment?  
> I know there are few solutions but what/which is most efficient method do 
> refresh for dev env?
>
> --
> C*heers,
> Anshu V
>
>


Re: Bootstrapping fails with < 128GB RAM ...

2018-02-09 Thread Jürgen Albersdorfer
Hi Jon,
should I register to the JIRA and open an Issue or will you do so?
I'm currently trying to bootstrap another node - with 100GB RAM, this time,
and I'm recording Java Heap Memory over time via Jconsole, Top Threads and
do monitoring the debug.log.

There, in the debug.log, I can see, that the other nodes seem to
immediatelly start hinting to the joining node, indicated by the following
logs, which I have hundrets per second in my debug.log:

DEBUG [MutationStage-27] 2018-02-09 12:06:03,241 HintVerbHandler.java:95 -
Failed to apply hint
java.util.concurrent.CompletionException:
org.apache.cassandra.exceptions.WriteTimeoutException: Operation timed out
- received only 0 responses.
at
java.util.concurrent.CompletableFuture.encodeThrowable(CompletableFuture.java:292)
~[na:1.8.0_151]
at
java.util.concurrent.CompletableFuture.completeThrowable(CompletableFuture.java:308)
~[na:1.8.0_151]
at
java.util.concurrent.CompletableFuture.uniAccept(CompletableFuture.java:647)
~[na:1.8.0_151]
at
java.util.concurrent.CompletableFuture$UniAccept.tryFire(CompletableFuture.java:632)
~[na:1.8.0_151]
at
java.util.concurrent.CompletableFuture.postComplete(CompletableFuture.java:474)
~[na:1.8.0_151]
at
java.util.concurrent.CompletableFuture.completeExceptionally(CompletableFuture.java:1977)
~[na:1.8.0_151]
at
org.apache.cassandra.db.Keyspace.applyInternal(Keyspace.java:523)
~[apache-cassandra-3.11.1.jar:3.11.1]
at
org.apache.cassandra.db.Keyspace.lambda$applyInternal$0(Keyspace.java:538)
~[apache-cassandra-3.11.1.jar:3.11.1]
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
~[na:1.8.0_151]
at
org.apache.cassandra.concurrent.AbstractLocalAwareExecutorService$FutureTask.run(AbstractLocalAwareExecutorService.java:162)
~[apache-cassandra-3.11.1.jar:3.11.1]
at
org.apache.cassandra.concurrent.SEPWorker.run(SEPWorker.java:109)
~[apache-cassandra-3.11.1.jar:3.11.1]
at java.lang.Thread.run(Thread.java:748) ~[na:1.8.0_151]
Caused by: org.apache.cassandra.exceptions.WriteTimeoutException: Operation
timed out - received only 0 responses.
... 6 common frames omitted

Could this be connected? Maybe causing the extensive RAM requirement?

Thanks so far, regards
Juergen

2018-02-07 19:49 GMT+01:00 Jon Haddad :

> It would be extremely helpful to get some info about your heap.  At a bare
> minimum, a histogram of the heap dump would be useful, but ideally a full
> heap dump would be best.
>
> jmap  -dump:live,format=b,file=heap.bin PID
>
> Taking a look at that in YourKit should give some pretty quick insight
> into what kinds of objects are allocated then we can get to the bottom of
> the issue.  This should be moved to a JIRA (https://issues.apache.org/
> jira/secure/Dashboard.jspa) in order to track and fix it, if you could
> attach that heap dump it would be very helpful.
>
> Jon
>
>
> On Feb 7, 2018, at 6:11 AM, Nicolas Guyomar 
> wrote:
>
> Ok then, following up on the wild guess : because you have quite a lot of
> concurrent compactors, maybe it is too much concurrent compactions for the
> jvm to deal with (taking into account that your load average of 106 seems
> really high IMHO)
>
> 55Gb of data is not that much, you can try to reduce those concurrent
> compactor to make sure your box is not under too much stress (how many
> compaction do you have in parallel during boostrap ? )
>
> In the end, it does seem that you're gonna have to share some heap dump
> for further investigation (sorry I'm not gonna help lot on this matter)
>
> On 7 February 2018 at 14:43, Jürgen Albersdorfer 
> wrote:
>
>> Hi Nicolas,
>>
>> Do you know how many sstables is this new node suppose to receive ?
>>
>>
>> If I can find out this via nodetool netstats, then this would be 619 as
>> following:
>>
>> # nodetool netstats
>> Bootstrap b95371e0-0c0a-11e8-932b-f775227bf21c
>> /192.168.1.215 - Receiving 71 files, 7744612158 <(774)%20461-2158>
>> bytes total. Already received 0 files, 893897583 bytes total
>> /192.168.1.214 - Receiving 58 files, 5693392001 bytes total. Already
>> received 0 files, 1078372756 bytes total
>> /192.168.1.206 - Receiving 52 files, 3389096409 bytes total. Already
>> received 3 files, 508592758 bytes total
>> /192.168.1.213 - Receiving 59 files, 6041633329 bytes total. Already
>> received 0 files, 1038760653 bytes total
>> /192.168.1.231 - Receiving 79 files, 7579181689 <(757)%20918-1689>
>> bytes total. Already received 4 files, 38387859 bytes total
>> /192.168.1.208 - Receiving 51 files, 3272885123 bytes total. Already
>> received 3 files, 362450903 bytes total
>> /192.168.1.207 - Receiving 56 files, 3028344200 <(302)%20834-4200>
>> bytes total. Already received 3 files, 57790197 bytes total
>> /192.168.1.232 - Receiving 79 files, 7268716317 <(726)%20871-6317>
>> bytes total. Already received 1 files, 

Re: Hints folder missing in Cassandra

2018-02-09 Thread Nicolas Guyomar
Hi,

There are no piece of code in Cassandra that would remove this folder. You
should start looking elsewhere, like other people mentioned (chef, ansible
and so on), good luck


On 8 February 2018 at 22:54, test user  wrote:

> Does anyone have more inputs on the missing hints folder, rather why it
> gets deleted.
>
> Has anyone run into this scenario before?
>
> Regards,
> Cassandra User
>
> On Wed, Feb 7, 2018 at 9:21 PM, test user  wrote:
>
>> The problem is even though, I get that O_RDONLY WARN message, if I try to
>> navigate to the path where the hints folder is stored, the folder is not
>> present.
>> I cannot check the permissions on that folder, its already missing, got
>> deleted somehow.
>>
>> I believe everything runs as the root user.
>>
>> I do see a lot of sstable activity performed by MemTableFlushWriter
>> (ColumnFamilyStore), CompactionExecutor, PerDiskFlushWriter (MemTable)
>> before and after this WARN message.
>>
>> It is not a space issue, I checked that already.
>>
>>
>>
>> On Wed, Feb 7, 2018 at 3:49 PM, Nate McCall 
>> wrote:
>>
>>>
>>> The environment is built using established images for Cassandra 3.10.
 Unfortunately the debug log does not indicate any errors before I start
 seeing the WARN for missing hints folder. I understand that hints file will
 be deleted after replay is complete, but not sure of the root cause of why
 the hints folder is getting deleted.
 When I look at the nodetool status or nodetool ring - it indicates that
 all nodes are up and running in normal state, no node went down. Also, I do
 not see anything the debug logs indicating that a node went down. In such a
 scenario, I am not sure why would HintsWriterExecutor would get triggered.


>>> That error code (O_RDONLY) in the log message indicates that the hints
>>> folder has had its permission bits set to read only.
>>>
>>> We've had several issues with some of the tools doing this type of thing
>>> when they are run as the root user. Is this specific node one on which you
>>> use any of the tools like sstableloader or similar? If so, are you running
>>> them as root?
>>>
>>> Another thought - if it is on a different partition than the data
>>> directory, is there free space left on the underlying device holding:
>>> /var/lib/cassandra/hints?
>>>
>>>
>>> --
>>> -
>>> Nate McCall
>>> Wellington, NZ
>>> @zznate
>>>
>>> CTO
>>> Apache Cassandra Consulting
>>> http://www.thelastpickle.com
>>>
>>
>>
>