LWT broken?

2018-02-08 Thread Mahdi Ben Hamida

Hello,

I'm running a 2.0.17 cluster (I know, I know, need to upgrade) with 46 
nodes across 3 racks (& RF=3). I'm seeing that under high contention, 
LWT may actually not guarantee uniqueness. With a total of 16 million 
LWT transactions (with peak LWT concurrency around 5k/sec), I found 38 
conflicts that should have been impossible. I was wondering if there 
were any known issues that make LWT broken for this old version of 
cassandra.


I use LWT to guarantee that a 128 bit number (hash) maps to a unique 64 
bit number (id). There could be a large number of threads trying to 
allocate an id for a given hash.


I do the following logic (slightly more complicated than this due to 
timeout handling)


 1  existing_id = SELECT id FROM hash_id WHERE hash=computed_hash *| 
consistency = ONE*

 2  if existing_id != null:
 3    return existing_id
 4  new_id = generateUniqueId()
 5  result=INSERT INTO hash_id (id) VALUES(new_id) WHERE 
hash=computed_hash IF NOT EXIST | *consistency = QUORUM, 
serialConsistency = SERIAL*

 6  if result == [applied] // ie we won LWT
 7    return new_id
 8  else// we lost LWT, fetch the winning value
 9    existing_id = SELECT id FROM hash_id WHERE hash=computed_hash | 
consistency = ONE

10    return existing_id

Is there anything flawed about this ?
I do the read at line #1 and #9 at a consistency of ONE. Would that 
cause uncommitted changes to be seen (ie, dirty reads) ? Should it be a 
SERIAL consistency instead ? My understanding is that only one 
transaction will be able to apply the write (at quorum), so doing a read 
at consistency of one will either result in a null, or I would get the 
id that won the LWT race.


Any help is appreciated. I've been banging my head on this issue 
(thinking it was a bug in the code) for some time now.


--
Mahdi.



Refresh from Prod to Dev

2018-02-08 Thread Anshu Vajpayee
Team ,

I want to validate and POC on production data. Data on production is huge.
What could be optimal method to move the data from Prod to Dev
environment?  I know there are few solutions but what/which is most
efficient method do refresh for dev env?

-- 
*C*heers,*
*Anshu V*


Re: What kind of Automation you have for Cassandra related operations on AWS ?

2018-02-08 Thread Lerh Chuan Low
Terraform, Packer and Ansible does pretty decently, you may have to do some
smarts around replacing nodes and attaching the right volumes to replaced
nodes. If you could get Kubernetes working with Cassandra (beyond the
readily available guides) then I think you'll be a total baller.

On 9 February 2018 at 15:45, daemeon reiydelle  wrote:

> Terraform plus ansible. Put ok but messy. 5-30,000 nodes and infra
>
>
> Daemeon (Dæmœn) Reiydelle
> USA 1.415.501.0198 <(415)%20501-0198>
>
> On Thu, Feb 8, 2018, 15:57 Ben Wood  wrote:
>
>> Shameless plug of our (DC/OS) Apache Cassandra service: https://docs.
>> mesosphere.com/services/cassandra/2.0.3-3.0.14.
>>
>> You must run DC/OS, but it will handle:
>> Restarts
>> Replacement of nodes
>> Modification of configuration
>> Backups and Restores (to S3)
>>
>> On Thu, Feb 8, 2018 at 3:46 PM, Krish Donald 
>> wrote:
>>
>>> Hi All,
>>>
>>> What kind of Automation you have for Cassandra related operations on AWS
>>> like restacking, restart of the cluster , changing cassandra.yaml
>>> parameters etc ?
>>>
>>> Thanks
>>>
>>>
>>
>>
>> --
>> Ben Wood
>> Software Engineer - Data Agility
>> Mesosphere
>>
>


Re: What kind of Automation you have for Cassandra related operations on AWS ?

2018-02-08 Thread daemeon reiydelle
Terraform plus ansible. Put ok but messy. 5-30,000 nodes and infra


Daemeon (Dæmœn) Reiydelle
USA 1.415.501.0198

On Thu, Feb 8, 2018, 15:57 Ben Wood  wrote:

> Shameless plug of our (DC/OS) Apache Cassandra service:
> https://docs.mesosphere.com/services/cassandra/2.0.3-3.0.14.
>
> You must run DC/OS, but it will handle:
> Restarts
> Replacement of nodes
> Modification of configuration
> Backups and Restores (to S3)
>
> On Thu, Feb 8, 2018 at 3:46 PM, Krish Donald  wrote:
>
>> Hi All,
>>
>> What kind of Automation you have for Cassandra related operations on AWS
>> like restacking, restart of the cluster , changing cassandra.yaml
>> parameters etc ?
>>
>> Thanks
>>
>>
>
>
> --
> Ben Wood
> Software Engineer - Data Agility
> Mesosphere
>


Re: What kind of Automation you have for Cassandra related operations on AWS ?

2018-02-08 Thread Romain Hardouin
 
At Teads we use Terraform, Chef, Packer and Rundeck for our AWS infrastructure. 
I'll publish a blog post on Medium which talk about that, it's in the pipeline. 
Terraform is awesome.
Best,
RomainLe vendredi 9 février 2018 à 00:57:01 UTC+1, Ben Wood 
 a écrit :  
 
 Shameless plug of our (DC/OS) Apache Cassandra service: 
https://docs.mesosphere.com/services/cassandra/2.0.3-3.0.14.
You must run DC/OS, but it will handle:RestartsReplacement of nodesModification 
of configurationBackups and Restores (to S3)
On Thu, Feb 8, 2018 at 3:46 PM, Krish Donald  wrote:

Hi All,
What kind of Automation you have for Cassandra related operations on AWS like 
restacking, restart of the cluster , changing cassandra.yaml parameters etc ?

Thanks




-- 
Ben WoodSoftware Engineer - Data AgilityMesosphere  

Re: What kind of Automation you have for Cassandra related operations on AWS ?

2018-02-08 Thread Ben Wood
Shameless plug of our (DC/OS) Apache Cassandra service:
https://docs.mesosphere.com/services/cassandra/2.0.3-3.0.14.

You must run DC/OS, but it will handle:
Restarts
Replacement of nodes
Modification of configuration
Backups and Restores (to S3)

On Thu, Feb 8, 2018 at 3:46 PM, Krish Donald  wrote:

> Hi All,
>
> What kind of Automation you have for Cassandra related operations on AWS
> like restacking, restart of the cluster , changing cassandra.yaml
> parameters etc ?
>
> Thanks
>
>


-- 
Ben Wood
Software Engineer - Data Agility
Mesosphere


What kind of Automation you have for Cassandra related operations on AWS ?

2018-02-08 Thread Krish Donald
Hi All,

What kind of Automation you have for Cassandra related operations on AWS
like restacking, restart of the cluster , changing cassandra.yaml
parameters etc ?

Thanks


Re: Hints folder missing in Cassandra

2018-02-08 Thread test user
Does anyone have more inputs on the missing hints folder, rather why it
gets deleted.

Has anyone run into this scenario before?

Regards,
Cassandra User

On Wed, Feb 7, 2018 at 9:21 PM, test user  wrote:

> The problem is even though, I get that O_RDONLY WARN message, if I try to
> navigate to the path where the hints folder is stored, the folder is not
> present.
> I cannot check the permissions on that folder, its already missing, got
> deleted somehow.
>
> I believe everything runs as the root user.
>
> I do see a lot of sstable activity performed by MemTableFlushWriter
> (ColumnFamilyStore), CompactionExecutor, PerDiskFlushWriter (MemTable)
> before and after this WARN message.
>
> It is not a space issue, I checked that already.
>
>
>
> On Wed, Feb 7, 2018 at 3:49 PM, Nate McCall 
> wrote:
>
>>
>> The environment is built using established images for Cassandra 3.10.
>>> Unfortunately the debug log does not indicate any errors before I start
>>> seeing the WARN for missing hints folder. I understand that hints file will
>>> be deleted after replay is complete, but not sure of the root cause of why
>>> the hints folder is getting deleted.
>>> When I look at the nodetool status or nodetool ring - it indicates that
>>> all nodes are up and running in normal state, no node went down. Also, I do
>>> not see anything the debug logs indicating that a node went down. In such a
>>> scenario, I am not sure why would HintsWriterExecutor would get triggered.
>>>
>>>
>> That error code (O_RDONLY) in the log message indicates that the hints
>> folder has had its permission bits set to read only.
>>
>> We've had several issues with some of the tools doing this type of thing
>> when they are run as the root user. Is this specific node one on which you
>> use any of the tools like sstableloader or similar? If so, are you running
>> them as root?
>>
>> Another thought - if it is on a different partition than the data
>> directory, is there free space left on the underlying device holding:
>> /var/lib/cassandra/hints?
>>
>>
>> --
>> -
>> Nate McCall
>> Wellington, NZ
>> @zznate
>>
>> CTO
>> Apache Cassandra Consulting
>> http://www.thelastpickle.com
>>
>
>


Re: Add column if it does not exist?

2018-02-08 Thread Eric Stevens
To hop on what Jon said, if your concern is automatic application of schema
migrations, you want to be very careful with this.  I'd consider it an
unsolved problem in Cassandra for some methods of schema application.

The failed ALTER is not what you have to worry about, it's two successful
ALTERs that will be the problem.  Cassandra is eventually consistent,
including for schema changes, and they are also not isolated.  If two
identical ALTER commands both succeed, you will end up in a schema
disagreement.  This is not fun to recover from.

You MUST coordinate your schema migrations FROM A SINGLE HOST ONLY.

On Wed, Feb 7, 2018 at 12:23 PM Rahul Singh 
wrote:

> Yah. I saw one such migration via Spark Job running concurrently and
> created 4 Cfids and migrated data. It was a nightmare to cleanup the
> duplicated sstables.
>
> Alter schema and migrate should always be different applications separate
> from the actual system.
>
> --
> Rahul Singh
> rahul.si...@anant.us
>
> Anant Corporation
>
> On Feb 7, 2018, 12:39 PM -0600, Jon Haddad , wrote:
>
> All of the drivers also have keyspace / table metadata. For instance:
> https://datastax.github.io/python-driver/api/cassandra/metadata.html
>
> I’d be *really* careful how you use this.  A lot of teams want to just
> deploy their code to a couple hundred servers and let them race to apply
> the ALTER.  That will not be a fun time.  I advise against using LWT to
> manage this as well. If you’re looking to apply schema changes like this,
> I’d let a single app server manage it and avoid the headache of concurrency.
>
> Jon
>
> On Feb 6, 2018, at 8:13 PM, Irtiza Ali  wrote:
>
> Hello,
>
> this link might also be helpful to you for querying table schema.
>
> Link:
> https://docs.datastax.com/en/cql/3.3/cql/cql_using/useQuerySystemTable.html
>
> Best, Iriiza
>
>
>
> On Tue, Feb 6, 2018 at 9:55 PM, Oliver Ruebenacker 
> wrote:
>
>>
>>  Hello,
>>
>>   Is there a describe query in CQL? I don't see one on
>> http://cassandra.apache.org/doc/latest/cql/index.html.
>>
>>   I also can't find such a query in the DataStax Java driver API.
>>
>>   Thanks!
>>
>>  Best, Oliver
>>
>> On Tue, Feb 6, 2018 at 11:48 AM, Irtiza Ali  wrote:
>>
>>> Hello.
>>>
>>> Another thing that you can try is the use the describe table query to
>>> get the table schema and parse it. Once done you can check whether column
>>> exists or not.
>>>
>>>
>>> With Regards
>>> Irtiza Ali
>>>
>>> On 6 Feb 2018 21:35, "Oliver Ruebenacker"  wrote:
>>>
   Thanks for the response!

   So, the best solution I can come up with is catching the
 InvalidQueryException and check whether its message contains the phrase
 "conflicts with an existing column". Seems to work, but super-ugly.

   I do assume that in general, if a request fails, it does not
 permanently change the data in Cassandra, right?

   It would be great if alter-add could have an if-not-exists clause.
 Would that be hard to implement?

   I could not find a standard CQL way of asking what columns exist. Did
 I miss it? Would it be hard to implement?

   I get that we're only eventually consistent anyway.

   Thanks!

  Best, Oliver

 On Mon, Feb 5, 2018 at 5:12 PM, Rahul Singh <
 rahul.xavier.si...@gmail.com> wrote:

> Yeah, you can handle the exception — what i meant that it wouldnt
> cause harm to the DB
>
> --
> Rahul Singh
> rahul.si...@anant.us
>
> Anant Corporation
>
> On Feb 5, 2018, 5:07 PM -0500, Oliver Ruebenacker ,
> wrote:
>
> Well, it does throw an InvalidQueryException if the column already
> exists.
>
> On Mon, Feb 5, 2018 at 4:44 PM, Rahul Singh <
> rahul.xavier.si...@gmail.com> wrote:
>
>> Since CQL != SQL, there’s isnt a syntatical way. Just run the alter
>> table command and it shouldn't be an issue if its there.
>>
>> --
>> Rahul Singh
>> rahul.si...@anant.us
>>
>> Anant Corporation
>>
>> On Feb 5, 2018, 4:15 PM -0500, Oliver Ruebenacker ,
>> wrote:
>>
>>
>>  Hello,
>>
>>   What's the easiest way to add a column to a table but only if it
>> does not exist? Thanks!
>>
>>  Best, Oliver
>>
>> --
>> Oliver Ruebenacker
>> Senior Software Engineer, Diabetes Portal
>> , Broad Institute
>> 
>>
>>
>
>
> --
> Oliver Ruebenacker
> Senior Software Engineer, Diabetes Portal
> , Broad Institute
> 
>
>


 --
 Oliver Ruebenacker
 Senior Software Engineer, Diabetes Portal
 ,