One time major deletion/purge vs periodic deletion

2018-03-05 Thread Charulata Sharma (charshar)
Hi,

  Wanted the community’s feedback on deciding the schedule of Archive and 
Purge job.
Is it better to Purge a large volume of data at regular intervals (like run A 
jobs once in 3 months ) or purge smaller amounts more frequently (run the job 
weekly??)

Some estimates on the number of deletes performed would be…upto 80-90K  rows 
purged in 3 months vs 10K deletes every week ??

Thanks,
Charu



Re: Cassandra Daemon not coming up

2018-03-05 Thread mahesh rajamani
I did not add any user and disk space was fine.



On Tue, Feb 27, 2018, 11:33 Rahul Singh 
wrote:

> Were there any changes to the system such as permissions, etc. Did you add
> users / change auth scheme?
>
> On Feb 27, 2018, 10:27 AM -0600, ZAIDI, ASAD A , wrote:
>
> Can you check if you’ve enough disk space available ?
>
> ~Asad
>
>
>
> *From:* mahesh rajamani [mailto:rajamani.mah...@gmail.com]
> *Sent:* Tuesday, February 27, 2018 10:11 AM
> *To:* user@cassandra.apache.org
> *Subject:* Cassandra Daemon not coming up
>
>
>
> I am using Cassandra 3.0.9 version on a 12 node cluster. I have multiple
> node down after a restart. The cassandra VM is not coming up with an asset
> error as below. On running in debug mode it is failing while doing
> operation on " resource_role_permissons_index" in system_auth keyspace.
> Please let me know how to bring the cassandra running from this state.
>
>
>
> Logs from system.log
>
>
>
> INFO  [main] 2018-02-27 15:43:24,005 ColumnFamilyStore.java:389 -
> Initializing system_schema.columns
>
>
> INFO  [main] 2018-02-27 15:43:24,012 ColumnFamilyStore.java:389 -
> Initializing system_schema.triggers
>
>
> INFO  [main] 2018-02-27 15:43:24,019 ColumnFamilyStore.java:389 -
> Initializing system_schema.dropped_columns
>
>
> INFO  [main] 2018-02-27 15:43:24,029 ColumnFamilyStore.java:389 -
> Initializing system_schema.views
>
>
> INFO  [main] 2018-02-27 15:43:24,038 ColumnFamilyStore.java:389 -
> Initializing system_schema.types
>
>
> INFO  [main] 2018-02-27 15:43:24,049 ColumnFamilyStore.java:389 -
> Initializing system_schema.functions
>
>
> INFO  [main] 2018-02-27 15:43:24,061 ColumnFamilyStore.java:389 -
> Initializing system_schema.aggregates
>
>
> INFO  [main] 2018-02-27 15:43:24,072 ColumnFamilyStore.java:389 -
> Initializing system_schema.indexes
>
>
> ERROR [main] 2018-02-27 15:43:24,127 CassandraDaemon.java:709 - Exception
> encountered during startup
>
>
> java.lang.AssertionError: null
>
>
>
> at
> org.apache.cassandra.db.marshal.CompositeType.getInstance(CompositeType.java:103)
> ~[apache-cassandra-3.0.9.jar:3.0.9]
>
>
> at
> org.apache.cassandra.config.CFMetaData.rebuild(CFMetaData.java:311)
> ~[apache-cassandra-3.0.9.jar:3.0.9]
>
>
> at
> org.apache.cassandra.config.CFMetaData.(CFMetaData.java:288)
> ~[apache-cassandra-3.0.9.jar:3.0.9]
>
>
> at
> org.apache.cassandra.config.CFMetaData.create(CFMetaData.java:366)
> ~[apache-cassandra-3.0.9.jar:3.0.9]
>
>
> at
> org.apache.cassandra.schema.SchemaKeyspace.fetchTable(SchemaKeyspace.java:954)
> ~[apache-cassandra-3.0.9.jar:3.0.9]
>
>
> at
> org.apache.cassandra.schema.SchemaKeyspace.fetchTables(SchemaKeyspace.java:928)
> ~[apache-cassandra-3.0.9.jar:3.0.9]
>
>
> at
> org.apache.cassandra.schema.SchemaKeyspace.fetchKeyspace(SchemaKeyspace.java:891)
> ~[apache-cassandra-3.0.9.jar:3.0.9]
>
>
> at
> org.apache.cassandra.schema.SchemaKeyspace.fetchKeyspacesWithout(SchemaKeyspace.java:868)
> ~[apache-cassandra-3.0.9.jar:3.0.9]
>
>
> at
> org.apache.cassandra.schema.SchemaKeyspace.fetchNonSystemKeyspaces(SchemaKeyspace.java:856)
> ~[apache-cassandra-3.0.9.jar:3.0.9]
>
>
> at
> org.apache.cassandra.config.Schema.loadFromDisk(Schema.java:136)
> ~[apache-cassandra-3.0.9.jar:3.0.9]
>
>
> at
> org.apache.cassandra.config.Schema.loadFromDisk(Schema.java:126)
> ~[apache-cassandra-3.0.9.jar:3.0.9]
>
>
> at
> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:239)
> [apache-cassandra-3.0.9.jar:3.0.9]
>
>
> at
> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:568)
> [apache-cassandra-3.0.9.jar:3.0.9]
>
>
> at
> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:696)
> [apache-cassandra-3.0.9.jar:3.0.9]
>
>
>
> --
>
> Regards,
> Mahesh Rajamani
>
>


paging through cql query on django

2018-03-05 Thread Daniel Santos
I have two queries. One that gives me the first page from a cassandra table, 
and another one that retrieves the successive pages. The fist one is like :

select * from images_by_user where token(iduser) = token(5) limit 10 allow 
filtering;

The successive ones are :

select * from images_by_user where token(iduser) = token(5) and imagekey > 
90b18881-ccd3-4ed4-8cdf-d71eb99b3505 limit 10 allow filtering;

where the image key is the last one on the first page.

I have 13 rows in the table. The first query returns 10 both on the cqlsh and 
the application (consistency level ONE just for development).
The second query only retrieves results on clash. Below is the cassandra 
database engine configuration I have on the django application :

'ENGINE': 'django_cassandra_engine',
'NAME': 'xekmypic',
'HOST': 'localhost',
'OPTIONS': {
'replication': {
'strategy_class': 'SimpleStrategy',
'replication_factor': 1
},
'connection': {
'consistency': ConsistencyLevel.LOCAL_ONE,
'retry_connect': True
# + All connection options for 
cassandra.cluster.Cluster()
}
}

The cassandra version I am using is doc-3.0.9 and below is my virtual env pip 
list :

cassandra-driver (3.9.0)
Cython (0.25)
Django (1.11)
django-cassandra-engine (1.1.0)
mysqlclient (1.3.10)
olefile (0.44)
Pillow (4.1.0)
pip (7.1.2)
python-memcached (1.58)
pytz (2017.2)
setuptools (18.2)
six (1.10.0)

Why does the second page does not return any results on the application but 
does on cqlsh prompt ?


-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Rocksandra blog post

2018-03-05 Thread Dikang Gu
As some of you already know, Instagram Cassandra team is working on the
project to use RocksDB as Cassandra's storage engine.

Today, we just published a blog post about the work we have done, and more
excitingly, we published the benchmark metrics in AWS environment.

Check it out here:
https://engineering.instagram.com/open-sourcing-a-10x-reduction-in-apache-cassandra-tail-latency-d64f86b43589

Thanks
Dikang


Re: Read latency

2018-03-05 Thread D. Salvatore
Hi Jeff,
Thank you very much for your response.
Your considerations are definitely right but, at this point, I just want to
consider the Cassandra response time on different Azure VMs size.

Yes, the YCSB GC can impact on it but the total time that YCSB spent with
the GC is ~ 3% of the total experiment time (results provided by the end of
YCSB execution). In the results I got, the YCSB is more than 4 times higher
than the one gathered from each node.

Thanks
Salvatore

2018-03-05 14:59 GMT+00:00 Jeff Jirsa :

>
>
>
> On Mar 5, 2018, at 6:52 AM, D. Salvatore  wrote:
>
> Hello everyone,
> I am benchmarking a Cassandra installation on Azure composed of 4 nodes
> (Standard_D2S_V3 - 2vCPU and 8GB ram) with a replication factor of 2.
>
>
> Bit smaller than most people would want to run in production.
>
> To benchmark this testbed, I am using a single YCSB instance with the
> workload C (100% read request), a Consistency level ONE and only 10 clients
> ( so very low load).
>
>
> Be sure that you understand the difference between your benchmark and your
> prod use case - especially differences in data model and consistency levels.
>
>
> However, I founded that the average latency gathered by YCSB is much
> different from the one obtained from the JMX (org.apache.cassandra.metrics:
> type=ClientRequest,scope=Read,name=Latency)
> I ran the workload for over an hour and:
> - YCSB returns around 4000 us as average latency
> - JMX returns around 780 us as average latency
>
> It is a quite an important difference in performance that I don't know how
> to justify.
> Do you have any idea from where this difference comes from?
>
>
> Could be app side pauses (ycsb being java itself, could be seeing jvm gc
> on the load testing servers)
>
>
> Regarding the throughput, the value is the same across both the reading
> (~2400ops/sec)
>
> Thank in advance
> Salvatore
>
>


Re: system.size_estimates - safe to remove sstables?

2018-03-05 Thread Chris Lohfink
Any chance space used by snapshots? What files exist there that are taking up 
space?

> On Mar 5, 2018, at 1:02 AM, Kunal Gangakhedkar  
> wrote:
> 
> Hi all,
> 
> I have a 2-node cluster running cassandra 2.1.18.
> One of the nodes has run out of disk space and died - almost all of it shows 
> up as occupied by size_estimates CF.
> Out of 296GiB, 288GiB shows up as consumed by size_estimates in 'du -sh' 
> output.
> 
> This is while the other node is chugging along - shows only 25MiB consumed by 
> size_estimates (du -sh output).
> 
> Any idea why this descripancy?
> Is it safe to remove the size_estimates sstables from the affected node and 
> restart the service?
> 
> Thanks,
> Kunal


-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: system.size_estimates - safe to remove sstables?

2018-03-05 Thread Chris Lohfink
Unless using spark or hadoop nothing consumes the data in that table (unless 
you have tooling that may use it like opscenter or something) so your safe to 
just truncate it or rm the sstables when instance offline you will be fine, if 
you do use that table you can then do a `nodetool refreshsizeestimates` to 
readd it or just wait for it to re-run automatically (every 5 min).

Chris

> On Mar 5, 2018, at 1:02 AM, Kunal Gangakhedkar  
> wrote:
> 
> Hi all,
> 
> I have a 2-node cluster running cassandra 2.1.18.
> One of the nodes has run out of disk space and died - almost all of it shows 
> up as occupied by size_estimates CF.
> Out of 296GiB, 288GiB shows up as consumed by size_estimates in 'du -sh' 
> output.
> 
> This is while the other node is chugging along - shows only 25MiB consumed by 
> size_estimates (du -sh output).
> 
> Any idea why this descripancy?
> Is it safe to remove the size_estimates sstables from the affected node and 
> restart the service?
> 
> Thanks,
> Kunal


-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: Seed nodes of DC2 creating own versions of system keyspaces

2018-03-05 Thread Jeff Jirsa



> On Mar 5, 2018, at 6:40 AM, Oleksandr Shulgin  
> wrote:
> 
> Hi,
> 
> We were deploying a second DC today with 3 seed nodes (30 nodes in total) and 
> we have noticed that all seed nodes reported the following:
> 
> INFO  10:20:50 Create new Keyspace: KeyspaceMetadata{name=system_traces, 
> params=KeyspaceParams{durable_writes=true, 
> replication=ReplicationParams{class=org.apache.cassandra.locator.SimpleStrategy,
>  replication_factor=2}}, ...
> 
> followed by similar lines for system_distributed and system_auth.  Is this to 
> be expected?


They’re written with timestamp=0 to ensure they’re created at least once, but 
if you’ve ever issued an ALTER to the table or keyspace, your modified version 
will win through normal schema reconciliation process.


> 
> Cassandra version is 3.0.15.  The DC2 was added to NTS replication setting 
> for all of the non-local keyspaces in advance, even before starting any of 
> the new nodes.  The schema versions reported by `nodetool describecluster' 
> are consistent accross DCs, that is: all nodes are on the same version.
> 
> All new nodes use auto_bootstrap=true (in order for 
> allocate_tokens_for_keyspace=mydata_ks to take effect), the seeds ignore this 
> setting and report it.  The non-seed nodes didn't try to create the system 
> keyspaces on their own.
> 
> I would expect that even if we don't add the DC2 in advance, the new nodes 
> should be able to learn about existing system keyspaces and wouldn't try to 
> create their own.  Ultimately we will run `nodetool rebuild' on every node in 
> DC2, but I would like to understand why this schema disagreement initially?
> 
> Thanks,
> -- 
> Oleksandr "Alex" Shulgin | Database Engineer | Zalando SE | Tel: +49 176 
> 127-59-707
> 


Re: Read latency

2018-03-05 Thread Jeff Jirsa



> On Mar 5, 2018, at 6:52 AM, D. Salvatore  wrote:
> 
> Hello everyone,
> I am benchmarking a Cassandra installation on Azure composed of 4 nodes 
> (Standard_D2S_V3 - 2vCPU and 8GB ram) with a replication factor of 2. 

Bit smaller than most people would want to run in production.

> To benchmark this testbed, I am using a single YCSB instance with the 
> workload C (100% read request), a Consistency level ONE and only 10 clients ( 
> so very low load).

Be sure that you understand the difference between your benchmark and your prod 
use case - especially differences in data model and consistency levels.

> 
> However, I founded that the average latency gathered by YCSB is much 
> different from the one obtained from the JMX 
> (org.apache.cassandra.metrics:type=ClientRequest,scope=Read,name=Latency)
> I ran the workload for over an hour and:
> - YCSB returns around 4000 us as average latency
> - JMX returns around 780 us as average latency
> 
> It is a quite an important difference in performance that I don't know how to 
> justify. 
> Do you have any idea from where this difference comes from?

Could be app side pauses (ycsb being java itself, could be seeing jvm gc on the 
load testing servers)

> 
> Regarding the throughput, the value is the same across both the reading 
> (~2400ops/sec)
> 
> Thank in advance
> Salvatore
> 


Read latency

2018-03-05 Thread D. Salvatore
Hello everyone,
I am benchmarking a Cassandra installation on Azure composed of 4 nodes
(Standard_D2S_V3 - 2vCPU and 8GB ram) with a replication factor of 2.
To benchmark this testbed, I am using a single YCSB instance with the
workload C (100% read request), a Consistency level ONE and only 10 clients
( so very low load).

However, I founded that the average latency gathered by YCSB is much
different from the one obtained from the JMX
(org.apache.cassandra.metrics:type=ClientRequest,scope=Read,name=Latency)
I ran the workload for over an hour and:
- YCSB returns around 4000 us as average latency
- JMX returns around 780 us as average latency

It is a quite an important difference in performance that I don't know how
to justify.
Do you have any idea from where this difference comes from?

Regarding the throughput, the value is the same across both the reading
(~2400ops/sec)

Thank in advance
Salvatore


Seed nodes of DC2 creating own versions of system keyspaces

2018-03-05 Thread Oleksandr Shulgin
Hi,

We were deploying a second DC today with 3 seed nodes (30 nodes in total)
and we have noticed that all seed nodes reported the following:

INFO  10:20:50 Create new Keyspace: KeyspaceMetadata{name=system_traces,
params=KeyspaceParams{durable_writes=true,
replication=ReplicationParams{class=org.apache.cassandra.locator.SimpleStrategy,
replication_factor=2}}, ...

followed by similar lines for system_distributed and system_auth.  Is this
to be expected?

Cassandra version is 3.0.15.  The DC2 was added to NTS replication setting
for all of the non-local keyspaces in advance, even before starting any of
the new nodes.  The schema versions reported by `nodetool describecluster'
are consistent accross DCs, that is: all nodes are on the same version.

All new nodes use auto_bootstrap=true (in order for
allocate_tokens_for_keyspace=mydata_ks to take effect), the seeds ignore
this setting and report it.  The non-seed nodes didn't try to create the
system keyspaces on their own.

I would expect that even if we don't add the DC2 in advance, the new nodes
should be able to learn about existing system keyspaces and wouldn't try to
create their own.  Ultimately we will run `nodetool rebuild' on every node
in DC2, but I would like to understand why this schema disagreement
initially?

Thanks,
-- 
Oleksandr "Alex" Shulgin | Database Engineer | Zalando SE | Tel: +49 176
127-59-707


Re: How do counter updates work?

2018-03-05 Thread Hannu Kröger
So just to clarify we have two different use cases:
- TIMEUUID is there for client side generation of unique row ids. It’s great 
for that.
- Cassandra counters are not very good for row id generation and suited better 
to e.g. those use cases I listed before

Hannu


> On 5 Mar 2018, at 16:34, Javier Pareja  wrote:
> 
> Doesn't cassandra have TIMEUUID for these use cases?
> 
> Anyways, hopefully someone can help me better understand possible delays when 
> writing a counter.
> 
> F Javier Pareja
> 
> On Mon, Mar 5, 2018 at 1:54 PM, Hannu Kröger  > wrote:
> Traditionally auto increment counters have been used to generate SQL row IDs. 
> This is what Kyrylo probably is here referring to.
> 
> Cassandra counters are better tracking e.g. usage patterns, web site 
> visitors, statistics, etc. 
> 
> For accurate counting (e.g. for generating IDs) those counters are not good 
> because they are inaccurate in certain cases.
> 
> Hannu
> 
>> On 5 Mar 2018, at 15:50, Javier Pareja > > wrote:
>> 
>> Hi Kyrulo,
>> 
>> I don't understand how UUIDs are related to counters, but I use counters to 
>> increment the value of a cell in an atomic manner. I could try reading the 
>> value and then writing to the cell but then I would lose the atomicity of 
>> the update.
>> 
>> F Javier Pareja
>> 
>> On Mon, Mar 5, 2018 at 1:32 PM, Kyrylo Lebediev > > wrote:
>> Hello!
>> 
>> Can't answer your question but there is another one: "why do we need to 
>> maintain counters with their known limitations (and I've heard of some 
>> issues with implementation of counters in Cassandra), when there exist 
>> really effective uuid generation algorithms which allow us to generate 
>> unique values?"
>> https://begriffs.com/posts/2018-01-01-sql-keys-in-depth.html 
>> 
>>  (the article 
>> is about keys in RDBMS's, but its statements are true for NoSQL as well)
>> 
>> Regards, 
>> Kyrill
>> From: jpar...@gmail.com  > > on behalf of Javier Pareja 
>> >
>> Sent: Monday, March 5, 2018 1:55:14 PM
>> To: user@cassandra.apache.org 
>> Subject: How do counter updates work?
>>  
>> Hello everyone,
>> 
>> I am trying to understand how cassandra counter writes work in more detail 
>> but all that I could find is this: 
>> https://www.datastax.com/dev/blog/whats-new-in-cassandra-2-1-a-better-implementation-of-counters
>>  
>> 
>> From there I was able to extract the following process:
>> (click here to edit 
>> 

Re: How do counter updates work?

2018-03-05 Thread Javier Pareja
Doesn't cassandra have TIMEUUID for these use cases?

Anyways, hopefully someone can help me better understand possible delays
when writing a counter.

F Javier Pareja

On Mon, Mar 5, 2018 at 1:54 PM, Hannu Kröger  wrote:

> Traditionally auto increment counters have been used to generate SQL row
> IDs. This is what Kyrylo probably is here referring to.
>
> Cassandra counters are better tracking e.g. usage patterns, web site
> visitors, statistics, etc.
>
> For accurate counting (e.g. for generating IDs) those counters are not
> good because they are inaccurate in certain cases.
>
> Hannu
>
> On 5 Mar 2018, at 15:50, Javier Pareja  wrote:
>
> Hi Kyrulo,
>
> I don't understand how UUIDs are related to counters, but I use counters
> to increment the value of a cell in an atomic manner. I could try reading
> the value and then writing to the cell but then I would lose the atomicity
> of the update.
>
> F Javier Pareja
>
> On Mon, Mar 5, 2018 at 1:32 PM, Kyrylo Lebediev 
> wrote:
>
>> Hello!
>>
>> Can't answer your question but there is another one: "why do we need to
>> maintain counters with their known limitations (and I've heard of some
>> issues with implementation of counters in Cassandra), when there exist
>> really effective uuid generation algorithms which allow us to generate
>> unique values?"
>> https://begriffs.com/posts/2018-01-01-sql-keys-in-depth.html
>> (the
>> article is about keys in RDBMS's, but its statements are true for NoSQL as
>> well)
>>
>> Regards,
>> Kyrill
>> --
>> *From:* jpar...@gmail.com  on behalf of Javier Pareja
>> 
>> *Sent:* Monday, March 5, 2018 1:55:14 PM
>> *To:* user@cassandra.apache.org
>> *Subject:* How do counter updates work?
>>
>> Hello everyone,
>>
>> I am trying to understand how cassandra counter writes work in more
>> detail but all that I could find is this: https://www.datastax.com/dev/b
>> log/whats-new-in-cassandra-2-1-a-better-implementation-of-counters
>> From there I was able to extract the following process:
>> (click here to edit
>> 
>> ).
>> 
>>
>> PATH 1 will be much quicker than PATH 2 and its bottleneck (assuming HDD
>> drives) will be the commitlog
>> PATH 2 will need at least an access to disk to do a read (potentially
>> even in a different machine) and an access to disk to do a write to the
>> commitlog. This is at least twice as slow as PATH 1.
>>
>> This is all the info that I could get from the internet but a lot is
>> missing. For example, there is no information about how the counter lock is
>> acquired, is there a shared lock across all the nodes?
>>
>> Hope I am 

Re: How do counter updates work?

2018-03-05 Thread Hannu Kröger
Traditionally auto increment counters have been used to generate SQL row IDs. 
This is what Kyrylo probably is here referring to.

Cassandra counters are better tracking e.g. usage patterns, web site visitors, 
statistics, etc. 

For accurate counting (e.g. for generating IDs) those counters are not good 
because they are inaccurate in certain cases.

Hannu

> On 5 Mar 2018, at 15:50, Javier Pareja  wrote:
> 
> Hi Kyrulo,
> 
> I don't understand how UUIDs are related to counters, but I use counters to 
> increment the value of a cell in an atomic manner. I could try reading the 
> value and then writing to the cell but then I would lose the atomicity of the 
> update.
> 
> F Javier Pareja
> 
> On Mon, Mar 5, 2018 at 1:32 PM, Kyrylo Lebediev  > wrote:
> Hello!
> 
> Can't answer your question but there is another one: "why do we need to 
> maintain counters with their known limitations (and I've heard of some issues 
> with implementation of counters in Cassandra), when there exist really 
> effective uuid generation algorithms which allow us to generate unique 
> values?"
> https://begriffs.com/posts/2018-01-01-sql-keys-in-depth.html 
> 
>  (the article 
> is about keys in RDBMS's, but its statements are true for NoSQL as well)
> 
> Regards, 
> Kyrill
> From: jpar...@gmail.com   > on behalf of Javier Pareja 
> >
> Sent: Monday, March 5, 2018 1:55:14 PM
> To: user@cassandra.apache.org 
> Subject: How do counter updates work?
>  
> Hello everyone,
> 
> I am trying to understand how cassandra counter writes work in more detail 
> but all that I could find is this: 
> https://www.datastax.com/dev/blog/whats-new-in-cassandra-2-1-a-better-implementation-of-counters
>  
> 
> From there I was able to extract the following process:
> (click here to edit 
> ).
>  
>  
> 
> PATH 1 will be much quicker than PATH 2 and its bottleneck (assuming HDD 
> drives) will be the commitlog
> PATH 2 will need at least an access to disk to do a read (potentially even in 
> a different machine) and an access to disk to do a write to the commitlog. 
> This is at least twice as slow as PATH 1.
> 
> This is all the info that I could get from the internet but a lot is missing. 
> For example, there is no information about how the counter lock is acquired, 
> is there a shared lock across 

Re: How do counter updates work?

2018-03-05 Thread Javier Pareja
Hi Kyrulo,

I don't understand how UUIDs are related to counters, but I use counters to
increment the value of a cell in an atomic manner. I could try reading the
value and then writing to the cell but then I would lose the atomicity of
the update.

F Javier Pareja

On Mon, Mar 5, 2018 at 1:32 PM, Kyrylo Lebediev 
wrote:

> Hello!
>
> Can't answer your question but there is another one: "why do we need to
> maintain counters with their known limitations (and I've heard of some
> issues with implementation of counters in Cassandra), when there exist
> really effective uuid generation algorithms which allow us to generate
> unique values?"
> https://begriffs.com/posts/2018-01-01-sql-keys-in-depth.html
>
> (the
> article is about keys in RDBMS's, but its statements are true for NoSQL as
> well)
>
>
> Regards,
> Kyrill
> --
> *From:* jpar...@gmail.com  on behalf of Javier Pareja <
> pareja.jav...@gmail.com>
> *Sent:* Monday, March 5, 2018 1:55:14 PM
> *To:* user@cassandra.apache.org
> *Subject:* How do counter updates work?
>
> Hello everyone,
>
> I am trying to understand how cassandra counter writes work in more detail
> but all that I could find is this: https://www.datastax.com/dev/
> blog/whats-new-in-cassandra-2-1-a-better-implementation-of-counters
> From there I was able to extract the following process:
> (click here to edit
> 
> ).
>
>
> PATH 1 will be much quicker than PATH 2 and its bottleneck (assuming HDD
> drives) will be the commitlog
> PATH 2 will need at least an access to disk to do a read (potentially even
> in a different machine) and an access to disk to do a write to the
> commitlog. This is at least twice as slow as PATH 1.
>
> This is all the info that I could get from the internet but a lot is
> missing. For example, there is no information about how the counter lock is
> acquired, is there a shared lock across all the nodes?
>
> Hope I am not oversimplifying things, but I think this will be useful to
> better understand how to tune up the system.
>
> Thanks in advance.
>
> F Javier Pareja
>


Re: How do counter updates work?

2018-03-05 Thread Kyrylo Lebediev
Hello!

Can't answer your question but there is another one: "why do we need to 
maintain counters with their known limitations (and I've heard of some issues 
with implementation of counters in Cassandra), when there exist really 
effective uuid generation algorithms which allow us to generate unique values?"
https://begriffs.com/posts/2018-01-01-sql-keys-in-depth.html

(the article is 
about keys in RDBMS's, but its statements are true for NoSQL as well)


Regards,
Kyrill


From: jpar...@gmail.com  on behalf of Javier Pareja 

Sent: Monday, March 5, 2018 1:55:14 PM
To: user@cassandra.apache.org
Subject: How do counter updates work?

Hello everyone,

I am trying to understand how cassandra counter writes work in more detail but 
all that I could find is this: 
https://www.datastax.com/dev/blog/whats-new-in-cassandra-2-1-a-better-implementation-of-counters
>From there I was able to extract the following process:
(click here to 
edit).
[cid:ii_jee3zlby0_161f5cd7cfb1234b]

PATH 1 will be much quicker than PATH 2 and its bottleneck (assuming HDD 
drives) will be the commitlog
PATH 2 will need at least an access to disk to do a read (potentially even in a 
different machine) and an access to disk to do a write to the commitlog. This 
is at least twice as slow as PATH 1.

This is all the info that I could get from the internet but a lot is missing. 
For example, there is no information about how the counter lock is acquired, is 
there a shared lock across all the nodes?

Hope I am not oversimplifying things, but I think this will be useful to better 
understand how to tune up the system.

Thanks in advance.

F Javier Pareja


Re: [External] Re: Whch version is the best version to run now?

2018-03-05 Thread Tom van der Woerdt
We run on the order of a thousand Cassandra nodes in production. Most of
that is 3.0.16, but new clusters are defaulting to 3.11.2 and some older
clusters have been upgraded to it as well.

All of the bugs I encountered in 3.11.x were also seen in 3.0.x, but 3.11.x
seems to get more love from the community wrt patches. This is why I'd
recommend 3.11.x for new projects.

Stay away from any of the 2.x series, they're going EOL soonish and the
newer versions are very stable.

Tom van der Woerdt
Site Reliability Engineer

Booking.com B.V.
Vijzelstraat 66-80 Amsterdam 1017HL Netherlands
[image: Booking.com] 
The world's #1 accommodation site
43 languages, 198+ offices worldwide, 120,000+ global destinations,
1,550,000+ room nights booked every day
No booking fees, best price always guaranteed
Subsidiary of Booking Holdings Inc. (NASDAQ: BKNG)

On Sat, Mar 3, 2018 at 12:25 AM, Jeff Jirsa  wrote:

> I’d personally be willing to run 3.0.16
>
> 3.11.2 or 3 whatever should also be similar, but I haven’t personally
> tested it at any meaningful scale
>
>
> --
> Jeff Jirsa
>
>
> On Mar 2, 2018, at 2:37 PM, Kenneth Brotman 
> wrote:
>
> Seems like a lot of people are running old versions of Cassandra.  What is
> the best version, most reliable stable version to use now?
>
>
>
> Kenneth Brotman
>
>


How do counter updates work?

2018-03-05 Thread Javier Pareja
Hello everyone,

I am trying to understand how cassandra counter writes work in more detail
but all that I could find is this:
https://www.datastax.com/dev/blog/whats-new-in-cassandra-2-1-a-better-implementation-of-counters
>From there I was able to extract the following process:
(click here to edit

).


PATH 1 will be much quicker than PATH 2 and its bottleneck (assuming HDD
drives) will be the commitlog
PATH 2 will need at least an access to disk to do a read (potentially even
in a different machine) and an access to disk to do a write to the
commitlog. This is at least twice as slow as PATH 1.

This is all the info that I could get from the internet but a lot is
missing. For example, there is no information about how the counter lock is
acquired, is there a shared lock across all the nodes?

Hope I am not oversimplifying things, but I think this will be useful to
better understand how to tune up the system.

Thanks in advance.

F Javier Pareja


Re: vnodes: high availability

2018-03-05 Thread Kyrylo Lebediev
What's the reason behind this negative effect of dynamic_snitch enabled?

Is this true for all C* versions for which this feature is implemented?

Is that because node latencies change too dynamically/sporadically while values 
is dynamic_snitch tune slower 'than required' and can't keep up with these 
changes?

In case dynamic_snitch is disabled what algorithm is used to determine which 
replica should be read  (read requests, not digest requests)?


Regards,

Kyrill


From: Jon Haddad  on behalf of Jon Haddad 

Sent: Thursday, January 18, 2018 12:49:02 AM
To: user
Subject: Re: vnodes: high availability

I *strongly* recommend disabling dynamic snitch.  I’ve seen it make latency 
jump 10x.

dynamic_snitch: false is your friend.



On Jan 17, 2018, at 2:00 PM, Kyrylo Lebediev 
> wrote:

Avi,
If we prefer to have better balancing [like absence of hotspots during a node 
down event etc], large number of vnodes is a good solution.
Personally, I wouldn't prefer any balancing over overall resiliency  (and in 
case of non-optimal setup, larger number of nodes in a cluster decreases 
overall resiliency, as far as I understand.)

Talking about hotspots, there is a number of features helping to mitigate the 
issue, for example:
  - dynamic snitch [if a node overloaded it won't be queried]
  - throttling of streaming operations

Thanks,
Kyrill


From: Avi Kivity >
Sent: Wednesday, January 17, 2018 2:50 PM
To: user@cassandra.apache.org; kurt greaves
Subject: Re: vnodes: high availability

On the flip side, a large number of vnodes is also beneficial. For example, if 
you add a node to a 20-node cluster with many vnodes, each existing node will 
contribute 5% of the data towards the new node, and all nodes will participate 
in streaming (meaning the impact on any single node will be limited, and 
completion time will be faster).

With a low number of vnodes, only a few nodes participate in streaming, which 
means that the cluster is left unbalanced and the impact on each streaming node 
is greater (or that completion time is slower).

Similarly, with a high number of vnodes, if a node is down its work is 
distributed equally among all nodes. With a low number of vnodes the cluster 
becomes unbalanced.

Overall I recommend high vnode count, and to limit the impact of failures in 
other ways (smaller number of large nodes vs. larger number of small nodes).

btw, rack-aware topology improves the multi-failure problem but at the cost of 
causing imbalance during maintenance operations. I recommend using rack-aware 
topology only if you really have racks with single-points-of-failure, not for 
other reasons.

On 01/17/2018 05:43 AM, kurt greaves wrote:
Even with a low amount of vnodes you're asking for a bad time. Even if you 
managed to get down to 2 vnodes per node, you're still likely to include double 
the amount of nodes in any streaming/repair operation which will likely be very 
problematic for incremental repairs, and you still won't be able to easily 
reason about which nodes are responsible for which token ranges. It's still 
quite likely that a loss of 2 nodes would mean some portion of the ring is down 
(at QUORUM). At the moment I'd say steer clear of vnodes and use single tokens 
if you can; a lot of work still needs to be done to ensure smooth operation of 
C* while using vnodes, and they are much more difficult to reason about (which 
is probably the reason no one has bothered to do the math). If you're really 
keen on the math your best bet is to do it yourself, because it's not a point 
of interest for many C* devs plus probably a lot of us wouldn't remember enough 
math to know how to approach it.

If you want to get out of this situation you'll need to do a DC migration to a 
new DC with a better configuration of snitch/replication strategy/racks/tokens.


On 16 January 2018 at 21:54, Kyrylo Lebediev 
> wrote:
Thank you for this valuable info, Jon.
I guess both you and Alex are referring to improved vnodes allocation method  
https://issues.apache.org/jira/browse/CASSANDRA-7032 which was implemented in 
3.0.
Based on your info and comments in the ticket it's really a bad idea to have 
small number of vnodes for the versions using old allocation method because of 
hot-spots, so it's not an option for my particular case (v.2.1) :(

[As far as I can see from the source code this new method wasn't backported to 
2.1.]


Regards,
Kyrill
[CASSANDRA-7032] Improve vnode allocation - ASF 
JIRA
issues.apache.org
It's been known for a little while that random vnode allocation causes hotspots 
of ownership. It should be possible to