Re: group by select queries

2018-02-01 Thread DuyHai Doan
Worth digging into the source code of GROUP BY but as far as I remember, using GROUP BY without any aggregation function will lead to C* picking just the first row (or maybe last, not sure on this point) row at hand. About ordering, since the grouping is on a component of partition key, do not

Re: TWCS not deleting expired sstables

2018-02-01 Thread Thakrar, Jayesh
Thank you so much Kurt - that helped!! So here's the thing, I was logged into the server as my own id and had Cassandra binaries under my home directory with the default configuration. Cassandra was running on the server with a service account and with its own user-id and configuration. I

Documentation question

2018-02-01 Thread Mikael
Hi! I have a commercial application that use Cassandra and the manual for it includes a link to the Cassandra documentation, but I will include the most basic stuff in the manual like how to setup Cassandra, configuration and so on, so the question is if I need to write this myself from

Re: Old tombstones not being cleaned up

2018-02-01 Thread James Shaw
i see leveled compaction used, if it's last, it will have to stay until next level compaction happens, then will be gone, right ? On Thu, Feb 1, 2018 at 2:33 AM, Bo Finnerup Madsen wrote: > Hi, > > We are running a small 9 node Cassandra v2.1.17 cluster. The cluster >

Re: Upgrading sstables not using all available compaction slots on version 2.2

2018-02-01 Thread Oleksandr Shulgin
On Thu, Feb 1, 2018 at 9:23 AM, Oleksandr Shulgin < oleksandr.shul...@zalando.de> wrote: > On 1 Feb 2018 06:51, "kurt greaves" wrote: > > Would you be able to create a JIRA ticket for this? Not sure if this is > still a problem in 3.0+ but worth creating a ticket to

RE: group by select queries

2018-02-01 Thread Modha, Digant
Jira created: CASSANDRA-14209. From: kurt greaves [mailto:k...@instaclustr.com] Sent: Thursday, February 01, 2018 12:38 AM To: User Subject: Re: group by select queries Seems problematic. Would you be able to create a JIRA ticket with the above information/examples? On 30 January 2018 at

Re: Documentation question

2018-02-01 Thread Alain RODRIGUEZ
Hello Mikael, I believe that this licence apply to the documentation: https://github.com/apache/cassandra/blob/trunk/LICENSE.txt#L90-L129. The documentation is in the same repository than the code now. I believe you can, but with some rules to respect. C*heers, --- Alain

Re: multiple tables vs. partitions and TTL

2018-02-01 Thread Alain RODRIGUEZ
Hello Marcus, We want to store bigger amounts of data (> 30mio rows containing blobs) > This should be perfectly fine for a partition. We use to recommend up to 100 MB per partition, which is a soft limit, as some use cases work very well with bigger partition. will be deleted depending on the

Re: multiple tables vs. partitions and TTL

2018-02-01 Thread James Shaw
if me, I will go 1 table, just think too much labor to manage many tables and also how reliable while switching tables. Regarding tombstones, may try some ways to fight: reasonable partition size ( big partition with large tombstones will be a problem); don't query tombstones as possible, in

RE: Old tombstones not being cleaned up

2018-02-01 Thread ZAIDI, ASAD A
Make data consistent (run repair), reduce gc_grace_seconds (try set it to 0 temporarily though careful as this can affect hinted handoff!) and set table’s compaction sub-property i.e. unchecked_tombstone_compaction to true. Compaction will take care of tombstones! From: Jonathan Haddad

Re: Nodes show different number of tokens than initially

2018-02-01 Thread Oleksandr Shulgin
On Thu, Feb 1, 2018 at 5:19 AM, Jeff Jirsa wrote: > >> The reason I find it surprising, is that it makes very little *sense* to >> put a token belonging to a mode from one DC between tokens of nodes from >> another one. >> > > I don't want to really turn this into an argument

Re: Old tombstones not being cleaned up

2018-02-01 Thread Jonathan Haddad
Changing the defaul TTL doesn’t change the TTL on the existing data, only new data. It’s only set if you don’t supply one yourself. On Wed, Jan 31, 2018 at 11:35 PM Bo Finnerup Madsen wrote: > Hi, > > We are running a small 9 node Cassandra v2.1.17 cluster. The cluster >

RE: Old tombstones not being cleaned up

2018-02-01 Thread Steinmaurer, Thomas
Did you started with a 9 node cluster from the beginning or did you extend / scale out your cluster (with vnodes) beyond the replication factor? If later applies and if you are deleting by explicit deletes and not via TTL, then nodes might not see the deletes anymore, as a node might not own

Re: Not what I‘ve expected Performance

2018-02-01 Thread Jürgen Albersdorfer
I changed it a little to spark.sql and extracted such a partitioning key table as You did with the userid and joined this to my table to copy and safe this to cassandra seemed in a First Test to utilize every given Bit of Performance the Cluster can provide. Dont yet know why the first code did

Re: unable to start cassandra 3.11.1

2018-02-01 Thread Kant Kodali
Hi Justin, I am using java version "1.8.0_162" Java(TM) SE Runtime Environment (build 1.8.0_162-b12) Thanks! On Thu, Feb 1, 2018 at 2:40 PM, Justin Cameron wrote: > Unfortunately C* 3.11.1 is incompatible with the latest version of Java. > You'll need to either

RE: Old tombstones not being cleaned up

2018-02-01 Thread ZAIDI, ASAD A
No it doesn’t. unchecked_tombstone_compaction sub property is common in all STCS, DTCS & LCS. Though you can also use jmxterm tool and invoke different compaction on single node if you desire. From: Bo Finnerup Madsen [mailto:bo.gunder...@gmail.com] Sent: Thursday, February 01, 2018 3:17 PM

Re: Old tombstones not being cleaned up

2018-02-01 Thread Bo Finnerup Madsen
We do not use TTL anywhere...records are inserted and deleted "manually" by our software. tor. 1. feb. 2018 kl. 18.29 skrev Jonathan Haddad : > Changing the defaul TTL doesn’t change the TTL on the existing data, only > new data. It’s only set if you don’t supply one

Converting from Apache Cassandra 2.2.6 to 3.x

2018-02-01 Thread William Boutin
We are converting our product from Apache Cassandra 2.2.6 to 3.x. What issues may we run into when we convert? [Ericsson] WILLIAM L. BOUTIN Engineer IV - Sftwr BMDA PADB DSE DU CC NGEE Ericsson 1 Ericsson Drive, US PI06 1.S747 Piscataway, NJ, 08854, USA Phone (913)

RE: Converting from Apache Cassandra 2.2.6 to 3.x

2018-02-01 Thread ZAIDI, ASAD A
You may want to upgrade python and java/JDK version with Cassandra upgrade. please refer to CHANGES.txt for all updates & improvement made in your selected 3.x version. From: William Boutin [mailto:william.bou...@ericsson.com] Sent: Thursday, February 01, 2018 4:49 PM To:

Re: Not what I‘ve expected Performance

2018-02-01 Thread kurt greaves
That extra code is not necessary, it's just to only retrieve a sampling of let's. You don't want it if you're copying the whole table. It sounds like you're taking the right approach, probably just need some more tuning. Might be on the Cassandra side as well (concurrent_reads/writes). On 1 Feb.

unable to start cassandra 3.11.1

2018-02-01 Thread Kant Kodali
Hi All, I am unable to start cassandra 3.11.1. Below is the stack trace. Exception (java.lang.AbstractMethodError) encountered during startup:

Re: Old tombstones not being cleaned up

2018-02-01 Thread Bo Finnerup Madsen
I have forced several compactions without the tombstones being cleaned. Compactions was forced both using "nodetool compact" and by changeing compaction algorithem from leved to sizedtiered and back... tor. 1. feb. 2018 kl. 15.54 skrev James Shaw : > i see leveled compaction

Re: Old tombstones not being cleaned up

2018-02-01 Thread Bo Finnerup Madsen
I, almost, tried that today :) I ran a repair, changed the compaction algorithm from leveled to sizetierd and back. This definitely forced a compaction, but the tombstones are still there. Will setting the unchecked_tombstone_compaction force another type of compaction? tor. 1. feb. 2018 kl.

Re: unable to start cassandra 3.11.1

2018-02-01 Thread Justin Cameron
Unfortunately C* 3.11.1 is incompatible with the latest version of Java. You'll need to either downgrade to Java 1.8.0.151-5 or wait for C* 3.11.2 (see https://issues.apache.org/jira/browse/CASSANDRA-14173 for details) On Fri, 2 Feb 2018 at 09:35 Kant Kodali wrote: > Hi All,

Re: unable to start cassandra 3.11.1

2018-02-01 Thread Kant Kodali
Ok I saw the ticket looks like this java version "1.8.0_162" wont work! On Thu, Feb 1, 2018 at 2:43 PM, Kant Kodali wrote: > Hi Justin, > > I am using > > java version "1.8.0_162" > > Java(TM) SE Runtime Environment (build 1.8.0_162-b12) > > > Thanks! > > On Thu, Feb 1, 2018

Re: CDC usability and future development

2018-02-01 Thread Jay Zhuang
We did a POC to improve CDC feature as an interface ( https://github.com/ngcc/ngcc2017/blob/master/CassandraDataIngestion.pdf), so the user doesn't have to read the commit log directly. We deployed the change to a test cluster and doing more tests for production traffics, will send out the design

Re: Setting min_index_interval to 1?

2018-02-01 Thread Nate McCall
> > > Another was the crazy idea I started with of setting min_index_interval to > 1. My guess was that this would cause it to read all index entries, and > effectively have them all cached permanently. And it would read them > straight out of the SSTables on every restart. Would this work? Other

Re: Converting from Apache Cassandra 2.2.6 to 3.x

2018-02-01 Thread Christopher Lord
If upgrading to 3.11.x be aware of the dropped support for the legacy auth tables. The relevant notes from NEWS.txt: The authentication & authorization subsystems have been redesigned to support role based access control (RBAC), resulting in a change to the schema of the system_auth

Setting min_index_interval to 1?

2018-02-01 Thread Dan Kinder
Hi, I have an unusual case here: I'm wondering what will happen if I set min_index_interval to 1. Here's the logic. Suppose I have a table where I really want to squeeze as many reads/sec out of it as possible, and where the row data size is much larger than the keys. E.g. the keys are a few

Re: Nodes show different number of tokens than initially

2018-02-01 Thread kurt greaves
So one time I tried to understand why only a single node could have a token, and it appeared that it came over the fence from facebook and has been kept ever since. Personally I don't think it's necessary, and agree that it is kind of problematic (but there's probably lot's of stuff that relies on

Re: Old tombstones not being cleaned up

2018-02-01 Thread Bo Finnerup Madsen
We did start with a 3 node cluster and a RF of 3, then added another 3 nodes and again another 3 nodes. So it is a good guess :) But I have run both repair and cleanup against the table on all nodes, would that not have removed any stray partitions? tor. 1. feb. 2018 kl. 22.31 skrev Steinmaurer,

RE: Old tombstones not being cleaned up

2018-02-01 Thread Steinmaurer, Thomas
Right. In this case, cleanup should have done the necessary work here. Thomas From: Bo Finnerup Madsen [mailto:bo.gunder...@gmail.com] Sent: Freitag, 02. Februar 2018 06:59 To: user@cassandra.apache.org Subject: Re: Old tombstones not being cleaned up We did start with a 3 node cluster and a RF

Re: Nodes show different number of tokens than initially

2018-02-01 Thread Oleksandr Shulgin
On Fri, Feb 2, 2018 at 2:37 AM, kurt greaves wrote: > So one time I tried to understand why only a single node could have a > token, and it appeared that it came over the fence from facebook and has > been kept ever since. Personally I don't think it's necessary, and agree

multiple tables vs. partitions and TTL

2018-02-01 Thread Marcus Haarmann
Hi experts, I have a design issue here: We want to store bigger amounts of data (> 30mio rows containing blobs) which will be deleted depending on the type of data on a monthly base (not in the same order as the data entered the system). Some data would survive for two month only, other

Re: Not what I‘ve expected Performance

2018-02-01 Thread Jürgen Albersdorfer
Hi Kurt, thanks for your response. I indeed utilized Spark - what I've forgot to mention - and I did it nearly the same as in the example you gave me. Just without that .select(PK).sample(false, 0.1) Instruction which I don't actually get what it's useful for - and maybe that's the key to the

Re: Upgrading sstables not using all available compaction slots on version 2.2

2018-02-01 Thread Oleksandr Shulgin
On 1 Feb 2018 06:51, "kurt greaves" wrote: Would you be able to create a JIRA ticket for this? Not sure if this is still a problem in 3.0+ but worth creating a ticket to investigate. It'd be really helpful if you could try and reproduce on 3.0.15 or 3.11.1 to see if it's an