Large temporary files generated during cleaning up

2017-06-19 Thread wxn...@zjqunshuo.com
Hi, Cleaning up is generating temporary files which are occupying large disk space. I noticed for every source sstable file, it is generating 4 temporary files, and two of them is almost as large as the source sstable file. If there are two concurrent cleaning tasks running, I have to leave the

Re: Question: Behavior of inserting a list multiple times with same timestamp

2017-06-19 Thread Subroto Barua
here is the response from Datastax support/dev: In a list each item is its own cell. Append adds a new cell sorted at basically "current server time uuid" prepend adds at "-current server time uuid". User supplied time stamps are used for the cell timestamp when specified. Inserting the

Re: Cassandra is always CP or AP in terms of CAP theorem

2017-06-19 Thread Justin Cameron
This is achieved through a combination of replication factor (RF) and consistency level (CL): Replication factor is tied to your schema (more specifically, it is configured at the keyspace level) and specifies how many copies of each piece of data is kept. Consistency level is associated (either

Cassandra is always CP or AP in terms of CAP theorem

2017-06-19 Thread Kaushal Shriyan
Hi, I am reading the CAP theorem and Cassandra either satisfies CP or AP. I am not sure how do we take care of Availability property or Consistency property. Any examples to understand it better. Please help me understand if i am completely wrong? Thanks in Advance. Regards, Kaushal

Re: CQL: fails to COPY FROM with null values

2017-06-19 Thread Stefania Alborghetti
It doesn't work because of the white space. By default the NULL value is an empty string, extra white spaces are not trimmed automatically. This should work: ce98d62a-3666-4d3a-ae2f-df315ad448aa,Jonsson,Malcom,,2001-01-19 17:55:17+ You can change the string representing missing values with

Re: Don't print Ping caused error logs

2017-06-19 Thread Eric Plowe
The driver had load balancing policies built in. Behind a load balancer you'd lose the benefit things like the TokenAwarePolicy. On Mon, Jun 19, 2017 at 3:49 PM Jonathan Haddad wrote: > The driver grabs all the cluster information from the nodes you provide > the driver and

RE: Adding nodes and cleanup

2017-06-19 Thread ZAIDI, ASAD A
I think the token ranges that are clean/completed and potentially streamed down to additional node , won’t be cleaned again so potentially you’ll need to run cleanup once again. Can you can stop cleanup, add additional node and start cleanup over again so to get nodes clean in single shot!

Re: Don't print Ping caused error logs

2017-06-19 Thread Jonathan Haddad
The driver grabs all the cluster information from the nodes you provide the driver and connects automatically to the rest. You don't need (and shouldn't use) a load balancer. Jon On Mon, Jun 19, 2017 at 12:28 PM Daniel Hölbling-Inzko < daniel.hoelbling-in...@bitmovin.com> wrote: > Just out of

Re: Don't print Ping caused error logs

2017-06-19 Thread Daniel Hölbling-Inzko
Just out of curiosity how to you then make sure all nodes get the same amount of traffic from clients without having to maintain a manual contact points list of all cassandra nodes in the client applications? Especially with big C* deployments this sounds like a lot of work whenever

Adding nodes and cleanup

2017-06-19 Thread Mark Furlong
I have added a few nodes and now am running some cleanups. Can I add an additional node while these cleanups are running? What are the ramifications of doing this? Mark Furlong Sr. Database Administrator mfurl...@ancestry.com M: 801-859-7427 O: 801-705-7115 1300

Re: SASI index on datetime column does not filter on minutes

2017-06-19 Thread Tobias Eriksson
Thanx guys, it was the timezone thingi … Adding + did the trick select lastname,firstname,dateofbirth from playground.individual where dateofbirth < '2001-01-01T10:00:00' and dateofbirth > '2000-11-18 17:55:17+'; -Tobias From: DuyHai Doan Date: Monday, 19 June

RE: Partition range incremental repairs

2017-06-19 Thread ZAIDI, ASAD A
Few options that you can consider to improve repair time are: § Un-throttle streamthroughput & interdcstreamthroughput , at least for the duration of repair. § Increase number of job threads i.e. to use –j option § Use subrange repair options § implement jumbo frames on your internode-

RE: Secondary Index

2017-06-19 Thread ZAIDI, ASAD A
If you’re only creating index so that your query work, think again! You’ll be storing secondary index on each node , queries involving index could create issues (slowness!!) down the road the when index on multiple node Is involved and not maintained! Tables involving a lot of inserts/delete

Secondary Index

2017-06-19 Thread techpyaasa .
Hi, I want to create Index on already existing table which has more than 3 GB/node. We are using c*-2.1.17 with 2 DCs , each DC with 3 groups and each group has 7 nodes.(Total 42 nodes in cluster) So is it ok to create Index on this table now or will it have any problem? If its ok , how much

Re: SASI index on datetime column does not filter on minutes

2017-06-19 Thread DuyHai Doan
The + in the date format is necessary to specify timezone On Mon, Jun 19, 2017 at 5:38 PM, Hannu Kröger wrote: > Hello, > > I tried the same thing with 3.10 which I happened to have at hand and that > seems to work. > > cqlsh:test> select lastname,firstname,dateofbirth

Re: SASI index on datetime column does not filter on minutes

2017-06-19 Thread Hannu Kröger
Hello, I tried the same thing with 3.10 which I happened to have at hand and that seems to work. cqlsh:test> select lastname,firstname,dateofbirth from individuals where dateofbirth < '2001-01-01T10:00:00' and dateofbirth > '2000-11-18 17:59:18'; lastname | firstname | dateofbirth

SASI index on datetime column does not filter on minutes

2017-06-19 Thread Tobias Eriksson
Hi I have a table like this (Cassandra 3.5) Table id uuid, lastname text, firstname text, address_id uuid, dateofbirth timestamp, PRIMARY KEY (id, lastname, firstname) And a SASI index like this create custom index indv_birth ON playground.individual(dateofbirth) USING

CQL: fails to COPY FROM with null values

2017-06-19 Thread Tobias Eriksson
Hi I am trying to copy a file of CSV data into a table But I get an error since sometimes one of the columns (which is a UUID) is empty Is this a bug or what am I missing? Here is how it looks like Table id uuid, lastname text, firstname text, address_id uuid, dateofbirth

Re: Question: Behavior of inserting a list multiple times with same timestamp

2017-06-19 Thread Thakrar, Jayesh
Subroto, Cassandra docs say otherwise. Writing list data is accomplished with a JSON-style syntax. To write a record using INSERT, specify the entire list as a JSON array. Note: An INSERT will always replace the entire list. Maybe you can elaborate/shed some more light? Thanks, Jayesh

Re: Partition range incremental repairs

2017-06-19 Thread Chris Stokesmore
Anyone have anymore thoughts on this at all? Struggling to understand it.. > On 9 Jun 2017, at 11:32, Chris Stokesmore > wrote: > > Hi Anuj, > > Thanks for the reply. > > 1). We are using Cassandra 2.2.8, and our repair commands we are comparing > are >

Re: Don't print Ping caused error logs

2017-06-19 Thread Akhil Mehra
Just in case you are not aware using a load balancer is an anti patter. Please refer to (http://docs.datastax.com/en/landing_page/doc/landing_page/planning/planningAntiPatterns.html#planningAntiPatterns__AntiPatLoadBal

Re: Cleaning up related issue

2017-06-19 Thread Akhil Mehra
The nodetool cleanup docs explain this increase in disk space usage. "Running the nodetool cleanupcommand causes a temporary increase in disk space usage proportional to the size of your largest SSTable. Disk I/O occurs when running this command."

Don't print Ping caused error logs

2017-06-19 Thread wxn...@zjqunshuo.com
Hi, Our cluster nodes are behind a SLB(Service Load Balancer) with a VIP and the Cassandra client access the cluster by the VIP. System.log print the below IOException every several seconds. I guess it's the SLB service which Ping the port 9042 of the Cassandra node periodically and caused the

Re: Re: Cleaning up related issue

2017-06-19 Thread wxn...@zjqunshuo.com
Akhil, I agree with you that the node still has unwanted data, but why it has more data than before cleaning up? More background: Before cleaning up, the node has 790GB data. After cleaning up, I assume it should has less data. But in fact it has 1000GB data which is larger than I expected.

Re: Cleaning up related issue

2017-06-19 Thread Akhil Mehra
When you add a new node into the cluster data is streamed for all the old nodes into the new node added. The new node is now responsible for data previously stored in the old node. The clean up process removes unwanted data after adding a new node to the cluster. In your case clean up failed

Re: Re: Cleaning up related issue

2017-06-19 Thread wxn...@zjqunshuo.com
Thanks for the quick response. It's the existing node where the cleanup failed. It has a larger volume than other nodes. From: Akhil Mehra Date: 2017-06-19 14:56 To: wxn002 CC: user Subject: Re: Cleaning up related issue Is the node with the large volume a new node or an existing node. If it is

Re: Cleaning up related issue

2017-06-19 Thread Akhil Mehra
Is the node with the large volume a new node or an existing node. If it is an existing node is this the one where the node tool cleanup failed. Cheers, Akhil > On 19/06/2017, at 6:40 PM, wxn...@zjqunshuo.com wrote: > > Hi, > After adding a new node, I started cleaning up task to remove the old

Cleaning up related issue

2017-06-19 Thread wxn...@zjqunshuo.com
Hi, After adding a new node, I started cleaning up task to remove the old data on the other 4 nodes. All went well except one node. The cleanup takes hours and the Cassandra daemon crashed in the third node. I checked the node and found the crash was because of OOM. The Cassandra data volume

Re: Question: Behavior of inserting a list multiple times with same timestamp

2017-06-19 Thread Subroto Barua
This is an expected behavior. We learned this issue/feature at the current site (we use Dse 5.08) Subroto > On Jun 18, 2017, at 10:29 PM, Zhongxiang Zheng wrote: > > Hi all, > > I have a question about a behavior when insert a list with specifying > timestamp. > > It