Re: Cassandra is consuming a lot of disk space

2016-01-13 Thread Carlos Rolo
You can check if the snapshot exists in the snapshot folder.
Repairs stream sstables over, than can temporary increase disk space. But I
think Carlos Alonso might be correct. Running compactions might be the
issue.

Regards,

Carlos Juzarte Rolo
Cassandra Consultant

Pythian - Love your data

rolo@pythian | Twitter: @cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo
*
Mobile: +351 91 891 81 00 | Tel: +1 613 565 8696 x1649
www.pythian.com

On Wed, Jan 13, 2016 at 9:24 AM, Carlos Alonso  wrote:

> I'd have a look also at possible running compactions.
>
> If you have big column families with STCS then large compactions may be
> happening.
>
> Check it with nodetool compactionstats
>
> Carlos Alonso | Software Engineer | @calonso 
>
> On 13 January 2016 at 05:22, Kevin O'Connor  wrote:
>
>> Have you tried restarting? It's possible there's open file handles to
>> sstables that have been compacted away. You can verify by doing lsof and
>> grepping for DEL or deleted.
>>
>> If it's not that, you can run nodetool cleanup on each node to scan all
>> of the sstables on disk and remove anything that it's not responsible for.
>> Generally this would only work if you added nodes recently.
>>
>>
>> On Tuesday, January 12, 2016, Rahul Ramesh  wrote:
>>
>>> We have a 2 node Cassandra cluster with a replication factor of 2.
>>>
>>> The load factor on the nodes is around 350Gb
>>>
>>> Datacenter: Cassandra
>>> ==
>>> Address  RackStatus State   LoadOwns
>>>Token
>>>
>>> -5072018636360415943
>>> 172.31.7.91  rack1   Up Normal  328.5 GB100.00%
>>> -7068746880841807701
>>> 172.31.7.92  rack1   Up Normal  351.7 GB100.00%
>>> -5072018636360415943
>>>
>>> However,if I use df -h,
>>>
>>> /dev/xvdf   252G  223G   17G  94% /HDD1
>>> /dev/xvdg   493G  456G   12G  98% /HDD2
>>> /dev/xvdh   197G  167G   21G  90% /HDD3
>>>
>>>
>>> HDD1,2,3 contains only cassandra data. It amounts to close to 1Tb in one
>>> of the machine and in another machine it is close to 650Gb.
>>>
>>> I started repair 2 days ago, after running repair, the amount of disk
>>> space consumption has actually increased.
>>> I also checked if this is because of snapshots. nodetool listsnapshot
>>> intermittently lists a snapshot but it goes away after sometime.
>>>
>>> Can somebody please help me understand,
>>> 1. why so much disk space is consumed?
>>> 2. Why did it increase after repair?
>>> 3. Is there any way to recover from this state.
>>>
>>>
>>> Thanks,
>>> Rahul
>>>
>>>
>

-- 


--





Re: Cassandra is consuming a lot of disk space

2016-01-13 Thread Carlos Alonso
I'd have a look also at possible running compactions.

If you have big column families with STCS then large compactions may be
happening.

Check it with nodetool compactionstats

Carlos Alonso | Software Engineer | @calonso 

On 13 January 2016 at 05:22, Kevin O'Connor  wrote:

> Have you tried restarting? It's possible there's open file handles to
> sstables that have been compacted away. You can verify by doing lsof and
> grepping for DEL or deleted.
>
> If it's not that, you can run nodetool cleanup on each node to scan all of
> the sstables on disk and remove anything that it's not responsible for.
> Generally this would only work if you added nodes recently.
>
>
> On Tuesday, January 12, 2016, Rahul Ramesh  wrote:
>
>> We have a 2 node Cassandra cluster with a replication factor of 2.
>>
>> The load factor on the nodes is around 350Gb
>>
>> Datacenter: Cassandra
>> ==
>> Address  RackStatus State   LoadOwns
>>Token
>>
>>   -5072018636360415943
>> 172.31.7.91  rack1   Up Normal  328.5 GB100.00%
>>   -7068746880841807701
>> 172.31.7.92  rack1   Up Normal  351.7 GB100.00%
>>   -5072018636360415943
>>
>> However,if I use df -h,
>>
>> /dev/xvdf   252G  223G   17G  94% /HDD1
>> /dev/xvdg   493G  456G   12G  98% /HDD2
>> /dev/xvdh   197G  167G   21G  90% /HDD3
>>
>>
>> HDD1,2,3 contains only cassandra data. It amounts to close to 1Tb in one
>> of the machine and in another machine it is close to 650Gb.
>>
>> I started repair 2 days ago, after running repair, the amount of disk
>> space consumption has actually increased.
>> I also checked if this is because of snapshots. nodetool listsnapshot
>> intermittently lists a snapshot but it goes away after sometime.
>>
>> Can somebody please help me understand,
>> 1. why so much disk space is consumed?
>> 2. Why did it increase after repair?
>> 3. Is there any way to recover from this state.
>>
>>
>> Thanks,
>> Rahul
>>
>>


Re: Node stuck when joining a Cassandra 2.2.0 cluster

2016-01-13 Thread Carlos Alonso
Hi Robert.

I'm thinking of upgrading hardware in place. Can you please elaborate a bit
more on how to use the auto_bootstrap=false + hibernate repair technique?

Cheers!

Carlos Alonso | Software Engineer | @calonso 

On 6 January 2016 at 11:10, Herbert Fischer 
wrote:

> Hi,
>
> Thanks for the tip.
>
> I found that one keyspace was kinda corrupted. It was previously
> scrubbed/deleted but there where files left in the servers, so it was in a
> strange state. After removing it from the filesystem I was able to add the
> new node to the cluster. Since this keyspace was in an unknown state, I
> could not find it through the cfId from the error messages.
>
> best
>
> On 5 January 2016 at 22:33, Robert Coli  wrote:
>
>> On Tue, Jan 5, 2016 at 3:01 AM, Herbert Fischer <
>> herbert.fisc...@crossengage.io> wrote:
>>
>>> We run a small Cassandra 2.2.0 cluster, with 5 nodes, on bare-metal
>>> servers and we are going to replace those nodes with other nodes. I planned
>>> to add all the new nodes first, one-by-one, and later remove the old ones,
>>> one-by-one.
>>>
>>
>> It sounds like your bootstraps are hanging. Your streams should restart
>> after an hour, but probably you want to figure out why they're hanging...
>>
>> You can also use the auto_bootstrap=false+hibernate repair method for
>> this process. That's probably what I'd do if I was upgrading the hardware
>> of nodes in place.
>>
>> =Rob
>>
>>
>
>
>
> --
> Herbert Fischer | Senior IT Architect
> CrossEngage GmbH (haftungsbeschränkt) | Julie-Wolfthorn-Straße 1 | 10115
> Berlin
>
> E-Mail: herbert.fisc...@crossengage.io
> Web: www.crossengage.io
>
> Amtsgericht Berlin-Charlottenburg | HRB 169537 B
> Geschäftsführer: Dr. Markus Wübben, Manuel Hinz | USt-IdNr.: DE301504202
>


Cassandra DSE Solr - search JSON content in column

2016-01-13 Thread Joseph Tech
Hi,

Is it possible in DSE Cassandra Solr to search for JSON content within a
column?
We store a complex JSON in a column of type "text", very simplified version
below.

{
"userId": "user100",
"addressList": [{
"addressId": "100",
"address": "100 ABC Street"
}],
"userName": "user11"
}

In this, can search return all records that have address="100 ABC Street" ?

Thanks,
Joseph


Spark Cassandra Java Connector: records missing despite consistency=ALL

2016-01-13 Thread Dennis Birkholz

Hi together,

we Cassandra to log event data and process it every 15 minutes with 
Spark. We are using the Cassandra Java Connector for Spark.


Randomly our Spark runs produce too few output records because no data 
is returned from Cassandra for a several minutes window of input data. 
When querying the data (with cqlsh), after multiple tries, the data 
eventually becomes available.


To solve the problem, we tried to use consistency=ALL when reading the 
data in Spark. We use the 
CassandraJavaUtil.javafunctions().cassandraTable() method and have set 
"spark.cassandra.input.consistency.level"="ALL" on the config when 
creating the Spark context. The problem persists but according to 
http://stackoverflow.com/a/25043599 using a consistency level of ONE on 
the write side (which we use) and ALL on the READ side should be 
sufficient for data consistency.


I would really appreciate if someone could give me a hint how to fix 
this problem, thanks!


Greets,
Dennis

P.s.:
some information about our setup:
Cassandra 2.1.12 in a two Node configuration with replication factor=2
Spark 1.5.1
Cassandra Java Driver 2.2.0-rc3
Spark Cassandra Java Connector 2.10-1.5.0-M2


Re: Spark Cassandra Java Connector: records missing despite consistency=ALL

2016-01-13 Thread Alex Popescu
Dennis,

You'll have better chances to get an answer on the
spark-cassandra-connector mailing list
https://groups.google.com/a/lists.datastax.com/forum/#!forum/spark-connector-user
or on IRC #spark-cassandra-connector

On Wed, Jan 13, 2016 at 4:17 AM, Dennis Birkholz 
wrote:

> Hi together,
>
> we Cassandra to log event data and process it every 15 minutes with Spark.
> We are using the Cassandra Java Connector for Spark.
>
> Randomly our Spark runs produce too few output records because no data is
> returned from Cassandra for a several minutes window of input data. When
> querying the data (with cqlsh), after multiple tries, the data eventually
> becomes available.
>
> To solve the problem, we tried to use consistency=ALL when reading the
> data in Spark. We use the
> CassandraJavaUtil.javafunctions().cassandraTable() method and have set
> "spark.cassandra.input.consistency.level"="ALL" on the config when creating
> the Spark context. The problem persists but according to
> http://stackoverflow.com/a/25043599 using a consistency level of ONE on
> the write side (which we use) and ALL on the READ side should be sufficient
> for data consistency.
>
> I would really appreciate if someone could give me a hint how to fix this
> problem, thanks!
>
> Greets,
> Dennis
>
> P.s.:
> some information about our setup:
> Cassandra 2.1.12 in a two Node configuration with replication factor=2
> Spark 1.5.1
> Cassandra Java Driver 2.2.0-rc3
> Spark Cassandra Java Connector 2.10-1.5.0-M2
>



-- 
Bests,

Alex Popescu | @al3xandru
Sen. Product Manager @ DataStax


Re: New node has high network and disk usage.

2016-01-13 Thread Anuj Wadehra
Hi,
Revisiting the thread I can see that nodetool status had both good and bad 
nodes at same time. How do you replace nodes? When you say bad node..I 
understand that the node is no more usable even though Cassandra is UP? Is that 
correct?
If a node is in bad shape and not working, adding new node may trigger 
streaming huge data from bad node too. Have you considered using the procedure 
for replacing a dead node?
Please share Latest nodetool status.
nodetool output shared earlier:
 `nodetool status` output:

    Status=Up/Down
    |/ State=Normal/Leaving/Joining/Moving
    --  Address Load   Tokens  Owns   Host ID   
    Rack
    UN  A (Good)    252.37 GB  256 23.0%  
9cd2e58c-a062-48a4-8d3f-b7bd9ee0576f  rack1
    UN  B (Good)    245.91 GB  256 24.4%  
6f0cfff2-babe-4de2-a1e3-6201228dee44  rack1
    UN  C (Good)    254.79 GB  256 23.7%  
f4891729-9179-4f19-ab2c-50d387da7ac6  rack1
    UN  D (Bad) 163.85 GB  256 28.8%  
faa5b073-6af4-4c80-b280-e7fdd61924d3  rack1



ThanksAnuj
Sent from Yahoo Mail on Android 
 
  On Wed, 13 Jan, 2016 at 10:34 pm, James 
Griffin wrote:   Hi all, 
We’ve spent a few days running things but are in the same position. To add some 
more flavour:
   
   - We have a 3-node ring, replication factor = 3. We’ve been running in this 
configuration for a few years without any real issues
   - Nodes 2 & 3 are much newer than node 1. These two nodes were brought in to 
replace two other nodes which had failed RAID0 configuration and thus were 
lacking in disk space.
   - When node 2 was brought into the ring, it exhibited high CPU wait, IO and 
load metrics
   - We subsequently brought 3 into the ring: as soon as 3 was fully 
bootstrapped, the load, CPU wait and IO stats on 2 dropped to normal levels. 
Those same stats on 3, however, sky-rocketed
   - We’ve confirmed configuration across all three nodes are identical and in 
line with the recommended production settings
   - We’ve run a full repair
   - Node 2 is currently running compactions, 1 & 3 aren’t and have no pending
   - There is no GC happening from what I can see. Node 1 has a GC log, but 
that’s not been written to since May last year

What we’re seeing at the moment is similar and normal stats on nodes 1 & 2, but 
high CPU wait, IO and load stats on 3. As a snapshot:
   
   - Load: 3.96, CPU wait: 30.8%, Disk Read Ops: 408/s
   - Load: 5.88, CPU wait: 14.6%, Disk Read Ops: 275/s 
   - Load: 58.15, CPU wait: 87.0%, Disk Read Ops: 2,408/s 

Can you recommend any next steps? 
Griff

On 6 January 2016 at 17:31, Anuj Wadehra  wrote:

Hi Vickrum,
I would have proceeded with diagnosis as follows:
1. Analysis of sar report to check system health -cpu memory swap disk etc. 
System seems to be overloaded. This is evident from mutation drops.
2. Make sure that  all recommended Cassandra production settings available at 
Datastax site are applied ,disable zone reclaim and THP.
3.Run full Repair on bad node and check data size. Node is owner of maximum 
token range but has significant lower data.I doubt that bootstrapping happened 
properly.
4.Compactionstats shows 22 pending compactions. Try throttling compactions via 
reducing cincurent compactors or compaction throughput.
5.Analyze logs to make sure bootstrapping happened without errors.
6. Look for other common performance problems such as GC pauses to make sure 
that dropped mutations are not caused by GC pauses.

ThanksAnuj

Sent from Yahoo Mail on Android 
 
 On Wed, 6 Jan, 2016 at 10:12 pm, Vickrum Loi 
wrote:   # nodetool compactionstats
pending tasks: 22
  compaction type    keyspace   table   completed   
    total  unit  progress
   Compactionproduction_analytics    interactions   240410213   
 161172668724 bytes 0.15%
   Compactionproduction_decisionsdecisions.decisions_q_idx   
120815385   226295183 bytes    53.39%
Active compaction remaining time :   2h39m58s

Worth mentioning that compactions haven't been running on this node 
particularly often. The node's been performing badly regardless of whether it's 
compacting or not.

On 6 January 2016 at 16:35, Jeff Ferland  wrote:

What’s your output of `nodetool compactionstats`?

On Jan 6, 2016, at 7:26 AM, Vickrum Loi  wrote:
Hi,

We recently added a new node to our cluster in order to replace a node that 
died (hardware failure we believe). For the next two weeks it had high disk and 
network activity. We replaced the server, but it's happened again. We've looked 
into memory allowances, disk performance, number of connections, and all the 
nodetool stats, but can't find the cause of the issue.

`nodetool tpstats`[0] shows a lot of active and pending threads, in comparison 
to the rest of the cluster, but that's likely a symptom, not a 

Re: New node has high network and disk usage.

2016-01-13 Thread James Griffin
Hi Anuj,

Below is the output of nodetool status. The nodes were replaced following
the instructions in Datastax documentation for replacing running nodes
since the nodes were running fine, it was that the servers had been
incorrectly initialised and they thus had less disk space. The status below
shows 2 has significantly higher load, however as I say 2 is operating
normally and is running compactions, so I guess that's not an issue?

Datacenter: datacenter1
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address Load   Tokens  Owns   Host ID
Rack
UN  1   253.59 GB  256 31.7%
 6f0cfff2-babe-4de2-a1e3-6201228dee44  rack1
UN  2   302.23 GB  256 35.3%
 faa5b073-6af4-4c80-b280-e7fdd61924d3  rack1
UN  3   265.02 GB  256 33.1%
 74b15507-db5c-45df-81db-6e5bcb7438a3  rack1

Griff

On 13 January 2016 at 18:12, Anuj Wadehra  wrote:

> Hi,
>
> Revisiting the thread I can see that nodetool status had both good and bad
> nodes at same time. How do you replace nodes? When you say bad node..I
> understand that the node is no more usable even though Cassandra is UP? Is
> that correct?
>
> If a node is in bad shape and not working, adding new node may trigger
> streaming huge data from bad node too. Have you considered using the
> procedure for replacing a dead node?
>
> Please share Latest nodetool status.
>
> nodetool output shared earlier:
>
>  `nodetool status` output:
>
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address Load   Tokens  Owns   Host
> ID   Rack
> UN  A (Good)252.37 GB  256 23.0%
> 9cd2e58c-a062-48a4-8d3f-b7bd9ee0576f  rack1
> UN  B (Good)245.91 GB  256 24.4%
> 6f0cfff2-babe-4de2-a1e3-6201228dee44  rack1
> UN  C (Good)254.79 GB  256 23.7%
> f4891729-9179-4f19-ab2c-50d387da7ac6  rack1
> UN  D (Bad) 163.85 GB  256 28.8%
> faa5b073-6af4-4c80-b280-e7fdd61924d3  rack1
>
>
>
> Thanks
> Anuj
>
> Sent from Yahoo Mail on Android
> 
>
> On Wed, 13 Jan, 2016 at 10:34 pm, James Griffin
>  wrote:
> Hi all,
>
> We’ve spent a few days running things but are in the same position. To add
> some more flavour:
>
>
>- We have a 3-node ring, replication factor = 3. We’ve been running in
>this configuration for a few years without any real issues
>- Nodes 2 & 3 are much newer than node 1. These two nodes were brought
>in to replace two other nodes which had failed RAID0 configuration and thus
>were lacking in disk space.
>- When node 2 was brought into the ring, it exhibited high CPU wait,
>IO and load metrics
>- We subsequently brought 3 into the ring: as soon as 3 was fully
>bootstrapped, the load, CPU wait and IO stats on 2 dropped to normal
>levels. Those same stats on 3, however, sky-rocketed
>- We’ve confirmed configuration across all three nodes are identical
>and in line with the recommended production settings
>- We’ve run a full repair
>- Node 2 is currently running compactions, 1 & 3 aren’t and have no
>pending
>- There is no GC happening from what I can see. Node 1 has a GC log,
>but that’s not been written to since May last year
>
>
> What we’re seeing at the moment is similar and normal stats on nodes 1 &
> 2, but high CPU wait, IO and load stats on 3. As a snapshot:
>
>
>1. Load: 3.96, CPU wait: 30.8%, Disk Read Ops: 408/s
>2. Load: 5.88, CPU wait: 14.6%, Disk Read Ops: 275/s
>3. Load: 58.15, CPU wait: 87.0%, Disk Read Ops: 2,408/s
>
>
> Can you recommend any next steps?
>
> Griff
>
> On 6 January 2016 at 17:31, Anuj Wadehra  wrote:
>
>> Hi Vickrum,
>>
>> I would have proceeded with diagnosis as follows:
>>
>> 1. Analysis of sar report to check system health -cpu memory swap disk
>> etc.
>> System seems to be overloaded. This is evident from mutation drops.
>>
>> 2. Make sure that  all recommended Cassandra production settings
>> available at Datastax site are applied ,disable zone reclaim and THP.
>>
>> 3.Run full Repair on bad node and check data size. Node is owner of
>> maximum token range but has significant lower data.I doubt that
>> bootstrapping happened properly.
>>
>> 4.Compactionstats shows 22 pending compactions. Try throttling
>> compactions via reducing cincurent compactors or compaction throughput.
>>
>> 5.Analyze logs to make sure bootstrapping happened without errors.
>>
>> 6. Look for other common performance problems such as GC pauses to make
>> sure that dropped mutations are not caused by GC pauses.
>>
>>
>> Thanks
>> Anuj
>>
>> Sent from Yahoo Mail on Android
>> 
>>
>> On Wed, 6 Jan, 2016 at 10:12 pm, Vickrum Loi
>>  wrote:
>> # nodetool compactionstats
>> 

Re: New node has high network and disk usage.

2016-01-13 Thread Anuj Wadehra
Node 2 has slightly higher data but that should be ok. Not sure how read ops 
are so high when no IO intensive activity such as repair and compaction is 
running on node 3.May be you can try investigating logs to see whats happening.
Others on the mailing list could also share their views on the situation.

ThanksAnuj


Sent from Yahoo Mail on Android 
 
  On Wed, 13 Jan, 2016 at 11:46 pm, James 
Griffin wrote:   Hi Anuj, 
Below is the output of nodetool status. The nodes were replaced following the 
instructions in Datastax documentation for replacing running nodes since the 
nodes were running fine, it was that the servers had been incorrectly 
initialised and they thus had less disk space. The status below shows 2 has 
significantly higher load, however as I say 2 is operating normally and is 
running compactions, so I guess that's not an issue?
Datacenter: datacenter1===Status=Up/Down|/ 
State=Normal/Leaving/Joining/Moving--  Address         Load       Tokens  Owns  
 Host ID                               RackUN  1               253.59 GB  256   
  31.7%  6f0cfff2-babe-4de2-a1e3-6201228dee44  rack1UN  2               302.23 
GB  256     35.3%  faa5b073-6af4-4c80-b280-e7fdd61924d3  rack1UN  3             
  265.02 GB  256     33.1%  74b15507-db5c-45df-81db-6e5bcb7438a3  rack1
Griff

On 13 January 2016 at 18:12, Anuj Wadehra  wrote:

Hi,
Revisiting the thread I can see that nodetool status had both good and bad 
nodes at same time. How do you replace nodes? When you say bad node..I 
understand that the node is no more usable even though Cassandra is UP? Is that 
correct?
If a node is in bad shape and not working, adding new node may trigger 
streaming huge data from bad node too. Have you considered using the procedure 
for replacing a dead node?
Please share Latest nodetool status.
nodetool output shared earlier:
 `nodetool status` output:

    Status=Up/Down
    |/ State=Normal/Leaving/Joining/Moving
    --  Address Load   Tokens  Owns   Host ID   
    Rack
    UN  A (Good)    252.37 GB  256 23.0%  
9cd2e58c-a062-48a4-8d3f-b7bd9ee0576f  rack1
    UN  B (Good)    245.91 GB  256 24.4%  
6f0cfff2-babe-4de2-a1e3-6201228dee44  rack1
    UN  C (Good)    254.79 GB  256 23.7%  
f4891729-9179-4f19-ab2c-50d387da7ac6  rack1
    UN  D (Bad) 163.85 GB  256 28.8%  
faa5b073-6af4-4c80-b280-e7fdd61924d3  rack1



ThanksAnuj
Sent from Yahoo Mail on Android 
 
 On Wed, 13 Jan, 2016 at 10:34 pm, James 
Griffin wrote:   Hi all, 
We’ve spent a few days running things but are in the same position. To add some 
more flavour:
   
   - We have a 3-node ring, replication factor = 3. We’ve been running in this 
configuration for a few years without any real issues
   - Nodes 2 & 3 are much newer than node 1. These two nodes were brought in to 
replace two other nodes which had failed RAID0 configuration and thus were 
lacking in disk space.
   - When node 2 was brought into the ring, it exhibited high CPU wait, IO and 
load metrics
   - We subsequently brought 3 into the ring: as soon as 3 was fully 
bootstrapped, the load, CPU wait and IO stats on 2 dropped to normal levels. 
Those same stats on 3, however, sky-rocketed
   - We’ve confirmed configuration across all three nodes are identical and in 
line with the recommended production settings
   - We’ve run a full repair
   - Node 2 is currently running compactions, 1 & 3 aren’t and have no pending
   - There is no GC happening from what I can see. Node 1 has a GC log, but 
that’s not been written to since May last year

What we’re seeing at the moment is similar and normal stats on nodes 1 & 2, but 
high CPU wait, IO and load stats on 3. As a snapshot:
   
   - Load: 3.96, CPU wait: 30.8%, Disk Read Ops: 408/s
   - Load: 5.88, CPU wait: 14.6%, Disk Read Ops: 275/s 
   - Load: 58.15, CPU wait: 87.0%, Disk Read Ops: 2,408/s 

Can you recommend any next steps? 
Griff

On 6 January 2016 at 17:31, Anuj Wadehra  wrote:

Hi Vickrum,
I would have proceeded with diagnosis as follows:
1. Analysis of sar report to check system health -cpu memory swap disk etc. 
System seems to be overloaded. This is evident from mutation drops.
2. Make sure that  all recommended Cassandra production settings available at 
Datastax site are applied ,disable zone reclaim and THP.
3.Run full Repair on bad node and check data size. Node is owner of maximum 
token range but has significant lower data.I doubt that bootstrapping happened 
properly.
4.Compactionstats shows 22 pending compactions. Try throttling compactions via 
reducing cincurent compactors or compaction throughput.
5.Analyze logs to make sure bootstrapping happened without errors.
6. Look for other common performance problems such as GC pauses to make sure 
that dropped mutations are not caused by GC pauses.

ThanksAnuj

Sent from Yahoo Mail 

Re: Help debugging a very slow query

2016-01-13 Thread Jeff Jirsa
Very large partitions create a lot of garbage during reads: 
https://issues.apache.org/jira/browse/CASSANDRA-9754 - you will see significant 
GC pauses trying to read from large enough partitions. 

I suspect GC, though it’s odd that you’re unable to see it. 



From:  Bryan Cheng
Reply-To:  "user@cassandra.apache.org"
Date:  Wednesday, January 13, 2016 at 12:40 PM
To:  "user@cassandra.apache.org"
Subject:  Help debugging a very slow query

Hi list, 

Would appreciate some insight into some irregular performance we're seeing.

We have a column family that has become problematic recently. We've noticed a 
few queries take enormous amounts of time, and seem to clog up read resources 
on the machine (read pending tasks pile up, then immediately are relieved). 

I've included the output of cfhistograms on this keyspace[1]. The latencies 
sampled do not include one of these problematic partitions, but show two 
things: 1) the vast majority of queries to this table seem to be healthy, and 
2) that the maximum partition size is absurd (4139110981 bytes).

This particular cf is not expected to be updated beyond an initial set of 
writes, but can be read many times. The data model includes several hashes that 
amount to a few KB at most, a set that can hit ~30-40 entries, and three 
list that reach a hundred or so entries at most. There doesn't appear to 
be any material difference in the size or character of the data saved between 
"good" and "bad" partitions. Often, the same extremely slow partition queried 
with consistency ONE returns cleanly and very quickly against other replicas.

I've included a trace of one of these slow returns[2], which I find very 
strange: The vast majority of operations are very quick, but the final step is 
extremely slow. Nothing exceeds 2ms until the final "Read 1 live and 0 
tombstone cells" which takes a whopping 69 seconds [!!]. We've checked our 
garbage collection in this time period and have not noticed any significant 
collections.

As far as I can tell, the trace doesn't raise any red flags, and we're largely 
stumped.

We've got two main questions:

1) What's up with the megapartition? What's the best way to debug this? Our 
data model is largely write once, we don't do any updates. We do DELETE, but 
the partitions that are giving us issues haven't been removed. We had some 
suspicions on https://issues.apache.org/jira/browse/CASSANDRA-10547, but that 
seems to largely be triggered by UPDATE operations.

2) What could cause the Read to take such an absurd amount of time when it's a 
pair of sstables and the memtable being examined, and its just a single cell 
being read? We originally suspected just memory pressure from huge sstables, 
but without a corresponding GC this seems unlikely?

Any ideas?

Thanks in advance!

--Bryan


[1] 
Percentile  SSTables Write Latency  Read LatencyPartition Size  
  Cell Count
  (micros)  (micros)   (bytes)  

50% 1.00 35.00 72.00  1109  
  14
75% 1.00 50.00149.00  1331  
  17
95% 1.00 72.00924.00  4768  
  35
98% 2.00103.00   1597.00  9887  
  72
99% 2.00149.00   1597.00 14237  
 103
Min 0.00 15.00 25.0043  
   0
Max 2.00258.00   6866.004139110981  
   20501

[2]
[
  {
"Sessionid": "4f51fa70-ba2f-11e5-8729-e1d125cb9b2d",
"Eventid": "4f524890-ba2f-11e5-8729-e1d125cb9b2d",
"Activity": "Parsing select * from pooltx where hash = 
0x5f805c68d66e7d271361e7774a7eeec0591eb5197d4f420126cea83171f0a8ff;",
"Source": "172.31.54.46",
"SourceElapsed": 26
  },
  {
"Sessionid": "4f51fa70-ba2f-11e5-8729-e1d125cb9b2d",
"Eventid": "4f526fa0-ba2f-11e5-8729-e1d125cb9b2d",
"Activity": "Preparing statement",
"Source": "172.31.54.46",
"SourceElapsed": 79
  },
  {
"Sessionid": "4f51fa70-ba2f-11e5-8729-e1d125cb9b2d",
"Eventid": "4f52bdc0-ba2f-11e5-8729-e1d125cb9b2d",
"Activity": "Executing single-partition query on pooltx",
"Source": "172.31.54.46",
"SourceElapsed": 1014
  },
  {
"Sessionid": "4f51fa70-ba2f-11e5-8729-e1d125cb9b2d",
"Eventid": "4f52e4d0-ba2f-11e5-8729-e1d125cb9b2d",
"Activity": "Acquiring sstable references",
"Source": "172.31.54.46",
"SourceElapsed": 1016
  },
  {
"Sessionid": "4f51fa70-ba2f-11e5-8729-e1d125cb9b2d",
"Eventid": "4f530be0-ba2f-11e5-8729-e1d125cb9b2d",
"Activity": "Merging memtable tombstones",
"Source": "172.31.54.46",
"SourceElapsed": 1029
  },
  {
"Sessionid": "4f51fa70-ba2f-11e5-8729-e1d125cb9b2d",
"Eventid": "4f5332f0-ba2f-11e5-8729-e1d125cb9b2d",

Re: New node has high network and disk usage.

2016-01-13 Thread James Griffin
I think I was incorrect in assuming GC wasn't an issue due to the lack of
logs. Comparing jstat output on nodes 2 & 3 show some fairly marked
differences, though
comparing the startup flags on the two machines show the GC config is
identical.:

$ jstat -gcutil
   S0 S1 E  O  P YGC YGCTFGCFGCT GCT
2  5.08   0.00  55.72  18.24  59.90  25986  619.827281.597  621.424
3  0.00   0.00  22.79  17.87  59.99 422600 11225.979   668   57.383
11283.361

Here's typical output for iostat on nodes 2 & 3 as well:

$ iostat -dmx md0

  Device: rrqm/s   wrqm/s r/s w/srMB/swMB/s
avgrq-sz avgqu-sz   await r_await w_await  svctm  %util
2 md0   0.00 0.00  339.000.00 9.77 0.00
 59.00 0.000.000.000.00   0.00   0.00
3 md0   0.00 0.00 2069.001.0085.85 0.00
 84.94 0.000.000.000.00   0.00   0.00

Griff

On 13 January 2016 at 18:36, Anuj Wadehra  wrote:

> Node 2 has slightly higher data but that should be ok. Not sure how read
> ops are so high when no IO intensive activity such as repair and compaction
> is running on node 3.May be you can try investigating logs to see whats
> happening.
>
> Others on the mailing list could also share their views on the situation.
>
> Thanks
> Anuj
>
>
>
> Sent from Yahoo Mail on Android
> 
>
> On Wed, 13 Jan, 2016 at 11:46 pm, James Griffin
>  wrote:
> Hi Anuj,
>
> Below is the output of nodetool status. The nodes were replaced following
> the instructions in Datastax documentation for replacing running nodes
> since the nodes were running fine, it was that the servers had been
> incorrectly initialised and they thus had less disk space. The status below
> shows 2 has significantly higher load, however as I say 2 is operating
> normally and is running compactions, so I guess that's not an issue?
>
> Datacenter: datacenter1
> ===
> Status=Up/Down
> |/ State=Normal/Leaving/Joining/Moving
> --  Address Load   Tokens  Owns   Host ID
>   Rack
> UN  1   253.59 GB  256 31.7%
>  6f0cfff2-babe-4de2-a1e3-6201228dee44  rack1
> UN  2   302.23 GB  256 35.3%
>  faa5b073-6af4-4c80-b280-e7fdd61924d3  rack1
> UN  3   265.02 GB  256 33.1%
>  74b15507-db5c-45df-81db-6e5bcb7438a3  rack1
>
> Griff
>
> On 13 January 2016 at 18:12, Anuj Wadehra  wrote:
>
>> Hi,
>>
>> Revisiting the thread I can see that nodetool status had both good and
>> bad nodes at same time. How do you replace nodes? When you say bad node..I
>> understand that the node is no more usable even though Cassandra is UP? Is
>> that correct?
>>
>> If a node is in bad shape and not working, adding new node may trigger
>> streaming huge data from bad node too. Have you considered using the
>> procedure for replacing a dead node?
>>
>> Please share Latest nodetool status.
>>
>> nodetool output shared earlier:
>>
>>  `nodetool status` output:
>>
>> Status=Up/Down
>> |/ State=Normal/Leaving/Joining/Moving
>> --  Address Load   Tokens  Owns   Host
>> ID   Rack
>> UN  A (Good)252.37 GB  256 23.0%
>> 9cd2e58c-a062-48a4-8d3f-b7bd9ee0576f  rack1
>> UN  B (Good)245.91 GB  256 24.4%
>> 6f0cfff2-babe-4de2-a1e3-6201228dee44  rack1
>> UN  C (Good)254.79 GB  256 23.7%
>> f4891729-9179-4f19-ab2c-50d387da7ac6  rack1
>> UN  D (Bad) 163.85 GB  256 28.8%
>> faa5b073-6af4-4c80-b280-e7fdd61924d3  rack1
>>
>>
>>
>> Thanks
>> Anuj
>>
>> Sent from Yahoo Mail on Android
>> 
>>
>> On Wed, 13 Jan, 2016 at 10:34 pm, James Griffin
>>  wrote:
>> Hi all,
>>
>> We’ve spent a few days running things but are in the same position. To
>> add some more flavour:
>>
>>
>>- We have a 3-node ring, replication factor = 3. We’ve been running
>>in this configuration for a few years without any real issues
>>- Nodes 2 & 3 are much newer than node 1. These two nodes were
>>brought in to replace two other nodes which had failed RAID0 configuration
>>and thus were lacking in disk space.
>>- When node 2 was brought into the ring, it exhibited high CPU wait,
>>IO and load metrics
>>- We subsequently brought 3 into the ring: as soon as 3 was fully
>>bootstrapped, the load, CPU wait and IO stats on 2 dropped to normal
>>levels. Those same stats on 3, however, sky-rocketed
>>- We’ve confirmed configuration across all three nodes are identical
>>and in line with the recommended production settings
>>- We’ve run a full repair
>>- Node 2 is currently running compactions, 1 & 3 aren’t and have no
>>pending
>>- There is no GC happening from what I can see. Node 1 has a GC log,
>>

Re: Help debugging a very slow query

2016-01-13 Thread Robert Coli
On Wed, Jan 13, 2016 at 12:40 PM, Bryan Cheng  wrote:

> 1) What's up with the megapartition? What's the best way to debug this?
> Our data model is largely write once, we don't do any updates. We do
> DELETE, but the partitions that are giving us issues haven't been removed.
> We had some suspicions on
> https://issues.apache.org/jira/browse/CASSANDRA-10547, but that seems to
> largely be triggered by UPDATE operations.
>

Modern versions of Cassandra log the partition key of large partitions when
they are compacted.

Assuming no data model problems, large partitions are frequently the
partition for the key 'None' or some other application level error.

=Rob


Help debugging a very slow query

2016-01-13 Thread Bryan Cheng
Hi list,

Would appreciate some insight into some irregular performance we're seeing.

We have a column family that has become problematic recently. We've noticed
a few queries take enormous amounts of time, and seem to clog up read
resources on the machine (read pending tasks pile up, then immediately are
relieved).

I've included the output of cfhistograms on this keyspace[1]. The latencies
sampled do not include one of these problematic partitions, but show two
things: 1) the vast majority of queries to this table seem to be healthy,
and 2) that the maximum partition size is absurd (4139110981 bytes).

This particular cf is not expected to be updated beyond an initial set of
writes, but can be read many times. The data model includes several hashes
that amount to a few KB at most, a set that can hit ~30-40 entries,
and three list that reach a hundred or so entries at most. There
doesn't appear to be any material difference in the size or character of
the data saved between "good" and "bad" partitions. Often, the same
extremely slow partition queried with consistency ONE returns cleanly and
very quickly against other replicas.

I've included a trace of one of these slow returns[2], which I find very
strange: The vast majority of operations are very quick, but the final step
is extremely slow. Nothing exceeds 2ms until the final "Read 1 live and 0
tombstone cells" which takes a whopping 69 seconds [!!]. We've checked our
garbage collection in this time period and have not noticed any significant
collections.

As far as I can tell, the trace doesn't raise any red flags, and we're
largely stumped.

We've got two main questions:

1) What's up with the megapartition? What's the best way to debug this? Our
data model is largely write once, we don't do any updates. We do DELETE,
but the partitions that are giving us issues haven't been removed. We had
some suspicions on https://issues.apache.org/jira/browse/CASSANDRA-10547,
but that seems to largely be triggered by UPDATE operations.

2) What could cause the Read to take such an absurd amount of time when
it's a pair of sstables and the memtable being examined, and its just a
single cell being read? We originally suspected just memory pressure from
huge sstables, but without a corresponding GC this seems unlikely?

Any ideas?

Thanks in advance!

--Bryan


[1]
Percentile  SSTables Write Latency  Read LatencyPartition Size
   Cell Count
  (micros)  (micros)   (bytes)

50% 1.00 35.00 72.00  1109
   14
75% 1.00 50.00149.00  1331
   17
95% 1.00 72.00924.00  4768
   35
98% 2.00103.00   1597.00  9887
   72
99% 2.00149.00   1597.00 14237
  103
Min 0.00 15.00 25.0043
0
Max 2.00258.00   6866.004139110981
20501

[2]

[
  {
"Sessionid": "4f51fa70-ba2f-11e5-8729-e1d125cb9b2d",
"Eventid": "4f524890-ba2f-11e5-8729-e1d125cb9b2d",
"Activity": "Parsing select * from pooltx where hash =
0x5f805c68d66e7d271361e7774a7eeec0591eb5197d4f420126cea83171f0a8ff;",
"Source": "172.31.54.46",
"SourceElapsed": 26
  },
  {
"Sessionid": "4f51fa70-ba2f-11e5-8729-e1d125cb9b2d",
"Eventid": "4f526fa0-ba2f-11e5-8729-e1d125cb9b2d",
"Activity": "Preparing statement",
"Source": "172.31.54.46",
"SourceElapsed": 79
  },
  {
"Sessionid": "4f51fa70-ba2f-11e5-8729-e1d125cb9b2d",
"Eventid": "4f52bdc0-ba2f-11e5-8729-e1d125cb9b2d",
"Activity": "Executing single-partition query on pooltx",
"Source": "172.31.54.46",
"SourceElapsed": 1014
  },
  {
"Sessionid": "4f51fa70-ba2f-11e5-8729-e1d125cb9b2d",
"Eventid": "4f52e4d0-ba2f-11e5-8729-e1d125cb9b2d",
"Activity": "Acquiring sstable references",
"Source": "172.31.54.46",
"SourceElapsed": 1016
  },
  {
"Sessionid": "4f51fa70-ba2f-11e5-8729-e1d125cb9b2d",
"Eventid": "4f530be0-ba2f-11e5-8729-e1d125cb9b2d",
"Activity": "Merging memtable tombstones",
"Source": "172.31.54.46",
"SourceElapsed": 1029
  },
  {
"Sessionid": "4f51fa70-ba2f-11e5-8729-e1d125cb9b2d",
"Eventid": "4f5332f0-ba2f-11e5-8729-e1d125cb9b2d",
"Activity": "Bloom filter allows skipping sstable 387133",
"Source": "172.31.54.46",
"SourceElapsed": 1040
  },
  {
"Sessionid": "4f51fa70-ba2f-11e5-8729-e1d125cb9b2d",
"Eventid": "4f535a00-ba2f-11e5-8729-e1d125cb9b2d",
"Activity": "Key cache hit for sstable 386331",
"Source": "172.31.54.46",
"SourceElapsed": 1046
  },
  {
"Sessionid": "4f51fa70-ba2f-11e5-8729-e1d125cb9b2d",
"Eventid": "4f538110-ba2f-11e5-8729-e1d125cb9b2d",
"Activity": "Seeking to 

Re: Cassandra is consuming a lot of disk space

2016-01-13 Thread Rahul Ramesh
Thanks for your suggestion.

Compaction was happening on one of the large tables. The disk space did not
decrease much after the compaction. So I ran an external compaction. The
disk space decreased by around 10%. However it is still consuming close to
750Gb for load of 250Gb.

I even restarted cassandra thinking there may be some open files. However
it didnt help much.

Is there any way to find out why so much of data is being consumed?

I checked if there are any open files using lsof. There are not any open
files.

*Recovery:*
Just a wild thought
I am using replication factor of 2 and I have two nodes. If I delete
complete data on one of the node, will I be able to recover all the data
from the active node?
I don't want to pursue this path as I want to find out the root cause of
the issue!


Any help will be greatly appreciated

Thank you,

Rahul






On Wed, Jan 13, 2016 at 3:37 PM, Carlos Rolo  wrote:

> You can check if the snapshot exists in the snapshot folder.
> Repairs stream sstables over, than can temporary increase disk space. But
> I think Carlos Alonso might be correct. Running compactions might be the
> issue.
>
> Regards,
>
> Carlos Juzarte Rolo
> Cassandra Consultant
>
> Pythian - Love your data
>
> rolo@pythian | Twitter: @cjrolo | Linkedin: *linkedin.com/in/carlosjuzarterolo
> *
> Mobile: +351 91 891 81 00 | Tel: +1 613 565 8696 x1649
> www.pythian.com
>
> On Wed, Jan 13, 2016 at 9:24 AM, Carlos Alonso  wrote:
>
>> I'd have a look also at possible running compactions.
>>
>> If you have big column families with STCS then large compactions may be
>> happening.
>>
>> Check it with nodetool compactionstats
>>
>> Carlos Alonso | Software Engineer | @calonso
>> 
>>
>> On 13 January 2016 at 05:22, Kevin O'Connor  wrote:
>>
>>> Have you tried restarting? It's possible there's open file handles to
>>> sstables that have been compacted away. You can verify by doing lsof and
>>> grepping for DEL or deleted.
>>>
>>> If it's not that, you can run nodetool cleanup on each node to scan all
>>> of the sstables on disk and remove anything that it's not responsible for.
>>> Generally this would only work if you added nodes recently.
>>>
>>>
>>> On Tuesday, January 12, 2016, Rahul Ramesh  wrote:
>>>
 We have a 2 node Cassandra cluster with a replication factor of 2.

 The load factor on the nodes is around 350Gb

 Datacenter: Cassandra
 ==
 Address  RackStatus State   LoadOwns
  Token

 -5072018636360415943
 172.31.7.91  rack1   Up Normal  328.5 GB100.00%
 -7068746880841807701
 172.31.7.92  rack1   Up Normal  351.7 GB100.00%
 -5072018636360415943

 However,if I use df -h,

 /dev/xvdf   252G  223G   17G  94% /HDD1
 /dev/xvdg   493G  456G   12G  98% /HDD2
 /dev/xvdh   197G  167G   21G  90% /HDD3


 HDD1,2,3 contains only cassandra data. It amounts to close to 1Tb in
 one of the machine and in another machine it is close to 650Gb.

 I started repair 2 days ago, after running repair, the amount of disk
 space consumption has actually increased.
 I also checked if this is because of snapshots. nodetool listsnapshot
 intermittently lists a snapshot but it goes away after sometime.

 Can somebody please help me understand,
 1. why so much disk space is consumed?
 2. Why did it increase after repair?
 3. Is there any way to recover from this state.


 Thanks,
 Rahul


>>
>
> --
>
>
>
>


Re: Cassandra DSE Solr - search JSON content in column

2016-01-13 Thread Joseph Tech
Hi Jack,

I didnt exactly understand your suggestion. Could you please share an
example?

Thanks,
Joseph

On Thu, Jan 14, 2016 at 10:21 AM, Jack Krupansky 
wrote:

> For a nested object you can just concatenate the sequence of names with
> dots or some other separator and use that for each leaf value of the nested
> tree.
>
> -- Jack Krupansky
>
> On Wed, Jan 13, 2016 at 11:40 PM, Joseph Tech 
> wrote:
>
>> Thanks, Field Transformers is exactly what i was looking for. Mine is a
>> somewhat nested object, so will need to see how complex the transformer
>> would get, and if it would become a maintenance hassle later on; will try
>> this out and share feedback.
>>
>> -Joseph
>>
>> On Wed, Jan 13, 2016 at 8:31 PM, Russell Bradberry 
>> wrote:
>>
>>> You can use the full text wildcard search as mentioned. However, if you
>>> need something more specific like certain fields in the JSON indexed, you
>>> can use DSE SOLR field transformers.
>>> http://www.datastax.com/dev/blog/dse-field-transformers
>>>
>>> From: DuyHai Doan 
>>> Reply-To: 
>>> Date: Wednesday, January 13, 2016 at 9:10 AM
>>> To: 
>>> Subject: Re: Cassandra DSE Solr - search JSON content in column
>>>
>>> Try
>>>
>>> SELECT * FROM your_table WHERE solr_query='json:"*100 ABC Street*"';
>>>
>>> Warning: since you're storing in JSON format, searching data inside a
>>> JSON is equivalent to a wildcard seach *xxx* and it is quite expensive,
>>> even for full text search engines like Solr
>>>
>>> On Wed, Jan 13, 2016 at 2:50 PM, Joseph Tech 
>>> wrote:
>>>
 Hi,

 Is it possible in DSE Cassandra Solr to search for JSON content within
 a column?
 We store a complex JSON in a column of type "text", very simplified
 version below.

 {
 "userId": "user100",
 "addressList": [{
 "addressId": "100",
 "address": "100 ABC Street"
 }],
 "userName": "user11"
 }

 In this, can search return all records that have address="100 ABC
 Street" ?

 Thanks,
 Joseph


>>>
>>
>


Re: Cassandra is consuming a lot of disk space

2016-01-13 Thread Jan Kesten
Hi Rahul,

just an idea, did you have a look at the data directorys on disk 
(/var/lib/cassandra/data)? It could be that there are some from old keyspaces 
that have been deleted and snapshoted before. Try something like "du -sh 
/var/lib/cassandra/data/*" to verify which keyspace is consuming your space.

Jan

Von meinem iPhone gesendet

> Am 14.01.2016 um 07:25 schrieb Rahul Ramesh :
> 
> Thanks for your suggestion. 
> 
> Compaction was happening on one of the large tables. The disk space did not 
> decrease much after the compaction. So I ran an external compaction. The disk 
> space decreased by around 10%. However it is still consuming close to 750Gb 
> for load of 250Gb. 
> 
> I even restarted cassandra thinking there may be some open files. However it 
> didnt help much. 
> 
> Is there any way to find out why so much of data is being consumed? 
> 
> I checked if there are any open files using lsof. There are not any open 
> files.
> 
> Recovery:
> Just a wild thought 
> I am using replication factor of 2 and I have two nodes. If I delete complete 
> data on one of the node, will I be able to recover all the data from the 
> active node? 
> I don't want to pursue this path as I want to find out the root cause of the 
> issue! 
> 
> 
> Any help will be greatly appreciated
> 
> Thank you,
> 
> Rahul
> 
> 
> 
> 
> 
> 
>> On Wed, Jan 13, 2016 at 3:37 PM, Carlos Rolo  wrote:
>> You can check if the snapshot exists in the snapshot folder.
>> Repairs stream sstables over, than can temporary increase disk space. But I 
>> think Carlos Alonso might be correct. Running compactions might be the issue.
>> 
>> Regards,
>> 
>> Carlos Juzarte Rolo
>> Cassandra Consultant
>>  
>> Pythian - Love your data
>> 
>> rolo@pythian | Twitter: @cjrolo | Linkedin: linkedin.com/in/carlosjuzarterolo
>> Mobile: +351 91 891 81 00 | Tel: +1 613 565 8696 x1649
>> www.pythian.com
>> 
>>> On Wed, Jan 13, 2016 at 9:24 AM, Carlos Alonso  wrote:
>>> I'd have a look also at possible running compactions.
>>> 
>>> If you have big column families with STCS then large compactions may be 
>>> happening.
>>> 
>>> Check it with nodetool compactionstats
>>> 
>>> Carlos Alonso | Software Engineer | @calonso
>>> 
 On 13 January 2016 at 05:22, Kevin O'Connor  wrote:
 Have you tried restarting? It's possible there's open file handles to 
 sstables that have been compacted away. You can verify by doing lsof and 
 grepping for DEL or deleted. 
 
 If it's not that, you can run nodetool cleanup on each node to scan all of 
 the sstables on disk and remove anything that it's not responsible for. 
 Generally this would only work if you added nodes recently. 
 
 
> On Tuesday, January 12, 2016, Rahul Ramesh  wrote:
> We have a 2 node Cassandra cluster with a replication factor of 2. 
> 
> The load factor on the nodes is around 350Gb
> 
> Datacenter: Cassandra
> ==
> Address  RackStatus State   LoadOwns  
>   Token   
>   
>   -5072018636360415943
> 172.31.7.91  rack1   Up Normal  328.5 GB100.00%   
>   -7068746880841807701   
> 172.31.7.92  rack1   Up Normal  351.7 GB100.00%   
>   -5072018636360415943
> 
> However,if I use df -h, 
> 
> /dev/xvdf   252G  223G   17G  94% /HDD1
> /dev/xvdg   493G  456G   12G  98% /HDD2
> /dev/xvdh   197G  167G   21G  90% /HDD3
> 
> 
> HDD1,2,3 contains only cassandra data. It amounts to close to 1Tb in one 
> of the machine and in another machine it is close to 650Gb. 
> 
> I started repair 2 days ago, after running repair, the amount of disk 
> space consumption has actually increased. 
> I also checked if this is because of snapshots. nodetool listsnapshot 
> intermittently lists a snapshot but it goes away after sometime. 
> 
> Can somebody please help me understand, 
> 1. why so much disk space is consumed?
> 2. Why did it increase after repair?
> 3. Is there any way to recover from this state.
> 
> 
> Thanks,
> Rahul
>> 
>> 
>> --
>> 
> 


Re: max connection per user

2016-01-13 Thread Robert Coli
On Wed, Jan 13, 2016 at 1:41 PM, oleg yusim  wrote:

> Quick question, here: does Cassandra have a configuration switch to limit
> number of connections per user (protection of DoS attack, security)?
>

Quick answer : no.

=Rob


max connection per user

2016-01-13 Thread oleg yusim
Greetings,

Quick question, here: does Cassandra have a configuration switch to limit
number of connections per user (protection of DoS attack, security)?

Thanks,

Oleg


Re: Recommendations for an embedded Cassandra and Unit Tests

2016-01-13 Thread Richard L. Burton III
I was hoping that the cqlsh project would expose a class that you can feed
is a source file via Java.

The parsers in these other projects don't properly parse CQL. e.g., when
you encounter a semicolon within a string, ignore it and continue on
looking for the end of the string.

I ended up having separate *.cql files that I execute during the setup of
my tests. Not ideal, but it'll work.


On Tue, Jan 12, 2016 at 7:24 AM, DuyHai Doan  wrote:

> "What I'm noticing with these projects is that they don't handle CQL
> files properly"
>
> --> your concern is very legit. But handling CQL files properly is very
> complex, let me explain the reasons.
>
> A naive solution if you want to handle CQL syntax is to re-use the ANTLR
> grammar file here:
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/Cql.g
>
>  I've gone down this path in the past and it's nearly impossible, simply
> because the Cql.g grammar file is using a lot of "internal" Cassandra
> classes. Just look at the import block at the beginning of the file.
>
> At a higher level, we should clearly define the "scope" of a CQL script
> executor. Is it responsible for 1) parsing CQL statements or 2) validating
> CQL statements ?
>
> As far as I'm concerned, point 2) should be done by Cassandra. If we limit
> the scope of a script executor to point 1) it's sufficient.
>
> Indeed the remaining challenge is : how to split a block of input text
> that contains multiples CQL statements into a list of CQL statements that
> can be executed sequentially (or in //) by the Java driver ?
>
> The Zeppelin Cassandra interpreter is using Scala combinator parser to
> define a minimum grammar to split differents CQL statements apart:
> https://github.com/doanduyhai/incubator-zeppelin/blob/CassandraInterpreter-V2/cassandra/src/main/scala/org/apache/zeppelin/cassandra/ParagraphParser.scala#L179-L198
>
> Until Cassandra 2.1, it's pretty easy, the semi-colon (;) can be used as
> statement separator. Since Cassandra 2.2 and the introduction of UDF, it's
> much more complex. Semi-colon can appears in Java source code block of the
> definition of a function so using it as separator no longer works.
>
> A complex regular expression like this:
> https://github.com/doanduyhai/incubator-zeppelin/blob/CassandraInterpreter-V2/cassandra/src/main/scala/org/apache/zeppelin/cassandra/ParagraphParser.scala#L55-L69
> is necessary to parse UDF creation statements correctly.
>
> In a nutshell, parsing (and even not validating) CQL is harder than most
> people think.
>
>
>
> On Mon, Jan 11, 2016 at 10:52 PM, Richard L. Burton III <
> mrbur...@gmail.com> wrote:
>
>> What I'm noticing with these projects is that they don't handle CQL files
>> properly. e.g., cassandra-unit dies when you have a string that contains ;
>> inside of it. The parsing logic they use is very primitive in the sense
>> they simple look for ; to denote the end of a statement.
>>
>> Is there any class in Cassandra I could use that given a *.cql file,
>> it'll return a list of statements inside of it?
>>
>> Looking at CQLParser, it's only good for parsing a single statement vs. a
>> file that contains multiple statements.
>>
>>
>> On Mon, Jan 11, 2016 at 3:06 PM, DuyHai Doan 
>> wrote:
>>
>>> Achilles 4.x does offer an embedded Cassandra server support with some
>>> utility classes like ScriptExecutor. It supports C* 2.2 currently :
>>>
>>> https://github.com/doanduyhai/Achilles/wiki/CQL-embedded-cassandra-server
>>> Le 11 janv. 2016 20:47, "Richard L. Burton III"  a
>>> écrit :
>>>
 I'm looking to see what's recommended for an embedded version of
 Cassandra, just for unit testing.

 I'm looking at https://github.com/jsevellec/cassandra-unit/wiki but I
 wanted to see if there's was a better recommendation?

 --
 -Richard L. Burton III
 @rburton

>>>
>>
>>
>> --
>> -Richard L. Burton III
>> @rburton
>>
>
>


-- 
-Richard L. Burton III
@rburton


Re: Cassandra Performance on a Single Machine

2016-01-13 Thread Anurag Khandelwal
Hi John,

Thanks for responding!

The aim of this benchmark was not to benchmark Cassandra as an end-to-end 
distributed system, but to understand a break down of the performance. For 
instance, if we understand the performance characteristics that we can expect 
from a single machine cassandra instance with RF=Consistency=1, we can have a 
good estimate of what the distributed performance with higher replication 
factors and consistency are going to look like. Even in the ideal case, the 
performance improvement would scale at most linearly with more machines and 
replicas.

That being said, I still want to understand whether this is the performance I 
should expect for the setup I described; if the performance for the current 
setup can be improved, then clearly the performance for a production setup 
(with multiple nodes, replicas) would also improve. Does that make sense?

Thanks!
Anurag

> On Jan 6, 2016, at 9:31 AM, John Schulz  wrote:
> 
> Anurag,
> 
> Unless you are planning on continuing to use only one machine with RF=1 
> benchmarking a single system using RF=Consistancy=1 is mostly a waste of 
> time. If you are going to use RF=1 and a single host then why use Cassandra 
> at all. Plain old relational dbs should do the job just fine.
> 
> Cassandra is designed to be distributed. You won't get the full impact of how 
> it scales and the limits on scaling unless you benchmark a distributed 
> system. For example the scaling impact of secondary indexes will not be 
> visible on a single node.
> 
> John
> 
> 
> 
> 
> 
> 
> On Tue, Jan 5, 2016 at 3:16 PM, Anurag Khandelwal  > wrote:
> Hi,
> 
> I’ve been benchmarking Cassandra to get an idea of how the performance scales 
> with more data on a single machine. I just wanted to get some feedback to 
> whether these are the numbers I should expect.
> 
> The benchmarks are quite simple — I measure the latency and throughput for 
> two kinds of queries:
> 
> 1. get() queries - These fetch an entire row for a given primary key.
> 2. search() queries - These fetch all the primary keys for rows where a 
> particular column matches a particular value (e.g., “name” is “John Smith”). 
> 
> Indexes are constructed for all columns that are queried.
> 
> Dataset
> 
> The dataset used comprises of ~1.5KB records (on an average) when represented 
> as CSV; there are 105 attributes in each record.
> 
> Queries
> 
> For get() queries, randomly generated primary keys are used.
> 
> For search() queries, column values are selected such that their total number 
> of occurrences in the dataset is between 1 - 4000. For example, a query for  
> “name” = “John Smith” would only be performed if the number of rows that 
> contain the same lies between 1-4000.
> 
> The results for the benchmarks are provided below:
> 
> Latency Measurements
> 
> The latency measurements are an average of 1 queries.
> 
> 
> 
> 
> 
> Throughput Measurements
> 
> The throughput measurements were repeated for 1-16 client threads, and the 
> numbers reported for each input size is for the configuration (i.e., # client 
> threads) with the highest throughput.
> 
> 
> 
> 
> 
> Any feedback here would be greatly appreciated!
> 
> Thanks!
> Anurag
> 
> 
> 
> 
> -- 
> John H. Schulz
> Principal Consultant
> Pythian - Love your data
> 
> sch...@pythian.com  |  Linkedin 
> www.linkedin.com/pub/john-schulz/13/ab2/930/ 
> 
> Mobile: 248-376-3380
> www.pythian.com 
> --
> 
> 
> 
> 
> 



Re: Cassandra DSE Solr - search JSON content in column

2016-01-13 Thread Jack Krupansky
For a nested object you can just concatenate the sequence of names with
dots or some other separator and use that for each leaf value of the nested
tree.

-- Jack Krupansky

On Wed, Jan 13, 2016 at 11:40 PM, Joseph Tech  wrote:

> Thanks, Field Transformers is exactly what i was looking for. Mine is a
> somewhat nested object, so will need to see how complex the transformer
> would get, and if it would become a maintenance hassle later on; will try
> this out and share feedback.
>
> -Joseph
>
> On Wed, Jan 13, 2016 at 8:31 PM, Russell Bradberry 
> wrote:
>
>> You can use the full text wildcard search as mentioned. However, if you
>> need something more specific like certain fields in the JSON indexed, you
>> can use DSE SOLR field transformers.
>> http://www.datastax.com/dev/blog/dse-field-transformers
>>
>> From: DuyHai Doan 
>> Reply-To: 
>> Date: Wednesday, January 13, 2016 at 9:10 AM
>> To: 
>> Subject: Re: Cassandra DSE Solr - search JSON content in column
>>
>> Try
>>
>> SELECT * FROM your_table WHERE solr_query='json:"*100 ABC Street*"';
>>
>> Warning: since you're storing in JSON format, searching data inside a
>> JSON is equivalent to a wildcard seach *xxx* and it is quite expensive,
>> even for full text search engines like Solr
>>
>> On Wed, Jan 13, 2016 at 2:50 PM, Joseph Tech 
>> wrote:
>>
>>> Hi,
>>>
>>> Is it possible in DSE Cassandra Solr to search for JSON content within a
>>> column?
>>> We store a complex JSON in a column of type "text", very simplified
>>> version below.
>>>
>>> {
>>> "userId": "user100",
>>> "addressList": [{
>>> "addressId": "100",
>>> "address": "100 ABC Street"
>>> }],
>>> "userName": "user11"
>>> }
>>>
>>> In this, can search return all records that have address="100 ABC
>>> Street" ?
>>>
>>> Thanks,
>>> Joseph
>>>
>>>
>>
>


Re: max connection per user

2016-01-13 Thread oleg yusim
OK Rob, I see what you saying. Well, let's dive into the long questions and
answers at this case a bit:

1) Is there any other approach Cassandra currently utilizes to mitigate DoS
attacks?
2) How about max connection per DB? I know, Cassandra has this parameter on
JDBC driver configuration, but what be suggested value not to exceed?

Thanks,

Oleg

On Wed, Jan 13, 2016 at 6:31 PM, Robert Coli  wrote:

> On Wed, Jan 13, 2016 at 1:41 PM, oleg yusim  wrote:
>
>> Quick question, here: does Cassandra have a configuration switch to limit
>> number of connections per user (protection of DoS attack, security)?
>>
>
> Quick answer : no.
>
> =Rob
>
>


Re: max connection per user

2016-01-13 Thread oleg yusim
Brian - absolutely.

To give you are brief description of what I'm doing. I'm working for VMware
as security architect, and they tasked me with creating a STIG (working
with DISA ) for Cassandra DB. To create a STIG I would walk through the
Database SRG security controls and assess them against Cassandra DB
configuration. As the result, I would have to address all the security
controls in SRG, proposing mitigations where Cassandra can't meet it by
means of configuring and specifying desired configuration, where it would
be possible to do so.

At this particular place, I'm dealing with following security control:

The DBMS must limit the number of concurrent sessions to an
organization-defined number per user for all accounts and/or account types.

Here is the brief dive into why it is needed:


Database management includes the ability to control the number of users and
user sessions utilizing a DBMS. Unlimited concurrent connections to the
DBMS could allow a successful Denial of Service (DoS) attack by exhausting
connection resources; and a system can also fail or be degraded by an
overload of legitimate users. Limiting the number of concurrent sessions
per user is helpful in reducing these risks.

This requirement addresses concurrent session control for a single account.
It does not address concurrent sessions by a single user via multiple
system accounts; and it does not deal with the total number of sessions
across all accounts.

The capability to limit the number of concurrent sessions per user must be
configured in or added to the DBMS (for example, by use of a logon
trigger), when this is technically feasible. Note that it is not sufficient
to limit sessions via a web server or application server alone, because
legitimate users and adversaries can potentially connect to the DBMS by
other means.

The organization will need to define the maximum number of concurrent
sessions by account type, by account, or a combination thereof. In deciding
on the appropriate number, it is important to consider the work
requirements of the various types of users. For example, 2 might be an
acceptable limit for general users accessing the database via an
application; but 10 might be too few for a database administrator using a
database management GUI tool, where each query tab and navigation pane may
count as a separate session.

(Sessions may also be referred to as connections or logons, which for the
purposes of this requirement are synonyms.)


Now with that in mind, typical way to DoS database would be open more
connections than database can support, bringing server to its knees.
Typical way to counter it is limiting number of concurrent user sessions to
two and number of concurrent administrator sessions to 10.

With the answer Rob provided me with, I'm reduced to searching for
mitigation control. That might be limiting maximum amount of connections to
database, to the amount database for sure can support. I know JDBC driver
has such configuration switches, allowing to go for that. The question now
is - how many? What is the number of simultanious connections Cassandra
would be able to bare?

Thanks,

Oleg

On Wed, Jan 13, 2016 at 8:40 PM, Bryan Cheng  wrote:

> Are you actively exposing your database to users outside of your
> organization, or are you just asking about security best practices?
>
> If you mean the former, this isn't really a common use case and there
> isn't a huge amount out of the box that Cassandra will do to help.
>
> If you're just asking about security best-practices,
> http://www.datastax.com/wp-content/uploads/2014/04/WP-DataStax-Enterprise-Best-Practices.pdf
> has a brief blurb, and there are many resources online for securing
> Cassandra specifically and databases in general- the approaches are going
> to be largely the same.
>
> Can you describe what avenues you're expecting either intrusion or DOS?
>
> On Wed, Jan 13, 2016 at 6:01 PM, oleg yusim  wrote:
>
>> OK Rob, I see what you saying. Well, let's dive into the long questions
>> and answers at this case a bit:
>>
>> 1) Is there any other approach Cassandra currently utilizes to mitigate
>> DoS attacks?
>> 2) How about max connection per DB? I know, Cassandra has this parameter
>> on JDBC driver configuration, but what be suggested value not to exceed?
>>
>> Thanks,
>>
>> Oleg
>>
>> On Wed, Jan 13, 2016 at 6:31 PM, Robert Coli 
>> wrote:
>>
>>> On Wed, Jan 13, 2016 at 1:41 PM, oleg yusim  wrote:
>>>
 Quick question, here: does Cassandra have a configuration switch to
 limit number of connections per user (protection of DoS attack, security)?

>>>
>>> Quick answer : no.
>>>
>>> =Rob
>>>
>>>
>>
>>
>


Re: Cassandra DSE Solr - search JSON content in column

2016-01-13 Thread Joseph Tech
Thanks, Field Transformers is exactly what i was looking for. Mine is a
somewhat nested object, so will need to see how complex the transformer
would get, and if it would become a maintenance hassle later on; will try
this out and share feedback.

-Joseph

On Wed, Jan 13, 2016 at 8:31 PM, Russell Bradberry 
wrote:

> You can use the full text wildcard search as mentioned. However, if you
> need something more specific like certain fields in the JSON indexed, you
> can use DSE SOLR field transformers.
> http://www.datastax.com/dev/blog/dse-field-transformers
>
> From: DuyHai Doan 
> Reply-To: 
> Date: Wednesday, January 13, 2016 at 9:10 AM
> To: 
> Subject: Re: Cassandra DSE Solr - search JSON content in column
>
> Try
>
> SELECT * FROM your_table WHERE solr_query='json:"*100 ABC Street*"';
>
> Warning: since you're storing in JSON format, searching data inside a JSON
> is equivalent to a wildcard seach *xxx* and it is quite expensive, even for
> full text search engines like Solr
>
> On Wed, Jan 13, 2016 at 2:50 PM, Joseph Tech 
> wrote:
>
>> Hi,
>>
>> Is it possible in DSE Cassandra Solr to search for JSON content within a
>> column?
>> We store a complex JSON in a column of type "text", very simplified
>> version below.
>>
>> {
>> "userId": "user100",
>> "addressList": [{
>> "addressId": "100",
>> "address": "100 ABC Street"
>> }],
>> "userName": "user11"
>> }
>>
>> In this, can search return all records that have address="100 ABC Street"
>> ?
>>
>> Thanks,
>> Joseph
>>
>>
>


Re: New node has high network and disk usage.

2016-01-13 Thread Anuj Wadehra
Ok. I saw dropped mutations on your cluster and full gc is a common cause for 
that.Can you just search the word GCInspector in system.log and share the 
frequency of minor and full gc. Moreover, are you printing promotion failures 
in gc logs?? Why full gc ia getting triggered??promotion failures or concurrent 
mode failures?
If you are on CMS, you need to fine tune your heap options to address full gc.


ThanksAnuj
Sent from Yahoo Mail on Android 
 
  On Thu, 14 Jan, 2016 at 12:57 am, James 
Griffin wrote:   I think I was incorrect in 
assuming GC wasn't an issue due to the lack of logs. Comparing jstat output on 
nodes 2 & 3 show some fairly marked differences, though 
comparing the startup flags on the two machines show the GC config is 
identical.:
$ jstat -gcutil   S0     S1     E      O      P     YGC     YGCT    FGC    FGCT 
    GCT2  5.08   0.00  55.72  18.24  59.90  25986  619.827    28    1.597  
621.4243  0.00   0.00  22.79  17.87  59.99 422600 11225.979   668   57.383 
11283.361
Here's typical output for iostat on nodes 2 & 3 as well:
$ iostat -dmx md0
  Device:         rrqm/s   wrqm/s     r/s     w/s    rMB/s    wMB/s avgrq-sz 
avgqu-sz   await r_await w_await  svctm  %util2 md0               0.00     0.00 
 339.00    0.00     9.77     0.00    59.00     0.00    0.00    0.00    0.00   
0.00   0.003 md0               0.00     0.00 2069.00    1.00    85.85     0.00  
  84.94     0.00    0.00    0.00    0.00   0.00   0.00
Griff

On 13 January 2016 at 18:36, Anuj Wadehra  wrote:

Node 2 has slightly higher data but that should be ok. Not sure how read ops 
are so high when no IO intensive activity such as repair and compaction is 
running on node 3.May be you can try investigating logs to see whats happening.
Others on the mailing list could also share their views on the situation.

ThanksAnuj


Sent from Yahoo Mail on Android 
 
 On Wed, 13 Jan, 2016 at 11:46 pm, James 
Griffin wrote:   Hi Anuj, 
Below is the output of nodetool status. The nodes were replaced following the 
instructions in Datastax documentation for replacing running nodes since the 
nodes were running fine, it was that the servers had been incorrectly 
initialised and they thus had less disk space. The status below shows 2 has 
significantly higher load, however as I say 2 is operating normally and is 
running compactions, so I guess that's not an issue?
Datacenter: datacenter1===Status=Up/Down|/ 
State=Normal/Leaving/Joining/Moving--  Address         Load       Tokens  Owns  
 Host ID                               RackUN  1               253.59 GB  256   
  31.7%  6f0cfff2-babe-4de2-a1e3-6201228dee44  rack1UN  2               302.23 
GB  256     35.3%  faa5b073-6af4-4c80-b280-e7fdd61924d3  rack1UN  3             
  265.02 GB  256     33.1%  74b15507-db5c-45df-81db-6e5bcb7438a3  rack1
Griff

On 13 January 2016 at 18:12, Anuj Wadehra  wrote:

Hi,
Revisiting the thread I can see that nodetool status had both good and bad 
nodes at same time. How do you replace nodes? When you say bad node..I 
understand that the node is no more usable even though Cassandra is UP? Is that 
correct?
If a node is in bad shape and not working, adding new node may trigger 
streaming huge data from bad node too. Have you considered using the procedure 
for replacing a dead node?
Please share Latest nodetool status.
nodetool output shared earlier:
 `nodetool status` output:

    Status=Up/Down
    |/ State=Normal/Leaving/Joining/Moving
    --  Address Load   Tokens  Owns   Host ID   
    Rack
    UN  A (Good)    252.37 GB  256 23.0%  
9cd2e58c-a062-48a4-8d3f-b7bd9ee0576f  rack1
    UN  B (Good)    245.91 GB  256 24.4%  
6f0cfff2-babe-4de2-a1e3-6201228dee44  rack1
    UN  C (Good)    254.79 GB  256 23.7%  
f4891729-9179-4f19-ab2c-50d387da7ac6  rack1
    UN  D (Bad) 163.85 GB  256 28.8%  
faa5b073-6af4-4c80-b280-e7fdd61924d3  rack1



ThanksAnuj
Sent from Yahoo Mail on Android 
 
 On Wed, 13 Jan, 2016 at 10:34 pm, James 
Griffin wrote:   Hi all, 
We’ve spent a few days running things but are in the same position. To add some 
more flavour:
   
   - We have a 3-node ring, replication factor = 3. We’ve been running in this 
configuration for a few years without any real issues
   - Nodes 2 & 3 are much newer than node 1. These two nodes were brought in to 
replace two other nodes which had failed RAID0 configuration and thus were 
lacking in disk space.
   - When node 2 was brought into the ring, it exhibited high CPU wait, IO and 
load metrics
   - We subsequently brought 3 into the ring: as soon as 3 was fully 
bootstrapped, the load, CPU wait and IO stats on 2 dropped to normal levels. 
Those same stats on 3, however, sky-rocketed
   - We’ve confirmed configuration across all three nodes are identical and in 
line with the 

Re: max connection per user

2016-01-13 Thread Bryan Cheng
Are you actively exposing your database to users outside of your
organization, or are you just asking about security best practices?

If you mean the former, this isn't really a common use case and there isn't
a huge amount out of the box that Cassandra will do to help.

If you're just asking about security best-practices,
http://www.datastax.com/wp-content/uploads/2014/04/WP-DataStax-Enterprise-Best-Practices.pdf
has a brief blurb, and there are many resources online for securing
Cassandra specifically and databases in general- the approaches are going
to be largely the same.

Can you describe what avenues you're expecting either intrusion or DOS?

On Wed, Jan 13, 2016 at 6:01 PM, oleg yusim  wrote:

> OK Rob, I see what you saying. Well, let's dive into the long questions
> and answers at this case a bit:
>
> 1) Is there any other approach Cassandra currently utilizes to mitigate
> DoS attacks?
> 2) How about max connection per DB? I know, Cassandra has this parameter
> on JDBC driver configuration, but what be suggested value not to exceed?
>
> Thanks,
>
> Oleg
>
> On Wed, Jan 13, 2016 at 6:31 PM, Robert Coli  wrote:
>
>> On Wed, Jan 13, 2016 at 1:41 PM, oleg yusim  wrote:
>>
>>> Quick question, here: does Cassandra have a configuration switch to
>>> limit number of connections per user (protection of DoS attack, security)?
>>>
>>
>> Quick answer : no.
>>
>> =Rob
>>
>>
>
>


Re: Cassandra DSE Solr - search JSON content in column

2016-01-13 Thread DuyHai Doan
Try

SELECT * FROM your_table WHERE solr_query='json:"*100 ABC Street*"';

Warning: since you're storing in JSON format, searching data inside a JSON
is equivalent to a wildcard seach *xxx* and it is quite expensive, even for
full text search engines like Solr

On Wed, Jan 13, 2016 at 2:50 PM, Joseph Tech  wrote:

> Hi,
>
> Is it possible in DSE Cassandra Solr to search for JSON content within a
> column?
> We store a complex JSON in a column of type "text", very simplified
> version below.
>
> {
> "userId": "user100",
> "addressList": [{
> "addressId": "100",
> "address": "100 ABC Street"
> }],
> "userName": "user11"
> }
>
> In this, can search return all records that have address="100 ABC Street"
> ?
>
> Thanks,
> Joseph
>
>


Re: Sorting & pagination in apache cassandra 2.1

2016-01-13 Thread Narendra Sharma
In the example you gave the primary key user _ name is the row key. Since
the default partition is random you are getting rows in random order.

Since each row no clustering column there is no further grouping of data.
Or in simple terms each row has one record and is being returned ordered by
column name.

To see some meaningful ordering there should be some clustering column
defined.

You can use create additional column families to maintain ordering. Or use
external solutions like elasticsearch.
On Jan 12, 2016 10:07 PM, "anuja jain"  wrote:

> I understand the meaning of SSTable but whats the reason behind sorting
> the table on the basis of int columns first..
> Is there any data type preference in cassandra?
> Also What is the alternative to creating materialised views if my
> cassandra version is prior to 3.0 (specifically 2.1) and which is already
> in production.?
>
>
> On Wed, Jan 13, 2016 at 12:17 AM, Robert Coli 
> wrote:
>
>> On Mon, Jan 11, 2016 at 11:30 PM, anuja jain 
>> wrote:
>>
>>> 1 more question, what does it mean by "cassandra inherently sorts data"?
>>>
>>
>> SSTable = Sorted Strings Table.
>>
>> It doesn't contain "Strings" anymore, really, but that's a hint.. :)
>>
>> =Rob
>>
>
>


Re: Cassandra DSE Solr - search JSON content in column

2016-01-13 Thread Russell Bradberry
You can use the full text wildcard search as mentioned. However, if you need 
something more specific like certain fields in the JSON indexed, you can use 
DSE SOLR field transformers.  
http://www.datastax.com/dev/blog/dse-field-transformers

From:  DuyHai Doan 
Reply-To:  
Date:  Wednesday, January 13, 2016 at 9:10 AM
To:  
Subject:  Re: Cassandra DSE Solr - search JSON content in column

Try

SELECT * FROM your_table WHERE solr_query='json:"*100 ABC Street*"'; 

Warning: since you're storing in JSON format, searching data inside a JSON is 
equivalent to a wildcard seach *xxx* and it is quite expensive, even for full 
text search engines like Solr

On Wed, Jan 13, 2016 at 2:50 PM, Joseph Tech  wrote:
Hi,

Is it possible in DSE Cassandra Solr to search for JSON content within a 
column? 
We store a complex JSON in a column of type "text", very simplified version 
below.

{
"userId": "user100",
"addressList": [{
"addressId": "100",
"address": "100 ABC Street"
}],
"userName": "user11"
}

In this, can search return all records that have address="100 ABC Street" ? 

Thanks,
Joseph





Re: New node has high network and disk usage.

2016-01-13 Thread James Griffin
 Hi all,

We’ve spent a few days running things but are in the same position. To add
some more flavour:


   - We have a 3-node ring, replication factor = 3. We’ve been running in
   this configuration for a few years without any real issues
   - Nodes 2 & 3 are much newer than node 1. These two nodes were brought
   in to replace two other nodes which had failed RAID0 configuration and thus
   were lacking in disk space.
   - When node 2 was brought into the ring, it exhibited high CPU wait, IO
   and load metrics
   - We subsequently brought 3 into the ring: as soon as 3 was fully
   bootstrapped, the load, CPU wait and IO stats on 2 dropped to normal
   levels. Those same stats on 3, however, sky-rocketed
   - We’ve confirmed configuration across all three nodes are identical and
   in line with the recommended production settings
   - We’ve run a full repair
   - Node 2 is currently running compactions, 1 & 3 aren’t and have no
   pending
   - There is no GC happening from what I can see. Node 1 has a GC log, but
   that’s not been written to since May last year


What we’re seeing at the moment is similar and normal stats on nodes 1 & 2,
but high CPU wait, IO and load stats on 3. As a snapshot:


   1. Load: 3.96, CPU wait: 30.8%, Disk Read Ops: 408/s
   2. Load: 5.88, CPU wait: 14.6%, Disk Read Ops: 275/s
   3. Load: 58.15, CPU wait: 87.0%, Disk Read Ops: 2,408/s


Can you recommend any next steps?

Griff

On 6 January 2016 at 17:31, Anuj Wadehra  wrote:

> Hi Vickrum,
>
> I would have proceeded with diagnosis as follows:
>
> 1. Analysis of sar report to check system health -cpu memory swap disk
> etc.
> System seems to be overloaded. This is evident from mutation drops.
>
> 2. Make sure that  all recommended Cassandra production settings available
> at Datastax site are applied ,disable zone reclaim and THP.
>
> 3.Run full Repair on bad node and check data size. Node is owner of
> maximum token range but has significant lower data.I doubt that
> bootstrapping happened properly.
>
> 4.Compactionstats shows 22 pending compactions. Try throttling compactions
> via reducing cincurent compactors or compaction throughput.
>
> 5.Analyze logs to make sure bootstrapping happened without errors.
>
> 6. Look for other common performance problems such as GC pauses to make
> sure that dropped mutations are not caused by GC pauses.
>
>
> Thanks
> Anuj
>
> Sent from Yahoo Mail on Android
> 
>
> On Wed, 6 Jan, 2016 at 10:12 pm, Vickrum Loi
>  wrote:
> # nodetool compactionstats
> pending tasks: 22
>   compaction typekeyspace   table
> completed   total  unit  progress
>Compactionproduction_analyticsinteractions
> 240410213161172668724 bytes 0.15%
>
> Compactionproduction_decisionsdecisions.decisions_q_idx
> 120815385   226295183 bytes53.39%
> Active compaction remaining time :   2h39m58s
>
> Worth mentioning that compactions haven't been running on this node
> particularly often. The node's been performing badly regardless of whether
> it's compacting or not.
>
> On 6 January 2016 at 16:35, Jeff Ferland  wrote:
>
>> What’s your output of `nodetool compactionstats`?
>>
>> On Jan 6, 2016, at 7:26 AM, Vickrum Loi 
>> wrote:
>>
>> Hi,
>>
>> We recently added a new node to our cluster in order to replace a node
>> that died (hardware failure we believe). For the next two weeks it had high
>> disk and network activity. We replaced the server, but it's happened again.
>> We've looked into memory allowances, disk performance, number of
>> connections, and all the nodetool stats, but can't find the cause of the
>> issue.
>>
>> `nodetool tpstats`[0] shows a lot of active and pending threads, in
>> comparison to the rest of the cluster, but that's likely a symptom, not a
>> cause.
>>
>> `nodetool status`[1] shows the cluster isn't quite balanced. The bad node
>> (D) has less data.
>>
>> Disk Activity[2] and Network activity[3] on this node is far higher than
>> the rest.
>>
>> The only other difference this node has to the rest of the cluster is
>> that its on the ext4 filesystem, whereas the rest are ext3, but we've done
>> plenty of testing there and can't see how that would affect performance on
>> this node so much.
>>
>> Nothing of note in system.log.
>>
>> What should our next step be in trying to diagnose this issue?
>>
>> Best wishes,
>> Vic
>>
>> [0] `nodetool tpstats` output:
>>
>> Good node:
>> Pool NameActive   Pending  Completed
>> Blocked  All time blocked
>> ReadStage 0 0   46311521
>> 0 0
>> RequestResponseStage  0 0   23817366
>> 0 0
>> MutationStage 0 0   47389269
>> 0 0
>> ReadRepairStage