Re: Sorting & pagination in apache cassandra 2.1

2016-01-15 Thread Carlos Alonso
Hi Anuja.

Yeah, that's what he means. Before Cassandra 3.0 the modelling advice is to
have one table per query. This may sound weird from a relational
perspective, but the truth is that writes in Cassandra are very cheap, and
its better to write multiple times and have quick and easy reads than write
just once and have expensive reads.

Carlos Alonso | Software Engineer | @calonso 

On 15 January 2016 at 05:57, anuja jain  wrote:

> @Jonathan
> what do you mean by "you'll need to maintain your own materialized view
> tables"?
> does it mean we have to create new table for each query?
>
> On Wed, Jan 13, 2016 at 7:40 PM, Narendra Sharma <
> narendra.sha...@gmail.com> wrote:
>
>> In the example you gave the primary key user _ name is the row key. Since
>> the default partition is random you are getting rows in random order.
>>
>> Since each row no clustering column there is no further grouping of data.
>> Or in simple terms each row has one record and is being returned ordered by
>> column name.
>>
>> To see some meaningful ordering there should be some clustering column
>> defined.
>>
>> You can use create additional column families to maintain ordering. Or
>> use external solutions like elasticsearch.
>> On Jan 12, 2016 10:07 PM, "anuja jain"  wrote:
>>
>>> I understand the meaning of SSTable but whats the reason behind sorting
>>> the table on the basis of int columns first..
>>> Is there any data type preference in cassandra?
>>> Also What is the alternative to creating materialised views if my
>>> cassandra version is prior to 3.0 (specifically 2.1) and which is already
>>> in production.?
>>>
>>>
>>> On Wed, Jan 13, 2016 at 12:17 AM, Robert Coli 
>>> wrote:
>>>
 On Mon, Jan 11, 2016 at 11:30 PM, anuja jain 
 wrote:

> 1 more question, what does it mean by "cassandra inherently sorts
> data"?
>

 SSTable = Sorted Strings Table.

 It doesn't contain "Strings" anymore, really, but that's a hint.. :)

 =Rob

>>>
>>>
>


Re: electricity outage problem

2016-01-15 Thread Adil
Hi,
we did full restart of the cluster but nodetool status still giving
incoerent info from different nodes, some nodes appers UP from a node but
appers DOWN from another, and in the log as is said still having the
message "received an invalid gossip generation for peer /x.x.x.x"
cassandra version is 2.1.2, we want to execute the purge operation as
explained here
https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_gossip_purge.html
but we don't found the peers folder, should we do it via cql deleting the
peers content? should we do it for all nodes?

thanks


2016-01-12 17:42 GMT+01:00 Jack Krupansky :

> Sometimes you may have to clear out the saved Gossip state:
>
> https://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_gossip_purge.html
>
> Note the instruction about bringing up the seed nodes first. Normally seed
> nodes are only relevant when initially joining a node to a cluster (and
> then the Gossip state will be persisted locally), but if you clear te
> persisted Gossip state the seed nodes will again be needed to find the rest
> of the cluster.
>
> I'm not sure whether a power outage is the same as stopping and restarting
> an instance (AWS) in terms of whether the restarted instance retains its
> current public IP address.
>
>
>
> -- Jack Krupansky
>
> On Tue, Jan 12, 2016 at 10:02 AM, daemeon reiydelle 
> wrote:
>
>> This happens when there is insufficient time for nodes coming up to join
>> a network. It takes a few seconds for a node to come up, e.g. your seed
>> node. If you tell a node to join a cluster you can get this scenario
>> because of high network utilization as well. I wait 90 seconds after the
>> first (i.e. my first seed) node comes up to start the next one. Any nodes
>> that are seeds need some 60 seconds, so the additional 30 seconds is a
>> buffer. Additional nodes each wait 60 seconds before joining (although this
>> is a parallel tree for large clusters).
>>
>>
>>
>>
>>
>> *...*
>>
>>
>>
>>
>>
>>
>> *“Life should not be a journey to the grave with the intention of
>> arriving safely in apretty and well preserved body, but rather to skid in
>> broadside in a cloud of smoke,thoroughly used up, totally worn out, and
>> loudly proclaiming “Wow! What a Ride!” - Hunter ThompsonDaemeon C.M.
>> ReiydelleUSA (+1) 415.501.0198 <%28%2B1%29%20415.501.0198>London (+44) (0)
>> 20 8144 9872 <%28%2B44%29%20%280%29%2020%208144%209872>*
>>
>> On Tue, Jan 12, 2016 at 6:56 AM, Adil  wrote:
>>
>>> Hi,
>>>
>>> we have two DC with 5 nodes in each cluster, yesterday there was an
>>> electricity outage causing all nodes down, we restart the clusters but when
>>> we run nodetool status on DC1 it results that some nodes are DN, the
>>> strange thing is that running the command from diffrent node in DC1 doesn't
>>> give the same node in DC as own, we have noticed this message in the log
>>> "received an invalid gossip generation for peer", does anyone know how to
>>> resolve this problem? should we purge the gossip?
>>>
>>> thanks
>>>
>>> Adil
>>>
>>
>>
>


Re: electricity outage problem

2016-01-15 Thread daemeon reiydelle
Nodes need about 60-90 second delay before it can start accepting
connections as a seed node. Also a seed node needs time to accept a node
starting up, and syncing to other nodes (on 10gbit the max new nodes is
only 1 or 2, on 1gigabit it can handle at least 3-4 new nodes connecting).
In a large cluster (500 nodes) I see this wierd condition where nodetool
status shows overlapping subsets of nodes, and the problem does not go away
after even an hour on a 10 gigabit network).



*...*






*“Life should not be a journey to the grave with the intention of arriving
safely in apretty and well preserved body, but rather to skid in broadside
in a cloud of smoke,thoroughly used up, totally worn out, and loudly
proclaiming “Wow! What a Ride!” - Hunter ThompsonDaemeon C.M. ReiydelleUSA
(+1) 415.501.0198London (+44) (0) 20 8144 9872*

On Fri, Jan 15, 2016 at 9:17 AM, Adil  wrote:

> Hi,
> we did full restart of the cluster but nodetool status still giving
> incoerent info from different nodes, some nodes appers UP from a node but
> appers DOWN from another, and in the log as is said still having the
> message "received an invalid gossip generation for peer /x.x.x.x"
> cassandra version is 2.1.2, we want to execute the purge operation as
> explained here
> https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_gossip_purge.html
> but we don't found the peers folder, should we do it via cql deleting the
> peers content? should we do it for all nodes?
>
> thanks
>
>
> 2016-01-12 17:42 GMT+01:00 Jack Krupansky :
>
>> Sometimes you may have to clear out the saved Gossip state:
>>
>> https://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_gossip_purge.html
>>
>> Note the instruction about bringing up the seed nodes first. Normally
>> seed nodes are only relevant when initially joining a node to a cluster
>> (and then the Gossip state will be persisted locally), but if you clear te
>> persisted Gossip state the seed nodes will again be needed to find the rest
>> of the cluster.
>>
>> I'm not sure whether a power outage is the same as stopping and
>> restarting an instance (AWS) in terms of whether the restarted instance
>> retains its current public IP address.
>>
>>
>>
>> -- Jack Krupansky
>>
>> On Tue, Jan 12, 2016 at 10:02 AM, daemeon reiydelle 
>> wrote:
>>
>>> This happens when there is insufficient time for nodes coming up to join
>>> a network. It takes a few seconds for a node to come up, e.g. your seed
>>> node. If you tell a node to join a cluster you can get this scenario
>>> because of high network utilization as well. I wait 90 seconds after the
>>> first (i.e. my first seed) node comes up to start the next one. Any nodes
>>> that are seeds need some 60 seconds, so the additional 30 seconds is a
>>> buffer. Additional nodes each wait 60 seconds before joining (although this
>>> is a parallel tree for large clusters).
>>>
>>>
>>>
>>>
>>>
>>> *...*
>>>
>>>
>>>
>>>
>>>
>>>
>>> *“Life should not be a journey to the grave with the intention of
>>> arriving safely in apretty and well preserved body, but rather to skid in
>>> broadside in a cloud of smoke,thoroughly used up, totally worn out, and
>>> loudly proclaiming “Wow! What a Ride!” - Hunter ThompsonDaemeon C.M.
>>> ReiydelleUSA (+1) 415.501.0198 <%28%2B1%29%20415.501.0198>London (+44) (0)
>>> 20 8144 9872 <%28%2B44%29%20%280%29%2020%208144%209872>*
>>>
>>> On Tue, Jan 12, 2016 at 6:56 AM, Adil  wrote:
>>>
 Hi,

 we have two DC with 5 nodes in each cluster, yesterday there was an
 electricity outage causing all nodes down, we restart the clusters but when
 we run nodetool status on DC1 it results that some nodes are DN, the
 strange thing is that running the command from diffrent node in DC1 doesn't
 give the same node in DC as own, we have noticed this message in the log
 "received an invalid gossip generation for peer", does anyone know how to
 resolve this problem? should we purge the gossip?

 thanks

 Adil

>>>
>>>
>>
>


Re: electricity outage problem

2016-01-15 Thread Adil
our case is not about accepting connection, some nodes receives gossip
generation number greater the local one, a looked at the tables peers and
local and can't found where local one is stored.

2016-01-15 17:54 GMT+01:00 daemeon reiydelle :

> Nodes need about 60-90 second delay before it can start accepting
> connections as a seed node. Also a seed node needs time to accept a node
> starting up, and syncing to other nodes (on 10gbit the max new nodes is
> only 1 or 2, on 1gigabit it can handle at least 3-4 new nodes connecting).
> In a large cluster (500 nodes) I see this wierd condition where nodetool
> status shows overlapping subsets of nodes, and the problem does not go away
> after even an hour on a 10 gigabit network).
>
>
>
> *...*
>
>
>
>
>
>
> *“Life should not be a journey to the grave with the intention of arriving
> safely in apretty and well preserved body, but rather to skid in broadside
> in a cloud of smoke,thoroughly used up, totally worn out, and loudly
> proclaiming “Wow! What a Ride!” - Hunter ThompsonDaemeon C.M. ReiydelleUSA
> (+1) 415.501.0198 <%28%2B1%29%20415.501.0198>London (+44) (0) 20 8144 9872
> <%28%2B44%29%20%280%29%2020%208144%209872>*
>
> On Fri, Jan 15, 2016 at 9:17 AM, Adil  wrote:
>
>> Hi,
>> we did full restart of the cluster but nodetool status still giving
>> incoerent info from different nodes, some nodes appers UP from a node but
>> appers DOWN from another, and in the log as is said still having the
>> message "received an invalid gossip generation for peer /x.x.x.x"
>> cassandra version is 2.1.2, we want to execute the purge operation as
>> explained here
>> https://docs.datastax.com/en/cassandra/2.1/cassandra/operations/ops_gossip_purge.html
>> but we don't found the peers folder, should we do it via cql deleting the
>> peers content? should we do it for all nodes?
>>
>> thanks
>>
>>
>> 2016-01-12 17:42 GMT+01:00 Jack Krupansky :
>>
>>> Sometimes you may have to clear out the saved Gossip state:
>>>
>>> https://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_gossip_purge.html
>>>
>>> Note the instruction about bringing up the seed nodes first. Normally
>>> seed nodes are only relevant when initially joining a node to a cluster
>>> (and then the Gossip state will be persisted locally), but if you clear te
>>> persisted Gossip state the seed nodes will again be needed to find the rest
>>> of the cluster.
>>>
>>> I'm not sure whether a power outage is the same as stopping and
>>> restarting an instance (AWS) in terms of whether the restarted instance
>>> retains its current public IP address.
>>>
>>>
>>>
>>> -- Jack Krupansky
>>>
>>> On Tue, Jan 12, 2016 at 10:02 AM, daemeon reiydelle 
>>> wrote:
>>>
 This happens when there is insufficient time for nodes coming up to
 join a network. It takes a few seconds for a node to come up, e.g. your
 seed node. If you tell a node to join a cluster you can get this scenario
 because of high network utilization as well. I wait 90 seconds after the
 first (i.e. my first seed) node comes up to start the next one. Any nodes
 that are seeds need some 60 seconds, so the additional 30 seconds is a
 buffer. Additional nodes each wait 60 seconds before joining (although this
 is a parallel tree for large clusters).





 *...*






 *“Life should not be a journey to the grave with the intention of
 arriving safely in apretty and well preserved body, but rather to skid in
 broadside in a cloud of smoke,thoroughly used up, totally worn out, and
 loudly proclaiming “Wow! What a Ride!” - Hunter ThompsonDaemeon C.M.
 ReiydelleUSA (+1) 415.501.0198 <%28%2B1%29%20415.501.0198>London (+44) (0)
 20 8144 9872 <%28%2B44%29%20%280%29%2020%208144%209872>*

 On Tue, Jan 12, 2016 at 6:56 AM, Adil  wrote:

> Hi,
>
> we have two DC with 5 nodes in each cluster, yesterday there was an
> electricity outage causing all nodes down, we restart the clusters but 
> when
> we run nodetool status on DC1 it results that some nodes are DN, the
> strange thing is that running the command from diffrent node in DC1 
> doesn't
> give the same node in DC as own, we have noticed this message in the log
> "received an invalid gossip generation for peer", does anyone know how to
> resolve this problem? should we purge the gossip?
>
> thanks
>
> Adil
>


>>>
>>
>


Re: Cassandra 3.1.1 with respect to HeapSpace

2016-01-15 Thread Jean Tremblay
Thank you Sebastián!

On 15 Jan 2016, at 19:09 , Sebastian Estevez 
> wrote:

The recommended (and default when available) heap size for Cassandra is 8GB and 
for New size it's 100mb per core.

Your milage may vary based on workload, hardware etc.

There are also some alternative JVM tuning schools of thought. See 
cassandra-8150 (large heap) and CASSANDRA-7486 (G1GC).



All the best,

[datastax_logo.png]
Sebastián Estévez
Solutions Architect | 954 905 8615 | 
sebastian.este...@datastax.com
[linkedin.png] [facebook.png] 
  [twitter.png] 
  [g+.png] 
  
[https://lh6.googleusercontent.com/24_538J0j5M0NHQx-jkRiV_IHrhsh-98hpi--Qz9b0-I4llvWuYI6LgiVJsul0AhxL0gMTOHgw3G0SvIXaT2C7fsKKa_DdQ2uOJ-bQ6h_mQ7k7iMybcR1dr1VhWgLMxcmg]
 


[http://learn.datastax.com/rs/059-YLZ-577/images/Gartner_728x90_Sig4.png]

DataStax is the fastest, most scalable distributed database technology, 
delivering Apache Cassandra to the world’s most innovative enterprises. 
Datastax is built to be agile, always-on, and predictably scalable to any size. 
With more than 500 customers in 45 countries, DataStax is the database 
technology and transactional backbone of choice for the worlds most innovative 
companies such as Netflix, Adobe, Intuit, and eBay.

On Fri, Jan 15, 2016 at 4:00 AM, Jean Tremblay 
> 
wrote:
Thank you Sebastián for your useful advice. I managed restarting the nodes, but 
I needed to delete all the commit logs, not only the last one specified. 
Nevertheless I’m back in business.

Would there be a better memory configuration to select for my nodes in a C* 3 
cluster? Currently I use MAX_HEAP_SIZE=“6G" HEAP_NEWSIZE=“496M” for a 16M RAM 
node.

Thanks for your help.

Jean

On 15 Jan 2016, at 24:24 , Sebastian Estevez 
> wrote:

Try starting the other nodes. You may have to delete or mv the commitlog 
segment referenced in the error message for the node to come up since 
apparently it is corrupted.

All the best,

[datastax_logo.png]
Sebastián Estévez
Solutions Architect | 954 905 8615 | 
sebastian.este...@datastax.com
[linkedin.png] [facebook.png] 
  [twitter.png] 
  [g+.png] 
  
[https://lh6.googleusercontent.com/24_538J0j5M0NHQx-jkRiV_IHrhsh-98hpi--Qz9b0-I4llvWuYI6LgiVJsul0AhxL0gMTOHgw3G0SvIXaT2C7fsKKa_DdQ2uOJ-bQ6h_mQ7k7iMybcR1dr1VhWgLMxcmg]
 


[http://learn.datastax.com/rs/059-YLZ-577/images/Gartner_728x90_Sig4.png]

DataStax is the fastest, most scalable distributed database technology, 
delivering Apache Cassandra to the world’s most innovative enterprises. 
Datastax is built to be agile, always-on, and predictably scalable to any size. 
With more than 500 customers in 45 countries, DataStax is the database 
technology and transactional backbone of choice for the worlds most innovative 
companies such as Netflix, Adobe, Intuit, and eBay.

On Thu, Jan 14, 2016 at 1:00 PM, Jean Tremblay 
> 
wrote:
How can I restart?
It blocks with the error listed below.
Are my memory settings good for my configuration?

On 14 Jan 2016, at 18:30, Jake Luciani 
> wrote:

Yes you can restart without data loss.

Can you please include info about how much data you have loaded per node and 
perhaps what your schema looks like?

Thanks

On Thu, Jan 14, 2016 at 12:24 PM, Jean Tremblay 
> 
wrote:

Ok, I will open a ticket.

How could I restart my cluster without loosing everything ?
Would there be a better memory configuration to select for my nodes? Currently 
I use MAX_HEAP_SIZE="6G" HEAP_NEWSIZE=“496M” for a 16M RAM node.

Thanks

Jean

On 14 Jan 2016, at 18:19, Tyler Hobbs 
> wrote:

I don't think that's a known issue.  Can you open a ticket at 
https://issues.apache.org/jira/browse/CASSANDRA and attach your schema along 
with the commitlog files and the mutation that was saved to /tmp?

On Thu, Jan 14, 2016 at 10:56 AM, Jean Tremblay 
> 
wrote:
Hi,

I 

Re: CQL Composite Key Seen After Table Creation

2016-01-15 Thread Chris Burroughs

On 01/06/2016 04:47 PM, Robert Coli wrote:

On Wed, Jan 6, 2016 at 12:54 PM, Chris Burroughs 
wrote:

The problem with that approach is that manually editing the local schema
tables in live cluster is wildly dangerous. I *think* this would work:


  * Make triple sure no schema changes are happening on the cluster.

  * Update schema tables on each node --> drain --> restart


I think that would work too, and probably be lower risk than modifying on
one and trying to get the others to pull via resetlocalschema. But I agree
it seems "wildly dangerous".


We did this, and a day later it appears successful.

I am still fuzzy on how schema "changes" propagate when you edit the 
schema tables directly and am unsure if the drain/restart rain dance was 
strictly necessary, but it felt safer. (Obviously even if I was sure 
now, that would not be behavior to count on, and I hope not to need to 
do this gain.)




Re: Cassandra 3.1.1 with respect to HeapSpace

2016-01-15 Thread Sebastian Estevez
The recommended (and default when available) heap size for Cassandra is 8GB
and for New size it's 100mb per core.

Your milage may vary based on workload, hardware etc.

There are also some alternative JVM tuning schools of thought. See
cassandra-8150 (large heap) and CASSANDRA-7486 (G1GC).



All the best,


[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]







DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Fri, Jan 15, 2016 at 4:00 AM, Jean Tremblay <
jean.tremb...@zen-innovations.com> wrote:

> Thank you Sebastián for your useful advice. I managed restarting the
> nodes, but I needed to delete all the commit logs, not only the last one
> specified. Nevertheless I’m back in business.
>
> Would there be a better memory configuration to select for my nodes in a
> C* 3 cluster? Currently I use MAX_HEAP_SIZE=“6G" HEAP_NEWSIZE=“496M” for
> a 16M RAM node.
>
> Thanks for your help.
>
> Jean
>
> On 15 Jan 2016, at 24:24 , Sebastian Estevez <
> sebastian.este...@datastax.com> wrote:
>
>
> Try starting the other nodes. You may have to delete or mv the commitlog
> segment referenced in the error message for the node to come up since
> apparently it is corrupted.
>
> All the best,
>
> [image: datastax_logo.png] 
> Sebastián Estévez
> Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com
> [image: linkedin.png]  [image:
> facebook.png]  [image: twitter.png]
>  [image: g+.png]
> 
> 
> 
>
> 
>
> DataStax is the fastest, most scalable distributed database technology,
> delivering Apache Cassandra to the world’s most innovative enterprises.
> Datastax is built to be agile, always-on, and predictably scalable to any
> size. With more than 500 customers in 45 countries, DataStax is the
> database technology and transactional backbone of choice for the worlds
> most innovative companies such as Netflix, Adobe, Intuit, and eBay.
>
> On Thu, Jan 14, 2016 at 1:00 PM, Jean Tremblay <
> jean.tremb...@zen-innovations.com> wrote:
>
>> How can I restart?
>> It blocks with the error listed below.
>> Are my memory settings good for my configuration?
>>
>> On 14 Jan 2016, at 18:30, Jake Luciani  wrote:
>>
>> Yes you can restart without data loss.
>>
>> Can you please include info about how much data you have loaded per node
>> and perhaps what your schema looks like?
>>
>> Thanks
>>
>> On Thu, Jan 14, 2016 at 12:24 PM, Jean Tremblay <
>> jean.tremb...@zen-innovations.com> wrote:
>>
>>>
>>> Ok, I will open a ticket.
>>>
>>> How could I restart my cluster without loosing everything ?
>>> Would there be a better memory configuration to select for my nodes?
>>> Currently I use MAX_HEAP_SIZE="6G" HEAP_NEWSIZE=“496M” for a 16M RAM node.
>>>
>>> Thanks
>>>
>>> Jean
>>>
>>> On 14 Jan 2016, at 18:19, Tyler Hobbs  wrote:
>>>
>>> I don't think that's a known issue.  Can you open a ticket at
>>> https://issues.apache.org/jira/browse/CASSANDRA and attach your schema
>>> along with the commitlog files and the mutation that was saved to /tmp?
>>>
>>> On Thu, Jan 14, 2016 at 10:56 AM, Jean Tremblay <
>>> jean.tremb...@zen-innovations.com> wrote:
>>>
 Hi,

 I have a small Cassandra Cluster with 5 nodes, having 16MB of RAM.
 I use Cassandra 3.1.1.
 I use the following setup for the memory:
   MAX_HEAP_SIZE="6G"
 HEAP_NEWSIZE="496M"

 I have been loading a lot of data in this cluster over the last 24
 hours. The system behaved I think very nicely. It was loading very fast,
 and giving excellent read time. There was no error messages until this one:


 ERROR [SharedPool-Worker-35] 2016-01-14 17:05:23,602
 JVMStabilityInspector.java:139 - JVM state determined to be unstable.
 Exiting forcefully due to:
 java.lang.OutOfMemoryError: Java heap space
 at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57) ~[na:1.8.0_65]
 at 

Impact of Changing Compaction Strategy

2016-01-15 Thread Anuj Wadehra
Hi,
I need to understand whether all existing sstables are recreated/updated when 
we change compaction strategy from STCS to DTCS?

Sstables are immutable by design but do we take an exception for such cases and 
update same files when an Alter statement is fired to change the compaction 
strategy?

ThanksAnuj
Sent from Yahoo Mail on Android

Re: Impact of Changing Compaction Strategy

2016-01-15 Thread Jeff Jirsa
When you change compaction strategy, nothing happens until the next flush. On 
the next flush, the new compaction strategy will decide what to do – if you 
change from STCS to DTCS, it will look at various timestamps of files, and 
attempt to group them by time windows based on the sstable’s minTimestamp, and 
your DTCS base_time_seconds and an ever-growing multiple of min_threshold

Generally, many people recommend doing a STCS major before changing to DTCS 
(essentially to force all sstables into the oldest possible bucket). Whether or 
not that’s ideal for you depends on why you’re using DTCS (do you want to cut 
down on compaction, or are you setting yourself up for efficient TTL 
expiration). If it’s the latter, you should be sure you understand the impact 
of STCS major on your TTL use case (no data will TTL out until all of the data 
currently on disk is ready to expire).



From:  Anuj Wadehra
Reply-To:  "user@cassandra.apache.org"
Date:  Friday, January 15, 2016 at 10:16 AM
To:  "user@cassandra.apache.org"
Subject:  Impact of Changing Compaction Strategy

Hi, 

I need to understand whether all existing sstables are recreated/updated when 
we change compaction strategy from STCS to DTCS?


Sstables are immutable by design but do we take an exception for such cases and 
update same files when an Alter statement is fired to change the compaction 
strategy?


Thanks
Anuj

Sent from Yahoo Mail on Android



smime.p7s
Description: S/MIME cryptographic signature


compaction throughput

2016-01-15 Thread Kai Wang
Hi,

I am trying to figure out the bottleneck of compaction on my node. The node
is CentOS 7 and has SSDs installed. The table is configured to use LCS.
Here is my compaction related configs in cassandra.yaml:

compaction_throughput_mb_per_sec: 160
concurrent_compactors: 4

I insert about 10G of data and start observing compaction.

*nodetool compaction* shows most of time there is one compaction. Sometimes
there are 3-4 (I suppose this is controlled by concurrent_compactors).
During the compaction, I see one CPU core is 100%. At that point, disk IO
is about 20-25 M/s write which is much lower than the disk is capable of.
Even when there are 4 compactions running, I see CPU go to +400% but disk
IO is still at 20-25M/s write. I use *nodetool setcompactionthroughput 0*
to disable the compaction throttle but don't see any difference.

Does this mean compaction is CPU bound? If so 20M/s is kinda low. Is there
anyway to improve the throughput?

Thanks.


Re: compaction throughput

2016-01-15 Thread Kai Wang
I forget to mention I am using C* 2.2.4
On Jan 15, 2016 3:53 PM, "Kai Wang"  wrote:

> Hi,
>
> I am trying to figure out the bottleneck of compaction on my node. The
> node is CentOS 7 and has SSDs installed. The table is configured to use
> LCS. Here is my compaction related configs in cassandra.yaml:
>
> compaction_throughput_mb_per_sec: 160
> concurrent_compactors: 4
>
> I insert about 10G of data and start observing compaction.
>
> *nodetool compaction* shows most of time there is one compaction.
> Sometimes there are 3-4 (I suppose this is controlled by
> concurrent_compactors). During the compaction, I see one CPU core is 100%.
> At that point, disk IO is about 20-25 M/s write which is much lower than
> the disk is capable of. Even when there are 4 compactions running, I see
> CPU go to +400% but disk IO is still at 20-25M/s write. I use *nodetool
> setcompactionthroughput 0* to disable the compaction throttle but don't
> see any difference.
>
> Does this mean compaction is CPU bound? If so 20M/s is kinda low. Is there
> anyway to improve the throughput?
>
> Thanks.
>


Re: compaction throughput

2016-01-15 Thread Jeff Ferland
Compaction is generally CPU bound and relatively slow. Exactly why that is I’m 
uncertain.

> On Jan 15, 2016, at 12:53 PM, Kai Wang  wrote:
> 
> Hi,
> 
> I am trying to figure out the bottleneck of compaction on my node. The node 
> is CentOS 7 and has SSDs installed. The table is configured to use LCS. Here 
> is my compaction related configs in cassandra.yaml:
> 
> compaction_throughput_mb_per_sec: 160
> concurrent_compactors: 4
> 
> I insert about 10G of data and start observing compaction.
> 
> nodetool compaction shows most of time there is one compaction. Sometimes 
> there are 3-4 (I suppose this is controlled by concurrent_compactors). During 
> the compaction, I see one CPU core is 100%. At that point, disk IO is about 
> 20-25 M/s write which is much lower than the disk is capable of. Even when 
> there are 4 compactions running, I see CPU go to +400% but disk IO is still 
> at 20-25M/s write. I use nodetool setcompactionthroughput 0 to disable the 
> compaction throttle but don't see any difference.
> 
> Does this mean compaction is CPU bound? If so 20M/s is kinda low. Is there 
> anyway to improve the throughput?
> 
> Thanks.



Re: compaction throughput

2016-01-15 Thread Sebastian Estevez
 *nodetool setcompactionthroughput 0*

Will only affect future compactions, not the ones that are currently
running.

All the best,


[image: datastax_logo.png] 

Sebastián Estévez

Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

[image: linkedin.png]  [image:
facebook.png]  [image: twitter.png]
 [image: g+.png]







DataStax is the fastest, most scalable distributed database technology,
delivering Apache Cassandra to the world’s most innovative enterprises.
Datastax is built to be agile, always-on, and predictably scalable to any
size. With more than 500 customers in 45 countries, DataStax is the
database technology and transactional backbone of choice for the worlds
most innovative companies such as Netflix, Adobe, Intuit, and eBay.

On Fri, Jan 15, 2016 at 4:40 PM, Jeff Ferland  wrote:

> Compaction is generally CPU bound and relatively slow. Exactly why that is
> I’m uncertain.
>
> On Jan 15, 2016, at 12:53 PM, Kai Wang  wrote:
>
> Hi,
>
> I am trying to figure out the bottleneck of compaction on my node. The
> node is CentOS 7 and has SSDs installed. The table is configured to use
> LCS. Here is my compaction related configs in cassandra.yaml:
>
> compaction_throughput_mb_per_sec: 160
> concurrent_compactors: 4
>
> I insert about 10G of data and start observing compaction.
>
> *nodetool compaction* shows most of time there is one compaction.
> Sometimes there are 3-4 (I suppose this is controlled by
> concurrent_compactors). During the compaction, I see one CPU core is 100%.
> At that point, disk IO is about 20-25 M/s write which is much lower than
> the disk is capable of. Even when there are 4 compactions running, I see
> CPU go to +400% but disk IO is still at 20-25M/s write. I use *nodetool
> setcompactionthroughput 0* to disable the compaction throttle but don't
> see any difference.
>
> Does this mean compaction is CPU bound? If so 20M/s is kinda low. Is there
> anyway to improve the throughput?
>
> Thanks.
>
>
>


Re: compaction throughput

2016-01-15 Thread Jeff Jirsa
With SSDs, the typical recommendation is up to 0.8-1 compactor per core 
(depending on other load).  How many CPU cores do you have?


From:  Kai Wang
Reply-To:  "user@cassandra.apache.org"
Date:  Friday, January 15, 2016 at 12:53 PM
To:  "user@cassandra.apache.org"
Subject:  compaction throughput

Hi,

I am trying to figure out the bottleneck of compaction on my node. The node is 
CentOS 7 and has SSDs installed. The table is configured to use LCS. Here is my 
compaction related configs in cassandra.yaml:

compaction_throughput_mb_per_sec: 160
concurrent_compactors: 4

I insert about 10G of data and start observing compaction.

nodetool compaction shows most of time there is one compaction. Sometimes there 
are 3-4 (I suppose this is controlled by concurrent_compactors). During the 
compaction, I see one CPU core is 100%. At that point, disk IO is about 20-25 
M/s write which is much lower than the disk is capable of. Even when there are 
4 compactions running, I see CPU go to +400% but disk IO is still at 20-25M/s 
write. I use nodetool setcompactionthroughput 0 to disable the compaction 
throttle but don't see any difference.

Does this mean compaction is CPU bound? If so 20M/s is kinda low. Is there 
anyway to improve the throughput?

Thanks.



smime.p7s
Description: S/MIME cryptographic signature


Re: compaction throughput

2016-01-15 Thread Kai Wang
Jeff & Sebastian,

Thanks for the reply. There are 12 cores but in my case C* only uses one
core most of the time. *nodetool compactionstats* shows there's only one
compactor running. I can see C* process only uses one core. So I guess I
should've asked the question more clearly:

1. Is ~25 M/s a reasonable compaction throughput for one core?
2. Is there any configuration that affects single core compaction
throughput?
3. Is concurrent_compactors the only option to parallelize compaction? If
so, I guess it's the compaction strategy itself that decides when to
parallelize and when to block on one core. Then there's not much we can do
here.

Thanks.

On Fri, Jan 15, 2016 at 5:23 PM, Jeff Jirsa 
wrote:

> With SSDs, the typical recommendation is up to 0.8-1 compactor per core
> (depending on other load).  How many CPU cores do you have?
>
>
> From: Kai Wang
> Reply-To: "user@cassandra.apache.org"
> Date: Friday, January 15, 2016 at 12:53 PM
> To: "user@cassandra.apache.org"
> Subject: compaction throughput
>
> Hi,
>
> I am trying to figure out the bottleneck of compaction on my node. The
> node is CentOS 7 and has SSDs installed. The table is configured to use
> LCS. Here is my compaction related configs in cassandra.yaml:
>
> compaction_throughput_mb_per_sec: 160
> concurrent_compactors: 4
>
> I insert about 10G of data and start observing compaction.
>
> *nodetool compaction* shows most of time there is one compaction.
> Sometimes there are 3-4 (I suppose this is controlled by
> concurrent_compactors). During the compaction, I see one CPU core is 100%.
> At that point, disk IO is about 20-25 M/s write which is much lower than
> the disk is capable of. Even when there are 4 compactions running, I see
> CPU go to +400% but disk IO is still at 20-25M/s write. I use *nodetool
> setcompactionthroughput 0* to disable the compaction throttle but don't
> see any difference.
>
> Does this mean compaction is CPU bound? If so 20M/s is kinda low. Is there
> anyway to improve the throughput?
>
> Thanks.
>


Re: compaction throughput

2016-01-15 Thread Sebastian Estevez
Correct.

Why are you concerned with the raw throughput, are you accumulating pending
compactions? Are you seeing high sstables per read statistics?

all the best,

Sebastián
On Jan 15, 2016 6:18 PM, "Kai Wang"  wrote:

> Jeff & Sebastian,
>
> Thanks for the reply. There are 12 cores but in my case C* only uses one
> core most of the time. *nodetool compactionstats* shows there's only one
> compactor running. I can see C* process only uses one core. So I guess I
> should've asked the question more clearly:
>
> 1. Is ~25 M/s a reasonable compaction throughput for one core?
> 2. Is there any configuration that affects single core compaction
> throughput?
> 3. Is concurrent_compactors the only option to parallelize compaction? If
> so, I guess it's the compaction strategy itself that decides when to
> parallelize and when to block on one core. Then there's not much we can do
> here.
>
> Thanks.
>
> On Fri, Jan 15, 2016 at 5:23 PM, Jeff Jirsa 
> wrote:
>
>> With SSDs, the typical recommendation is up to 0.8-1 compactor per core
>> (depending on other load).  How many CPU cores do you have?
>>
>>
>> From: Kai Wang
>> Reply-To: "user@cassandra.apache.org"
>> Date: Friday, January 15, 2016 at 12:53 PM
>> To: "user@cassandra.apache.org"
>> Subject: compaction throughput
>>
>> Hi,
>>
>> I am trying to figure out the bottleneck of compaction on my node. The
>> node is CentOS 7 and has SSDs installed. The table is configured to use
>> LCS. Here is my compaction related configs in cassandra.yaml:
>>
>> compaction_throughput_mb_per_sec: 160
>> concurrent_compactors: 4
>>
>> I insert about 10G of data and start observing compaction.
>>
>> *nodetool compaction* shows most of time there is one compaction.
>> Sometimes there are 3-4 (I suppose this is controlled by
>> concurrent_compactors). During the compaction, I see one CPU core is 100%.
>> At that point, disk IO is about 20-25 M/s write which is much lower than
>> the disk is capable of. Even when there are 4 compactions running, I see
>> CPU go to +400% but disk IO is still at 20-25M/s write. I use *nodetool
>> setcompactionthroughput 0* to disable the compaction throttle but don't
>> see any difference.
>>
>> Does this mean compaction is CPU bound? If so 20M/s is kinda low. Is
>> there anyway to improve the throughput?
>>
>> Thanks.
>>
>
>


Re: compaction throughput

2016-01-15 Thread Sebastian Estevez
LCS is IO ontensive but CPU is also relevant.

On slower disks compaction may not be cpu bound.

If you aren't seeing more than one compaction thread at a time, I suspect
your system is not compaction bound.

all the best,

Sebastián
On Jan 15, 2016 7:20 PM, "Kai Wang"  wrote:

> Sebastian,
>
> Because I have this impression that LCS is IO intensive and it's
> recommended only on SSDs. So I am curious to see how far it can stress
> those SSDs. But it turns out the most expensive part about LCS is not IO
> bound but CUP bound, or more precisely single core speed bound. This is a
> little surprising.
>
> Of course LCS is still superior in other aspects.
> On Jan 15, 2016 6:34 PM, "Sebastian Estevez" <
> sebastian.este...@datastax.com> wrote:
>
>> Correct.
>>
>> Why are you concerned with the raw throughput, are you accumulating
>> pending compactions? Are you seeing high sstables per read statistics?
>>
>> all the best,
>>
>> Sebastián
>> On Jan 15, 2016 6:18 PM, "Kai Wang"  wrote:
>>
>>> Jeff & Sebastian,
>>>
>>> Thanks for the reply. There are 12 cores but in my case C* only uses one
>>> core most of the time. *nodetool compactionstats* shows there's only
>>> one compactor running. I can see C* process only uses one core. So I guess
>>> I should've asked the question more clearly:
>>>
>>> 1. Is ~25 M/s a reasonable compaction throughput for one core?
>>> 2. Is there any configuration that affects single core compaction
>>> throughput?
>>> 3. Is concurrent_compactors the only option to parallelize compaction?
>>> If so, I guess it's the compaction strategy itself that decides when to
>>> parallelize and when to block on one core. Then there's not much we can do
>>> here.
>>>
>>> Thanks.
>>>
>>> On Fri, Jan 15, 2016 at 5:23 PM, Jeff Jirsa 
>>> wrote:
>>>
 With SSDs, the typical recommendation is up to 0.8-1 compactor per core
 (depending on other load).  How many CPU cores do you have?


 From: Kai Wang
 Reply-To: "user@cassandra.apache.org"
 Date: Friday, January 15, 2016 at 12:53 PM
 To: "user@cassandra.apache.org"
 Subject: compaction throughput

 Hi,

 I am trying to figure out the bottleneck of compaction on my node. The
 node is CentOS 7 and has SSDs installed. The table is configured to use
 LCS. Here is my compaction related configs in cassandra.yaml:

 compaction_throughput_mb_per_sec: 160
 concurrent_compactors: 4

 I insert about 10G of data and start observing compaction.

 *nodetool compaction* shows most of time there is one compaction.
 Sometimes there are 3-4 (I suppose this is controlled by
 concurrent_compactors). During the compaction, I see one CPU core is 100%.
 At that point, disk IO is about 20-25 M/s write which is much lower than
 the disk is capable of. Even when there are 4 compactions running, I see
 CPU go to +400% but disk IO is still at 20-25M/s write. I use *nodetool
 setcompactionthroughput 0* to disable the compaction throttle but
 don't see any difference.

 Does this mean compaction is CPU bound? If so 20M/s is kinda low. Is
 there anyway to improve the throughput?

 Thanks.

>>>
>>>


Re: compaction throughput

2016-01-15 Thread Kai Wang
Sebastian,

Because I have this impression that LCS is IO intensive and it's
recommended only on SSDs. So I am curious to see how far it can stress
those SSDs. But it turns out the most expensive part about LCS is not IO
bound but CUP bound, or more precisely single core speed bound. This is a
little surprising.

Of course LCS is still superior in other aspects.
On Jan 15, 2016 6:34 PM, "Sebastian Estevez" 
wrote:

> Correct.
>
> Why are you concerned with the raw throughput, are you accumulating
> pending compactions? Are you seeing high sstables per read statistics?
>
> all the best,
>
> Sebastián
> On Jan 15, 2016 6:18 PM, "Kai Wang"  wrote:
>
>> Jeff & Sebastian,
>>
>> Thanks for the reply. There are 12 cores but in my case C* only uses one
>> core most of the time. *nodetool compactionstats* shows there's only one
>> compactor running. I can see C* process only uses one core. So I guess I
>> should've asked the question more clearly:
>>
>> 1. Is ~25 M/s a reasonable compaction throughput for one core?
>> 2. Is there any configuration that affects single core compaction
>> throughput?
>> 3. Is concurrent_compactors the only option to parallelize compaction? If
>> so, I guess it's the compaction strategy itself that decides when to
>> parallelize and when to block on one core. Then there's not much we can do
>> here.
>>
>> Thanks.
>>
>> On Fri, Jan 15, 2016 at 5:23 PM, Jeff Jirsa 
>> wrote:
>>
>>> With SSDs, the typical recommendation is up to 0.8-1 compactor per core
>>> (depending on other load).  How many CPU cores do you have?
>>>
>>>
>>> From: Kai Wang
>>> Reply-To: "user@cassandra.apache.org"
>>> Date: Friday, January 15, 2016 at 12:53 PM
>>> To: "user@cassandra.apache.org"
>>> Subject: compaction throughput
>>>
>>> Hi,
>>>
>>> I am trying to figure out the bottleneck of compaction on my node. The
>>> node is CentOS 7 and has SSDs installed. The table is configured to use
>>> LCS. Here is my compaction related configs in cassandra.yaml:
>>>
>>> compaction_throughput_mb_per_sec: 160
>>> concurrent_compactors: 4
>>>
>>> I insert about 10G of data and start observing compaction.
>>>
>>> *nodetool compaction* shows most of time there is one compaction.
>>> Sometimes there are 3-4 (I suppose this is controlled by
>>> concurrent_compactors). During the compaction, I see one CPU core is 100%.
>>> At that point, disk IO is about 20-25 M/s write which is much lower than
>>> the disk is capable of. Even when there are 4 compactions running, I see
>>> CPU go to +400% but disk IO is still at 20-25M/s write. I use *nodetool
>>> setcompactionthroughput 0* to disable the compaction throttle but don't
>>> see any difference.
>>>
>>> Does this mean compaction is CPU bound? If so 20M/s is kinda low. Is
>>> there anyway to improve the throughput?
>>>
>>> Thanks.
>>>
>>
>>


Re: Cassandra 3.1.1 with respect to HeapSpace

2016-01-15 Thread Jean Tremblay
Thank you Sebastián for your useful advice. I managed restarting the nodes, but 
I needed to delete all the commit logs, not only the last one specified. 
Nevertheless I’m back in business.

Would there be a better memory configuration to select for my nodes in a C* 3 
cluster? Currently I use MAX_HEAP_SIZE=“6G" HEAP_NEWSIZE=“496M” for a 16M RAM 
node.

Thanks for your help.

Jean

On 15 Jan 2016, at 24:24 , Sebastian Estevez 
> wrote:

Try starting the other nodes. You may have to delete or mv the commitlog 
segment referenced in the error message for the node to come up since 
apparently it is corrupted.

All the best,

[datastax_logo.png]
Sebastián Estévez
Solutions Architect | 954 905 8615 | 
sebastian.este...@datastax.com
[linkedin.png] [facebook.png] 
  [twitter.png] 
  [g+.png] 
  
[https://lh6.googleusercontent.com/24_538J0j5M0NHQx-jkRiV_IHrhsh-98hpi--Qz9b0-I4llvWuYI6LgiVJsul0AhxL0gMTOHgw3G0SvIXaT2C7fsKKa_DdQ2uOJ-bQ6h_mQ7k7iMybcR1dr1VhWgLMxcmg]
 


[http://learn.datastax.com/rs/059-YLZ-577/images/Gartner_728x90_Sig4.png]

DataStax is the fastest, most scalable distributed database technology, 
delivering Apache Cassandra to the world’s most innovative enterprises. 
Datastax is built to be agile, always-on, and predictably scalable to any size. 
With more than 500 customers in 45 countries, DataStax is the database 
technology and transactional backbone of choice for the worlds most innovative 
companies such as Netflix, Adobe, Intuit, and eBay.

On Thu, Jan 14, 2016 at 1:00 PM, Jean Tremblay 
> 
wrote:
How can I restart?
It blocks with the error listed below.
Are my memory settings good for my configuration?

On 14 Jan 2016, at 18:30, Jake Luciani 
> wrote:

Yes you can restart without data loss.

Can you please include info about how much data you have loaded per node and 
perhaps what your schema looks like?

Thanks

On Thu, Jan 14, 2016 at 12:24 PM, Jean Tremblay 
> 
wrote:

Ok, I will open a ticket.

How could I restart my cluster without loosing everything ?
Would there be a better memory configuration to select for my nodes? Currently 
I use MAX_HEAP_SIZE="6G" HEAP_NEWSIZE=“496M” for a 16M RAM node.

Thanks

Jean

On 14 Jan 2016, at 18:19, Tyler Hobbs 
> wrote:

I don't think that's a known issue.  Can you open a ticket at 
https://issues.apache.org/jira/browse/CASSANDRA and attach your schema along 
with the commitlog files and the mutation that was saved to /tmp?

On Thu, Jan 14, 2016 at 10:56 AM, Jean Tremblay 
> 
wrote:
Hi,

I have a small Cassandra Cluster with 5 nodes, having 16MB of RAM.
I use Cassandra 3.1.1.
I use the following setup for the memory:
  MAX_HEAP_SIZE="6G"
HEAP_NEWSIZE="496M"

I have been loading a lot of data in this cluster over the last 24 hours. The 
system behaved I think very nicely. It was loading very fast, and giving 
excellent read time. There was no error messages until this one:


ERROR [SharedPool-Worker-35] 2016-01-14 17:05:23,602 
JVMStabilityInspector.java:139 - JVM state determined to be unstable.  Exiting 
forcefully due to:
java.lang.OutOfMemoryError: Java heap space
at java.nio.HeapByteBuffer.(HeapByteBuffer.java:57) ~[na:1.8.0_65]
at java.nio.ByteBuffer.allocate(ByteBuffer.java:335) ~[na:1.8.0_65]
at 
org.apache.cassandra.io.util.DataOutputBuffer.reallocate(DataOutputBuffer.java:126)
 ~[apache-cassandra-3.1.1.jar:3.1.1]
at 
org.apache.cassandra.io.util.DataOutputBuffer.doFlush(DataOutputBuffer.java:86) 
~[apache-cassandra-3.1.1.jar:3.1.1]
at 
org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.write(BufferedDataOutputStreamPlus.java:132)
 ~[apache-cassandra-3.1.1.jar:3.1.1]
at 
org.apache.cassandra.io.util.BufferedDataOutputStreamPlus.write(BufferedDataOutputStreamPlus.java:151)
 ~[apache-cassandra-3.1.1.jar:3.1.1]
at 
org.apache.cassandra.utils.ByteBufferUtil.writeWithVIntLength(ByteBufferUtil.java:297)
 ~[apache-cassandra-3.1.1.jar:3.1.1]
at 
org.apache.cassandra.db.marshal.AbstractType.writeValue(AbstractType.java:374) 
~[apache-cassandra-3.1.1.jar:3.1.1]
at 
org.apache.cassandra.db.rows.BufferCell$Serializer.serialize(BufferCell.java:263)
 ~[apache-cassandra-3.1.1.jar:3.1.1]
at 
org.apache.cassandra.db.rows.UnfilteredSerializer.serialize(UnfilteredSerializer.java:183)
 ~[apache-cassandra-3.1.1.jar:3.1.1]
at