Re: Understanding C* -XX:MaxTenuringThreshold=1

2015-10-26 Thread John Schulz
You are correct. That is the intent of the MaxTenuringThreshold parameter.
Setting the value > 1 keeps objects in survivor space longer, if there is
space to keep them. The intent of keeping objects in survivor space for
longer than one ParNew cycle is to prevent relatively short lived objects
from being promoted to old gen space if they are just going to be
eliminated in the next CMS collection its better to keep them in survivor
space.

On Mon, Oct 26, 2015 at 12:18 AM, qihuang.zheng <
qihuang.zh...@fraudmetrix.cn> wrote:

> As Cassandra default *-XX:MaxTenuringThreshold=1*, which means:
> first time YGC, Eden's live object copy to Survivor(S0), those survivor
> object's age counter=1.
> Then next time YGC, Eden live object and S0 will copy to S1. and those
> still live object from S0 age counter=2 which large than
> MaxTenuringThreshold=1, then they will promot to Old Gen.
>
> from this, I get this conclusion:
> 1. After YGC, Eden live object copy to to survivor(age=1), from
> survivor(age=2) first copy to to survivor then transfer to old gen.
> 2. Old Gen's increment size after this YGC will not large than latest from
> survivor's size.
> 3. Survivor's object age counter aways be 1. as those counter=2 promot to
> Old, then disappear from survivor.
>
> Plz tell me it’s right.
> TKS.
>
> qihuang.zheng
>



-- 

John H. Schulz

Principal Consultant

Pythian - Love your data


sch...@pythian.com |  Linkedin www.linkedin.com/pub/john-schulz/13/ab2/930/

Mobile: 248-376-3380

*www.pythian.com *

-- 


--





Find partition row of Compacted partition maximum bytes

2015-10-26 Thread qihuang.zheng
I use nodetool cfstats to see table’s status, and findCompacted partition 
maximum bytes: 190G. 
Is there anyway to find this largest wide partition row?
[qihuang.zheng@cass047202 cassandra]$ nodetool cfstats forseti.velocity 
Keyspace: forseti Read Count: 10470099 Read Latency: 1.3186399419909973 ms. 
Write Count: 146970362 Write Latency: 0.06062576270989929 ms. Pending Tasks: 0 
Table: velocity SSTable count: 2144 SSTables in each level: [1, 10, 96, 723, 
1314, 0, 0, 0, 0] Space used (live), bytes: 509031385679 Space used (total), 
bytes: 523815500936 Off heap memory used (total), bytes: 558210701 SSTable 
Compression Ratio: 0.23635049381008288 Number of keys (estimate): 269787648 
Memtable cell count: 271431 Memtable data size, bytes: 141953019 Memtable 
switch count: 1713 Local read count: 10470099 Local read latency: 1.266 ms 
Local write count: 146970371 Local write latency: 0.053 ms Pending tasks: 0 
Bloom filter false positives: 534721 Bloom filter false ratio: 0.13542 Bloom 
filter space used, bytes: 180529808 Bloom filter off heap memory used, bytes: 
180512656 Index summary off heap memory used, bytes: 118613037 Compression 
metadata off heap memory used, bytes: 259085008 Compacted partition minimum 
bytes: 104 Compacted partition maximum bytes: 190420296972 Compacted partition 
mean bytes: 8656 Average live cells per slice (last five minutes): 0.0 Average 
tombstones per slice (last five minutes): 0.0
qihuang.zheng

Re: Find partition row of Compacted partition maximum bytes

2015-10-26 Thread DuyHai Doan
>From C* 2.2.x

> nodetool help toppartitions

NAME
nodetool toppartitions - Sample and print the most active
partitions for
a given column family



On Mon, Oct 26, 2015 at 7:54 AM, qihuang.zheng  wrote:

> I use nodetool cfstats to see table’s status, and find *Compacted
> partition maximum bytes: 190G.  *
>
> *Is there anyway to find this largest wide partition row?*
>
> [qihuang.zheng@cass047202 cassandra]$ nodetool cfstats forseti.velocity
> Keyspace: forseti
> Read Count: 10470099
> Read Latency: 1.3186399419909973 ms.
> Write Count: 146970362
> Write Latency: 0.06062576270989929 ms.
> Pending Tasks: 0
> Table: velocity
> SSTable count: 2144
> SSTables in each level: [1, 10, 96, 723, 1314, 0, 0, 0, 0]
> Space used (live), bytes: 509031385679
> Space used (total), bytes: 523815500936
> Off heap memory used (total), bytes: 558210701
> SSTable Compression Ratio: 0.23635049381008288
> Number of keys (estimate): 269787648
> Memtable cell count: 271431
> Memtable data size, bytes: 141953019
> Memtable switch count: 1713
> Local read count: 10470099
> Local read latency: 1.266 ms
> Local write count: 146970371
> Local write latency: 0.053 ms
> Pending tasks: 0
> Bloom filter false positives: 534721
> Bloom filter false ratio: 0.13542
> Bloom filter space used, bytes: 180529808
> Bloom filter off heap memory used, bytes: 180512656
> Index summary off heap memory used, bytes: 118613037
> Compression metadata off heap memory used, bytes: 259085008
> Compacted partition minimum bytes: 104*Compacted partition 
> maximum bytes: 190420296972
> *Compacted partition mean bytes: 8656
> Average live cells per slice (last five minutes): 0.0
> Average tombstones per slice (last five minutes): 0.0
>
> qihuang.zheng
>


Re: Can consistency-levels be different for "read" and "write" in Datastax Java-Driver?

2015-10-26 Thread daemeon reiydelle
If one rethinks "consistency" to mean "copies returned" and "copies
written" then one can have different values for the former (datastax) and
the latter (within Cassandra). The latter changes eventual consistency
(e.g. two copies must be written), the former can speed up a result at the
(slight) risk of stale data. I have no experience with the former, just
recall it somewhere in the documentation: n-copy eventual consistency is
fine for all of my work.



*...*






*“Life should not be a journey to the grave with the intention of arriving
safely in apretty and well preserved body, but rather to skid in broadside
in a cloud of smoke,thoroughly used up, totally worn out, and loudly
proclaiming “Wow! What a Ride!” - Hunter ThompsonDaemeon C.M. ReiydelleUSA
(+1) 415.501.0198London (+44) (0) 20 8144 9872*

On Mon, Oct 26, 2015 at 11:52 AM, Jonathan Haddad  wrote:

> What's your query?  Do you have IF NOT EXISTS in there?
>
> On Mon, Oct 26, 2015 at 11:17 AM Ajay Garg  wrote:
>
>> Right now, I have setup "LOCAL QUORUM" as the consistency level in the
>> driver, but it seems that "SERIAL" is being used during writes, and I
>> consistently get this error of type ::
>>
>> *Cassandra timeout during write query at consistency SERIAL (3 replica
>> were required but only 0 acknowledged the write)*
>>
>>
>> Am I missing something?
>>
>>
>>
>> --
>> Regards,
>> Ajay
>>
>


Re: how to grant permissions to OpsCenter keyspace?

2015-10-26 Thread Adam Holmberg
You need to quote the "OpsCenter" identifier to distinguish capital letters:
https://cassandra.apache.org/doc/cql3/CQL.html#identifiers

Adam

On Mon, Oct 26, 2015 at 4:25 PM, Kai Wang  wrote:

> Hi,
>
> My understanding is that if I want to enable internal authentication and
> authorization on C* while still keeping OpsCenter working, I should grant
> all to OpsCenter space and describe/select on everything else. But when I
> try to grant permissions to or even switch into OpsCenter, cqlsh reports
> this:
>
> cassandra@cqlsh> use OpsCenter;
> InvalidRequest: code=2200 [Invalid query] message="Keyspace 'opscenter'
> does not exist"
>
> KS OpsCenter of course exists. I notice cqlsh returns keyspace name in
> lower case but in system.schema_keyspaces it shows OpsCenter. C* supports
> case sensitive keyspace names in DevCenter. But cqlsh converts everything
> to lower case?
>
> How can I grant permission to OpsCenter? Or a further question, how can I
> do case sensitive operations in cqlsh?
>
> Thanks.
>


how to grant permissions to OpsCenter keyspace?

2015-10-26 Thread Kai Wang
Hi,

My understanding is that if I want to enable internal authentication and
authorization on C* while still keeping OpsCenter working, I should grant
all to OpsCenter space and describe/select on everything else. But when I
try to grant permissions to or even switch into OpsCenter, cqlsh reports
this:

cassandra@cqlsh> use OpsCenter;
InvalidRequest: code=2200 [Invalid query] message="Keyspace 'opscenter'
does not exist"

KS OpsCenter of course exists. I notice cqlsh returns keyspace name in
lower case but in system.schema_keyspaces it shows OpsCenter. C* supports
case sensitive keyspace names in DevCenter. But cqlsh converts everything
to lower case?

How can I grant permission to OpsCenter? Or a further question, how can I
do case sensitive operations in cqlsh?

Thanks.


Re: how to grant permissions to OpsCenter keyspace?

2015-10-26 Thread Kai Wang
Thanks Adam.

On Mon, Oct 26, 2015 at 5:30 PM, Adam Holmberg 
wrote:

> You need to quote the "OpsCenter" identifier to distinguish capital
> letters:
> https://cassandra.apache.org/doc/cql3/CQL.html#identifiers
>
> Adam
>
> On Mon, Oct 26, 2015 at 4:25 PM, Kai Wang  wrote:
>
>> Hi,
>>
>> My understanding is that if I want to enable internal authentication and
>> authorization on C* while still keeping OpsCenter working, I should grant
>> all to OpsCenter space and describe/select on everything else. But when
>> I try to grant permissions to or even switch into OpsCenter, cqlsh
>> reports this:
>>
>> cassandra@cqlsh> use OpsCenter;
>> InvalidRequest: code=2200 [Invalid query] message="Keyspace 'opscenter'
>> does not exist"
>>
>> KS OpsCenter of course exists. I notice cqlsh returns keyspace name in
>> lower case but in system.schema_keyspaces it shows OpsCenter. C* supports
>> case sensitive keyspace names in DevCenter. But cqlsh converts everything
>> to lower case?
>>
>> How can I grant permission to OpsCenter? Or a further question, how can I
>> do case sensitive operations in cqlsh?
>>
>> Thanks.
>>
>
>


Cassandra Hadoop Integration

2015-10-26 Thread Kenji Fnu
Hi guys, I was wondering how should hadoop and cassandra integrate with
each other? Currently I am running a spring-data framework to integrate
those 2. I wonder if there are other ways. Also is it possible to access
cassandra from remote facility? thanks a lot!


decommission too slow

2015-10-26 Thread qihuang.zheng
Recently we want to delete some c* nodes for data migration. C* verision:2.0.15
we use nodetooldecommission with nohup: nohup nodetool decommission -h xxx
After execute 3 days already, seems this process did’t finished yet!
This decommissioning Node data isnearly 400G.


1. I check jps -lm, and NodeCmd decommission process still there.
2. and nohup.out file is empty always.
3. I check opscenter, the node status is Leaving: 1 running task.
4. the running task is Compaction, I use nodetool stop compaction, but after 
sometime later compaction happend again.
5. thenodetool netstats show it’s leaving:
Mode: LEAVING
Unbootstrap 928e6be0-7950-11e5-9cfb-910d8a1425c3
….
Read Repair Statistics:
Attempted: 4746100
Mismatch (Blocking): 950
Mismatch (Background): 100746
Pool Name  Active  Pending   Completed
Commandsn/a 0   1275402208
Responsesn/a 0   1034430957


I don’t know when decommission will finished. Or does something wrong inside? 
just 400G data takes 3 days(and still unfinished) seems abnormal. 


Tks, qihuang.zheng

Re: C* Table Changed and Data Migration with new primary key

2015-10-26 Thread qihuang.zheng
Tks Doan! we would try spark. as our online already has spark1.4.1. but our C* 
version is 2.0.15, As spark1.4 need 2.1.5, when I query size get error “Failed 
to fetch size estimates for system.size_estimates”. but that’s not a serious 
problem.
I tried use spark1.4.1 and cass2.01.5 on test env, It’s really fast than just 
use java driver api.
But we may meet some problem on producet env, as our spark node deploy totally 
different with cassandra nodes. 




qihuang.zheng


原始邮件
发件人:DuyHai doandoanduy...@gmail.com
收件人:useru...@cassandra.apache.org
发送时间:2015年10月22日(周四) 19:50
主题:Re: C* Table Changed and Data Migration with new primary key


Use Spark to distribute the job of copying data all over the cluster and help 
accelerating the migration. The Spark connector does auto paging in the 
background with the Java Driver
Le 22 oct. 2015 11:03, "qihuang.zheng" qihuang.zh...@fraudmetrix.cn a écrit :

I tried using java driver with auto paging query: setFetchSize instead of token 
function. as Cass has this feature already.
ref from 
here:http://www.datastax.com/dev/blog/client-side-improvements-in-cassandra-2-0


But I tried in test envrionment with only 1million data read then insert 3 
tables, It’s too slow.
After running 20 min, Exception likeNoHostAvailableException happen, offcourse 
data did’t sync completed.
And our product env has nearly 25 billion data. which is unacceptble for this 
case. It’s there other ways?



Thanks  Regards,
qihuang.zheng


原始邮件
发件人:Jeff jirsajeff.ji...@crowdstrike.com
收件人:user@cassandra.apache.orgu...@cassandra.apache.org
发送时间:2015年10月22日(周四) 13:52
主题:Re: C* Table Changed and Data Migration with new primary key


Because the data format has changed, you’ll need to read it out and write it 
back in again.


This means using either a driver (java, python, c++, etc), or something like 
spark.


In either case, split up the token range so you can parallelize it for 
significant speed improvements.






From: "qihuang.zheng"
Reply-To: "user@cassandra.apache.org"
Date: Wednesday, October 21, 2015 at 6:18 PM
To: user
Subject: C* Table Changed and Data Migration with new primary key



Hi All:
 We have a table defined only one partition key and some cluster key.
CREATE TABLE test1 (
 attribute text,   
 partner text,  
 app text,
 "timestamp" bigint, 
 event text, 
 PRIMARY KEY ((attribute), partner, app, "timestamp")
)
And now we want to split original test1 table to 3 tables like this: 
test_global : PRIMARY KEY ((attribute),“timestamp")
test_partner: PRIMARY KEY ((attribute, partner), "timestamp”)
test_app:PRIMARY KEY ((attribute, partner, app), “timestamp”)


Why we split original table because when queryglobal databy timestamp desc like 
this:
select * from test1 where attribute=? order by timestamp desc
is not support in Cass. As class order by support should use all clustering key.
But sql like this:
select * from test1 where attribute=? order by partner desc,app desc, timestamp 
desc
can’t query the right global data by ts desc.
After Split table we could do globa data query right: select * from test_global 
where attribute=? order by timestamp desc.


Now we have a problem ofdata migration.
As I Know,sstableloaderis the most easy way,but could’t deal with different 
table name. (Am I right?)
Andcpcmd in cqlsh can’t fit our situation because our data is two large. 
(10Nodes, one nodes has 400G data)
I alos try JavaAPI by query the origin table and then insert into 3 different 
splited table.But seems too slow


Any Solution aboult quick data migration?
TKS!!


PS: Cass version: 2.0.15




Thanks  Regards,
qihuang.zheng

回复:Can consistency-levels be different for "read" and "write" inDatastax Java-Driver?

2015-10-26 Thread qihuang.zheng
Using java driver, both read and write will transfer to Statement. We use 
PrepareStatement like this:
public PreparedStatement getPrepareSTMT(String cql){
 PreparedStatement statement = session.prepare(cql);
 
statement.setConsistencyLevel(ConsistencyLevel.ONE).setRetryPolicy(FallthroughRetryPolicy.INSTANCE);
 return statement;
}
So that you can set ConsistencyLevel differently for read and write. 


Tks, qihuang.zheng




原始邮件
发件人:Ajay gargajaygargn...@gmail.com
收件人:useru...@cassandra.apache.org
发送时间:2015年10月27日(周二) 02:17
主题:Can consistency-levels be different for "read" and "write" inDatastax 
Java-Driver?


Right now, I have setup "LOCAL QUORUM" as the consistency level in the driver, 
but it seems that "SERIAL" is being used during writes, and I consistently get 
this error of type ::

Cassandra timeout during write query at consistency SERIAL (3 replica were 
required but only 0 acknowledged the write)



Am I missing something?



-- 

Regards,
Ajay

Re: Can consistency-levels be different for "read" and "write" in Datastax Java-Driver?

2015-10-26 Thread Jonathan Haddad
What's your query?  Do you have IF NOT EXISTS in there?

On Mon, Oct 26, 2015 at 11:17 AM Ajay Garg  wrote:

> Right now, I have setup "LOCAL QUORUM" as the consistency level in the
> driver, but it seems that "SERIAL" is being used during writes, and I
> consistently get this error of type ::
>
> *Cassandra timeout during write query at consistency SERIAL (3 replica
> were required but only 0 acknowledged the write)*
>
>
> Am I missing something?
>
>
>
> --
> Regards,
> Ajay
>


unsubscribe

2015-10-26 Thread Brian Tarbox
-- 
http://about.me/BrianTarbox


Re: Find partition row of Compacted partition maximum bytes

2015-10-26 Thread Tushar Agrawal
Toppartions provide the most active partitions. 

I am trying to do same thing. I was able to narrow down the largest partition 
by looking at warning in system.log. 

Given that I have the key, how to see the entire data for that key?

Thanks,
Tushar


> On Oct 26, 2015, at 4:21 AM, DuyHai Doan  wrote:
> 
> From C* 2.2.x
> 
> > nodetool help toppartitions
> 
> NAME
> nodetool toppartitions - Sample and print the most active partitions 
> for
> a given column family
> 
> 
> 
> On Mon, Oct 26, 2015 at 7:54 AM, qihuang.zheng  
> wrote:
>> I use nodetool cfstats to see table’s status, and find Compacted partition 
>> maximum bytes: 190G.  
>> Is there anyway to find this largest wide partition row?
>> [qihuang.zheng@cass047202 cassandra]$ nodetool cfstats forseti.velocity
>> Keyspace: forseti
>> Read Count: 10470099
>> Read Latency: 1.3186399419909973 ms.
>> Write Count: 146970362
>> Write Latency: 0.06062576270989929 ms.
>> Pending Tasks: 0
>> Table: velocity
>> SSTable count: 2144
>> SSTables in each level: [1, 10, 96, 723, 1314, 0, 0, 0, 0]
>> Space used (live), bytes: 509031385679
>> Space used (total), bytes: 523815500936
>> Off heap memory used (total), bytes: 558210701
>> SSTable Compression Ratio: 0.23635049381008288
>> Number of keys (estimate): 269787648
>> Memtable cell count: 271431
>> Memtable data size, bytes: 141953019
>> Memtable switch count: 1713
>> Local read count: 10470099
>> Local read latency: 1.266 ms
>> Local write count: 146970371
>> Local write latency: 0.053 ms
>> Pending tasks: 0
>> Bloom filter false positives: 534721
>> Bloom filter false ratio: 0.13542
>> Bloom filter space used, bytes: 180529808
>> Bloom filter off heap memory used, bytes: 180512656
>> Index summary off heap memory used, bytes: 118613037
>> Compression metadata off heap memory used, bytes: 259085008
>> Compacted partition minimum bytes: 104
>> Compacted partition maximum bytes: 190420296972
>> Compacted partition mean bytes: 8656
>> Average live cells per slice (last five minutes): 0.0
>> Average tombstones per slice (last five minutes): 0.0
>> qihuang.zheng
> 


Re : Data restore to a new cluster

2015-10-26 Thread sai krishnam raju potturi
hi;
   we are working on a data backup and restore procedure to a new cluster.
We are following the datastax documentation. It mentions a step

"Restore the SSTable files snapshotted from the old cluster onto the new
cluster using the same directories"

http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_snapshot_restore_new_cluster.html

Could not find a mention about  "SCHEMA" creation. Could somebody shed some
light on this. At what point do we create the "SCHEMA", of required.


thanks
Sai


Can consistency-levels be different for "read" and "write" in Datastax Java-Driver?

2015-10-26 Thread Ajay Garg
Right now, I have setup "LOCAL QUORUM" as the consistency level in the
driver, but it seems that "SERIAL" is being used during writes, and I
consistently get this error of type ::

*Cassandra timeout during write query at consistency SERIAL (3 replica were
required but only 0 acknowledged the write)*


Am I missing something?


-- 
Regards,
Ajay