Much less connected native clients after node join

2015-11-15 Thread Michał Łowicki
Hi,

I'm using Python Driver 2.7.2 connected to C* 2.1.11 cluster in two DCs. I
had to reboot and rejoin one node and noticed that after successful join
the number of connected native clients was much less than to other nodes
(blue line on the attached graph). It didn't fixed after many hours so I
restarted newly joined node ~9:50 and everything looked much better. I
guess expected behaviour would be to have same number connected clients
after some time.



​

-- 
BR,
Michał Łowicki


Table modeling to search for various fields

2015-11-15 Thread Marlon Patrick
Hello guys,

In this presentation the slides 23 and 24 shows a modeling technique in
Cassandra as follows:

Link: http://pt.slideshare.net/patrickmcfadin/become-a-super-modeler

- Car table has a primary key composed of fields make, model, color,
vehicle_id.

- The partition key consists of fields make, model, color.

*Then populates the table this way:*

insert into car (make, model, color, vehicle_id)
values ('Ford','Mustang','Blue',1234);

insert into car (make, model, color, vehicle_id)
values ('Ford','','Blue',1234);

insert into car (make, model, color, vehicle_id)
values ('Ford','Mustang','',1234);

insert into car (make, model, color, vehicle_id)
values ('','Mustang','Blue',1234);

insert into car (make, model, color, vehicle_id)
values ('','Mustang','',1234);

insert into car (make, model, color, vehicle_id)
values ('','','Blue',1234);


*Finally do queries like:*

select vehicle_id from car where make='Ford' and model='' and color='Blue';

select vehicle_id from car where make='' and model='' and color='Blue';


*Another way to make the queries above would create a table by research
field, something like:*

select * from car_by_make where make='Ford';

select * from car_by_color where color='Blue';


*In the example of presentation, queries are just to get vehicle_id column.
If necessary get all columns could do something like:*

insert into car (make_key, model_key, color_key, vehicle_id, make_value,
model_value, color_value)
values ('Ford','','',1234, 'Ford', 'Mustang', 'Blue');

select vehicle_id, make_value, model_value, color_value from car where
make_key='Ford' and model_key='' and color_key='';


   1. It can be said that these 3 strategies are plausible?
   2. Is there any to be considered best practice than the other?
   3. Is there any should I NOT use at all?
   4. What are the advantages and disadvantages of each?

-- 
Atenciosamente,

Marlon Patrick


Convert timeuuid in timestamp programmatically

2015-11-15 Thread Marlon Patrick
Hi guys,

Is there any way to convert a timeuuid in timestamp (dateOf)
programmatically using DataStax java driver?

-- 
Atenciosamente,

Marlon Patrick


Re: Nested Collections in Cassandra

2015-11-15 Thread Neha Dave
Hi Jack,
Thanks for the response.

Do you think it's a good idea to have a separate table like :

CREATE TYPE metadata (
  key text,
  value set,
  path_id uuid
  );

And then index it on Value so that the query like... :
SELECT * from metadata where value CONTAINS {values: {'FOX'};

2. Can I use composite column? any idea?

regards
neha




On Sat, Nov 14, 2015 at 9:21 PM, Jack Krupansky 
wrote:

> You can only nest frozen collections and even then you can only access the
> full nested value, not individual entries within the nested map.
>
> So, in your example, you can only access mimetype and then must specify
> the full mime type value, which doesn't satisfy your query requirement.
>
> You will need to flatten your nesting into distinct rows with clustering
> keys. Then you can query a row with the mimetype in a clustering key. And
> add a clustering key for the mime values name.
>
> -- Jack Krupansky
>
> On Sat, Nov 14, 2015 at 5:16 AM, Neha Dave  wrote:
>
>> Any Help?
>>
>> On Tue, Nov 10, 2015 at 7:44 PM, Neha Dave 
>> wrote:
>>
>>> How can we achieve Nested Collections in cassandra.
>>>
>>> My requirement :
>>> metadata map ... Is it possible?
>>>
>>> Eg. 'mime-type' : 'MIME'
>>>   'Security'  : {'SOX','FOX'}
>>>
>>> Query will be Give me all the ID's where 'Security'  : {'SOX'} OR
>>> contains 'SOX'
>>>
>>> Is it Possible?
>>> Can I use UDT to do it?
>>>
>>> Eg CQL :
>>>
>>> CREATE TYPE security (
>>>   number text,
>>>   tags set
>>>   );
>>>
>>>
>>> CREATE TYPE listdata (
>>>   values set
>>>   );
>>>
>>>   CREATE TABLE test_path (
>>>   path_id text PRIMARY KEY,
>>>   metadata map
>>>   );
>>>
>>> INSERT INTO test_path (path_id, metadata ) VALUES ( '2', { 'mime-type':
>>> {values : {'Mime'}}
>>> {'applicable-security-policy' : {'SOX','FOX'}} });
>>>
>>>
>>> Query (which does not work) can be :
>>> SELECT * from test_path where metadata CONTAINS {values: {'FOX'},
>>> 'SOX'}} ;
>>> OR
>>> SELECT * from test_path where metadata CONTAINS {values: {'FOX'};
>>>
>>>
>>> Thanks
>>> Regards
>>> Neha
>>>
>>>
>>>
>>>
>>>
>>
>


Re: Repair time comparison for Cassandra 2.1.11

2015-11-15 Thread Badrjan
Nothing is being dropped plus the processor is busy around 60%. 
B.

15. Nov 2015 15:58 by anujw_2...@yahoo.co.in:


> Repair can take long time if you have lota of inconaistent data. If you 
> havent restarted nodes yet, you can  run nodetool tpstats command on all 
> nodes to make sure that there no mutation drops.
> Thanks> Anuj
>
>
> Sent from Yahoo Mail on Android
>
>  > From> :"> badr...@tuta.io> " <> badr...@tuta.io> >
> Date> :Sun, 15 Nov, 2015 at 4:20 pm
> Subject> :Repair time comparison for Cassandra 2.1.11
>
>  Hi,
> I have cluster of 4 machines With Cassandra 2.1.11,  SSD drives, 600 gb 
> data on each node (replication factor 3). 
> When I run partial repair on one node, it takes 50 hours to finish. Is that 
> normal? 
> B. >   

Re: Repair time comparison for Cassandra 2.1.11

2015-11-15 Thread Anuj Wadehra
Ok. I dont have much experience with 2.1 as we are on 2.0.x. Are you using 
sequential repair? If yes, parallel repair can be faster but you need to make 
sure that your application has sufficient room to run when cluster is running 
repair.


Are you observing any WARN or ERROR messages in logs while repair is running?



50 hours seems too much considering your cluster is stable and you dont have 
any dropped mutations on any of the nodes.




Thanks

Anuj



Sent from Yahoo Mail on Android

From:"Badrjan" 
Date:Sun, 15 Nov, 2015 at 5:39 pm
Subject:Re: Repair time comparison for Cassandra 2.1.11

Nothing is being dropped plus the processor is busy around 60%. 


B.

15. Nov 2015 15:58 by anujw_2...@yahoo.co.in:

Repair can take long time if you have lota of inconaistent data. If you havent 
restarted nodes yet, you can  run nodetool tpstats command on all nodes to make 
sure that there no mutation drops.


Thanks

Anuj

Sent from Yahoo Mail on Android

From:"badr...@tuta.io" 
Date:Sun, 15 Nov, 2015 at 4:20 pm
Subject:Repair time comparison for Cassandra 2.1.11

Hi,


I have cluster of 4 machines With Cassandra 2.1.11,  SSD drives, 600 gb data on 
each node (replication factor 3). 

When I run partial repair on one node, it takes 50 hours to finish. Is that 
normal? 


B. 



Repair time comparison for Cassandra 2.1.11

2015-11-15 Thread badrjan
Hi,
I have cluster of 4 machines With Cassandra 2.1.11,  SSD drives, 600 gb data 
on each node (replication factor 3). 
When I run partial repair on one node, it takes 50 hours to finish. Is that 
normal? 
B. 

Re: Repair time comparison for Cassandra 2.1.11

2015-11-15 Thread Anuj Wadehra
Repair can take long time if you have lota of inconaistent data. If you havent 
restarted nodes yet, you can  run nodetool tpstats command on all nodes to make 
sure that there no mutation drops.


Thanks

Anuj

Sent from Yahoo Mail on Android

From:"badr...@tuta.io" 
Date:Sun, 15 Nov, 2015 at 4:20 pm
Subject:Repair time comparison for Cassandra 2.1.11

Hi,


I have cluster of 4 machines With Cassandra 2.1.11,  SSD drives, 600 gb data on 
each node (replication factor 3). 

When I run partial repair on one node, it takes 50 hours to finish. Is that 
normal? 


B. 



Re: Repair time comparison for Cassandra 2.1.11

2015-11-15 Thread Badrjan
Repairs are parallel. The only error-ish message I see in the log of nodetool 
is "Lost notification. You should check server log for repair status of 
keyspace"
During the repair most of the time was spent in the process of waiting for 
merkel tree from other nodes. I checked, the streaming was not the issue. So 
apparently the issue is somewhere in the part of merkle tree generation 
during validation, and most probably the part where disk is being read. 
Sequential repairs are off, so no anticompaction is being done. 
B.

15. Nov 2015 16:22 by anujw_2...@yahoo.co.in:


> Ok. I dont have much experience with 2.1 as we are on 2.0.x. Are you using 
> sequential repair? If yes, parallel repair can be faster but you need to 
> make sure that your application has sufficient room to run when cluster is 
> running repair.
> Are you observing any WARN or ERROR messages in logs while repair is 
> running?
>
> 50 hours seems too much considering your cluster is stable and you dont 
> have any dropped mutations on any of the nodes.
>
>
> Thanks> Anuj
>
>
> Sent from Yahoo Mail on Android
>
>  > From> :"Badrjan" <> badr...@tuta.io> >
> Date> :Sun, 15 Nov, 2015 at 5:39 pm
> Subject> :Re: Repair time comparison for Cassandra 2.1.11
>
>  Nothing is being dropped plus the processor is busy around 60%. 
> B.
>
> 15. Nov 2015 15:58 by > anujw_2...@yahoo.co.in> :
>
>
>> Repair can take long time if you have lota of inconaistent data. If you 
>> havent restarted nodes yet, you can  run nodetool tpstats command on all 
>> nodes to make sure that there no mutation drops.
>> Thanks>> Anuj
>>
>>
>> Sent from Yahoo Mail on Android
>>
>>  >> From>> :">> badr...@tuta.io>> " <>> badr...@tuta.io>> >
>> Date>> :Sun, 15 Nov, 2015 at 4:20 pm
>> Subject>> :Repair time comparison for Cassandra 2.1.11
>>
>>  Hi,
>> I have cluster of 4 machines With Cassandra 2.1.11,  SSD drives, 600 gb 
>> data on each node (replication factor 3). 
>> When I run partial repair on one node, it takes 50 hours to finish. Is 
>> that normal? 
>> B. >>
>   

Re: Repair time comparison for Cassandra 2.1.11

2015-11-15 Thread Anuj Wadehra


For the error, you can see 
http://www.scriptscoop.net/t/3bac9a3307ac/cassandra-lost-notification-from-nodetool-repair.html


Lost notification should not be a problem.please see 
https://issues.apache.org/jira/browse/CASSANDRA-7909



Infact, we are also currently facing an issue where merkle tree is not received 
from one or more nodes in remote dc and repair hangs for ever. We would be 
turning on debug logging as some important TCP messages are being logged at 
debug. Also we would be monitoring netstats and tcpdump while repair is 
running. You can try similar things to troubleshoot.


May be more experieced guys can comment on this to help u :)


Thanks

Anuj



Sent from Yahoo Mail on Android 

From:"Badrjan" 
Date:Sun, 15 Nov, 2015 at 6:14 pm
Subject:Re: Repair time comparison for Cassandra 2.1.11

Repairs are parallel. The only error-ish message I see in the log of nodetool 
is 

"Lost notification. You should check server log for repair status of keyspace"


During the repair most of the time was spent in the process of waiting for 
merkel tree from other nodes. I checked, the streaming was not the issue. So 
apparently the issue is somewhere in the part of merkle tree generation during 
validation, and most probably the part where disk is being read. Sequential 
repairs are off, so no anticompaction is being done. 


B.

15. Nov 2015 16:22 by anujw_2...@yahoo.co.in:

Ok. I dont have much experience with 2.1 as we are on 2.0.x. Are you using 
sequential repair? If yes, parallel repair can be faster but you need to make 
sure that your application has sufficient room to run when cluster is running 
repair.


Are you observing any WARN or ERROR messages in logs while repair is running?



50 hours seems too much considering your cluster is stable and you dont have 
any dropped mutations on any of the nodes.




Thanks

Anuj



Sent from Yahoo Mail on Android

From:"Badrjan" 
Date:Sun, 15 Nov, 2015 at 5:39 pm
Subject:Re: Repair time comparison for Cassandra 2.1.11

Nothing is being dropped plus the processor is busy around 60%. 


B.

15. Nov 2015 15:58 by anujw_2...@yahoo.co.in:

Repair can take long time if you have lota of inconaistent data. If you havent 
restarted nodes yet, you can  run nodetool tpstats command on all nodes to make 
sure that there no mutation drops.


Thanks

Anuj

Sent from Yahoo Mail on Android

From:"badr...@tuta.io" 
Date:Sun, 15 Nov, 2015 at 4:20 pm
Subject:Repair time comparison for Cassandra 2.1.11

Hi,


I have cluster of 4 machines With Cassandra 2.1.11,  SSD drives, 600 gb data on 
each node (replication factor 3). 

When I run partial repair on one node, it takes 50 hours to finish. Is that 
normal? 


B. 



Re: Convert timeuuid in timestamp programmatically

2015-11-15 Thread Dongfeng Lu
You can use long java.util.UUID.timestamp(). 


 On Sunday, November 15, 2015 9:20 AM, Marlon Patrick 
 wrote:
   

 Hi guys,
Is there any way to convert a timeuuid in timestamp (dateOf) programmatically 
using DataStax java driver?
-- 
Atenciosamente,

Marlon Patrick

  

UDT - Collection - Query

2015-11-15 Thread Neha Dave
CREATE type metadata1 (
  key text,
  value set,
  );

CREATE TABLE test_path3 (

  path_id text,
  mdata frozen,
  PRIMARY KEY (path_id,mdata)
  );
CREATE INDEX metadata_teste_path3 on test_path3 (mdata) ;

INSERT INTO test_path3 (path_id, mdata ) VALUES ( '2', { key
:'mime-type',value: {'Mime'}});

INSERT INTO test_path3 (path_id, mdata ) VALUES ( '1', { key
:'mime-type',value: {'Mime'}});

INSERT INTO test_path3 (path_id, mdata ) VALUES ( '1', { key
:'applicable-security-policy',value: {'SOX'}});

INSERT INTO test_path3 (path_id, mdata ) VALUES ( '1', { key
:'applicable-security-policy',value: {'FOX'}});





*Can I query Something likecqlsh:mykeyspace> SELECT * FROM test_path3
where mdata.value CONTAINS {'Mime'};SyntaxException: *
Thanks
regards
Neha