Re: opscenter doesn't work with cassandra 3.0

2016-01-26 Thread George Sigletos
Unfortunately Datastax decided to discontinue Opscenter for open source
Cassandra, starting from version 2.2.

Pitty

On Wed, Jan 6, 2016 at 6:00 PM, Michael Shuler 
wrote:

> On 01/06/2016 10:55 AM, Michael Shuler wrote:
> > On 01/06/2016 01:47 AM, Wills Feng wrote:
> >> Looks like opscenter doesn't support cassandra 3.0?
> >
> > This is correct. OpsCenter does not support Cassandra >= 3.0.
>
> It took me a minute to find the correct document:
>
>
> http://docs.datastax.com/en/upgrade/doc/upgrade/opscenter/opscCompatibility.html
>
> According to this version table, OpsCenter does not officially support
> Cassandra > 2.1.
>
> --
> Michael
>


Need Column Family Schema Suggestion

2016-01-26 Thread srungarapu vamsi
Hi,
I have the following use case:
A product (P) has 3 or more Devices associated with it. Each device (Di)
emits a set of names (size of the set is less than or equal to 250) every
minute.
Now the ask is: Compute the function f(product,hour) which is defined as
follows:
*foo*(*product*,*hour*) = Number of strings which are seen by all the
devices associated with the given *product *in the given *hour*.
Example:
Lets say product p1 has devices d1,d2,d3 associated with it.
Lets say S(d,h) is the *set* of names seen by the device d in hour h.
So, now foo(p1,h) = length(S(d1,h) intersect S(d2,h) intersect S(d3,h))

I came up with the following approaches but i am not convinced with them:
Approach A.
Create a column family with the following schema:
column family name : hour_data
hour_data(hour,product,name,device_id_set)
device_id_set is Set
Primary Key: (hour,product,name)
*Issue*:
I can't just run a query like SELECT COUNT(*) FROM hour_data where hour=
and product=p and length(device_id_set)=3 as querying on collections is not
possible

Approach B.
Create a column family with the following schema:
column family name : hour_data
hour_data(hour,product,name,num_devices_counter)
num_devices_counter is counter
Primary Key: (hour,product,name)
*Issue*:
I can't just run a query like SELECT COUNT(*) FROM hour_data where hour=
and product=p and num_devices_counter=3 as querying on collections is not
possible

Approach C.
Column family schema:
hour_data(hour,device,name)
Primary Key: (hour,device,name)
If we have to compute foo(p1,h) then read the data for every deice
from *hour_data
*and perform intersection in spark.
*Issue*:
This is a heavy operation and is demanding big and multiple machines.

Could you please help me in refining the Schemas or defining a new schema
to solve my problem?


Re: Need Column Family Schema Suggestion

2016-01-26 Thread srungarapu vamsi
Jack,
This is one of the analytics jobs i have to run. For the given problem, i
want to optimize the schema so that instead of loading the data as rdd to
spark machines , i want to get the direct number from cassandra queries.
The rationale behind this logic is i want to save on spark machine types :)

On Wed, 27 Jan 2016 at 02:07 Jack Krupansky 
wrote:

> Step 1 in data modeling in Cassandra is to define all of your queries. Are
> these in fact the ONLY queries that you need?
>
> If you are doing significant analytics, Spark is indeed the way to go.
>
> Cassandra works best for point queries and narrow slice queries (sequence
> of consecutive rows within a single partition).
>
> -- Jack Krupansky
>
> On Tue, Jan 26, 2016 at 4:46 AM, srungarapu vamsi <
> srungarapu1...@gmail.com> wrote:
>
>> Hi,
>> I have the following use case:
>> A product (P) has 3 or more Devices associated with it. Each device (Di)
>> emits a set of names (size of the set is less than or equal to 250) every
>> minute.
>> Now the ask is: Compute the function f(product,hour) which is defined as
>> follows:
>> *foo*(*product*,*hour*) = Number of strings which are seen by all the
>> devices associated with the given *product *in the given *hour*.
>> Example:
>> Lets say product p1 has devices d1,d2,d3 associated with it.
>> Lets say S(d,h) is the *set* of names seen by the device d in hour h.
>> So, now foo(p1,h) = length(S(d1,h) intersect S(d2,h) intersect S(d3,h))
>>
>> I came up with the following approaches but i am not convinced with them:
>> Approach A.
>> Create a column family with the following schema:
>> column family name : hour_data
>> hour_data(hour,product,name,device_id_set)
>> device_id_set is Set
>> Primary Key: (hour,product,name)
>> *Issue*:
>> I can't just run a query like SELECT COUNT(*) FROM hour_data where
>> hour= and product=p and length(device_id_set)=3 as querying on
>> collections is not possible
>>
>> Approach B.
>> Create a column family with the following schema:
>> column family name : hour_data
>> hour_data(hour,product,name,num_devices_counter)
>> num_devices_counter is counter
>> Primary Key: (hour,product,name)
>> *Issue*:
>> I can't just run a query like SELECT COUNT(*) FROM hour_data where
>> hour= and product=p and num_devices_counter=3 as querying on collections
>> is not possible
>>
>> Approach C.
>> Column family schema:
>> hour_data(hour,device,name)
>> Primary Key: (hour,device,name)
>> If we have to compute foo(p1,h) then read the data for every deice from 
>> *hour_data
>> *and perform intersection in spark.
>> *Issue*:
>> This is a heavy operation and is demanding big and multiple machines.
>>
>> Could you please help me in refining the Schemas or defining a new schema
>> to solve my problem?
>>
>>
>


Re: Embedded cassandra

2016-01-26 Thread Enrico Olivelli
Thank you all for your feedback.
I and my team will take all these suggestions into account.

Cheers
Enrico

Il giorno Mar 26 Gen 2016 23:51 Jack Krupansky 
ha scritto:

> There is no documented support for embedded Cassandra. Sure, there is a
> CassandraDaemon class and a EmbeddedCassandraService class, but they are
> intended for testing, not for use of the product.
>
> I have seen a couple of (old) references to people running embedded
> Cassandra (one in the official Wiki), but nothing in recent years.
>
> In short, maybe it might work, but be sure to inform your organization's
> management that it would not be supported. You'd be on your own. For
> example, even if you find a legitimate bug in Cassandra itself, the first
> thing we'd ask you to do is to provide a repro that uses normal Cassandra.
>
>
> -- Jack Krupansky
>
> On Tue, Jan 26, 2016 at 5:22 PM, Jonathan Haddad 
> wrote:
>
>> Launching a distributed database inside of an application server does not
>> make it easier to manage, it makes it a nightmare.
>>
>> Rebooting a node is easy, rebooting ALL your nodes when you do a
>> deployment is pointless.
>>
>> On Tue, Jan 26, 2016 at 2:14 PM Enrico Olivelli 
>> wrote:
>>
>>> Hanks for your replies.
>>> Actually my service uses a traditional jdbc database which is to be
>>> shared among all the peers. I'm looking for a shared-nothing db and
>>> Cassandra seems good.
>>> I'm already using HBase in production but it seems to me that Cassandra
>>> is more dynamic. No need for distributed fs like hdfs, no need for a
>>> coordinator. From the docs it looks like a sort of peer to peer db.
>>> Your answers sound like a reboot of a node or the addition or the
>>> removal are not so simple and automatic operations ?
>>>
>>> The other use case is that I want a db that can be launched inside the
>>> same process in order to make the system simpler to manage. Actually we use
>>> h2 for single instance deployments but it is not good for production.
>>>
>>> --  Enrico
>>>
>>> Il giorno Mar 26 Gen 2016 21:59 Jonathan Haddad  ha
>>> scritto:
>>>
 For the sake of argument... why do you think you should embed
 Cassandra?  I'll be honest with you, making Cassandra restart every time
 you want to upgrade your daemon sounds like a horrible idea.

 Run your 10 DB instances on their own and save yourself the operational
 headache.

 On Tue, Jan 26, 2016 at 11:35 AM Richard L. Burton III <
 mrbur...@gmail.com> wrote:

> I'm certain you're going to get a lot of users on this mailing list
> telling you that's a bad idea. You should read up on Cassandra via 
> datastax
> website to understand how Cassandra is designed and works.
>
> There's tools for monitoring and more.
>
> On Tue, Jan 26, 2016 at 2:31 PM, Enrico Olivelli 
> wrote:
>
>> Hi,
>> I' new to Cassandra. I'm evaluating to launch Cassandra daemon inside
>> the JVM of my program. This is essentially because I want the lifecycle 
>> of
>> Cassandra to be managed by my daemon.
>> My program can be launched on several machines (in the order of max
>> 10 instances) and every instance collaborate with others in a peer to 
>> peer,
>> fully decentralized way. Cassandra seems to be the best java db which can
>> be useful for my purpose.
>>
>> Is there any experience of Cassandra embedded in production systems ?
>>
>> Regards
>> Enrico Olivelli
>>
>
>
>
> --
> -Richard L. Burton III
> @rburton
>

>


Re: opscenter doesn't work with cassandra 3.0

2016-01-26 Thread Hannu Kröger
Is it really like that? Where does this info come from? I haven’t seen anything 
“official" yet.

Hannu

> On 26 Jan 2016, at 15:07,  
>  wrote:
> 
> This is a very strange move considering how well DataStax has supported open 
> source Cassandra. I hope there is a reasonable and well-publicized 
> explanation for this apparent change in direction.
>  
>  
> Sean Durity
>  
> From: George Sigletos [mailto:sigle...@textkernel.nl] 
> Sent: Tuesday, January 26, 2016 4:09 AM
> To: user@cassandra.apache.org
> Subject: Re: opscenter doesn't work with cassandra 3.0
>  
> Unfortunately Datastax decided to discontinue Opscenter for open source 
> Cassandra, starting from version 2.2. 
> 
> Pitty
>  
> On Wed, Jan 6, 2016 at 6:00 PM, Michael Shuler  > wrote:
> On 01/06/2016 10:55 AM, Michael Shuler wrote:
> > On 01/06/2016 01:47 AM, Wills Feng wrote:
> >> Looks like opscenter doesn't support cassandra 3.0?
> >
> > This is correct. OpsCenter does not support Cassandra >= 3.0.
> 
> It took me a minute to find the correct document:
> 
> http://docs.datastax.com/en/upgrade/doc/upgrade/opscenter/opscCompatibility.html
>  
> 
> 
> According to this version table, OpsCenter does not officially support
> Cassandra > 2.1.
> 
> --
> Michael
>  
> 
> 
> The information in this Internet Email is confidential and may be legally 
> privileged. It is intended solely for the addressee. Access to this Email by 
> anyone else is unauthorized. If you are not the intended recipient, any 
> disclosure, copying, distribution or any action taken or omitted to be taken 
> in reliance on it, is prohibited and may be unlawful. When addressed to our 
> clients any opinions or advice contained in this Email are subject to the 
> terms and conditions expressed in any applicable governing The Home Depot 
> terms of business or client engagement letter. The Home Depot disclaims all 
> responsibility and liability for the accuracy and content of this attachment 
> and for any damages or losses arising from any inaccuracies, errors, viruses, 
> e.g., worms, trojan horses, etc., or other items of a destructive nature, 
> which may be contained in this attachment and shall not be liable for direct, 
> indirect, consequential or special damages in connection with this e-mail 
> message or its attachment.



Re: opscenter doesn't work with cassandra 3.0

2016-01-26 Thread Julien Anguenot
If you install the latest OpsCenter you will get a banner stating the policy 
changes and linking to:

   http://docs.datastax.com/en/opscenter/5.2/opsc/opscPolicyChanges.html

On our side, we decided to go with Sematext SPM to replace OpsCenter for our 
2.1 cluster and thus be able to upgrade to 2.2.x and then 3.0.x before November 
this year.

   https://sematext.com/spm/integrations/cassandra-monitoring.html

The Cassandra integration is great and very easy to setup. They offer both SaaS 
and on-premises at a reasonable price.

As well, they say it should work with Cassandra 3.0.x. (have not tested yet 
myself)

   J.

> On Jan 26, 2016, at 7:19 AM, Hannu Kröger  wrote:
> 
> Is it really like that? Where does this info come from? I haven’t seen 
> anything “official" yet.
> 
> Hannu
> 
>> On 26 Jan 2016, at 15:07, > > > > wrote:
>> 
>> This is a very strange move considering how well DataStax has supported open 
>> source Cassandra. I hope there is a reasonable and well-publicized 
>> explanation for this apparent change in direction.
>>  
>>  
>> Sean Durity
>>  
>> From: George Sigletos [mailto:sigle...@textkernel.nl 
>> ] 
>> Sent: Tuesday, January 26, 2016 4:09 AM
>> To: user@cassandra.apache.org 
>> Subject: Re: opscenter doesn't work with cassandra 3.0
>>  
>> Unfortunately Datastax decided to discontinue Opscenter for open source 
>> Cassandra, starting from version 2.2. 
>> 
>> Pitty
>>  
>> On Wed, Jan 6, 2016 at 6:00 PM, Michael Shuler > > wrote:
>> On 01/06/2016 10:55 AM, Michael Shuler wrote:
>> > On 01/06/2016 01:47 AM, Wills Feng wrote:
>> >> Looks like opscenter doesn't support cassandra 3.0?
>> >
>> > This is correct. OpsCenter does not support Cassandra >= 3.0.
>> 
>> It took me a minute to find the correct document:
>> 
>> http://docs.datastax.com/en/upgrade/doc/upgrade/opscenter/opscCompatibility.html
>>  
>> 
>> 
>> According to this version table, OpsCenter does not officially support
>> Cassandra > 2.1.
>> 
>> --
>> Michael
>>  
>> 
>> 
>> The information in this Internet Email is confidential and may be legally 
>> privileged. It is intended solely for the addressee. Access to this Email by 
>> anyone else is unauthorized. If you are not the intended recipient, any 
>> disclosure, copying, distribution or any action taken or omitted to be taken 
>> in reliance on it, is prohibited and may be unlawful. When addressed to our 
>> clients any opinions or advice contained in this Email are subject to the 
>> terms and conditions expressed in any applicable governing The Home Depot 
>> terms of business or client engagement letter. The Home Depot disclaims all 
>> responsibility and liability for the accuracy and content of this attachment 
>> and for any damages or losses arising from any inaccuracies, errors, 
>> viruses, e.g., worms, trojan horses, etc., or other items of a destructive 
>> nature, which may be contained in this attachment and shall not be liable 
>> for direct, indirect, consequential or special damages in connection with 
>> this e-mail message or its attachment.
> 

--
Julien Anguenot (@anguenot)
USA +1.832.408.0344   
FR +33.7.86.85.70.44



Re: About cassandra's reblance when adding one or more nodes into the existed cluster?

2016-01-26 Thread Alain RODRIGUEZ
Hi Dillon,

I will assume you're using Murmur3 and that you *don't* use vnodes (ie
"num_token" option commented in cassandra.yaml) as if vnodes are enabled
this operation is useless (and maybe harmful, not sure about that)

Can you give us the output of :

$ *nodetool ring*

It looks like your trying to take a token already in use as described.

 After I added one node into the existed cluster, I want to use "nodetool
> move" command:


Why not starting with the good token setting "num_token" to this token ? A
move is quite an heavy operation, plus you will have to run *nodetool
cleanup *on all the nodes which have had their range reduced (node impacted
+ replicas) which is also long and potentially heavy to.

-
Alain Rodriguez
France

The Last Pickle
http://www.thelastpickle.com

2016-01-26 5:19 GMT+01:00 土卜皿 :

> Hi, all
>After I added one node into the existed cluster, I want to use
> "nodetool move" command:
>
> ../cassandra-2.1.11/bin/nodetool -h 192.168.56.110 move
> -2696407920004217295
>
> I hope move  -2696407920004217295 (existed in 192.168.56.110) into
> 192.168.56.112, but I got the following error:
>
> [root@test-1 pengcz]# ../cassandra-2.1.11/bin/nodetool -h 192.168.56.110
> move -2696407920004217295
>
> error: target token -2696407920004217295 is already owned by another node.
>
> -- StackTrace --
>
> java.io.IOException: target token -2696407920004217295 is already owned by
> another node.
>
> at
> org.apache.cassandra.service.StorageService.move(StorageService.java:3479)
>
> What should I do for this, Thanks in advance!
>
>
> Dillon
>
>
>


Re: How to make the new cassandra cluster balanced after adding one or more nodes?

2016-01-26 Thread Alain RODRIGUEZ
Hi Dillon,

I advise you to keep writing in the same thread as long as it is about the
same issue, to avoid spreading information. I answered your first email :-).

C*heers,

-
Alain Rodriguez
France

The Last Pickle
http://www.thelastpickle.com

2016-01-26 6:50 GMT+01:00 土卜皿 :

> For testing here and describing the question simple, I used two nodes to
> build a cassandra(2.1.11) cluster (192.168.56.110 and 192.168.56.111),
> Now I added one additional node(192.168.56.112) to this cluster, and I
> hope my new ring balanced through using nodetool move command, but when I
> using the following steps:
>
>1.
>
>Getting the 192.168.56.110's all token range, such as 981588427421702712
>-- 1007755089748978774
>2.
>
>Getting the new node's all token range, such as 5458173168911717635 --
>5458821955945522089
>3.
>
>I executed the command:
>
> [root@test-1 pengcz]# ../cassandra-2.1.11/bin/nodetool  -h 192.168.56.110 
> -u admin -pw   admin4587 move 5458173168911717635
> error: target token 5458173168911717635 is already owned by another node.
> -- StackTrace --
> java.io.IOException: target token 5458173168911717635 is already owned by 
> another node.
>
>According to the article Load balancing
> said: If you add nodes
>to your cluster your ring will be unbalanced and only way to get perfect
>balance is to compute new tokens for every node and assign them to each
>node manually by using nodetool move command., I think I seemly
>understood nodetool move command wrong, But I don't know how to
>understand it and balance the new cluster? Any advice will be appreciated!
>
> Dillon
>


Re: opscenter doesn't work with cassandra 3.0

2016-01-26 Thread Otis Gospodnetić
Hi,

As Julien pointed out, there is a good OpsCenter alternative at
https://sematext.com/spm/integrations/cassandra-monitoring.html

Questions/comments/feedback/milk/cookies are all welcome.

Otis
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/


On Wed, Jan 6, 2016 at 12:00 PM, Michael Shuler 
wrote:

> On 01/06/2016 10:55 AM, Michael Shuler wrote:
> > On 01/06/2016 01:47 AM, Wills Feng wrote:
> >> Looks like opscenter doesn't support cassandra 3.0?
> >
> > This is correct. OpsCenter does not support Cassandra >= 3.0.
>
> It took me a minute to find the correct document:
>
>
> http://docs.datastax.com/en/upgrade/doc/upgrade/opscenter/opscCompatibility.html
>
> According to this version table, OpsCenter does not officially support
> Cassandra > 2.1.
>
> --
> Michael
>


Re: opscenter doesn't work with cassandra 3.0

2016-01-26 Thread DuyHai Doan
Hello Otis

 The Sematext tools, is it free or not ? And if not free, is there a
"limited" open-source version ?

On Tue, Jan 26, 2016 at 3:39 PM, Otis Gospodnetić <
otis.gospodne...@gmail.com> wrote:

> Hi,
>
> As Julien pointed out, there is a good OpsCenter alternative at
> https://sematext.com/spm/integrations/cassandra-monitoring.html
>
> Questions/comments/feedback/milk/cookies are all welcome.
>
> Otis
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
> On Wed, Jan 6, 2016 at 12:00 PM, Michael Shuler 
> wrote:
>
>> On 01/06/2016 10:55 AM, Michael Shuler wrote:
>> > On 01/06/2016 01:47 AM, Wills Feng wrote:
>> >> Looks like opscenter doesn't support cassandra 3.0?
>> >
>> > This is correct. OpsCenter does not support Cassandra >= 3.0.
>>
>> It took me a minute to find the correct document:
>>
>>
>> http://docs.datastax.com/en/upgrade/doc/upgrade/opscenter/opscCompatibility.html
>>
>> According to this version table, OpsCenter does not officially support
>> Cassandra > 2.1.
>>
>> --
>> Michael
>>
>
>


Re: opscenter doesn't work with cassandra 3.0

2016-01-26 Thread Otis Gospodnetić
Hi Duyhai,

SPM is not free, but there is a free plan, plus we have special pricing for
startups, non-profits, and education institutions.

Otis
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/


On Tue, Jan 26, 2016 at 9:59 AM, DuyHai Doan  wrote:

> Hello Otis
>
>  The Sematext tools, is it free or not ? And if not free, is there a
> "limited" open-source version ?
>
> On Tue, Jan 26, 2016 at 3:39 PM, Otis Gospodnetić <
> otis.gospodne...@gmail.com> wrote:
>
>> Hi,
>>
>> As Julien pointed out, there is a good OpsCenter alternative at
>> https://sematext.com/spm/integrations/cassandra-monitoring.html
>>
>> Questions/comments/feedback/milk/cookies are all welcome.
>>
>> Otis
>> --
>> Monitoring - Log Management - Alerting - Anomaly Detection
>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>>
>>
>> On Wed, Jan 6, 2016 at 12:00 PM, Michael Shuler 
>> wrote:
>>
>>> On 01/06/2016 10:55 AM, Michael Shuler wrote:
>>> > On 01/06/2016 01:47 AM, Wills Feng wrote:
>>> >> Looks like opscenter doesn't support cassandra 3.0?
>>> >
>>> > This is correct. OpsCenter does not support Cassandra >= 3.0.
>>>
>>> It took me a minute to find the correct document:
>>>
>>>
>>> http://docs.datastax.com/en/upgrade/doc/upgrade/opscenter/opscCompatibility.html
>>>
>>> According to this version table, OpsCenter does not officially support
>>> Cassandra > 2.1.
>>>
>>> --
>>> Michael
>>>
>>
>>
>


Re: opscenter doesn't work with cassandra 3.0

2016-01-26 Thread Alain RODRIGUEZ
Hi all,

I can confirm that Sematext SPM is a working option for
monitoring Cassandra. I used it for 1+ year.

Their support is online most of the time, they are very friendly (starting
with Otis) and comprehensive (go discuss with them if the only issue is
pricing, you'll find a way) but overall competent. You can ask them for new
features you would like to have.

The product itself gives a lot of *relevant* informations out of the box.
The dashboards are already built and very helpful to tackle any issue. You
can install just an agent and you're all set.

Here is an Interview I did 1 year ago for planetcassandra about this topic:
http://www.planetcassandra.org/blog/interview/video-advertising-platform-teads-chose-cassandra-spm-and-opscenter-to-monitor-a-personalized-ad-experience/.
Hope it will be useful.

I have been out for 3 months, but my guess is the product has improved. I
was mainly missing the ring view from opscenter. This view is a cool view
to present Cassandra and also give an really quick and yet relevant view of
the C* cluster. Also, 2 DC = 2 app in Sematext, which is not terrible
compared to Opscenter.

Glad to see you around Otis,

C*heers,

-
Alain Rodriguez
France

The Last Pickle
http://www.thelastpickle.com

2016-01-26 16:42 GMT+01:00 Otis Gospodnetić :

> Hi Duyhai,
>
> SPM is not free, but there is a free plan, plus we have special pricing
> for startups, non-profits, and education institutions.
>
> Otis
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
> On Tue, Jan 26, 2016 at 9:59 AM, DuyHai Doan  wrote:
>
>> Hello Otis
>>
>>  The Sematext tools, is it free or not ? And if not free, is there a
>> "limited" open-source version ?
>>
>> On Tue, Jan 26, 2016 at 3:39 PM, Otis Gospodnetić <
>> otis.gospodne...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> As Julien pointed out, there is a good OpsCenter alternative at
>>> https://sematext.com/spm/integrations/cassandra-monitoring.html
>>>
>>> Questions/comments/feedback/milk/cookies are all welcome.
>>>
>>> Otis
>>> --
>>> Monitoring - Log Management - Alerting - Anomaly Detection
>>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>>>
>>>
>>> On Wed, Jan 6, 2016 at 12:00 PM, Michael Shuler 
>>> wrote:
>>>
 On 01/06/2016 10:55 AM, Michael Shuler wrote:
 > On 01/06/2016 01:47 AM, Wills Feng wrote:
 >> Looks like opscenter doesn't support cassandra 3.0?
 >
 > This is correct. OpsCenter does not support Cassandra >= 3.0.

 It took me a minute to find the correct document:


 http://docs.datastax.com/en/upgrade/doc/upgrade/opscenter/opscCompatibility.html

 According to this version table, OpsCenter does not officially support
 Cassandra > 2.1.

 --
 Michael

>>>
>>>
>>
>


Re: automated CREATE TABLE just nuked my cluster after a 2.0 -> 2.1 upgrade....

2016-01-26 Thread Francisco Reyes

On 01/22/2016 10:29 PM, Kevin Burton wrote:
I sort of agree.. but we are also considering migrating to hourly 
tables.. and what if the single script doesn't run.


I like having N nodes make changes like this because in my experience 
that central / single box will usually fail at the wrong time :-/




On Fri, Jan 22, 2016 at 6:47 PM, Jonathan Haddad > wrote:


Instead of using ZK, why not solve your concurrency problem by
removing it?  By that, I mean simply have 1 process that creates
all your tables instead of creating a race condition intentionally?

On Fri, Jan 22, 2016 at 6:16 PM Kevin Burton > wrote:

Not sure if this is a bug or not or kind of a *fuzzy* area.

In 2.0 this worked fine.

We have a bunch of automated scripts that go through and
create tables... one per day.

at midnight UTC our entire CQL went offline.. .took down our
whole app.  ;-/

The resolution was a full CQL shut down and then a drop table
to remove the bad tables...

pretty sure the issue was with schema disagreement.

All our CREATE TABLE use IF NOT EXISTS but I think the IF
NOT EXISTS only checks locally?

My work around is going to be to use zookeeper to create a
mutex lock during this operation.

Any other things I should avoid?


-- 
We’re hiring if you know of any awesome Java Devops or Linux

Operations Engineers!

Founder/CEO Spinn3r.com 
Location: *San Francisco, CA*
blog:**http://burtonator.wordpress.com
… or check out my Google+ profile





--
We’re hiring if you know of any awesome Java Devops or Linux 
Operations Engineers!


Founder/CEO Spinn3r.com 
Location: *San Francisco, CA*
blog:**http://burtonator.wordpress.com
… or check out my Google+ profile 





One way to accomplish both, a single process doing the work and having 
multiple machines be able to do it, is to have a control table.


You can have a table that lists what tables have been created and force 
concistency all. In this table you list the names of tables created. If 
a table name is in there, it doesn't need to be created again.


Re: opscenter doesn't work with cassandra 3.0

2016-01-26 Thread Chris Lohfink
DataStax has a free program for startups
http://www.datastax.com/datastax-enterprise-for-startups

On Tue, Jan 26, 2016 at 9:42 AM, Otis Gospodnetić <
otis.gospodne...@gmail.com> wrote:

> Hi Duyhai,
>
> SPM is not free, but there is a free plan, plus we have special pricing
> for startups, non-profits, and education institutions.
>
> Otis
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>
>
> On Tue, Jan 26, 2016 at 9:59 AM, DuyHai Doan  wrote:
>
>> Hello Otis
>>
>>  The Sematext tools, is it free or not ? And if not free, is there a
>> "limited" open-source version ?
>>
>> On Tue, Jan 26, 2016 at 3:39 PM, Otis Gospodnetić <
>> otis.gospodne...@gmail.com> wrote:
>>
>>> Hi,
>>>
>>> As Julien pointed out, there is a good OpsCenter alternative at
>>> https://sematext.com/spm/integrations/cassandra-monitoring.html
>>>
>>> Questions/comments/feedback/milk/cookies are all welcome.
>>>
>>> Otis
>>> --
>>> Monitoring - Log Management - Alerting - Anomaly Detection
>>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
>>>
>>>
>>> On Wed, Jan 6, 2016 at 12:00 PM, Michael Shuler 
>>> wrote:
>>>
 On 01/06/2016 10:55 AM, Michael Shuler wrote:
 > On 01/06/2016 01:47 AM, Wills Feng wrote:
 >> Looks like opscenter doesn't support cassandra 3.0?
 >
 > This is correct. OpsCenter does not support Cassandra >= 3.0.

 It took me a minute to find the correct document:


 http://docs.datastax.com/en/upgrade/doc/upgrade/opscenter/opscCompatibility.html

 According to this version table, OpsCenter does not officially support
 Cassandra > 2.1.

 --
 Michael

>>>
>>>
>>
>


Re: automated CREATE TABLE just nuked my cluster after a 2.0 -> 2.1 upgrade....

2016-01-26 Thread Eric Stevens
There's still a race condition there, because two clients could SELECT at
the same time as each other, then both INSERT.

You'd be better served with a CAS operation, and let Paxos guarantee
at-most-once execution.

On Tue, Jan 26, 2016 at 9:06 AM Francisco Reyes  wrote:

> On 01/22/2016 10:29 PM, Kevin Burton wrote:
>
> I sort of agree.. but we are also considering migrating to hourly tables..
> and what if the single script doesn't run.
>
> I like having N nodes make changes like this because in my experience that
> central / single box will usually fail at the wrong time :-/
>
>
>
> On Fri, Jan 22, 2016 at 6:47 PM, Jonathan Haddad 
> wrote:
>
>> Instead of using ZK, why not solve your concurrency problem by removing
>> it?  By that, I mean simply have 1 process that creates all your tables
>> instead of creating a race condition intentionally?
>>
>> On Fri, Jan 22, 2016 at 6:16 PM Kevin Burton  wrote:
>>
>>> Not sure if this is a bug or not or kind of a *fuzzy* area.
>>>
>>> In 2.0 this worked fine.
>>>
>>> We have a bunch of automated scripts that go through and create
>>> tables... one per day.
>>>
>>> at midnight UTC our entire CQL went offline.. .took down our whole app.
>>>  ;-/
>>>
>>> The resolution was a full CQL shut down and then a drop table to remove
>>> the bad tables...
>>>
>>> pretty sure the issue was with schema disagreement.
>>>
>>> All our CREATE TABLE use IF NOT EXISTS but I think the IF NOT EXISTS
>>> only checks locally?
>>>
>>> My work around is going to be to use zookeeper to create a mutex lock
>>> during this operation.
>>>
>>> Any other things I should avoid?
>>>
>>>
>>> --
>>> We’re hiring if you know of any awesome Java Devops or Linux Operations
>>> Engineers!
>>>
>>> Founder/CEO Spinn3r.com
>>> Location: *San Francisco, CA*
>>> blog: http://burtonator.wordpress.com
>>> … or check out my Google+ profile
>>> 
>>>
>>>
>
>
> --
> We’re hiring if you know of any awesome Java Devops or Linux Operations
> Engineers!
>
> Founder/CEO Spinn3r.com
> Location: *San Francisco, CA*
> blog: http://burtonator.wordpress.com
> … or check out my Google+ profile
> 
>
>
> One way to accomplish both, a single process doing the work and having
> multiple machines be able to do it, is to have a control table.
>
> You can have a table that lists what tables have been created and force
> concistency all. In this table you list the names of tables created. If a
> table name is in there, it doesn't need to be created again.
>


Embedded cassandra

2016-01-26 Thread Enrico Olivelli
Hi,
I' new to Cassandra. I'm evaluating to launch Cassandra daemon inside the
JVM of my program. This is essentially because I want the lifecycle of
Cassandra to be managed by my daemon.
My program can be launched on several machines (in the order of max 10
instances) and every instance collaborate with others in a peer to peer,
fully decentralized way. Cassandra seems to be the best java db which can
be useful for my purpose.

Is there any experience of Cassandra embedded in production systems ?

Regards
Enrico Olivelli


Re: Embedded cassandra

2016-01-26 Thread Richard L. Burton III
I'm certain you're going to get a lot of users on this mailing list telling
you that's a bad idea. You should read up on Cassandra via datastax website
to understand how Cassandra is designed and works.

There's tools for monitoring and more.

On Tue, Jan 26, 2016 at 2:31 PM, Enrico Olivelli 
wrote:

> Hi,
> I' new to Cassandra. I'm evaluating to launch Cassandra daemon inside the
> JVM of my program. This is essentially because I want the lifecycle of
> Cassandra to be managed by my daemon.
> My program can be launched on several machines (in the order of max 10
> instances) and every instance collaborate with others in a peer to peer,
> fully decentralized way. Cassandra seems to be the best java db which can
> be useful for my purpose.
>
> Is there any experience of Cassandra embedded in production systems ?
>
> Regards
> Enrico Olivelli
>



-- 
-Richard L. Burton III
@rburton


Re: About cassandra's reblance when adding one or more nodes into the existed cluster?

2016-01-26 Thread Romain Hardouin


Hi Dillon, 
CMIIW I suspect that you use vnodes and you want to "move one of the 256 tokens 
to another node". If yes, that's not possible."nodetool move" is not allowed 
with vnodes: 
https://github.com/apache/cassandra/blob/cassandra-2.1.11/src/java/org/apache/cassandra/service/StorageService.java#L3488
*But* if you try "nodetool move" with a token that is already owned by a node, 
the check is done *before* the vnodes check: 
https://github.com/apache/cassandra/blob/cassandra-2.1.11/src/java/org/apache/cassandra/service/StorageService.java#L3479
If you use single token, it seems you try to replace a node by another 
one...Maybe you could explain what is the problem that leads you to do a 
nodetool move? (along with the nodetool ring output as Alain suggested)
Best,Romain

Re: Need Column Family Schema Suggestion

2016-01-26 Thread Jack Krupansky
Step 1 in data modeling in Cassandra is to define all of your queries. Are
these in fact the ONLY queries that you need?

If you are doing significant analytics, Spark is indeed the way to go.

Cassandra works best for point queries and narrow slice queries (sequence
of consecutive rows within a single partition).

-- Jack Krupansky

On Tue, Jan 26, 2016 at 4:46 AM, srungarapu vamsi 
wrote:

> Hi,
> I have the following use case:
> A product (P) has 3 or more Devices associated with it. Each device (Di)
> emits a set of names (size of the set is less than or equal to 250) every
> minute.
> Now the ask is: Compute the function f(product,hour) which is defined as
> follows:
> *foo*(*product*,*hour*) = Number of strings which are seen by all the
> devices associated with the given *product *in the given *hour*.
> Example:
> Lets say product p1 has devices d1,d2,d3 associated with it.
> Lets say S(d,h) is the *set* of names seen by the device d in hour h.
> So, now foo(p1,h) = length(S(d1,h) intersect S(d2,h) intersect S(d3,h))
>
> I came up with the following approaches but i am not convinced with them:
> Approach A.
> Create a column family with the following schema:
> column family name : hour_data
> hour_data(hour,product,name,device_id_set)
> device_id_set is Set
> Primary Key: (hour,product,name)
> *Issue*:
> I can't just run a query like SELECT COUNT(*) FROM hour_data where
> hour= and product=p and length(device_id_set)=3 as querying on
> collections is not possible
>
> Approach B.
> Create a column family with the following schema:
> column family name : hour_data
> hour_data(hour,product,name,num_devices_counter)
> num_devices_counter is counter
> Primary Key: (hour,product,name)
> *Issue*:
> I can't just run a query like SELECT COUNT(*) FROM hour_data where
> hour= and product=p and num_devices_counter=3 as querying on collections
> is not possible
>
> Approach C.
> Column family schema:
> hour_data(hour,device,name)
> Primary Key: (hour,device,name)
> If we have to compute foo(p1,h) then read the data for every deice from 
> *hour_data
> *and perform intersection in spark.
> *Issue*:
> This is a heavy operation and is demanding big and multiple machines.
>
> Could you please help me in refining the Schemas or defining a new schema
> to solve my problem?
>
>


Re: Embedded cassandra

2016-01-26 Thread Jonathan Haddad
For the sake of argument... why do you think you should embed Cassandra?
I'll be honest with you, making Cassandra restart every time you want to
upgrade your daemon sounds like a horrible idea.

Run your 10 DB instances on their own and save yourself the operational
headache.

On Tue, Jan 26, 2016 at 11:35 AM Richard L. Burton III 
wrote:

> I'm certain you're going to get a lot of users on this mailing list
> telling you that's a bad idea. You should read up on Cassandra via datastax
> website to understand how Cassandra is designed and works.
>
> There's tools for monitoring and more.
>
> On Tue, Jan 26, 2016 at 2:31 PM, Enrico Olivelli 
> wrote:
>
>> Hi,
>> I' new to Cassandra. I'm evaluating to launch Cassandra daemon inside the
>> JVM of my program. This is essentially because I want the lifecycle of
>> Cassandra to be managed by my daemon.
>> My program can be launched on several machines (in the order of max 10
>> instances) and every instance collaborate with others in a peer to peer,
>> fully decentralized way. Cassandra seems to be the best java db which can
>> be useful for my purpose.
>>
>> Is there any experience of Cassandra embedded in production systems ?
>>
>> Regards
>> Enrico Olivelli
>>
>
>
>
> --
> -Richard L. Burton III
> @rburton
>


Re: opscenter doesn't work with cassandra 3.0

2016-01-26 Thread Otis Gospodnetić
Hi,

On Tue, Jan 26, 2016 at 8:28 AM, Julien Anguenot 
wrote:

> If you install the latest OpsCenter you will get a banner stating the
> policy changes and linking to:
>
>http://docs.datastax.com/en/opscenter/5.2/opsc/opscPolicyChanges.html
>
> On our side, we decided to go with Sematext SPM to replace OpsCenter for
> our 2.1 cluster and thus be able to upgrade to 2.2.x and then 3.0.x before
> November this year.
>
>https://sematext.com/spm/integrations/cassandra-monitoring.html
>
> The Cassandra integration is great and very easy to setup. They offer both
> SaaS and on-premises at a reasonable price.
>
> As well, they say it should work with Cassandra 3.0.x. (have not tested
> yet myself)
>

We just tested SPM with Cassandra 3.x -- it works, metrics are there, no
issues.  Looks like some old Cassandra metrics may no longer be available
(e.g. local reads and local writes), but everything else seems to be
there.  If you use SPM  with Cassandra 3.x and
see anything missing, just shout.

Otis
--
Monitoring - Log Management - Alerting - Anomaly Detection
Solr & Elasticsearch Consulting Support Training - http://sematext.com/

.


>
>J.
>
> On Jan 26, 2016, at 7:19 AM, Hannu Kröger  wrote:
>
> Is it really like that? Where does this info come from? I haven’t seen
> anything “official" yet.
>
> Hannu
>
> On 26 Jan 2016, at 15:07,  <
> sean_r_dur...@homedepot.com> wrote:
>
> This is a very strange move considering how well DataStax has supported
> open source Cassandra. I hope there is a reasonable and well-publicized
> explanation for this apparent change in direction.
>
>
> Sean Durity
>
> *From:* George Sigletos [mailto:sigle...@textkernel.nl
> ]
> *Sent:* Tuesday, January 26, 2016 4:09 AM
> *To:* user@cassandra.apache.org
> *Subject:* Re: opscenter doesn't work with cassandra 3.0
>
> Unfortunately Datastax decided to discontinue Opscenter for open source
> Cassandra, starting from version 2.2.
>
> Pitty
>
> On Wed, Jan 6, 2016 at 6:00 PM, Michael Shuler 
> wrote:
>
> On 01/06/2016 10:55 AM, Michael Shuler wrote:
> > On 01/06/2016 01:47 AM, Wills Feng wrote:
> >> Looks like opscenter doesn't support cassandra 3.0?
> >
> > This is correct. OpsCenter does not support Cassandra >= 3.0.
>
> It took me a minute to find the correct document:
>
>
> http://docs.datastax.com/en/upgrade/doc/upgrade/opscenter/opscCompatibility.html
>
> According to this version table, OpsCenter does not officially support
> Cassandra > 2.1.
>
> --
> Michael
>
>
>
> --
>
> The information in this Internet Email is confidential and may be legally
> privileged. It is intended solely for the addressee. Access to this Email
> by anyone else is unauthorized. If you are not the intended recipient, any
> disclosure, copying, distribution or any action taken or omitted to be
> taken in reliance on it, is prohibited and may be unlawful. When addressed
> to our clients any opinions or advice contained in this Email are subject
> to the terms and conditions expressed in any applicable governing The Home
> Depot terms of business or client engagement letter. The Home Depot
> disclaims all responsibility and liability for the accuracy and content of
> this attachment and for any damages or losses arising from any
> inaccuracies, errors, viruses, e.g., worms, trojan horses, etc., or other
> items of a destructive nature, which may be contained in this attachment
> and shall not be liable for direct, indirect, consequential or special
> damages in connection with this e-mail message or its attachment.
>
>
>
> --
> Julien Anguenot (@anguenot)
> USA +1.832.408.0344
> FR +33.7.86.85.70.44
>
>


Re: Embedded cassandra

2016-01-26 Thread Enrico Olivelli
Hanks for your replies.
Actually my service uses a traditional jdbc database which is to be shared
among all the peers. I'm looking for a shared-nothing db and Cassandra
seems good.
I'm already using HBase in production but it seems to me that Cassandra is
more dynamic. No need for distributed fs like hdfs, no need for a
coordinator. From the docs it looks like a sort of peer to peer db.
Your answers sound like a reboot of a node or the addition or the removal
are not so simple and automatic operations ?

The other use case is that I want a db that can be launched inside the same
process in order to make the system simpler to manage. Actually we use h2
for single instance deployments but it is not good for production.

--  Enrico

Il giorno Mar 26 Gen 2016 21:59 Jonathan Haddad  ha
scritto:

> For the sake of argument... why do you think you should embed Cassandra?
> I'll be honest with you, making Cassandra restart every time you want to
> upgrade your daemon sounds like a horrible idea.
>
> Run your 10 DB instances on their own and save yourself the operational
> headache.
>
> On Tue, Jan 26, 2016 at 11:35 AM Richard L. Burton III 
> wrote:
>
>> I'm certain you're going to get a lot of users on this mailing list
>> telling you that's a bad idea. You should read up on Cassandra via datastax
>> website to understand how Cassandra is designed and works.
>>
>> There's tools for monitoring and more.
>>
>> On Tue, Jan 26, 2016 at 2:31 PM, Enrico Olivelli 
>> wrote:
>>
>>> Hi,
>>> I' new to Cassandra. I'm evaluating to launch Cassandra daemon inside
>>> the JVM of my program. This is essentially because I want the lifecycle of
>>> Cassandra to be managed by my daemon.
>>> My program can be launched on several machines (in the order of max 10
>>> instances) and every instance collaborate with others in a peer to peer,
>>> fully decentralized way. Cassandra seems to be the best java db which can
>>> be useful for my purpose.
>>>
>>> Is there any experience of Cassandra embedded in production systems ?
>>>
>>> Regards
>>> Enrico Olivelli
>>>
>>
>>
>>
>> --
>> -Richard L. Burton III
>> @rburton
>>
>


Re: Embedded cassandra

2016-01-26 Thread Jonathan Haddad
Launching a distributed database inside of an application server does not
make it easier to manage, it makes it a nightmare.

Rebooting a node is easy, rebooting ALL your nodes when you do a deployment
is pointless.

On Tue, Jan 26, 2016 at 2:14 PM Enrico Olivelli  wrote:

> Hanks for your replies.
> Actually my service uses a traditional jdbc database which is to be shared
> among all the peers. I'm looking for a shared-nothing db and Cassandra
> seems good.
> I'm already using HBase in production but it seems to me that Cassandra is
> more dynamic. No need for distributed fs like hdfs, no need for a
> coordinator. From the docs it looks like a sort of peer to peer db.
> Your answers sound like a reboot of a node or the addition or the removal
> are not so simple and automatic operations ?
>
> The other use case is that I want a db that can be launched inside the
> same process in order to make the system simpler to manage. Actually we use
> h2 for single instance deployments but it is not good for production.
>
> --  Enrico
>
> Il giorno Mar 26 Gen 2016 21:59 Jonathan Haddad  ha
> scritto:
>
>> For the sake of argument... why do you think you should embed Cassandra?
>> I'll be honest with you, making Cassandra restart every time you want to
>> upgrade your daemon sounds like a horrible idea.
>>
>> Run your 10 DB instances on their own and save yourself the operational
>> headache.
>>
>> On Tue, Jan 26, 2016 at 11:35 AM Richard L. Burton III <
>> mrbur...@gmail.com> wrote:
>>
>>> I'm certain you're going to get a lot of users on this mailing list
>>> telling you that's a bad idea. You should read up on Cassandra via datastax
>>> website to understand how Cassandra is designed and works.
>>>
>>> There's tools for monitoring and more.
>>>
>>> On Tue, Jan 26, 2016 at 2:31 PM, Enrico Olivelli 
>>> wrote:
>>>
 Hi,
 I' new to Cassandra. I'm evaluating to launch Cassandra daemon inside
 the JVM of my program. This is essentially because I want the lifecycle of
 Cassandra to be managed by my daemon.
 My program can be launched on several machines (in the order of max 10
 instances) and every instance collaborate with others in a peer to peer,
 fully decentralized way. Cassandra seems to be the best java db which can
 be useful for my purpose.

 Is there any experience of Cassandra embedded in production systems ?

 Regards
 Enrico Olivelli

>>>
>>>
>>>
>>> --
>>> -Richard L. Burton III
>>> @rburton
>>>
>>


Re: Logging

2016-01-26 Thread oleg yusim
Sam, Paulo,

Thank you very much for explanations and references.

Oleg

On Mon, Jan 25, 2016 at 10:08 AM, Sam Tunnicliffe  wrote:

> Paulo is correct in saying that C* doesn't have a direct equivalent of
> SecurityContextHolder. Authenticated principal info is retrievable from the
> QueryState during query execution but a) this isn't available to every
> method in the call chain and b) its scope is limited to the coordinator for
> the request. That is, it isn't serialized and included in the read/mutation
> messages which the coordinator distributes to the replicas. So you could
> produce a level of audit trail by providing a custom QueryHandler (See
> CASSANDRA-6659) that logs each statement along with the principal. But if
> the goal is indeed that "every log message in file should start with
> username of the user, who initiated this action", it's isn't really
> feasible right now
>
> On Mon, Jan 25, 2016 at 3:52 PM, Paulo Motta 
> wrote:
>
>> That would work, but afaik Cassandra doesn't have an equivalent of
>> RequestContextHolder/SecurityContextHolder that is able to retrieve the
>> user/session of a given thread/request (maybe I'm wrong as I'm no auth
>> expert), so if these don't exist we'd need to add equivalent to those or do
>> it via MDC (set the context when request arrives, propagate to down stream
>> threads, cleanup), which can become quite messy as shown in CASSANDRA-7276.
>>
>> For CQL statements perhaps the query tracing infrastructure could be
>> reused to provide that info, but that would require further investigation.
>> See CASSANDRA-1123 for more details on that.
>>
>> 2016-01-25 12:30 GMT-03:00 oleg yusim :
>>
>>> Paulo,
>>>
>>> Ideally - all the actions (security purposes, preserving completness of
>>> the audit trail). How about this approach:
>>> http://www.codelord.net/2010/08/27/logging-with-a-context-users-in-logback-and-spring-security/
>>>  ?
>>> Would that work? Or you would rather suggest to go MDC way?
>>>
>>> Thanks,
>>>
>>> Oleg
>>>
>>> On Mon, Jan 25, 2016 at 9:23 AM, Paulo Motta 
>>> wrote:
>>>
 What kind of actions? nodetool/system actions or cql statements?

 You could probably achieve identity-based logging with logback Mapped
 Diagnostic Context (MDC - logback.qos.ch/manual/mdc.html), but you'd
 need to patch your own Cassandra jars in many locations to provide that
 information to the logging context, so not exactly a trivial thing to do.
 We tried using that to print ks/cf names on log messages but it became a
 bit messy due to the SEDA architecture as you need to patch executors to
 inherit identifiers from parent threads and cleanup afterwards. See
 CASSANDRA-7276 for more background.

 2016-01-25 12:09 GMT-03:00 oleg yusim :

> I want to try to re-phrase my question here... what I'm trying to
> achieve is identity-based logging. I.e. every log message in file should
> start with username of the user, who initiated this action. Would that be
> possible to achieve? If so, can you give me a brief example?
>
> Thanks,
>
> Oleg
>
> On Thu, Jan 21, 2016 at 2:57 PM, oleg yusim 
> wrote:
>
>> Joel,
>>
>> Thanks for reference. What I'm trying to achieve, is to add the name
>> of the user, who initiated logged action. I tried c{5}, but what I see is
>> that;
>>
>> TRACE [GossipTasks:1] c{5} 2016-01-21 20:51:17,619 Gossiper.java:700
>> - Performing status check ...
>>
>> I think, I'm missing something here. Any suggestions?
>>
>> Thanks,
>>
>> Oleg
>>
>>
>>
>> On Thu, Jan 21, 2016 at 1:30 PM, Joel Knighton <
>> joel.knigh...@datastax.com> wrote:
>>
>>> Cassandra uses logback as its backend for logging.
>>>
>>> You can find information about configuring logging in Cassandra by
>>> searching for "Configuring logging" on docs.datastax.com and
>>> selecting the documentation for your version.
>>>
>>> The documentation for PatternLayouts (the pattern string about which
>>> you're asking) in logback is available in the logback manual under the
>>> section for Conversion Words
>>> http://logback.qos.ch/manual/layouts.html#conversionWord
>>>
>>>
>>> On Thu, Jan 21, 2016 at 1:21 PM, oleg yusim 
>>> wrote:
>>>
 Greetings,

 Guys, can you, please, point me to documentation on how to
 configure format of logs? I want make it clear, I'm talking about
 formatting i.e. this:

 %-5level %date{HH:mm:ss,SSS} %msg%n

 What if I want to add another parameters into this string? Is there
 a list of available parameters here and syntax?

 Thanks,

 Oleg


>>>
>>>
>>> 

Re: Embedded cassandra

2016-01-26 Thread Jack Krupansky
There is no documented support for embedded Cassandra. Sure, there is a
CassandraDaemon class and a EmbeddedCassandraService class, but they are
intended for testing, not for use of the product.

I have seen a couple of (old) references to people running embedded
Cassandra (one in the official Wiki), but nothing in recent years.

In short, maybe it might work, but be sure to inform your organization's
management that it would not be supported. You'd be on your own. For
example, even if you find a legitimate bug in Cassandra itself, the first
thing we'd ask you to do is to provide a repro that uses normal Cassandra.


-- Jack Krupansky

On Tue, Jan 26, 2016 at 5:22 PM, Jonathan Haddad  wrote:

> Launching a distributed database inside of an application server does not
> make it easier to manage, it makes it a nightmare.
>
> Rebooting a node is easy, rebooting ALL your nodes when you do a
> deployment is pointless.
>
> On Tue, Jan 26, 2016 at 2:14 PM Enrico Olivelli 
> wrote:
>
>> Hanks for your replies.
>> Actually my service uses a traditional jdbc database which is to be
>> shared among all the peers. I'm looking for a shared-nothing db and
>> Cassandra seems good.
>> I'm already using HBase in production but it seems to me that Cassandra
>> is more dynamic. No need for distributed fs like hdfs, no need for a
>> coordinator. From the docs it looks like a sort of peer to peer db.
>> Your answers sound like a reboot of a node or the addition or the removal
>> are not so simple and automatic operations ?
>>
>> The other use case is that I want a db that can be launched inside the
>> same process in order to make the system simpler to manage. Actually we use
>> h2 for single instance deployments but it is not good for production.
>>
>> --  Enrico
>>
>> Il giorno Mar 26 Gen 2016 21:59 Jonathan Haddad  ha
>> scritto:
>>
>>> For the sake of argument... why do you think you should embed
>>> Cassandra?  I'll be honest with you, making Cassandra restart every time
>>> you want to upgrade your daemon sounds like a horrible idea.
>>>
>>> Run your 10 DB instances on their own and save yourself the operational
>>> headache.
>>>
>>> On Tue, Jan 26, 2016 at 11:35 AM Richard L. Burton III <
>>> mrbur...@gmail.com> wrote:
>>>
 I'm certain you're going to get a lot of users on this mailing list
 telling you that's a bad idea. You should read up on Cassandra via datastax
 website to understand how Cassandra is designed and works.

 There's tools for monitoring and more.

 On Tue, Jan 26, 2016 at 2:31 PM, Enrico Olivelli 
 wrote:

> Hi,
> I' new to Cassandra. I'm evaluating to launch Cassandra daemon inside
> the JVM of my program. This is essentially because I want the lifecycle of
> Cassandra to be managed by my daemon.
> My program can be launched on several machines (in the order of max 10
> instances) and every instance collaborate with others in a peer to peer,
> fully decentralized way. Cassandra seems to be the best java db which can
> be useful for my purpose.
>
> Is there any experience of Cassandra embedded in production systems ?
>
> Regards
> Enrico Olivelli
>



 --
 -Richard L. Burton III
 @rburton

>>>


Re: About cassandra's reblance when adding one or more nodes into the existed cluster?

2016-01-26 Thread 土卜皿
Hi Alain and Romain,

I am so sorry for this issue! I should not use the command "nodetool move"
because I set "num_tokens: 256" in every node's cassandra.yaml.

However, I have new questions after adding two nodes into the cluster:

node1: 192.21.0.184
node2: 192.21.0.185

After starting the two nodes one by one, the first node 192.21.0.184 finished
the joining immediately, but the second one 192.21.0.185 took several hours
to join and not finished now: Under 192.168.0.184:

[root@report-01 cassandra]# bin/nodetool compactionstats
pending tasks: 0

Under 192.168.0.185:

[root@report-02 cassandra]# bin/nodetool compactionstats
pending tasks: 11
compaction type  keyspace   table   completed  total
 unit   progress
Compaction   testforuser   users1028 9439074  545972293
bytes  1.73%
Compaction   user_center   users 7566752   263673724274
bytes  0.00%
Active compaction remaining time :   4h22m27s

And:

[root@report-01 cassandra]# bin/nodetool status
Datacenter: DC1
===
Status=Up/Down
|/ State=Normal/Leaving/Joining/Moving
--  Address   Load   Tokens  OwnsHost ID
Rack
UN  192.21.0.135  120.83 GB  512 ?
11e1e80f-9c5f-4f7c-81f2-42d3b704d8e3  RAC1
UN  192.21.0.133  129.11 GB  512 ?
3e662ccb-fa2b-427b-9ca1-c2d3468bfbc9  RAC1
UN  192.21.0.131  152.25 GB  512 ?
60f763f3-09bc-4d6f-9301-494c93857fc1  RAC1
UJ  192.21.0.185  117.94 GB  256 ?
84c0dd16-6491-4bfb-b288-d4e410cd8c2a  RAC1
UN  192.21.0.184  649.03 MB  256 ?
4041c232-c110-4315-89a1-23ca53b851c2  RAC1

And the node2's boostrap interrupted several times because it got a error:

INFO  00:57:42 [Stream #8eb8cbe0-c488-11e5-baf9-918c8558de90] Session
with /192.21.0.135 is complete
INFO  00:57:42 [Stream #8eb8cbe0-c488-11e5-baf9-918c8558de90] Session
with /192.21.0.131 is complete
WARN  00:57:42 [Stream #8eb8cbe0-c488-11e5-baf9-918c8558de90] Stream failed
ERROR 00:57:42 Exception encountered during startup
java.lang.RuntimeException: Error during boostrap: Stream failed

So I restarted it and the join continued! I don't know why there is the
difference between the two nodes? Thank you in advance!

Dillon

2016-01-27 4:33 GMT+08:00 Romain Hardouin :

>
>
> Hi Dillon,
>
> CMIIW I suspect that you use vnodes and you want to "move one of the 256
> tokens to another node". If yes, that's not possible.
> "nodetool move" is not allowed with vnodes:
> https://github.com/apache/cassandra/blob/cassandra-2.1.11/src/java/org/apache/cassandra/service/StorageService.java#L3488
>
> *But* if you try "nodetool move" with a token that is already owned by a
> node, the check is done *before* the vnodes check:
>
> https://github.com/apache/cassandra/blob/cassandra-2.1.11/src/java/org/apache/cassandra/service/StorageService.java#L3479
>
> If you use single token, it seems you try to replace a node by another
> one...
> Maybe you could explain what is the problem that leads you to do a
> nodetool move? (along with the nodetool ring output as Alain suggested)
>
> Best,
> Romain
>


RE: NullPointerException when trying to compact under 3.2

2016-01-26 Thread SEAN_R_DURITY
Can you wipe all the data directories, saved_cache, and commitlog and let the 
node bootstrap again?


Sean Durity

From: Nimi Wariboko Jr [mailto:n...@channelmeter.com]
Sent: Monday, January 25, 2016 6:59 PM
To: cassandra-u...@apache.org
Subject: NullPointerException when trying to compact under 3.2

Hi,

I recently upgraded from 2.1.12 to 3.2, and one issue I'm having is I can no 
longer read certain rows from a table. A simple SELECT * FROM `table` times 
out, only when the bad partition keys are reached. Trying to query the affected 
partition keys directly also causes a timeout.

I think the SSTable might be corrupted because nodetool compact (and repair) 
fails (although scrub succeeds). With Debug logging, a compact results in:

cassandra/data/cmpayments/report_payments-f675e45076ce11e5938129463d90c3f0/ma-3778-big-Data.db:level=0,
 
/var/lib/cassandra/data/cmpayments/report_payments-f675e45076ce11e5938129463d90c3f0/ma-3780-big-Data.db:level=0,
 
/var/lib/cassandra/data/cmpayments/report_payments-f675e45076ce11e5938129463d90c3f0/ma-3775-big-Data.db:level=0,
 
/var/lib/cassandra/data/cmpayments/report_payments-f675e45076ce11e5938129463d90c3f0/ma-3776-big-Data.db:level=0,
 
/var/lib/cassandra/data/cmpayments/report_payments-f675e45076ce11e5938129463d90c3f0/ma-3798-big-Data.db:level=0,
 ]
WARN  [CompactionExecutor:5] 2016-01-25 15:13:02,734 SSTableReader.java:261 - 
Reading cardinality from Statistics.db failed for 
/var/lib/cassandra/data/cmpayments/report_payments-f675e45076ce11e5938129463d90c3f0/ma-3778-big-Data.db
ERROR [CompactionExecutor:5] 2016-01-25 15:13:02,777 CassandraDaemon.java:195 - 
Exception in thread Thread[CompactionExecutor:5,1,main]
java.lang.NullPointerException: null

(the NPE has no stack trace).

On disk, the SSTable "ma-3778-big-Data.db" does not exist. Even if I scrub or 
restart, Cassandra seems to always try to compact this non existent table (the 
SSTables on Disk are numbered 3851+).

I'm assuming the SSTables are fubared, and I'd like to restore a snapshot, but 
the fact this "ghost" sstable causes the compaction to fail, I'm unsure if 
restoring a new set of sstables would actually solve the issue.

Nimi



The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.


RE: opscenter doesn't work with cassandra 3.0

2016-01-26 Thread SEAN_R_DURITY
This is a very strange move considering how well DataStax has supported open 
source Cassandra. I hope there is a reasonable and well-publicized explanation 
for this apparent change in direction.


Sean Durity

From: George Sigletos [mailto:sigle...@textkernel.nl]
Sent: Tuesday, January 26, 2016 4:09 AM
To: user@cassandra.apache.org
Subject: Re: opscenter doesn't work with cassandra 3.0

Unfortunately Datastax decided to discontinue Opscenter for open source 
Cassandra, starting from version 2.2.

Pitty

On Wed, Jan 6, 2016 at 6:00 PM, Michael Shuler 
> wrote:
On 01/06/2016 10:55 AM, Michael Shuler wrote:
> On 01/06/2016 01:47 AM, Wills Feng wrote:
>> Looks like opscenter doesn't support cassandra 3.0?
>
> This is correct. OpsCenter does not support Cassandra >= 3.0.

It took me a minute to find the correct document:

http://docs.datastax.com/en/upgrade/doc/upgrade/opscenter/opscCompatibility.html

According to this version table, OpsCenter does not officially support
Cassandra > 2.1.

--
Michael




The information in this Internet Email is confidential and may be legally 
privileged. It is intended solely for the addressee. Access to this Email by 
anyone else is unauthorized. If you are not the intended recipient, any 
disclosure, copying, distribution or any action taken or omitted to be taken in 
reliance on it, is prohibited and may be unlawful. When addressed to our 
clients any opinions or advice contained in this Email are subject to the terms 
and conditions expressed in any applicable governing The Home Depot terms of 
business or client engagement letter. The Home Depot disclaims all 
responsibility and liability for the accuracy and content of this attachment 
and for any damages or losses arising from any inaccuracies, errors, viruses, 
e.g., worms, trojan horses, etc., or other items of a destructive nature, which 
may be contained in this attachment and shall not be liable for direct, 
indirect, consequential or special damages in connection with this e-mail 
message or its attachment.