Re: Warning about copying and pasting from datastax configuration page: weird characters in config

2014-02-11 Thread Robert Coli
On Tue, Feb 11, 2014 at 3:23 PM, Andy Losey  wrote:

> I think the issue you're running into is the same reason not to compose a
> script or config file in notepad. Without using a true text editor (vi), or
> relying on copy and past, you're very likely to introduce non visible
> formatting characters, as you see with you paragraph symbols. Trusting code
> right off a webpage isn't going to always be WYSIWYG.
>

I think that technical docs should generally be assumed to be safely
cut-and-pasteable; problems encountered when cut and pasting them should,
IMO, be fixed.

There was a related issue previously where datastax doc software CMS was
turning "--" into emdashes. It's likely that they are not aware of cases
(mobile, etc.) in which edge cases like this are occurring. They are,
however, very helpful people and very responsive when contacted. I have
added them (docs a t datastax d o t com) to the bcc: on this thread for
their information.

Docs guys - for context, read here :
http://www.mail-archive.com/user@cassandra.apache.org/msg34631.html

=Rob


Re: Warning about copying and pasting from datastax configuration page: weird characters in config

2014-02-11 Thread Michael Shuler

On 02/11/2014 05:23 PM, Andy Losey wrote:

I think the issue you're running into is the same reason not to
compose a script or config file in notepad. Without using a true text
editor (vi), or relying on copy and past, you're very likely to
introduce non visible formatting characters, as you see with you
paragraph symbols. Trusting code right off a webpage isn't going to
always be WYSIWYG.


Granted, I was using chrome and vi on a linux laptop and did have a 
couple different odd pastes after copying from the mobile page - no 
strange inserted characters, just incomplete text from what I actually 
highlighted.  I don't have any issues from the standard page.


--
Michael


Re: Warning about copying and pasting from datastax configuration page: weird characters in config

2014-02-11 Thread Andy Losey
I think the issue you're running into is the same reason not to compose a 
script or config file in notepad. Without using a true text editor (vi), or 
relying on copy and past, you're very likely to introduce non visible 
formatting characters, as you see with you paragraph symbols. Trusting code 
right off a webpage isn't going to always be WYSIWYG.

Sent from my 21st century device.

> On Feb 11, 2014, at 6:01 PM, Donald Smith  
> wrote:
> 
> The same problem happens with the non-mobile page: 
> http://www.datastax.com/documentation/cassandra/2.0/webhelp/index.html#cassandra/install/installRecommendSettings.html
> 
> I used the mobile page because someone embedded that link in a wiki page I 
> was referring to.  (I'll change the link.)
> 
> Don
> 
> -Original Message-
> From: Michael Shuler [mailto:mshu...@pbandjelly.org] On Behalf Of Michael 
> Shuler
> Sent: Tuesday, February 11, 2014 2:58 PM
> To: user@cassandra.apache.org
> Subject: Re: Warning about copying and pasting from datastax configuration 
> page: weird characters in config
> 
>> On 02/11/2014 04:50 PM, Donald Smith wrote:
>> In
>> http://www.datastax.com/documentation/cassandra/2.0/mobile/cassandra/install/installRecommendSettings.html
>>  it says:
> 
> Just curious.. why are you using the mobile site on a desktop, instead of the 
> main page? [0]
> 
> --
> Michael
> 
> [0]
> http://www.datastax.com/documentation/cassandra/2.0/webhelp/index.html#cassandra/install/installRecommendSettings.html


RE: Warning about copying and pasting from datastax configuration page: weird characters in config

2014-02-11 Thread Donald Smith
The same problem happens with the non-mobile page: 
http://www.datastax.com/documentation/cassandra/2.0/webhelp/index.html#cassandra/install/installRecommendSettings.html

I used the mobile page because someone embedded that link in a wiki page I was 
referring to.  (I'll change the link.)

 Don

-Original Message-
From: Michael Shuler [mailto:mshu...@pbandjelly.org] On Behalf Of Michael Shuler
Sent: Tuesday, February 11, 2014 2:58 PM
To: user@cassandra.apache.org
Subject: Re: Warning about copying and pasting from datastax configuration 
page: weird characters in config

On 02/11/2014 04:50 PM, Donald Smith wrote:
> In
> http://www.datastax.com/documentation/cassandra/2.0/mobile/cassandra/install/installRecommendSettings.html
>   it says:

Just curious.. why are you using the mobile site on a desktop, instead of the 
main page? [0]

--
Michael

[0]
http://www.datastax.com/documentation/cassandra/2.0/webhelp/index.html#cassandra/install/installRecommendSettings.html


Re: Warning about copying and pasting from datastax configuration page: weird characters in config

2014-02-11 Thread Michael Shuler

On 02/11/2014 04:50 PM, Donald Smith wrote:

In
http://www.datastax.com/documentation/cassandra/2.0/mobile/cassandra/install/installRecommendSettings.html
  it says:


Just curious.. why are you using the mobile site on a desktop, instead 
of the main page? [0]


--
Michael

[0] 
http://www.datastax.com/documentation/cassandra/2.0/webhelp/index.html#cassandra/install/installRecommendSettings.html


Warning about copying and pasting from datastax configuration page: weird characters in config

2014-02-11 Thread Donald Smith
In 
http://www.datastax.com/documentation/cassandra/2.0/mobile/cassandra/install/installRecommendSettings.html
  it says:

Packaged installs: Ensure that the following settings are included in the 
/etc/security/limits.d/cassandra.conf file:
cassandra - memlock unlimited
cassandra - nofile 10
cassandra - nproc 32768
cassandra - as unlimited

But when I copy and paste those four lines to linux, it inserts periods  in the 
first two lines so it looks like this:
cassandra - memlock.unlimited
cassandra - nofile.10
cassandra - nproc 32768
cassandra - as unlimited

This happens for both firefox and chrome.  And it happened for my coworker too 
(though for him the spaces after “memlock” and “nofile” were deleted).If I 
paste to windows it doesn’t happen.

Using firebug, I found the HTML source:

cassandra‌·-‌·memlock unlimited‌¶cassandra‌·-‌·nofile 10‌¶cassandra‌·-‌·nproc‌·32768‌¶cassandra‌·-‌·as‌·unlimited‌¶

The HTML on that page 
http://www.datastax.com/documentation/cassandra/2.0/mobile/cassandra/install/installRecommendSettings.html
 seems fragile.



According to http://www.w3.org/TR/html4/sgml/entities.html:







There were other specious characters included in some config I pasted from 
there, and that caused headaches. Specifically, earlier I saw:

cassandraâ- memlock unlimited
cassandraâ- nofile 10
cassandra - nproc 32768
cassandra - as unlimited

(with the weird â chars added).

Don


Donald A. Smith | Senior Software Engineer
P: 425.201.3900 x 3866
C: (206) 819-5965
F: (646) 443-2333
dona...@audiencescience.com

[AudienceScience]

<>

Re: Clarification on how multi-DC replication works

2014-02-11 Thread graham sanderson
slightly off topic, but does anyone know off the top of their head what happens 
if data is being written at LOCAL_QUORUM to a multi data center setup faster 
than the inter data center link can handle… something has to block, throw an 
exception, die, or have unbounded growth (memory, threads, on disk hints etc) 
somewhere along the line ;-)

I haven’t found any good info on this via searching the web… I have not studied 
the code in detail as we have not yet set up a multi-DC cluster. (Note we’re 
using 2.0.5).

Note we do not intend to do this in practice, but it might happen in some short 
bursts… obviously we can test this once we have such a setup, but any info 
would help us plan how to handle it, and/or throttle at either cassandra config 
level, or app level.

On Feb 11, 2014, at 12:33 PM, Mullen, Robert  wrote:

> The picture shows a sample request, which is why the coordinator points to 
> two specific nodes.  What I was trying to convey that the coordinator node 
> would ensure that 2 of the 3 nodes were written to before reporting success 
> to the client.
> This is my point. ANY 2 of 3. Your picture shows specific 2 of 3.
> 
> True that, I know that and I'm not debating that.  I am showing a single 
> request sequence in that picture, and during a single request it will 
> actually be a specific 2 of the 3 nodes.   
> 
> 
> 
> I found the article here, it says that the non-blocking writes to the 2nd 
> data center are asynchronous.  Is this blog post incorrect as well?
> http://www.datastax.com/dev/blog/deploying-cassandra-across-multiple-data-centers
> 
> Why is it incorrect? Everything is asynchronous, both local and remote. The 
> coordinator simply waits for response from local nodes. But it doesn't make 
> it synchronous, because it waits for response from ANY 2 nodes. 
> 
> I wasn't saying it was incorrect, I was just looking for clarification if you 
> thought that that blog post was misleading as well, as I've been sending 
> people to that page for info on multi dc replication.  It it was erroneous 
> then I would have stopped sending them there.   I was thinking that the 
> response was synchronous more from the client's point of view, meaning that 
> the app can't proceed until those specific operations were completed and a 
> response was returned from cassandra.  
> 
> Thanks for the help in clarifying all of this, it is very much appreciated.
> Regards,
> Rob
> 
> 
> On Tue, Feb 11, 2014 at 11:25 AM, Andrey Ilinykh  wrote:
> 
> 
> 
> On Tue, Feb 11, 2014 at 10:14 AM, Mullen, Robert  
> wrote:
> Thanks for the feedback.
> 
> The picture shows a sample request, which is why the coordinator points to 
> two specific nodes.  What I was trying to convey that the coordinator node 
> would ensure that 2 of the 3 nodes were written to before reporting success 
> to the client.
> This is my point. ANY 2 of 3. Your picture shows specific 2 of 3.
> 
>  
> 
> I found the article here, it says that the non-blocking writes to the 2nd 
> data center are asynchronous.  Is this blog post incorrect as well?
> http://www.datastax.com/dev/blog/deploying-cassandra-across-multiple-data-centers
> 
> Why is it incorrect? Everything is asynchronous, both local and remote. The 
> coordinator simply waits for response from local nodes. But it doesn't make 
> it synchronous, because it waits for response from ANY 2 nodes. 
> 



smime.p7s
Description: S/MIME cryptographic signature


Re: Clarification on how multi-DC replication works

2014-02-11 Thread Mullen, Robert
>
> The picture shows a sample request, which is why the coordinator points to
> two specific nodes.  What I was trying to convey that the coordinator node
> would ensure that 2 of the 3 nodes were written to before reporting success
> to the client.
>
This is my point. ANY 2 of 3. Your picture shows specific 2 of 3.

True that, I know that and I'm not debating that.  I am showing a single
request sequence in that picture, and during a single request it will
actually be a specific 2 of the 3 nodes.



> I found the article here, it says that the non-blocking writes to the 2nd
> data center are asynchronous.  Is this blog post incorrect as well?
>
> http://www.datastax.com/dev/blog/deploying-cassandra-across-multiple-data-centers
>

Why is it incorrect? Everything is asynchronous, both local and remote. The
coordinator simply waits for response from local nodes. But it doesn't make
it synchronous, because it waits for response from ANY 2 nodes.

I wasn't saying it was incorrect, I was just looking for clarification if
you thought that that blog post was misleading as well, as I've been
sending people to that page for info on multi dc replication.  It it was
erroneous then I would have stopped sending them there.   I was thinking
that the response was synchronous more from the client's point of view,
meaning that the app can't proceed until those specific operations were
completed and a response was returned from cassandra.

Thanks for the help in clarifying all of this, it is very much appreciated.
Regards,
Rob


On Tue, Feb 11, 2014 at 11:25 AM, Andrey Ilinykh  wrote:

>
>
>
> On Tue, Feb 11, 2014 at 10:14 AM, Mullen, Robert <
> robert.mul...@pearson.com> wrote:
>
>> Thanks for the feedback.
>>
>> The picture shows a sample request, which is why the coordinator points
>> to two specific nodes.  What I was trying to convey that the coordinator
>> node would ensure that 2 of the 3 nodes were written to before reporting
>> success to the client.
>>
> This is my point. ANY 2 of 3. Your picture shows specific 2 of 3.
>
>
>
>>
>> I found the article here, it says that the non-blocking writes to the 2nd
>> data center are asynchronous.  Is this blog post incorrect as well?
>>
>> http://www.datastax.com/dev/blog/deploying-cassandra-across-multiple-data-centers
>>
>
> Why is it incorrect? Everything is asynchronous, both local and remote.
> The coordinator simply waits for response from local nodes. But it doesn't
> make it synchronous, because it waits for response from ANY 2 nodes.
>


Re: Clarification on how multi-DC replication works

2014-02-11 Thread Andrey Ilinykh
On Tue, Feb 11, 2014 at 10:14 AM, Mullen, Robert
wrote:

> Thanks for the feedback.
>
> The picture shows a sample request, which is why the coordinator points to
> two specific nodes.  What I was trying to convey that the coordinator node
> would ensure that 2 of the 3 nodes were written to before reporting success
> to the client.
>
This is my point. ANY 2 of 3. Your picture shows specific 2 of 3.



>
> I found the article here, it says that the non-blocking writes to the 2nd
> data center are asynchronous.  Is this blog post incorrect as well?
>
> http://www.datastax.com/dev/blog/deploying-cassandra-across-multiple-data-centers
>

Why is it incorrect? Everything is asynchronous, both local and remote. The
coordinator simply waits for response from local nodes. But it doesn't make
it synchronous, because it waits for response from ANY 2 nodes.


Re: Clarification on how multi-DC replication works

2014-02-11 Thread Mullen, Robert
Thanks for the feedback.

The picture shows a sample request, which is why the coordinator points to
two specific nodes.  What I was trying to convey that the coordinator node
would ensure that 2 of the 3 nodes were written to before reporting success
to the client.

I found the article here, it says that the non-blocking writes to the 2nd
data center are asynchronous.  Is this blog post incorrect as well?
http://www.datastax.com/dev/blog/deploying-cassandra-across-multiple-data-centers

I'd like to get clarification on how this works and hope to clear up some
of the misinformation about multi-DC replication that is out there.  I like
a lot of the features of cassandra and enjoy working with it, but the
amount of conflicting information out on the web is a little disconcerting
sometimes.

thanks,
Rob

On Tue, Feb 11, 2014 at 11:02 AM, Andrey Ilinykh  wrote:

> 1. reply part is missing.
> 2. It is confusing a little bit. I would not use term "synchronous".
> Everything is asynchronous here. Coordinator writes data to all local nodes
> and waits for  response from ANY two of them (in case of quorum). In your
> picture it looks like the coordinator first makes decision what nodes
> should reply. It is not correct.
>
>
> On Tue, Feb 11, 2014 at 9:36 AM, Mullen, Robert  > wrote:
>
>> So is that picture incorrect, or just incomplete missing the piece on how
>> the nodes reply to the coordinator node.
>>
>>
>> On Tue, Feb 11, 2014 at 9:38 AM, sankalp kohli wrote:
>>
>>> @Mullen,
>>> I think your diagram does not answer the question on responses.
>>> @Sameer
>>> All nodes in DC2 will replay back to the co-ordinator in DC1. So if you
>>> have replication of DC1:3,DC2:3. A co-ordinator node will get 6 responses
>>> back if it is not in the replica set.
>>> Hope that answers your question.
>>>
>>>
>>> On Tue, Feb 11, 2014 at 8:16 AM, Mullen, Robert <
>>> robert.mul...@pearson.com> wrote:
>>>
 I had the same question a while back and put together this picture to
 help me understand the flow of data for multi region deployments. Hope that
 it helps.


 On Mon, Feb 10, 2014 at 7:52 PM, Sameer Farooqui <
 sam...@blueplastic.com> wrote:

> Hi,
>
> I was hoping someone could clarify a point about multi-DC replication.
>
> Let's say I have 2 data centers configured with replication factor = 3
> in each DC.
>
> My client app is sitting in DC 1 and is able to intelligently pick a
> coordinator that will also be a replica partner.
>
> So the client app sends a write with consistency for DC1 = Q and
> consistency for DC2 = Q to a coordinator node in DC1.
>
> That coordinator in DC1 forwards the write to 2 other nodes in DC1 and
> a coordinator in DC2.
>
> Is it correct that all 3 nodes in DC2 will respond back to the
> original coordinator in DC1? Or will the DC2 nodes respond back to the DC2
> coordinator?
>
> Let's say one of the replica nodes in DC2 is down. Who will hold the
> hint for that node? The original coordinator in DC1 or the coordinator in
> DC2?
>


>>>
>>
>


Re: Clarification on how multi-DC replication works

2014-02-11 Thread Andrey Ilinykh
1. reply part is missing.
2. It is confusing a little bit. I would not use term "synchronous".
Everything is asynchronous here. Coordinator writes data to all local nodes
and waits for  response from ANY two of them (in case of quorum). In your
picture it looks like the coordinator first makes decision what nodes
should reply. It is not correct.


On Tue, Feb 11, 2014 at 9:36 AM, Mullen, Robert
wrote:

> So is that picture incorrect, or just incomplete missing the piece on how
> the nodes reply to the coordinator node.
>
>
> On Tue, Feb 11, 2014 at 9:38 AM, sankalp kohli wrote:
>
>> @Mullen,
>> I think your diagram does not answer the question on responses.
>> @Sameer
>> All nodes in DC2 will replay back to the co-ordinator in DC1. So if you
>> have replication of DC1:3,DC2:3. A co-ordinator node will get 6 responses
>> back if it is not in the replica set.
>> Hope that answers your question.
>>
>>
>> On Tue, Feb 11, 2014 at 8:16 AM, Mullen, Robert <
>> robert.mul...@pearson.com> wrote:
>>
>>> I had the same question a while back and put together this picture to
>>> help me understand the flow of data for multi region deployments. Hope that
>>> it helps.
>>>
>>>
>>> On Mon, Feb 10, 2014 at 7:52 PM, Sameer Farooqui >> > wrote:
>>>
 Hi,

 I was hoping someone could clarify a point about multi-DC replication.

 Let's say I have 2 data centers configured with replication factor = 3
 in each DC.

 My client app is sitting in DC 1 and is able to intelligently pick a
 coordinator that will also be a replica partner.

 So the client app sends a write with consistency for DC1 = Q and
 consistency for DC2 = Q to a coordinator node in DC1.

 That coordinator in DC1 forwards the write to 2 other nodes in DC1 and
 a coordinator in DC2.

 Is it correct that all 3 nodes in DC2 will respond back to the original
 coordinator in DC1? Or will the DC2 nodes respond back to the DC2
 coordinator?

 Let's say one of the replica nodes in DC2 is down. Who will hold the
 hint for that node? The original coordinator in DC1 or the coordinator in
 DC2?

>>>
>>>
>>
>


Re: Clarification on how multi-DC replication works

2014-02-11 Thread Mullen, Robert
So is that picture incorrect, or just incomplete missing the piece on how
the nodes reply to the coordinator node.


On Tue, Feb 11, 2014 at 9:38 AM, sankalp kohli wrote:

> @Mullen,
> I think your diagram does not answer the question on responses.
> @Sameer
> All nodes in DC2 will replay back to the co-ordinator in DC1. So if you
> have replication of DC1:3,DC2:3. A co-ordinator node will get 6 responses
> back if it is not in the replica set.
> Hope that answers your question.
>
>
> On Tue, Feb 11, 2014 at 8:16 AM, Mullen, Robert  > wrote:
>
>> I had the same question a while back and put together this picture to
>> help me understand the flow of data for multi region deployments. Hope that
>> it helps.
>>
>>
>> On Mon, Feb 10, 2014 at 7:52 PM, Sameer Farooqui 
>> wrote:
>>
>>> Hi,
>>>
>>> I was hoping someone could clarify a point about multi-DC replication.
>>>
>>> Let's say I have 2 data centers configured with replication factor = 3
>>> in each DC.
>>>
>>> My client app is sitting in DC 1 and is able to intelligently pick a
>>> coordinator that will also be a replica partner.
>>>
>>> So the client app sends a write with consistency for DC1 = Q and
>>> consistency for DC2 = Q to a coordinator node in DC1.
>>>
>>> That coordinator in DC1 forwards the write to 2 other nodes in DC1 and a
>>> coordinator in DC2.
>>>
>>> Is it correct that all 3 nodes in DC2 will respond back to the original
>>> coordinator in DC1? Or will the DC2 nodes respond back to the DC2
>>> coordinator?
>>>
>>> Let's say one of the replica nodes in DC2 is down. Who will hold the
>>> hint for that node? The original coordinator in DC1 or the coordinator in
>>> DC2?
>>>
>>
>>
>


Re: 1.2.15 non-seed nodes never join cluster. JOINING: waiting for schema information to complete

2014-02-11 Thread Michael Shuler

On 02/11/2014 10:34 AM, sankalp kohli wrote:

If you don't have a schema, you are probably hitting this
https://issues.apache.org/jira/browse/CASSANDRA-6685


Looks like #6685 was committed to the cassandra-1.2 branch, yesterday.

SNAPSHOT artifacts can be grabbed for the latest build of each branch, 
if anyone's watching for something to be committed.  I just finished 
setting this up in jenkins, yesterday.


  http://cassci.datastax.com/job/cassandra-1.2/lastSuccessfulBuild/
  http://cassci.datastax.com/job/cassandra-2.0/lastSuccessfulBuild/
  http://cassci.datastax.com/job/trunk/lastSuccessfulBuild/



--
Kind regards,
Michael


Re: Worse perf after Row Caching version 1.2.5:

2014-02-11 Thread Jonathan Lacefield
Hello,

  Please paste the output of cfhistograms for these tables.  Also, what
does your environment look like, number of nodes, disk drive configs,
memory, C* version, etc.

Thanks,

Jonathan

Jonathan Lacefield
Solutions Architect, DataStax
(404) 822 3487






On Tue, Feb 11, 2014 at 10:26 AM, PARASHAR, BHASKARJYA JAY
wrote:

>  Hi,
>
>
>
> I have two tables and I enabled row caching for both of them using CQL.
> These two CF's are very small with one about 300 rows and other < 2000
> rows. The rows themselves are small.
>
> Cassandra heap: 8gb.
>
> a.   alter table TABLE_X with caching = 'rows_only';
>
> b.  alter table TABLE_Y with caching = 'rows_only';
>
> I also changed row_cache_size_in_mb: 1024 in the Cassandra.yaml file.
>
> After extensive testing, it seems the performance of Table_X degraded from
> 600ms to 750ms and Table_Y gained about 10 ms (from 188ms to 177 ms).
>
> More Info
>
> Table X is always queried with "Select * from Table_X";  Cfstats in
> Table_X shows Read Latency: NaN ms. I assumed that since we select all the
> rows, the entire table would be cached.
>
> Table_Y has a secondary index and is queried on that index.
>
>
>
>
>
> Would appreciate any input why the performance is worse and how to enable
> row caching for these two tables.
>
>
>
> Thanks
>
> Jay
>
>
>
>
>


Re: Clarification on how multi-DC replication works

2014-02-11 Thread sankalp kohli
@Mullen,
I think your diagram does not answer the question on responses.
@Sameer
All nodes in DC2 will replay back to the co-ordinator in DC1. So if you
have replication of DC1:3,DC2:3. A co-ordinator node will get 6 responses
back if it is not in the replica set.
Hope that answers your question.


On Tue, Feb 11, 2014 at 8:16 AM, Mullen, Robert
wrote:

> I had the same question a while back and put together this picture to help
> me understand the flow of data for multi region deployments. Hope that it
> helps.
>
>
> On Mon, Feb 10, 2014 at 7:52 PM, Sameer Farooqui 
> wrote:
>
>> Hi,
>>
>> I was hoping someone could clarify a point about multi-DC replication.
>>
>> Let's say I have 2 data centers configured with replication factor = 3 in
>> each DC.
>>
>> My client app is sitting in DC 1 and is able to intelligently pick a
>> coordinator that will also be a replica partner.
>>
>> So the client app sends a write with consistency for DC1 = Q and
>> consistency for DC2 = Q to a coordinator node in DC1.
>>
>> That coordinator in DC1 forwards the write to 2 other nodes in DC1 and a
>> coordinator in DC2.
>>
>> Is it correct that all 3 nodes in DC2 will respond back to the original
>> coordinator in DC1? Or will the DC2 nodes respond back to the DC2
>> coordinator?
>>
>> Let's say one of the replica nodes in DC2 is down. Who will hold the hint
>> for that node? The original coordinator in DC1 or the coordinator in DC2?
>>
>
>


Re: 1.2.15 non-seed nodes never join cluster. JOINING: waiting for schema information to complete

2014-02-11 Thread sankalp kohli
If you don't have a schema, you are probably hitting this
https://issues.apache.org/jira/browse/CASSANDRA-6685


On Tue, Feb 11, 2014 at 8:22 AM, John Pyeatt wrote:

> I am trying to bring up a 6 node cluster in AWS. 3 seed nodes and 3
> non-seed nodes. One of each in each availability zone with 1.2.15 and my
> non-seed nodes never join the cluster. If I run 1.2.14 everything works
> fine. We are not using vnodes and all of the initial_token values are
> assigned based on the Murmur3 calculations.
>
> This isn't a data migration from a previous version. It is a completely
> clean cluster which I am starting from scratch.
>
> The seed nodes come up and join the cluster just fine. But none of my
> non-seed nodes are joining the cluster. In the logs I am seeing the
> following from one of my non-seed nodes. Note the repeats of the last lines
> that never go away.
>
>  INFO 15:58:54,729 Handshaking version with /10.0.12.13
>  INFO 15:58:55,724 Handshaking version with /10.0.32.126
>  INFO 15:58:56,726 Handshaking version with /10.0.22.230
> INFO 15:58:56,929 Node /10.0.32.126 is now part of the cluster
>  INFO 15:58:56,930 InetAddress /10.0.32.126 is now UP
>  INFO 15:58:56,957 Node /10.0.12.103 is now part of the cluster
>  INFO 15:58:56,960 InetAddress /10.0.12.103 is now UP
>  INFO 15:58:56,967 Node /10.0.22.206 is now part of the cluster
>  INFO 15:58:56,968 InetAddress /10.0.22.206 is now UP
>  INFO 15:58:56,975 Node /10.0.12.13 is now part of the cluster
>  INFO 15:58:56,976 InetAddress /10.0.12.13 is now UP
>  INFO 15:58:56,984 Node /10.0.22.230 is now part of the cluster
>  INFO 15:58:56,984 InetAddress /10.0.22.230 is now UP
>  INFO 15:58:57,010 CFS(Keyspace='system', ColumnFamily='peers') liveRatio
> is 12.87932647333957 (just-counted was 12.87932647333957).  calculation
> took 19ms for 38 columns
>  INFO 15:58:57,679 Handshaking version with /10.0.22.206
>  INFO 15:58:57,726 Handshaking version with /10.0.22.230
>  INFO 15:58:58,728 Handshaking version with /10.0.12.13
>  INFO 15:58:59,730 Handshaking version with /10.0.12.103
>  INFO 15:59:06,090 Handshaking version with /10.0.32.126
>
>
>
>
>
>
>
> * INFO 15:59:23,932 JOINING: waiting for schema information to complete
>  INFO 15:59:24,932 JOINING: waiting for schema information to complete INFO
> 15:59:25,933 JOINING: waiting for schema information to complete INFO
> 15:59:26,933 JOINING: waiting for schema information to complete  INFO
> 15:59:27,934 JOINING: waiting for schema information to complete INFO
> 15:59:28,934 JOINING: waiting for schema information to complete INFO
> 15:59:29,935 JOINING: waiting for schema information to complete  INFO
> 15:59:30,935 JOINING: waiting for schema information to complete*
>
> So I suspect it is some sort of bootstrapping issue. I checked the
> CHANGES.txt and noticed this for 1.2.15.
> *Move handling of migration event source to solve bootstrap race
> (CASSANDRA-6648)*
> I looked at 6648 and there seems, based on some of the comments that there
> is a lack of confidence in this problem.
>
> Has anyone else seen this problem?
> --
> John Pyeatt
> Singlewire Software, LLC
> www.singlewire.com
> --
> 608.661.1184
> john.pye...@singlewire.com
>


1.2.15 non-seed nodes never join cluster. JOINING: waiting for schema information to complete

2014-02-11 Thread John Pyeatt
I am trying to bring up a 6 node cluster in AWS. 3 seed nodes and 3
non-seed nodes. One of each in each availability zone with 1.2.15 and my
non-seed nodes never join the cluster. If I run 1.2.14 everything works
fine. We are not using vnodes and all of the initial_token values are
assigned based on the Murmur3 calculations.

This isn't a data migration from a previous version. It is a completely
clean cluster which I am starting from scratch.

The seed nodes come up and join the cluster just fine. But none of my
non-seed nodes are joining the cluster. In the logs I am seeing the
following from one of my non-seed nodes. Note the repeats of the last lines
that never go away.

 INFO 15:58:54,729 Handshaking version with /10.0.12.13
 INFO 15:58:55,724 Handshaking version with /10.0.32.126
 INFO 15:58:56,726 Handshaking version with /10.0.22.230
INFO 15:58:56,929 Node /10.0.32.126 is now part of the cluster
 INFO 15:58:56,930 InetAddress /10.0.32.126 is now UP
 INFO 15:58:56,957 Node /10.0.12.103 is now part of the cluster
 INFO 15:58:56,960 InetAddress /10.0.12.103 is now UP
 INFO 15:58:56,967 Node /10.0.22.206 is now part of the cluster
 INFO 15:58:56,968 InetAddress /10.0.22.206 is now UP
 INFO 15:58:56,975 Node /10.0.12.13 is now part of the cluster
 INFO 15:58:56,976 InetAddress /10.0.12.13 is now UP
 INFO 15:58:56,984 Node /10.0.22.230 is now part of the cluster
 INFO 15:58:56,984 InetAddress /10.0.22.230 is now UP
 INFO 15:58:57,010 CFS(Keyspace='system', ColumnFamily='peers') liveRatio
is 12.87932647333957 (just-counted was 12.87932647333957).  calculation
took 19ms for 38 columns
 INFO 15:58:57,679 Handshaking version with /10.0.22.206
 INFO 15:58:57,726 Handshaking version with /10.0.22.230
 INFO 15:58:58,728 Handshaking version with /10.0.12.13
 INFO 15:58:59,730 Handshaking version with /10.0.12.103
 INFO 15:59:06,090 Handshaking version with /10.0.32.126







* INFO 15:59:23,932 JOINING: waiting for schema information to
complete INFO 15:59:24,932 JOINING: waiting for schema information to
complete INFO 15:59:25,933 JOINING: waiting for schema information to
complete INFO 15:59:26,933 JOINING: waiting for schema information to
complete INFO 15:59:27,934 JOINING: waiting for schema information to
complete INFO 15:59:28,934 JOINING: waiting for schema information to
complete INFO 15:59:29,935 JOINING: waiting for schema information to
complete INFO 15:59:30,935 JOINING: waiting for schema information to
complete*

So I suspect it is some sort of bootstrapping issue. I checked the
CHANGES.txt and noticed this for 1.2.15.
*Move handling of migration event source to solve bootstrap race
(CASSANDRA-6648)*
I looked at 6648 and there seems, based on some of the comments that there
is a lack of confidence in this problem.

Has anyone else seen this problem?
-- 
John Pyeatt
Singlewire Software, LLC
www.singlewire.com
--
608.661.1184
john.pye...@singlewire.com


Worse perf after Row Caching version 1.2.5:

2014-02-11 Thread PARASHAR, BHASKARJYA JAY
Hi,

I have two tables and I enabled row caching for both of them using CQL. These 
two CF's are very small with one about 300 rows and other < 2000 rows. The rows 
themselves are small.
Cassandra heap: 8gb.
a.   alter table TABLE_X with caching = 'rows_only';
b.  alter table TABLE_Y with caching = 'rows_only';
I also changed row_cache_size_in_mb: 1024 in the Cassandra.yaml file.
After extensive testing, it seems the performance of Table_X degraded from 
600ms to 750ms and Table_Y gained about 10 ms (from 188ms to 177 ms).
More Info
Table X is always queried with "Select * from Table_X";  Cfstats in Table_X 
shows Read Latency: NaN ms. I assumed that since we select all the rows, the 
entire table would be cached.
Table_Y has a secondary index and is queried on that index.


Would appreciate any input why the performance is worse and how to enable row 
caching for these two tables.

Thanks
Jay




RE: Adding a node to cluster keeping 100% data replicated on all nodes

2014-02-11 Thread Brust, Corwin [Hollander]
I’m wondering if write consistency “ALL” on the KeySpace will help, here.   I 
too will listen eagerly for the ordained answer.

From: _ _ [mailto:a...@abv.bg]
Sent: Monday, February 10, 2014 5:02 AM
To: user@cassandra.apache.org
Subject: Re: Adding a node to cluster keeping 100% data replicated on all nodes

> Hi,
>
> Our environment will consist of cluster with size not bigger than 2 to 4 
> nodes per cluster(all
> located in the same DC). We want to ensure that every node in the cluster 
> will own 100% of
> the data. A node adding(or removing) procedure will be automated so we want 
> to ensure we're
> making the right steps. Lets say we have node 'A' up and running and want to 
> add another node
> 'B' to make a cluster. Node A configuration will be:
> seed: "IP of A"
> listen_address: "IP of A"
> num_tokens: 256
> rpc_address: 0.0.0.0
> The keyspace uses SimpleStrategy with RF: 1.
>
> Adding node 'B' to cluster we are doing the following:
> 1. Stop cassandra on B.
> 2. Update cassandra.yaml - change seed to point to "IP of A"
> 3. Update cassandra-topology.properties - add node A ip to it and make it the 
> default one.
> 4. rm -rf /var/lib/cassandra/*
> 5. Start cassandra on B.
> 6. Wait untill nodetool status reports the node B is up.
> 7. Update RP of the keyspace to 2.
> 8. Run nodetool repair on B and wait it to finish.
>
> Can we update the RF factor on A before starting Cassandra on B in order to 
> skip steps 7 and
> 8?
>
>
> Now when the data is sync on both nodes we want to make a node B a seed node.
> 9. Update seed property on A and B to include the the IP of B node.
> 10. Restart cassandra on both nodes.
>
> If adding more nodes to the cluster the steps will be the same except that 
> seed property will
> contain all existing nodes in the cluster.
>
> So are these steps everything we need to do?
> Is there anything more we need to do?
> Is there an easier way to do what we want or all the steps above are 
> mandatory?

Good day,

That's something I'm looking for too. Unfortunately till now I haven't found 
the right way to achieve it. Nodetool repair takes lots of time to execute.






PRIVILEGE AND CONFIDENTIALITY NOTICE
The information in this electronic mail (and any associated attachments) is 
intended for the named recipient(s) only and may contain privileged and 
confidential information. If you have received this message in error, you are 
hereby notified that any use, disclosure, copying or alteration of this message 
is strictly prohibited. If you are not the intended recipient(s), please 
contact the sender by reply email and destroy all copies of the original 
message. Thank you.







RE: Recommended OS

2014-02-11 Thread Brust, Corwin [Hollander]
This /is/ our first cluster.  We've upgraded one-over-one since 2.0.2 (our 
initially deployed version), doing rolling updates across the rings.

No especial resource tuning save turning off SELinux and the usual (disabling 
swap, separate disk for commit logs, data and the OS).

From: Keith Wright [mailto:kwri...@nanigans.com]
Sent: Monday, February 10, 2014 4:35 PM
To: user@cassandra.apache.org
Cc: Don Jackson; Dave Carroll
Subject: RE: Recommended OS


Is this your first cluster?  Have you run older versions of Cassandra?  Any 
specific resource tuning?

Thanks all.  We are unable to bootstrap nodes and are considering creating a 
fresh cluster in hopes this is some how data related.
On Feb 10, 2014 5:33 PM, "Brust, Corwin [Hollander]" 
mailto:corwin.br...@hollanderparts.com>> wrote:
We're running C* 2.0.5 under CentOS 6.5 and have not noticed anything like you 
describe.  We have just a couple of pre-production rings (Dev and Test) meaning 
nothing we have has received particularly intense utilization.

Corwin

From: Keith Wright [mailto:kwri...@nanigans.com]
Sent: Monday, February 10, 2014 2:09 PM
To: user@cassandra.apache.org
Cc: Don Jackson; Dave Carroll
Subject: Re: Recommended OS

We are running on CentOS 6.4 but an upgrade to 6.5 caused packets to backup on 
the net queue causing HUGE load spikes and cluster meltdown.  Ultimately we 
reverted.  Have others seen this?  Are others running CentOS 6.4/6.5?

Thanks

From: , Joshua 
mailto:joshua_sho...@cable.comcast.com>>
Reply-To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Date: Monday, February 10, 2014 at 1:56 PM
To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Cc: Don Jackson mailto:djack...@nanigans.com>>, Dave 
Carroll mailto:dcarr...@nanigans.com>>
Subject: Re: Recommended OS

What issues are you running into with CentOS 6.4/5?  I'm running 1.2.8 on 
CentOS 6.3 and Java 1.7.0-25, and about to test with 1.7.latest.
--
Josh Sholes

From: Keith Wright mailto:kwri...@nanigans.com>>
Reply-To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Date: Monday, February 10, 2014 at 1:50 PM
To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Cc: Don Jackson mailto:djack...@nanigans.com>>, Dave 
Carroll mailto:dcarr...@nanigans.com>>
Subject: Recommended OS

Hi all,

I was wondering what operating systems and versions people are running with 
success in production environments?  We are using C* 1.2.13 and have had issues 
using CentOS 6.4/6.5.  Are others using that OS?  What would people recommend?  
What about Java 6 vs 7 (specific versions?!)?

Thanks!!!





PRIVILEGE AND CONFIDENTIALITY NOTICE
The information in this electronic mail (and any associated attachments) is 
intended for the named recipient(s) only and may contain privileged and 
confidential information. If you have received this message in error, you are 
hereby notified that any use, disclosure, copying or alteration of this message 
is strictly prohibited. If you are not the intended recipient(s), please 
contact the sender by reply email and destroy all copies of the original 
message. Thank you.






Re: CQL3 Custom Functions

2014-02-11 Thread Sylvain Lebresne
On Mon, Feb 10, 2014 at 7:16 PM, Drew Kutcharian  wrote:

> Hey Guys,
>
> How can I define custom CQL3 functions (similar to dateOf, now, etc)?
>

You can't, there is currently no way to define custom functions.

--
Sylvain