Re: Advice in upgrade plan from 1.2.18 to 2.2.8

2016-12-22 Thread Edward Capriolo
Also before you get started. Make sure:
1) no one attempts to change schema during the process
2) no one attempts to run a repair
3) no one attempts to join a node
4) no one attempts to remove/move nodes from the cluster

Each of these things trigger repair sessions and stream data which do not
work in a mix cluster

On Thu, Dec 22, 2016 at 3:07 PM, Aiman Parvaiz  wrote:

> Thanks Alain. This was extremely helpful, really grateful.
>
> Aiman
>
> On Dec 22, 2016, at 5:00 AM, Alain RODRIGUEZ  wrote:
>
> Hi,
>
> Here are some thoughts:
>
> running 1.2.18. I plan to upgrade them to 2.2.latest
>>
>
> Going 1 major release at the time is probably the safest way to go indeed.
>
>
>>1. Install 2.0.latest on one node at a time, start and wait for it to
>>join the ring.
>>2. Run upgradesstables on this node.
>>3. Repeat Step 1,2 on each node installing cassandra2.0 in a rolling
>>manner and running upgradesstables in parallel. (Please let me know if
>>running upgrades stables in parallel is not right here. My cluster is not
>>under much load really)
>>
>>
> I would:
>
> - Upgrade one node, check for cluster health (monitoring, logs, nodeool
> commands), having a special attention to the 2.0 node.
> - If everything is ok, then go for more nodes, if using distinct racks I
> would go per rack; sequentially, node per node, all the nodes from
> DC1-rack1, then DC1-rack2, then DC1-rack3. Then move to the next DC if
> everything is fine.
> - Start the 'upgradesstables' when the cluster is completely and
> successfully running with the new version (2.0.17). It is perfectly fine to
> run this in parallel as the last part of the upgrade. As you guessed, it is
> good to keep monitoring the cluster load.
>
> 4. Now I will have both my DCs running 2.0.latest.
>
>
> Without really having any strong argument, I would let it run for "some
> time" like this, hours at least, maybe days. In any case, you will probably
> have some work to prepare before the next upgrade, so you will have time to
> check how the cluster is doing.
>
> 6. Do I need to run upgradesstables here again after the node has started
>> and joined? (I think yes, but seek advice. https://docs.datastax.
>> com/en/latest-upgrade/upgrade/cassandra/upgrdCassandra.html)
>
>
> Yes, every time you run a major upgrade. Anyway, nodetool upgradesstables
> will skip any sstable that do not need to be upgraded (as long as you don't
> add the option to force it), so it is probably better to run it when you
> have a doubt.
>
>
> As additional information, I would prepare, for each upgrade:
>
>
>- The new Cassandra configuration (cassandra.yaml and
>cassandra-sh.yaml mainly but also other configuration files)
>
>To do that, I use to merge the current file in use (your configuration
>on C* 1.2.18) and the Cassandra version file from github for the new
>version (i.e. https://github.com/apache/cassandra/tree/
>cassandra-2.0.17/conf).
>
>This allows you to
>   - Acknowledge and consider the new and removed configuration
>   settings
>   - Keep comments and default values in the configuration files up to
>   date
>   - Be fully exhaustive, and learn as you parse the files
>
>   - Make sure clients will still work with the new version (see the
>doc, do the tests)
>- Cassandra metrics changed in the latest versions, you might have to
>rework your dashboards. Anticipating the dashboard creation for new
>versions would prevent you from loosing metrics when you need them the 
> most.
>
>
> Finally keep in mind that you should not perform any streaming while
> running multiple version and as long as 'nodetool upgradesstables' is not
> completely done. Meaning you should not add, remove, replace, move or
> repair a node. Also, I would limit schema changes as much as possible while
> running multiple versions as it caused troubles in the past.
>
> During an upgrade, almost nothing else than the normal load due to the
> service and the upgrade itself should happen. We always try to keep this
> time window as short as possible.
>
> C*heers,
> ---
> Alain Rodriguez - @arodream - al...@thelastpickle.com
> France
>
> The Last Pickle - Apache Cassandra Consulting
> http://www.thelastpickle.com
>
> 2016-12-21 20:36 GMT+01:00 Aiman Parvaiz :
>
>> Hi everyone,
>> I have 2 C* DCs with 12 nodes in each running 1.2.18. I plan to upgrade
>> them to 2.2.latest and wanted to run by you experts my plan.
>>
>>
>>1. Install 2.0.latest on one node at a time, start and wait for it to
>>join the ring.
>>2. Run upgradesstables on this node.
>>3. Repeat Step 1,2 on each node installing cassandra2.0 in a rolling
>>manner and running upgradesstables in parallel. (Please let me know if
>>running upgrades stables in parallel is not right here. My cluster is not
>>under much load really)
>>4. Now I will have both my DCs running 2.0.latest.
>>5. Install cassandra 2.1.late

Re: Advice in upgrade plan from 1.2.18 to 2.2.8

2016-12-22 Thread Aiman Parvaiz
Thanks Alain. This was extremely helpful, really grateful.

Aiman
On Dec 22, 2016, at 5:00 AM, Alain RODRIGUEZ 
mailto:arodr...@gmail.com>> wrote:

Hi,

Here are some thoughts:

running 1.2.18. I plan to upgrade them to 2.2.latest

Going 1 major release at the time is probably the safest way to go indeed.


  1.  Install 2.0.latest on one node at a time, start and wait for it to join 
the ring.
  2.  Run upgradesstables on this node.
  3.  Repeat Step 1,2 on each node installing cassandra2.0 in a rolling manner 
and running upgradesstables in parallel. (Please let me know if running 
upgrades stables in parallel is not right here. My cluster is not under much 
load really)

I would:

- Upgrade one node, check for cluster health (monitoring, logs, nodeool 
commands), having a special attention to the 2.0 node.
- If everything is ok, then go for more nodes, if using distinct racks I would 
go per rack; sequentially, node per node, all the nodes from DC1-rack1, then 
DC1-rack2, then DC1-rack3. Then move to the next DC if everything is fine.
- Start the 'upgradesstables' when the cluster is completely and successfully 
running with the new version (2.0.17). It is perfectly fine to run this in 
parallel as the last part of the upgrade. As you guessed, it is good to keep 
monitoring the cluster load.

4. Now I will have both my DCs running 2.0.latest.

Without really having any strong argument, I would let it run for "some time" 
like this, hours at least, maybe days. In any case, you will probably have some 
work to prepare before the next upgrade, so you will have time to check how the 
cluster is doing.

6. Do I need to run upgradesstables here again after the node has started and 
joined? (I think yes, but seek advice. 
https://docs.datastax.com/en/latest-upgrade/upgrade/cassandra/upgrdCassandra.html)

Yes, every time you run a major upgrade. Anyway, nodetool upgradesstables will 
skip any sstable that do not need to be upgraded (as long as you don't add the 
option to force it), so it is probably better to run it when you have a doubt.


As additional information, I would prepare, for each upgrade:


  *   The new Cassandra configuration (cassandra.yaml and cassandra-sh.yaml 
mainly but also other configuration files)

To do that, I use to merge the current file in use (your configuration on C* 
1.2.18) and the Cassandra version file from github for the new version (i.e. 
https://github.com/apache/cassandra/tree/cassandra-2.0.17/conf).

This allows you to
 *   Acknowledge and consider the new and removed configuration settings
 *   Keep comments and default values in the configuration files up to date
 *   Be fully exhaustive, and learn as you parse the files

  *   Make sure clients will still work with the new version (see the doc, do 
the tests)
  *   Cassandra metrics changed in the latest versions, you might have to 
rework your dashboards. Anticipating the dashboard creation for new versions 
would prevent you from loosing metrics when you need them the most.

Finally keep in mind that you should not perform any streaming while running 
multiple version and as long as 'nodetool upgradesstables' is not completely 
done. Meaning you should not add, remove, replace, move or repair a node. Also, 
I would limit schema changes as much as possible while running multiple 
versions as it caused troubles in the past.

During an upgrade, almost nothing else than the normal load due to the service 
and the upgrade itself should happen. We always try to keep this time window as 
short as possible.

C*heers,
---
Alain Rodriguez - @arodream - 
al...@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2016-12-21 20:36 GMT+01:00 Aiman Parvaiz 
mailto:ai...@steelhouse.com>>:
Hi everyone,
I have 2 C* DCs with 12 nodes in each running 1.2.18. I plan to upgrade them to 
2.2.latest and wanted to run by you experts my plan.


  1.  Install 2.0.latest on one node at a time, start and wait for it to join 
the ring.
  2.  Run upgradesstables on this node.
  3.  Repeat Step 1,2 on each node installing cassandra2.0 in a rolling manner 
and running upgradesstables in parallel. (Please let me know if running 
upgrades stables in parallel is not right here. My cluster is not under much 
load really)
  4.  Now I will have both my DCs running 2.0.latest.
  5.  Install cassandra 2.1.latest on one node at a time (same as above)
  6.  Do I need to run upgradesstables here again after the node has started 
and joined? (I think yes, but seek advice. 
https://docs.datastax.com/en/latest-upgrade/upgrade/cassandra/upgrdCassandra.html)
  7.  Following the above pattern, I would install cassandra2.1 in a rolling 
manner across 2 DCs (depending on response to 6 I might or might not run 
upgradesstables)
  8.  At this point both DCs would have 2.1.latest and again in rolling manner 
I install 2.2.8.

My

Re: Cassandra cluster performance

2016-12-22 Thread Branislav Janosik -T (bjanosik - AAP3 INC at Cisco)
Yes, there is definitely something wrong but I’m struggling to figure out what 
exactly. To answer your questions.

-  There are no errors in client or Cassandra

-  I tried manual inserts and there are no errors either, I set the 
tracing on so I can see that the data is distributed to different partitions. 
Even when I use nodetool status, the Owns is 48.4% to 51.6%.
Regards,
Branislav

From: Ben Slater 
Reply-To: "user@cassandra.apache.org" 
Date: Wednesday, December 21, 2016 at 6:20 PM
To: "user@cassandra.apache.org" 
Subject: Re: Cassandra cluster performance

Given you’re using replication factor 1 (so each piece of data is only going to 
get written to one node) something definitely seems wrong. Some questions/ideas:
- are there any errors in the Cassandra logs or are you seeing any errors at 
the client?
- is your test data distributed across your partition key or is it possible all 
your test data is going to a single partition?
- have you tried manually running a few inserts to see if you get any errors?

Cheers
Ben


On Thu, 22 Dec 2016 at 11:48 Branislav Janosik -T (bjanosik - AAP3 INC at 
Cisco) mailto:bjano...@cisco.com>> wrote:
Hi,

- Consistency level is set to ONE
-  Keyspace definition:

"CREATE KEYSPACE  IF NOT EXISTS  onem2m " +
"WITH replication = " +
"{ 'class' : 'SimpleStrategy', 'replication_factor' : 1}";



- yes, the client is on separate VM

- In our project we use Cassandra API version 3.0.2 but the database (cluster) 
is version 3.9

- for 2node cluster:

 first VM: 25 GB RAM, 16 CPUs

 second VM: 16 GB RAM, 16 CPUs




From: Ben Slater mailto:ben.sla...@instaclustr.com>>
Reply-To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Date: Wednesday, December 21, 2016 at 2:32 PM
To: "user@cassandra.apache.org" 
mailto:user@cassandra.apache.org>>
Subject: Re: Cassandra cluster performance

You would expect some drop when moving to single multiple nodes but on the face 
of it that feels extreme to me (although I’ve never personally tested the 
difference). Some questions that might help provide an answer:
- what consistency level are you using for the test?
- what is your keyspace definition (replication factor most importantly)?
- where are you running your test client (is it a separate box to cassandra)?
- what C* version?
- what are specs (CPU, RAM) of the test servers?

Cheers
Ben

On Thu, 22 Dec 2016 at 09:26 Branislav Janosik -T (bjanosik - AAP3 INC at 
Cisco) mailto:bjano...@cisco.com>> wrote:
Hi all,

I’m working on a project and we have Java benchmark test for testing the 
performance when using Cassandra database. Create operation on a single node 
Cassandra cluster is about 15K operations per second. Problem we have is when I 
set up cluster with 2 or more nodes (each of them are on separate virtual 
machines and servers), the performance goes down to 1K ops/sec. I follow the 
official instructions on how to set up a multinode cluster – the only things I 
change in Cassandra.yaml file are: change seeds to IP address of one node, 
change listen and rpc address to IP address of the node and finally change 
endpoint snitch to GossipingPropertyFileSnitch. The replication factor is set 
to 1 when having 2-node cluster. I use only one datacenter. The cluster seems 
to be doing fine (I can see nodes communicating) and so is the CPU, RAM usage 
on the machines.

Does anybody have any ideas? Any help would be very appreciated.

Thanks!



Re: Why does Cassandra recommends Oracle JVM instead of OpenJDK?

2016-12-22 Thread Kant Kodali
I would agree with Eric with his following statement. In fact, I was trying
to say the same thing.

"I don't really have any opinions on Oracle per say, but Cassandra is a
Free Software project and I would prefer that we not depend on
commercial software, (and that's kind of what we have here, an
implicit dependency)."

On Thu, Dec 22, 2016 at 3:09 AM, Brice Dutheil 
wrote:

> Pretty much a non-story, it seems like.
>
> Clickbait imho. Search ‘The Register’ in this wikipedia page
> 
>
> @Ben Manes
>
> Agreed, OpenJDK and Oracle JDK are now pretty close, but there is still
> some differences in the VM code and third party dependencies like security
> libraries. Maybe that’s fine for some productions, but maybe not for
> everyone.
>
> Also another thing, while OpenJDK source is available to all, I don’t
> think all OpenJDK builds have been certified with the TCK. For example the
> Zulu OpenJDK is, as Azul have access to the TCK and certifies
>  the builds. Another example OpenJDK
> build installed on RHEL is certified
> . Canonical probably is
> running TCK comliance tests as well on thei OpenJDK 8 since they are listed
> on the signatories
> 
> but not sure as I couldn’t find evidence on this; on this signatories list
> again there’s an individual – Emmanuel Bourg – who is related to Debian
>  (linkedin
> ), but not sure again the TCK is
> passed for each build.
>
> Bad OpenJDK intermediary builds, i.e without TCK compliance tests, is a
> reality
> 
> .
>
> While the situation has enhanced over the past months I’ll still double
> check before using any OpenJDK builds.
> ​
>
> -- Brice
>
> On Wed, Dec 21, 2016 at 5:08 PM, Voytek Jarnot 
> wrote:
>
>> Reading that article the only conclusion I can reach (unless I'm
>> misreading) is that all the stuff that was never free is still not free -
>> the change is that Oracle may actually be interested in the fact that some
>> are using non-free products for free.
>>
>> Pretty much a non-story, it seems like.
>>
>> On Tue, Dec 20, 2016 at 11:55 PM, Kant Kodali  wrote:
>>
>>> Looking at this http://www.theregister.co.uk/2016/12/16/oracle_targets_
>>> java_users_non_compliance/?mt=1481919461669 I don't know why Cassandra
>>> recommends Oracle JVM?
>>>
>>> JVM is a great piece of software but I would like to stay away from
>>> Oracle as much as possible. Oracle is just horrible the way they are
>>> dealing with Java in General.
>>>
>>>
>>>
>>
>


Re: Openstack and Cassandra

2016-12-22 Thread Shalom Sagges
Thanks for the info Aaron!

I will test it in hope there will be no issues. If no issues will occur,
this could actually be a good idea and would save a lot of resources.

Have a great day!


Shalom Sagges
DBA
T: +972-74-700-4035
 
 We Create Meaningful Connections



On Thu, Dec 22, 2016 at 6:27 PM, Aaron Ploetz  wrote:

> Shalom,
>
> We (Target) have been challenged by our management team to leverage
> OpenStack whenever possible, and that includes Cassandra.  I was against it
> at first, but we have done some stress testing with it and had application
> teams try it out.  So far, there haven't been any issues.
>
> A good use case for Cassandra on OpenStack, is to support an
> internal-facing application that needs to scale for disk footprint, or to
> spin-up a quick dev environment.  When building clusters to support those
> solutions, we haven't had any problems due to simply deploying on
> OpenStack.  Our largest Cassandra cluster on OpenStack is currently around
> 30 nodes.  OpenStack is a good solution for that particular use case as we
> can easily add/remove nodes to accommodate the dynamic disk usage
> requirements.
>
> However, when query latency is a primary concern, I do still recommend
> that we use one of our external cloud providers.
>
> Hope that helps,
>
> Aaron
>
> On Thu, Dec 22, 2016 at 9:51 AM, Shalom Sagges 
> wrote:
>
>> Thanks Vladimir!
>>
>> I guess I'll just have to deploy and continue from there.
>>
>>
>>
>>
>> Shalom Sagges
>> DBA
>> T: +972-74-700-4035 <+972%2074-700-4035>
>>  
>>  We Create Meaningful Connections
>> 
>>
>>
>> On Thu, Dec 22, 2016 at 5:20 PM, Vladimir Yudovin 
>> wrote:
>>
>>> Hi Shalom,
>>>
>>> I don't see any reason why it wouldn't work,  but obviously, any
>>> resource sharing affects performance. You can expect less degradation with
>>> SSD disks, I guess.
>>>
>>>
>>> Best regards, Vladimir Yudovin,
>>> *Winguzone  - Cloud Cassandra Hosting*
>>>
>>>
>>>  On Wed, 21 Dec 2016 13:31:22 -0500 *Shalom Sagges
>>> >* wrote 
>>>
>>> Hi Everyone,
>>>
>>> I am looking into the option of deploying a Cassandra cluster on
>>> Openstack nodes instead of physical nodes due to resource management
>>> considerations.
>>>
>>> Does anyone has any insights regarding this?
>>> Can this combination work properly?
>>> Since the disks (HDDs) are part of one physical machine that divide
>>> their capacity to various instances (not only Cassandra), will this affect
>>> performance, especially when the commitlog directory will probably reside
>>> with the data directory?
>>>
>>> I'm at a loss here and don't have any answers for that matter.
>>>
>>> Can anyone assist please?
>>>
>>> Thanks!
>>>
>>>
>>>
>>>
>>> Shalom Sagges
>>> DBA
>>> T: +972-74-700-4035 <+972%2074-700-4035>
>>> 
>>> 
>>> 
>>> We Create Meaningful Connections
>>>
>>>
>>>
>>>
>>> This message may contain confidential and/or privileged information.
>>> If you are not the addressee or authorized to receive this on behalf of
>>> the addressee you must not use, copy, disclose or take action based on this
>>> message or any information herein.
>>> If you have received this message in error, please advise the sender
>>> immediately by reply email and delete this message. Thank you.
>>>
>>>
>>>
>>
>> This message may contain confidential and/or privileged information.
>> If you are not the addressee or authorized to receive this on behalf of
>> the addressee you must not use, copy, disclose or take action based on this
>> message or any information herein.
>> If you have received this message in error, please advise the sender
>> immediately by reply email and delete this message. Thank you.
>>
>
>

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.


Re: Openstack and Cassandra

2016-12-22 Thread Aaron Ploetz
Shalom,

We (Target) have been challenged by our management team to leverage
OpenStack whenever possible, and that includes Cassandra.  I was against it
at first, but we have done some stress testing with it and had application
teams try it out.  So far, there haven't been any issues.

A good use case for Cassandra on OpenStack, is to support an
internal-facing application that needs to scale for disk footprint, or to
spin-up a quick dev environment.  When building clusters to support those
solutions, we haven't had any problems due to simply deploying on
OpenStack.  Our largest Cassandra cluster on OpenStack is currently around
30 nodes.  OpenStack is a good solution for that particular use case as we
can easily add/remove nodes to accommodate the dynamic disk usage
requirements.

However, when query latency is a primary concern, I do still recommend that
we use one of our external cloud providers.

Hope that helps,

Aaron

On Thu, Dec 22, 2016 at 9:51 AM, Shalom Sagges 
wrote:

> Thanks Vladimir!
>
> I guess I'll just have to deploy and continue from there.
>
>
>
>
> Shalom Sagges
> DBA
> T: +972-74-700-4035 <+972%2074-700-4035>
>  
>  We Create Meaningful Connections
> 
>
>
> On Thu, Dec 22, 2016 at 5:20 PM, Vladimir Yudovin 
> wrote:
>
>> Hi Shalom,
>>
>> I don't see any reason why it wouldn't work,  but obviously, any resource
>> sharing affects performance. You can expect less degradation with SSD
>> disks, I guess.
>>
>>
>> Best regards, Vladimir Yudovin,
>> *Winguzone  - Cloud Cassandra Hosting*
>>
>>
>>  On Wed, 21 Dec 2016 13:31:22 -0500 *Shalom Sagges
>> >* wrote 
>>
>> Hi Everyone,
>>
>> I am looking into the option of deploying a Cassandra cluster on
>> Openstack nodes instead of physical nodes due to resource management
>> considerations.
>>
>> Does anyone has any insights regarding this?
>> Can this combination work properly?
>> Since the disks (HDDs) are part of one physical machine that divide their
>> capacity to various instances (not only Cassandra), will this affect
>> performance, especially when the commitlog directory will probably reside
>> with the data directory?
>>
>> I'm at a loss here and don't have any answers for that matter.
>>
>> Can anyone assist please?
>>
>> Thanks!
>>
>>
>>
>>
>> Shalom Sagges
>> DBA
>> T: +972-74-700-4035 <+972%2074-700-4035>
>> 
>> 
>> 
>> We Create Meaningful Connections
>>
>>
>>
>>
>> This message may contain confidential and/or privileged information.
>> If you are not the addressee or authorized to receive this on behalf of
>> the addressee you must not use, copy, disclose or take action based on this
>> message or any information herein.
>> If you have received this message in error, please advise the sender
>> immediately by reply email and delete this message. Thank you.
>>
>>
>>
>
> This message may contain confidential and/or privileged information.
> If you are not the addressee or authorized to receive this on behalf of
> the addressee you must not use, copy, disclose or take action based on this
> message or any information herein.
> If you have received this message in error, please advise the sender
> immediately by reply email and delete this message. Thank you.
>


Re: Openstack and Cassandra

2016-12-22 Thread Shalom Sagges
Thanks Vladimir!

I guess I'll just have to deploy and continue from there.




Shalom Sagges
DBA
T: +972-74-700-4035
 
 We Create Meaningful Connections



On Thu, Dec 22, 2016 at 5:20 PM, Vladimir Yudovin 
wrote:

> Hi Shalom,
>
> I don't see any reason why it wouldn't work,  but obviously, any resource
> sharing affects performance. You can expect less degradation with SSD
> disks, I guess.
>
>
> Best regards, Vladimir Yudovin,
> *Winguzone  - Cloud Cassandra Hosting*
>
>
>  On Wed, 21 Dec 2016 13:31:22 -0500 *Shalom Sagges
> >* wrote 
>
> Hi Everyone,
>
> I am looking into the option of deploying a Cassandra cluster on Openstack
> nodes instead of physical nodes due to resource management considerations.
>
> Does anyone has any insights regarding this?
> Can this combination work properly?
> Since the disks (HDDs) are part of one physical machine that divide their
> capacity to various instances (not only Cassandra), will this affect
> performance, especially when the commitlog directory will probably reside
> with the data directory?
>
> I'm at a loss here and don't have any answers for that matter.
>
> Can anyone assist please?
>
> Thanks!
>
>
>
>
> Shalom Sagges
> DBA
> T: +972-74-700-4035 <+972%2074-700-4035>
> 
> 
> 
> We Create Meaningful Connections
>
>
>
>
> This message may contain confidential and/or privileged information.
> If you are not the addressee or authorized to receive this on behalf of
> the addressee you must not use, copy, disclose or take action based on this
> message or any information herein.
> If you have received this message in error, please advise the sender
> immediately by reply email and delete this message. Thank you.
>
>
>

-- 
This message may contain confidential and/or privileged information. 
If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this 
message or any information herein. 
If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.


Re: Not timing out some queries (Java driver)

2016-12-22 Thread Ali Akhtar
The replication factor is the default - I haven't changed it. Would
tweaking it help?

On Thu, Dec 22, 2016 at 8:41 PM, Ali Akhtar  wrote:

> Vladimir,
>
> I'm receiving a batch of messages which are out of order, and I need to
> process those messages in order.
>
> My solution is to write them to a cassandra table first, where they'll be
> ordered by their timestamp.
>
> Then read them back from that table, knowing that they'll be ordered.
>
> But for this to work, I need the data to be available immediately after I
> write it. For this, I think I need consistency = ALL.
>
> On Thu, Dec 22, 2016 at 8:29 PM, Vladimir Yudovin 
> wrote:
>
>> What is replication factor? Why not use CONSISTENCY QUORUM? It's faster
>> and safe enough.
>>
>> Best regards, Vladimir Yudovin,
>> *Winguzone  - Cloud Cassandra Hosting*
>>
>>
>>  On Thu, 22 Dec 2016 10:14:14 -0500 *Ali Akhtar
>> >* wrote 
>>
>> Is it possible to provide these options per query rather than set them
>> globally?
>>
>> On Thu, Dec 22, 2016 at 7:15 AM, Voytek Jarnot 
>> wrote:
>>
>> cassandra.yaml has various timeouts such as read_request_timeout,
>> range_request_timeout, write_request_timeout, etc.  The driver does as well
>> (via Cluster -> Configuration -> SocketOptions -> setReadTimeoutMillis).
>>
>> Not sure if you can (or would want to) set them to "forever", but it's a
>> starting point.
>>
>> On Wed, Dec 21, 2016 at 7:10 PM, Ali Akhtar  wrote:
>>
>> I have some queries which need to be processed in a consistent manner.
>> I'm setting the consistently level = ALL option on these queries.
>>
>> However, I've noticed that sometimes these queries fail because of a
>> timeout (2 seconds).
>>
>> In my use case, for certain queries, I want them to never time out and
>> block until they have been acknowledged by all nodes.
>>
>> Is that possible thru the Datastax Java driver, or another way?
>>
>>
>>
>


Re: Not timing out some queries (Java driver)

2016-12-22 Thread Ali Akhtar
Vladimir,

I'm receiving a batch of messages which are out of order, and I need to
process those messages in order.

My solution is to write them to a cassandra table first, where they'll be
ordered by their timestamp.

Then read them back from that table, knowing that they'll be ordered.

But for this to work, I need the data to be available immediately after I
write it. For this, I think I need consistency = ALL.

On Thu, Dec 22, 2016 at 8:29 PM, Vladimir Yudovin 
wrote:

> What is replication factor? Why not use CONSISTENCY QUORUM? It's faster
> and safe enough.
>
> Best regards, Vladimir Yudovin,
> *Winguzone  - Cloud Cassandra Hosting*
>
>
>  On Thu, 22 Dec 2016 10:14:14 -0500 *Ali Akhtar  >* wrote 
>
> Is it possible to provide these options per query rather than set them
> globally?
>
> On Thu, Dec 22, 2016 at 7:15 AM, Voytek Jarnot 
> wrote:
>
> cassandra.yaml has various timeouts such as read_request_timeout,
> range_request_timeout, write_request_timeout, etc.  The driver does as well
> (via Cluster -> Configuration -> SocketOptions -> setReadTimeoutMillis).
>
> Not sure if you can (or would want to) set them to "forever", but it's a
> starting point.
>
> On Wed, Dec 21, 2016 at 7:10 PM, Ali Akhtar  wrote:
>
> I have some queries which need to be processed in a consistent manner. I'm
> setting the consistently level = ALL option on these queries.
>
> However, I've noticed that sometimes these queries fail because of a
> timeout (2 seconds).
>
> In my use case, for certain queries, I want them to never time out and
> block until they have been acknowledged by all nodes.
>
> Is that possible thru the Datastax Java driver, or another way?
>
>
>


Re: Not timing out some queries (Java driver)

2016-12-22 Thread Vladimir Yudovin
What is replication factor? Why not use CONSISTENCY QUORUM? It's faster and 
safe enough.



Best regards, Vladimir Yudovin, 

Winguzone - Cloud Cassandra Hosting






 On Thu, 22 Dec 2016 10:14:14 -0500 Ali Akhtar  
wrote 




Is it possible to provide these options per query rather than set them globally?



On Thu, Dec 22, 2016 at 7:15 AM, Voytek Jarnot  
wrote:

cassandra.yaml has various timeouts such as read_request_timeout, 
range_request_timeout, write_request_timeout, etc.  The driver does as well 
(via Cluster -> Configuration -> SocketOptions -> 
setReadTimeoutMillis).



Not sure if you can (or would want to) set them to "forever", but it's a 
starting point.




On Wed, Dec 21, 2016 at 7:10 PM, Ali Akhtar  wrote:

I have some queries which need to be processed in a consistent manner. I'm 
setting the consistently level = ALL option on these queries.



However, I've noticed that sometimes these queries fail because of a timeout (2 
seconds).



In my use case, for certain queries, I want them to never time out and block 
until they have been acknowledged by all nodes.



Is that possible thru the Datastax Java driver, or another way?















Re: Openstack and Cassandra

2016-12-22 Thread Vladimir Yudovin
Hi Shalom,



I don't see any reason why it wouldn't work,  but obviously, any resource 
sharing affects performance. You can expect less degradation with SSD disks, I 
guess.





Best regards, Vladimir Yudovin, 

Winguzone - Cloud Cassandra Hosting






 On Wed, 21 Dec 2016 13:31:22 -0500 Shalom Sagges 
 wrote 




Hi Everyone, 



I am looking into the option of deploying a Cassandra cluster on Openstack 
nodes instead of physical nodes due to resource management considerations. 



Does anyone has any insights regarding this?

Can this combination work properly? 

Since the disks (HDDs) are part of one physical machine that divide their 
capacity to various instances (not only Cassandra), will this affect 
performance, especially when the commitlog directory will probably reside with 
the data directory?



I'm at a loss here and don't have any answers for that matter. 



Can anyone assist please? 



Thanks!





 


 
Shalom Sagges
 
DBA
 
T: +972-74-700-4035
 

 
 
 
 We Create Meaningful Connections
 
 

 

 









This message may contain confidential and/or privileged information. 

If you are not the addressee or authorized to receive this on behalf of the 
addressee you must not use, copy, disclose or take action based on this message 
or any information herein. 

If you have received this message in error, please advise the sender 
immediately by reply email and delete this message. Thank you.








Re: Not timing out some queries (Java driver)

2016-12-22 Thread Ali Akhtar
Is it possible to provide these options per query rather than set them
globally?

On Thu, Dec 22, 2016 at 7:15 AM, Voytek Jarnot 
wrote:

> cassandra.yaml has various timeouts such as read_request_timeout,
> range_request_timeout, write_request_timeout, etc.  The driver does as well
> (via Cluster -> Configuration -> SocketOptions -> setReadTimeoutMillis).
>
> Not sure if you can (or would want to) set them to "forever", but it's a
> starting point.
>
> On Wed, Dec 21, 2016 at 7:10 PM, Ali Akhtar  wrote:
>
>> I have some queries which need to be processed in a consistent manner.
>> I'm setting the consistently level = ALL option on these queries.
>>
>> However, I've noticed that sometimes these queries fail because of a
>> timeout (2 seconds).
>>
>> In my use case, for certain queries, I want them to never time out and
>> block until they have been acknowledged by all nodes.
>>
>> Is that possible thru the Datastax Java driver, or another way?
>>
>
>


Re: Advice in upgrade plan from 1.2.18 to 2.2.8

2016-12-22 Thread Alain RODRIGUEZ
Hi,

Here are some thoughts:

running 1.2.18. I plan to upgrade them to 2.2.latest
>

Going 1 major release at the time is probably the safest way to go indeed.


>1. Install 2.0.latest on one node at a time, start and wait for it to
>join the ring.
>2. Run upgradesstables on this node.
>3. Repeat Step 1,2 on each node installing cassandra2.0 in a rolling
>manner and running upgradesstables in parallel. (Please let me know if
>running upgrades stables in parallel is not right here. My cluster is not
>under much load really)
>
>
I would:

- Upgrade one node, check for cluster health (monitoring, logs, nodeool
commands), having a special attention to the 2.0 node.
- If everything is ok, then go for more nodes, if using distinct racks I
would go per rack; sequentially, node per node, all the nodes from
DC1-rack1, then DC1-rack2, then DC1-rack3. Then move to the next DC if
everything is fine.
- Start the 'upgradesstables' when the cluster is completely and
successfully running with the new version (2.0.17). It is perfectly fine to
run this in parallel as the last part of the upgrade. As you guessed, it is
good to keep monitoring the cluster load.

4. Now I will have both my DCs running 2.0.latest.


Without really having any strong argument, I would let it run for "some
time" like this, hours at least, maybe days. In any case, you will probably
have some work to prepare before the next upgrade, so you will have time to
check how the cluster is doing.

6. Do I need to run upgradesstables here again after the node has started
> and joined? (I think yes, but seek advice. https://docs.datastax.
> com/en/latest-upgrade/upgrade/cassandra/upgrdCassandra.html)


Yes, every time you run a major upgrade. Anyway, nodetool upgradesstables
will skip any sstable that do not need to be upgraded (as long as you don't
add the option to force it), so it is probably better to run it when you
have a doubt.


As additional information, I would prepare, for each upgrade:


   - The new Cassandra configuration (cassandra.yaml and cassandra-sh.yaml
   mainly but also other configuration files)

   To do that, I use to merge the current file in use (your configuration
   on C* 1.2.18) and the Cassandra version file from github for the new
   version (i.e.
   https://github.com/apache/cassandra/tree/cassandra-2.0.17/conf).

   This allows you to
  - Acknowledge and consider the new and removed configuration settings
  - Keep comments and default values in the configuration files up to
  date
  - Be fully exhaustive, and learn as you parse the files

  - Make sure clients will still work with the new version (see the
   doc, do the tests)
   - Cassandra metrics changed in the latest versions, you might have to
   rework your dashboards. Anticipating the dashboard creation for new
   versions would prevent you from loosing metrics when you need them the most.


Finally keep in mind that you should not perform any streaming while
running multiple version and as long as 'nodetool upgradesstables' is not
completely done. Meaning you should not add, remove, replace, move or
repair a node. Also, I would limit schema changes as much as possible while
running multiple versions as it caused troubles in the past.

During an upgrade, almost nothing else than the normal load due to the
service and the upgrade itself should happen. We always try to keep this
time window as short as possible.

C*heers,
---
Alain Rodriguez - @arodream - al...@thelastpickle.com
France

The Last Pickle - Apache Cassandra Consulting
http://www.thelastpickle.com

2016-12-21 20:36 GMT+01:00 Aiman Parvaiz :

> Hi everyone,
> I have 2 C* DCs with 12 nodes in each running 1.2.18. I plan to upgrade
> them to 2.2.latest and wanted to run by you experts my plan.
>
>
>1. Install 2.0.latest on one node at a time, start and wait for it to
>join the ring.
>2. Run upgradesstables on this node.
>3. Repeat Step 1,2 on each node installing cassandra2.0 in a rolling
>manner and running upgradesstables in parallel. (Please let me know if
>running upgrades stables in parallel is not right here. My cluster is not
>under much load really)
>4. Now I will have both my DCs running 2.0.latest.
>5. Install cassandra 2.1.latest on one node at a time (same as above)
>6. Do I need to run upgradesstables here again after the node has
>started and joined? (I think yes, but seek advice.
>https://docs.datastax.com/en/latest-upgrade/upgrade/
>cassandra/upgrdCassandra.html
>
> 
>)
>7. Following the above pattern, I would install cassandra2.1 in a
>rolling manner across 2 DCs (depending on response to 6 I might or might
>not run upgradesstables)
>8. At this point both DCs would have 2.1.latest and again in rolling
>manner I install 2.2.8.
>
>
> My assumption is that while this up

Re: Why does Cassandra recommends Oracle JVM instead of OpenJDK?

2016-12-22 Thread Brice Dutheil
Pretty much a non-story, it seems like.

Clickbait imho. Search ‘The Register’ in this wikipedia page


@Ben Manes

Agreed, OpenJDK and Oracle JDK are now pretty close, but there is still
some differences in the VM code and third party dependencies like security
libraries. Maybe that’s fine for some productions, but maybe not for
everyone.

Also another thing, while OpenJDK source is available to all, I don’t think
all OpenJDK builds have been certified with the TCK. For example the Zulu
OpenJDK is, as Azul have access to the TCK and certifies
 the builds. Another example OpenJDK
build installed on RHEL is certified
. Canonical probably is running
TCK comliance tests as well on thei OpenJDK 8 since they are listed on the
signatories
 but
not sure as I couldn’t find evidence on this; on this signatories list
again there’s an individual – Emmanuel Bourg – who is related to Debian
 (linkedin
), but not sure again the TCK is passed
for each build.

Bad OpenJDK intermediary builds, i.e without TCK compliance tests, is a
reality

.

While the situation has enhanced over the past months I’ll still double
check before using any OpenJDK builds.
​

-- Brice

On Wed, Dec 21, 2016 at 5:08 PM, Voytek Jarnot 
wrote:

> Reading that article the only conclusion I can reach (unless I'm
> misreading) is that all the stuff that was never free is still not free -
> the change is that Oracle may actually be interested in the fact that some
> are using non-free products for free.
>
> Pretty much a non-story, it seems like.
>
> On Tue, Dec 20, 2016 at 11:55 PM, Kant Kodali  wrote:
>
>> Looking at this http://www.theregister.co.uk/2016/12/16/oracle_targets_
>> java_users_non_compliance/?mt=1481919461669 I don't know why Cassandra
>> recommends Oracle JVM?
>>
>> JVM is a great piece of software but I would like to stay away from
>> Oracle as much as possible. Oracle is just horrible the way they are
>> dealing with Java in General.
>>
>>
>>
>