Re: Copy from CSV on OS X problem with varint values <= -2^63

2017-04-07 Thread Brice Dutheil
@Boris, what formula did you use on homebrew and what is the git version of
this formula ?

Anyway the current cassandra formula is here :
https://github.com/Homebrew/homebrew-core/blob/master/Formula/cassandra.rb

I am not a Homebrew developper, the formula does a lot of facy stuff, yet I
see a resource for the python driver that seems to be 3.8.0
.
At first sight nothing on driver 3.10. Could it be a bad (brew) bottle ?

— Brice

On Fri, Apr 7, 2017 at 1:00 AM, Boris Babic  wrote:

Stefania
>
> Downloading and simply running from folder without homebrew interference
> it now looks like the driver matches what you say in the last email.
> I will try writing variants again to confirm it works.
>
> cqlsh --debug
> Using CQL driver:  apache-cassandra-3.10/bin/../lib/cassandra-driver-internal-
> only-3.7.0.post0-2481531.zip/cassandra-driver-3.7.0.post0-
> 2481531/cassandra/__init__.py'>
> Using connect timeout: 5 seconds
> Using 'utf-8' encoding
> Using ssl: False
> Connected to Test Cluster at 127.0.0.1:9042.
> [cqlsh 5.0.1 | Cassandra 3.10 | CQL spec 3.4.4 | Native protocol v4]
> Use HELP for help.
>
>
>
> On Apr 6, 2017, at 11:58 AM, Stefania Alborghetti  datastax.com> wrote:
>
> It doesn't look like the embedded driver, it should come from a zip file
> labeled with version 3.7.0.post0-2481531 for cassandra 3.10:
>
> Using CQL driver:  cassandra/bin/../lib/cassandra-driver-internal-
> only-3.7.0.post0-2481531.zip/cassandra-driver-3.7.0.post0-
> 2481531/cassandra/__init__.py'>
>
> Sorry, I should have posted this example in my previous email, rather than
> an example based on the non-embedded driver.
>
> I don't know who to contact regarding homebrew installation, but you
> could download the Cassandra package, unzip it, and run cqlsh and Cassandra
> from that directory?
>
>
> On Thu, Apr 6, 2017 at 4:59 AM, Boris Babic  wrote:
> Stefania
>
> This is the output of my --debug, I never touched CQLSH_NO_BUNDLED and did
> not know about it.
> As you can see I have used homebrew to install Cassandra and looks like
> its the embedded version as it sits under the Cassandra folder ?
>
> cqlsh --debug
> Using CQL driver:  3.10_1/libexec/vendor/lib/python2.7/site-packages/cassandra/__init__.pyc'>
> Using connect timeout: 5 seconds
> Using 'utf-8' encoding
> Using ssl: False
> Connected to Test Cluster at 127.0.0.1:9042.
> [cqlsh 5.0.1 | Cassandra 3.10 | CQL spec 3.4.4 | Native protocol v4]
> Use HELP for help.
>
>
> On Apr 5, 2017, at 12:07 PM, Stefania Alborghetti  datastax.com> wrote:
>
> You are welcome.
>
> I traced the problem to a commit of the Python driver that shipped in
> version 3.8 of the driver. It is fixed in 3.8.1. More details
> on CASSANDRA-13408. I don't think it's related to the OS.
>
> Since Cassandra 3.10 ships with an older version of the driver embedded in
> a zip file in the lib folder, and this version is not affected,
> I'm guessing that either the embedded version does not work on OS X, or you
> are manually using a different version of the driver by
> setting CQLSH_NO_BUNDLED (which is why I could reproduce it on my laptop).
>
> You can run cqlsh with --debug to see the version of the driver that cqlsh
> is using, for example:
>
> cqlsh --debug
> Using CQL driver:  dist-packages/cassandra_driver-3.8.1-py2.7-linux-x86_
> 64.egg/cassandra/__init__.pyc'>
>
> Can you confirm if you were overriding the Python driver by setting
> CQLSH_NO_BUNDLED and the version of the driver?
>
>
>
> On Tue, Apr 4, 2017 at 6:12 PM, Boris Babic  wrote:
> Thanks Stefania, going from memory don't think I noticed this on windows
> but haven't got a machine handy to test it on at the moment.
>
> On Apr 4, 2017, at 19:44, Stefania Alborghetti  datastax.com> wrote:
>
> I've reproduced the same problem on Linux, and I've opened
> CASSANDRA-13408. As a workaround, disable prepared statements and it
> will work (WITH HEADER = TRUE AND PREPAREDSTATEMENTS = False).
>
> On Tue, Apr 4, 2017 at 5:02 PM, Boris Babic  wrote:
>
> On Apr 4, 2017, at 7:00 PM, Boris Babic  wrote:
>
> Hi
>
> I’m testing the write of various datatypes on OS X for fun running
> cassandra 3.10 on a single laptop instance, and from what i can see
> varint should map to java.math.BigInteger and have no problems with
> Long.MIN_VALE , -9223372036854775808, but i can’t see what I’m doing wrong.
>
> cqlsh: 5.0.1
> cassandra 3.10
> osx el capitan.
>
> data.csv:
>
> id,varint
> -2147483648 <(214)%20748-3648>,-9223372036854775808
> 2147483647 <(214)%20748-3647>,9223372036854775807
>
> COPY mykeyspace.data (id,varint) FROM 'data.csv' WITH HEADER=true;
>
>   Failed to make batch statement: Received an argument of invalid type
> for column "varint". Expected: ,
> Got: ; (descriptor 'bit_length' requires a 'int' object 

Re: Why does Cassandra recommends Oracle JVM instead of OpenJDK?

2017-02-13 Thread Brice Dutheil
The Android battle is another thing that I wouldn't consider for OracleJDK
/ OpenJDK.
While I do like what Google did from a technical point of view, Google may
have overstepped fair use (or not – I don't know). Anyway Sun didn't like
what Google did, they probably considered going to court at that time.




-- Brice

On Mon, Feb 13, 2017 at 10:20 AM, kurt greaves  wrote:

> are people actually trying to imply that Google is less evil than oracle?
> what is this shill fest
>
>
> On 12 Feb. 2017 8:24 am, "Kant Kodali"  wrote:
>
> Saw this one today...
>
> https://news.ycombinator.com/item?id=13624062
>
> On Tue, Jan 3, 2017 at 6:27 AM, Eric Evans 
> wrote:
>
>> On Mon, Jan 2, 2017 at 2:26 PM, Edward Capriolo 
>> wrote:
>> > Lets be clear:
>> > What I am saying is avoiding being loose with the word "free"
>> >
>> > https://en.wikipedia.org/wiki/Free_software_license
>> >
>> > Many things with the JVM are free too. Most importantly it is free to
>> use.
>> >
>> > https://www.java.com/en/download/faq/distribution.xml
>> >
>> > As it relates to this conversation: I am not aware of anyone running
>> > Cassandra that has modified upstream JVM to make Cassandra run
>> > better/differently *. Thus the license around the Oracle JVM is roughly
>> > meaningless to the user/developer of cassandra.
>> >
>> > * The only group I know that took an action to modify upstream was
>> Acunu.
>> > They had released a modified Linux Kernel with a modified Apache
>> Cassandra.
>> > http://cloudtweaks.com/2011/02/data-storage-startup-acunu-ra
>> ises-3-6-million-to-launch-its-first-product/.
>> > That product no longer exists.
>> >
>> > "I don't how to read any of this.  It sounds like you're saying that a
>> > JVM is something that cannot be produced as a Free Software project,"
>> >
>> > What I am saying is something like the JVM "could" be produced as a
>> "free
>> > software project". However, the argument that I was making is that the
>> > popular viable languages/(including vms or runtime to use them) today
>> > including Java, C#, Go, Swift are developed by the largest tech
>> companies in
>> > the world, and as such I do believe a platform would be viable.
>> Specifically
>> > I believe without Oracle driving Java OpenJDK would not be viable.
>> >
>> > There are two specific reasons.
>> > 1) I do not see large costly multi-year initiatives like G1 happening
>> > 2) Without guidance/leadership that sun/oracle I do not see new features
>> > that change the language like lambda's and try multi-catch happening in
>> a
>> > sane way.
>> >
>> > I expanded upon #2 be discussing my experience with standards like c++
>> 11,
>> > 14,17 and attempting to take compiling working lambda code on linux GCC
>> to
>> > microsoft visual studio and having it not compile. In my opinion, Java
>> only
>> > wins because as a platform it is very portable as both source and binary
>> > code. Without leadership on that front I believe that over time the
>> language
>> > would suffer.
>>
>> I realize that you're trying to be pragmatic about all of this, but
>> what I don't think you realize, is that so am I.
>>
>> Java could change hands at any time (it has once already), or Oracle
>> leadership could decide to go in a different direction.  Imagine for
>> example that they relicensed it to exclude use by orientation or
>> religion, Cassandra would implicitly carry these restrictions as well.
>> Imagine that they decided to provide a back-door to the NSA, Cassandra
>> would then also contain such a back-door.  These might sound
>> hypothetical, but there is plenty of precedent here.
>>
>> OpenJDK benefits from the same resources and leadership from Oracle
>> that you value, but is licensed and distributed in a way that
>> safeguards us from a day when Oracle becomes less benevolent, (if that
>> were to happen, some other giant company could assume the mantle of
>> leadership).
>>
>> All I'm really suggesting is that we at least soften our requirement
>> on the Oracle JVM, and perhaps perform some test runs in CI against
>> OpenJDK.  Actively discouraging people from using the Free Software
>> alternative here, one that is working well for many, isn't the
>> behavior I'd normally expect from a Free Software project.
>>
>> --
>> Eric Evans
>> john.eric.ev...@gmail.com
>>
>
>
>


Re: Why does Cassandra recommends Oracle JVM instead of OpenJDK?

2016-12-26 Thread Brice Dutheil
A note on this video from the respected James Gosling, is that it is from
2010, when Oracle was new to the Java stewardship ecosystem. The company
came a long since. I'm not saying everything is perfect. But I doubt that a
product such as the JVM will be as good without a company guidance.

The module system is interesting and is good thing regardless of the Oracle
features. Having AWT classes for a server always annoyed me, for IoT as
well. I'm really excited about Java 9.


-- Brice

On Mon, Dec 26, 2016 at 3:55 PM, Edward Capriolo <edlinuxg...@gmail.com>
wrote:

>
>
> On Sat, Dec 24, 2016 at 5:58 AM, Kant Kodali <k...@peernova.com> wrote:
>
>> @Edward Agreed JVM is awesome and it is a work of many smart people and
>> this is obvious if one looks into the JDK code. But given Oracle history of
>> business practices and other decisions it is a bit hard to convince oneself
>> that everything is going to be OK and that they actually care about open
>> source. Even the module system that they are trying to come up with is
>> something that motivated by the problem they have faced internally.
>>
>> To reiterate again just watch this video https://www.youtube.com/
>> watch?v=9ei-rbULWoA
>>
>> My statements are not solely based on this video but I certainly would
>> give good weight for James Gosling.
>>
>> I tend to think that Oracle has not closed Java because they know that
>> cant get money from users because these days not many people are willing to
>> pay even for distributed databases so I don't think anyone would pay for
>> programming language. In short, Let me end by saying Oracle just has lot of
>> self interest but I really hope that I am wrong since I am a big fan of JVM.
>>
>>
>>
>>
>>
>> On Fri, Dec 23, 2016 at 7:15 PM, Edward Capriolo <edlinuxg...@gmail.com>
>> wrote:
>>
>>>
>>> On Fri, Dec 23, 2016 at 6:01 AM, Kant Kodali <k...@peernova.com> wrote:
>>>
>>>> Java 9 Module system looks really interesting. I would be very curious
>>>> to see how Cassandra would leverage that.
>>>>
>>>> On Thu, Dec 22, 2016 at 9:09 AM, Kant Kodali <k...@peernova.com> wrote:
>>>>
>>>>> I would agree with Eric with his following statement. In fact, I was
>>>>> trying to say the same thing.
>>>>>
>>>>> "I don't really have any opinions on Oracle per say, but Cassandra is
>>>>> a
>>>>> Free Software project and I would prefer that we not depend on
>>>>> commercial software, (and that's kind of what we have here, an
>>>>> implicit dependency)."
>>>>>
>>>>> On Thu, Dec 22, 2016 at 3:09 AM, Brice Dutheil <
>>>>> brice.duth...@gmail.com> wrote:
>>>>>
>>>>>> Pretty much a non-story, it seems like.
>>>>>>
>>>>>> Clickbait imho. Search ‘The Register’ in this wikipedia page
>>>>>> <https://en.wikipedia.org/wiki/Wikipedia:Potentially_unreliable_sources#News_media>
>>>>>>
>>>>>> @Ben Manes
>>>>>>
>>>>>> Agreed, OpenJDK and Oracle JDK are now pretty close, but there is
>>>>>> still some differences in the VM code and third party dependencies like
>>>>>> security libraries. Maybe that’s fine for some productions, but maybe not
>>>>>> for everyone.
>>>>>>
>>>>>> Also another thing, while OpenJDK source is available to all, I don’t
>>>>>> think all OpenJDK builds have been certified with the TCK. For example 
>>>>>> the
>>>>>> Zulu OpenJDK is, as Azul have access to the TCK and certifies
>>>>>> <https://www.azul.com/products/zulu/> the builds. Another example
>>>>>> OpenJDK build installed on RHEL is certified
>>>>>> <https://access.redhat.com/articles/1299013>. Canonical probably is
>>>>>> running TCK comliance tests as well on thei OpenJDK 8 since they are 
>>>>>> listed
>>>>>> on the signatories
>>>>>> <http://openjdk.java.net/groups/conformance/JckAccess/jck-access.html>
>>>>>> but not sure as I couldn’t find evidence on this; on this signatories 
>>>>>> list
>>>>>> again there’s an individual – Emmanuel Bourg – who is related to
>>>>>> Debian <https://lists.debian.org/debian-java/2015/01/msg00015.html> (
>>>>>> linkedin <

Re: Why does Cassandra recommends Oracle JVM instead of OpenJDK?

2016-12-22 Thread Brice Dutheil
Pretty much a non-story, it seems like.

Clickbait imho. Search ‘The Register’ in this wikipedia page


@Ben Manes

Agreed, OpenJDK and Oracle JDK are now pretty close, but there is still
some differences in the VM code and third party dependencies like security
libraries. Maybe that’s fine for some productions, but maybe not for
everyone.

Also another thing, while OpenJDK source is available to all, I don’t think
all OpenJDK builds have been certified with the TCK. For example the Zulu
OpenJDK is, as Azul have access to the TCK and certifies
 the builds. Another example OpenJDK
build installed on RHEL is certified
. Canonical probably is running
TCK comliance tests as well on thei OpenJDK 8 since they are listed on the
signatories
 but
not sure as I couldn’t find evidence on this; on this signatories list
again there’s an individual – Emmanuel Bourg – who is related to Debian
 (linkedin
), but not sure again the TCK is passed
for each build.

Bad OpenJDK intermediary builds, i.e without TCK compliance tests, is a
reality

.

While the situation has enhanced over the past months I’ll still double
check before using any OpenJDK builds.
​

-- Brice

On Wed, Dec 21, 2016 at 5:08 PM, Voytek Jarnot 
wrote:

> Reading that article the only conclusion I can reach (unless I'm
> misreading) is that all the stuff that was never free is still not free -
> the change is that Oracle may actually be interested in the fact that some
> are using non-free products for free.
>
> Pretty much a non-story, it seems like.
>
> On Tue, Dec 20, 2016 at 11:55 PM, Kant Kodali  wrote:
>
>> Looking at this http://www.theregister.co.uk/2016/12/16/oracle_targets_
>> java_users_non_compliance/?mt=1481919461669 I don't know why Cassandra
>> recommends Oracle JVM?
>>
>> JVM is a great piece of software but I would like to stay away from
>> Oracle as much as possible. Oracle is just horrible the way they are
>> dealing with Java in General.
>>
>>
>>
>


Re: Why does Cassandra recommends Oracle JVM instead of OpenJDK?

2016-12-21 Thread Brice Dutheil
Let's not debate opinion on the Oracle stewardship here, we certainly have
different views that come from different experiences.

Let's discuss facts instead :)

-- Brice

On Wed, Dec 21, 2016 at 11:34 AM, Kant Kodali <k...@peernova.com> wrote:

> yeah well I don't think Oracle is treating Java the way Google is treating
> Go and I am not a big fan of Go mainly because I understand the JVM is far
> more robust than anything that is out there.
>
> "Oracle just doesn't understand open source" These are the words from
> James Gosling himself
>
> I do think its better to stay away from Oracle as we never know when they
> would switch open source to closed source. Given their history of practices
> their statements are not credible.
>
> I am pretty sure the community would take care of OpenJDK.
>
>
>
>
>
> On Wed, Dec 21, 2016 at 2:04 AM, Brice Dutheil <brice.duth...@gmail.com>
> wrote:
>
>> The problem described in this article is different than what you have on
>> your servers and I’ll add this article should be reaad with caution, as The
>> Register is known for sensationalism. The article itself has no substantial
>> proof or enough details. In my opinion this article is clickbait.
>>
>> Anyway there’s several point to think of instead of just swicthing to
>> OpenJDK :
>>
>>-
>>
>>There is technical differences between Oracle JDK and openjdk. Where
>>there’s licensing issues some libraries are closed source in Hotspot like
>>font, rasterizer or cryptography and OpenJDK use open source alternatives
>>which leads to different bugs or performance. I believe they also have
>>minor differences in the hotspot code to plug in stuff like Java Mission
>>Control or Flight Recorder or hotpost specific options.
>>Also I believe that Oracle JDK is more tested or more up to date than
>>OpenJDK.
>>
>>So while OpenJDK is functionnaly the same as Oracle JDK it may not
>>have the same performance or the same bugs or the same security fixes.
>>(Unless are your ready to test that with your production servers and your
>>production data).
>>
>>I don’t know if datastax have released the details of their
>>configuration when they test Cassandra.
>>-
>>
>>There’s also a question of support. OpeJDK is for the community.
>>Oracle can offer support but maybe only for Oracle JDK.
>>
>>Twitter uses OpenJDK, but they have their own JVM support team. Not
>>sure everyone can afford that.
>>
>> As a side note I’ll add that Oracle is paying talented engineers to work
>> on the JVM to make it great.
>>
>> Cheers,
>> ​
>>
>> -- Brice
>>
>> On Wed, Dec 21, 2016 at 6:55 AM, Kant Kodali <k...@peernova.com> wrote:
>>
>>> Looking at this http://www.theregister.co.uk/2016/12/16/oracle_targets_
>>> java_users_non_compliance/?mt=1481919461669 I don't know why Cassandra
>>> recommends Oracle JVM?
>>>
>>> JVM is a great piece of software but I would like to stay away from
>>> Oracle as much as possible. Oracle is just horrible the way they are
>>> dealing with Java in General.
>>>
>>>
>>>
>>
>


Re: Why does Cassandra recommends Oracle JVM instead of OpenJDK?

2016-12-21 Thread Brice Dutheil
The problem described in this article is different than what you have on
your servers and I’ll add this article should be reaad with caution, as The
Register is known for sensationalism. The article itself has no substantial
proof or enough details. In my opinion this article is clickbait.

Anyway there’s several point to think of instead of just swicthing to
OpenJDK :

   -

   There is technical differences between Oracle JDK and openjdk. Where
   there’s licensing issues some libraries are closed source in Hotspot like
   font, rasterizer or cryptography and OpenJDK use open source alternatives
   which leads to different bugs or performance. I believe they also have
   minor differences in the hotspot code to plug in stuff like Java Mission
   Control or Flight Recorder or hotpost specific options.
   Also I believe that Oracle JDK is more tested or more up to date than
   OpenJDK.

   So while OpenJDK is functionnaly the same as Oracle JDK it may not have
   the same performance or the same bugs or the same security fixes. (Unless
   are your ready to test that with your production servers and your
   production data).

   I don’t know if datastax have released the details of their
   configuration when they test Cassandra.
   -

   There’s also a question of support. OpeJDK is for the community. Oracle
   can offer support but maybe only for Oracle JDK.

   Twitter uses OpenJDK, but they have their own JVM support team. Not sure
   everyone can afford that.

As a side note I’ll add that Oracle is paying talented engineers to work on
the JVM to make it great.

Cheers,
​

-- Brice

On Wed, Dec 21, 2016 at 6:55 AM, Kant Kodali  wrote:

> Looking at this http://www.theregister.co.uk/2016/12/16/oracle_
> targets_java_users_non_compliance/?mt=1481919461669 I don't know why
> Cassandra recommends Oracle JVM?
>
> JVM is a great piece of software but I would like to stay away from Oracle
> as much as possible. Oracle is just horrible the way they are dealing with
> Java in General.
>
>
>


Re: Data migration from Oracle to Cassandra

2016-11-21 Thread Brice Dutheil
Hi Shashidhar,

I have done something like that at reasonably high scale, migrating a few
billions of Oracle records to Cassandra.

Basically the process we used is : the app will perform the write in
cassandra for new or updated records, the batch will backfeed old data in
cassandra.

We wrote everything ourselves, at that time. Those migration tool are not
public as they were specific to the business domain. However it is
noteworthy that we spend quite a bit to tweak them a lot (regarding the
concurrency, the volume of writes, the production data, etc.).

We gave with Duy Hai a presentation at the Cassandra summit in 2014
however, due to the short presentation 30 minutes we didn't spoke about the
migration tooling itself.

https://speakerdeck.com/bric3/billion-records-from-sql-to-cassandra-lessons-learned
https://youtu.be/U6oa3Tsdtp4





-- Brice

On Thu, Nov 17, 2016 at 3:58 PM, Chidambaran Subramanian 
wrote:

> More curious than answering the question. Would it be possible to even
> design something generic here? Would it not depend on the schema?
>
> On Thu, Nov 17, 2016 at 8:21 PM, Shashidhar Rao <
> raoshashidhar...@gmail.com> wrote:
>
>> Hi,
>>
>> Has anyone done data migration from Oracle to Cassandra taking care of
>> Change data capture.
>>
>> Kindly share the experience about the tools used. Golden Gate, IBM CDC or
>> any tools.
>>
>> Recommendation of any Open Source tools would be highly useful. I need to
>> constantly capture the commits from Oracle to Cassandra.
>>
>>
>> Regards
>> shashi
>>
>
>


Re: cql-maven-plugin

2016-10-07 Thread Brice Dutheil
At this moment no, as this is a maven plugin. Extracting such code would be
relatively trivial.

-- Brice

On Fri, Oct 7, 2016 at 1:24 PM, Ali Akhtar <ali.rac...@gmail.com> wrote:

> Is there a way to call this programatically such as from unit tests, to
> create keyspace / table schema from a cql file?
>
> On Fri, Oct 7, 2016 at 2:40 PM, Brice Dutheil <brice.duth...@gmail.com>
> wrote:
>
>> Hi there,
>>
>> I’d like to share a very simple project around handling CQL files with
>> maven. We were using the cassandra-maven-plugin before, but with
>> limitations on the authentication and the use of thrift protocol. I was
>> tempted to write a replacement focused only the execution of CQL
>> statements, in the same way that sql-maven-plugin is.
>>
>> I didn’t port the cassandra lifecycle tasks as they could be handled with
>> other tool. e.g. docker
>>
>> It’s available on maven at the following coordinate
>> com.github.bric3.maven:cql-maven-plugin:0.4, and the code is available
>> on Github https://github.com/bric3/cql-maven-plugin
>>
>> I would definitely like feedback on this. It’s probably not bug free, but
>> our team uses this plugin with several projects, that are each built
>> several times a day.
>> Currently the code is lacking integration tests, that’s probably the area
>> where it can be improved the most.
>>
>> Cheers,
>> — Brice
>> ​
>>
>
>


cql-maven-plugin

2016-10-07 Thread Brice Dutheil
Hi there,

I’d like to share a very simple project around handling CQL files with
maven. We were using the cassandra-maven-plugin before, but with
limitations on the authentication and the use of thrift protocol. I was
tempted to write a replacement focused only the execution of CQL
statements, in the same way that sql-maven-plugin is.

I didn’t port the cassandra lifecycle tasks as they could be handled with
other tool. e.g. docker

It’s available on maven at the following coordinate
com.github.bric3.maven:cql-maven-plugin:0.4, and the code is available on
Github https://github.com/bric3/cql-maven-plugin

I would definitely like feedback on this. It’s probably not bug free, but
our team uses this plugin with several projects, that are each built
several times a day.
Currently the code is lacking integration tests, that’s probably the area
where it can be improved the most.

Cheers,
— Brice
​


Re: [ANNOUNCEMENT] Website update

2016-09-12 Thread Brice Dutheil
Really nice update !

There's still some todos ;)
http://cassandra.apache.org/doc/latest/architecture/storage_engine.html
http://cassandra.apache.org/doc/latest/architecture/guarantees.html
http://cassandra.apache.org/doc/latest/operating/read_repair.html
...



-- Brice

On Mon, Sep 12, 2016 at 6:38 AM, Ashish Disawal <
ashish.disa...@evivehealth.com> wrote:

> Website looks great.
> Good job guys.
>
> --
> Ashish Disawal
>
> On Mon, Sep 12, 2016 at 3:00 AM, Jens Rantil  wrote:
>
>> Nice! The website also feels snappier!
>>
>>
>> On Friday, July 29, 2016, Sylvain Lebresne  wrote:
>>
>>> Wanted to let everyone know that if you go to the Cassandra website
>>> (cassandra.apache.org), you'll notice that there has been some change.
>>> Outside
>>> of a face lift, the main change is a much improved documentation section
>>> (http://cassandra.apache.org/doc/). As indicated, that documentation is
>>> a
>>> work-in-progress and still has a few missing section. That documentation
>>> is
>>> maintained in-tree and contributions (through JIRA as any other
>>> contribution)
>>> is more than welcome.
>>>
>>> Best,
>>> On behalf of the Apache Cassandra developers.
>>>
>>
>>
>> --
>> Jens Rantil
>> Backend engineer
>> Tink AB
>>
>> Email: jens.ran...@tink.se
>> Phone: +46 708 84 18 32
>> Web: www.tink.se
>>
>> Facebook  Linkedin
>> 
>>  Twitter 
>>
>>
>


Re: NTP Synchronization Setup Changes

2016-04-01 Thread Brice Dutheil
Hi another tip, make sure the OS doesn't come with pre-configured NTP 
synchronisation services. We had a proper NTP setup, but we missed a service 
that came with CentOS that synced to a low stratum NTP server.
-- Brice




On Thu, Mar 31, 2016 at 10:00 AM -0700, "Eric Evans"  
wrote:










On Wed, Mar 30, 2016 at 8:07 PM, Mukil Kesavan
 wrote:
> Are there any issues if this causes a huge time correction on the cassandra
> cluster? I know that NTP gradually corrects the time on all the servers. I
> just wanted to understand if there were any corner cases that will cause us
> to lose data/schema updates when this happens. In particular, we seem to be
> having some issues around missing secondary indices at the moment (not all
> but some).

As a thought experiment, imagine every scenario where it matters to
have one write occur after another (an update followed by a delete is
a good example).  Now imagine having your clock yanked backward to
correct for drift between the first such operation and the second.

I would strongly recommend you come up with a stable NTP setup.


-- 
Eric Evans
eev...@wikimedia.org







Re: Cassandra Java Driver

2015-12-26 Thread Brice Dutheil
Not yet. The latestest DSE (4.8.3) is shipped with a patched version of
Cassandra 2.11.
You can find this information on their website.

4.8 Release note :
https://docs.datastax.com/en/datastax_enterprise/4.8/datastax_enterprise/RNdse.html
>From this page in the menu you can navigate and unroll the menu *Product
Guide* > *Datastax Enterprise* it should contain DSE versions.

And there's always other sources like the blog.

Cassandra 3.x should be shipped with DSE 5.x early next year.


HTH
-- Brice

On Sat, Dec 26, 2015 at 3:46 AM, Noorul Islam Kamal Malmiyoda <
noo...@noorul.com> wrote:

> Is DSE shipping with 3.x ?
>
> Thanks and Regards
> Noorul
>
> On Fri, Dec 25, 2015 at 9:07 PM, Alexandre Dutra
>  wrote:
> > Hi Jean,
> >
> > You should use 3.0.0-beta1.
> >
> > TL;DR
> >
> > DataStax Java driver series 2.2.x has been discontinued in favor of
> series
> > 3.x; we explained why in this mail to the Java driver mailing list. We do
> > not advise users to use this series.
> >
> > So the most recent driver version compatible with all versions of
> Cassandra,
> > including 2.2 and 3.x, is now 3.0.0-beta1, although 3.0.0-rc1 will be
> > released very soon.
> >
> > In spite of its "beta" label, version 3.0.0-beta1 has been thoroughly
> tested
> > against all versions of Cassandra and is definitely production-ready...
> as
> > long as the Cassandra version in use is also production-ready. Note
> however
> > that Cassandra 2.2 and 3.0 are quite recent and most companies AFAICT do
> not
> > consider them yet as production-ready.
> >
> > Hope that helps,
> >
> > Alexandre
> >
> >
> > On Tue, Dec 22, 2015 at 4:40 PM Jean Tremblay
> >  wrote:
> >>
> >> Hi,
> >> Which Java Driver is suited for Cassandra 2.2.x. ?
> >> I see datastax 3.0.0 beta1 and datastax 2.2.0 rc3...
> >> Are they suited for production?
> >> Is there anything better?
> >> Thanks for your comments and replies?
> >> Jean
> >
> > --
> > Alexandre Dutra
> > Driver & Tools Engineer @ DataStax
>


Re: Oracle TIMESTAMP(9) equivalent in Cassandra

2015-10-29 Thread Brice Dutheil
Additionally if the time uuid is generated client side, make sure the boxes
that will perform the write hava correct ntp/ptp configuration.

@John Haddad

Keep in mind that in a distributed environment you probably have so much
variance that nanosecond precision is pointless. Even google notes that in
the paper, Dapper, a Large-Scale Distributed Systems Tracing Infrastructure
[http://research.google.com/pubs/pub36356.html]

I agree with your statement about variance. Though I just like to mention
Dapper is about *tracing* query/code, more generally it’s about about the
execution overhead of tracing, which is a bit different that just
*timestamping*.
​

-- Brice

On Thu, Oct 29, 2015 at 2:45 PM, Clint Martin <
clintlmar...@coolfiretechnologies.com> wrote:

> Generating the time uuid on the server side via the now() function also
> makes the operation non idempotent. This may not be a huge problem for your
> application but it is something to keep in mind.
>
> Clint
> On Oct 29, 2015 9:01 AM, "Kai Wang"  wrote:
>
>> If you want the timestamp to be generated on the C* side, you need to
>> sync clocks among nodes to the nanosecond precision first. That alone might
>> be hard or impossible already. I think the safe bet is to generate the
>> timestamp on the client side. But depending on your data volume, if data
>> comes from multiple clients you still need to sync clocks among them.
>>
>>
>> On Thu, Oct 29, 2015 at 7:57 AM,  wrote:
>>
>>> Hi Doan,
>>>
>>>
>>>
>>> Is the timeBased() method available in Java driver similar to now() function
>>> in cqlsh. Does both provide identical results.
>>>
>>>
>>>
>>> Also, the preference is to generate values during record insertion from
>>> database side, rather than client side. Something similar to SYSTIMESTAMP
>>> in Oracle.
>>>
>>>
>>>
>>> Regards, Chandra Sekar KR
>>>
>>> *From:* DuyHai Doan [mailto:doanduy...@gmail.com]
>>> *Sent:* 29/10/2015 5:13 PM
>>> *To:* user@cassandra.apache.org
>>> *Subject:* Re: Oracle TIMESTAMP(9) equivalent in Cassandra
>>>
>>>
>>>
>>> You can use TimeUUID data type and provide the value yourself from
>>> client side.
>>>
>>>
>>>
>>> The Java driver offers an utility class
>>> com.datastax.driver.core.utils.UUIDs and the method timeBased() to generate
>>> the TimeUUID.
>>>
>>>
>>>
>>>  The precision is only guaranteed up to 100 nano seconds. So you can
>>> have possibly 10k distincts values for 1 millsec. For your requirement of
>>> 20k per sec, it should be enough.
>>>
>>>
>>>
>>> On Thu, Oct 29, 2015 at 12:10 PM,  wrote:
>>>
>>> Hi,
>>>
>>>
>>>
>>> Oracle Timestamp data type supports fractional seconds (upto 9 digits, 6
>>> is default). What is the Cassandra equivalent data type for Oracle
>>> TimeStamp nanosecond precision.
>>>
>>>
>>>
>>> This is required for determining the order of insertion of record where
>>> the number of records inserted per sec is close to 20K. Is TIMEUUID an
>>> alternate functionality which can determine the order of record insertion
>>> in Cassandra ?
>>>
>>>
>>>
>>> Regards, Chandra Sekar KR
>>>
>>> The information contained in this electronic message and any attachments
>>> to this message are intended for the exclusive use of the addressee(s) and
>>> may contain proprietary, confidential or privileged information. If you are
>>> not the intended recipient, you should not disseminate, distribute or copy
>>> this e-mail. Please notify the sender immediately and destroy all copies of
>>> this message and any attachments. WARNING: Computer viruses can be
>>> transmitted via email. The recipient should check this email and any
>>> attachments for the presence of viruses. The company accepts no liability
>>> for any damage caused by any virus transmitted by this email.
>>> www.wipro.com
>>>
>>>
>>> The information contained in this electronic message and any attachments
>>> to this message are intended for the exclusive use of the addressee(s) and
>>> may contain proprietary, confidential or privileged information. If you are
>>> not the intended recipient, you should not disseminate, distribute or copy
>>> this e-mail. Please notify the sender immediately and destroy all copies of
>>> this message and any attachments. WARNING: Computer viruses can be
>>> transmitted via email. The recipient should check this email and any
>>> attachments for the presence of viruses. The company accepts no liability
>>> for any damage caused by any virus transmitted by this email.
>>> www.wipro.com
>>>
>>
>>


Re: Oracle TIMESTAMP(9) equivalent in Cassandra

2015-10-29 Thread Brice Dutheil
My point is about the difficulty in having perfect clocks in a distributed
system. If nanosecond precision isn’t happening at Google scale, it’s
unlikely to be happening anywhere. The fact that dapper was written in the
context of tracing is irrelevant.

I agree with you : yes precise time at the nano scale is hard.
However while the context of *tracing* is indeed is irrelevant, the notion
of *measure* time ; this isn’t the same problem at all, the paper is about
measuring things that span across different software/hardware while the
problem here is the order of writes (as mentioned in the original question).

Anyway I wouldn’t even trust nanoTime to generate timestamp at the
*nanoscale*, let’s look at java.lang.System.nanotTime(), the javadoc
<http://hg.openjdk.java.net/jdk8/jdk8/jdk/file/687fd7c7986d/src/share/classes/java/lang/System.java>
says this call gives the nano precision however the resolution is at least
as good the millisecond, indeed depending on OS or hardware there may be be
not the same *accuracy*, on Linux for example the code may be using using
an internal counter
<http://hg.openjdk.java.net/jdk8/jdk8/hotspot/file/87ee5ee27509/src/os/linux/vm/os_linux.cpp#l1453>,
and even if it doesn’t I’m not even sure that
Linux::clock_gettime(CLOCK_MONOTONIC,
); will even be *consistent* across different hardware thread (since
each core may run at different speed for several reasons like physical
differences, power management, etc…). This implies that java threads may
issue writes that may not be ordered at the nano second precision.

Multicore processors are already a distributed system, so yes working with
an accurate nanoseconds precision in a distributed system with network
latencies is incredibly hard if not impossible.

There’s till the possibility of using a single threaded writer (but there’s
still other issues).

— Brice

On Thu, Oct 29, 2015 at 8:02 PM, Jonathan Haddad <j...@jonhaddad.com> wrote:

My point is about the difficulty in having perfect clocks in a distributed
> system. If nanosecond precision isn't happening at Google scale, it's
> unlikely to be happening anywhere. The fact that dapper was written in the
> context of tracing is irrelevant.
> On Thu, Oct 29, 2015 at 7:27 PM Brice Dutheil <brice.duth...@gmail.com>
> wrote:
>
>> Additionally if the time uuid is generated client side, make sure the
>> boxes that will perform the write hava correct ntp/ptp configuration.
>>
>> @John Haddad
>>
>> Keep in mind that in a distributed environment you probably have so much
>> variance that nanosecond precision is pointless. Even google notes that in
>> the paper, Dapper, a Large-Scale Distributed Systems Tracing Infrastructure
>> [http://research.google.com/pubs/pub36356.html]
>>
>> I agree with your statement about variance. Though I just like to mention
>> Dapper is about *tracing* query/code, more generally it’s about about
>> the execution overhead of tracing, which is a bit different that just
>> *timestamping*.
>> ​
>>
>> -- Brice
>>
>> On Thu, Oct 29, 2015 at 2:45 PM, Clint Martin <
>> clintlmar...@coolfiretechnologies.com> wrote:
>>
>>> Generating the time uuid on the server side via the now() function also
>>> makes the operation non idempotent. This may not be a huge problem for your
>>> application but it is something to keep in mind.
>>>
>>> Clint
>>> On Oct 29, 2015 9:01 AM, "Kai Wang" <dep...@gmail.com> wrote:
>>>
>>>> If you want the timestamp to be generated on the C* side, you need to
>>>> sync clocks among nodes to the nanosecond precision first. That alone might
>>>> be hard or impossible already. I think the safe bet is to generate the
>>>> timestamp on the client side. But depending on your data volume, if data
>>>> comes from multiple clients you still need to sync clocks among them.
>>>>
>>>>
>>>> On Thu, Oct 29, 2015 at 7:57 AM, <chandrasekar@wipro.com> wrote:
>>>>
>>>>> Hi Doan,
>>>>>
>>>>>
>>>>>
>>>>> Is the timeBased() method available in Java driver similar to now() 
>>>>> function
>>>>> in cqlsh. Does both provide identical results.
>>>>>
>>>>>
>>>>>
>>>>> Also, the preference is to generate values during record insertion
>>>>> from database side, rather than client side. Something similar to
>>>>> SYSTIMESTAMP in Oracle.
>>>>>
>>>>>
>>>>>
>>>>> Regards, Chandra Sekar KR
>>>>>
>>>>> *From:* DuyHai Doan [mailto:

Re: Cassandra query degradation with high frequency updated tables.

2015-10-10 Thread Brice Dutheil
What do you mean by that *And since this is a test , this is just running
on a single node.* ? What is the hardware spec ?

Also from the schema in CASSANDRA-10502
 there’s a lot of
maps, what is the size fo these maps ? I’ve seen cassandra having trouble
with large collections (2.0 / 2.1).
​

-- Brice

On Sat, Oct 10, 2015 at 12:34 AM, Nazario Parsacala 
wrote:

> I will send the jstat output later.
>
> I have created the ticket:
>
> https://issues.apache.org/jira/browse/CASSANDRA-10502
>
>
>
>
>
> On Oct 9, 2015, at 5:20 PM, Jonathan Haddad  wrote:
>
> I'd be curious to see GC logs.
>
> jstat -gccause 
>
> On Fri, Oct 9, 2015 at 2:16 PM Tyler Hobbs  wrote:
>
>> Hmm, it seems off to me that the merge step is taking 1 to 2 seconds,
>> especially when there are only ~500 cells from one sstable and the
>> memtables.  Can you open a ticket (
>> https://issues.apache.org/jira/browse/CASSANDRA) with your schema,
>> details on your data layout, and these traces?
>>
>> On Fri, Oct 9, 2015 at 3:47 PM, Nazario Parsacala 
>> wrote:
>>
>>>
>>>
>>> So the trace is varying a lot. And does not seem to correlate with the
>>> data return from the client ? Maybe datastax java  driver related. ..? (not
>>> likely).. Just checkout the results.
>>>
>>>
>>> Below is the one that I took when from the client (java application)
>>> perspective it was returning data in  about 1100 ms.
>>>
>>>
>>>
>>> *racing session: *566477c0-6ebc-11e5-9493-9131aba66d63
>>>
>>>  *activity*
>>>
>>> |
>>> *timestamp*  | *source*| *source_elapsed*
>>>
>>> --++---+
>>>
>>>
>>>   *Execute CQL3 query* | 
>>> *2015-10-09
>>> 15:31:28.70* | *172.31.17.129* |  *0*
>>>  *Parsing select * from processinfometric_profile where
>>> profilecontext='GENERIC' and id=‘1' and month='Oct' and day='' and hour=''
>>> and minute=''; [SharedPool-Worker-1]* | *2015-10-09 15:31:28.701000* |
>>> *172.31.17.129* |*101*
>>>
>>>
>>> *Preparing statement [SharedPool-Worker-1]* | 
>>> *2015-10-09
>>> 15:31:28.701000* | *172.31.17.129* |*334*
>>>
>>>   *Executing
>>> single-partition query on processinfometric_profile [SharedPool-Worker-3]*
>>>  | *2015-10-09 15:31:28.701000* | *172.31.17.129* |*692*
>>>
>>>
>>>   *Acquiring sstable references [SharedPool-Worker-3]* | 
>>> *2015-10-09
>>> 15:31:28.701000* | *172.31.17.129* |*713*
>>>
>>>
>>> *Merging memtable tombstones [SharedPool-Worker-3]* | 
>>> *2015-10-09
>>> 15:31:28.701000* | *172.31.17.129* |*726*
>>>
>>>
>>>   *Key cache hit for sstable 209 [SharedPool-Worker-3]* | 
>>> *2015-10-09
>>> 15:31:28.704000* | *172.31.17.129* |   *3143*
>>>
>>> 
>>> *Seeking
>>> to partition beginning in data file [SharedPool-Worker-3]* | *2015-10-09
>>> 15:31:28.704000* | *172.31.17.129* |   *3169*
>>>
>>>
>>>   *Key cache hit for sstable 208 [SharedPool-Worker-3]* | 
>>> *2015-10-09
>>> 15:31:28.704000* | *172.31.17.129* |   *3691*
>>>
>>> 
>>> *Seeking
>>> to partition beginning in data file [SharedPool-Worker-3]* | *2015-10-09
>>> 15:31:28.704000* | *172.31.17.129* |   *3713*
>>>
>>>   *Skipped 0/2
>>> non-slice-intersecting sstables, included 0 due to tombstones
>>> [SharedPool-Worker-3]* | *2015-10-09 15:31:28.704000* | *172.31.17.129* |
>>> *3807*
>>>
>>> 
>>> *Merging
>>> data from memtables and 2 sstables [SharedPool-Worker-3]* | *2015-10-09
>>> 15:31:28.704000* | *172.31.17.129* |   *3818*
>>>
>>>
>>> *Read 462 live and 0 tombstone cells [SharedPool-Worker-3]* | 
>>> *2015-10-09
>>> 15:31:29.611000* | *172.31.17.129* | *910723*
>>>
>>>
>>> *Request complete* | 
>>> *2015-10-09
>>> 15:31:29.649251* | *172.31.17.129* | *949251*
>>>
>>>
>>>
>>>
>>> Below when this is around 1400 ms . But the trace data seems to look
>>> faster ..?
>>>
>>>
>>>
>>> *racing session: *7c591550-6ebf-11e5-9493-9131aba66d63
>>>
>>>  *activity*
>>>
>>>   |
>>> 

Re: Realtime data and (C)AP

2015-10-09 Thread Brice Dutheil
On Fri, Oct 9, 2015 at 2:27 AM, Steve Robenalt 
wrote:

In general, if you write at QUORUM and read at ONE (or LOCAL variants
> thereof if you have multiple data centers), your apps will work well
> despite the theoretical consistency issues.

Nit-picky comment : if consistency is something important then reading at
QUORUM is important. If read is ONE then the read operation *may* not see
important update. The safest option is QUORUM for both write and read. Then
depending on the business or feature the consistency may be tuned.

— Brice
​


BATCH consistency level VS. individual statement consistency level

2015-07-06 Thread Brice Dutheil
Hi,

I’m not sure how consistency level is applied on batch statement. I didn’t
found detailed information on datastax.com (1)
http://docs.datastax.com/en/cql/3.0/cql/cql_reference/batch_r.html
regarding that.

   - It is possible to set a CL on individual statements.
   - It is possible to set a CL on the whole batch.
   - If CL is not set then the default CL is used (2)
   http://docs.datastax.com/en/cql/3.0/cql/cql_reference/consistency_r.html
   which is ONE (3)
   
http://docs.datastax.com/en/cassandra/1.2/cassandra/dml/dml_config_consistency_c.html?scroll=concept_ds_umf_5xx_zj__configuring-client-consistency-levels

I understand that if the CL is only set on the batch, then this CL will be
applied to individual statements.

But how are applied consistency levels in these cases ?

   1.

   when consistency levels differ between statements and batch :

BEGIN BATCH USING CONSISTENCY QUORUM
  INSERT INTO users (...) VALUES (...) USING CONSISTENCY LOCAL_QUORUM
  UPDATE users SET password = '...' WHERE userID = '...'  USING
CONSISTENCY LOCAL_ONE
  ...
APPLY BATCH;

   2.

   when consistency level is set on statement but not on batch (will it be
   overridden to default consistency level, i.e. ONE ?)

BEGIN BATCH
  INSERT INTO users (...) VALUES (...) USING CONSISTENCY LOCAL_QUORUM
  UPDATE users SET password = '...' WHERE userID = '...'  USING
CONSISTENCY LOCAL_ONE
  ...
APPLY BATCH;


Cheers,
— Brice
​


Re: LCS Strategy, compaction pending tasks keep increasing

2015-04-22 Thread Brice Dutheil
Reads are mostly limited by IO so I’d set concurrent_read to something
related to your disks, we have set it to 64 (yet we have SSDs)
Writes are mostly limited by CPU, so the number of cores matter, we set
concurrent_read to 48 and 128 (depending on the CPU on the nodes)

Careful with LCS it is not recommended for write heavy workload. LCS is
good to optimize reads, in that it avoids t read many SSTables.
​

-- Brice

On Wed, Apr 22, 2015 at 6:53 AM, Anishek Agarwal anis...@gmail.com wrote:

 Thanks Brice for the input,

 I am confused as to how to calculate the value of concurrent_read,
 following is what i found recommended on sites and in configuration docs.

 concurrent_read : some places its 16 X number of drives or 4 X number of
 cores
 which of the above should i pick ?  i have 40 core cpu with 3 disks(non
 ssd) one used for commitlog and other two for data directories, I am having
 3 nodes in my cluster.

 I think there are tools out there that allow the max write speed to disk,
 i am going to run them too to find out the write throughput i can get to
 see that i am not trying to overachieve something, currently we are stuck
 at 35MBps

 @Sebastian
 the concurrent_compactors is at default value of 32 for us and i think
 that should be fine.
 Since we had lot of cores i thought it would be better to use 
 multithreaded_compaction
 but i think i will try one set with it turned off again.

 Question is still,

 how do i find what write load should i aim for per node such that it is
 able to compact data while inserting, is it just try and error ? or there
 is a certain QPS i can target for per node ?

 Our business case is
 -- new client comes and create a new keyspace for him, initially there
 will be lots of new keys ( i think size tired might work better here)
 -- as time progresses we are going to update the existing keys very
 frequently ( i think LCS will work better here -- we are going with this
 strategy for long term benefit)




 On Wed, Apr 22, 2015 at 4:17 AM, Brice Dutheil brice.duth...@gmail.com
 wrote:

 Yes I was referring referring to multithreaded_compaction, but just
 because we didn’t get bitten by this setting just doesn’t mean it’s right,
 and the jira is a clear indication of that ;)

 @Anishek that reminds me of these settings to look at as well:

- concurrent_write and concurrent_read both need to be adapted to
your actual hardware though.

  Cassandra is, more often than not, disk constrained though this can
 change for some workloads with SSD’s.

 Yes that is typically the case, SSDs are more and more commons but so are
 multi-core CPUs and the trend to multiple cores is not going to stop ; just
 look at the next Intel *flagship* : Knights Landing
 http://www.anandtech.com/show/8217/intels-knights-landing-coprocessor-detailed
 = *72 cores*.

 Nowadays it is not rare to have boxes with multicore CPU, either way if
 they are not used because of some IO bottleneck there’s no reason to be
 licensed for that, and if IO is not an issue the CPUs are most probably
 next in line. While node is much more about a combination of that plus much
 more added value like the linear scaling of Cassandra. And I’m not even
 listing the other nifty integration that DSE ships in.

 But on this matter I believe we shouldn’t hijack the original thread
 purpose.

 — Brice

 On Wed, Apr 22, 2015 at 12:13 AM, Sebastian Estevez
 [sebastian.este...@datastax.com](mailto:sebastian.este...@datastax.com)
 http://mailto:%5bsebastian.este...@datastax.com%5D(mailto:sebastian.este...@datastax.com)
 wrote:

 I want to draw a distinction between a) multithreaded compaction (the
 jira I just pointed to) and b) concurrent_compactors. I'm not clear on
 which one you are recommending at this stage.

 a) Multithreaded compaction is what I warned against in my last note. b)
 Concurrent compactors is the number of separate compaction tasks (on
 different tables) that can run simultaneously. You can crank this up
 without much risk though the old default of num cores was too aggressive
 (CASSANDRA-7139). 2 seems to be the sweet-spot.

 Cassandra is, more often than not, disk constrained though this can
 change for some workloads with SSD's.


 All the best,


 [image: datastax_logo.png] http://www.datastax.com/

 Sebastián Estévez

 Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

 [image: linkedin.png] https://www.linkedin.com/company/datastax [image:
 facebook.png] https://www.facebook.com/datastax [image: twitter.png]
 https://twitter.com/datastax [image: g+.png]
 https://plus.google.com/+Datastax/about
 http://feeds.feedburner.com/datastax

 http://cassandrasummit-datastax.com/

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45 countries, DataStax is the
 database technology

Re: Handle Write Heavy Loads in Cassandra 2.0.3

2015-04-22 Thread Brice Dutheil
Another reason for memtable to be kept in memory if there's wide rows.
Maybe someone can chime in and confirm or not, but I believe wide rows (in
the thrift sense) need to synced entirely across nodes. So from the number
you gave a node can send ~100 Mb over the network for a single row. With
compaction and other stuff, it may be an issue, as these object can stay
long enough in the heap to survive a collection.

Think about the row cache too, as with wide rows, Cassandra will hold a bit
longer the tables to serialize the data in the off-heap row cache (in
2.0.x, not sure about other versions). See this page :
http://docs.datastax.com/en/cassandra/2.0/cassandra/operations/ops_configuring_caches_c.html




-- Brice

On Wed, Apr 22, 2015 at 2:47 PM, Anuj Wadehra anujw_2...@yahoo.co.in
wrote:

 Any other suggestions on the JVM Tuning and Cassandra config we did to
 solve the promotion failures during gc?

 I would appreciate if someone can try to answer our queries mentioned in
 initial mail?

 Thanks
 Anuj Wadehra

 Sent from Yahoo Mail on Android
 https://overview.mail.yahoo.com/mobile/?.src=Android
 --
   *From*:Anuj Wadehra anujw_2...@yahoo.co.in
 *Date*:Wed, 22 Apr, 2015 at 6:12 pm

 *Subject*:Re: Handle Write Heavy Loads in Cassandra 2.0.3

 Thanks Brice for all the comments..

 We analyzed gc logs and heap dump before tuning JVM n gc. With new JVM
 config I specified we were able to remove promotion failures seen with
 default config. With Heap dump I got an idea that memetables and compaction
 are biggest culprits.

 CAASSANDRA-6142 talks about multithreaded_compaction but we are using
 concurrent_compactors. I think they are different. On nodes with many cores
 it is usually recommend to run core/2 concurrent compactors. I dont think
 10 or 12 would  make big difference.

 For now, we have kept compaction throughput to 24 as we already have
 scenarios which create heap pressure due to heavy read write load. Yes we
 can think of increasing it on SSD.

 We have already enabled trickle fsync.

 Justification behind increasing MaxTenuringThreshold ,young gen size and
 creating large survivor space is to gc most memtables in Yong gen itself.
 For making sure that memtables are smaller and not kept too long in heap
 ,we have reduced total_memtable_space_in_mb to 1g from heap size/4 which is
 default. We flush a memtable to disk approx every 15 sec and our minor
 collection runs evry 3-7 secs.So its highly probable that most memtables
 will be collected in young gen. Idea is that most short lived and middle
 life time objects should not reach old gen otherwise CMC old gen
 collections would be very frequent,more expensive as they may not collect
 memtables and fragmentation would be higher.

 I think wide rows less than 100mb should nt be prob. Cassandra infact
 provides very good wide rows format suitable for time series and other
 scenarios. The problem is that when my in_memory_compaction_in_mb limit is
 125 mb why Cassandra is printing compacting large rows when row is less
 than 100mb.



 Thanks
 Anuj Wadehra

 Sent from Yahoo Mail on Android
 https://overview.mail.yahoo.com/mobile/?.src=Android
 --
   *From*:Brice Dutheil brice.duth...@gmail.com
 *Date*:Wed, 22 Apr, 2015 at 3:52 am
 *Subject*:Re: Handle Write Heavy Loads in Cassandra 2.0.3

 Hi, I cannot really answer your question as some rock solid truth.

 When we had problems, we did mainly two things

- Analyzed the GC logs (with censum from jClarity, this tool IS really
awesome, it’s good investment even better if the production is running
other java applications)
- Heap dumped cassandra when there was a GC, this helped in narrowing
down the actual issue

 I don’t know precisely how to answer, but :

- concurrent_compactors could be lowered to 10, it seems from another
thread here that it can be harmful, see
https://issues.apache.org/jira/browse/CASSANDRA-6142
- memtable_flush_writers we set it to 2
- compaction_throughput_mb_per_sec could probably be increased, on
SSDs that should help
- trickle_fsync don’t forget this one too if you’re on SSDs

 Touching JVM heap parameters can be hazardous, increasing heap may seem
 like a nice thing, but it can increase GC time in the worst case scenario.

 Also increasing the MaxTenuringThreshold is probably wrong too, as you
 probably know it means objects will be copied from Eden to Survivor 0/1 and
 to the other Survivor on the next collection until that threshold is
 reached, then it will be copied in Old generation. That means that’s being
 applied to Memtables, so it *may* mean several copies to be done on each
 GCs, and memtables are not small objects that could take a little while for
 an *available* system. Another fact to take account for is that upon each
 collection the active survivor S0/S1 has to be big enough for the memtable
 to fit there, and there’s other objects too.

 So I would

Re: LCS Strategy, compaction pending tasks keep increasing

2015-04-21 Thread Brice Dutheil
I’m not sure I get everything about storm stuff, but my understanding of
LCS is that compaction count may increase the more one update data (that’s
why I was wondering about duplicate primary keys).

Another option is that the code is sending too much write request/s to the
cassandra cluster. I don’t know haw many nodes you have, but the less node
there is the more compactions.
Also I’d look at the CPU / load, maybe the config is too *restrictive*,
look at the following properties in the cassandra.yaml

   - compaction_throughput_mb_per_sec, by default the value is 16, you may
   want to increase it but be careful on mechanical drives, if already in SSD
   IO is rarely the issue, we have 64 (with SSDs)
   - multithreaded_compaction by default it is false, we enabled it.

Compaction thread are niced, so it shouldn’t be much an issue for serving
production r/w requests. But you never know, always keep an eye on IO and
CPU.

— Brice

On Tue, Apr 21, 2015 at 2:48 PM, Anishek Agarwal anis...@gmail.com wrote:

sorry i take that back we will modify different keys across threads not the
 same key, our storm topology is going to use field grouping to get updates
 for same keys to same set of bolts.

 On Tue, Apr 21, 2015 at 6:17 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 @Bruice : I dont think so as i am giving each thread a specific key range
 with no overlaps this does not seem to be the case now. However we will
 have to test where we have to modify the same key across threads -- do u
 think that will cause a problem ? As far as i have read LCS is recommended
 for such cases. should i just switch back to SizeTiredCompactionStrategy.


 On Tue, Apr 21, 2015 at 6:13 PM, Brice Dutheil brice.duth...@gmail.com
 wrote:

 Could it that the app is inserting _duplicate_ keys ?

 -- Brice

 On Tue, Apr 21, 2015 at 1:52 PM, Marcus Eriksson krum...@gmail.com
 wrote:

 nope, but you can correlate I guess, tools/bin/sstablemetadata gives
 you sstable level information

 and, it is also likely that since you get so many L0 sstables, you will
 be doing size tiered compaction in L0 for a while.

 On Tue, Apr 21, 2015 at 1:40 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 @Marcus I did look and that is where i got the above but it doesnt
 show any detail about moving from L0 -L1 any specific arguments i should
 try with ?

 On Tue, Apr 21, 2015 at 4:52 PM, Marcus Eriksson krum...@gmail.com
 wrote:

 you need to look at nodetool compactionstats - there is probably a
 big L0 - L1 compaction going on that blocks other compactions from 
 starting

 On Tue, Apr 21, 2015 at 1:06 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 the some_bits column has about 14-15 bytes of data per key.

 On Tue, Apr 21, 2015 at 4:34 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 Hello,

 I am inserting about 100 million entries via datastax-java driver
 to a cassandra cluster of 3 nodes.

 Table structure is as

 create keyspace test with replication = {'class':
 'NetworkTopologyStrategy', 'DC' : 3};

 CREATE TABLE test_bits(id bigint primary key , some_bits text) with
 gc_grace_seconds=0 and compaction = {'class': 
 'LeveledCompactionStrategy'}
 and compression={'sstable_compression' : ''};

 have 75 threads that are inserting data into the above table with
 each thread having non over lapping keys.

 I see that the number of pending tasks via nodetool
 compactionstats keeps increasing and looks like from nodetool cfstats
 test.test_bits has SSTTable levels as [154/4, 8, 0, 0, 0, 0, 0, 0, 0],

 Why is compaction not kicking in ?

 thanks
 anishek








  ​


Re: LCS Strategy, compaction pending tasks keep increasing

2015-04-21 Thread Brice Dutheil
Could it that the app is inserting _duplicate_ keys ?

-- Brice

On Tue, Apr 21, 2015 at 1:52 PM, Marcus Eriksson krum...@gmail.com wrote:

 nope, but you can correlate I guess, tools/bin/sstablemetadata gives you
 sstable level information

 and, it is also likely that since you get so many L0 sstables, you will be
 doing size tiered compaction in L0 for a while.

 On Tue, Apr 21, 2015 at 1:40 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 @Marcus I did look and that is where i got the above but it doesnt show
 any detail about moving from L0 -L1 any specific arguments i should try
 with ?

 On Tue, Apr 21, 2015 at 4:52 PM, Marcus Eriksson krum...@gmail.com
 wrote:

 you need to look at nodetool compactionstats - there is probably a big
 L0 - L1 compaction going on that blocks other compactions from starting

 On Tue, Apr 21, 2015 at 1:06 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 the some_bits column has about 14-15 bytes of data per key.

 On Tue, Apr 21, 2015 at 4:34 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 Hello,

 I am inserting about 100 million entries via datastax-java driver to a
 cassandra cluster of 3 nodes.

 Table structure is as

 create keyspace test with replication = {'class':
 'NetworkTopologyStrategy', 'DC' : 3};

 CREATE TABLE test_bits(id bigint primary key , some_bits text) with
 gc_grace_seconds=0 and compaction = {'class': 'LeveledCompactionStrategy'}
 and compression={'sstable_compression' : ''};

 have 75 threads that are inserting data into the above table with each
 thread having non over lapping keys.

 I see that the number of pending tasks via nodetool compactionstats
 keeps increasing and looks like from nodetool cfstats test.test_bits has
 SSTTable levels as [154/4, 8, 0, 0, 0, 0, 0, 0, 0],

 Why is compaction not kicking in ?

 thanks
 anishek








Re: Handle Write Heavy Loads in Cassandra 2.0.3

2015-04-21 Thread Brice Dutheil
Hi, I cannot really answer your question as some rock solid truth.

When we had problems, we did mainly two things

   - Analyzed the GC logs (with censum from jClarity, this tool IS really
   awesome, it’s good investment even better if the production is running
   other java applications)
   - Heap dumped cassandra when there was a GC, this helped in narrowing
   down the actual issue

I don’t know precisely how to answer, but :

   - concurrent_compactors could be lowered to 10, it seems from another
   thread here that it can be harmful, see
   https://issues.apache.org/jira/browse/CASSANDRA-6142
   - memtable_flush_writers we set it to 2
   - compaction_throughput_mb_per_sec could probably be increased, on SSDs
   that should help
   - trickle_fsync don’t forget this one too if you’re on SSDs

Touching JVM heap parameters can be hazardous, increasing heap may seem
like a nice thing, but it can increase GC time in the worst case scenario.

Also increasing the MaxTenuringThreshold is probably wrong too, as you
probably know it means objects will be copied from Eden to Survivor 0/1 and
to the other Survivor on the next collection until that threshold is
reached, then it will be copied in Old generation. That means that’s being
applied to Memtables, so it *may* mean several copies to be done on each
GCs, and memtables are not small objects that could take a little while for
an *available* system. Another fact to take account for is that upon each
collection the active survivor S0/S1 has to be big enough for the memtable
to fit there, and there’s other objects too.

So I would rather work on the real cause. rather than GC. One thing brought
my attention

Though still getting logs saying “compacting large row”.

Could it be that the model is based on wide rows ? That could be a problem,
for several reasons not limited to compactions. If that is so I’d advise to
revise the datamodel
​

-- Brice

On Tue, Apr 21, 2015 at 7:53 PM, Anuj Wadehra anujw_2...@yahoo.co.in
wrote:

 Thanks Brice!!

 We are using Red Hat Linux 6.4..24 cores...64Gb Ram..SSDs in RAID5..CPU
 are not overloaded even in peak load..I dont think IO is an issue as iostat
 shows await17 all times..util attrbute in iostat usually increases from 0
 to 100..and comes back immediately..m not an expert on analyzing IO but
 things look ok..We are using STCS..and not using Logged batches..We are
 making around 12k writes/sec in 5 cf (one with 4 sec index) and 2300
 reads/sec on each node of 3 node cluster. 2 CFs have wide rows with max
 data of around 100mb per row.   We have further reduced
 in_memory_compaction_limit_in_mb to 125.Though still getting logs saying
 compacting large row.

 We are planning to upgrade to 2.0.14 as 2.1 is not yet production ready.

 I would appreciate if you could answer the queries posted in initial mail.

 Thanks
 Anuj Wadehra

 Sent from Yahoo Mail on Android
 https://overview.mail.yahoo.com/mobile/?.src=Android
 --
   *From*:Brice Dutheil brice.duth...@gmail.com
 *Date*:Tue, 21 Apr, 2015 at 10:22 pm

 *Subject*:Re: Handle Write Heavy Loads in Cassandra 2.0.3

 This is an intricate matter, I cannot say for sure what are good
 parameters from the wrong ones, too many things changed at once.

 However there’s many things to consider

- What is your OS ?
- Do your nodes have SSDs or mechanical drives ? How many cores do you
have ?
- Is it the CPUs or IOs that are overloaded ?
- What is the write request/s per node and cluster wide ?
- What is the compaction strategy of the tables you are writing into ?
- Are you using LOGGED BATCH statement.

 With heavy writes, it is *NOT* recommend to use LOGGED BATCH statements.

 In our 2.0.14 cluster we have experimented node unavailability due to long
 Full GC pauses. We discovered bogus legacy data, a single outlier was so
 wrong that it updated hundred thousand time the same CQL rows with
 duplicate data. Given the tables we were writing to were configured to use
 LCS, this resulted in keeping Memtables in memory long enough to promote
 them in the old generation (the MaxTenuringThreshold default is 1).
 Handling this data proved to be the thing to fix, with default GC settings
 the cluster (10 nodes) handle 39 write requests/s.

 Note Memtables are allocated on heap with 2.0.x. With 2.1.x they will be
 allocated off-heap.
 ​

 -- Brice

 On Tue, Apr 21, 2015 at 5:12 PM, Anuj Wadehra anujw_2...@yahoo.co.in
 wrote:

 Any suggestions or comments on this one??

 Thanks
 Anuj Wadhera

 Sent from Yahoo Mail on Android
 https://overview.mail.yahoo.com/mobile/?.src=Android
 --
 *From*:Anuj Wadehra anujw_2...@yahoo.co.in
 *Date*:Mon, 20 Apr, 2015 at 11:51 pm
 *Subject*:Re: Handle Write Heavy Loads in Cassandra 2.0.3

 Small correction: we are making writes in 5 cf an reading frm one at high
 speeds.



 Thanks
 Anuj Wadehra

 Sent from Yahoo Mail on Android
 https://overview.mail.yahoo.com/mobile/?.src

Re: LCS Strategy, compaction pending tasks keep increasing

2015-04-21 Thread Brice Dutheil
Yes I was referring referring to multithreaded_compaction, but just because
we didn’t get bitten by this setting just doesn’t mean it’s right, and the
jira is a clear indication of that ;)

@Anishek that reminds me of these settings to look at as well:

   - concurrent_write and concurrent_read both need to be adapted to your
   actual hardware though.

 Cassandra is, more often than not, disk constrained though this can change
for some workloads with SSD’s.

Yes that is typically the case, SSDs are more and more commons but so are
multi-core CPUs and the trend to multiple cores is not going to stop ; just
look at the next Intel *flagship* : Knights Landing
http://www.anandtech.com/show/8217/intels-knights-landing-coprocessor-detailed
= *72 cores*.

Nowadays it is not rare to have boxes with multicore CPU, either way if
they are not used because of some IO bottleneck there’s no reason to be
licensed for that, and if IO is not an issue the CPUs are most probably
next in line. While node is much more about a combination of that plus much
more added value like the linear scaling of Cassandra. And I’m not even
listing the other nifty integration that DSE ships in.

But on this matter I believe we shouldn’t hijack the original thread
purpose.

— Brice

On Wed, Apr 22, 2015 at 12:13 AM, Sebastian Estevez
[sebastian.este...@datastax.com](mailto:sebastian.este...@datastax.com)
http://mailto:[sebastian.este...@datastax.com](mailto:sebastian.este...@datastax.com)
wrote:

I want to draw a distinction between a) multithreaded compaction (the jira
 I just pointed to) and b) concurrent_compactors. I'm not clear on which one
 you are recommending at this stage.

 a) Multithreaded compaction is what I warned against in my last note. b)
 Concurrent compactors is the number of separate compaction tasks (on
 different tables) that can run simultaneously. You can crank this up
 without much risk though the old default of num cores was too aggressive
 (CASSANDRA-7139). 2 seems to be the sweet-spot.

 Cassandra is, more often than not, disk constrained though this can change
 for some workloads with SSD's.


 All the best,


 [image: datastax_logo.png] http://www.datastax.com/

 Sebastián Estévez

 Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

 [image: linkedin.png] https://www.linkedin.com/company/datastax [image:
 facebook.png] https://www.facebook.com/datastax [image: twitter.png]
 https://twitter.com/datastax [image: g+.png]
 https://plus.google.com/+Datastax/about
 http://feeds.feedburner.com/datastax

 http://cassandrasummit-datastax.com/

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45 countries, DataStax is the
 database technology and transactional backbone of choice for the worlds
 most innovative companies such as Netflix, Adobe, Intuit, and eBay.

 On Tue, Apr 21, 2015 at 5:46 PM, Brice Dutheil brice.duth...@gmail.com
 wrote:

 Oh, thank you Sebastian for this input and the ticket reference !
 We did notice an increase in CPU usage, but kept the concurrent
 compaction low enough for our usage, by default it takes the number of
 cores. We did use a number up to 30% of our available cores. But under
 heavy load clearly CPU is the bottleneck and we have 2 CPU with 8 hyper
 threaded cores per node.

 In a related topic : I’m a bit concerned by datastax communication,
 usually people talk about IO as being the weak spot, but in our case it’s
 more about CPU. Fortunately the Moore law doesn’t really apply anymore
 vertically, now we have have multi core processors *and* the trend is
 going that way. Yet Datastax terms feels a bit *antiquated* and maybe a
 bit too much Oracle-y : http://www.datastax.com/enterprise-terms
 Node licensing is more appropriate for this century.
 ​

 -- Brice

 On Tue, Apr 21, 2015 at 11:19 PM, Sebastian Estevez 
 sebastian.este...@datastax.com wrote:

 Do not enable multithreaded compaction. Overhead usually outweighs any
 benefit. It's removed in 2.1 because it harms more than helps:

 https://issues.apache.org/jira/browse/CASSANDRA-6142

 All the best,


 [image: datastax_logo.png] http://www.datastax.com/

 Sebastián Estévez

 Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

 [image: linkedin.png] https://www.linkedin.com/company/datastax [image:
 facebook.png] https://www.facebook.com/datastax [image: twitter.png]
 https://twitter.com/datastax [image: g+.png]
 https://plus.google.com/+Datastax/about
 http://feeds.feedburner.com/datastax

 http://cassandrasummit-datastax.com/

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45

Re: Handle Write Heavy Loads in Cassandra 2.0.3

2015-04-21 Thread Brice Dutheil
This is an intricate matter, I cannot say for sure what are good parameters
from the wrong ones, too many things changed at once.

However there’s many things to consider

   - What is your OS ?
   - Do your nodes have SSDs or mechanical drives ? How many cores do you
   have ?
   - Is it the CPUs or IOs that are overloaded ?
   - What is the write request/s per node and cluster wide ?
   - What is the compaction strategy of the tables you are writing into ?
   - Are you using LOGGED BATCH statement.

With heavy writes, it is *NOT* recommend to use LOGGED BATCH statements.

In our 2.0.14 cluster we have experimented node unavailability due to long
Full GC pauses. We discovered bogus legacy data, a single outlier was so
wrong that it updated hundred thousand time the same CQL rows with
duplicate data. Given the tables we were writing to were configured to use
LCS, this resulted in keeping Memtables in memory long enough to promote
them in the old generation (the MaxTenuringThreshold default is 1).
Handling this data proved to be the thing to fix, with default GC settings
the cluster (10 nodes) handle 39 write requests/s.

Note Memtables are allocated on heap with 2.0.x. With 2.1.x they will be
allocated off-heap.
​

-- Brice

On Tue, Apr 21, 2015 at 5:12 PM, Anuj Wadehra anujw_2...@yahoo.co.in
wrote:

 Any suggestions or comments on this one??

 Thanks
 Anuj Wadhera

 Sent from Yahoo Mail on Android
 https://overview.mail.yahoo.com/mobile/?.src=Android
 --
   *From*:Anuj Wadehra anujw_2...@yahoo.co.in
 *Date*:Mon, 20 Apr, 2015 at 11:51 pm
 *Subject*:Re: Handle Write Heavy Loads in Cassandra 2.0.3

 Small correction: we are making writes in 5 cf an reading frm one at high
 speeds.


 Thanks
 Anuj Wadehra

 Sent from Yahoo Mail on Android
 https://overview.mail.yahoo.com/mobile/?.src=Android
 --
 *From*:Anuj Wadehra anujw_2...@yahoo.co.in
 *Date*:Mon, 20 Apr, 2015 at 7:53 pm
 *Subject*:Handle Write Heavy Loads in Cassandra 2.0.3

 Hi,

 Recently, we discovered that  millions of mutations were getting dropped
 on our cluster. Eventually, we solved this problem by increasing the value
 of memtable_flush_writers from 1 to 3. We usually write 3 CFs
 simultaneously an one of them has 4 Secondary Indexes.

 New changes also include:
 concurrent_compactors: 12 (earlier it was default)
 compaction_throughput_mb_per_sec: 32(earlier it was default)
 in_memory_compaction_limit_in_mb: 400 ((earlier it was default 64)
 memtable_flush_writers: 3 (earlier 1)

 After, making above changes, our write heavy workload scenarios started
 giving promotion failed exceptions in  gc logs.

 We have done JVM tuning and Cassandra config changes to solve this:

 MAX_HEAP_SIZE=12G (Increased Heap to from 8G to reduce fragmentation)
 HEAP_NEWSIZE=3G

 JVM_OPTS=$JVM_OPTS -XX:SurvivorRatio=2 (We observed that even at
 SurvivorRatio=4, our survivor space was getting 100% utilized under heavy
 write load and we thought that minor collections were directly promoting
 objects to Tenured generation)

 JVM_OPTS=$JVM_OPTS -XX:MaxTenuringThreshold=20 (Lots of objects were
 moving from Eden to Tenured on each minor collection..may be related to
 medium life objects related to Memtables and compactions as suggested by
 heapdump)

 JVM_OPTS=$JVM_OPTS -XX:ConcGCThreads=20
 JVM_OPTS=$JVM_OPTS -XX:+UnlockDiagnosticVMOptions
 JVM_OPTS=$JVM_OPTS -XX:+UseGCTaskAffinity
 JVM_OPTS=$JVM_OPTS -XX:+BindGCTaskThreadsToCPUs
 JVM_OPTS=$JVM_OPTS -XX:ParGCCardsPerStrideChunk=32768
 JVM_OPTS=$JVM_OPTS -XX:+CMSScavengeBeforeRemark
 JVM_OPTS=$JVM_OPTS -XX:CMSMaxAbortablePrecleanTime=3
 JVM_OPTS=$JVM_OPTS -XX:CMSWaitDuration=2000 //though it's default value
 JVM_OPTS=$JVM_OPTS -XX:+CMSEdenChunksRecordAlways
 JVM_OPTS=$JVM_OPTS -XX:+CMSParallelInitialMarkEnabled
 JVM_OPTS=$JVM_OPTS -XX:-UseBiasedLocking
 JVM_OPTS=$JVM_OPTS -XX:CMSInitiatingOccupancyFraction=70 (to avoid
 concurrent failures we reduced value)

 Cassandra config:
 compaction_throughput_mb_per_sec: 24
 memtable_total_space_in_mb: 1000 (to make memtable flush frequent.default
 is 1/4 heap which creates more long lived objects)

 Questions:
 1. Why increasing memtable_flush_writers and
 in_memory_compaction_limit_in_mb caused promotion failures in JVM? Does
 more memtable_flush_writers mean more memtables in memory?

 2. Still, objects are getting promoted at high speed to Tenured space. CMS
 is running on Old gen every 4-5 minutes  under heavy write load. Around
 750+ minor collections of upto 300ms happened in 45 mins. Do you see any
 problems with new JVM tuning and Cassandra config? Is the justification
 given against those changes sounds logical? Any suggestions?
 3. What is the best practice for reducing heap fragmentation/promotion
 failure when allocation and promotion rates are high?

 Thanks
 Anuj







Re: LCS Strategy, compaction pending tasks keep increasing

2015-04-21 Thread Brice Dutheil
Oh, thank you Sebastian for this input and the ticket reference !
We did notice an increase in CPU usage, but kept the concurrent compaction
low enough for our usage, by default it takes the number of cores. We did
use a number up to 30% of our available cores. But under heavy load clearly
CPU is the bottleneck and we have 2 CPU with 8 hyper threaded cores per
node.

In a related topic : I’m a bit concerned by datastax communication, usually
people talk about IO as being the weak spot, but in our case it’s more
about CPU. Fortunately the Moore law doesn’t really apply anymore
vertically, now we have have multi core processors *and* the trend is going
that way. Yet Datastax terms feels a bit *antiquated* and maybe a bit too
much Oracle-y : http://www.datastax.com/enterprise-terms
Node licensing is more appropriate for this century.
​

-- Brice

On Tue, Apr 21, 2015 at 11:19 PM, Sebastian Estevez 
sebastian.este...@datastax.com wrote:

 Do not enable multithreaded compaction. Overhead usually outweighs any
 benefit. It's removed in 2.1 because it harms more than helps:

 https://issues.apache.org/jira/browse/CASSANDRA-6142

 All the best,


 [image: datastax_logo.png] http://www.datastax.com/

 Sebastián Estévez

 Solutions Architect | 954 905 8615 | sebastian.este...@datastax.com

 [image: linkedin.png] https://www.linkedin.com/company/datastax [image:
 facebook.png] https://www.facebook.com/datastax [image: twitter.png]
 https://twitter.com/datastax [image: g+.png]
 https://plus.google.com/+Datastax/about
 http://feeds.feedburner.com/datastax

 http://cassandrasummit-datastax.com/

 DataStax is the fastest, most scalable distributed database technology,
 delivering Apache Cassandra to the world’s most innovative enterprises.
 Datastax is built to be agile, always-on, and predictably scalable to any
 size. With more than 500 customers in 45 countries, DataStax is the
 database technology and transactional backbone of choice for the worlds
 most innovative companies such as Netflix, Adobe, Intuit, and eBay.

 On Tue, Apr 21, 2015 at 9:06 AM, Brice Dutheil brice.duth...@gmail.com
 wrote:

 I’m not sure I get everything about storm stuff, but my understanding of
 LCS is that compaction count may increase the more one update data (that’s
 why I was wondering about duplicate primary keys).

 Another option is that the code is sending too much write request/s to
 the cassandra cluster. I don’t know haw many nodes you have, but the less
 node there is the more compactions.
 Also I’d look at the CPU / load, maybe the config is too *restrictive*,
 look at the following properties in the cassandra.yaml

- compaction_throughput_mb_per_sec, by default the value is 16, you
may want to increase it but be careful on mechanical drives, if already in
SSD IO is rarely the issue, we have 64 (with SSDs)
- multithreaded_compaction by default it is false, we enabled it.

 Compaction thread are niced, so it shouldn’t be much an issue for
 serving production r/w requests. But you never know, always keep an eye on
 IO and CPU.

 — Brice

 On Tue, Apr 21, 2015 at 2:48 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 sorry i take that back we will modify different keys across threads not
 the same key, our storm topology is going to use field grouping to get
 updates for same keys to same set of bolts.

 On Tue, Apr 21, 2015 at 6:17 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 @Bruice : I dont think so as i am giving each thread a specific key
 range with no overlaps this does not seem to be the case now. However we
 will have to test where we have to modify the same key across threads -- do
 u think that will cause a problem ? As far as i have read LCS is
 recommended for such cases. should i just switch back to
 SizeTiredCompactionStrategy.


 On Tue, Apr 21, 2015 at 6:13 PM, Brice Dutheil brice.duth...@gmail.com
  wrote:

 Could it that the app is inserting _duplicate_ keys ?

 -- Brice

 On Tue, Apr 21, 2015 at 1:52 PM, Marcus Eriksson krum...@gmail.com
 wrote:

 nope, but you can correlate I guess, tools/bin/sstablemetadata gives
 you sstable level information

 and, it is also likely that since you get so many L0 sstables, you
 will be doing size tiered compaction in L0 for a while.

 On Tue, Apr 21, 2015 at 1:40 PM, Anishek Agarwal anis...@gmail.com
 wrote:

 @Marcus I did look and that is where i got the above but it doesnt
 show any detail about moving from L0 -L1 any specific arguments i should
 try with ?

 On Tue, Apr 21, 2015 at 4:52 PM, Marcus Eriksson krum...@gmail.com
 wrote:

 you need to look at nodetool compactionstats - there is probably a
 big L0 - L1 compaction going on that blocks other compactions from 
 starting

 On Tue, Apr 21, 2015 at 1:06 PM, Anishek Agarwal anis...@gmail.com
  wrote:

 the some_bits column has about 14-15 bytes of data per key.

 On Tue, Apr 21, 2015 at 4:34 PM, Anishek Agarwal 
 anis...@gmail.com wrote:

 Hello,

 I am inserting about 100 million entries via datastax

Re: Frequent timeout issues

2015-04-01 Thread Brice Dutheil
And the keyspace? What is the replication factor.

Also how are the inserts done?

On Wednesday, April 1, 2015, Amlan Roy amlan@cleartrip.com wrote:

 Write consistency level is ONE.

 This is the describe output for one of the tables.

 CREATE TABLE event_data (
   event text,
   week text,
   bucket int,
   date timestamp,
   unique text,
   adt int,
   age listint,
   arrival listtimestamp,
   bank text,
   bf double,
   cabin text,
   card text,
   carrier listtext,
   cb double,
   channel text,
   chd int,
   company text,
   cookie text,
   coupon listtext,
   depart listtimestamp,
   dest listtext,
   device text,
   dis double,
   domain text,
   duration bigint,
   emi int,
   expressway boolean,
   flight listtext,
   freq_flyer listtext,
   host text,
   host_ip text,
   inf int,
   instance text,
   insurance text,
   intl boolean,
   itinerary text,
   journey text,
   meal_pref listtext,
   mkp double,
   name listtext,
   origin listtext,
   pax_type listtext,
   payment text,
   pref_carrier listtext,
   referrer text,
   result_cnt int,
   search text,
   src text,
   src_ip text,
   stops int,
   supplier listtext,
   tags listtext,
   total double,
   trip text,
   user text,
   user_agent text,
   PRIMARY KEY ((event, week, bucket), date, unique)
 ) WITH CLUSTERING ORDER BY (date DESC, unique ASC) AND
   bloom_filter_fp_chance=0.01 AND
   caching='KEYS_ONLY' AND
   comment='' AND
   dclocal_read_repair_chance=0.10 AND
   gc_grace_seconds=864000 AND
   index_interval=128 AND
   read_repair_chance=0.00 AND
   replicate_on_write='true' AND
   populate_io_cache_on_flush='false' AND
   default_time_to_live=0 AND
   speculative_retry='99.0PERCENTILE' AND
   memtable_flush_period_in_ms=0 AND
   compaction={'class': 'SizeTieredCompactionStrategy'} AND
   compression={'sstable_compression': 'LZ4Compressor’};


 On 01-Apr-2015, at 8:00 pm, Eric R Medley emed...@xylocore.com
 javascript:_e(%7B%7D,'cvml','emed...@xylocore.com'); wrote:

 Also, can you provide the table details and the consistency level you are
 using?

 Regards,

 Eric R Medley

 On Apr 1, 2015, at 9:13 AM, Eric R Medley emed...@xylocore.com
 javascript:_e(%7B%7D,'cvml','emed...@xylocore.com'); wrote:

 Amlan,

 Can you provide information on how much data is being written? Are any of
 the columns really large? Are any writes succeeding or are all timing out?

 Regards,

 Eric R Medley

 On Apr 1, 2015, at 9:03 AM, Amlan Roy amlan@cleartrip.com
 javascript:_e(%7B%7D,'cvml','amlan@cleartrip.com'); wrote:

 Hi,

 I am new to Cassandra. I have setup a cluster with Cassandra 2.0.13. I am
 writing the same data in HBase and Cassandra and find that the writes are
 extremely slow in Cassandra and frequently seeing exception “Cassandra
 timeout during write query at consistency ONE. The cluster size for both
 HBase and Cassandra are same.

 Looks like something is wrong with my cluster setup. What can be the
 possible issue? Data and commit logs are written into two separate disks.

 Regards,
 Amlan






-- 
Brice


Re: Delayed events processing / queue (anti-)pattern

2015-03-27 Thread Brice Dutheil
Would it help here to not actually issue a delete statement but instead use
date based compaction and a dynamically calculated ttl that is some safe
distance in the future from your key?

I’m not sure about about this part *date based compaction*, do you mean
DateTieredCompationStrategy ?

Anyway we achieved something like that without this strategy with a TTL +
date in partition key based approach. The thing however to watch is the
size of the partition (one should avoid too long partitions (in thrift wide
rows)), so care must be taken on the date increment to be correctly
adjusted.
​

-- Brice

On Thu, Mar 26, 2015 at 5:23 PM, Robin Verlangen ro...@us2.nl wrote:

 Interesting thought, that should work indeed, I'll evaluate both options
 and provide an update here once I have results.

 Best regards,

 Robin Verlangen
 *Chief Data Architect*

 W http://www.robinverlangen.nl
 E ro...@us2.nl

 http://goo.gl/Lt7BC
 *What is CloudPelican? http://goo.gl/HkB3D*

 Disclaimer: The information contained in this message and attachments is
 intended solely for the attention and use of the named addressee and may be
 confidential. If you are not the intended recipient, you are reminded that
 the information remains the property of the sender. You must not use,
 disclose, distribute, copy, print or rely on this e-mail. If you have
 received this message in error, please contact the sender immediately and
 irrevocably delete this message and any copies.

 On Thu, Mar 26, 2015 at 7:09 AM, Thunder Stumpges 
 thunder.stump...@gmail.com wrote:

 Would it help here to not actually issue a delete statement but instead
 use date based compaction and a dynamically calculated ttl that is some
 safe distance in the future from your key?

 Just a thought.
 -Thunder
  On Mar 25, 2015 11:07 AM, Robert Coli rc...@eventbrite.com wrote:

 On Wed, Mar 25, 2015 at 12:45 AM, Robin Verlangen ro...@us2.nl wrote:

 @Robert: can you elaborate a bit more on the not ideal parts? In my
 case I will be throwing away the rows (thus the points in time that are
 now in the past), which will create tombstones which are compacted away.


 Not ideal is what I mean... Cassandra has immutable data files, use
 cases which do DELETE pay an obvious penalty. Some percentage of tombstones
 will exist continuously, and you have to store them and seek past them.

 =Rob






Re: CQL 3.x Update ...USING TIMESTAMP...

2015-03-13 Thread Brice Dutheil
I agree with Tyler, in the normal run of a live application I would not
recommend the use of the timestamp, and use other ways to *version*
*inserts*. Otherwise you may fall in the *upsert* pitfalls that Tyler
mentions.

However I find there’s a legitimate use the USING TIMESTAMP trick, when
migrating data form another datastore.

The trick is at some point to enable the application to start writing
cassandra *without* any timestamp setting on the statements. ⇐ for fresh
data
Then start a migration batch that will use a write time with an older date
(i.e. when there’s *no* possible *collision* with other data). ⇐ for older
data

*This tricks has been used in prod with billions of records.*
​

-- Brice

On Thu, Mar 12, 2015 at 10:42 PM, Eric Stevens migh...@gmail.com wrote:

 Ok, but if you're using a system of time that isn't server clock oriented
 (Sachin's document revision ID, and my fixed and necessarily consistent
 base timestamp [B's always know their parent A's exact recorded
 timestamp]), isn't the principle of using timestamps to force a particular
 update out of several to win still sound?

  as using the clocks is only valid if clocks are perfectly sync'ed,
 which they are not

 Clock skew is a problem which doesn't seem to be a factor in either use
 case given that both have a consistent external source of truth for
 timestamp.

 On Thu, Mar 12, 2015 at 12:58 PM, Jonathan Haddad j...@jonhaddad.com
 wrote:

 In most datacenters you're going to see significant variance in your
 server times.  Likely  20ms between servers in the same rack.  Even
 google, using atomic clocks, has 1-7ms variance.  [1]

 I would +1 Tyler's advice here, as using the clocks is only valid if
 clocks are perfectly sync'ed, which they are not, and likely never will be
 in our lifetime.

 [1] http://queue.acm.org/detail.cfm?id=2745385


 On Thu, Mar 12, 2015 at 7:04 AM Eric Stevens migh...@gmail.com wrote:

  It's possible, but you'll end up with problems when attempting to
 overwrite or delete entries

 I'm wondering if you can elucidate on that a little bit, do you just
 mean that it's easy to forget to always set your timestamp correctly, and
 if you goof it up, it makes it difficult to recover from (i.e. you issue a
 delete with system timestamp instead of document version, and that's way
 larger than your document version would ever be, so you can never write
 that document again)?  Or is there some bug in write timestamps that can
 cause the wrong entry to win the write contention?

 We're looking at doing something similar to keep a live max value column
 in a given table, our setup is as follows:

 CREATE TABLE a (
   id whatever,
   time timestamp,
   max_b_foo int,
   PRIMARY KEY (id)
 );
 CREATE TABLE b (
   b_id whatever,
   a_id whatever,
   a_timestamp timestamp,
   foo int,
   PRIMARY KEY (a_id, b_id)
 );

 The idea being that there's a one-to-many relationship between *a* and
 *b*.  We want *a* to know what the maximum value is in *b* for field
 *foo* so we can avoid reading *all* *b* when we want to resolve *a*.
 You can see that we can't just use *b*'s clustering key to resolve that
 with LIMIT 1; also this is for DSE Solr, which wouldn't be able to query a
 by max b.foo anyway.  So when we write to *b*, we also write to *a*
 with something like

 UPDATE a USING TIMESTAMP ${b.a_timestamp.toMicros + b.foo} SET max_b_foo
 = ${b.foo} WHERE id = ${b.a_id}

 Assuming that we don't run afoul of related antipatterns such as
 repeatedly overwriting the same value indefinitely, this strikes me as
 sound if unorthodox practice, as long as conflict resolution in Cassandra
 isn't broken in some subtle way.  We also designed this to be safe from
 getting write timestamps greatly out of sync with clock time so that
 non-timestamped operations (especially delete) if done accidentally will
 still have a reasonable chance of having the expected results.

 So while it may not be the intended use case for write timestamps, and
 there are definitely gotchas if you are not careful or misunderstand the
 consequences, as far as I can see the logic behind it is sound but does
 rely on correct conflict resolution in Cassandra.  I'm curious if I'm
 missing or misunderstanding something important.

 On Wed, Mar 11, 2015 at 4:11 PM, Tyler Hobbs ty...@datastax.com wrote:

 Don't use the version as your timestamp.  It's possible, but you'll end
 up with problems when attempting to overwrite or delete entries.

 Instead, make the version part of the primary key:

 CREATE TABLE document_store (document_id bigint, version int, document
 text, PRIMARY KEY (document_id, version)) WITH CLUSTERING ORDER BY (version
 desc)

 That way you don't have to worry about overwriting higher versions with
 a lower one, and to read the latest version, you only have to do:

 SELECT * FROM document_store WHERE document_id = ? LIMIT 1;

 Another option is to use lightweight transactions (i.e. UPDATE ... SET
 docuement = ?, version = ? WHERE document_id = 

Re: Better option to load data to cassandra

2014-11-12 Thread Brice Dutheil
On our project we wrote ourself a custom batch to load the data to
cassandra the way we wanted.

-- Brice

On Tue, Nov 11, 2014 at 2:33 PM, srinivas rao pinnakasrin...@gmail.com
wrote:

 hi Alexey,

 i tried with sqoop, and data stax copy command. any other options we can
 use.

 i have one more question that, have a table with compisite key as row key,
 so how can i do using sqoop or copy command while exporing.

 Thanks
 Srinivas


 On Tue, Nov 11, 2014 at 6:28 PM, Plotnik, Alexey aplot...@rhonda.ru
 wrote:

  What have you tried?

 -- Original Message --
 From: srinivas rao pinnakasrin...@gmail.com
 To: Cassandra Users user@cassandra.apache.org
 Sent: 11.11.2014 22:51:54
 Subject: Better option to load data to cassandra


  Hi Team,

 Please suggest me the better options to load data from NoSql to Cassndra.



 Thanks
 Srini





Re: paging through an entire table in chunks?

2014-09-29 Thread Brice Dutheil
You may be using the async feature
http://www.datastax.com/documentation/developer/java-driver/1.0/java-driver/asynchronous_t.html
of the java driver. In order to manage complexity related to do several
queries I used RxJava, it leverages readability and asynchronicity in a
very elegant way (much more than Futures). However you may need to code
some code to bridge Rx and the Java driver but it’s worth it.

— Brice

On Sun, Sep 28, 2014 at 12:57 AM, Kevin Burton bur...@spinn3r.com wrote:

Agreed… but I’d like to parallelize it… Eventually I’ll just have too much
 data to do it on one server… plus, I need suspend/resume and this way if
 I’m doing like 10MB at a time I’ll be able to suspend / resume as well as
 track progress.

 On Sat, Sep 27, 2014 at 2:52 PM, DuyHai Doan doanduy...@gmail.com wrote:

 Use the java driver and paging feature:
 http://www.datastax.com/drivers/java/2.1/com/datastax/driver/core/Statement.html#setFetchSize(int)

 1) Do you SELECT * FROM without any selection
 2) Set fetchSize to a sensitive value
 3) Execute the query and get an iterator from the ResultSet
 4) Iterate



 On Sat, Sep 27, 2014 at 11:42 PM, Kevin Burton bur...@spinn3r.com
 wrote:

 I need a way to do a full table scan across all of our data.

 Can’t I just use token() for this?

 This way I could split up our entire keyspace into say 1024 chunks, and
 then have one activemq task work with range 0, then range 1, etc… that way
 I can easily just map() my whole table.

 and since it’s token() I should (generally) read a contiguous range from
 a given table.

 --

 Founder/CEO Spinn3r.com
 Location: *San Francisco, CA*
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 https://plus.google.com/102718274791889610666/posts
 http://spinn3r.com





 --

 Founder/CEO Spinn3r.com
 Location: *San Francisco, CA*
 blog: http://burtonator.wordpress.com
 … or check out my Google+ profile
 https://plus.google.com/102718274791889610666/posts
 http://spinn3r.com

  ​


Re: using dynamic cell names in CQL 3

2014-09-26 Thread Brice Dutheil
I’m not sure I understand correctly “for example column.name would be
event_name(temperature)“, what I gather however is that you have multiple
events that may or may not have certain properties, in your example I
believe you mean you want a CF for events with a type event_name that
contains a column temperature ?!

You can model it like that :

CREATE TABLE events (
  name text,
  metric text,
  value text,
  PRIMARY KEY (name, metric)
)

Where

   - name is the row key, for each kind (or name) of event
   - metric is the column name, aka the clustering key

For example when inserting

INSERT INTO events (name, metric, value) VALUES ('captor',
'temperature', '25 ºC');
INSERT INTO events (name, metric, value) VALUES ('captor', 'wind', '5 km/h');
INSERT INTO events (name, metric, value) VALUES ('captor',
'atmosphere', '1013 millibars');

INSERT INTO events (name, metric, value) VALUES ('cpu', 'temperature', '70 ºC');
INSERT INTO events (name, metric, value) VALUES ('cpu', 'frequency',
'1015,7 MHz');

You will have something like this :
  temperature atmosphere wind frequency   meteorologic 25 ºC 1013 millibars 5
km/h  cpu 70 ºC  1015,7 MHz

CQLSH represents each clustering key as row, which is not how the column
family is stored.

The model I give is just an example as you may want a to model differently
according to your use cases. And time probably is part of it. And will
probably be in the clustering key too.

Note that if you create wide rows and you have *a lot* of data, you may
want to bucket the CF per time period (month / week / day / etc).

HTH

— Brice

On Thu, Sep 25, 2014 at 3:13 PM, shahab shahab.mok...@gmail.com wrote:

Thanks,
 It seems that I was not clear in my question, I would like to store values
 in the column name, for example column.name would be event_name
 (temperature) and column-content would be the respective value (e.g.
 40.5) . And I need to know how the schema should look like in CQL 3

 best,
 /Shahab


 On Wed, Sep 24, 2014 at 1:49 PM, DuyHai Doan doanduy...@gmail.com wrote:

 Dynamic thing in Thrift ≈ clustering columns in CQL

 Can you give more details about your data model ?

 On Wed, Sep 24, 2014 at 1:11 PM, shahab shahab.mok...@gmail.com wrote:

 Hi,

 I  would like to define schema for a table where the column (cell) names
 are defined dynamically. Apparently there is a way to do this in Thrift (
 http://www.datastax.com/dev/blog/does-cql-support-dynamic-columns-wide-rows
 )

 but i couldn't find how i can do the same using CQL?

 Any resource/example that I can look at ?


 best,
 /Shahab



  ​


Re: Repair taking long time

2014-09-26 Thread Brice Dutheil
Unfortunately DSE 4.5.0 is still on 2.0.x

-- Brice

On Fri, Sep 26, 2014 at 7:40 PM, Jonathan Haddad j...@jonhaddad.com wrote:

 Are you using Cassandra 2.0  vnodes?  If so, repair takes forever.
 This problem is addressed in 2.1.

 On Fri, Sep 26, 2014 at 9:52 AM, Gene Robichaux
 gene.robich...@match.com wrote:
  I am fairly new to Cassandra. We have a 9 node cluster, 5 in one DC and
 4 in
  another.
 
 
 
  Running a repair on a large column family seems to be moving much slower
  than I expect.
 
 
 
  Looking at nodetool compaction stats it indicates the Validation phase is
  running that the total bytes is 4.5T (4505336278756).
 
 
 
  This is a very large CF. The process has been running for 2.5 hours and
 has
  processed 71G (71950433062). That rate is about 28.4 GB per hour. At this
  rate it will take 158 hours, just shy of 1 week.
 
 
 
  Is this reasonable? This is my first large repair and I am wondering if
 this
  is normal for a CF of this size. Seems like a long time to me.
 
 
 
  Is it possible to tune this process to speed it up? Is there something
 in my
  configuration that could be causing this slow performance? I am running
  HDDs, not SSDs in a JBOD configuration.
 
 
 
 
 
 
 
  Gene Robichaux
 
  Manager, Database Operations
 
  Match.com
 
  8300 Douglas Avenue I Suite 800 I Dallas, TX  75225
 
  Phone: 214-576-3273
 
 



 --
 Jon Haddad
 http://www.rustyrazorblade.com
 twitter: rustyrazorblade



Re: Help with approach to remove RDBMS schema from code to move to C*?

2014-09-20 Thread Brice Dutheil
I’m fairly new to cassandra, but here’s my input.

Think of your column families as a projection of how the application needs
them. Thinking with CQRS in mind helps. So with more CFs that may require
more space, as data may be written differently in different column families
for different usage. For that reason you have to think about the disk
usage, considering the growth of the data, the space needed for cassandra
to perform compaction and other stuff.

Also on the modeling front, pay attention to growing wide rows, i.e. when
updating or deleting column in such row may adds too many tombstones (
tombstone_failure_threshold default is 100 000), which may cause cassandra
to abort queries on such rows (before compaction) because it have to load
this partition in memory to actually output the actual data.
This is especially important for time series. We had to rework our model to
bucket by period, to avoid such cases. However this will require some work
on the business code to query such a column family.

Avoid secondary indexes, which somehow relate to modeling per usage hence
removing their need.

Cheers,
— Brice

On Sat, Sep 20, 2014 at 6:55 AM, Jack Krupansky j...@basetechnology.com
wrote:

  Start by asking how you intend to query the data. That should drive the
 data model.

 Is there existing app client code or an app layer that is already using
 the current schema, or are you intending to rewrite that as well.

 FWIW, you could place the numeric columns in a numeric map collection, and
 the string columns in a string map collection, but... it’s best to first
 step back and look at the big picture of what the data actually looks like
 as well as how you want to query it.

 -- Jack Krupansky

  *From:* Les Hartzman lhartz...@gmail.com
 *Sent:* Friday, September 19, 2014 5:46 PM
 *To:* user@cassandra.apache.org
 *Subject:* Help with approach to remove RDBMS schema from code to move to
 C*?

  My company is using an RDBMS for storing time-series data. This
 application was developed before Cassandra and NoSQL. I'd like to move to
 C*, but ...

 The application supports data coming from multiple models of devices.
 Because there is enough variability in the data, the main table to hold the
 device data only has some core columns defined. The other columns are
 non-specific; a set of columns for numeric and a set for character. So for
 these non-specific columns, their use is defined in the code. The use of
 column 'numeric_1' might hold a millisecond time for one device and a fault
 code for another device. This appears to have been done to keep from
 modifying the schema whenever a new device was introduced. And they rolled
 their own db interface to support this mess.

 Now, we could just use C* like an RDBMS - defining CFs to mimic the
 tables. But this just pushes a bad design from one platform to another.

 Clearly there needs to be a code re-write. But what suggestions does
 anyone have on how to make this shift to C*?

 Would you just layout all of the columns represented by the different
 devices, naming them as they are used, and having jagged rows? Or is there
 some other way to approach this?

 Of course, the data miners already have scripts/methods for accessing the
 data from the RDBMS now in the user-unfriendly form it's in now. This would
 have to be addressed as well, but until I know how to store it, mining it
 gets ahead of things.

 Thanks.

 Les


​