from:"Kant Kodali"

Re: Is there a plan for Feature like this in C* ?

2018-07-02 Thread Kant Kodali

Hi Justin,

Thanks, Looks like a very early stage feature and no integration with Kafka
yet I suppose.

Thanks!

On Mon, Jul 2, 2018 at 6:24 PM, Justin Cameron 
wrote:

> yes, take a look at http://cassandra.apache.org/
> doc/latest/operating/cdc.html
>
> On Tue, 3 Jul 2018 at 01:20 Kant Kodali  wrote:
>
>> https://www.cockroachlabs.com/docs/v2.1/change-data-capture.html
>>
> --
>
>
> *Justin Cameron*Senior Software Engineer
>
>
> <https://www.instaclustr.com/>
>
>
> This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
> and Instaclustr Inc (USA).
>
> This email and any attachments may contain confidential and legally
> privileged information.  If you are not the intended recipient, do not copy
> or disclose its content, but please reply to this email immediately and
> highlight the error to the sender and then immediately delete the message.
>

Is there a plan for Feature like this in C* ?

2018-07-02 Thread Kant Kodali

https://www.cockroachlabs.com/docs/v2.1/change-data-capture.html

UDF related question

2018-04-03 Thread Kant Kodali

Hi All,

I was reading the article below and I was wondering how did one manage to
block all I/O calls given that there is no byte code instruction for I/O in
java instead all the I/O calls in java will go through *invokevirtual *byte
code instruction. But this can call a C function that just add two numbers
right. so how can one block all I/O calls?

Thanks!

https://www.datastax.com/dev/blog/user-defined-aggregations-with-spark-in-dse-5

Re: Cassandra/Spark failing to process large table

2018-03-03 Thread Kant Kodali

The fact that cqlsh itself gives different results tells me that this has
nothing to do with spark. Moreover, spark results are monotonically
increasing which seem to be more consistent than cqlsh. so I believe
spark can be taken out of the equation.

 Now, while you are running these queries is there another process or
thread that is writing also at the same time ? If yes then your results are
fine but If it's not, you may want to try nodetool flush first and then run
these iterations again?

Thanks!


On Fri, Mar 2, 2018 at 11:17 PM, Faraz Mateen  wrote:

> Hi everyone,
>
> I am trying to use spark to process a large cassandra table (~402 million
> entries and 84 columns) but I am getting inconsistent results. Initially
> the requirement was to copy some columns from this table to another table.
> After copying the data, I noticed that some entries in the new table were
> missing. To verify that I took count of the large source table but I am
> getting different values each time. I tried the queries on a smaller table
> (~7 million records) and the results were fine.
>
> Initially, I attempted to take count using pyspark. Here is my pyspark
> script:
>
> spark = SparkSession.builder.appName("Datacopy App").getOrCreate()
> df = 
> spark.read.format("org.apache.spark.sql.cassandra").options(table=sourcetable,
>  keyspace=sourcekeyspace).load().cache()
> df.createOrReplaceTempView("data")
> query = ("select count(1) from data " )
> vgDF = spark.sql(query)
> vgDF.show(10)
>
> Spark submit command is as follows:
>
> ~/spark-2.1.0-bin-hadoop2.7/bin/spark-submit --master 
> spark://10.128.0.18:7077 --packages 
> datastax:spark-cassandra-connector:2.0.1-s_2.11 --conf 
> spark.cassandra.connection.host="10.128.1.1,10.128.1.2,10.128.1.3" --conf 
> "spark.storage.memoryFraction=1" --conf spark.local.dir=/media/db/ 
> --executor-memory 10G --num-executors=6 --executor-cores=2 
> --total-executor-cores 18 pyspark_script.py
>
> The above spark submit process takes ~90 minutes to complete. I ran it
> three times and here are the counts I got:
>
> Spark iteration 1:  402273852
> Spark iteration 2:  402273884
> Spark iteration 3:  402274209
>
> Spark does not show any error or exception during the entire process. I
> ran the same query in cqlsh thrice and got different results again:
>
> Cqlsh iteration 1:   402273598
> Cqlsh iteration 2:   402273499
> Cqlsh iteration 3:   402273515
>
> I am unable to find out why I am getting different outcomes from the same
> query. Cassandra system logs (*/var/log/cassandra/system.log*) has shown
> the following error message just once:
>
> ERROR [SSTableBatchOpen:3] 2018-02-27 09:48:23,592 CassandraDaemon.java:226 - 
> Exception in thread Thread[SSTableBatchOpen:3,5,main]
> java.lang.AssertionError: Stats component is missing for sstable 
> /media/db/datakeyspace/sensordata1-acfa7880acba11e782fd9bf3ae460699/mc-58617-big
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:460)
>  ~[apache-cassandra-3.9.jar:3.9]
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader.open(SSTableReader.java:375)
>  ~[apache-cassandra-3.9.jar:3.9]
> at 
> org.apache.cassandra.io.sstable.format.SSTableReader$4.run(SSTableReader.java:536)
>  ~[apache-cassandra-3.9.jar:3.9]
> at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) 
> ~[na:1.8.0_131]
> at java.util.concurrent.FutureTask.run(FutureTask.java:266) 
> ~[na:1.8.0_131]
> at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
>  ~[na:1.8.0_131]
> at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
>  [na:1.8.0_131]
> at java.lang.Thread.run(Thread.java:748) [na:1.8.0_131]
>
> *Versions:*
>
>- Cassandra 3.9
>- Spark 2.1.0
>- Datastax's spark-cassandra-connector 2.0.1
>- Scala version 2.11
>
> *Cluster:*
>
>- Spark setup with 3 workers and 1 master node.
>- 3 worker nodes also have a cassandra cluster installed.
>- Each worker node has 8 CPU cores and 40 GB RAM.
>
> Any help will be greatly appreciated.
>
> Thanks,
> Faraz
>

Re: Jon Haddad on Diagnosing Performance Problems in Production

2018-02-27 Thread Kant Kodali

+1 That was a nice talk! I don't know why I haven't come across that video
before!

On Tue, Feb 27, 2018 at 9:12 AM, Jonathan Haddad  wrote:

> There isn't a ton from that talk I'd consider "wrong" at this point, but
> some of it is a little stale.  I always start off looking at system
> metrics.  For a very thorough discussion on the matter check out Brendan
> Gregg's USE [1] method.  I did a blog post on my own about the talk [2]
> that has screenshots and might be helpful.  Generally speaking know your OS
> and the tools to examine each component.  Learn how to interpret the
> numbers you see, there's more information than a human can process in a
> lifetime but understanding some fundamentals of throughput vs latency &
> error rates and how to find out each of those metrics for cpu / memory /
> network / disk is a good start.
>
> More recently I did a talk at Data Day Texas, I posted the slides on
> Slideshare [3].  The focus there was more on perf tuning and less on
> performance troubleshooting, but I guess it's a matter of perspective which
> point your at.  The tools have changed a little (Prometheus instead of
> Graphite), and there's some new perf tuning tips like examining your read
> ahead and compression settings, generating flame graphs and using tools
> like YourKit and Java Flight Recorder, and the easiest win of all time,
> disabling dynamic snitch if your hardware is fast and you want sub ms
> p99s.  Turn up counter cache if you use counters (it still gets hit on the
> write path), and row cache is way more effective than people give it credit
> for under the right workloads.
>
> I've got a blog post in the works on JVM tuning, but for now I reference
> CASSANDRA-8150 [4] and Blake Eggleston's blog post [5] from back in our
> days at a small startup.
>
> Lastly, I'm doing a performance tuning series on our blog at The Last
> Pickle, with the first being on Flame Graphs [6].  I've got about 6 posts
> in the pipeline, just need to find time to get to them.
>
> Hope this helps,
> Jon
>
> [1] http://www.brendangregg.com/usemethod.html
> [2] http://rustyrazorblade.com/post/2014/2014-09-18-diagnosing-production/
> [3] https://www.slideshare.net/JonHaddad/performance-tuning-86995333
> [4] https://issues.apache.org/jira/browse/CASSANDRA-8150
> [5] http://blakeeggleston.com/cassandra-tuning-the-jvm-for-
> read-heavy-workloads.html
> [6] http://thelastpickle.com/blog/2018/01/16/cassandra-flame-graphs.html
>
>
>
> On Tue, Feb 27, 2018 at 8:56 AM Michael Shuler 
> wrote:
>
>> On 02/27/2018 10:20 AM, Nicolas Guyomar wrote:
>> > Is Jon blog
>> > post https://academy.datastax.com/planet-cassandra/blog/
>> cassandra-summit-recap-diagnosing-problems-in-production
>> > was relocated somewhere ?
>>
>> https://web.archive.org/web/20160322011022/planetcassandra.org/blog/
>> cassandra-summit-recap-diagnosing-problems-in-production
>>
>> --
>> Kind regards,
>> Michael
>>
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>
>>

Re: How to Parse raw CQL text?

2018-02-26 Thread Kant Kodali

wouldn't it make sense to expose the parser at some point?

On Mon, Feb 26, 2018 at 9:47 AM, Ariel Weisberg <ar...@weisberg.ws> wrote:

> Hi,
>
> I took a similar approach and it worked fine. I was able to build a tool
> that parsed production query logs.
>
> I used a helper method that would just grab a private field out of an
> object by name using reflection.
>
> Ariel
>
> On Sun, Feb 25, 2018, at 11:58 PM, Jonathan Haddad wrote:
>
> I had to do something similar recently.  Take a look at
> org.apache.cassandra.cql3.QueryProcessor.parseStatement().  I've got some
> sample code here [1] as well as a blog post [2] that explains how to access
> the private variables, since there's no access provided.  It wasn't really
> designed to be used as a library, so YMMV with future changes.
>
> [1] https://github.com/rustyrazorblade/rustyrazorblade-examples/blob/
> master/privatevaraccess/src/main/kotlin/com/rustyrazorblade/
> privatevaraccess/CreateTableParser.kt
> [2] http://rustyrazorblade.com/post/2018/2018-02-25-
> accessing-private-variables-in-jvm/
>
> On Mon, Feb 5, 2018 at 2:27 PM Kant Kodali <k...@peernova.com> wrote:
>
> I just did some trial and error. Looks like this would work
>
> *public class *Test {
>
> *public static void *main(String[] args) *throws *Exception {
>
> String stmt = *"create table if not exists test_keyspace.my_table 
> (field1 text, field2 int, field3 set, field4 map<ascii, text>, primary 
> key (field1) );"*;
> ANTLRStringStream stringStream = *new *ANTLRStringStream(stmt);
> CqlLexer cqlLexer = *new *CqlLexer(stringStream);
> CommonTokenStream token = *new *CommonTokenStream(cqlLexer);
> CqlParser parser = *new *CqlParser(token);
>
> ParsedStatement query = parser.cqlStatement();
>
>
> *if *(query.getClass().getDeclaringClass() == 
> CreateTableStatement.*class*) {
> CreateTableStatement.RawStatement cts = 
> (CreateTableStatement.RawStatement) query;
>
> CFMetaData
> .*compile*(stmt, cts.keyspace())
>
>
> .getColumnMetadata()
> .values()
> .stream()
> .forEach(cd -> System.*out*.println(cd));
>
>
> }
>
>}
>
> }
>
>
> On Mon, Feb 5, 2018 at 2:13 PM, Kant Kodali <k...@peernova.com> wrote:
>
> Hi Anant,
>
> I just have CQL create table statement as a string I want to extract all
> the parts like, tableName, KeySpaceName, regular Columns,  partitionKey,
> ClusteringKey, Clustering Order and so on. Thats really  it!
>
> Thanks!
>
> On Mon, Feb 5, 2018 at 1:50 PM, Rahul Singh <rahul.xavier.si...@gmail.com>
> wrote:
>
> I think I understand what you are trying to do … but what is your goal?
> What do you mean “use it for different” queries… Maybe you want to do an
> event and have an event processor? Seems like you are trying to basically
> by pass that pattern and parse a query and split it into several actions?
>
> Did you look into this unit test folder?
>
> https://github.com/apache/cassandra/blob/trunk/test/
> unit/org/apache/cassandra/cql3/CQLTester.java
>
> --
> Rahul Singh
> rahul.si...@anant.us
>
> Anant Corporation
>
> On Feb 5, 2018, 4:06 PM -0500, Kant Kodali <k...@peernova.com>, wrote:
>
> Hi All,
>
> I have a need where I get a raw CQL create table statement as a String and
> I need to parse the keyspace, tablename, columns and so on..so I can use it
> for various queries and send it to C*. I used the example below from this
> link <https://github.com/tacoo/cassandra-antlr-sample>. I get the
> following error.  And I thought maybe someone in this mailing list will be
> more familiar with internals.
>
> Exception in thread "main" 
> org.apache.cassandra.exceptions.ConfigurationException:
> Keyspace test_keyspace doesn't exist
> at org.apache.cassandra.cql3.statements.CreateTableStatement$
> RawStatement.prepare(CreateTableStatement.java:200)
> at com.hello.world.Test.main(Test.java:23)
>
>
> Here is my code.
>
> *package *com.hello.world;
>
> *import *org.antlr.runtime.ANTLRStringStream;
> *import *org.antlr.runtime.CommonTokenStream;
> *import *org.apache.cassandra.cql3.CqlLexer;
> *import *org.apache.cassandra.cql3.CqlParser;
> *import *org.apache.cassandra.cql3.statements.CreateTableStatement;
> *import *org.apache.cassandra.cql3.statements.ParsedStatement;
>
> *public class *Test {
>
> *public static void *main(String[] args) *throws *Exception {
> String stmt = *"create table if not exists test_keyspace**.my_table

Re: How to Parse raw CQL text?

2018-02-05 Thread Kant Kodali

I just did some trial and error. Looks like this would work

public class Test {

public static void main(String[] args) throws Exception {

String stmt = "create table if not exists
test_keyspace.my_table (field1 text, field2 int, field3 set,
field4 map<ascii, text>, primary key (field1) );";
ANTLRStringStream stringStream = new ANTLRStringStream(stmt);
CqlLexer cqlLexer = new CqlLexer(stringStream);
CommonTokenStream token = new CommonTokenStream(cqlLexer);
CqlParser parser = new CqlParser(token);
ParsedStatement query = parser.cqlStatement();
if (query.getClass().getDeclaringClass() ==
CreateTableStatement.class) {
CreateTableStatement.RawStatement cts =
(CreateTableStatement.RawStatement) query;
CFMetaData
.compile(stmt, cts.keyspace())
.getColumnMetadata()
.values()
.stream()
.forEach(cd -> System.out.println(cd));

}

   }

}


On Mon, Feb 5, 2018 at 2:13 PM, Kant Kodali <k...@peernova.com> wrote:

> Hi Anant,
>
> I just have CQL create table statement as a string I want to extract all
> the parts like, tableName, KeySpaceName, regular Columns,  partitionKey,
> ClusteringKey, Clustering Order and so on. Thats really  it!
>
> Thanks!
>
> On Mon, Feb 5, 2018 at 1:50 PM, Rahul Singh <rahul.xavier.si...@gmail.com>
> wrote:
>
>> I think I understand what you are trying to do … but what is your goal?
>> What do you mean “use it for different” queries… Maybe you want to do an
>> event and have an event processor? Seems like you are trying to basically
>> by pass that pattern and parse a query and split it into several actions?
>>
>> Did you look into this unit test folder?
>>
>> https://github.com/apache/cassandra/blob/trunk/test/unit/
>> org/apache/cassandra/cql3/CQLTester.java
>>
>> --
>> Rahul Singh
>> rahul.si...@anant.us
>>
>> Anant Corporation
>>
>> On Feb 5, 2018, 4:06 PM -0500, Kant Kodali <k...@peernova.com>, wrote:
>>
>> Hi All,
>>
>> I have a need where I get a raw CQL create table statement as a String
>> and I need to parse the keyspace, tablename, columns and so on..so I can
>> use it for various queries and send it to C*. I used the example below
>> from this link <https://github.com/tacoo/cassandra-antlr-sample>. I get
>> the following error.  And I thought maybe someone in this mailing list will
>> be more familiar with internals.
>>
>> Exception in thread "main" 
>> org.apache.cassandra.exceptions.ConfigurationException:
>> Keyspace test_keyspace doesn't exist
>> at org.apache.cassandra.cql3.statements.CreateTableStatement$Ra
>> wStatement.prepare(CreateTableStatement.java:200)
>> at com.hello.world.Test.main(Test.java:23)
>>
>>
>> Here is my code.
>>
>> package com.hello.world;
>>
>> import org.antlr.runtime.ANTLRStringStream;
>> import org.antlr.runtime.CommonTokenStream;
>> import org.apache.cassandra.cql3.CqlLexer;
>> import org.apache.cassandra.cql3.CqlParser;
>> import org.apache.cassandra.cql3.statements.CreateTableStatement;
>> import org.apache.cassandra.cql3.statements.ParsedStatement;
>>
>> public class Test {
>>
>> public static void main(String[] args) throws Exception {
>> String stmt = "create table if not exists test_keyspace.my_table 
>> (field1 text, field2 int, field3 set, field4 map<ascii, text>, 
>> primary key (field1) );";
>> ANTLRStringStream stringStream = new ANTLRStringStream(stmt);
>> CqlLexer cqlLexer = new CqlLexer(stringStream);
>> CommonTokenStream token = new CommonTokenStream(cqlLexer);
>> CqlParser parser = new CqlParser(token);
>> ParsedStatement query = parser.query();
>> if (query.getClass().getDeclaringClass() == 
>> CreateTableStatement.class) {
>> CreateTableStatement.RawStatement cts = 
>> (CreateTableStatement.RawStatement) query;
>> System.out.println(cts.keyspace());
>> System.out.println(cts.columnFamily());
>> ParsedStatement.Prepared prepared = cts.prepare();
>> CreateTableStatement cts2 = (CreateTableStatement) 
>> prepared.statement;
>> cts2.getCFMetaData()
>> .getColumnMetadata()
>> .values()
>> .stream()
>> .forEach(cd -> System.out.println(cd));
>> }
>> }
>> }
>>
>> Thanks!
>>
>>
>

Re: How to Parse raw CQL text?

2018-02-05 Thread Kant Kodali

Hi Anant,

I just have CQL create table statement as a string I want to extract all
the parts like, tableName, KeySpaceName, regular Columns,  partitionKey,
ClusteringKey, Clustering Order and so on. Thats really  it!

Thanks!

On Mon, Feb 5, 2018 at 1:50 PM, Rahul Singh <rahul.xavier.si...@gmail.com>
wrote:

> I think I understand what you are trying to do … but what is your goal?
> What do you mean “use it for different” queries… Maybe you want to do an
> event and have an event processor? Seems like you are trying to basically
> by pass that pattern and parse a query and split it into several actions?
>
> Did you look into this unit test folder?
>
> https://github.com/apache/cassandra/blob/trunk/test/
> unit/org/apache/cassandra/cql3/CQLTester.java
>
> --
> Rahul Singh
> rahul.si...@anant.us
>
> Anant Corporation
>
> On Feb 5, 2018, 4:06 PM -0500, Kant Kodali <k...@peernova.com>, wrote:
>
> Hi All,
>
> I have a need where I get a raw CQL create table statement as a String and
> I need to parse the keyspace, tablename, columns and so on..so I can use it
> for various queries and send it to C*. I used the example below from this
> link <https://github.com/tacoo/cassandra-antlr-sample>. I get the
> following error.  And I thought maybe someone in this mailing list will be
> more familiar with internals.
>
> Exception in thread "main" 
> org.apache.cassandra.exceptions.ConfigurationException:
> Keyspace test_keyspace doesn't exist
> at org.apache.cassandra.cql3.statements.CreateTableStatement$Ra
> wStatement.prepare(CreateTableStatement.java:200)
> at com.hello.world.Test.main(Test.java:23)
>
>
> Here is my code.
>
> package com.hello.world;
>
> import org.antlr.runtime.ANTLRStringStream;
> import org.antlr.runtime.CommonTokenStream;
> import org.apache.cassandra.cql3.CqlLexer;
> import org.apache.cassandra.cql3.CqlParser;
> import org.apache.cassandra.cql3.statements.CreateTableStatement;
> import org.apache.cassandra.cql3.statements.ParsedStatement;
>
> public class Test {
>
> public static void main(String[] args) throws Exception {
> String stmt = "create table if not exists test_keyspace.my_table 
> (field1 text, field2 int, field3 set, field4 map<ascii, text>, primary 
> key (field1) );";
> ANTLRStringStream stringStream = new ANTLRStringStream(stmt);
> CqlLexer cqlLexer = new CqlLexer(stringStream);
> CommonTokenStream token = new CommonTokenStream(cqlLexer);
> CqlParser parser = new CqlParser(token);
> ParsedStatement query = parser.query();
> if (query.getClass().getDeclaringClass() == 
> CreateTableStatement.class) {
> CreateTableStatement.RawStatement cts = 
> (CreateTableStatement.RawStatement) query;
> System.out.println(cts.keyspace());
> System.out.println(cts.columnFamily());
> ParsedStatement.Prepared prepared = cts.prepare();
> CreateTableStatement cts2 = (CreateTableStatement) 
> prepared.statement;
> cts2.getCFMetaData()
> .getColumnMetadata()
> .values()
> .stream()
> .forEach(cd -> System.out.println(cd));
> }
> }
> }
>
> Thanks!
>
>

How to Parse raw CQL text?

2018-02-05 Thread Kant Kodali

Hi All,

I have a need where I get a raw CQL create table statement as a String and
I need to parse the keyspace, tablename, columns and so on..so I can use it
for various queries and send it to C*. I used the example below from this
link . I get the following
error.  And I thought maybe someone in this mailing list will be more
familiar with internals.

Exception in thread "main"
org.apache.cassandra.exceptions.ConfigurationException:
Keyspace test_keyspace doesn't exist
at org.apache.cassandra.cql3.statements.CreateTableStatement$
RawStatement.prepare(CreateTableStatement.java:200)
at com.hello.world.Test.main(Test.java:23)


Here is my code.

package com.hello.world;

import org.antlr.runtime.ANTLRStringStream;
import org.antlr.runtime.CommonTokenStream;
import org.apache.cassandra.cql3.CqlLexer;
import org.apache.cassandra.cql3.CqlParser;
import org.apache.cassandra.cql3.statements.CreateTableStatement;
import org.apache.cassandra.cql3.statements.ParsedStatement;

public class Test {

public static void main(String[] args) throws Exception {
String stmt = "create table if not exists
test_keyspace.my_table (field1 text, field2 int, field3 set,
field4 map, primary key (field1) );";
ANTLRStringStream stringStream = new ANTLRStringStream(stmt);
CqlLexer cqlLexer = new CqlLexer(stringStream);
CommonTokenStream token = new CommonTokenStream(cqlLexer);
CqlParser parser = new CqlParser(token);
ParsedStatement query = parser.query();
if (query.getClass().getDeclaringClass() ==
CreateTableStatement.class) {
CreateTableStatement.RawStatement cts =
(CreateTableStatement.RawStatement) query;
System.out.println(cts.keyspace());
System.out.println(cts.columnFamily());
ParsedStatement.Prepared prepared = cts.prepare();
CreateTableStatement cts2 = (CreateTableStatement)
prepared.statement;
cts2.getCFMetaData()
.getColumnMetadata()
.values()
.stream()
.forEach(cd -> System.out.println(cd));
}
}
}

Thanks!

Re: unable to start cassandra 3.11.1

2018-02-02 Thread Kant Kodali

When you say latest Java runtime you mean does it work with Java 9 as well?

On Fri, Feb 2, 2018 at 5:02 AM, Sam Tunnicliffe <s...@beobal.com> wrote:

> I've actually just committed the fix for this to the 3.11 and trunk
> branches, so if you desperately need a compatible build you can make build
> from those branches.
> As I mentioned on the JIRA, I expect we'll move to a release vote very
> soon, so hopefully should have a 3.11.2 release with this fix shortly.
>
>
> On 2 February 2018 at 12:19, Marcus Haarmann <marcus.haarm...@midoco.de>
> wrote:
>
>> you can try to checkout https://github.com/be
>> obal/cassandra/tree/14173-3.11
>> and compile yourself a compatible version (unreleased), in case you are
>> bound to
>> the latest java runtime for any reason.
>>
>> Marcus Haarmann
>>
>> --
>> *Von: *"Kant Kodali" <k...@peernova.com>
>> *An: *"user" <user@cassandra.apache.org>
>> *Gesendet: *Donnerstag, 1. Februar 2018 23:45:06
>> *Betreff: *Re: unable to start cassandra 3.11.1
>>
>> Ok I saw the ticket looks like this java version "1.8.0_162" wont work!
>>
>> On Thu, Feb 1, 2018 at 2:43 PM, Kant Kodali <k...@peernova.com> wrote:
>>
>>> Hi Justin,
>>> I am using
>>>
>>> java version "1.8.0_162"
>>>
>>> Java(TM) SE Runtime Environment (build 1.8.0_162-b12)
>>>
>>>
>>> Thanks!
>>>
>>> On Thu, Feb 1, 2018 at 2:40 PM, Justin Cameron <jus...@instaclustr.com>
>>> wrote:
>>>
>>>> Unfortunately C* 3.11.1 is incompatible with the latest version of
>>>> Java. You'll need to either downgrade to Java 1.8.0.151-5 or wait for C*
>>>> 3.11.2 (see https://issues.apache.org/jira/browse/CASSANDRA-14173 for
>>>> details)
>>>>
>>>> On Fri, 2 Feb 2018 at 09:35 Kant Kodali <k...@peernova.com> wrote:
>>>>
>>>>> Hi All,
>>>>> I am unable to start cassandra 3.11.1. Below is the stack trace.
>>>>>
>>>>> Exception (java.lang.AbstractMethodError) encountered during startup: 
>>>>> org.apache.cassandra.utils.JMXServerUtils$Exporter.exportObject(Ljava/rmi/Remote;ILjava/rmi/server/RMIClientSocketFactory;Ljava/rmi/server/RMIServerSocketFactory;Lsun/misc/ObjectInputFilter;)Ljava/rmi/Remote;
>>>>> java.lang.AbstractMethodError: 
>>>>> org.apache.cassandra.utils.JMXServerUtils$Exporter.exportObject(Ljava/rmi/Remote;ILjava/rmi/server/RMIClientSocketFactory;Ljava/rmi/server/RMIServerSocketFactory;Lsun/misc/ObjectInputFilter;)Ljava/rmi/Remote;
>>>>> at 
>>>>> javax.management.remote.rmi.RMIJRMPServerImpl.export(RMIJRMPServerImpl.java:150)
>>>>> at 
>>>>> javax.management.remote.rmi.RMIJRMPServerImpl.export(RMIJRMPServerImpl.java:135)
>>>>> at 
>>>>> javax.management.remote.rmi.RMIConnectorServer.start(RMIConnectorServer.java:405)
>>>>> at 
>>>>> org.apache.cassandra.utils.JMXServerUtils.createJMXServer(JMXServerUtils.java:104)
>>>>> at 
>>>>> org.apache.cassandra.service.CassandraDaemon.maybeInitJmx(CassandraDaemon.java:143)
>>>>> at 
>>>>> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:188)
>>>>> at 
>>>>> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:600)
>>>>> at 
>>>>> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:689)
>>>>> ERROR 22:33:49 Exception encountered during startup
>>>>> java.lang.AbstractMethodError: 
>>>>> org.apache.cassandra.utils.JMXServerUtils$Exporter.exportObject(Ljava/rmi/Remote;ILjava/rmi/server/RMIClientSocketFactory;Ljava/rmi/server/RMIServerSocketFactory;Lsun/misc/ObjectInputFilter;)Ljava/rmi/Remote;
>>>>> at 
>>>>> javax.management.remote.rmi.RMIJRMPServerImpl.export(RMIJRMPServerImpl.java:150)
>>>>>  ~[na:1.8.0_162]
>>>>> at 
>>>>> javax.management.remote.rmi.RMIJRMPServerImpl.export(RMIJRMPServerImpl.java:135)
>>>>>  ~[na:1.8.0_162]
>>>>> at 
>>>>> javax.management.remote.rmi.RMIConnectorServer.start(RMIConnectorServer.java:405)
>>>>>  ~[na:1.8.0_162]
>>>>> at 
>>>>> org.apache.cassandra.utils.JMXServerUtils.createJMXServer(JMXServerUtils.java:104)
>>>>>  ~[apache-cassandra-3.11.1.jar:3.11

Re: unable to start cassandra 3.11.1

2018-02-01 Thread Kant Kodali

Ok I saw the ticket looks like this java version "1.8.0_162" wont work!

On Thu, Feb 1, 2018 at 2:43 PM, Kant Kodali <k...@peernova.com> wrote:

> Hi Justin,
>
> I am using
>
> java version "1.8.0_162"
>
> Java(TM) SE Runtime Environment (build 1.8.0_162-b12)
>
>
> Thanks!
>
> On Thu, Feb 1, 2018 at 2:40 PM, Justin Cameron <jus...@instaclustr.com>
> wrote:
>
>> Unfortunately C* 3.11.1 is incompatible with the latest version of Java.
>> You'll need to either downgrade to Java 1.8.0.151-5 or wait for C* 3.11.2
>> (see https://issues.apache.org/jira/browse/CASSANDRA-14173 for details)
>>
>> On Fri, 2 Feb 2018 at 09:35 Kant Kodali <k...@peernova.com> wrote:
>>
>>> Hi All,
>>>
>>> I am unable to start cassandra 3.11.1. Below is the stack trace.
>>>
>>> Exception (java.lang.AbstractMethodError) encountered during startup: 
>>> org.apache.cassandra.utils.JMXServerUtils$Exporter.exportObject(Ljava/rmi/Remote;ILjava/rmi/server/RMIClientSocketFactory;Ljava/rmi/server/RMIServerSocketFactory;Lsun/misc/ObjectInputFilter;)Ljava/rmi/Remote;
>>> java.lang.AbstractMethodError: 
>>> org.apache.cassandra.utils.JMXServerUtils$Exporter.exportObject(Ljava/rmi/Remote;ILjava/rmi/server/RMIClientSocketFactory;Ljava/rmi/server/RMIServerSocketFactory;Lsun/misc/ObjectInputFilter;)Ljava/rmi/Remote;
>>> at 
>>> javax.management.remote.rmi.RMIJRMPServerImpl.export(RMIJRMPServerImpl.java:150)
>>> at 
>>> javax.management.remote.rmi.RMIJRMPServerImpl.export(RMIJRMPServerImpl.java:135)
>>> at 
>>> javax.management.remote.rmi.RMIConnectorServer.start(RMIConnectorServer.java:405)
>>> at 
>>> org.apache.cassandra.utils.JMXServerUtils.createJMXServer(JMXServerUtils.java:104)
>>> at 
>>> org.apache.cassandra.service.CassandraDaemon.maybeInitJmx(CassandraDaemon.java:143)
>>> at 
>>> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:188)
>>> at 
>>> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:600)
>>> at 
>>> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:689)
>>> ERROR 22:33:49 Exception encountered during startup
>>> java.lang.AbstractMethodError: 
>>> org.apache.cassandra.utils.JMXServerUtils$Exporter.exportObject(Ljava/rmi/Remote;ILjava/rmi/server/RMIClientSocketFactory;Ljava/rmi/server/RMIServerSocketFactory;Lsun/misc/ObjectInputFilter;)Ljava/rmi/Remote;
>>> at 
>>> javax.management.remote.rmi.RMIJRMPServerImpl.export(RMIJRMPServerImpl.java:150)
>>>  ~[na:1.8.0_162]
>>> at 
>>> javax.management.remote.rmi.RMIJRMPServerImpl.export(RMIJRMPServerImpl.java:135)
>>>  ~[na:1.8.0_162]
>>> at 
>>> javax.management.remote.rmi.RMIConnectorServer.start(RMIConnectorServer.java:405)
>>>  ~[na:1.8.0_162]
>>> at 
>>> org.apache.cassandra.utils.JMXServerUtils.createJMXServer(JMXServerUtils.java:104)
>>>  ~[apache-cassandra-3.11.1.jar:3.11.1]
>>> at 
>>> org.apache.cassandra.service.CassandraDaemon.maybeInitJmx(CassandraDaemon.java:143)
>>>  [apache-cassandra-3.11.1.jar:3.11.1]
>>> at 
>>> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:188)
>>>  [apache-cassandra-3.11.1.jar:3.11.1]
>>> at 
>>> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:600)
>>>  [apache-cassandra-3.11.1.jar:3.11.1]
>>> at 
>>> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:689) 
>>> [apache-cassandra-3.11.1.jar:3.11.1]
>>>
>>> --
>>
>>
>> *Justin Cameron*Senior Software Engineer
>>
>>
>> <https://www.instaclustr.com/>
>>
>>
>> This email has been sent on behalf of Instaclustr Pty. Limited
>> (Australia) and Instaclustr Inc (USA).
>>
>> This email and any attachments may contain confidential and legally
>> privileged information.  If you are not the intended recipient, do not copy
>> or disclose its content, but please reply to this email immediately and
>> highlight the error to the sender and then immediately delete the message.
>>
>
>

Re: unable to start cassandra 3.11.1

2018-02-01 Thread Kant Kodali

Hi Justin,

I am using

java version "1.8.0_162"

Java(TM) SE Runtime Environment (build 1.8.0_162-b12)


Thanks!

On Thu, Feb 1, 2018 at 2:40 PM, Justin Cameron <jus...@instaclustr.com>
wrote:

> Unfortunately C* 3.11.1 is incompatible with the latest version of Java.
> You'll need to either downgrade to Java 1.8.0.151-5 or wait for C* 3.11.2
> (see https://issues.apache.org/jira/browse/CASSANDRA-14173 for details)
>
> On Fri, 2 Feb 2018 at 09:35 Kant Kodali <k...@peernova.com> wrote:
>
>> Hi All,
>>
>> I am unable to start cassandra 3.11.1. Below is the stack trace.
>>
>> Exception (java.lang.AbstractMethodError) encountered during startup: 
>> org.apache.cassandra.utils.JMXServerUtils$Exporter.exportObject(Ljava/rmi/Remote;ILjava/rmi/server/RMIClientSocketFactory;Ljava/rmi/server/RMIServerSocketFactory;Lsun/misc/ObjectInputFilter;)Ljava/rmi/Remote;
>> java.lang.AbstractMethodError: 
>> org.apache.cassandra.utils.JMXServerUtils$Exporter.exportObject(Ljava/rmi/Remote;ILjava/rmi/server/RMIClientSocketFactory;Ljava/rmi/server/RMIServerSocketFactory;Lsun/misc/ObjectInputFilter;)Ljava/rmi/Remote;
>> at 
>> javax.management.remote.rmi.RMIJRMPServerImpl.export(RMIJRMPServerImpl.java:150)
>> at 
>> javax.management.remote.rmi.RMIJRMPServerImpl.export(RMIJRMPServerImpl.java:135)
>> at 
>> javax.management.remote.rmi.RMIConnectorServer.start(RMIConnectorServer.java:405)
>> at 
>> org.apache.cassandra.utils.JMXServerUtils.createJMXServer(JMXServerUtils.java:104)
>> at 
>> org.apache.cassandra.service.CassandraDaemon.maybeInitJmx(CassandraDaemon.java:143)
>> at 
>> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:188)
>> at 
>> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:600)
>> at 
>> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:689)
>> ERROR 22:33:49 Exception encountered during startup
>> java.lang.AbstractMethodError: 
>> org.apache.cassandra.utils.JMXServerUtils$Exporter.exportObject(Ljava/rmi/Remote;ILjava/rmi/server/RMIClientSocketFactory;Ljava/rmi/server/RMIServerSocketFactory;Lsun/misc/ObjectInputFilter;)Ljava/rmi/Remote;
>> at 
>> javax.management.remote.rmi.RMIJRMPServerImpl.export(RMIJRMPServerImpl.java:150)
>>  ~[na:1.8.0_162]
>> at 
>> javax.management.remote.rmi.RMIJRMPServerImpl.export(RMIJRMPServerImpl.java:135)
>>  ~[na:1.8.0_162]
>> at 
>> javax.management.remote.rmi.RMIConnectorServer.start(RMIConnectorServer.java:405)
>>  ~[na:1.8.0_162]
>> at 
>> org.apache.cassandra.utils.JMXServerUtils.createJMXServer(JMXServerUtils.java:104)
>>  ~[apache-cassandra-3.11.1.jar:3.11.1]
>> at 
>> org.apache.cassandra.service.CassandraDaemon.maybeInitJmx(CassandraDaemon.java:143)
>>  [apache-cassandra-3.11.1.jar:3.11.1]
>> at 
>> org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:188) 
>> [apache-cassandra-3.11.1.jar:3.11.1]
>> at 
>> org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:600)
>>  [apache-cassandra-3.11.1.jar:3.11.1]
>> at 
>> org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:689) 
>> [apache-cassandra-3.11.1.jar:3.11.1]
>>
>> --
>
>
> *Justin Cameron*Senior Software Engineer
>
>
> <https://www.instaclustr.com/>
>
>
> This email has been sent on behalf of Instaclustr Pty. Limited (Australia)
> and Instaclustr Inc (USA).
>
> This email and any attachments may contain confidential and legally
> privileged information.  If you are not the intended recipient, do not copy
> or disclose its content, but please reply to this email immediately and
> highlight the error to the sender and then immediately delete the message.
>

unable to start cassandra 3.11.1

2018-02-01 Thread Kant Kodali

Hi All,

I am unable to start cassandra 3.11.1. Below is the stack trace.

Exception (java.lang.AbstractMethodError) encountered during startup:
org.apache.cassandra.utils.JMXServerUtils$Exporter.exportObject(Ljava/rmi/Remote;ILjava/rmi/server/RMIClientSocketFactory;Ljava/rmi/server/RMIServerSocketFactory;Lsun/misc/ObjectInputFilter;)Ljava/rmi/Remote;
java.lang.AbstractMethodError:
org.apache.cassandra.utils.JMXServerUtils$Exporter.exportObject(Ljava/rmi/Remote;ILjava/rmi/server/RMIClientSocketFactory;Ljava/rmi/server/RMIServerSocketFactory;Lsun/misc/ObjectInputFilter;)Ljava/rmi/Remote;
at 
javax.management.remote.rmi.RMIJRMPServerImpl.export(RMIJRMPServerImpl.java:150)
at 
javax.management.remote.rmi.RMIJRMPServerImpl.export(RMIJRMPServerImpl.java:135)
at 
javax.management.remote.rmi.RMIConnectorServer.start(RMIConnectorServer.java:405)
at 
org.apache.cassandra.utils.JMXServerUtils.createJMXServer(JMXServerUtils.java:104)
at 
org.apache.cassandra.service.CassandraDaemon.maybeInitJmx(CassandraDaemon.java:143)
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:188)
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:600)
at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:689)
ERROR 22:33:49 Exception encountered during startup
java.lang.AbstractMethodError:
org.apache.cassandra.utils.JMXServerUtils$Exporter.exportObject(Ljava/rmi/Remote;ILjava/rmi/server/RMIClientSocketFactory;Ljava/rmi/server/RMIServerSocketFactory;Lsun/misc/ObjectInputFilter;)Ljava/rmi/Remote;
at 
javax.management.remote.rmi.RMIJRMPServerImpl.export(RMIJRMPServerImpl.java:150)
~[na:1.8.0_162]
at 
javax.management.remote.rmi.RMIJRMPServerImpl.export(RMIJRMPServerImpl.java:135)
~[na:1.8.0_162]
at 
javax.management.remote.rmi.RMIConnectorServer.start(RMIConnectorServer.java:405)
~[na:1.8.0_162]
at 
org.apache.cassandra.utils.JMXServerUtils.createJMXServer(JMXServerUtils.java:104)
~[apache-cassandra-3.11.1.jar:3.11.1]
at 
org.apache.cassandra.service.CassandraDaemon.maybeInitJmx(CassandraDaemon.java:143)
[apache-cassandra-3.11.1.jar:3.11.1]
at 
org.apache.cassandra.service.CassandraDaemon.setup(CassandraDaemon.java:188)
[apache-cassandra-3.11.1.jar:3.11.1]
at 
org.apache.cassandra.service.CassandraDaemon.activate(CassandraDaemon.java:600)
[apache-cassandra-3.11.1.jar:3.11.1]
at 
org.apache.cassandra.service.CassandraDaemon.main(CassandraDaemon.java:689)
[apache-cassandra-3.11.1.jar:3.11.1]

Re: What happens if multiple processes send create table if not exist statement to cassandra?

2018-01-28 Thread Kant Kodali

Thanks a lot for that explanation Jeff!! I am trying to see if there is any 
JIRA ticket that talks about incorporating LWT in scenarios you mentioned?

Sent from my iPhone

> On Jan 27, 2018, at 1:41 PM, Jeff Jirsa <jji...@gmail.com> wrote:
> 
> Originally we would make tables based on keyspace name / table name pairs, 
> which was fine unless you dropped a table and recreated it, which could 
> happen while one node was offline / split network / gc pause. The recreation 
> scenario could allow data to be resurrected after a drop. 
> 
> So we augmented that (years and years ago) to have a uuid identifier for the 
> table, so now we can differentiate between table creations - if you drop a 
> table and recreate it, the new table has a different id.
> 
> However, if you issue a create table on two instances at the same time, 
> neither thinks the table exists, each generates their own cfid, two ids get 
> created. Schema eventually gets store inside Cassandra, so last write wins, 
> and the first ID seen gets stomped by the second. The race typically 
> manifests as one instance throwing errors about cfid not found, or a data 
> directory that doesn’t match the cfid in the schema (so a restart creates an 
> empty data directory), or similar situations like that.
> 
> The actual plumbing to use strong consistency (actually do paxos or some 
> other election to make sure exactly one id wins) is planned, likely for 4.0, 
> but doesn’t exist in any released version now
> 
> So again, don’t programmatically create tables if there’s a race possible, it 
> may work fine most of the time, but there’s a risk of ugly failure.
> 
> -- 
> Jeff Jirsa
> 
> 
>> On Jan 27, 2018, at 1:23 PM, Kant Kodali <k...@peernova.com> wrote:
>> 
>> May I know why? 
>> 
>> Sent from my iPhone
>> 
>>> On Jan 27, 2018, at 12:36 PM, Jeff Jirsa <jji...@gmail.com> wrote:
>>> 
>>> Yes it causes issues
>>> 
>>> 
>>> -- 
>>> Jeff Jirsa
>>> 
>>> 
>>>> On Jan 27, 2018, at 12:17 PM, Kant Kodali <k...@peernova.com> wrote:
>>>> 
>>>> Schema changes I assume you guys are talking about different create table 
>>>> or alter table statements. What if multiple threads issue same exact 
>>>> create table if not exists statement? Will that cause issues?
>>>> 
>>>> Sent from my iPhone
>>>> 
>>>>> On Jan 27, 2018, at 11:41 AM, Carlos Rolo <r...@pythian.com> wrote:
>>>>> 
>>>>> Don't do that. Worst case you might get different schemas in flight and 
>>>>> no agreement on your cluster.  If you are already doing that, check 
>>>>> "nodetool describecluster" after you do that.
>>>>> 
>>>>> Like Jeff said, it is likely to cause problems.
>>>>> 
>>>>> Regards,
>>>>> 
>>>>> Carlos Juzarte Rolo
>>>>> Cassandra Consultant / Datastax Certified Architect / Cassandra MVP
>>>>>  
>>>>> Pythian - Love your data
>>>>> 
>>>>> rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin: 
>>>>> linkedin.com/in/carlosjuzarterolo 
>>>>> Mobile: +351 918 918 100 
>>>>> www.pythian.com
>>>>> 
>>>>>> On Sat, Jan 27, 2018 at 7:25 PM, Jeff Jirsa <jji...@gmail.com> wrote:
>>>>>> It’s not LWT. Don’t do programmatic schema changes that can race, it’s 
>>>>>> likely to cause problems
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Jeff Jirsa
>>>>>> 
>>>>>> 
>>>>>> > On Jan 27, 2018, at 10:19 AM, Kant Kodali <k...@peernova.com> wrote:
>>>>>> >
>>>>>> > Hi All,
>>>>>> >
>>>>>> > What happens if multiple processes send create table if not exist 
>>>>>> > statement to cassandra? will there be any data corruption or any other 
>>>>>> > issues if I send "create table if not exist" request often?
>>>>>> >
>>>>>> > I dont see any entry in system.paxos table so is it fair to say "IF 
>>>>>> > NOT EXISTS" doesn't automatically imply LWT?
>>>>>> >
>>>>>> > Thanks!
>>>>>> 
>>>>>> -
>>>>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>>>>>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> 
>>>>> 
>>>>> 
>>>>>

Re: What happens if multiple processes send create table if not exist statement to cassandra?

2018-01-27 Thread Kant Kodali

May I know why? 

Sent from my iPhone

> On Jan 27, 2018, at 12:36 PM, Jeff Jirsa <jji...@gmail.com> wrote:
> 
> Yes it causes issues
> 
> 
> -- 
> Jeff Jirsa
> 
> 
>> On Jan 27, 2018, at 12:17 PM, Kant Kodali <k...@peernova.com> wrote:
>> 
>> Schema changes I assume you guys are talking about different create table or 
>> alter table statements. What if multiple threads issue same exact create 
>> table if not exists statement? Will that cause issues?
>> 
>> Sent from my iPhone
>> 
>>> On Jan 27, 2018, at 11:41 AM, Carlos Rolo <r...@pythian.com> wrote:
>>> 
>>> Don't do that. Worst case you might get different schemas in flight and no 
>>> agreement on your cluster.  If you are already doing that, check "nodetool 
>>> describecluster" after you do that.
>>> 
>>> Like Jeff said, it is likely to cause problems.
>>> 
>>> Regards,
>>> 
>>> Carlos Juzarte Rolo
>>> Cassandra Consultant / Datastax Certified Architect / Cassandra MVP
>>>  
>>> Pythian - Love your data
>>> 
>>> rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin: 
>>> linkedin.com/in/carlosjuzarterolo 
>>> Mobile: +351 918 918 100 
>>> www.pythian.com
>>> 
>>>> On Sat, Jan 27, 2018 at 7:25 PM, Jeff Jirsa <jji...@gmail.com> wrote:
>>>> It’s not LWT. Don’t do programmatic schema changes that can race, it’s 
>>>> likely to cause problems
>>>> 
>>>> 
>>>> --
>>>> Jeff Jirsa
>>>> 
>>>> 
>>>> > On Jan 27, 2018, at 10:19 AM, Kant Kodali <k...@peernova.com> wrote:
>>>> >
>>>> > Hi All,
>>>> >
>>>> > What happens if multiple processes send create table if not exist 
>>>> > statement to cassandra? will there be any data corruption or any other 
>>>> > issues if I send "create table if not exist" request often?
>>>> >
>>>> > I dont see any entry in system.paxos table so is it fair to say "IF NOT 
>>>> > EXISTS" doesn't automatically imply LWT?
>>>> >
>>>> > Thanks!
>>>> 
>>>> -
>>>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>>>> For additional commands, e-mail: user-h...@cassandra.apache.org
>>>> 
>>> 
>>> 
>>> --
>>> 
>>> 
>>> 
>>>

Re: What happens if multiple processes send create table if not exist statement to cassandra?

2018-01-27 Thread Kant Kodali

Schema changes I assume you guys are talking about different create table or 
alter table statements. What if multiple threads issue same exact create table 
if not exists statement? Will that cause issues?

Sent from my iPhone

> On Jan 27, 2018, at 11:41 AM, Carlos Rolo <r...@pythian.com> wrote:
> 
> Don't do that. Worst case you might get different schemas in flight and no 
> agreement on your cluster.  If you are already doing that, check "nodetool 
> describecluster" after you do that.
> 
> Like Jeff said, it is likely to cause problems.
> 
> Regards,
> 
> Carlos Juzarte Rolo
> Cassandra Consultant / Datastax Certified Architect / Cassandra MVP
>  
> Pythian - Love your data
> 
> rolo@pythian | Twitter: @cjrolo | Skype: cjr2k3 | Linkedin: 
> linkedin.com/in/carlosjuzarterolo 
> Mobile: +351 918 918 100 
> www.pythian.com
> 
>> On Sat, Jan 27, 2018 at 7:25 PM, Jeff Jirsa <jji...@gmail.com> wrote:
>> It’s not LWT. Don’t do programmatic schema changes that can race, it’s 
>> likely to cause problems
>> 
>> 
>> --
>> Jeff Jirsa
>> 
>> 
>> > On Jan 27, 2018, at 10:19 AM, Kant Kodali <k...@peernova.com> wrote:
>> >
>> > Hi All,
>> >
>> > What happens if multiple processes send create table if not exist 
>> > statement to cassandra? will there be any data corruption or any other 
>> > issues if I send "create table if not exist" request often?
>> >
>> > I dont see any entry in system.paxos table so is it fair to say "IF NOT 
>> > EXISTS" doesn't automatically imply LWT?
>> >
>> > Thanks!
>> 
>> -
>> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
>> For additional commands, e-mail: user-h...@cassandra.apache.org
>> 
> 
> 
> --
> 
> 
> 
>

What happens if multiple processes send create table if not exist statement to cassandra?

2018-01-27 Thread Kant Kodali

Hi All,

What happens if multiple processes send create table if not exist statement
to cassandra? will there be any data corruption or any other issues if I
send "create table if not exist" request often?

I dont see any entry in system.paxos table so is it fair to say "IF NOT
EXISTS" doesn't automatically imply LWT?

Thanks!

Re: Weird error (unable to start cassandra)

2017-09-11 Thread Kant Kodali

I had to do brew upgrade jemalloc to fix this issue.

On Mon, Sep 11, 2017 at 4:25 AM, Kant Kodali <k...@peernova.com> wrote:

> Hi All,
>
> I am trying to start cassandra 3.11 on Mac OS Sierra 10.12.6. when invoke
> cassandra binary I get the following error
>
> java(2981,0x7fffedb763c0) malloc: *** malloc_zone_unregister() failed for
> 0x7fffedb6c000
>
> I have xcode version 8.3.3 installed (latest). Any clue ?
>
> Thanks!
>

Weird error (unable to start cassandra)

2017-09-11 Thread Kant Kodali

Hi All,

I am trying to start cassandra 3.11 on Mac OS Sierra 10.12.6. when invoke
cassandra binary I get the following error

java(2981,0x7fffedb763c0) malloc: *** malloc_zone_unregister() failed for
0x7fffedb6c000

I have xcode version 8.3.3 installed (latest). Any clue ?

Thanks!

Re: Cassandra & Spark

2017-06-08 Thread Kant Kodali

If you use Containers like Docker Plan A can work provided you do the
resource and capacity planning. I tend to think that Plan B is more
Standard and easier Although you can wait to hear from others for a second
opinion.

Caution: Data Locality will make sense if the Disk throughput is
significantly higher than Network Throughput (Not all, have the same
scenario)


On Thu, Jun 8, 2017 at 1:25 AM, 한 승호  wrote:

> Hello,
>
>
>
> I am Seung-ho and I work as a Data Engineer in Korea. I need some advice.
>
>
>
> My company recently consider replacing RDMBS-based system with Cassandra
> and Hadoop.
>
> The purpose of this system is to analyze Cadssandra and HDFS data with
> Spark.
>
>
>
> It seems many user cases put emphasis on data locality, for instance, both
> Cassandra and Spark executor should be on the same node.
>
>
>
> The thing is, my company's data analyst team wants to analyze
> heterogeneous data source, Cassandra and HDFS, using Spark.
>
> So, I wonder what would be the best practices of using Cassandra and
> Hadoop in such case.
>
>
>
> Plan A: Both HDFS and Cassandra with NodeManager(Spark Executor) on the
> same node
>
>
>
> Plan B: Cassandra + Node Manager / HDFS + NodeManager in each node
> separately but the same cluster
>
>
>
>
>
> Which would be better or correct, or would be a better way?
>
>
>
> I appreciate your advice in advance :)
>
>
>
> Best Regards,
>
> Seung-Ho Han
>
>
>
>
>
> Windows 10용 메일 에서 보냄
>
>
>

Is there a C* summit this year?

2017-05-19 Thread Kant Kodali

Hi All,

I was wondering if there is going to be a C* summit this year? If so, when
can we expect?

Thanks!

Re: Cassandra 3.10 has partial partition key search but does it result in a table scan?

2017-05-09 Thread Kant Kodali

Thanks a lot guys!

On Tue, May 9, 2017 at 7:32 AM, Alexander Dejanovski <a...@thelastpickle.com
> wrote:

> Hi Kant,
>
> Unless you provide the full partition key, I see no way for Cassandra to
> avoid doing a full table scan.
> In order to know on which specific nodes to search (and in which sstables
> ,etc...) it needs to have a token. The token is a hash of the whole
> partition key.
> For a specific value of column "a" and different values of column "b" you
> always end up with different tokens that have no guaranty to be stored on
> the same node.
> After that, bloom filters, partition indexes, etc... require the full
> token too, so a full scan is further necessary on each node to get the
> data.
>
> TL;DR : no way to avoid a full cluster scan unless you provide the full
> partition key in your where clause.
>
> Cheers,
>
> On Tue, May 9, 2017 at 4:24 PM Jon Haddad <jonathan.had...@gmail.com>
> wrote:
>
>> Nope, I didn’t comment on that query.   I specifically answered your
>> question about "select * from hello where a='foo' allow filtering;”
>>
>> The query you’ve listed here looks like it would also do a full table
>> scan (again, I don’t see how it would be avoided).
>>
>> I recommend firing up a 3 node cluster using CCM, creating a key space
>> with RF=1, and seeing what it does.
>>
>> On May 9, 2017, at 9:12 AM, Kant Kodali <k...@peernova.com> wrote:
>>
>> Hi,
>>
>> Are you saying The following query select max(b) from hello where a='a1'
>> allow filtering; doesn't result in a table scan? I got the result for
>> this query and yes I just tried tracing it and looks like it is indeed
>> doing a table scan on ReadStage-2 although I am not sure if I am
>> interpreting it right? Finally is there anyway to prevent table scan while
>> providing the partial partition key and get the max b ?
>>
>> 
>> 
>>
>>
>> On Tue, May 9, 2017 at 6:33 AM, Jon Haddad <jonathan.had...@gmail.com>
>> wrote:
>>
>>> I don’t see any way it wouldn’t.  Have you tried tracing it?
>>>
>>> > On May 9, 2017, at 8:32 AM, Kant Kodali <k...@peernova.com> wrote:
>>> >
>>> > Hi All,
>>> >
>>> > It looks like Cassandra 3.10 has partial partition key search but does
>>> it result in a table scan? for example I can have the following
>>> >
>>> > create table hello(
>>> > a text,
>>> > b int,
>>> > c text,
>>> > d text,
>>> > primary key((a,b), c)
>>> > );
>>> >
>>> > Now I can do select * from hello where a='foo' allow filtering;// This
>>> works in 3.10 but I wonder if this query results in table scan and if so is
>>> there any way to limit such that I get max b?
>>> >
>>> > Thanks!
>>>
>>>
>> --
> -
> Alexander Dejanovski
> France
> @alexanderdeja
>
> Consultant
> Apache Cassandra Consulting
> http://www.thelastpickle.com
>

Cassandra 3.10 has partial partition key search but does it result in a table scan?

2017-05-09 Thread Kant Kodali

Hi All,

It looks like Cassandra 3.10 has partial partition key search but does it
result in a table scan? for example I can have the following

create table hello(
a text,
b int,
c text,
d text,
primary key((a,b), c)
);

Now I can do select * from hello where a='foo' allow filtering;// This
works in 3.10 but I wonder if this query results in table scan and if so is
there any way to limit such that I get max b?

Thanks!

is static column thread safe or only counter column is thread safe?

2017-05-08 Thread Kant Kodali

Hi All,

is static column thread safe or only counter column is thread safe?

Thanks!

Re: scylladb

2017-03-12 Thread Kant Kodali

I don't think ScyallDB guys started this conversation in the first place to
suggest or promote "drop-in replacement". It was something that is brought
up by one of the Cassandra users and ScyallDB guys just clarified it. They
are gracious enough to share the internals in detail.

honestly, I find it weird when I see questions like whether a question
belongs  to a mailing list or not especially in this case. If one doesn't
like it they can simply not follow the thread. I am not sure what is the
harm here.



On Sun, Mar 12, 2017 at 2:29 PM, James Carman <ja...@carmanconsulting.com>
wrote:

> Well, looking back, it appears this thread is from 2015, so apparently
> everyone is okay with it.
>
> Promoting a value-add product that makes using Cassandra easier/more
> efficient/etc would be cool, but coming to the Cassandra mailing list to
> promote a "drop-in replacement" (use us, not Cassandra) isn't cool, IMHO.
>
>
> On Sun, Mar 12, 2017 at 5:04 PM Kant Kodali <k...@peernova.com> wrote:
>
> yes.
>
> On Sun, Mar 12, 2017 at 2:01 PM, James Carman <ja...@carmanconsulting.com>
> wrote:
>
> Does all of this Scylla talk really even belong on the Cassandra user
> mailing list in the first place?
>
>
>
>
> On Sun, Mar 12, 2017 at 4:07 PM Jeff Jirsa <jji...@apache.org> wrote:
>
>
>
> On 2017-03-11 22:33 (-0700), Dor Laor <d...@scylladb.com> wrote:
> > On Sat, Mar 11, 2017 at 10:02 PM, Jeff Jirsa <jji...@gmail.com> wrote:
> > > On 2017-03-10 09:57 (-0800), Rakesh Kumar wrote:
> > > > Cassanda vs Scylla is a valid comparison because they both are
> > > compatible. Scylla is a drop-in replacement for Cassandra.
> > >
> > > No, they aren't, and no, it isn't
> > >
> >
> > Jeff is angry with us for some reason. I don't know why, it's natural
> that
> > when  a new opponent there are objections and the proof lies on us.
>
> I'm not angry. When I'm angry I send emails with paragraphs of expletives.
> It doesn't happen very often.
>
> This is an open source ASF project, it's not about fighting for market
> share against startups who find it necessary to inflate their level of
> compatibility to sell support contracts, it's about providing software that
> people can use (with a license that makes it easy to use). I don't work for
> a company that makes money selling Cassandra based solutions and you're not
> an opponent.
>
> >
> > Scylla IS a drop in replacement for C*. We support the same CQL (from
> > version 1.7 it's cql 3.3.1, protocol v4), the same SStable format (based
> on
> > 2.1.8).
>
> Scylla doesn't even run on all of the supported operating systems, let
> alone have feature parity or network level compatibility (which you'd
> probably need if you REALLY want to be drop-in 
> stop-one-cassandra-node-swap-binaries-start-it-up
> compatible, which is what your site used to claim, but obviously isn't
> supported). You support a subset of one query language and can read and
> write one sstable format. You do it with great supporting tech and a great
> engineering team, but you're not compatible, and if I were your cofounder
> I'd ask you to focus on the tech strengths and not your drop-in
> compatibility, so engineers who care about facts don't grow to resent your
> public lies.
>
> I've used a lot of databases in my life, but I don't know that I've ever
> had someone call me angry because I pointed out that database A wasn't
> compatible with database B, but I guess I'll chalk it up to 2017 and the
> year of fake news / alternative facts.
>
> Hugs and kisses,
> - Jeff
>
>
>

Re: scylladb

2017-03-12 Thread Kant Kodali

yes.

On Sun, Mar 12, 2017 at 2:01 PM, James Carman 
wrote:

> Does all of this Scylla talk really even belong on the Cassandra user
> mailing list in the first place?
>
>
>
>
> On Sun, Mar 12, 2017 at 4:07 PM Jeff Jirsa  wrote:
>
>
>
> On 2017-03-11 22:33 (-0700), Dor Laor  wrote:
> > On Sat, Mar 11, 2017 at 10:02 PM, Jeff Jirsa  wrote:
> > > On 2017-03-10 09:57 (-0800), Rakesh Kumar wrote:
> > > > Cassanda vs Scylla is a valid comparison because they both are
> > > compatible. Scylla is a drop-in replacement for Cassandra.
> > >
> > > No, they aren't, and no, it isn't
> > >
> >
> > Jeff is angry with us for some reason. I don't know why, it's natural
> that
> > when  a new opponent there are objections and the proof lies on us.
>
> I'm not angry. When I'm angry I send emails with paragraphs of expletives.
> It doesn't happen very often.
>
> This is an open source ASF project, it's not about fighting for market
> share against startups who find it necessary to inflate their level of
> compatibility to sell support contracts, it's about providing software that
> people can use (with a license that makes it easy to use). I don't work for
> a company that makes money selling Cassandra based solutions and you're not
> an opponent.
>
> >
> > Scylla IS a drop in replacement for C*. We support the same CQL (from
> > version 1.7 it's cql 3.3.1, protocol v4), the same SStable format (based
> on
> > 2.1.8).
>
> Scylla doesn't even run on all of the supported operating systems, let
> alone have feature parity or network level compatibility (which you'd
> probably need if you REALLY want to be drop-in 
> stop-one-cassandra-node-swap-binaries-start-it-up
> compatible, which is what your site used to claim, but obviously isn't
> supported). You support a subset of one query language and can read and
> write one sstable format. You do it with great supporting tech and a great
> engineering team, but you're not compatible, and if I were your cofounder
> I'd ask you to focus on the tech strengths and not your drop-in
> compatibility, so engineers who care about facts don't grow to resent your
> public lies.
>
> I've used a lot of databases in my life, but I don't know that I've ever
> had someone call me angry because I pointed out that database A wasn't
> compatible with database B, but I guess I'll chalk it up to 2017 and the
> year of fake news / alternative facts.
>
> Hugs and kisses,
> - Jeff
>
>

Re: scylladb

2017-03-12 Thread Kant Kodali

One more thing. Pretty much every database that is written in C++ or Java
uses native kernel threads for non-blocking I/O as well. They didn't use
Seaster or Quasar but anyways I am going to read up on Seaster and see what
it really does.

On Sun, Mar 12, 2017 at 3:48 AM, Kant Kodali <k...@peernova.com> wrote:

>
> If you have thread-per-core and N (logical) cores, and have M tasks
>> running concurrently where M > N, then you need a scheduler to decide which
>> of those M tasks gets to run on those N kernel threads.  Whether those M
>> tasks are user-level threads, or callbacks, or a mix of the two is
>> immaterial.  In such cases a scheduler always exists, even if it is a
>> simple FIFO queue.
>>
>
>
>> yes ofcourse scheduler is needed. But what you said is immaterial is
>> where I see the devil or say our conflict of arguments really are. Let the
>> kernel thread per core deal with callbacks rather than having to build a
>> user-level thread library and its scheduling mechanisms and the mapping
>> between them. This sounds more of an overhead in general but may work in a
>> specific case.
>>
>
>

Re: scylladb

2017-03-12 Thread Kant Kodali

Sorry I made some typo's here is a better version.

@Avi

"User-level scheduling is great for high performance I/O intensive
applications like databases and file systems." This is generally a claim
made by people who want to use user-level threads but I rarely had seen any
significant performance gain. Since you are claiming that you do. It would
be great if you can quantify that. The other day I have seen a benchmark of
a Golang server which supports user level threads/green threads natively
and it was able to handle 10K concurrent requests. Even Nginx which is
written in C and uses kernel threads can handle that many with Non-blocking
I/O. We all know concurrency is not parallelism.

One may have to pay for something which could be any of the following.

*Duplication of the schedulers*
M:N requires two schedulers which basically do same work, one at user level
and one in kernel. This is undesirable. It requires frequent data
communications between kernel and user space for scheduling information
transference.

Duplication takes more space in both Dcache and Icache for scheduling than
a single scheduler. It is highly undesirable if cache misses are caused by
the schedulers but the application, because a L2 cache miss could be more
expensive than a kernel thread switch. Then the additional scheduler might
become a trouble maker! In this case, to save kernel trappings does not
justify a user-scheduler, which is more truen when the processors are
providing faster and faster kernel trapping execution.

*Thread local data maintenance*
M:N has to maintain thread specific data, which are already provided by
kernel for kernel thread, such as the TLS data, error number. To provide
the same feature for user threads is not straightforward, because, for
example, the error number is returned for system call failure and supported
by kernel. User-level support degrades system performance and increases
system complexity.

*System info oblivious*
Kernel scheduler is close to underlying platform and architecture. It can
take advantage of their features. This is difficult for user thread library
because it's a layer at user level. User threads are second-order entities
in the system. If a kernel thread uses a GDT slot for TLS data, a user
thread perhaps can only use an LDT slot for TLS data. With increasingly
more supports available from the new processors for threading/scheduling
(Hyperthreading, NUMA, many-core), the second order nature seriously limits
the ability of M:N threading.

On Sun, Mar 12, 2017 at 1:33 AM, Kant Kodali <k...@peernova.com> wrote:

> Sorry I made some typo's here is a better version.
>
> @Avi
>
> "User-level scheduling is great for high performance I/O intensive
> applications like databases and file systems." This is generally a claim
> made by people who want to use user-level threads but I rarely had seen any
> significant performance gain. Since you are claiming that you do. It would
> be great if you can quantify that. The other day I have seen a benchmark of
> a Golang server which supports user level threads/green threads natively
> and it was able to handle 10K concurrent requests. Even Nginx which is
> written in C and uses kernel threads can handle that many with Non-blocking
> I/O. We all know concurrency is not parallelism.
>
> One may have to pay for something which could be any of the following.
>
> *Duplication of the schedulers*
> M:N requires two schedulers which basically do same work, one at user
> level and one in kernel. This is undesirable. It requires frequent data
> communications between kernel and user space for scheduling information
> transference.
>
> Duplication takes more space in both Dcache and Icache for scheduling than
> a single scheduler. It is highly undesirable if cache misses are caused by
> the schedulers but the application, because a L2 cache miss could be more
> expensive than a kernel thread switch. Then the additional scheduler might
> become a trouble maker! In this case, to save kernel trappings does not
> justify a user-scheduler, which is more truen when the processors are
> providing faster and faster kernel trapping execution.
>
> *Thread local data maintenance*
> M:N has to maintain thread specific data, which are already provided by
> kernel for kernel thread, such as the TLS data, error number. To provide
> the same feature for user threads is not straightforward, because, for
> example, the error number is returned for system call failure and supported
> by kernel. User-level support degrades system performance and increases
> system complexity.
>
> *System info oblivious*
> Kernel scheduler is close to underlying platform and architecture. It can
> take advantage of their features. This is difficult for user thread library
> because it's a layer at user level. User threads a

Re: scylladb

2017-03-12 Thread Kant Kodali

> If you have thread-per-core and N (logical) cores, and have M tasks
> running concurrently where M > N, then you need a scheduler to decide which
> of those M tasks gets to run on those N kernel threads.  Whether those M
> tasks are user-level threads, or callbacks, or a mix of the two is
> immaterial.  In such cases a scheduler always exists, even if it is a
> simple FIFO queue.
>


> yes ofcourse scheduler is needed. But what you said is immaterial is where
> I see the devil or say our conflict of arguments really are. Let the kernel
> thread per core deal with callbacks rather than having to build a
> user-level thread library and its scheduling mechanisms and the mapping
> between them. This sounds more of an overhead in general but may work in a
> specific case.
>

Re: scylladb

2017-03-12 Thread Kant Kodali

@Avi

I don't disagree with thread per core design and in fact I said that is a
reasonable/good choice. But I am having a hard time seeing through how user
level scheduling can make a significant difference even in Non-blocking I/O
case. My question really is that if you already have TPC why do you need
user level scheduling ? And if the answer is to switch between user level
tasks then I am simply trying to say "concurrency is not parallelism" (just
because one was able to switch between user level threads doesn't mean they
are running in parallel underneath). Why not simple schedule those on
kernel threads running on those cores and have a callback mechanism. Why
would one need to deal with user level scheduling overhead and all the
problems that comes with it. This to me just sounds like difference in the
design paradigm but doesn't seem to add much to the performance.

Seaster sounds very similar to Quasar. And I am not seeing great benefits
from it.




On Sun, Mar 12, 2017 at 1:48 AM, Avi Kivity <a...@scylladb.com> wrote:

> We already quantified it, the result is Scylla. Now, Scylla's performance
> is only in part due to the threading model, so I can't give you a number
> that quantifies how much just this aspect of the design is worth.  Removing
> it (or adding it to Cassandra) is a multi-man-year effort that I can't
> justify for this conversation.
>
>
> If you want to continue to use kernel threads for you applications, by all
> means continue to do so.  They're the right choice for all but the most I/O
> intensive applications.  But for these I/O intensive applications
> thread-per-core is the right choice, regardless of the points you raise.
>
>
> I encourage you to study the seastar code base [1] and documentation [2]
> to see how we handled those problems.  I'll also comment a bit below.
>
>
> [1] https://github.com/scylladb/seastar
>
> [2] http://www.seastar-project.org/
>
> On 03/12/2017 11:07 AM, Kant Kodali wrote:
>
> @Avi
>
> "User-level scheduling is great for high performance I/O intensive
> applications like databases and file systems." This is generally a claim
> made by people you want to use user-level threads but I rarely had seen any
> significant performance gain. Since you are claiming that you do. It would
> be great if you can quantify that. The other day I have seen a benchmark of
> a Golang server which supports user level threads/green threads natively
> and it was able to handle 10K concurrent requests. Even Nginx which is
> written and C and uses kernel threads can handle that many with
> Non-blocking I/O. We all know concurrency is not parallelism.
>
> You may have to pay for something which could be any of the following.
>
> *Duplication of the schedulers*
> M:N requires two schedulers which basically do same work, one at user
> level and one in kernel. This is undesirable. It requires frequent data
> communications between kernel and user space for scheduling information
> transference.
>
> Duplication takes more space in both Dcache and Icache for scheduling than
> a single scheduler. It is highly undesirable if cache misses are caused by
> the schedulers but the application, because a L2 cache miss could be more
> expensive than a kernel thread switch. Then the additional scheduler might
> become a trouble maker! In this case, to save kernel trappings does not
> justify a user-scheduler, which is more truen when the processors are
> providing faster and faster kernel trapping execution.
>
>
>
> That's not a problem, at least in my experience. The kernel scheduler
> needs to schedule only one thread, and that very infrequently. It is
> completely out of any hot path.
>
>
> *Thread local data maintenance*
> M:N has to maintain thread specific data, which are already provided by
> kernel for kernel thread, such as the TLS data, error number. To provide
> the same feature for user threads is not straightforward, because, for
> example, the error number is returned for system call failure and supported
> by kernel. User-level support degrades system performance and increases
> system complexity.
>
>
> This is also not a problem, we capture error codes in exceptions
> immediately after a system call and so we don't need to rely on TLS for
> errno.
>
>
> *System info oblivious*
> Kernel scheduler is close to underlying platform and architecture. It can
> take advantage of their features. This is difficult for user thread library
> because it's a layer at user level. User threads are second-order entities
> in the system. If a kernel thread uses a GDT slot for TLS data, a user
> thread perhaps can only use an LDT slot for TLS data. With increasingly
> more supports available from the new processors for t

Re: scylladb

2017-03-12 Thread Kant Kodali

On Sun, Mar 12, 2017 at 12:23 AM, Avi Kivity <a...@scylladb.com> wrote:

>
>
> On 03/12/2017 12:19 AM, Kant Kodali wrote:
>
> My response is inline.
>
> On Sat, Mar 11, 2017 at 1:43 PM, Avi Kivity <a...@scylladb.com> wrote:
>
>> There are several issues at play here.
>>
>> First, a database runs a large number of concurrent operations, each of
>> which only consumes a small amount of CPU. The high concurrency is need to
>> hide latency: disk latency, or the latency of contacting a remote node.
>>
>
> *Ok so you are talking about hiding I/O latency.  If all these I/O are
> non-blocking system calls then a thread per core and callback mechanism
> should suffice isn't it?*
>
>
>
> Scylla uses a mix of user-level threads and callbacks. Most of the code
> uses callbacks (fronted by a future/promise API). SSTable writers
> (memtable flush, compaction) use a user-level thread (internally
> implemented using callbacks).  The important bit is multiplexing many
> concurrent operations onto a single kernel thread.
>
>
> This means that the scheduler will need to switch contexts very often. A
>> kernel thread scheduler knows very little about the application, so it has
>> to switch a lot of context.  A user level scheduler is tightly bound to the
>> application, so it can perform the switching faster.
>>
>
> *sure but this applies in other direction as well. A user level scheduler
> has no idea about kernel level scheduler either.  There is literally no
> coordination between kernel level scheduler and user level scheduler in
> linux or any major OS. It may be possible with OS's that support scheduler
> activation(LWP's) and upcall mechanism. *
>
>
> There is no need for coordination, because the kernel scheduler has no
> scheduling decisions to make.  With one thread per core, bound to its core,
> the kernel scheduler can't make the wrong decision because it has just one
> choice.
>
>
> *Even then it is hard to say if it is all worth it (The research shows
> performance may not outweigh the complexity). Golang problem is exactly
> this if one creates 1000 go routines/green threads where each of them is
> making a blocking system call then it would create 1000 kernel threads
> underneath because it has no way to know that the kernel thread is blocked
> (no upcall). *
>
>
> All of the significant system calls we issue are through the main thread,
> either asynchronous or non-blocking.
>
> *And in non-blocking case I still don't even see a significant performance
> when compared to few kernel threads with callback mechanism.*
>
>
> We do.
>
>
> *  If you are saying user level scheduling is the Future (perhaps I would
> just let the researchers argue about it) As of today that is not case else
> languages would have had it natively instead of using third party
> frameworks or libraries. *
>
>
> User-level scheduling is great for high performance I/O intensive
> applications like databases and file systems.  It's not a general solution,
> and it involves a lot of effort to set up the infrastructure. However, for
> our use case, it was worth it.
>

*Even with I/O intensive applications it is very much debatable. The
numbers I had seen aren't convincing at all. *

>
>
>
>
>> There are also implications on the concurrency primitives in use (locks
>> etc.) -- they will be much faster for the user-level scheduler, because
>> they cooperate with the scheduler.  For example, no atomic
>> read-modify-write instructions need to be executed.
>>
>
>
>  Second, how many (kernel) threads should you run?
> * This question one will always have. If there are 10K user level threads
> that maps to only one kernel thread then they cannot exploit parallelism.
> so there is no right answer but a thread per core is a reasonable/good
> choice. *
>
>
> Only if you can multiplex many operations on top of each of those
> threads.  Otherwise, the CPUs end up underutilized.
>

*Yes thats exactly my point to your question on "how many (kernel) threads
should you run?" so I will repeat myself here.  This question one will
always have even they prefer user-level thread scheduling they still need
to know how may kernel threads they need to map to so one will end up with
same question which is how many kernel threads to create?. If there are 10K
user level threads that maps to only one kernel thread then they cannot
exploit parallelism. so there is no right answer but a thread per core is a
reasonable/good choice. *


>
>
>
>
>> If you run too few threads, then you will not be able to saturate the CPU
>> resources.  This is a common problem with Cassandra -- it'

Re: scylladb

2017-03-12 Thread Kant Kodali

@Avi

"User-level scheduling is great for high performance I/O intensive
applications like databases and file systems." This is generally a claim
made by people you want to use user-level threads but I rarely had seen any
significant performance gain. Since you are claiming that you do. It would
be great if you can quantify that. The other day I have seen a benchmark of
a Golang server which supports user level threads/green threads natively
and it was able to handle 10K concurrent requests. Even Nginx which is
written and C and uses kernel threads can handle that many with
Non-blocking I/O. We all know concurrency is not parallelism.

You may have to pay for something which could be any of the following.

*Duplication of the schedulers*
M:N requires two schedulers which basically do same work, one at user level
and one in kernel. This is undesirable. It requires frequent data
communications between kernel and user space for scheduling information
transference.

Duplication takes more space in both Dcache and Icache for scheduling than
a single scheduler. It is highly undesirable if cache misses are caused by
the schedulers but the application, because a L2 cache miss could be more
expensive than a kernel thread switch. Then the additional scheduler might
become a trouble maker! In this case, to save kernel trappings does not
justify a user-scheduler, which is more truen when the processors are
providing faster and faster kernel trapping execution.

*Thread local data maintenance*
M:N has to maintain thread specific data, which are already provided by
kernel for kernel thread, such as the TLS data, error number. To provide
the same feature for user threads is not straightforward, because, for
example, the error number is returned for system call failure and supported
by kernel. User-level support degrades system performance and increases
system complexity.

*System info oblivious*
Kernel scheduler is close to underlying platform and architecture. It can
take advantage of their features. This is difficult for user thread library
because it's a layer at user level. User threads are second-order entities
in the system. If a kernel thread uses a GDT slot for TLS data, a user
thread perhaps can only use an LDT slot for TLS data. With increasingly
more supports available from the new processors for threading/scheduling
(Hyperthreading, NUMA, many-core), the second order nature seriously limits
the ability of M:N threading.

On Sun, Mar 12, 2017 at 1:05 AM, Avi Kivity <a...@scylladb.com> wrote:

> btw, for an example of how user-level tasks can be scheduled in a way that
> cannot be done with kernel threads, see this pair of blog posts:
>
>
>   http://www.scylladb.com/2016/04/14/io-scheduler-1/
>
>   http://www.scylladb.com/2016/04/29/io-scheduler-2/
>
>
> There's simply no way to get this kind of control when you rely on the
> kernel for scheduling and page cache management.  As a result you have to
> overprovision your node and then you mostly underutilize it.
>
> On 03/12/2017 10:23 AM, Avi Kivity wrote:
>
>
>
> On 03/12/2017 12:19 AM, Kant Kodali wrote:
>
> My response is inline.
>
> On Sat, Mar 11, 2017 at 1:43 PM, Avi Kivity <a...@scylladb.com> wrote:
>
>> There are several issues at play here.
>>
>> First, a database runs a large number of concurrent operations, each of
>> which only consumes a small amount of CPU. The high concurrency is need to
>> hide latency: disk latency, or the latency of contacting a remote node.
>>
>
> *Ok so you are talking about hiding I/O latency.  If all these I/O are
> non-blocking system calls then a thread per core and callback mechanism
> should suffice isn't it?*
>
>
>
> Scylla uses a mix of user-level threads and callbacks. Most of the code
> uses callbacks (fronted by a future/promise API). SSTable writers
> (memtable flush, compaction) use a user-level thread (internally
> implemented using callbacks).  The important bit is multiplexing many
> concurrent operations onto a single kernel thread.
>
>
> This means that the scheduler will need to switch contexts very often. A
>> kernel thread scheduler knows very little about the application, so it has
>> to switch a lot of context.  A user level scheduler is tightly bound to the
>> application, so it can perform the switching faster.
>>
>
> *sure but this applies in other direction as well. A user level scheduler
> has no idea about kernel level scheduler either.  There is literally no
> coordination between kernel level scheduler and user level scheduler in
> linux or any major OS. It may be possible with OS's that support scheduler
> activation(LWP's) and upcall mechanism. *
>
>
> There is no need for coordination, because the kernel scheduler has no
> scheduling decisions to make.  With one thre

Re: scylladb

2017-03-11 Thread Kant Kodali

My response is inline.

On Sat, Mar 11, 2017 at 1:43 PM, Avi Kivity <a...@scylladb.com> wrote:

> There are several issues at play here.
>
> First, a database runs a large number of concurrent operations, each of
> which only consumes a small amount of CPU. The high concurrency is need to
> hide latency: disk latency, or the latency of contacting a remote node.
>

*Ok so you are talking about hiding I/O latency.  If all these I/O are
non-blocking system calls then a thread per core and callback mechanism
should suffice isn't it?*

> This means that the scheduler will need to switch contexts very often. A
> kernel thread scheduler knows very little about the application, so it has
> to switch a lot of context.  A user level scheduler is tightly bound to the
> application, so it can perform the switching faster.
>

*sure but this applies in other direction as well. A user level scheduler
has no idea about kernel level scheduler either.  There is literally no
coordination between kernel level scheduler and user level scheduler in
linux or any major OS. It may be possible with OS's that support scheduler
activation(LWP's) and upcall mechanism. Even then it is hard to say if it
is all worth it (The research shows performance may not outweigh the
complexity). Golang problem is exactly this if one creates 1000 go
routines/green threads where each of them is making a blocking system call
then it would create 1000 kernel threads underneath because it has no way
to know that the kernel thread is blocked (no upcall). And in non-blocking
case I still don't even see a significant performance when compared to few
kernel threads with callback mechanism.  If you are saying user level
scheduling is the Future (perhaps I would just let the researchers argue
about it) As of today that is not case else languages would have had it
natively instead of using third party frameworks or libraries. *

> There are also implications on the concurrency primitives in use (locks
> etc.) -- they will be much faster for the user-level scheduler, because
> they cooperate with the scheduler.  For example, no atomic
> read-modify-write instructions need to be executed.
>

 Second, how many (kernel) threads should you run?* This question one
will always have. If there are 10K user level threads that maps to only one
kernel thread then they cannot exploit parallelism. so there is no right
answer but a thread per core is a reasonable/good choice. *

> If you run too few threads, then you will not be able to saturate the CPU
> resources.  This is a common problem with Cassandra -- it's very hard to
> get it to consume all of the CPU power on even a moderately large machine.
> On the other hand, if you have too many threads, you will see latency rise
> very quickly, because kernel scheduling granularity is on the order of
> milliseconds.  User-level scheduling, because it leaves control in the hand
> of the application, allows you to both saturate the CPU and maintain low
> latency.
>

F*or my workload and probably others I had seen Cassandra was always
been CPU bound.*

>
> There are other factors, like NUMA-friendliness, but in the end it all
> boils down to efficiency and control.
>
> None of this is new btw, it's pretty common in the storage world.
>
> Avi
>
>
> On 03/11/2017 11:18 PM, Kant Kodali wrote:
>
> Here is the Java version http://docs.paralleluniverse.co/quasar/ but I
> still don't see how user level scheduling can be beneficial (This is a well
> debated problem)? How can this add to the performance? or say why is user
> level scheduling necessary Given the Thread per core design and the
> callback mechanism?
>
> On Sat, Mar 11, 2017 at 12:51 PM, Avi Kivity <a...@scylladb.com> wrote:
>
>> Scylla uses a the seastar framework, which provides for both user-level
>> thread scheduling and simple run-to-completion tasks.
>>
>> Huge pages are limited to 2MB (and 1GB, but these aren't available as
>> transparent hugepages).
>>
>>
>> On 03/11/2017 10:26 PM, Kant Kodali wrote:
>>
>> @Dor
>>
>> 1) You guys have a CPU scheduler? you mean user level thread Scheduler
>> that maps user level threads to kernel level threads? I thought C++ by
>> default creates native kernel threads but sure nothing will stop someone to
>> create a user level scheduling library if that's what you are talking about?
>> 2) How can one create THP of size 1KB? According to this post
>> <https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Performance_Tuning_Guide/s-memory-transhuge.html>
>>  it
>> looks like the valid values 2MB and 1GB.
>>
>> Thanks,
>> kant
>>
>> On Sat, Mar 11, 2017 at 11:41 AM, Avi Kivity <a...@scylladb.com> wrote:
>

Re: scylladb

2017-03-11 Thread Kant Kodali

Here is the Java version http://docs.paralleluniverse.co/quasar/ but I
still don't see how user level scheduling can be beneficial (This is a well
debated problem)? How can this add to the performance? or say why is user
level scheduling necessary Given the Thread per core design and the
callback mechanism?

On Sat, Mar 11, 2017 at 12:51 PM, Avi Kivity <a...@scylladb.com> wrote:

> Scylla uses a the seastar framework, which provides for both user-level
> thread scheduling and simple run-to-completion tasks.
>
> Huge pages are limited to 2MB (and 1GB, but these aren't available as
> transparent hugepages).
>
>
> On 03/11/2017 10:26 PM, Kant Kodali wrote:
>
> @Dor
>
> 1) You guys have a CPU scheduler? you mean user level thread Scheduler
> that maps user level threads to kernel level threads? I thought C++ by
> default creates native kernel threads but sure nothing will stop someone to
> create a user level scheduling library if that's what you are talking about?
> 2) How can one create THP of size 1KB? According to this post
> <https://access.redhat.com/documentation/en-US/Red_Hat_Enterprise_Linux/6/html/Performance_Tuning_Guide/s-memory-transhuge.html>
>  it
> looks like the valid values 2MB and 1GB.
>
> Thanks,
> kant
>
> On Sat, Mar 11, 2017 at 11:41 AM, Avi Kivity <a...@scylladb.com> wrote:
>
>> Agreed, I'd recommend to treat benchmarks as a rough guide to see where
>> there is potential, and follow through with your own tests.
>>
>> On 03/11/2017 09:37 PM, Edward Capriolo wrote:
>>
>>
>> Benchmarks are great for FUDly blog posts. Real world work loads matter
>> more. Every NoSQL vendor wins their benchmarks.
>>
>>
>>
>
>
>
>

Re: scylladb

2017-03-11 Thread Kant Kodali

@Dor

1) You guys have a CPU scheduler? you mean user level thread Scheduler that
maps user level threads to kernel level threads? I thought C++ by default
creates native kernel threads but sure nothing will stop someone to create
a user level scheduling library if that's what you are talking about?
2) How can one create THP of size 1KB? According to this post

it
looks like the valid values 2MB and 1GB.

Thanks,
kant

On Sat, Mar 11, 2017 at 11:41 AM, Avi Kivity  wrote:

> Agreed, I'd recommend to treat benchmarks as a rough guide to see where
> there is potential, and follow through with your own tests.
>
> On 03/11/2017 09:37 PM, Edward Capriolo wrote:
>
>
> Benchmarks are great for FUDly blog posts. Real world work loads matter
> more. Every NoSQL vendor wins their benchmarks.
>
>
>

Re: scylladb

2017-03-10 Thread Kant Kodali

gt;> with the configs used. Having plain ops per second and 99p latency is
>> blackbox.
>>
>>
>>
>> Regards,
>>
>> Bhuvan
>>
>>
>>
>> On Fri, Mar 10, 2017 at 12:47 PM, Avi Kivity <a...@scylladb.com> wrote:
>>
>> ScyllaDB engineer here.
>>
>> C++ is really an enabling technology here. It is directly responsible for
>> a small fraction of the gain by executing faster than Java.  But it is
>> indirectly responsible for the gain by allowing us direct control over
>> memory and threading.  Just as an example, Scylla starts by taking over
>> almost all of the machine's memory, and dynamically assigning it to
>> memtables, cache, and working memory needed to handle requests in flight.
>> Memory is statically partitioned across cores, allowing us to exploit NUMA
>> fully.  You can't do these things in Java.
>>
>> I would say the major contributors to Scylla performance are:
>>  - thread-per-core design
>>  - replacement of the page cache with a row cache
>>  - careful attention to many small details, each contributing a little,
>> but with a large overall impact
>>
>> While I'm here I can say that performance is not the only goal here, it
>> is stable and predictable performance over varying loads and during
>> maintenance operations like repair, without any special tuning.  We measure
>> the amount of CPU and I/O spent on foreground (user) and background
>> (maintenance) tasks and divide them fairly.  This work is not complete but
>> already makes operating Scylla a lot simpler.
>>
>>
>>
>> On 03/10/2017 01:42 AM, Kant Kodali wrote:
>>
>> I dont think ScyllaDB performance is because of C++. The design decisions
>> in scylladb are indeed different from Cassandra such as getting rid of SEDA
>> and moving to TPC and so on.
>>
>>
>>
>> If someone thinks it is because of C++ then just show the benchmarks that
>> proves it is indeed the C++ which gave 10X performance boost as ScyllaDB
>> claims instead of stating it.
>>
>>
>>
>>
>>
>> On Thu, Mar 9, 2017 at 3:22 PM, Richard L. Burton III <mrbur...@gmail.com>
>> wrote:
>>
>> They spend an enormous amount of time focusing on performance. You can
>> expect them to continue on with their optimization and keep crushing it.
>>
>>
>>
>> P.S., I don't work for ScyllaDB.
>>
>>
>>
>> On Thu, Mar 9, 2017 at 6:02 PM, Rakesh Kumar <rakeshkumar...@outlook.com>
>> wrote:
>>
>> In all of their presentation they keep harping on the fact that scylladb
>> is written in C++ and does not carry the overhead of Java.  Still the
>> difference looks staggering.
>> __ __
>> From: daemeon reiydelle <daeme...@gmail.com>
>> Sent: Thursday, March 9, 2017 14:21
>> To: user@cassandra.apache.org
>> Subject: Re: scylladb
>>
>> The comparison is fair, and conservative. Did substantial performance
>> comparisons for two clients, both results returned throughputs that were
>> faster than the published comparisons (15x as I recall). At that time the
>> client preferred to utilize a Cass COTS solution and use a caching solution
>> for OLA compliance.
>>
>>
>> ...
>>
>> Daemeon C.M. Reiydelle
>> USA (+1) 415.501.0198 <+1%20415-501-0198>
>> London (+44) (0) 20 8144 9872 <+44%2020%208144%209872>
>>
>> On Thu, Mar 9, 2017 at 11:04 AM, Robin Verlangen 
>> <ro...@us2.nl<mailto:robin@us2
>> .nl <ro...@us2.nl>>> wrote:
>> I was wondering how people feel about the comparison that's made here
>> between Cassandra and ScyllaDB : http://www.scylladb.com/techno
>> logy/ycsb-cassandra-scylla/#re sults-of-3-scylla-nodes-vs-30-
>> cassandra-nodes
>> <http://www.scylladb.com/technology/ycsb-cassandra-scylla/#results-of-3-scylla-nodes-vs-30-cassandra-nodes>
>>
>> They are claiming a 10x improvement, is that a fair comparison or maybe a
>> somewhat coloured view of a (micro)benchmark in a specific setup? Any
>> pros/cons known?
>>
>> Best regards,
>>
>> Robin Verlangen
>> Chief Data Architect
>>
>> Disclaimer: The information contained in this message and attachments is
>> intended solely for the attention and use of the named addressee and may be
>> confidential. If you are not the intended recipient, you are reminded that
>> the information remains the property of the sender. You must not use,
>> disclose, distribute, copy, print or rely on this e-mail. If y

Re: scylladb

2017-03-09 Thread Kant Kodali

I dont think ScyllaDB performance is because of C++. The design decisions
in scylladb are indeed different from Cassandra such as getting rid of SEDA
and moving to TPC and so on.

If someone thinks it is because of C++ then just show the benchmarks that
proves it is indeed the C++ which gave 10X performance boost as ScyllaDB
claims instead of stating it.


On Thu, Mar 9, 2017 at 3:22 PM, Richard L. Burton III 
wrote:

> They spend an enormous amount of time focusing on performance. You can
> expect them to continue on with their optimization and keep crushing it.
>
> P.S., I don't work for ScyllaDB.
>
> On Thu, Mar 9, 2017 at 6:02 PM, Rakesh Kumar 
> wrote:
>
>> In all of their presentation they keep harping on the fact that scylladb
>> is written in C++ and does not carry the overhead of Java.  Still the
>> difference looks staggering.
>> 
>> From: daemeon reiydelle 
>> Sent: Thursday, March 9, 2017 14:21
>> To: user@cassandra.apache.org
>> Subject: Re: scylladb
>>
>> The comparison is fair, and conservative. Did substantial performance
>> comparisons for two clients, both results returned throughputs that were
>> faster than the published comparisons (15x as I recall). At that time the
>> client preferred to utilize a Cass COTS solution and use a caching solution
>> for OLA compliance.
>>
>>
>> ...
>>
>> Daemeon C.M. Reiydelle
>> USA (+1) 415.501.0198
>> London (+44) (0) 20 8144 9872
>>
>> On Thu, Mar 9, 2017 at 11:04 AM, Robin Verlangen  ro...@us2.nl>> wrote:
>> I was wondering how people feel about the comparison that's made here
>> between Cassandra and ScyllaDB : http://www.scylladb.com/techno
>> logy/ycsb-cassandra-scylla/#results-of-3-scylla-nodes-vs-
>> 30-cassandra-nodes
>>
>> They are claiming a 10x improvement, is that a fair comparison or maybe a
>> somewhat coloured view of a (micro)benchmark in a specific setup? Any
>> pros/cons known?
>>
>> Best regards,
>>
>> Robin Verlangen
>> Chief Data Architect
>>
>> Disclaimer: The information contained in this message and attachments is
>> intended solely for the attention and use of the named addressee and may be
>> confidential. If you are not the intended recipient, you are reminded that
>> the information remains the property of the sender. You must not use,
>> disclose, distribute, copy, print or rely on this e-mail. If you have
>> received this message in error, please contact the sender immediately and
>> irrevocably delete this message and any copies.
>>
>> On Wed, Dec 16, 2015 at 11:52 AM, Carlos Rolo  r...@pythian.com>> wrote:
>> No rain at all! But I almost had it running last weekend, but stopped
>> short of installing it. Let's see if this one is for real!
>>
>> Regards,
>>
>> Carlos Juzarte Rolo
>> Cassandra Consultant
>>
>> Pythian - Love your data
>>
>> rolo@pythian | Twitter: @cjrolo | Linkedin:
>> linkedin.com/in/carlosjuzarterolo> losjuzarterolo>
>> Mobile: +351 91 891 81 00 | Tel: +1 613 565
>> 8696 x1649
>> www.pythian.com
>>
>> On Wed, Dec 16, 2015 at 12:38 AM, Dani Traphagen <
>> dani.trapha...@datastax.com> wrote:
>> You'll be the first Carlos.
>>
>> [Inline image 1]
>>
>> Had any rain lately? Curious how this went, if so.
>>
>> On Thu, Nov 12, 2015 at 4:36 AM, Jack Krupansky > > wrote:
>> I just did a Twitter search on scylladb and did not see any tweets about
>> actual use, so far.
>>
>>
>> -- Jack Krupansky
>>
>> On Wed, Nov 11, 2015 at 10:54 AM, Carlos Alonso > > wrote:
>> Any update about this?
>>
>> @Carlos Rolo, did you tried it? Thoughts?
>>
>> Carlos Alonso | Software Engineer | @calonso
>>
>> On 5 November 2015 at 14:07, Carlos Rolo > pythian.com>> wrote:
>> Something to do on a expected rainy weekend. Thanks for the information.
>>
>> Regards,
>>
>> Carlos Juzarte Rolo
>> Cassandra Consultant
>>
>> Pythian - Love your data
>>
>> rolo@pythian | Twitter: @cjrolo | Linkedin:
>> linkedin.com/in/carlosjuzarterolo> losjuzarterolo>
>> Mobile: +351 91 891 81 00 | Tel: +1 613
>> 565 8696 x1649
>> www.pythian.com
>>
>> On Thu, Nov 5, 2015 at 12:07 PM, Dani Traphagen <
>> dani.trapha...@datastax.com> wrote:
>> As of two days ago, they say they've got it @cjrolo.
>>
>> https://github.com/scylladb/scylla/wiki/RELEASE-Scylla-0.11-Beta
>>
>>
>> On Thursday, November 5, 2015, Carlos Rolo > pythian.com>> wrote:
>> I will not try until multi-DC is implemented. More than an month has
>> passed since

Re: Attached profiled data but need help understanding it

2017-03-06 Thread Kant Kodali

Hi Romain,

We may be able to achieve what we need without LWT but that would require
bunch of changes from the application side and possibly introducing caching
layers and designing solution around that. But for now, we are constrained
to use LWT's for another month or so. All said, I still would like to see
the discouraged features such as LWT's, secondary indexes, triggers get
better over time so it would really benefit users.

Agreed High park/unpark is a sign of excessive context switching but any
ideas why this is happening? yes today we will be experimenting with
c3.2Xlarge and see what the numbers look like and slowly scale up from
there.

How do I make sure I install  ixgbevf driver? Do M4.xlarge or C3.2Xlarge
don't already have it? when I googled " ixgbevf driver" it tells me it is
ethernet driver...I thought all instances by default run on ethernet on
AWS. can you please give more context on this?

Thanks,
kant

On Fri, Mar 3, 2017 at 4:42 AM, Romain Hardouin  wrote:

> Also, I should have mentioned that it would be a good idea to spawn your
> three benchmark instances in the same AZ, then try with one instance on
> each AZ to see how network latency affects your LWT rate. The lower latency
> is achievable with three instances on the same placement group of course
> but it's kinda dangerous for production.
>
>
>

Re: Attached profiled data but need help understanding it

2017-02-28 Thread Kant Kodali

Hi Romain,

I am using Cassandra version 3.0.9 and here is the generated report
<http://fastthread.io/my-thread-report.jsp?p=c2hhcmVkLzIwMTcvMDMvMS8tLWpzdGFja19kdW1wLm91dC0tMi0yNC00OA==>
(Graphical view) of my thread dump as well!. Just send this over in case if
it helps.

Thanks,
kant

On Tue, Feb 28, 2017 at 7:51 PM, Kant Kodali <k...@peernova.com> wrote:

> Hi Romain,
>
> Thanks again. My response are inline.
>
> kant
>
> On Tue, Feb 28, 2017 at 10:04 AM, Romain Hardouin <romainh...@yahoo.fr>
> wrote:
>
>> > we are currently using 3.0.9.  should we use 3.8 or 3.10
>>
>> No, don't use 3.X in production unless you really need a major feature.
>> I would advise to stick to 3.0.X (i.e. 3.0.11 now).
>> You can backport CASSANDRA-11966 easily but of course you have to deploy
>> from source as a prerequisite.
>>
>
>   * By backporting you mean I should cherry pick CASSANDRA-11966 commit
> and compile from source?*
>
>>
>> > I haven't done any tuning yet.
>>
>> So it's a good news because maybe there is room for improvement
>>
>> > Can I change this on a running instance? If so, how? or does it require
>> a downtime?
>>
>> You can throttle compaction at runtime with "nodetool
>> setcompactionthroughput". Be sure to read all nodetool commmands, some of
>> them are really useful for a day to day tuning/management.
>>
>> If GC is fine, then check other things -> "[...] different pool sizes for
>> NTR, concurrent reads and writes, compaction executors, etc. Also check if
>> you can improve network latency (e.g. VF or ENA on AWS)."
>>
>> Regarding thread pools, some of them can be resized at runtime via JMX.
>>
>> > 5000 is the target.
>>
>> Right now you reached 1500. Is it per node or for the cluster?
>> We don't know your setup so it's hard to say it's doable. Can you provide
>> more details? VM, physical nodes, #nodes, etc.
>> Generally speaking LWT should be seldom used. AFAIK you won't achieve
>> 10,000 writes/s per node.
>>
>> Maybe someone on the list already made some tuning for heavy LWT workload?
>>
>
> *1500 total cluster.  *
>
> *I have a 8 node cassandra cluster. Each node is AWS m4.xlarge
> instance (so 4 vCPU, 16GB, 1Gbit network=125MB/s)*
>
>
>
> *I have 1 node (m4.xlarge) for my application which just inserts a
> bunch of data and each insert is an LWT I tested the network throughput
> of the node.  I can get up 98 MB/s.*
>
> *Now, when I start my application. I see that Cassandra nodes Receive
> rate/ throughput is about 4MB/s (yes it is in Mega Bytes. I checked this by
> running sudo iftop -B). The Disk I/O is also same and the Cassandra process
> CPU usage is about 360% (the max is 400% since it is a 4 core machine). The
> application node transmission throughput is about 6MB/s. so even with 4MB/s
> receive throughput at Cassandra node the CPU is almost maxed out. I am not
> sure what this says about Cassandra? But, what I can tell is that Network
> is way underutilized and that 8 nodes are unnecessary so we plan to bring
> it down to 4 nodes except each node this time will have 8 cores. All said,
> I am still not sure how to scale up from 1500 writes/sec? *
>
>
>>
>> Best,
>>
>> Romain
>>
>>
>

Re: Attached profiled data but need help understanding it

2017-02-28 Thread Kant Kodali

Hi Romain,

Thanks again. My response are inline.

kant

On Tue, Feb 28, 2017 at 10:04 AM, Romain Hardouin 
wrote:

> > we are currently using 3.0.9.  should we use 3.8 or 3.10
>
> No, don't use 3.X in production unless you really need a major feature.
> I would advise to stick to 3.0.X (i.e. 3.0.11 now).
> You can backport CASSANDRA-11966 easily but of course you have to deploy
> from source as a prerequisite.
>

  * By backporting you mean I should cherry pick CASSANDRA-11966 commit and
compile from source?*

>
> > I haven't done any tuning yet.
>
> So it's a good news because maybe there is room for improvement
>
> > Can I change this on a running instance? If so, how? or does it require
> a downtime?
>
> You can throttle compaction at runtime with "nodetool
> setcompactionthroughput". Be sure to read all nodetool commmands, some of
> them are really useful for a day to day tuning/management.
>
> If GC is fine, then check other things -> "[...] different pool sizes for
> NTR, concurrent reads and writes, compaction executors, etc. Also check if
> you can improve network latency (e.g. VF or ENA on AWS)."
>
> Regarding thread pools, some of them can be resized at runtime via JMX.
>
> > 5000 is the target.
>
> Right now you reached 1500. Is it per node or for the cluster?
> We don't know your setup so it's hard to say it's doable. Can you provide
> more details? VM, physical nodes, #nodes, etc.
> Generally speaking LWT should be seldom used. AFAIK you won't achieve
> 10,000 writes/s per node.
>
> Maybe someone on the list already made some tuning for heavy LWT workload?
>

*1500 total cluster.  *

*I have a 8 node cassandra cluster. Each node is AWS m4.xlarge instance
(so 4 vCPU, 16GB, 1Gbit network=125MB/s)*

*I have 1 node (m4.xlarge) for my application which just inserts a
bunch of data and each insert is an LWT I tested the network throughput
of the node.  I can get up 98 MB/s.*

*Now, when I start my application. I see that Cassandra nodes Receive
rate/ throughput is about 4MB/s (yes it is in Mega Bytes. I checked this by
running sudo iftop -B). The Disk I/O is also same and the Cassandra process
CPU usage is about 360% (the max is 400% since it is a 4 core machine). The
application node transmission throughput is about 6MB/s. so even with 4MB/s
receive throughput at Cassandra node the CPU is almost maxed out. I am not
sure what this says about Cassandra? But, what I can tell is that Network
is way underutilized and that 8 nodes are unnecessary so we plan to bring
it down to 4 nodes except each node this time will have 8 cores. All said,
I am still not sure how to scale up from 1500 writes/sec? *

>
> Best,
>
> Romain
>
>

Re: Attached profiled data but need help understanding it

2017-02-27 Thread Kant Kodali

Hi! My answers are inline.

On Mon, Feb 27, 2017 at 11:48 AM, Kant Kodali <k...@peernova.com> wrote:

>
>
> On Mon, Feb 27, 2017 at 10:30 AM, Romain Hardouin <romainh...@yahoo.fr>
> wrote:
>
>> Hi,
>>
>> Regarding shared pool workers see CASSANDRA-11966. You may have to
>> backport it depending on your Cassandra version.
>>
>
> *we are currently using 3.0.9.  should we use 3.8 or 3.10?*
>
>>
>> Did you try to lower compaction throughput to see if it helps? Be sure to
>> keep an eye on pending compactions, SSTables count and SSTable per read of
>> course.
>>
>
>*I haven't done any tuning yet. Can I change this on a running
> instance? If so, how? or does it require a downtime?*
>
>>
>> "alloc" is the memory allocation rate. You can see that compactions are
>> GC intensive.
>>
>
>> You won't be able to achieve impressive writes/s with LWT. But maybe
>> there is room for improvement. Try GC tuning, different pool sizes for NTR,
>> concurrent reads and writes, compaction executors, etc. Also check if you
>> can improve network latency (e.g. VF or ENA on AWS).
>>
>
>* GC seems to be fine because when I checked GC is about 0.25%. Total
> GC time is about 6minutes since the node is up and running for about 50
> hours*.
>
>>
>> What LWT rate would you want to achieve?  *5000 is the target.*
>>
>> Best,
>>
>> Romain
>>
>>
>>
>> Le Lundi 27 février 2017 12h48, Kant Kodali <k...@peernova.com> a écrit :
>>
>>
>> Also Attached is a flamed graph generated from a thread dump.
>>
>> On Mon, Feb 27, 2017 at 2:32 AM, Kant Kodali <k...@peernova.com> wrote:
>>
>> Hi,
>>
>> Attached are the stats of my Cassandra node running on a 4-core CPU. I am
>> using sjk-plus tool for the first time so what are the things I should
>> watched out for in my attached screenshot? I can see the CPU is almost
>> maxed out but should I say that is because of compaction or
>> shared-worker-pool threads (which btw, I dont know what they are doing
>> perhaps I need to take threadump)? Also what is alloc for each thread?
>>
>> I have a insert heavy workload (almost like an ingest running against
>> cassandra cluster) and in my case all writes are LWT.
>>
>> The current throughput is 1500 writes/sec where each write is about 1KB.
>> How can I tune something for a higher throughput? Any pointers or
>> suggestions would help.
>>
>> Thanks much,
>> kant
>>
>>
>>
>>
>>
>

Re: Attached profiled data but need help understanding it

2017-02-27 Thread Kant Kodali

On Mon, Feb 27, 2017 at 10:30 AM, Romain Hardouin <romainh...@yahoo.fr>
wrote:

> Hi,
>
> Regarding shared pool workers see CASSANDRA-11966. You may have to
> backport it depending on your Cassandra version.
>

*we are currently using 3.0.9.  should we use 3.8 or 3.10?*

>
> Did you try to lower compaction throughput to see if it helps? Be sure to
> keep an eye on pending compactions, SSTables count and SSTable per read of
> course.
>

   *I haven't done any tuning yet. Can I change this on a running instance?
If so, how? or does it require a downtime?*

>
> "alloc" is the memory allocation rate. You can see that compactions are GC
> intensive.
>

> You won't be able to achieve impressive writes/s with LWT. But maybe there
> is room for improvement. Try GC tuning, different pool sizes for NTR,
> concurrent reads and writes, compaction executors, etc. Also check if you
> can improve network latency (e.g. VF or ENA on AWS).
>

   * GC seems to be fine because when I checked GC is about 0.25%. Total GC
time is about 6minutes since the node is up and running for about 50 hours*.

>
> What LWT rate would you want to achieve?  *5000 is the target.*
>
> Best,
>
> Romain
>
>
>
> Le Lundi 27 février 2017 12h48, Kant Kodali <k...@peernova.com> a écrit :
>
>
> Also Attached is a flamed graph generated from a thread dump.
>
> On Mon, Feb 27, 2017 at 2:32 AM, Kant Kodali <k...@peernova.com> wrote:
>
> Hi,
>
> Attached are the stats of my Cassandra node running on a 4-core CPU. I am
> using sjk-plus tool for the first time so what are the things I should
> watched out for in my attached screenshot? I can see the CPU is almost
> maxed out but should I say that is because of compaction or
> shared-worker-pool threads (which btw, I dont know what they are doing
> perhaps I need to take threadump)? Also what is alloc for each thread?
>
> I have a insert heavy workload (almost like an ingest running against
> cassandra cluster) and in my case all writes are LWT.
>
> The current throughput is 1500 writes/sec where each write is about 1KB.
> How can I tune something for a higher throughput? Any pointers or
> suggestions would help.
>
> Thanks much,
> kant
>
>
>
>
>

Re: Attached profiled data but need help understanding it

2017-02-27 Thread Kant Kodali

Also Attached is a flamed graph generated from a thread dump.

On Mon, Feb 27, 2017 at 2:32 AM, Kant Kodali <k...@peernova.com> wrote:

> Hi,
>
> Attached are the stats of my Cassandra node running on a 4-core CPU. I am
> using sjk-plus tool for the first time so what are the things I should
> watched out for in my attached screenshot? I can see the CPU is almost
> maxed out but should I say that is because of compaction or
> shared-worker-pool threads (which btw, I dont know what they are doing
> perhaps I need to take threadump)? Also what is alloc for each thread?
>
> I have a insert heavy workload (almost like an ingest running against
> cassandra cluster) and in my case all writes are LWT.
>
> The current throughput is 1500 writes/sec where each write is about 1KB.
> How can I tune something for a higher throughput? Any pointers or
> suggestions would help.
>
> Thanks much,
> kant
>

Re: How does cassandra achieve Linearizability?

2017-02-26 Thread Kant Kodali

Is there way to apply the commits from this
https://github.com/bdeggleston/cassandra/tree/CASSANDRA-6246-trunk branch
to Apache Cassandra 3.10 branch? I thought I could just merge these two
branches but looks like there are several trunks so I am confused which
trunk I am merging to?
I want to merge it just to try on my local machine.

Thanks!

On Wed, Feb 22, 2017 at 8:04 PM, Michael Shuler <mich...@pbandjelly.org>
wrote:

> I updated the fix version on CASSANDRA-6246 to 4.x. The 3.11.x edit was
> a bulk move when removing the cassandra-3.X branch and the 3.x Jira
> version. There are likely other new feature tickets that should really
> say 4.x.
>
> --
> Kind regards,
> Michael
>
> On 02/22/2017 07:28 PM, Kant Kodali wrote:
> > I hope that patch is reviewed as quickly as possible. We use LWT's
> > heavily and we are getting a throughput of 600 writes/sec and each write
> > is 1KB in our case.
> >
> >
> >
> >
> >
> > On Wed, Feb 22, 2017 at 7:48 AM, Edward Capriolo <edlinuxg...@gmail.com
> > <mailto:edlinuxg...@gmail.com>> wrote:
> >
> >
> >
> > On Wed, Feb 22, 2017 at 9:47 AM, Ariel Weisberg <ar...@weisberg.ws
> > <mailto:ar...@weisberg.ws>> wrote:
> >
> > __
> > Hi,
> >
> > No it's not going to be in 3.11.x. The earliest release it could
> > make it into is 4.0.
> >
> > Ariel
> >
> > On Wed, Feb 22, 2017, at 03:34 AM, Kant Kodali wrote:
> >> Hi Ariel,
> >>
> >> Can we really expect the fix in 3.11.x as the
> >> ticket https://issues.apache.org/jira/browse/CASSANDRA-6246
> >> <https://issues.apache.org/jira/browse/CASSANDRA-6246?
> jql=text%20~%20%22epaxos%22> says?
> >>
> >> Thanks,
> >> kant
> >>
> >> On Thu, Feb 16, 2017 at 2:12 PM, Ariel Weisberg
> >> <ar...@weisberg.ws <mailto:ar...@weisberg.ws>> wrote:
> >>
> >> __
> >> Hi,
> >>
> >> That would work and would help a lot with the dueling
> >> proposer issue.
> >>
> >> A lot of the leader election stuff is designed to reduce
> >> the number of roundtrips and not just address the dueling
> >> proposer issue. Those will have downtime because it's
> >> there for correctness. Just adding an affinity for a
> >> specific proposer is probably a free lunch.
> >>
> >> I don't think you can group keys because the Paxos
> >> proposals are per partition which is why we get linear
> >> scale out for Paxos. I don't believe it's linearizable
> >> across multiple partitions. You can use the clustering key
> >> and deterministically pick one of the live replicas for
> >> that clustering key. Sort the list of replicas by IP, hash
> >> the clustering key, use the hash as an index into the list
> >> of replicas.
> >>
> >> Batching is of limited usefulness because we only use
> >> Paxos for CAS I think? So in a batch by definition all but
> >> one will fail the CAS. This is something where a
> >> distinguished coordinator could help by failing the rest
> >> of the contending requests more inexpensively than it
> >> currently does.
> >>
> >>
> >> Ariel
> >>
> >> On Thu, Feb 16, 2017, at 04:55 PM, Edward Capriolo wrote:
> >>>
> >>>
> >>> On Thu, Feb 16, 2017 at 4:33 PM, Ariel Weisberg
> >>> <ar...@weisberg.ws <mailto:ar...@weisberg.ws>> wrote:
> >>>
> >>> __
> >>> Hi,
> >>>
> >>> Classic Paxos doesn't have a leader. There are
> >>> variants on the original Lamport approach that will
> >>> elect a leader (or some other variation like Mencius)
> >>> to improve throughput, latency, and performance under
> >>> contention. Cassandra implements the approach from
> >>> the beginning of "Paxos Made Simple"
> >>> (https://goo.gl/SrP0Wb) with no additional
> >>> op

Re: How does cassandra achieve Linearizability?

2017-02-22 Thread Kant Kodali

I hope that patch is reviewed as quickly as possible. We use LWT's heavily
and we are getting a throughput of 600 writes/sec and each write is 1KB in
our case.





On Wed, Feb 22, 2017 at 7:48 AM, Edward Capriolo <edlinuxg...@gmail.com>
wrote:

>
>
> On Wed, Feb 22, 2017 at 9:47 AM, Ariel Weisberg <ar...@weisberg.ws> wrote:
>
>> Hi,
>>
>> No it's not going to be in 3.11.x. The earliest release it could make it
>> into is 4.0.
>>
>> Ariel
>>
>> On Wed, Feb 22, 2017, at 03:34 AM, Kant Kodali wrote:
>>
>> Hi Ariel,
>>
>> Can we really expect the fix in 3.11.x as the ticket
>> https://issues.apache.org/jira/browse/CASSANDRA-6246
>> <https://issues.apache.org/jira/browse/CASSANDRA-6246?jql=text%20~%20%22epaxos%22>
>>  says?
>>
>> Thanks,
>> kant
>>
>> On Thu, Feb 16, 2017 at 2:12 PM, Ariel Weisberg <ar...@weisberg.ws>
>> wrote:
>>
>>
>> Hi,
>>
>> That would work and would help a lot with the dueling proposer issue.
>>
>> A lot of the leader election stuff is designed to reduce the number of
>> roundtrips and not just address the dueling proposer issue. Those will have
>> downtime because it's there for correctness. Just adding an affinity for a
>> specific proposer is probably a free lunch.
>>
>> I don't think you can group keys because the Paxos proposals are per
>> partition which is why we get linear scale out for Paxos. I don't believe
>> it's linearizable across multiple partitions. You can use the clustering
>> key and deterministically pick one of the live replicas for that clustering
>> key. Sort the list of replicas by IP, hash the clustering key, use the hash
>> as an index into the list of replicas.
>>
>> Batching is of limited usefulness because we only use Paxos for CAS I
>> think? So in a batch by definition all but one will fail the CAS. This is
>> something where a distinguished coordinator could help by failing the rest
>> of the contending requests more inexpensively than it currently does.
>>
>>
>> Ariel
>>
>> On Thu, Feb 16, 2017, at 04:55 PM, Edward Capriolo wrote:
>>
>>
>>
>> On Thu, Feb 16, 2017 at 4:33 PM, Ariel Weisberg <ar...@weisberg.ws>
>> wrote:
>>
>>
>> Hi,
>>
>> Classic Paxos doesn't have a leader. There are variants on the original
>> Lamport approach that will elect a leader (or some other variation like
>> Mencius) to improve throughput, latency, and performance under contention.
>> Cassandra implements the approach from the beginning of "Paxos Made Simple"
>> (https://goo.gl/SrP0Wb) with no additional optimizations that I am aware
>> of. There is no distinguished proposer (leader).
>>
>> That paper does go on to discuss electing a distinguished proposer, but
>> that was never done for C*. I believe it's not considered a good fit for C*
>> philosophically.
>>
>> Ariel
>>
>> On Thu, Feb 16, 2017, at 04:20 PM, Kant Kodali wrote:
>>
>> @Ariel Weisberg EPaxos looks very interesting as it looks like it doesn't
>> need any designated leader for C* but I am assuming the paxos that is
>> implemented today for LWT's requires Leader election and If so, don't we
>> need to have an odd number of nodes or racks or DC's to satisfy N = 2F + 1
>> constraint to tolerate F failures ? I understand it is not needed when not
>> using LWT's since Cassandra is a master-less system.
>>
>> On Fri, Feb 10, 2017 at 10:25 AM, Kant Kodali <k...@peernova.com> wrote:
>>
>> Thanks Ariel! Yes I knew there are so many variations and optimizations
>> of Paxos. I just wanted to see if we had any plans on improving the
>> existing Paxos implementation and it is great to see the work is under
>> progress! I am going to follow that ticket and read up the references
>> pointed in it
>>
>>
>> On Fri, Feb 10, 2017 at 8:33 AM, Ariel Weisberg <ar...@weisberg.ws>
>> wrote:
>>
>>
>> Hi,
>>
>> Cassandra's implementation of Paxos doesn't implement many optimizations
>> that would drastically improve throughput and latency. You need consensus,
>> but it doesn't have to be exorbitantly expensive and fall over under any
>> kind of contention.
>>
>> For instance you could implement EPaxos https://issues.apache.o
>> rg/jira/browse/CASSANDRA-6246
>> <https://issues.apache.org/jira/browse/CASSANDRA-6246?jql=text%20~%20%22epaxos%22>,
>> batch multiple operations into the same Paxos round, have an affinity for a
>> specific proposer for

Re: How does cassandra achieve Linearizability?

2017-02-22 Thread Kant Kodali

Hi Ariel,

Can we really expect the fix in 3.11.x as the ticket https://issues.apache.
org/jira/browse/CASSANDRA-6246
<https://issues.apache.org/jira/browse/CASSANDRA-6246?jql=text%20~%20%22epaxos%22>
 says?

Thanks,
kant

On Thu, Feb 16, 2017 at 2:12 PM, Ariel Weisberg <ar...@weisberg.ws> wrote:

> Hi,
>
> That would work and would help a lot with the dueling proposer issue.
>
> A lot of the leader election stuff is designed to reduce the number of
> roundtrips and not just address the dueling proposer issue. Those will have
> downtime because it's there for correctness. Just adding an affinity for a
> specific proposer is probably a free lunch.
>
> I don't think you can group keys because the Paxos proposals are per
> partition which is why we get linear scale out for Paxos. I don't believe
> it's linearizable across multiple partitions. You can use the clustering
> key and deterministically pick one of the live replicas for that clustering
> key. Sort the list of replicas by IP, hash the clustering key, use the hash
> as an index into the list of replicas.
>
> Batching is of limited usefulness because we only use Paxos for CAS I
> think? So in a batch by definition all but one will fail the CAS. This is
> something where a distinguished coordinator could help by failing the rest
> of the contending requests more inexpensively than it currently does.
>
> Ariel
> On Thu, Feb 16, 2017, at 04:55 PM, Edward Capriolo wrote:
>
>
>
> On Thu, Feb 16, 2017 at 4:33 PM, Ariel Weisberg <ar...@weisberg.ws> wrote:
>
>
> Hi,
>
> Classic Paxos doesn't have a leader. There are variants on the original
> Lamport approach that will elect a leader (or some other variation like
> Mencius) to improve throughput, latency, and performance under contention.
> Cassandra implements the approach from the beginning of "Paxos Made Simple"
> (https://goo.gl/SrP0Wb) with no additional optimizations that I am aware
> of. There is no distinguished proposer (leader).
>
> That paper does go on to discuss electing a distinguished proposer, but
> that was never done for C*. I believe it's not considered a good fit for C*
> philosophically.
>
> Ariel
>
> On Thu, Feb 16, 2017, at 04:20 PM, Kant Kodali wrote:
>
> @Ariel Weisberg EPaxos looks very interesting as it looks like it doesn't
> need any designated leader for C* but I am assuming the paxos that is
> implemented today for LWT's requires Leader election and If so, don't we
> need to have an odd number of nodes or racks or DC's to satisfy N = 2F + 1
> constraint to tolerate F failures ? I understand it is not needed when not
> using LWT's since Cassandra is a master-less system.
>
> On Fri, Feb 10, 2017 at 10:25 AM, Kant Kodali <k...@peernova.com> wrote:
>
> Thanks Ariel! Yes I knew there are so many variations and optimizations of
> Paxos. I just wanted to see if we had any plans on improving the existing
> Paxos implementation and it is great to see the work is under progress! I
> am going to follow that ticket and read up the references pointed in it
>
>
> On Fri, Feb 10, 2017 at 8:33 AM, Ariel Weisberg <ar...@weisberg.ws> wrote:
>
>
> Hi,
>
> Cassandra's implementation of Paxos doesn't implement many optimizations
> that would drastically improve throughput and latency. You need consensus,
> but it doesn't have to be exorbitantly expensive and fall over under any
> kind of contention.
>
> For instance you could implement EPaxos https://issues.apache.o
> rg/jira/browse/CASSANDRA-6246
> <https://issues.apache.org/jira/browse/CASSANDRA-6246?jql=text%20~%20%22epaxos%22>,
> batch multiple operations into the same Paxos round, have an affinity for a
> specific proposer for a specific partition, implement asynchronous commit,
> use a more efficient implementation of the Paxos log, and maybe other
> things.
>
>
> Ariel
>
>
>
> On Fri, Feb 10, 2017, at 05:31 AM, Benjamin Roth wrote:
>
> Hi Kant,
>
> If you read the published papers about Paxos, you will most probably
> recognize that there is no way to "do it better". This is a conceptional
> thing due to the nature of distributed systems + the CAP theorem.
> If you want A+P in the triangle, then C is very expensive. CS is made for
> A+P mostly with tunable C. In ACID databases this is a completely different
> thing as they are mostly either not partition tolerant, not highly
> available or not scalable (in a distributed manner, not speaking of
> "monolithic super servers").
>
> There is no free lunch ...
>
>
> 2017-02-10 11:09 GMT+01:00 Kant Kodali <k...@peernova.com>:
>
> "That’s the safety blanket everyone wants but is extremely expensive,
> especially i

Re: Does C* coordinator writes to replicas in same order or different order?

2017-02-21 Thread Kant Kodali

It looks like there is ordering within one client (ordering based on
timestamp) and looks like this *order is preserved across all replicas*
however the benefits of async given the ordering restriction are slightly
blur for me.


On Tue, Feb 21, 2017 at 2:35 AM, Kant Kodali <k...@peernova.com> wrote:

> Agreed with multiple clients one cannot guarantee the order however with
> multiple clients the client side timestamps will overlap as well. so even
> in the case of LWT's and multiple clients the order is not guaranteed
> right. By multiple clients I mean multiple C* sessions on the driver side.
> if multiple LWT's have same time stamps from different clients I would
> assume one of the LWT's from one client can be overwritten by the other LWT
> from another client with same timestamp.
>
>
>
>
>
> On Tue, Feb 21, 2017 at 1:52 AM, Benjamin Roth <benjamin.r...@jaumo.com>
> wrote:
>
>> For eventual consistency, it does not matter if it is sync or async. LWW
>> always works as long as clocks are synchronized.
>> Thats a design pattern of CS or EC databases in general. Every write has
>> a timestamp and no matter at what time it arrives, the last write will win
>> even if a "sooner" write arrives late due to network latency oder a
>> unavailable server that receives a hint after 1 hour.
>> Doing replication sync will kill all the benefits you have from CS's
>> design:
>> - low latency
>> - partition tolerance
>> - high availability
>>
>> Doing sync replication would also not guarantee a state as another client
>> could "interfer" with your write. So you still have no "linearizability".
>> Only LWT does this.
>> You cannot rely on orders in CS. No matter how replication works. You
>> only can rely "eventually" on it but there is never a point in time you can
>> tell 100% your system is completely consistent.
>>
>> Maybe what you could do if you are talking of "orders" and that pointer
>> thing you mentioned earlier: Try sth similar like MVs do.
>> Create a trigger, operate on your local dataset, read the order based on
>> PK (locally) and update "the pointer" on every write (also locally). If you
>> then store your pointer with the last known timestamp of your base data,
>> you also have a LWW on your pointer so also the last pointer wins when
>> reading with > CL_ONE.
>> But that will probably harm your write performance.
>>
>> 2017-02-21 10:36 GMT+01:00 Kant Kodali <k...@peernova.com>:
>>
>>> @Benjamin I am more looking for how C* replication works underneath.
>>> There are few things here that I would need some clarification.
>>>
>>> 1. Does C* uses sync replication or async replication? If it is async
>>> replication how can one get performance especially when there is an
>>> ordering constraint among requests to comply with LWW.  Also below is a
>>> statement from C* website so how can one choose between sync or async
>>> replication? any configuration parameter that needs to be passed in?
>>>
>>> "Choose between synchronous or asynchronous replication for each update.
>>> "
>>>
>>> http://cassandra.apache.org/
>>>
>>> 2. Is it Guaranteed that C* coordinator writes data in the same order to
>>> all the replicas (either sync or async)?
>>>
>>> Thanks,
>>> kant
>>>
>>> On Tue, Feb 21, 2017 at 1:23 AM, Benjamin Roth <benjamin.r...@jaumo.com>
>>> wrote:
>>>
>>>> To me that sounds like a completely different design pattern and a
>>>> different use case.
>>>> CS was not designed to guarantee order. It was build to be linear
>>>> scalable, highly concurrent and eventual consistent.
>>>> To me it sounds like a ACID DB better serves what you are asking for.
>>>>
>>>> 2017-02-21 10:17 GMT+01:00 Kant Kodali <k...@peernova.com>:
>>>>
>>>>> Agreed that async performs better than sync in general but the catch
>>>>> here to me is the "order".
>>>>>
>>>>> The whole point of async is to do out of order processing by which I
>>>>> mean say if a request 1 comes in at time t1 and a request 2 comes in at
>>>>> time t2 where t1 < t2 and say now that t1 is taking longer to process than
>>>>> t2 in which case request 2 should get a response first and subsequently a
>>>>> response for request 1. This is where I would imagine all the benefits of
>>>>>

Re: Does C* coordinator writes to replicas in same order or different order?

2017-02-21 Thread Kant Kodali

Agreed with multiple clients one cannot guarantee the order however with
multiple clients the client side timestamps will overlap as well. so even
in the case of LWT's and multiple clients the order is not guaranteed
right. By multiple clients I mean multiple C* sessions on the driver side.
if multiple LWT's have same time stamps from different clients I would
assume one of the LWT's from one client can be overwritten by the other LWT
from another client with same timestamp.





On Tue, Feb 21, 2017 at 1:52 AM, Benjamin Roth <benjamin.r...@jaumo.com>
wrote:

> For eventual consistency, it does not matter if it is sync or async. LWW
> always works as long as clocks are synchronized.
> Thats a design pattern of CS or EC databases in general. Every write has a
> timestamp and no matter at what time it arrives, the last write will win
> even if a "sooner" write arrives late due to network latency oder a
> unavailable server that receives a hint after 1 hour.
> Doing replication sync will kill all the benefits you have from CS's
> design:
> - low latency
> - partition tolerance
> - high availability
>
> Doing sync replication would also not guarantee a state as another client
> could "interfer" with your write. So you still have no "linearizability".
> Only LWT does this.
> You cannot rely on orders in CS. No matter how replication works. You only
> can rely "eventually" on it but there is never a point in time you can tell
> 100% your system is completely consistent.
>
> Maybe what you could do if you are talking of "orders" and that pointer
> thing you mentioned earlier: Try sth similar like MVs do.
> Create a trigger, operate on your local dataset, read the order based on
> PK (locally) and update "the pointer" on every write (also locally). If you
> then store your pointer with the last known timestamp of your base data,
> you also have a LWW on your pointer so also the last pointer wins when
> reading with > CL_ONE.
> But that will probably harm your write performance.
>
> 2017-02-21 10:36 GMT+01:00 Kant Kodali <k...@peernova.com>:
>
>> @Benjamin I am more looking for how C* replication works underneath.
>> There are few things here that I would need some clarification.
>>
>> 1. Does C* uses sync replication or async replication? If it is async
>> replication how can one get performance especially when there is an
>> ordering constraint among requests to comply with LWW.  Also below is a
>> statement from C* website so how can one choose between sync or async
>> replication? any configuration parameter that needs to be passed in?
>>
>> "Choose between synchronous or asynchronous replication for each update."
>>
>> http://cassandra.apache.org/
>>
>> 2. Is it Guaranteed that C* coordinator writes data in the same order to
>> all the replicas (either sync or async)?
>>
>> Thanks,
>> kant
>>
>> On Tue, Feb 21, 2017 at 1:23 AM, Benjamin Roth <benjamin.r...@jaumo.com>
>> wrote:
>>
>>> To me that sounds like a completely different design pattern and a
>>> different use case.
>>> CS was not designed to guarantee order. It was build to be linear
>>> scalable, highly concurrent and eventual consistent.
>>> To me it sounds like a ACID DB better serves what you are asking for.
>>>
>>> 2017-02-21 10:17 GMT+01:00 Kant Kodali <k...@peernova.com>:
>>>
>>>> Agreed that async performs better than sync in general but the catch
>>>> here to me is the "order".
>>>>
>>>> The whole point of async is to do out of order processing by which I
>>>> mean say if a request 1 comes in at time t1 and a request 2 comes in at
>>>> time t2 where t1 < t2 and say now that t1 is taking longer to process than
>>>> t2 in which case request 2 should get a response first and subsequently a
>>>> response for request 1. This is where I would imagine all the benefits of
>>>> async come in but the moment you introduce order by saying for Last Write
>>>> Wins all the async requests should be processed in order I would imagine
>>>> all the benefits of async are lost.
>>>>
>>>> Let's see if anyone can comment about how it works inside C*.
>>>>
>>>> Thanks!
>>>>
>>>>
>>>>
>>>> On Mon, Feb 20, 2017 at 10:54 PM, Dor Laor <d...@scylladb.com> wrote:
>>>>
>>>>> Could be. Let's stay tuned to see if someone else pick it up.
>>>>> Anyway, if it's synchronous, you'll have a large penalty for l

Re: Does C* coordinator writes to replicas in same order or different order?

2017-02-21 Thread Kant Kodali

@Benjamin I am more looking for how C* replication works underneath. There
are few things here that I would need some clarification.

1. Does C* uses sync replication or async replication? If it is async
replication how can one get performance especially when there is an
ordering constraint among requests to comply with LWW.  Also below is a
statement from C* website so how can one choose between sync or async
replication? any configuration parameter that needs to be passed in?

"Choose between synchronous or asynchronous replication for each update."

http://cassandra.apache.org/

2. Is it Guaranteed that C* coordinator writes data in the same order to
all the replicas (either sync or async)?

Thanks,
kant

On Tue, Feb 21, 2017 at 1:23 AM, Benjamin Roth <benjamin.r...@jaumo.com>
wrote:

> To me that sounds like a completely different design pattern and a
> different use case.
> CS was not designed to guarantee order. It was build to be linear
> scalable, highly concurrent and eventual consistent.
> To me it sounds like a ACID DB better serves what you are asking for.
>
> 2017-02-21 10:17 GMT+01:00 Kant Kodali <k...@peernova.com>:
>
>> Agreed that async performs better than sync in general but the catch here
>> to me is the "order".
>>
>> The whole point of async is to do out of order processing by which I mean
>> say if a request 1 comes in at time t1 and a request 2 comes in at time t2
>> where t1 < t2 and say now that t1 is taking longer to process than t2 in
>> which case request 2 should get a response first and subsequently a
>> response for request 1. This is where I would imagine all the benefits of
>> async come in but the moment you introduce order by saying for Last Write
>> Wins all the async requests should be processed in order I would imagine
>> all the benefits of async are lost.
>>
>> Let's see if anyone can comment about how it works inside C*.
>>
>> Thanks!
>>
>>
>>
>> On Mon, Feb 20, 2017 at 10:54 PM, Dor Laor <d...@scylladb.com> wrote:
>>
>>> Could be. Let's stay tuned to see if someone else pick it up.
>>> Anyway, if it's synchronous, you'll have a large penalty for latency.
>>>
>>> On Mon, Feb 20, 2017 at 10:11 PM, Kant Kodali <k...@peernova.com> wrote:
>>>
>>>> Thanks again for the response! if they mean it between client and
>>>> server I am not sure why they would use the word "replication" in the
>>>> statement below since there is no replication between client and server(
>>>> coordinator).
>>>>
>>>> "Choose between synchronous or asynchronous replication for each update
>>>>> ."
>>>>>
>>>>
>>>> Sent from my iPhone
>>>>
>>>> On Feb 20, 2017, at 5:30 PM, Dor Laor <d...@scylladb.com> wrote:
>>>>
>>>> I think they mean the client to server and not among the servers
>>>>
>>>> On Mon, Feb 20, 2017 at 5:28 PM, Kant Kodali <k...@peernova.com> wrote:
>>>>
>>>>> Also here is a statement from C* website
>>>>>
>>>>> "Choose between synchronous or asynchronous replication for each
>>>>> update."
>>>>>
>>>>> http://cassandra.apache.org/
>>>>>
>>>>> Looks like we can choose then either sync or async then?
>>>>>
>>>>> On Mon, Feb 20, 2017 at 5:25 PM, Kant Kodali <k...@peernova.com>
>>>>> wrote:
>>>>>
>>>>>> Hi Dor,
>>>>>>
>>>>>> Great response! My comments are inline.
>>>>>>
>>>>>> Thanks a lot,
>>>>>> kant
>>>>>>
>>>>>>
>>>>>> On Mon, Feb 20, 2017 at 4:41 PM, Dor Laor <d...@scylladb.com> wrote:
>>>>>>
>>>>>>> I sent this answer but it bounced off the user@apache.
>>>>>>> Here is the email anyway:
>>>>>>>
>>>>>>> -- Forwarded message --
>>>>>>> From: Dor Laor <d...@scylladb.com>
>>>>>>> Date: Mon, Feb 20, 2017 at 4:37 PM
>>>>>>> Subject: Re: Does C* coordinator writes to replicas in same order or
>>>>>>> different order?
>>>>>>> To: d...@cassandra.apache.org
>>>>>>> Cc: user@cassandra.apache.org
>>>>>>>
>>>>>>>
>>>>>>> + The C* coordinator send async write

Re: Does C* coordinator writes to replicas in same order or different order?

2017-02-21 Thread Kant Kodali

Agreed that async performs better than sync in general but the catch here
to me is the "order".

The whole point of async is to do out of order processing by which I mean
say if a request 1 comes in at time t1 and a request 2 comes in at time t2
where t1 < t2 and say now that t1 is taking longer to process than t2 in
which case request 2 should get a response first and subsequently a
response for request 1. This is where I would imagine all the benefits of
async come in but the moment you introduce order by saying for Last Write
Wins all the async requests should be processed in order I would imagine
all the benefits of async are lost.

Let's see if anyone can comment about how it works inside C*.

Thanks!



On Mon, Feb 20, 2017 at 10:54 PM, Dor Laor <d...@scylladb.com> wrote:

> Could be. Let's stay tuned to see if someone else pick it up.
> Anyway, if it's synchronous, you'll have a large penalty for latency.
>
> On Mon, Feb 20, 2017 at 10:11 PM, Kant Kodali <k...@peernova.com> wrote:
>
>> Thanks again for the response! if they mean it between client and server
>> I am not sure why they would use the word "replication" in the statement
>> below since there is no replication between client and server( coordinator).
>>
>> "Choose between synchronous or asynchronous replication for each update."
>>>
>>
>> Sent from my iPhone
>>
>> On Feb 20, 2017, at 5:30 PM, Dor Laor <d...@scylladb.com> wrote:
>>
>> I think they mean the client to server and not among the servers
>>
>> On Mon, Feb 20, 2017 at 5:28 PM, Kant Kodali <k...@peernova.com> wrote:
>>
>>> Also here is a statement from C* website
>>>
>>> "Choose between synchronous or asynchronous replication for each update.
>>> "
>>>
>>> http://cassandra.apache.org/
>>>
>>> Looks like we can choose then either sync or async then?
>>>
>>> On Mon, Feb 20, 2017 at 5:25 PM, Kant Kodali <k...@peernova.com> wrote:
>>>
>>>> Hi Dor,
>>>>
>>>> Great response! My comments are inline.
>>>>
>>>> Thanks a lot,
>>>> kant
>>>>
>>>>
>>>> On Mon, Feb 20, 2017 at 4:41 PM, Dor Laor <d...@scylladb.com> wrote:
>>>>
>>>>> I sent this answer but it bounced off the user@apache.
>>>>> Here is the email anyway:
>>>>>
>>>>> -- Forwarded message --
>>>>> From: Dor Laor <d...@scylladb.com>
>>>>> Date: Mon, Feb 20, 2017 at 4:37 PM
>>>>> Subject: Re: Does C* coordinator writes to replicas in same order or
>>>>> different order?
>>>>> To: d...@cassandra.apache.org
>>>>> Cc: user@cassandra.apache.org
>>>>>
>>>>>
>>>>> + The C* coordinator send async write requests to the replicas.
>>>>>This is very important since it allows it to return a low latency
>>>>>reply to the client once the CL is reached. You wouldn't want
>>>>>to serialize the replicas one after the other.
>>>>>
>>>>
>>>> *so coordinator wont wait until a CL is reached before it
>>>> process another request? *
>>>>
>>>>>
>>>>>  + The client <-> server sync/async isn't related to the coordinator
>>>>> in this case.
>>>>>
>>>>>  + In the case of concurrent writes (always the case...), the time
>>>>> stamp
>>>>> sets the order. Note that it's possible to work with client
>>>>> timestamps or
>>>>> server timestamps. The client ones are usually the best choice.
>>>>>
>>>>
>>>>  *In theory, Why we say concurrent writes they should have the same
>>>> timestamp right?  What I am really looking for is that if I send write
>>>> request concurrently for record 1 and record 2 are they guaranteed to be
>>>> inserted in the same order across replicas? (Whatever order coordinator may
>>>> choose is fine but I want the same order across all replicas and with async
>>>> replication I am not sure how that is possible ? for example,  if a request
>>>> arrives with timestamp t1 and another request arrives with a timestamp t2
>>>> where t1 < t2...with async replication what if one replica chooses to
>>>> execute t2 first and then t1 simply because t1 is slow while another
>>>> replica choose to execute t1 first and then t2..how would that work?  )*
>>>>
>>>>>
>>>>> Note that C* each node can be a coordinator (one per request) and its
>>>>> the desired case in order to load balance the incoming requests. Once
>>>>> again,
>>>>> timestamps determine the order among the requests.
>>>>>
>>>>> Cheers,
>>>>> Dor
>>>>>
>>>>> On Mon, Feb 20, 2017 at 4:12 PM, Kant Kodali <k...@peernova.com>
>>>>> wrote:
>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> when C* coordinator writes to replicas does it write it in same order
>>>>>> or
>>>>>> different order? other words, Does the replication happen
>>>>>> synchronously or
>>>>>> asynchrnoulsy ? Also does this depend sync or async client? What
>>>>>> happens in
>>>>>> the case of concurrent writes to a coordinator ?
>>>>>>
>>>>>> Thanks,
>>>>>> kant
>>>>>>
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Does C* coordinator writes to replicas in same order or different order?

2017-02-20 Thread Kant Kodali

Hi,

when C* coordinator writes to replicas does it write it in same order or
different order? other words, Does the replication happen synchronously or
asynchrnoulsy ? Also does this depend sync or async client? What happens in
the case of concurrent writes to a coordinator ?

Thanks,
kant

Are Cassandra Triggers Thread Safe? ("Tough questions perhaps!")

2017-02-20 Thread Kant Kodali

Hi,

1. Are Cassandra Triggers Thread Safe? what happens if two writes invoke
the trigger where the trigger is trying to modify same row in a partition?
2. Had anyone used it successfully on production? If so, any issues? (I am
using the latest version of C* 3.10)
3. I have partitions that are about 10K rows. And each row in a partition
need to have a pointer to the previous row (and the pointer in this case is
a hash). Using Triggers here would greatly simply our application logic.

It will be a huge help if I can get answers to this.

Thanks,
kant

question on uda/udf

2017-02-18 Thread Kant Kodali

Hi All,

Goal: want to create check_duplicate UDA on a blob column

Context: I have a partition of 10Million rows with size of 10GB (I know
this is bad). I want to check if there are duplicate in a blob column in
this partition. The blob column can at most be 256 bytes.

Question: can I create state map ? since blob is represented as
a ByteBuffer I suspect this wont work because of .equals method which would
just compare the references. so can I convert the blob into hex string
first? If so, how should I do that?

Thanks,
kant

Re: is there a query to find out the largest partition in a table?

2017-02-18 Thread Kant Kodali

*I did the following. Now I wonder if this is one node or multiple nodes?
Does this value really tell me I have a large partition?*

nodetool cfhistograms test hello // This reports the max partition size is
10GB

nodetool tablestats test.hello // This also reports Compacted partition
maximum bytes: 10299432635


Percentile  SSTables Write Latency  Read LatencyPartition Size
   Cell Count

  (micros)  (micros)   (bytes)


50% 0.00 20.50 51.01 155469300
   654949

75% 0.00 24.60 88.154139110981
 17436917

95% 6.00 29.52 155469.30   10299432635
 43388628

98% 6.00 42.51 668489.53   10299432635
 43388628

99% 6.00 61.21 802187.44   10299432635
 43388628

Min 0.00  5.72  9.89   125
5

Max 6.00 668489.538582860.53   10299432635
 43388628

On Sat, Feb 18, 2017 at 12:28 AM, Kant Kodali <k...@peernova.com> wrote:

> is there a query to find out the largest partition in a table? Does the
> query below give me the largest partition?
>
> select max(mean_partition_size) from size_estimates ;
>
> Thanks,
> Kant
>

is there a query to find out the largest partition in a table?

2017-02-18 Thread Kant Kodali

is there a query to find out the largest partition in a table? Does the
query below give me the largest partition?

select max(mean_partition_size) from size_estimates ;

Thanks,
Kant

Re: How does cassandra achieve Linearizability?

2017-02-16 Thread Kant Kodali

@Ariel Weisberg EPaxos looks very interesting as it looks like it doesn't
need any designated leader for C* but I am assuming the paxos that is
implemented today for LWT's requires Leader election and If so, don't we
need to have an odd number of nodes or racks or DC's to satisfy N = 2F + 1
constraint to tolerate F failures ? I understand it is not needed when not
using LWT's since Cassandra is a master-less system.

On Fri, Feb 10, 2017 at 10:25 AM, Kant Kodali <k...@peernova.com> wrote:

> Thanks Ariel! Yes I knew there are so many variations and optimizations of
> Paxos. I just wanted to see if we had any plans on improving the existing
> Paxos implementation and it is great to see the work is under progress! I
> am going to follow that ticket and read up the references pointed in it
>
>
> On Fri, Feb 10, 2017 at 8:33 AM, Ariel Weisberg <ar...@weisberg.ws> wrote:
>
>> Hi,
>>
>> Cassandra's implementation of Paxos doesn't implement many optimizations
>> that would drastically improve throughput and latency. You need consensus,
>> but it doesn't have to be exorbitantly expensive and fall over under any
>> kind of contention.
>>
>> For instance you could implement EPaxos https://issues.apache.o
>> rg/jira/browse/CASSANDRA-6246
>> <https://issues.apache.org/jira/browse/CASSANDRA-6246?jql=text%20~%20%22epaxos%22>,
>> batch multiple operations into the same Paxos round, have an affinity for a
>> specific proposer for a specific partition, implement asynchronous commit,
>> use a more efficient implementation of the Paxos log, and maybe other
>> things.
>>
>> Ariel
>>
>>
>> On Fri, Feb 10, 2017, at 05:31 AM, Benjamin Roth wrote:
>>
>> Hi Kant,
>>
>> If you read the published papers about Paxos, you will most probably
>> recognize that there is no way to "do it better". This is a conceptional
>> thing due to the nature of distributed systems + the CAP theorem.
>> If you want A+P in the triangle, then C is very expensive. CS is made for
>> A+P mostly with tunable C. In ACID databases this is a completely different
>> thing as they are mostly either not partition tolerant, not highly
>> available or not scalable (in a distributed manner, not speaking of
>> "monolithic super servers").
>>
>> There is no free lunch ...
>>
>>
>> 2017-02-10 11:09 GMT+01:00 Kant Kodali <k...@peernova.com>:
>>
>> "That’s the safety blanket everyone wants but is extremely expensive,
>> especially in Cassandra."
>>
>> yes LWT's are expensive. Are there any plans to make this better?
>>
>> On Fri, Feb 10, 2017 at 12:17 AM, Kant Kodali <k...@peernova.com> wrote:
>>
>> Hi Jon,
>>
>> Thanks a lot for your response. I am well aware that the LWW != LWT but I
>> was talking more in terms of LWW with respective to LWT's which I believe
>> you answered. so thanks much!
>>
>>
>> kant
>>
>>
>> On Thu, Feb 9, 2017 at 6:01 PM, Jon Haddad <jonathan.had...@gmail.com>
>> wrote:
>>
>> LWT != Last Write Wins.  They are totally different.
>>
>> LWTs give you (assuming you also read at SERIAL) “atomic consistency”,
>> meaning you are able to perform operations atomically and in isolation.
>> That’s the safety blanket everyone wants but is extremely expensive,
>> especially in Cassandra.  The lightweight part, btw, may be a little
>> optimistic, especially if a key is under contention.  With regard to the
>> “last write” part you’re asking about - w/ LWT Cassandra provides the
>> timestamp and manages it as part of the ballot, and it always is
>> increasing.  See 
>> org.apache.cassandra.service.ClientState#getTimestampForPaxos.
>> From the code:
>>
>>  * Returns a timestamp suitable for paxos given the timestamp of the last
>> known commit (or in progress update).
>>  * Paxos ensures that the timestamp it uses for commits respects the
>> serial order of those commits. It does so
>>  * by having each replica reject any proposal whose timestamp is not
>> strictly greater than the last proposal it
>>  * accepted. So in practice, which timestamp we use for a given proposal
>> doesn't affect correctness but it does
>>  * affect the chance of making progress (if we pick a timestamp lower
>> than what has been proposed before, our
>>  * new proposal will just get rejected).
>>
>> Effectively paxos removes the ability to use custom timestamps and
>> addresses clock variance by rejecting ballots with timestamps less than
>> what was last seen.  You can learn more by reading

Re: Why does Cassandra recommends Oracle JVM instead of OpenJDK?

2017-02-11 Thread Kant Kodali

Saw this one today...

https://news.ycombinator.com/item?id=13624062

On Tue, Jan 3, 2017 at 6:27 AM, Eric Evans 
wrote:

> On Mon, Jan 2, 2017 at 2:26 PM, Edward Capriolo 
> wrote:
> > Lets be clear:
> > What I am saying is avoiding being loose with the word "free"
> >
> > https://en.wikipedia.org/wiki/Free_software_license
> >
> > Many things with the JVM are free too. Most importantly it is free to
> use.
> >
> > https://www.java.com/en/download/faq/distribution.xml
> >
> > As it relates to this conversation: I am not aware of anyone running
> > Cassandra that has modified upstream JVM to make Cassandra run
> > better/differently *. Thus the license around the Oracle JVM is roughly
> > meaningless to the user/developer of cassandra.
> >
> > * The only group I know that took an action to modify upstream was Acunu.
> > They had released a modified Linux Kernel with a modified Apache
> Cassandra.
> > http://cloudtweaks.com/2011/02/data-storage-startup-acunu-
> raises-3-6-million-to-launch-its-first-product/.
> > That product no longer exists.
> >
> > "I don't how to read any of this.  It sounds like you're saying that a
> > JVM is something that cannot be produced as a Free Software project,"
> >
> > What I am saying is something like the JVM "could" be produced as a "free
> > software project". However, the argument that I was making is that the
> > popular viable languages/(including vms or runtime to use them) today
> > including Java, C#, Go, Swift are developed by the largest tech
> companies in
> > the world, and as such I do believe a platform would be viable.
> Specifically
> > I believe without Oracle driving Java OpenJDK would not be viable.
> >
> > There are two specific reasons.
> > 1) I do not see large costly multi-year initiatives like G1 happening
> > 2) Without guidance/leadership that sun/oracle I do not see new features
> > that change the language like lambda's and try multi-catch happening in a
> > sane way.
> >
> > I expanded upon #2 be discussing my experience with standards like c++
> 11,
> > 14,17 and attempting to take compiling working lambda code on linux GCC
> to
> > microsoft visual studio and having it not compile. In my opinion, Java
> only
> > wins because as a platform it is very portable as both source and binary
> > code. Without leadership on that front I believe that over time the
> language
> > would suffer.
>
> I realize that you're trying to be pragmatic about all of this, but
> what I don't think you realize, is that so am I.
>
> Java could change hands at any time (it has once already), or Oracle
> leadership could decide to go in a different direction.  Imagine for
> example that they relicensed it to exclude use by orientation or
> religion, Cassandra would implicitly carry these restrictions as well.
> Imagine that they decided to provide a back-door to the NSA, Cassandra
> would then also contain such a back-door.  These might sound
> hypothetical, but there is plenty of precedent here.
>
> OpenJDK benefits from the same resources and leadership from Oracle
> that you value, but is licensed and distributed in a way that
> safeguards us from a day when Oracle becomes less benevolent, (if that
> were to happen, some other giant company could assume the mantle of
> leadership).
>
> All I'm really suggesting is that we at least soften our requirement
> on the Oracle JVM, and perhaps perform some test runs in CI against
> OpenJDK.  Actively discouraging people from using the Free Software
> alternative here, one that is working well for many, isn't the
> behavior I'd normally expect from a Free Software project.
>
> --
> Eric Evans
> john.eric.ev...@gmail.com
>

Re: How does cassandra achieve Linearizability?

2017-02-10 Thread Kant Kodali

Thanks Ariel! Yes I knew there are so many variations and optimizations of
Paxos. I just wanted to see if we had any plans on improving the existing
Paxos implementation and it is great to see the work is under progress! I
am going to follow that ticket and read up the references pointed in it


On Fri, Feb 10, 2017 at 8:33 AM, Ariel Weisberg <ar...@weisberg.ws> wrote:

> Hi,
>
> Cassandra's implementation of Paxos doesn't implement many optimizations
> that would drastically improve throughput and latency. You need consensus,
> but it doesn't have to be exorbitantly expensive and fall over under any
> kind of contention.
>
> For instance you could implement EPaxos https://issues.apache.
> org/jira/browse/CASSANDRA-6246
> <https://issues.apache.org/jira/browse/CASSANDRA-6246?jql=text%20~%20%22epaxos%22>,
> batch multiple operations into the same Paxos round, have an affinity for a
> specific proposer for a specific partition, implement asynchronous commit,
> use a more efficient implementation of the Paxos log, and maybe other
> things.
>
> Ariel
>
>
> On Fri, Feb 10, 2017, at 05:31 AM, Benjamin Roth wrote:
>
> Hi Kant,
>
> If you read the published papers about Paxos, you will most probably
> recognize that there is no way to "do it better". This is a conceptional
> thing due to the nature of distributed systems + the CAP theorem.
> If you want A+P in the triangle, then C is very expensive. CS is made for
> A+P mostly with tunable C. In ACID databases this is a completely different
> thing as they are mostly either not partition tolerant, not highly
> available or not scalable (in a distributed manner, not speaking of
> "monolithic super servers").
>
> There is no free lunch ...
>
>
> 2017-02-10 11:09 GMT+01:00 Kant Kodali <k...@peernova.com>:
>
> "That’s the safety blanket everyone wants but is extremely expensive,
> especially in Cassandra."
>
> yes LWT's are expensive. Are there any plans to make this better?
>
> On Fri, Feb 10, 2017 at 12:17 AM, Kant Kodali <k...@peernova.com> wrote:
>
> Hi Jon,
>
> Thanks a lot for your response. I am well aware that the LWW != LWT but I
> was talking more in terms of LWW with respective to LWT's which I believe
> you answered. so thanks much!
>
>
> kant
>
>
> On Thu, Feb 9, 2017 at 6:01 PM, Jon Haddad <jonathan.had...@gmail.com>
> wrote:
>
> LWT != Last Write Wins.  They are totally different.
>
> LWTs give you (assuming you also read at SERIAL) “atomic consistency”,
> meaning you are able to perform operations atomically and in isolation.
> That’s the safety blanket everyone wants but is extremely expensive,
> especially in Cassandra.  The lightweight part, btw, may be a little
> optimistic, especially if a key is under contention.  With regard to the
> “last write” part you’re asking about - w/ LWT Cassandra provides the
> timestamp and manages it as part of the ballot, and it always is
> increasing.  See 
> org.apache.cassandra.service.ClientState#getTimestampForPaxos.
> From the code:
>
>  * Returns a timestamp suitable for paxos given the timestamp of the last
> known commit (or in progress update).
>  * Paxos ensures that the timestamp it uses for commits respects the
> serial order of those commits. It does so
>  * by having each replica reject any proposal whose timestamp is not
> strictly greater than the last proposal it
>  * accepted. So in practice, which timestamp we use for a given proposal
> doesn't affect correctness but it does
>  * affect the chance of making progress (if we pick a timestamp lower than
> what has been proposed before, our
>  * new proposal will just get rejected).
>
> Effectively paxos removes the ability to use custom timestamps and
> addresses clock variance by rejecting ballots with timestamps less than
> what was last seen.  You can learn more by reading through the other
> comments and code in that file.
>
> Last write wins is a free for all that guarantees you *nothing* except the
> timestamp is used as a tiebreaker.  Here we acknowledge things like the
> speed of light as being a real problem that isn’t going away anytime soon.
> This problem is sometimes addressed with event sourcing rather than
> mutating in place.
>
> Hope this helps.
>
>
> Jon
>
>
>
>
> On Feb 9, 2017, at 5:21 PM, Kant Kodali <k...@peernova.com> wrote:
>
> @Justin I read this article http://www.datastax.com/dev/bl
> og/lightweight-transactions-in-cassandra-2-0. And it clearly says
> Linearizable consistency can be achieved with LWT's.  so should I assume
> the Linearizability in the context of the above article is possible with
> LWT's and synchronization of clocks thro

Re: How does cassandra achieve Linearizability?

2017-02-10 Thread Kant Kodali

"That’s the safety blanket everyone wants but is extremely expensive,
especially in Cassandra."

yes LWT's are expensive. Are there any plans to make this better?

On Fri, Feb 10, 2017 at 12:17 AM, Kant Kodali <k...@peernova.com> wrote:

> Hi Jon,
>
> Thanks a lot for your response. I am well aware that the LWW != LWT but I
> was talking more in terms of LWW with respective to LWT's which I believe
> you answered. so thanks much!
>
> kant
>
> On Thu, Feb 9, 2017 at 6:01 PM, Jon Haddad <jonathan.had...@gmail.com>
> wrote:
>
>> LWT != Last Write Wins.  They are totally different.
>>
>> LWTs give you (assuming you also read at SERIAL) “atomic consistency”,
>> meaning you are able to perform operations atomically and in isolation.
>> That’s the safety blanket everyone wants but is extremely expensive,
>> especially in Cassandra.  The lightweight part, btw, may be a little
>> optimistic, especially if a key is under contention.  With regard to the
>> “last write” part you’re asking about - w/ LWT Cassandra provides the
>> timestamp and manages it as part of the ballot, and it always is
>> increasing.  See 
>> org.apache.cassandra.service.ClientState#getTimestampForPaxos.
>> From the code:
>>
>>  * Returns a timestamp suitable for paxos given the timestamp of the last
>> known commit (or in progress update).
>>  * Paxos ensures that the timestamp it uses for commits respects the
>> serial order of those commits. It does so
>>  * by having each replica reject any proposal whose timestamp is not
>> strictly greater than the last proposal it
>>  * accepted. So in practice, which timestamp we use for a given proposal
>> doesn't affect correctness but it does
>>  * affect the chance of making progress (if we pick a timestamp lower
>> than what has been proposed before, our
>>  * new proposal will just get rejected).
>>
>> Effectively paxos removes the ability to use custom timestamps and
>> addresses clock variance by rejecting ballots with timestamps less than
>> what was last seen.  You can learn more by reading through the other
>> comments and code in that file.
>>
>> Last write wins is a free for all that guarantees you *nothing* except
>> the timestamp is used as a tiebreaker.  Here we acknowledge things like the
>> speed of light as being a real problem that isn’t going away anytime soon.
>> This problem is sometimes addressed with event sourcing rather than
>> mutating in place.
>>
>> Hope this helps.
>>
>> Jon
>>
>>
>> On Feb 9, 2017, at 5:21 PM, Kant Kodali <k...@peernova.com> wrote:
>>
>> @Justin I read this article http://www.datastax.com/dev/bl
>> og/lightweight-transactions-in-cassandra-2-0. And it clearly says
>> Linearizable consistency can be achieved with LWT's.  so should I assume
>> the Linearizability in the context of the above article is possible with
>> LWT's and synchronization of clocks through ntpd ? because LWT's also
>> follow Last Write Wins. isn't it? Also another question does most of the
>> production clusters do setup ntpd? If so what is the time it takes to sync?
>> any idea
>>
>> @Micheal Schuler Are you referring to  something like true time as in
>> https://static.googleusercontent.com/media/research.google.
>> com/en//archive/spanner-osdi2012.pdf?  Actually I never heard of setting
>> up GPS modules and how that can be helpful. Let me research on that but
>> good point.
>>
>> On Thu, Feb 9, 2017 at 5:09 PM, Michael Shuler <mich...@pbandjelly.org>
>> wrote:
>>
>>> If you require the best precision you can get, setting up a pair of
>>> stratum 1 ntpd masters in each data center location with a GPS modules
>>> is not terribly complex. Low latency and jitter on servers you manage.
>>> 140ms is a long way away network-wise, and I would suggest that was a
>>> poor choice of upstream (probably stratum 2 or 3) source.
>>>
>>> As Jonathan mentioned, there's no guarantee from Cassandra, but if you
>>> need as close as you can get, you'll probably need to do it yourself.
>>>
>>> (I run several stratum 2 ntpd servers for pool.ntp.org)
>>>
>>> --
>>> Kind regards,
>>> Michael
>>>
>>> On 02/09/2017 06:47 PM, Kant Kodali wrote:
>>> > Hi Justin,
>>> >
>>> > There are bunch of issues w.r.t to synchronization of clocks when we
>>> > used ntpd. Also the time it took to sync the clocks was approx 140ms
>>> > (don't quote me on it though because it is reported by our dev

Re: How does cassandra achieve Linearizability?

2017-02-10 Thread Kant Kodali

Hi Jon,

Thanks a lot for your response. I am well aware that the LWW != LWT but I
was talking more in terms of LWW with respective to LWT's which I believe
you answered. so thanks much!

kant

On Thu, Feb 9, 2017 at 6:01 PM, Jon Haddad <jonathan.had...@gmail.com>
wrote:

> LWT != Last Write Wins.  They are totally different.
>
> LWTs give you (assuming you also read at SERIAL) “atomic consistency”,
> meaning you are able to perform operations atomically and in isolation.
> That’s the safety blanket everyone wants but is extremely expensive,
> especially in Cassandra.  The lightweight part, btw, may be a little
> optimistic, especially if a key is under contention.  With regard to the
> “last write” part you’re asking about - w/ LWT Cassandra provides the
> timestamp and manages it as part of the ballot, and it always is
> increasing.  See 
> org.apache.cassandra.service.ClientState#getTimestampForPaxos.
> From the code:
>
>  * Returns a timestamp suitable for paxos given the timestamp of the last
> known commit (or in progress update).
>  * Paxos ensures that the timestamp it uses for commits respects the
> serial order of those commits. It does so
>  * by having each replica reject any proposal whose timestamp is not
> strictly greater than the last proposal it
>  * accepted. So in practice, which timestamp we use for a given proposal
> doesn't affect correctness but it does
>  * affect the chance of making progress (if we pick a timestamp lower than
> what has been proposed before, our
>  * new proposal will just get rejected).
>
> Effectively paxos removes the ability to use custom timestamps and
> addresses clock variance by rejecting ballots with timestamps less than
> what was last seen.  You can learn more by reading through the other
> comments and code in that file.
>
> Last write wins is a free for all that guarantees you *nothing* except the
> timestamp is used as a tiebreaker.  Here we acknowledge things like the
> speed of light as being a real problem that isn’t going away anytime soon.
> This problem is sometimes addressed with event sourcing rather than
> mutating in place.
>
> Hope this helps.
>
> Jon
>
>
> On Feb 9, 2017, at 5:21 PM, Kant Kodali <k...@peernova.com> wrote:
>
> @Justin I read this article http://www.datastax.com/dev/
> blog/lightweight-transactions-in-cassandra-2-0. And it clearly says
> Linearizable consistency can be achieved with LWT's.  so should I assume
> the Linearizability in the context of the above article is possible with
> LWT's and synchronization of clocks through ntpd ? because LWT's also
> follow Last Write Wins. isn't it? Also another question does most of the
> production clusters do setup ntpd? If so what is the time it takes to sync?
> any idea
>
> @Micheal Schuler Are you referring to  something like true time as in
> https://static.googleusercontent.com/media/research.google.com/en//
> archive/spanner-osdi2012.pdf?  Actually I never heard of setting up GPS
> modules and how that can be helpful. Let me research on that but good point.
>
> On Thu, Feb 9, 2017 at 5:09 PM, Michael Shuler <mich...@pbandjelly.org>
> wrote:
>
>> If you require the best precision you can get, setting up a pair of
>> stratum 1 ntpd masters in each data center location with a GPS modules
>> is not terribly complex. Low latency and jitter on servers you manage.
>> 140ms is a long way away network-wise, and I would suggest that was a
>> poor choice of upstream (probably stratum 2 or 3) source.
>>
>> As Jonathan mentioned, there's no guarantee from Cassandra, but if you
>> need as close as you can get, you'll probably need to do it yourself.
>>
>> (I run several stratum 2 ntpd servers for pool.ntp.org)
>>
>> --
>> Kind regards,
>> Michael
>>
>> On 02/09/2017 06:47 PM, Kant Kodali wrote:
>> > Hi Justin,
>> >
>> > There are bunch of issues w.r.t to synchronization of clocks when we
>> > used ntpd. Also the time it took to sync the clocks was approx 140ms
>> > (don't quote me on it though because it is reported by our devops :)
>> >
>> > we have multiple clients (for example bunch of micro services are
>> > reading from Cassandra) I am not sure how one can achieve
>> > Linearizability by setting timestamps on the clients ? since there is no
>> > total ordering across multiple clients.
>> >
>> > Thanks!
>> >
>> >
>> > On Thu, Feb 9, 2017 at 4:16 PM, Justin Cameron <jus...@instaclustr.com
>> > <mailto:jus...@instaclustr.com>> wrote:
>> >
>> > Hi Kant,
>> >
>> > Clock synchronization is important -

Re: How does cassandra achieve Linearizability?

2017-02-09 Thread Kant Kodali

@Justin I read this article
http://www.datastax.com/dev/blog/lightweight-transactions-in-cassandra-2-0.
And it clearly says Linearizable consistency can be achieved with LWT's.
 so should I assume the Linearizability in the context of the above article
is possible with LWT's and synchronization of clocks through ntpd ? because
LWT's also follow Last Write Wins. isn't it? Also another question does
most of the production clusters do setup ntpd? If so what is the time it
takes to sync? any idea

@Micheal Schuler Are you referring to  something like true time as in
https://static.googleusercontent.com/media/research.google.com/en//archive/spanner-osdi2012.pdf?
Actually I never heard of setting up GPS modules and how that can be
helpful. Let me research on that but good point.

On Thu, Feb 9, 2017 at 5:09 PM, Michael Shuler <mich...@pbandjelly.org>
wrote:

> If you require the best precision you can get, setting up a pair of
> stratum 1 ntpd masters in each data center location with a GPS modules
> is not terribly complex. Low latency and jitter on servers you manage.
> 140ms is a long way away network-wise, and I would suggest that was a
> poor choice of upstream (probably stratum 2 or 3) source.
>
> As Jonathan mentioned, there's no guarantee from Cassandra, but if you
> need as close as you can get, you'll probably need to do it yourself.
>
> (I run several stratum 2 ntpd servers for pool.ntp.org)
>
> --
> Kind regards,
> Michael
>
> On 02/09/2017 06:47 PM, Kant Kodali wrote:
> > Hi Justin,
> >
> > There are bunch of issues w.r.t to synchronization of clocks when we
> > used ntpd. Also the time it took to sync the clocks was approx 140ms
> > (don't quote me on it though because it is reported by our devops :)
> >
> > we have multiple clients (for example bunch of micro services are
> > reading from Cassandra) I am not sure how one can achieve
> > Linearizability by setting timestamps on the clients ? since there is no
> > total ordering across multiple clients.
> >
> > Thanks!
> >
> >
> > On Thu, Feb 9, 2017 at 4:16 PM, Justin Cameron <jus...@instaclustr.com
> > <mailto:jus...@instaclustr.com>> wrote:
> >
> > Hi Kant,
> >
> > Clock synchronization is important - you should ensure that ntpd is
> > properly configured on all nodes. If your particular use case is
> > especially sensitive to out-of-order mutations it is possible to set
> > timestamps on the client side using the
> > drivers. https://docs.datastax.com/en/developer/java-driver/3.1/
> manual/query_timestamps/
> > <https://docs.datastax.com/en/developer/java-driver/3.1/
> manual/query_timestamps/>
> >
> > We use our own NTP cluster to reduce clock drift as much as
> > possible, but public NTP servers are good enough for most
> > uses. https://www.instaclustr.com/blog/2015/11/05/apache-
> cassandra-synchronization/
> > <https://www.instaclustr.com/blog/2015/11/05/apache-
> cassandra-synchronization/>
> >
> > Cheers,
> > Justin
> >
> > On Thu, 9 Feb 2017 at 16:09 Kant Kodali <k...@peernova.com
> > <mailto:k...@peernova.com>> wrote:
> >
> > How does Cassandra achieve Linearizability with “Last write
> > wins” (conflict resolution methods based on time-of-day clocks) ?
> >
> > Relying on synchronized clocks are almost certainly
> > non-linearizable, because clock timestamps cannot be guaranteed
> > to be consistent with actual event ordering due to clock skew.
> > isn't it?
> >
> > Thanks!
> >
> > --
> >
> > Justin Cameron
> >
> > Senior Software Engineer | Instaclustr
> >
> >
> >
> >
> > This email has been sent on behalf of Instaclustr Pty Ltd
> > (Australia) and Instaclustr Inc (USA).
> >
> > This email and any attachments may contain confidential and legally
> > privileged information.  If you are not the intended recipient, do
> > not copy or disclose its content, but please reply to this email
> > immediately and highlight the error to the sender and then
> > immediately delete the message.
> >
> >
>
>

Re: How does cassandra achieve Linearizability?

2017-02-09 Thread Kant Kodali

Hi Justin,

There are bunch of issues w.r.t to synchronization of clocks when we used
ntpd. Also the time it took to sync the clocks was approx 140ms (don't
quote me on it though because it is reported by our devops :)

we have multiple clients (for example bunch of micro services are reading
from Cassandra) I am not sure how one can achieve Linearizability by
setting timestamps on the clients ? since there is no total ordering across
multiple clients.

Thanks!


On Thu, Feb 9, 2017 at 4:16 PM, Justin Cameron <jus...@instaclustr.com>
wrote:

> Hi Kant,
>
> Clock synchronization is important - you should ensure that ntpd is
> properly configured on all nodes. If your particular use case is especially
> sensitive to out-of-order mutations it is possible to set timestamps on the
> client side using the drivers. https://docs.datastax.com/en/developer/
> java-driver/3.1/manual/query_timestamps/
>
> We use our own NTP cluster to reduce clock drift as much as possible, but
> public NTP servers are good enough for most uses. https://www.instaclustr.
> com/blog/2015/11/05/apache-cassandra-synchronization/
>
> Cheers,
> Justin
>
> On Thu, 9 Feb 2017 at 16:09 Kant Kodali <k...@peernova.com> wrote:
>
>> How does Cassandra achieve Linearizability with “Last write wins”
>> (conflict resolution methods based on time-of-day clocks) ?
>>
>> Relying on synchronized clocks are almost certainly non-linearizable,
>> because clock timestamps cannot be guaranteed to be consistent with actual
>> event ordering due to clock skew. isn't it?
>>
>> Thanks!
>>
> --
>
> Justin Cameron
>
> Senior Software Engineer | Instaclustr
>
>
>
>
> This email has been sent on behalf of Instaclustr Pty Ltd (Australia) and
> Instaclustr Inc (USA).
>
> This email and any attachments may contain confidential and legally
> privileged information.  If you are not the intended recipient, do not copy
> or disclose its content, but please reply to this email immediately and
> highlight the error to the sender and then immediately delete the message.
>
>

How does cassandra achieve Linearizability?

2017-02-09 Thread Kant Kodali

How does Cassandra achieve Linearizability with “Last write wins” (conflict
resolution methods based on time-of-day clocks) ?

Relying on synchronized clocks are almost certainly non-linearizable,
because clock timestamps cannot be guaranteed to be consistent with actual
event ordering due to clock skew. isn't it?

Thanks!

If reading from materialized view with a consistency level of quorum am I guaranteed to have the most recent view?

2017-02-09 Thread Kant Kodali

If reading from materialized view with a consistency level of quorum am I
guaranteed to have the most recent view? other words is w + r > n contract
maintained for MV's as well for both reads and writes?

Thanks!

Re: AW: Why does CockroachDB github website say Cassandra has noAvailability on datacenter failure?

2017-02-07 Thread Kant Kodali

https://github.com/cockroachdb/cockroach/commit/f46a547827d3439b57baa5c3a11f8f9ad2d8b153

On Tue, Feb 7, 2017 at 3:20 PM, Kant Kodali <k...@peernova.com> wrote:

> LOL They took down that image finally!! But I would still keep an eye on
> what kind of fake benchmarks they might come up with.
>
> On Tue, Feb 7, 2017 at 7:11 AM, Amit Trivedi <tria...@gmail.com> wrote:
>
>> It indeed is a marketing gimmick. By clubbing Cassandra with likes of
>> HBase that favors consistency over availability, points under cons section
>> are all true, just not true when applied to any one database from that
>> group.
>>
>> Thanks and Regards
>>
>> Amit Trivedi
>>
>> On Feb 7, 2017, 7:32 AM -0500, j.kes...@enercast.de, wrote:
>>
>> Deeper inside there is a diagram:
>>
>>
>>
>> https://raw.githubusercontent.com/cockroachdb/cockroach/mast
>> er/docs/media/sql-nosql-newsql.png
>>
>>
>>
>> They compare to NoSQL along with Riak, HBase and Cassandra.
>>
>>
>>
>> Of course you CAN have a Cassandra cluster which is not fully available
>> with loss of a dc nor consistent.
>>
>>
>>
>> Marketing 
>>
>>
>>
>> Gesendet von meinem Windows 10 Phone
>>
>>
>>
>> *Von:* DuyHai Doan <doanduy...@gmail.com>
>> *Gesendet:* Dienstag, 7. Februar 2017 11:53
>> *An:* d...@cassandra.apache.org
>> *Cc:* user@cassandra.apache.org
>> *Betreff:* Re: Why does CockroachDB github website say Cassandra has
>> noAvailability on datacenter failure?
>>
>>
>>
>> The link you posted doesn't say anything about Cassandra
>>
>> Le 7 févr. 2017 11:41, "Kant Kodali" <k...@peernova.com> a écrit :
>>
>> Why does CockroachDB github website say Cassandra has no Availability on
>> datacenter failure?
>>
>> https://github.com/cockroachdb/cockroach
>>
>>
>>
>>
>

Re: Why does CockroachDB github website say Cassandra has no Availability on datacenter failure?

2017-02-07 Thread Kant Kodali

lol. But seriously are they even allowed to say something that is not true
about another product ?

On Tue, Feb 7, 2017 at 4:05 AM, kurt greaves  wrote:

> Marketing never lies. Ever
>

Why does CockroachDB github website say Cassandra has no Availability on datacenter failure?

2017-02-07 Thread Kant Kodali

Why does CockroachDB github website say Cassandra has no Availability on
datacenter failure?

https://github.com/cockroachdb/cockroach

Re: quick question

2017-02-01 Thread Kant Kodali

Adding dev only for this thread.

On Wed, Feb 1, 2017 at 4:39 AM, Kant Kodali <k...@peernova.com> wrote:

> What is the difference between accepting a value and committing a value?
>
>
>
> On Wed, Feb 1, 2017 at 4:25 AM, Kant Kodali <k...@peernova.com> wrote:
>
>> Hi,
>>
>> Thanks for the response. I finished watching this video but I still got
>> few questions.
>>
>> 1) The speaker seems to suggest that there are different consistency
>> levels being used in different phases of paxos protocol. If so, what is
>> right consistency level to set on these phases?
>>
>> 2) Right now, we just set consistency level as QUORUM at the global level
>> and I dont think we ever change it so in this case what would be the
>> consistency levels being used in different phases.
>>
>> 3) The fact that one should think about reading before the commit phase
>> or after the commit phase (but not any other phase) sounds like there is
>> something special about commit phase and what is that? when I set the
>> QUORUM level consistency at global level Does the commit phase happen right
>> after accept  phase or no? or irrespective of the consistency level when
>> does the commit phase happen anyway? and what happens during the commit
>> phase?
>>
>>
>> Thanks,
>> kant
>>
>>
>> On Wed, Feb 1, 2017 at 3:30 AM, Alain RODRIGUEZ <arodr...@gmail.com>
>> wrote:
>>
>>> Hi,
>>>
>>> I believe that this talk from Christopher Batey at the Cassandra Summit
>>> 2016 might answer most of your questions around LWT:
>>> https://www.youtube.com/watch?v=wcxQM3ZN20c
>>>
>>> He explains a lot of stuff including consistency considerations. My
>>> understanding is that the quorum read can only see the data written using
>>> LWT after the commit phase. A SERIAL Read would see it (video, around
>>> 23:40).
>>>
>>> Here are the slides as well: http://fr.slideshare.net
>>> /DataStax/light-weight-transactions-under-stress-christopher
>>> -batey-the-last-pickle-cassandra-summit-2016
>>>
>>> Let us know if you still have questions after watching this (about 35
>>> minutes).
>>>
>>> C*heers,
>>> ---
>>> Alain Rodriguez - @arodream - al...@thelastpickle.com
>>> France
>>>
>>> The Last Pickle - Apache Cassandra Consulting
>>> http://www.thelastpickle.com
>>>
>>> 2017-02-01 10:57 GMT+01:00 Kant Kodali <k...@peernova.com>:
>>>
>>>> When you initiate a LWT(write) and do a QUORUM read is there a chance
>>>> that one might not see the LWT write ? If so, can someone explain a bit
>>>> more?
>>>>
>>>> Thanks!
>>>>
>>>
>>>
>>
>

Re: quick question

2017-02-01 Thread Kant Kodali

What is the difference between accepting a value and committing a value?



On Wed, Feb 1, 2017 at 4:25 AM, Kant Kodali <k...@peernova.com> wrote:

> Hi,
>
> Thanks for the response. I finished watching this video but I still got
> few questions.
>
> 1) The speaker seems to suggest that there are different consistency
> levels being used in different phases of paxos protocol. If so, what is
> right consistency level to set on these phases?
>
> 2) Right now, we just set consistency level as QUORUM at the global level
> and I dont think we ever change it so in this case what would be the
> consistency levels being used in different phases.
>
> 3) The fact that one should think about reading before the commit phase or
> after the commit phase (but not any other phase) sounds like there is
> something special about commit phase and what is that? when I set the
> QUORUM level consistency at global level Does the commit phase happen right
> after accept  phase or no? or irrespective of the consistency level when
> does the commit phase happen anyway? and what happens during the commit
> phase?
>
>
> Thanks,
> kant
>
>
> On Wed, Feb 1, 2017 at 3:30 AM, Alain RODRIGUEZ <arodr...@gmail.com>
> wrote:
>
>> Hi,
>>
>> I believe that this talk from Christopher Batey at the Cassandra Summit
>> 2016 might answer most of your questions around LWT:
>> https://www.youtube.com/watch?v=wcxQM3ZN20c
>>
>> He explains a lot of stuff including consistency considerations. My
>> understanding is that the quorum read can only see the data written using
>> LWT after the commit phase. A SERIAL Read would see it (video, around
>> 23:40).
>>
>> Here are the slides as well: http://fr.slideshare.net
>> /DataStax/light-weight-transactions-under-stress-christopher
>> -batey-the-last-pickle-cassandra-summit-2016
>>
>> Let us know if you still have questions after watching this (about 35
>> minutes).
>>
>> C*heers,
>> ---
>> Alain Rodriguez - @arodream - al...@thelastpickle.com
>> France
>>
>> The Last Pickle - Apache Cassandra Consulting
>> http://www.thelastpickle.com
>>
>> 2017-02-01 10:57 GMT+01:00 Kant Kodali <k...@peernova.com>:
>>
>>> When you initiate a LWT(write) and do a QUORUM read is there a chance
>>> that one might not see the LWT write ? If so, can someone explain a bit
>>> more?
>>>
>>> Thanks!
>>>
>>
>>
>

Re: quick question

2017-02-01 Thread Kant Kodali

Hi,

Thanks for the response. I finished watching this video but I still got few
questions.

1) The speaker seems to suggest that there are different consistency levels
being used in different phases of paxos protocol. If so, what is right
consistency level to set on these phases?

2) Right now, we just set consistency level as QUORUM at the global level
and I dont think we ever change it so in this case what would be the
consistency levels being used in different phases.

3) The fact that one should think about reading before the commit phase or
after the commit phase (but not any other phase) sounds like there is
something special about commit phase and what is that? when I set the
QUORUM level consistency at global level Does the commit phase happen right
after accept  phase or no? or irrespective of the consistency level when
does the commit phase happen anyway? and what happens during the commit
phase?

Thanks,
kant

On Wed, Feb 1, 2017 at 3:30 AM, Alain RODRIGUEZ <arodr...@gmail.com> wrote:

> Hi,
>
> I believe that this talk from Christopher Batey at the Cassandra Summit
> 2016 might answer most of your questions around LWT:
> https://www.youtube.com/watch?v=wcxQM3ZN20c
>
> He explains a lot of stuff including consistency considerations. My
> understanding is that the quorum read can only see the data written using
> LWT after the commit phase. A SERIAL Read would see it (video, around
> 23:40).
>
> Here are the slides as well: http://fr.slideshare.
> net/DataStax/light-weight-transactions-under-stress-
> christopher-batey-the-last-pickle-cassandra-summit-2016
>
> Let us know if you still have questions after watching this (about 35
> minutes).
>
> C*heers,
> ---
> Alain Rodriguez - @arodream - al...@thelastpickle.com
> France
>
> The Last Pickle - Apache Cassandra Consulting
> http://www.thelastpickle.com
>
> 2017-02-01 10:57 GMT+01:00 Kant Kodali <k...@peernova.com>:
>
>> When you initiate a LWT(write) and do a QUORUM read is there a chance
>> that one might not see the LWT write ? If so, can someone explain a bit
>> more?
>>
>> Thanks!
>>
>
>

quick question

2017-02-01 Thread Kant Kodali

When you initiate a LWT(write) and do a QUORUM read is there a chance that
one might not see the LWT write ? If so, can someone explain a bit more?

Thanks!

question on multi DC setup and LWT's

2017-01-23 Thread Kant Kodali

HI Guys,

Lets say I have 2 DC's and I have 3 node cluster on each DC and one replica
on each DC. I would like to maintain Strong consistency and high
availability so

1) First of all, How do I even set up one replica on each DC?
2) what should my read and write consistent levels be when I am using LWT?
3) what is the difference of between QUORUM and SERIAL when using LWT for
both reads and writes?

Thanks!

Re: Cassandra cluster performance

2017-01-06 Thread Kant Kodali

yeah you should async writes also you cannot neglect data size so you might
want to let us know what your data size is?



On Thu, Jan 5, 2017 at 2:57 PM, kurt Greaves  wrote:

> you should try switching to async writes and then perform the test. sync
> writes won't make much difference from a single node but multiple nodes
> there should be a massive difference.
>
> On 4 Jan 2017 10:05, "Branislav Janosik -T (bjanosik - AAP3 INC at Cisco)"
>  wrote:
>
>> Hi,
>>
>>
>>
>> Our column family definition is
>>
>>
>>
>> *"CREATE TABLE onem2m.cse(" *+
>> *"name TEXT PRIMARY KEY," *+
>> *"resourceId TEXT," *+
>> *")"*;
>>
>> *"CREATE TABLE IF NOT EXISTS onem2m.AeIdToResourceIdMapping(" *+
>> *"cseBaseCseId TEXT," *+
>> *"aeId TEXT," *+
>> *"resourceId TEXT," *+
>> *"PRIMARY KEY ((cseBaseCseId), aeId)" *+
>> *")"*;
>>
>>
>>
>> *"CREATE TABLE IF NOT EXISTS onem2m.Resources_" *+ i + *"(" *+
>> *"CONTENT_INSTANCE_OldestId TEXT," *+
>> *"CONTENT_INSTANCE_LatestId TEXT," *+
>> *"SUBSCRIPTION_OldestId TEXT," *+
>> *"SUBSCRIPTION_LatestId TEXT," *+
>> *"resourceId TEXT PRIMARY KEY," *+
>> *"resourceType TEXT," *+
>> *"resourceName TEXT," *+
>> *"jsonContent TEXT," *+
>> *"parentId TEXT," *+
>> *")"*;
>>
>> *"CREATE TABLE IF NOT EXISTS onem2m.Children_" *+ i + *"(" *+
>> *"parentResourceId TEXT," *+
>> *"childName TEXT," *+
>> *"childResourceId TEXT," *+
>> *"nextId TEXT," *+
>> *"prevId TEXT," *+
>> *"PRIMARY KEY ((parentResourceId), childName)" *+
>> *")"*;
>>
>>
>>
>>
>>
>>
>>
>> *From: *Abhishek Kumar Maheshwari 
>> *Date: *Sunday, December 25, 2016 at 8:54 PM
>> *To: *"Branislav Janosik -T (bjanosik - AAP3 INC at Cisco)" <
>> bjano...@cisco.com>
>> *Cc: *"user@cassandra.apache.org" 
>> *Subject: *RE: Cassandra cluster performance
>>
>>
>>
>> Hi Branislav,
>>
>>
>>
>>
>>
>> What is your column family definition?
>>
>>
>>
>>
>>
>> *Thanks & Regards,*
>> *Abhishek Kumar Maheshwari*
>> *+91- 805591 <+91%208%2005591> (Mobile)*
>>
>> Times Internet Ltd. | A Times of India Group Company
>>
>> FC - 6, Sector 16A, Film City,  Noida,  U.P. 201301 | INDIA
>>
>> *P** Please do not print this email unless it is absolutely necessary.
>> Spread environmental awareness.*
>>
>>
>>
>> *From:* Branislav Janosik -T (bjanosik - AAP3 INC at Cisco) [mailto:
>> bjano...@cisco.com]
>> *Sent:* Thursday, December 22, 2016 6:18 AM
>> *To:* user@cassandra.apache.org
>> *Subject:* Re: Cassandra cluster performance
>>
>>
>>
>> Hi,
>>
>>
>>
>> - Consistency level is set to ONE
>>
>> -  Keyspace definition:
>>
>> *"CREATE KEYSPACE  IF NOT EXISTS  onem2m " *+
>> *"WITH replication = " *+
>> *"{ 'class' : 'SimpleStrategy', 'replication_factor' : 1}"*;
>>
>>
>>
>> - yes, the client is on separate VM
>>
>> - In our project we use Cassandra API version 3.0.2 but the database 
>> (cluster) is version 3.9
>>
>> - for 2node cluster:
>>
>>  first VM: 25 GB RAM, 16 CPUs
>>
>>  second VM: 16 GB RAM, 16 CPUs
>>
>>
>>
>>
>>
>>
>>
>> *From: *Ben Slater 
>> *Reply-To: *"user@cassandra.apache.org" 
>> *Date: *Wednesday, December 21, 2016 at 2:32 PM
>> *To: *"user@cassandra.apache.org" 
>> *Subject: *Re: Cassandra cluster performance
>>
>>
>>
>> You would expect some drop when moving to single multiple nodes but on
>> the face of it that feels extreme to me (although I’ve never personally
>> tested the difference). Some questions that might help provide an answer:
>>
>> - what consistency level are you using for the test?
>>
>> - what is your keyspace definition (replication factor most importantly)?
>>
>> - where are you running your test client (is it a separate box to
>> cassandra)?
>>
>> - what C* version?
>>
>> - what are specs (CPU, RAM) of the test servers?
>>
>>
>>
>> Cheers
>>
>> Ben
>>
>>
>>
>> On Thu, 22 Dec 2016 at 09:26 Branislav Janosik -T (bjanosik - AAP3 INC at
>> Cisco)  wrote:
>>
>> Hi all,
>>
>>
>>
>> I’m working on a project and we have Java benchmark test for testing the
>> performance when using Cassandra database. Create operation on a single
>> node Cassandra cluster is about 15K operations per second. Problem we have
>> is when I set up cluster with 2 or more nodes (each of them are on separate
>> virtual machines and servers), the performance goes down to 1K ops/sec. I
>> follow the official instructions on how to set up a multinode cluster – the
>> only things I change in Cassandra.yaml file are: change seeds to IP address
>> of one node, change listen and rpc address to IP address of the node and
>> finally change endpoint snitch to GossipingPropertyFileSnitch. The
>>

Re: Why does Cassandra recommends Oracle JVM instead of OpenJDK?

2017-01-02 Thread Kant Kodali

The fact that Oracle would even come up with something like this "Oracle's
position was that Google should have to license code from them." is just
messed up. And these kind of business practices are exactly the reason why
to stay away. Of course every company is there to make money. You look at
Google or FB and see how much open source contribution they have
done. Oracle doesnt come anywhere close to that.

On Mon, Jan 2, 2017 at 8:08 PM, Edward Capriolo <edlinuxg...@gmail.com>
wrote:

>
>
> On Mon, Jan 2, 2017 at 8:30 PM, Kant Kodali <k...@peernova.com> wrote:
>
>> This is a subjective question and of course it would turn into
>> opinionated answers and I think we should welcome that (Nothing wrong in
>> debating a topic). we have many such debates as SE's such as programming
>> language comparisons, Architectural debates, Framework/Library debates and
>> so on. people who don't like this conversation can simply refrain from
>> following this thread right. I don't know why they choose to Jump in if
>> they dont like a topic
>>
>> Sun is a great company no doubt! I don't know if Oracle is. Things like
>> this https://www.extremetech.com/mobile/220136-google-plans-
>> to-remove-oracles-java-apis-from-android-n is what pisses me about
>> Oracle which gives an impression that they are not up for open source. It
>> would be awesome to see JVM running on more and more devices (not less) so
>> Google taking away Oracle Java API's from Android is a big failure from
>> Oracle.
>>
>> JVM is a great piece of Software and by far there isn't anything yet that
>> comes close. And there are great people who worked at SUN at that time.
>> open the JDK source code and read it. you will encounter some great ideas
>> and Algorithms.
>>
>>
>>
>>
>>
>> On Mon, Jan 2, 2017 at 1:04 PM, Edward Capriolo <edlinuxg...@gmail.com>
>> wrote:
>>
>>>
>>> On Mon, Jan 2, 2017 at 3:51 PM, Benjamin Roth <benjamin.r...@jaumo.com>
>>> wrote:
>>>
>>>> Does this discussion really make sense any more? To me it seems it
>>>> turned opinionated and religious. From my point of view anything that has
>>>> to be said was said.
>>>>
>>>> Am 02.01.2017 21:27 schrieb "Edward Capriolo" <edlinuxg...@gmail.com>:
>>>>
>>>>>
>>>>>
>>>>> On Mon, Jan 2, 2017 at 11:56 AM, Eric Evans <john.eric.ev...@gmail.com
>>>>> > wrote:
>>>>>
>>>>>> On Fri, Dec 23, 2016 at 9:15 PM, Edward Capriolo <
>>>>>> edlinuxg...@gmail.com> wrote:
>>>>>> > "I don't really have any opinions on Oracle per say, but Cassandra
>>>>>> is a
>>>>>> > Free Software project and I would prefer that we not depend on
>>>>>> > commercial software, (and that's kind of what we have here, an
>>>>>> > implicit dependency)."
>>>>>> >
>>>>>> > We are a bit loose here with terms "free" and "commercial". The
>>>>>> oracle JVM
>>>>>> > is open source, it is free to use and the trademark is owned by a
>>>>>> company.
>>>>>>
>>>>>> Are we?  There are many definitions for the word "free", only one of
>>>>>> which means "without cost"; I assumed it was obvious that I was
>>>>>> talking about licensing terms (and of course the implications of that
>>>>>> licensing).
>>>>>>
>>>>>> Cassandra is Free Software by virtue of the fact that it is Apache
>>>>>> Licensed.  You are Free (as in Freedom) to modify and redistribute it.
>>>>>>
>>>>>> The Oracle JVM ships with a commercial license.  It is free only in
>>>>>> the sense that you are not required to pay anything to use it, (i.e.
>>>>>> you are not Free to do much of anything other than use it to run Java
>>>>>> software).
>>>>>>
>>>>>> > That is not much different then using a tool for cassandra like a
>>>>>> driver
>>>>>> > hosted on github but made my a company.
>>>>>>
>>>>>> It is very different IME.  Cassandra requires a JVM to function, this
>>>>>> is a hard dependency.  A driver is merely a means to make use of it.
>>>>>>
>>>>>> > The thing about a JVM is t

Re: Why does Cassandra recommends Oracle JVM instead of OpenJDK?

2017-01-02 Thread Kant Kodali

This is a subjective question and of course it would turn into opinionated
answers and I think we should welcome that (Nothing wrong in debating a
topic). we have many such debates as SE's such as programming language
comparisons, Architectural debates, Framework/Library debates and so on.
people who don't like this conversation can simply refrain from following
this thread right. I don't know why they choose to Jump in if they dont
like a topic

Sun is a great company no doubt! I don't know if Oracle is. Things like
this
https://www.extremetech.com/mobile/220136-google-plans-to-remove-oracles-java-apis-from-android-n
is what pisses me about Oracle which gives an impression that they are not
up for open source. It would be awesome to see JVM running on more and more
devices (not less) so Google taking away Oracle Java API's from Android is
a big failure from Oracle.

JVM is a great piece of Software and by far there isn't anything yet that
comes close. And there are great people who worked at SUN at that time.
open the JDK source code and read it. you will encounter some great ideas
and Algorithms.

On Mon, Jan 2, 2017 at 1:04 PM, Edward Capriolo 
wrote:

>
> On Mon, Jan 2, 2017 at 3:51 PM, Benjamin Roth 
> wrote:
>
>> Does this discussion really make sense any more? To me it seems it turned
>> opinionated and religious. From my point of view anything that has to be
>> said was said.
>>
>> Am 02.01.2017 21:27 schrieb "Edward Capriolo" :
>>
>>>
>>>
>>> On Mon, Jan 2, 2017 at 11:56 AM, Eric Evans 
>>> wrote:
>>>
 On Fri, Dec 23, 2016 at 9:15 PM, Edward Capriolo 
 wrote:
 > "I don't really have any opinions on Oracle per say, but Cassandra is
 a
 > Free Software project and I would prefer that we not depend on
 > commercial software, (and that's kind of what we have here, an
 > implicit dependency)."
 >
 > We are a bit loose here with terms "free" and "commercial". The
 oracle JVM
 > is open source, it is free to use and the trademark is owned by a
 company.

 Are we?  There are many definitions for the word "free", only one of
 which means "without cost"; I assumed it was obvious that I was
 talking about licensing terms (and of course the implications of that
 licensing).

 Cassandra is Free Software by virtue of the fact that it is Apache
 Licensed.  You are Free (as in Freedom) to modify and redistribute it.

 The Oracle JVM ships with a commercial license.  It is free only in
 the sense that you are not required to pay anything to use it, (i.e.
 you are not Free to do much of anything other than use it to run Java
 software).

 > That is not much different then using a tool for cassandra like a
 driver
 > hosted on github but made my a company.

 It is very different IME.  Cassandra requires a JVM to function, this
 is a hard dependency.  A driver is merely a means to make use of it.

 > The thing about a JVM is that like a kernel you want really smart
 dedicated
 > people working on it. Oracle has moved the JVM forward since taking
 over
 > sun. You can not just manage a JVM like say the freebsd port of x
 maintained
 > by 3 part time dudes that all get paid to do something else.

 I don't how to read any of this.  It sounds like you're saying that a
 JVM is something that cannot be produced as a Free Software project,
 or maybe that you just really like Oracle, I'm honestly not sure.  It
 doesn't seem relevant though, because there is in fact a Free Software
 JVM (and in addition to some mere mortals, the fine people at Oracle
 do contribute to it).

 --
 Eric Evans
 john.eric.ev...@gmail.com

>>>
>>> Are we?  There are many definitions for the word "free", only one of
>>> which means "without cost"; I assumed it was obvious that I was
>>> talking about licensing terms (and of course the implications of that
>>> licensing).
>>>
>>> Lets be clear:
>>> What I am saying is avoiding being loose with the word "free"
>>>
>>> https://en.wikipedia.org/wiki/Free_software_license
>>>
>>> Many things with the JVM are free too. Most importantly it is free to
>>> use.
>>>
>>> https://www.java.com/en/download/faq/distribution.xml
>>>
>>> As it relates to this conversation: I am not aware of anyone running
>>> Cassandra that has modified upstream JVM to make Cassandra run
>>> better/differently *. Thus the license around the Oracle JVM is roughly
>>> meaningless to the user/developer of cassandra.
>>>
>>> * The only group I know that took an action to modify upstream was
>>> Acunu. They had released a modified Linux Kernel with a modified Apache
>>> Cassandra. http://cloudtweaks.com/2011/02/data-storage-start
>>> up-acunu-raises-3-6-million-to-launch-its-first-product/. That product

Re: Why does Cassandra recommends Oracle JVM instead of OpenJDK?

2016-12-26 Thread Kant Kodali

The observations that James Gosling did aren't just relevant in the year
2010 but rather he expressed Oracle's DNA. He clearly expressed how the
upper management in that company works. And even today it works the same
way starting from decades ago.
If you know a character of someone you can predict what he or she would do.
And in that video Gosling more or less described the character of Oracle!

I dont mean to say JVM shouldn't be in hands of large entity but rather If
it was in the hands of companies like Google or Microsoft or say DataStax I
would have been more happy :)

Above all, I love JVM and the work of many smart people that are behind.  I
do wish Java 9 takes off really well with the module system where
containerized deployments


Google and andriod (j++--) ??? I don't even about j++-- existence. any
links? I tried a quick google search but couldn't find anything.

On Mon, Dec 26, 2016 at 8:08 AM, Brice Dutheil <brice.duth...@gmail.com>
wrote:

> A note on this video from the respected James Gosling, is that it is from
> 2010, when Oracle was new to the Java stewardship ecosystem. The company
> came a long since. I'm not saying everything is perfect. But I doubt that a
> product such as the JVM will be as good without a company guidance.
>
> The module system is interesting and is good thing regardless of the
> Oracle features. Having AWT classes for a server always annoyed me, for IoT
> as well. I'm really excited about Java 9.
>
>
> -- Brice
>
> On Mon, Dec 26, 2016 at 3:55 PM, Edward Capriolo <edlinuxg...@gmail.com>
> wrote:
>
>>
>>
>> On Sat, Dec 24, 2016 at 5:58 AM, Kant Kodali <k...@peernova.com> wrote:
>>
>>> @Edward Agreed JVM is awesome and it is a work of many smart people and
>>> this is obvious if one looks into the JDK code. But given Oracle history of
>>> business practices and other decisions it is a bit hard to convince oneself
>>> that everything is going to be OK and that they actually care about open
>>> source. Even the module system that they are trying to come up with is
>>> something that motivated by the problem they have faced internally.
>>>
>>> To reiterate again just watch this video https://www.youtube.com/
>>> watch?v=9ei-rbULWoA
>>>
>>> My statements are not solely based on this video but I certainly would
>>> give good weight for James Gosling.
>>>
>>> I tend to think that Oracle has not closed Java because they know that
>>> cant get money from users because these days not many people are willing to
>>> pay even for distributed databases so I don't think anyone would pay for
>>> programming language. In short, Let me end by saying Oracle just has lot of
>>> self interest but I really hope that I am wrong since I am a big fan of JVM.
>>>
>>>
>>>
>>>
>>>
>>> On Fri, Dec 23, 2016 at 7:15 PM, Edward Capriolo <edlinuxg...@gmail.com>
>>> wrote:
>>>
>>>>
>>>> On Fri, Dec 23, 2016 at 6:01 AM, Kant Kodali <k...@peernova.com> wrote:
>>>>
>>>>> Java 9 Module system looks really interesting. I would be very curious
>>>>> to see how Cassandra would leverage that.
>>>>>
>>>>> On Thu, Dec 22, 2016 at 9:09 AM, Kant Kodali <k...@peernova.com>
>>>>> wrote:
>>>>>
>>>>>> I would agree with Eric with his following statement. In fact, I was
>>>>>> trying to say the same thing.
>>>>>>
>>>>>> "I don't really have any opinions on Oracle per say, but Cassandra
>>>>>> is a
>>>>>> Free Software project and I would prefer that we not depend on
>>>>>> commercial software, (and that's kind of what we have here, an
>>>>>> implicit dependency)."
>>>>>>
>>>>>> On Thu, Dec 22, 2016 at 3:09 AM, Brice Dutheil <
>>>>>> brice.duth...@gmail.com> wrote:
>>>>>>
>>>>>>> Pretty much a non-story, it seems like.
>>>>>>>
>>>>>>> Clickbait imho. Search ‘The Register’ in this wikipedia page
>>>>>>> <https://en.wikipedia.org/wiki/Wikipedia:Potentially_unreliable_sources#News_media>
>>>>>>>
>>>>>>> @Ben Manes
>>>>>>>
>>>>>>> Agreed, OpenJDK and Oracle JDK are now pretty close, but there is
>>>>>>> still some differences in the VM code and third party dependencies like
>>>>>>> security libraries. Maybe that’s fine

Re: Why does Cassandra recommends Oracle JVM instead of OpenJDK?

2016-12-24 Thread Kant Kodali

@Edward Agreed JVM is awesome and it is a work of many smart people and
this is obvious if one looks into the JDK code. But given Oracle history of
business practices and other decisions it is a bit hard to convince oneself
that everything is going to be OK and that they actually care about open
source. Even the module system that they are trying to come up with is
something that motivated by the problem they have faced internally.

To reiterate again just watch this video https://www.youtube.com/watch?
v=9ei-rbULWoA

My statements are not solely based on this video but I certainly would give
good weight for James Gosling.

I tend to think that Oracle has not closed Java because they know that cant
get money from users because these days not many people are willing to pay
even for distributed databases so I don't think anyone would pay for
programming language. In short, Let me end by saying Oracle just has lot of
self interest but I really hope that I am wrong since I am a big fan of JVM.





On Fri, Dec 23, 2016 at 7:15 PM, Edward Capriolo <edlinuxg...@gmail.com>
wrote:

>
> On Fri, Dec 23, 2016 at 6:01 AM, Kant Kodali <k...@peernova.com> wrote:
>
>> Java 9 Module system looks really interesting. I would be very curious to
>> see how Cassandra would leverage that.
>>
>> On Thu, Dec 22, 2016 at 9:09 AM, Kant Kodali <k...@peernova.com> wrote:
>>
>>> I would agree with Eric with his following statement. In fact, I was
>>> trying to say the same thing.
>>>
>>> "I don't really have any opinions on Oracle per say, but Cassandra is a
>>> Free Software project and I would prefer that we not depend on
>>> commercial software, (and that's kind of what we have here, an
>>> implicit dependency)."
>>>
>>> On Thu, Dec 22, 2016 at 3:09 AM, Brice Dutheil <brice.duth...@gmail.com>
>>> wrote:
>>>
>>>> Pretty much a non-story, it seems like.
>>>>
>>>> Clickbait imho. Search ‘The Register’ in this wikipedia page
>>>> <https://en.wikipedia.org/wiki/Wikipedia:Potentially_unreliable_sources#News_media>
>>>>
>>>> @Ben Manes
>>>>
>>>> Agreed, OpenJDK and Oracle JDK are now pretty close, but there is still
>>>> some differences in the VM code and third party dependencies like security
>>>> libraries. Maybe that’s fine for some productions, but maybe not for
>>>> everyone.
>>>>
>>>> Also another thing, while OpenJDK source is available to all, I don’t
>>>> think all OpenJDK builds have been certified with the TCK. For example the
>>>> Zulu OpenJDK is, as Azul have access to the TCK and certifies
>>>> <https://www.azul.com/products/zulu/> the builds. Another example
>>>> OpenJDK build installed on RHEL is certified
>>>> <https://access.redhat.com/articles/1299013>. Canonical probably is
>>>> running TCK comliance tests as well on thei OpenJDK 8 since they are listed
>>>> on the signatories
>>>> <http://openjdk.java.net/groups/conformance/JckAccess/jck-access.html>
>>>> but not sure as I couldn’t find evidence on this; on this signatories list
>>>> again there’s an individual – Emmanuel Bourg – who is related to Debian
>>>> <https://lists.debian.org/debian-java/2015/01/msg00015.html> (linkedin
>>>> <https://www.linkedin.com/in/ebourg>), but not sure again the TCK is
>>>> passed for each build.
>>>>
>>>> Bad OpenJDK intermediary builds, i.e without TCK compliance tests, is a
>>>> reality
>>>> <https://github.com/docker-library/openjdk/commit/00a9c5c080f2a5fd1510bc0716db7afe06cbd017>
>>>> .
>>>>
>>>> While the situation has enhanced over the past months I’ll still double
>>>> check before using any OpenJDK builds.
>>>> 
>>>>
>>>> -- Brice
>>>>
>>>> On Wed, Dec 21, 2016 at 5:08 PM, Voytek Jarnot <voytek.jar...@gmail.com
>>>> > wrote:
>>>>
>>>>> Reading that article the only conclusion I can reach (unless I'm
>>>>> misreading) is that all the stuff that was never free is still not free -
>>>>> the change is that Oracle may actually be interested in the fact that some
>>>>> are using non-free products for free.
>>>>>
>>>>> Pretty much a non-story, it seems like.
>>>>>
>>>>> On Tue, Dec 20, 2016 at 11:55 PM, Kant Kodali <k...@peernova.com>
>>>>> wrote:
>>>>>
>>>>>>

Re: Why does Cassandra recommends Oracle JVM instead of OpenJDK?

2016-12-23 Thread Kant Kodali

Java 9 Module system looks really interesting. I would be very curious to
see how Cassandra would leverage that.

On Thu, Dec 22, 2016 at 9:09 AM, Kant Kodali <k...@peernova.com> wrote:

> I would agree with Eric with his following statement. In fact, I was
> trying to say the same thing.
>
> "I don't really have any opinions on Oracle per say, but Cassandra is a
> Free Software project and I would prefer that we not depend on
> commercial software, (and that's kind of what we have here, an
> implicit dependency)."
>
> On Thu, Dec 22, 2016 at 3:09 AM, Brice Dutheil <brice.duth...@gmail.com>
> wrote:
>
>> Pretty much a non-story, it seems like.
>>
>> Clickbait imho. Search ‘The Register’ in this wikipedia page
>> <https://en.wikipedia.org/wiki/Wikipedia:Potentially_unreliable_sources#News_media>
>>
>> @Ben Manes
>>
>> Agreed, OpenJDK and Oracle JDK are now pretty close, but there is still
>> some differences in the VM code and third party dependencies like security
>> libraries. Maybe that’s fine for some productions, but maybe not for
>> everyone.
>>
>> Also another thing, while OpenJDK source is available to all, I don’t
>> think all OpenJDK builds have been certified with the TCK. For example the
>> Zulu OpenJDK is, as Azul have access to the TCK and certifies
>> <https://www.azul.com/products/zulu/> the builds. Another example
>> OpenJDK build installed on RHEL is certified
>> <https://access.redhat.com/articles/1299013>. Canonical probably is
>> running TCK comliance tests as well on thei OpenJDK 8 since they are listed
>> on the signatories
>> <http://openjdk.java.net/groups/conformance/JckAccess/jck-access.html>
>> but not sure as I couldn’t find evidence on this; on this signatories list
>> again there’s an individual – Emmanuel Bourg – who is related to Debian
>> <https://lists.debian.org/debian-java/2015/01/msg00015.html> (linkedin
>> <https://www.linkedin.com/in/ebourg>), but not sure again the TCK is
>> passed for each build.
>>
>> Bad OpenJDK intermediary builds, i.e without TCK compliance tests, is a
>> reality
>> <https://github.com/docker-library/openjdk/commit/00a9c5c080f2a5fd1510bc0716db7afe06cbd017>
>> .
>>
>> While the situation has enhanced over the past months I’ll still double
>> check before using any OpenJDK builds.
>> 
>>
>> -- Brice
>>
>> On Wed, Dec 21, 2016 at 5:08 PM, Voytek Jarnot <voytek.jar...@gmail.com>
>> wrote:
>>
>>> Reading that article the only conclusion I can reach (unless I'm
>>> misreading) is that all the stuff that was never free is still not free -
>>> the change is that Oracle may actually be interested in the fact that some
>>> are using non-free products for free.
>>>
>>> Pretty much a non-story, it seems like.
>>>
>>> On Tue, Dec 20, 2016 at 11:55 PM, Kant Kodali <k...@peernova.com> wrote:
>>>
>>>> Looking at this http://www.theregister.co.uk/2016/12/16/oracle_targets_
>>>> java_users_non_compliance/?mt=1481919461669 I don't know why Cassandra
>>>> recommends Oracle JVM?
>>>>
>>>> JVM is a great piece of software but I would like to stay away from
>>>> Oracle as much as possible. Oracle is just horrible the way they are
>>>> dealing with Java in General.
>>>>
>>>>
>>>>
>>>
>>
>

Re: Why does Cassandra recommends Oracle JVM instead of OpenJDK?

2016-12-22 Thread Kant Kodali

I would agree with Eric with his following statement. In fact, I was trying
to say the same thing.

"I don't really have any opinions on Oracle per say, but Cassandra is a
Free Software project and I would prefer that we not depend on
commercial software, (and that's kind of what we have here, an
implicit dependency)."

On Thu, Dec 22, 2016 at 3:09 AM, Brice Dutheil <brice.duth...@gmail.com>
wrote:

> Pretty much a non-story, it seems like.
>
> Clickbait imho. Search ‘The Register’ in this wikipedia page
> <https://en.wikipedia.org/wiki/Wikipedia:Potentially_unreliable_sources#News_media>
>
> @Ben Manes
>
> Agreed, OpenJDK and Oracle JDK are now pretty close, but there is still
> some differences in the VM code and third party dependencies like security
> libraries. Maybe that’s fine for some productions, but maybe not for
> everyone.
>
> Also another thing, while OpenJDK source is available to all, I don’t
> think all OpenJDK builds have been certified with the TCK. For example the
> Zulu OpenJDK is, as Azul have access to the TCK and certifies
> <https://www.azul.com/products/zulu/> the builds. Another example OpenJDK
> build installed on RHEL is certified
> <https://access.redhat.com/articles/1299013>. Canonical probably is
> running TCK comliance tests as well on thei OpenJDK 8 since they are listed
> on the signatories
> <http://openjdk.java.net/groups/conformance/JckAccess/jck-access.html>
> but not sure as I couldn’t find evidence on this; on this signatories list
> again there’s an individual – Emmanuel Bourg – who is related to Debian
> <https://lists.debian.org/debian-java/2015/01/msg00015.html> (linkedin
> <https://www.linkedin.com/in/ebourg>), but not sure again the TCK is
> passed for each build.
>
> Bad OpenJDK intermediary builds, i.e without TCK compliance tests, is a
> reality
> <https://github.com/docker-library/openjdk/commit/00a9c5c080f2a5fd1510bc0716db7afe06cbd017>
> .
>
> While the situation has enhanced over the past months I’ll still double
> check before using any OpenJDK builds.
> 
>
> -- Brice
>
> On Wed, Dec 21, 2016 at 5:08 PM, Voytek Jarnot <voytek.jar...@gmail.com>
> wrote:
>
>> Reading that article the only conclusion I can reach (unless I'm
>> misreading) is that all the stuff that was never free is still not free -
>> the change is that Oracle may actually be interested in the fact that some
>> are using non-free products for free.
>>
>> Pretty much a non-story, it seems like.
>>
>> On Tue, Dec 20, 2016 at 11:55 PM, Kant Kodali <k...@peernova.com> wrote:
>>
>>> Looking at this http://www.theregister.co.uk/2016/12/16/oracle_targets_
>>> java_users_non_compliance/?mt=1481919461669 I don't know why Cassandra
>>> recommends Oracle JVM?
>>>
>>> JVM is a great piece of software but I would like to stay away from
>>> Oracle as much as possible. Oracle is just horrible the way they are
>>> dealing with Java in General.
>>>
>>>
>>>
>>
>

Re: Why does Cassandra recommends Oracle JVM instead of OpenJDK?

2016-12-21 Thread Kant Kodali

https://www.youtube.com/watch?v=9ei-rbULWoA

On Wed, Dec 21, 2016 at 2:59 AM, Kant Kodali <k...@peernova.com> wrote:

> https://www.elastic.co/guide/en/elasticsearch/guide/
> current/_java_virtual_machine.html
>
> On Wed, Dec 21, 2016 at 2:58 AM, Kant Kodali <k...@peernova.com> wrote:
>
>> The fact is Oracle is horrible :)
>>
>>
>> On Wed, Dec 21, 2016 at 2:54 AM, Brice Dutheil <brice.duth...@gmail.com>
>> wrote:
>>
>>> Let's not debate opinion on the Oracle stewardship here, we certainly
>>> have different views that come from different experiences.
>>>
>>> Let's discuss facts instead :)
>>>
>>> -- Brice
>>>
>>> On Wed, Dec 21, 2016 at 11:34 AM, Kant Kodali <k...@peernova.com> wrote:
>>>
>>>> yeah well I don't think Oracle is treating Java the way Google is
>>>> treating Go and I am not a big fan of Go mainly because I understand the
>>>> JVM is far more robust than anything that is out there.
>>>>
>>>> "Oracle just doesn't understand open source" These are the words from
>>>> James Gosling himself
>>>>
>>>> I do think its better to stay away from Oracle as we never know when
>>>> they would switch open source to closed source. Given their history of
>>>> practices their statements are not credible.
>>>>
>>>> I am pretty sure the community would take care of OpenJDK.
>>>>
>>>>
>>>>
>>>>
>>>>
>>>> On Wed, Dec 21, 2016 at 2:04 AM, Brice Dutheil <brice.duth...@gmail.com
>>>> > wrote:
>>>>
>>>>> The problem described in this article is different than what you have
>>>>> on your servers and I’ll add this article should be reaad with caution, as
>>>>> The Register is known for sensationalism. The article itself has no
>>>>> substantial proof or enough details. In my opinion this article is
>>>>> clickbait.
>>>>>
>>>>> Anyway there’s several point to think of instead of just swicthing to
>>>>> OpenJDK :
>>>>>
>>>>>-
>>>>>
>>>>>There is technical differences between Oracle JDK and openjdk.
>>>>>Where there’s licensing issues some libraries are closed source in 
>>>>> Hotspot
>>>>>like font, rasterizer or cryptography and OpenJDK use open source
>>>>>alternatives which leads to different bugs or performance. I believe 
>>>>> they
>>>>>also have minor differences in the hotspot code to plug in stuff like 
>>>>> Java
>>>>>Mission Control or Flight Recorder or hotpost specific options.
>>>>>Also I believe that Oracle JDK is more tested or more up to date
>>>>>than OpenJDK.
>>>>>
>>>>>So while OpenJDK is functionnaly the same as Oracle JDK it may not
>>>>>have the same performance or the same bugs or the same security fixes.
>>>>>(Unless are your ready to test that with your production servers and 
>>>>> your
>>>>>production data).
>>>>>
>>>>>I don’t know if datastax have released the details of their
>>>>>configuration when they test Cassandra.
>>>>>-
>>>>>
>>>>>There’s also a question of support. OpeJDK is for the community.
>>>>>Oracle can offer support but maybe only for Oracle JDK.
>>>>>
>>>>>Twitter uses OpenJDK, but they have their own JVM support team.
>>>>>Not sure everyone can afford that.
>>>>>
>>>>> As a side note I’ll add that Oracle is paying talented engineers to
>>>>> work on the JVM to make it great.
>>>>>
>>>>> Cheers,
>>>>> 
>>>>>
>>>>> -- Brice
>>>>>
>>>>> On Wed, Dec 21, 2016 at 6:55 AM, Kant Kodali <k...@peernova.com>
>>>>> wrote:
>>>>>
>>>>>> Looking at this http://www.theregister.co
>>>>>> .uk/2016/12/16/oracle_targets_java_users_non_compliance/?mt=
>>>>>> 1481919461669 I don't know why Cassandra recommends Oracle JVM?
>>>>>>
>>>>>> JVM is a great piece of software but I would like to stay away from
>>>>>> Oracle as much as possible. Oracle is just horrible the way they are
>>>>>> dealing with Java in General.
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Why does Cassandra recommends Oracle JVM instead of OpenJDK?

2016-12-21 Thread Kant Kodali

https://www.elastic.co/guide/en/elasticsearch/guide/current/_java_virtual_machine.html

On Wed, Dec 21, 2016 at 2:58 AM, Kant Kodali <k...@peernova.com> wrote:

> The fact is Oracle is horrible :)
>
>
> On Wed, Dec 21, 2016 at 2:54 AM, Brice Dutheil <brice.duth...@gmail.com>
> wrote:
>
>> Let's not debate opinion on the Oracle stewardship here, we certainly
>> have different views that come from different experiences.
>>
>> Let's discuss facts instead :)
>>
>> -- Brice
>>
>> On Wed, Dec 21, 2016 at 11:34 AM, Kant Kodali <k...@peernova.com> wrote:
>>
>>> yeah well I don't think Oracle is treating Java the way Google is
>>> treating Go and I am not a big fan of Go mainly because I understand the
>>> JVM is far more robust than anything that is out there.
>>>
>>> "Oracle just doesn't understand open source" These are the words from
>>> James Gosling himself
>>>
>>> I do think its better to stay away from Oracle as we never know when
>>> they would switch open source to closed source. Given their history of
>>> practices their statements are not credible.
>>>
>>> I am pretty sure the community would take care of OpenJDK.
>>>
>>>
>>>
>>>
>>>
>>> On Wed, Dec 21, 2016 at 2:04 AM, Brice Dutheil <brice.duth...@gmail.com>
>>> wrote:
>>>
>>>> The problem described in this article is different than what you have
>>>> on your servers and I’ll add this article should be reaad with caution, as
>>>> The Register is known for sensationalism. The article itself has no
>>>> substantial proof or enough details. In my opinion this article is
>>>> clickbait.
>>>>
>>>> Anyway there’s several point to think of instead of just swicthing to
>>>> OpenJDK :
>>>>
>>>>-
>>>>
>>>>There is technical differences between Oracle JDK and openjdk.
>>>>Where there’s licensing issues some libraries are closed source in 
>>>> Hotspot
>>>>like font, rasterizer or cryptography and OpenJDK use open source
>>>>alternatives which leads to different bugs or performance. I believe 
>>>> they
>>>>also have minor differences in the hotspot code to plug in stuff like 
>>>> Java
>>>>Mission Control or Flight Recorder or hotpost specific options.
>>>>Also I believe that Oracle JDK is more tested or more up to date
>>>>than OpenJDK.
>>>>
>>>>So while OpenJDK is functionnaly the same as Oracle JDK it may not
>>>>have the same performance or the same bugs or the same security fixes.
>>>>(Unless are your ready to test that with your production servers and 
>>>> your
>>>>    production data).
>>>>
>>>>I don’t know if datastax have released the details of their
>>>>configuration when they test Cassandra.
>>>>-
>>>>
>>>>There’s also a question of support. OpeJDK is for the community.
>>>>Oracle can offer support but maybe only for Oracle JDK.
>>>>
>>>>Twitter uses OpenJDK, but they have their own JVM support team. Not
>>>>sure everyone can afford that.
>>>>
>>>> As a side note I’ll add that Oracle is paying talented engineers to
>>>> work on the JVM to make it great.
>>>>
>>>> Cheers,
>>>> 
>>>>
>>>> -- Brice
>>>>
>>>> On Wed, Dec 21, 2016 at 6:55 AM, Kant Kodali <k...@peernova.com> wrote:
>>>>
>>>>> Looking at this http://www.theregister.co
>>>>> .uk/2016/12/16/oracle_targets_java_users_non_compliance/?mt=
>>>>> 1481919461669 I don't know why Cassandra recommends Oracle JVM?
>>>>>
>>>>> JVM is a great piece of software but I would like to stay away from
>>>>> Oracle as much as possible. Oracle is just horrible the way they are
>>>>> dealing with Java in General.
>>>>>
>>>>>
>>>>>
>>>>
>>>
>>
>

Re: Why does Cassandra recommends Oracle JVM instead of OpenJDK?

2016-12-21 Thread Kant Kodali

The fact is Oracle is horrible :)


On Wed, Dec 21, 2016 at 2:54 AM, Brice Dutheil <brice.duth...@gmail.com>
wrote:

> Let's not debate opinion on the Oracle stewardship here, we certainly have
> different views that come from different experiences.
>
> Let's discuss facts instead :)
>
> -- Brice
>
> On Wed, Dec 21, 2016 at 11:34 AM, Kant Kodali <k...@peernova.com> wrote:
>
>> yeah well I don't think Oracle is treating Java the way Google is
>> treating Go and I am not a big fan of Go mainly because I understand the
>> JVM is far more robust than anything that is out there.
>>
>> "Oracle just doesn't understand open source" These are the words from
>> James Gosling himself
>>
>> I do think its better to stay away from Oracle as we never know when they
>> would switch open source to closed source. Given their history of practices
>> their statements are not credible.
>>
>> I am pretty sure the community would take care of OpenJDK.
>>
>>
>>
>>
>>
>> On Wed, Dec 21, 2016 at 2:04 AM, Brice Dutheil <brice.duth...@gmail.com>
>> wrote:
>>
>>> The problem described in this article is different than what you have on
>>> your servers and I’ll add this article should be reaad with caution, as The
>>> Register is known for sensationalism. The article itself has no substantial
>>> proof or enough details. In my opinion this article is clickbait.
>>>
>>> Anyway there’s several point to think of instead of just swicthing to
>>> OpenJDK :
>>>
>>>-
>>>
>>>There is technical differences between Oracle JDK and openjdk. Where
>>>there’s licensing issues some libraries are closed source in Hotspot like
>>>font, rasterizer or cryptography and OpenJDK use open source alternatives
>>>which leads to different bugs or performance. I believe they also have
>>>minor differences in the hotspot code to plug in stuff like Java Mission
>>>Control or Flight Recorder or hotpost specific options.
>>>Also I believe that Oracle JDK is more tested or more up to date
>>>than OpenJDK.
>>>
>>>So while OpenJDK is functionnaly the same as Oracle JDK it may not
>>>have the same performance or the same bugs or the same security fixes.
>>>(Unless are your ready to test that with your production servers and your
>>>production data).
>>>
>>>I don’t know if datastax have released the details of their
>>>configuration when they test Cassandra.
>>>-
>>>
>>>There’s also a question of support. OpeJDK is for the community.
>>>Oracle can offer support but maybe only for Oracle JDK.
>>>
>>>Twitter uses OpenJDK, but they have their own JVM support team. Not
>>>sure everyone can afford that.
>>>
>>> As a side note I’ll add that Oracle is paying talented engineers to work
>>> on the JVM to make it great.
>>>
>>> Cheers,
>>> 
>>>
>>> -- Brice
>>>
>>> On Wed, Dec 21, 2016 at 6:55 AM, Kant Kodali <k...@peernova.com> wrote:
>>>
>>>> Looking at this http://www.theregister.co.uk/2016/12/16/oracle_targets_
>>>> java_users_non_compliance/?mt=1481919461669 I don't know why Cassandra
>>>> recommends Oracle JVM?
>>>>
>>>> JVM is a great piece of software but I would like to stay away from
>>>> Oracle as much as possible. Oracle is just horrible the way they are
>>>> dealing with Java in General.
>>>>
>>>>
>>>>
>>>
>>
>

Re: Why does Cassandra recommends Oracle JVM instead of OpenJDK?

2016-12-21 Thread Kant Kodali

yeah well I don't think Oracle is treating Java the way Google is treating
Go and I am not a big fan of Go mainly because I understand the JVM is far
more robust than anything that is out there.

"Oracle just doesn't understand open source" These are the words from James
Gosling himself

I do think its better to stay away from Oracle as we never know when they
would switch open source to closed source. Given their history of practices
their statements are not credible.

I am pretty sure the community would take care of OpenJDK.





On Wed, Dec 21, 2016 at 2:04 AM, Brice Dutheil <brice.duth...@gmail.com>
wrote:

> The problem described in this article is different than what you have on
> your servers and I’ll add this article should be reaad with caution, as The
> Register is known for sensationalism. The article itself has no substantial
> proof or enough details. In my opinion this article is clickbait.
>
> Anyway there’s several point to think of instead of just swicthing to
> OpenJDK :
>
>-
>
>There is technical differences between Oracle JDK and openjdk. Where
>there’s licensing issues some libraries are closed source in Hotspot like
>font, rasterizer or cryptography and OpenJDK use open source alternatives
>which leads to different bugs or performance. I believe they also have
>minor differences in the hotspot code to plug in stuff like Java Mission
>Control or Flight Recorder or hotpost specific options.
>Also I believe that Oracle JDK is more tested or more up to date than
>OpenJDK.
>
>So while OpenJDK is functionnaly the same as Oracle JDK it may not
>have the same performance or the same bugs or the same security fixes.
>(Unless are your ready to test that with your production servers and your
>production data).
>
>I don’t know if datastax have released the details of their
>configuration when they test Cassandra.
>-
>
>There’s also a question of support. OpeJDK is for the community.
>Oracle can offer support but maybe only for Oracle JDK.
>
>Twitter uses OpenJDK, but they have their own JVM support team. Not
>sure everyone can afford that.
>
> As a side note I’ll add that Oracle is paying talented engineers to work
> on the JVM to make it great.
>
> Cheers,
> 
>
> -- Brice
>
> On Wed, Dec 21, 2016 at 6:55 AM, Kant Kodali <k...@peernova.com> wrote:
>
>> Looking at this http://www.theregister.co.uk/2016/12/16/oracle_targets_
>> java_users_non_compliance/?mt=1481919461669 I don't know why Cassandra
>> recommends Oracle JVM?
>>
>> JVM is a great piece of software but I would like to stay away from
>> Oracle as much as possible. Oracle is just horrible the way they are
>> dealing with Java in General.
>>
>>
>>
>

Why does Cassandra recommends Oracle JVM instead of OpenJDK?

2016-12-20 Thread Kant Kodali

Looking at this
http://www.theregister.co.uk/2016/12/16/oracle_targets_java_users_non_compliance/?mt=1481919461669
I don't know why Cassandra recommends Oracle JVM?

JVM is a great piece of software but I would like to stay away from Oracle
as much as possible. Oracle is just horrible the way they are dealing with
Java in General.

Re: quick questions

2016-12-18 Thread Kant Kodali

you got it! that's what I was looking for from that part of my question.


thanks!!

On Sun, Dec 18, 2016 at 2:08 PM, DE VITO Dominique <
dominique.dev...@thalesgroup.com> wrote:

> Ø  I keep hearing that the minimum number of Cassandra nodes required to
> achieve Quorum consensus is 4 I wonder why not 3? In fact, many container
> deployments by default seem to deploy 4 nodes. Can anyone shine some
> light on this?
>
>
>
> I think it may be due to the following (note : I am assuming, here, a
> “vnode” cluster)
>
>
>
> a)When using 3 nodes, and QUORUM, the cluster can tolerate the loss
> of a node, but in that case, each of the remaining nodes will have a +50%
> workload
>
>
>
> b)When using 4 nodes, in case of the same loss, each of the remaining
> nodes will have (approximately) a +33% workload
>
>
>
> Option (a) will impact more the cluster stability than (b).
>
>
>
> Dominique
>
>
>
> [@@ THALES GROUP INTERNAL @@]
>
>
>
> *De :* Kant Kodali [mailto:k...@peernova.com]
> *Envoyé :* samedi 17 décembre 2016 22:21
> *À :* user@cassandra.apache.org
> *Objet :* quick questions
>
>
>
> I keep hearing that the minimum number of Cassandra nodes required to
> achieve Quorum consensus is 4 I wonder why not 3? In fact, many container
> deployments by default seem to deploy 4 nodes. Can anyone shine some light
> on this?
>
>
>
> What happens if I have 3 nodes and replication factor of 3 and consistency
> level: quorum? I should be able to achieve quorum level consensus right.
>
>
>
> If Total node = 3, RF=2 and consistency level = Quorum. Then I understand
> the quorum level consensus is not possible because the number of replica
> nodes here are 2.
>
> This also brings up another question does number of replica nodes always
> have to be an odd number to achieve quorum level consensus? If so, what
> happens when a replica node goes down ? it would still serve the requests
> but the quorum level consensus is not possible?
>
>
>
> Thanks
>
> kant
>
>
>
>
>
>
>
>
>
>
>

Re: quick questions

2016-12-17 Thread Kant Kodali

Thanks! got it!

On Sat, Dec 17, 2016 at 5:02 PM, Max C <mc_cassan...@core43.com> wrote:

> As Matija mentioned, quorum is RF / 2 + 1:
>
> RF=1, Quorum = 1
> RF=2, Quorum = 2
> RF=3, Quorum = 2
> RF=4, Quorum = 3
> RF=5, Quorum = 3
> RF=6, Quorum = 4
> RF=7, Quorum = 4
>
> So no, you don’t have to have an odd RF to achieve a quorum, as you see
> above.  Most people use RF=3 with a minimum of 3 nodes, though.  For RF=3,
> 2 of the 3 nodes need to be up in order to satisfy a quorum read/write.
>
> If you can’t achieve a quorum and you’re trying to read/write with quorum
> consistency then the read/write operation will fail.  You could still do
> reads/writes with CL=ONE, though (provided that at least 1 of the replicas
> was up).
>
> - Max
>
> > On Dec 17, 2016, at 1:21 pm, Kant Kodali <k...@peernova.com> wrote:
> >
> > I keep hearing that the minimum number of Cassandra nodes required to
> achieve Quorum consensus is 4 I wonder why not 3? In fact, many container
> deployments by default seem to deploy 4 nodes. Can anyone shine some light
> on this?
> >
> > What happens if I have 3 nodes and replication factor of 3 and
> consistency level: quorum? I should be able to achieve quorum level
> consensus right.
> >
> > If Total node = 3, RF=2 and consistency level = Quorum. Then I
> understand the quorum level consensus is not possible because the number of
> replica nodes here are 2.
> > This also brings up another question does number of replica nodes always
> have to be an odd number to achieve quorum level consensus? If so, what
> happens when a replica node goes down ? it would still serve the requests
> but the quorum level consensus is not possible?
> >
> > Thanks
> > kant
> >
> >
> >
> >
> >
>

Re: quick questions

2016-12-17 Thread Kant Kodali

@Matjia I think you either did not understand my question or I failed to
explain it more clearly.

On Sat, Dec 17, 2016 at 4:46 PM, Matija Gobec <matija0...@gmail.com> wrote:

> QUORUM is by documentation:
>
> quorum = (sum_of_replication_factors / 2) + 1
>
> Its not fixed value (as 4).
>
> On Sat, Dec 17, 2016 at 10:21 PM, Kant Kodali <k...@peernova.com> wrote:
>
>> I keep hearing that the minimum number of Cassandra nodes required to
>> achieve Quorum consensus is 4 I wonder why not 3? In fact, many container
>> deployments by default seem to deploy 4 nodes. Can anyone shine some light
>> on this?
>>
>> What happens if I have 3 nodes and replication factor of 3 and
>> consistency level: quorum? I should be able to achieve quorum level
>> consensus right.
>>
>> If Total node = 3, RF=2 and consistency level = Quorum. Then I understand
>> the quorum level consensus is not possible because the number of replica
>> nodes here are 2.
>> This also brings up another question does number of replica nodes always
>> have to be an odd number to achieve quorum level consensus? If so, what
>> happens when a replica node goes down ? it would still serve the requests
>> but the quorum level consensus is not possible?
>>
>> Thanks
>> kant
>>
>>
>>
>>
>>
>>
>

quick questions

2016-12-17 Thread Kant Kodali

I keep hearing that the minimum number of Cassandra nodes required to
achieve Quorum consensus is 4 I wonder why not 3? In fact, many container
deployments by default seem to deploy 4 nodes. Can anyone shine some light
on this?

What happens if I have 3 nodes and replication factor of 3 and consistency
level: quorum? I should be able to achieve quorum level consensus right.

If Total node = 3, RF=2 and consistency level = Quorum. Then I understand
the quorum level consensus is not possible because the number of replica
nodes here are 2.
This also brings up another question does number of replica nodes always
have to be an odd number to achieve quorum level consensus? If so, what
happens when a replica node goes down ? it would still serve the requests
but the quorum level consensus is not possible?

Thanks
kant

Are Materialized views persisted on disk?

2016-12-13 Thread Kant Kodali

Are Materialized views persisted on disk? sorry for the naive question.

Re: What is the size of each Virtual Node token range?

2016-11-28 Thread Kant Kodali

Sorry by row I mean partition key. you do answer part of my question!
thanks!

On Mon, Nov 28, 2016 at 8:39 AM, Benjamin Roth <benjamin.r...@jaumo.com>
wrote:

> A token does not identify a row. A token is a hash value of the partition
> key and the hash can have 2^64 different values. A collision is a normal
> thing in a hash table and it just means that different rows with the same
> token simply go to the same (v-)node, just like if they were different but
> in the same token range.
> You could even compare this to the typical implementation of a hash table
> in C, Java, Perl, whatever. A hashtable is a kind of a sparse array with
> the hash key as index and a linked list (or more complex implementations)
> as value where a list of all entries with the same hash values are stored.
> This simply makes it fast to find an entry by key without looping through
> all the list entries and comparing them with a key you are looking for.
>
> This thesis is maybe more correct:
> There can be no more than 2^64 nodes in a cluster as then 2 nodes would
> share exactly the same token and this does not make really sense.
>
> 2016-11-28 17:28 GMT+01:00 Kant Kodali <k...@peernova.com>:
>
>>
>> 1) What is the size of each Virtual Node token range?
>> 2) Are all Vnode token ranges in one server are of the same size?
>> 3) If these token ranges are predefined then isn't it implying that the
>> maximum total number of rows in a server is also predefined?
>>
>> maximum total number of rows in a server = num_tokens_in _vnode_1 +
>> num_tokens_in _vnode_2 + num_tokens_in _vnode_3 + +
>> num_tokens_in _vnode_256
>>
>
>
>
> --
> Benjamin Roth
> Prokurist
>
> Jaumo GmbH · www.jaumo.com
> Wehrstraße 46 · 73035 Göppingen · Germany
> Phone +49 7161 304880-6 · Fax +49 7161 304880-1
> AG Ulm · HRB 731058 · Managing Director: Jens Kammerer
>

What is the size of each Virtual Node token range?

2016-11-28 Thread Kant Kodali

1) What is the size of each Virtual Node token range?
2) Are all Vnode token ranges in one server are of the same size?
3) If these token ranges are predefined then isn't it implying that the
maximum total number of rows in a server is also predefined?

maximum total number of rows in a server = num_tokens_in _vnode_1 +
num_tokens_in _vnode_2 + num_tokens_in _vnode_3 + +
num_tokens_in _vnode_256

Re: Java GC pauses, reality check

2016-11-28 Thread Kant Kodali

@Harikrishnan Pillai

Thanks for that. What about the following?


"We are using g1GC in most clusters with *26GB heap* and extra threads
given to parallel and old gen collection. Those clusters 99% is also under
5 ms and doing good".

*So with G1GC you are able get under 5ms not the C4 (Zing's Garbage
Collector?)*

*What timeouts are you referring to ?*

On Mon, Nov 28, 2016 at 7:39 AM, Harikrishnan Pillai <
hpil...@walmartlabs.com> wrote:

> Hi @Kant Kodali,
>
> 11 /11 , 11 nodes in DC1 and 11 nodes in DC2.
>
>
> ------
> *From:* Kant Kodali <k...@peernova.com>
> *Sent:* Monday, November 28, 2016 6:56 AM
>
> *To:* user@cassandra.apache.org
> *Subject:* Re: Java GC pauses, reality check
>
> Hi Hari,
>
> I am a little bit confused.
>
> What you mean 11/11 ?
>
> "We are using g1GC in most clusters with *26GB heap* and extra threads
> given to parallel and old gen collection. Those clusters 99% is also under
> 5 ms and doing good". So with G1GC you are able get under 5ms not the C4
> (Zing's Garbage Collector?)
>
> What timeouts are you referring to here?
>
> Thanks,
> kant
>
> On Sun, Nov 27, 2016 at 9:57 PM, Harikrishnan Pillai <
> hpil...@walmartlabs.com> wrote:
>
>> Hi @Kant Kodali,
>>
>> We have multiple clusters running zing .
>>
>> One cluster has 11/11 and another one also has 11/11.(190 GB mem,6TB hard
>> disk and 16 Physical core machines)
>>
>> The average read size is around 200KB and it can go upto 6 MB.
>>
>> We are using g1GC in most clusters with *26GB heap* and extra threads
>> given to parallel and old gen collection. Those clusters 99% is also under
>> 5 ms and doing good. We used Zing to remove all timeouts . If application
>> is not having that requirement G1GC is good.
>>
>> with g1gGC i have seen average 200-300 ms min pauses every 4 minutes and
>> 600 ms pauses every 6 hours and 99% latency is under 5-10 ms for most of
>> the clusters having 10- 100 KB of read data.
>>
>> Regards
>>
>> Hari
>> --
>> *From:* Kant Kodali <k...@peernova.com>
>> *Sent:* Saturday, November 26, 2016 8:39:01 PM
>> *To:* user@cassandra.apache.org
>> *Subject:* Re: Java GC pauses, reality check
>>
>> @Harikrishnan Pillai: How many nodes you guys are running? and what is an
>> approximate read size and an approximate write size?
>>
>> On Fri, Nov 25, 2016 at 7:32 PM, Harikrishnan Pillai <
>> hpil...@walmartlabs.com> wrote:
>>
>>> We are running azul zing in prod with 1 million reads/s and 100 K
>>> writes/s with azul .we never had a major gc above 10 ms .
>>>
>>> Sent from my iPhone
>>>
>>> > On Nov 25, 2016, at 3:49 PM, Martin Schröder <mar...@oneiros.de>
>>> wrote:
>>> >
>>> > 2016-11-25 23:38 GMT+01:00 Kant Kodali <k...@peernova.com>:
>>> >> I would also restate the following sentence "java GC pauses are
>>> pretty much
>>> >> a fact of life" to "Any GC based system pauses are pretty much a fact
>>> of
>>> >> life".
>>> >>
>>> >> I would be more than happy to see if someone can counter prove.
>>> >
>>> > Azul disagrees.
>>> > https://www.azul.com/products/zing/pgc/
>>> >
>>> > Best
>>> >   Martin
>>>
>>
>>
>

Re: Java GC pauses, reality check

2016-11-28 Thread Kant Kodali

Hi Hari,

I am a little bit confused.

What you mean 11/11 ?

"We are using g1GC in most clusters with *26GB heap* and extra threads
given to parallel and old gen collection. Those clusters 99% is also under
5 ms and doing good". So with G1GC you are able get under 5ms not the C4
(Zing's Garbage Collector?)

What timeouts are you referring to here?

Thanks,
kant

On Sun, Nov 27, 2016 at 9:57 PM, Harikrishnan Pillai <
hpil...@walmartlabs.com> wrote:

> Hi @Kant Kodali,
>
> We have multiple clusters running zing .
>
> One cluster has 11/11 and another one also has 11/11.(190 GB mem,6TB hard
> disk and 16 Physical core machines)
>
> The average read size is around 200KB and it can go upto 6 MB.
>
> We are using g1GC in most clusters with *26GB heap* and extra threads
> given to parallel and old gen collection. Those clusters 99% is also under
> 5 ms and doing good. We used Zing to remove all timeouts . If application
> is not having that requirement G1GC is good.
>
> with g1gGC i have seen average 200-300 ms min pauses every 4 minutes and
> 600 ms pauses every 6 hours and 99% latency is under 5-10 ms for most of
> the clusters having 10- 100 KB of read data.
>
> Regards
>
> Hari
> --
> *From:* Kant Kodali <k...@peernova.com>
> *Sent:* Saturday, November 26, 2016 8:39:01 PM
> *To:* user@cassandra.apache.org
> *Subject:* Re: Java GC pauses, reality check
>
> @Harikrishnan Pillai: How many nodes you guys are running? and what is an
> approximate read size and an approximate write size?
>
> On Fri, Nov 25, 2016 at 7:32 PM, Harikrishnan Pillai <
> hpil...@walmartlabs.com> wrote:
>
>> We are running azul zing in prod with 1 million reads/s and 100 K
>> writes/s with azul .we never had a major gc above 10 ms .
>>
>> Sent from my iPhone
>>
>> > On Nov 25, 2016, at 3:49 PM, Martin Schröder <mar...@oneiros.de> wrote:
>> >
>> > 2016-11-25 23:38 GMT+01:00 Kant Kodali <k...@peernova.com>:
>> >> I would also restate the following sentence "java GC pauses are pretty
>> much
>> >> a fact of life" to "Any GC based system pauses are pretty much a fact
>> of
>> >> life".
>> >>
>> >> I would be more than happy to see if someone can counter prove.
>> >
>> > Azul disagrees.
>> > https://www.azul.com/products/zing/pgc/
>> >
>> > Best
>> >   Martin
>>
>
>

Re: Java GC pauses, reality check

2016-11-27 Thread Kant Kodali

Yes I am well aware of Scyalldb. It might be well written in C++ but the
performance gain they are claiming has very little to do with moving from
Java to C++. They had major design changes such as moving away from SEDA to
TPC and so on. Moreover I would say it still needs to mature. Lot of users
had complained that they cannot get the benchmarks similar to the ones that
are posted online and I keep seeing comments stating that you need to use a
specific hardware and specific tuning mechanisms and so on (I don't mean to
say what scylladb is claiming is wrong I certainly haven't verified it but
I do know for the fact lot of people are having trouble to reach those
benchmarks).

SEDA to TPC is a very big change. Let's see how long it would take for
Apache C*

https://issues.apache.org/jira/browse/CASSANDRA-10989




On Sat, Nov 26, 2016 at 11:45 PM, Benjamin Roth <benjamin.r...@jaumo.com>
wrote:

> You are of course right. There is no solution and no language that is a
> perfect match for every situation and every solution and language has it's
> own pros, cons, pitfalls and drawbacks.
> Actually that article you posted points at some aspect of ARC, I wasn't
> aware of, yet.
> Nevertheless, GC is an issue for Cassandra, otherwise this thread would
> not exist, right? But we have to deal with it and get the best out of it.
>
> Another option, besides optimizing your GC: You could check if
> http://www.scylladb.com/ is an option for you.
> They rewrote CS from the scratch. The goal is to be completely compatible
> with CS but to be much, much faster. Check their benchmarks and their
> architecture.
> I really do not want do depreciate the work of all the Cassandra
> Developers - they did a great job - but what I have seen there looked very
> interesting and promising! By the way it's written in C++.
>
>
> 2016-11-27 7:06 GMT+01:00 Kant Kodali <k...@peernova.com>:
>
>> Automatic Reference counting sounds like college level idea that we all
>> have been hearing for since GC is born! There seem to be bunch of cons of
>> ARC as explained here
>>
>> https://www.quora.com/Why-doesnt-Apple-Swift-adopt-the-memor
>> y-management-method-of-garbage-collection-like-in-Java
>>
>> Maintaining C and C++ APPS are never a pain? How about versioning and
>> static time libraries? There is work there too. so its all pros and cons
>>
>> "gc is a pain in the ass". How about seg faults? they aren't any lesser
>> pain :)
>>
>> Not only Cassandra that runs on JVM. Majority of Apache projects do run
>> on JVM for a reason.
>>
>> Bottom line. My point here is there are pros and cons of every language.
>> It doesn't make much sense to target one language.
>>
>>
>>
>>
>>
>>
>> On Sat, Nov 26, 2016 at 9:31 PM, Benjamin Roth <benjamin.r...@jaumo.com>
>> wrote:
>>
>>> Arc means Automatic Reference counting which is done at compilen time.
>>> Eg Objektive c and Swift use this technique. There are absolutely No gc's.
>>> Its a completely different memory Management technique.
>>>
>>> Why i dont like Java on Server side? Because gc is a pain in the ass. I
>>> am doing this Business since over 15 years and running/maintaining Apps
>>> that are build in c or c++ has never been such a pain.
>>>
>>> On the other Hand Java is easier to handle for Developers. And coding
>>> plain c is also a pain.
>>>
>>> Thats why i Said its a philosophic discussion.
>>> Anyway Cassandra rund on Java so We have to Deal with it.
>>>
>>> Am 27.11.2016 05:28 schrieb "Kant Kodali" <k...@peernova.com>:
>>>
>>>> Benjamin Roth: How do you know Arc eliminates GC pauses completely? By
>>>> completely I mean no GC pauses whatsoever.
>>>>
>>>> When you say Java is NOT the First choice for Server Applications you
>>>> are generalizing it too much I would say since many of them fall under that
>>>> category. Either way the statement you made is purely subjective.
>>>>
>>>> On Fri, Nov 25, 2016 at 2:41 PM, Benjamin Roth <benjamin.r...@jaumo.com
>>>> > wrote:
>>>>
>>>>> Lol. The counter proof is to use another memory Model like Arc. Thats
>>>>> why i personally think Java is NOT the First choice for Server
>>>>> Applications. But thats a philosophic discussion.
>>>>>
>>>>> Am 25.11.2016 23:38 schrieb "Kant Kodali" <k...@peernova.com>:
>>>>>
>>>>>> +1 Chris Lohfink response
>>>>>>
>>>>>> I would

Re: Java GC pauses, reality check

2016-11-26 Thread Kant Kodali

Automatic Reference counting sounds like college level idea that we all
have been hearing for since GC is born! There seem to be bunch of cons of
ARC as explained here

https://www.quora.com/Why-doesnt-Apple-Swift-adopt-the-memory-management-method-of-garbage-collection-like-in-Java

Maintaining C and C++ APPS are never a pain? How about versioning and
static time libraries? There is work there too. so its all pros and cons

"gc is a pain in the ass". How about seg faults? they aren't any lesser
pain :)

Not only Cassandra that runs on JVM. Majority of Apache projects do run on
JVM for a reason.

Bottom line. My point here is there are pros and cons of every language. It
doesn't make much sense to target one language.






On Sat, Nov 26, 2016 at 9:31 PM, Benjamin Roth <benjamin.r...@jaumo.com>
wrote:

> Arc means Automatic Reference counting which is done at compilen time. Eg
> Objektive c and Swift use this technique. There are absolutely No gc's. Its
> a completely different memory Management technique.
>
> Why i dont like Java on Server side? Because gc is a pain in the ass. I am
> doing this Business since over 15 years and running/maintaining Apps that
> are build in c or c++ has never been such a pain.
>
> On the other Hand Java is easier to handle for Developers. And coding
> plain c is also a pain.
>
> Thats why i Said its a philosophic discussion.
> Anyway Cassandra rund on Java so We have to Deal with it.
>
> Am 27.11.2016 05:28 schrieb "Kant Kodali" <k...@peernova.com>:
>
>> Benjamin Roth: How do you know Arc eliminates GC pauses completely? By
>> completely I mean no GC pauses whatsoever.
>>
>> When you say Java is NOT the First choice for Server Applications you
>> are generalizing it too much I would say since many of them fall under that
>> category. Either way the statement you made is purely subjective.
>>
>> On Fri, Nov 25, 2016 at 2:41 PM, Benjamin Roth <benjamin.r...@jaumo.com>
>> wrote:
>>
>>> Lol. The counter proof is to use another memory Model like Arc. Thats
>>> why i personally think Java is NOT the First choice for Server
>>> Applications. But thats a philosophic discussion.
>>>
>>> Am 25.11.2016 23:38 schrieb "Kant Kodali" <k...@peernova.com>:
>>>
>>>> +1 Chris Lohfink response
>>>>
>>>> I would also restate the following sentence "java GC pauses are pretty
>>>> much a fact of life" to "Any GC based system pauses are pretty much a
>>>> fact of life".
>>>>
>>>> I would be more than happy to see if someone can counter prove.
>>>>
>>>>
>>>>
>>>> On Fri, Nov 25, 2016 at 1:41 PM, Chris Lohfink <clohfin...@gmail.com>
>>>> wrote:
>>>>
>>>>> No tuning will eliminate gcs.
>>>>>
>>>>> 20-30 seconds is horrific and out of the ordinary. Most likely
>>>>> implementing antipatterns and/or poorly configured. Sub 1s is realistic 
>>>>> but
>>>>> with some workloads still may require some tuning to maintain. Some
>>>>> workloads are very unfriendly to GCs though (ie heavy tombstones, very 
>>>>> wide
>>>>> partitions).
>>>>>
>>>>> Chris
>>>>>
>>>>> On Fri, Nov 25, 2016 at 3:25 PM, S Ahmed <sahmed1...@gmail.com> wrote:
>>>>>
>>>>>> Hello!
>>>>>>
>>>>>> From what I understand java GC pauses are pretty much a fact of life,
>>>>>> but you can tune the jvm to reduce the likelihood of the frequency and
>>>>>> length of GC pauses.
>>>>>>
>>>>>> When using Cassandra, how frequent or long have these pauses known to
>>>>>> be?  Even with tuning, is it safe to assume they cannot be eliminated?
>>>>>>
>>>>>> Would a 20-30 second pause be something out of the ordinary?
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>
>>>>>
>>>>
>>

Re: Java GC pauses, reality check

2016-11-26 Thread Kant Kodali

@Harikrishnan Pillai: How many nodes you guys are running? and what is an
approximate read size and an approximate write size?

On Fri, Nov 25, 2016 at 7:32 PM, Harikrishnan Pillai <
hpil...@walmartlabs.com> wrote:

> We are running azul zing in prod with 1 million reads/s and 100 K writes/s
> with azul .we never had a major gc above 10 ms .
>
> Sent from my iPhone
>
> > On Nov 25, 2016, at 3:49 PM, Martin Schröder <mar...@oneiros.de> wrote:
> >
> > 2016-11-25 23:38 GMT+01:00 Kant Kodali <k...@peernova.com>:
> >> I would also restate the following sentence "java GC pauses are pretty
> much
> >> a fact of life" to "Any GC based system pauses are pretty much a fact of
> >> life".
> >>
> >> I would be more than happy to see if someone can counter prove.
> >
> > Azul disagrees.
> > https://www.azul.com/products/zing/pgc/
> >
> > Best
> >   Martin
>

Re: Java GC pauses, reality check

2016-11-26 Thread Kant Kodali

Good to know about Zing! I will have to take a look.

On Sat, Nov 26, 2016 at 8:27 PM, Kant Kodali <k...@peernova.com> wrote:

> Benjamin Roth: How do you know Arc eliminates GC pauses completely? By
> completely I mean no GC pauses whatsoever.
>
> When you say Java is NOT the First choice for Server Applications you are
> generalizing it too much I would say since many of them fall under that
> category. Either way the statement you made is purely subjective.
>
> On Fri, Nov 25, 2016 at 2:41 PM, Benjamin Roth <benjamin.r...@jaumo.com>
> wrote:
>
>> Lol. The counter proof is to use another memory Model like Arc. Thats why
>> i personally think Java is NOT the First choice for Server Applications.
>> But thats a philosophic discussion.
>>
>> Am 25.11.2016 23:38 schrieb "Kant Kodali" <k...@peernova.com>:
>>
>>> +1 Chris Lohfink response
>>>
>>> I would also restate the following sentence "java GC pauses are pretty
>>> much a fact of life" to "Any GC based system pauses are pretty much a
>>> fact of life".
>>>
>>> I would be more than happy to see if someone can counter prove.
>>>
>>>
>>>
>>> On Fri, Nov 25, 2016 at 1:41 PM, Chris Lohfink <clohfin...@gmail.com>
>>> wrote:
>>>
>>>> No tuning will eliminate gcs.
>>>>
>>>> 20-30 seconds is horrific and out of the ordinary. Most likely
>>>> implementing antipatterns and/or poorly configured. Sub 1s is realistic but
>>>> with some workloads still may require some tuning to maintain. Some
>>>> workloads are very unfriendly to GCs though (ie heavy tombstones, very wide
>>>> partitions).
>>>>
>>>> Chris
>>>>
>>>> On Fri, Nov 25, 2016 at 3:25 PM, S Ahmed <sahmed1...@gmail.com> wrote:
>>>>
>>>>> Hello!
>>>>>
>>>>> From what I understand java GC pauses are pretty much a fact of life,
>>>>> but you can tune the jvm to reduce the likelihood of the frequency and
>>>>> length of GC pauses.
>>>>>
>>>>> When using Cassandra, how frequent or long have these pauses known to
>>>>> be?  Even with tuning, is it safe to assume they cannot be eliminated?
>>>>>
>>>>> Would a 20-30 second pause be something out of the ordinary?
>>>>>
>>>>> Thanks.
>>>>>
>>>>
>>>>
>>>
>

Re: Java GC pauses, reality check

2016-11-26 Thread Kant Kodali

Benjamin Roth: How do you know Arc eliminates GC pauses completely? By
completely I mean no GC pauses whatsoever.

When you say Java is NOT the First choice for Server Applications you are
generalizing it too much I would say since many of them fall under that
category. Either way the statement you made is purely subjective.

On Fri, Nov 25, 2016 at 2:41 PM, Benjamin Roth <benjamin.r...@jaumo.com>
wrote:

> Lol. The counter proof is to use another memory Model like Arc. Thats why
> i personally think Java is NOT the First choice for Server Applications.
> But thats a philosophic discussion.
>
> Am 25.11.2016 23:38 schrieb "Kant Kodali" <k...@peernova.com>:
>
>> +1 Chris Lohfink response
>>
>> I would also restate the following sentence "java GC pauses are pretty
>> much a fact of life" to "Any GC based system pauses are pretty much a
>> fact of life".
>>
>> I would be more than happy to see if someone can counter prove.
>>
>>
>>
>> On Fri, Nov 25, 2016 at 1:41 PM, Chris Lohfink <clohfin...@gmail.com>
>> wrote:
>>
>>> No tuning will eliminate gcs.
>>>
>>> 20-30 seconds is horrific and out of the ordinary. Most likely
>>> implementing antipatterns and/or poorly configured. Sub 1s is realistic but
>>> with some workloads still may require some tuning to maintain. Some
>>> workloads are very unfriendly to GCs though (ie heavy tombstones, very wide
>>> partitions).
>>>
>>> Chris
>>>
>>> On Fri, Nov 25, 2016 at 3:25 PM, S Ahmed <sahmed1...@gmail.com> wrote:
>>>
>>>> Hello!
>>>>
>>>> From what I understand java GC pauses are pretty much a fact of life,
>>>> but you can tune the jvm to reduce the likelihood of the frequency and
>>>> length of GC pauses.
>>>>
>>>> When using Cassandra, how frequent or long have these pauses known to
>>>> be?  Even with tuning, is it safe to assume they cannot be eliminated?
>>>>
>>>> Would a 20-30 second pause be something out of the ordinary?
>>>>
>>>> Thanks.
>>>>
>>>
>>>
>>

Re: Java GC pauses, reality check

2016-11-25 Thread Kant Kodali

+1 Chris Lohfink response

I would also restate the following sentence "java GC pauses are pretty much
a fact of life" to "Any GC based system pauses are pretty much a fact of
life".

I would be more than happy to see if someone can counter prove.



On Fri, Nov 25, 2016 at 1:41 PM, Chris Lohfink  wrote:

> No tuning will eliminate gcs.
>
> 20-30 seconds is horrific and out of the ordinary. Most likely
> implementing antipatterns and/or poorly configured. Sub 1s is realistic but
> with some workloads still may require some tuning to maintain. Some
> workloads are very unfriendly to GCs though (ie heavy tombstones, very wide
> partitions).
>
> Chris
>
> On Fri, Nov 25, 2016 at 3:25 PM, S Ahmed  wrote:
>
>> Hello!
>>
>> From what I understand java GC pauses are pretty much a fact of life, but
>> you can tune the jvm to reduce the likelihood of the frequency and length
>> of GC pauses.
>>
>> When using Cassandra, how frequent or long have these pauses known to
>> be?  Even with tuning, is it safe to assume they cannot be eliminated?
>>
>> Would a 20-30 second pause be something out of the ordinary?
>>
>> Thanks.
>>
>
>

Re: Is there a way to do Read and Set at Cassandra level?

2016-11-05 Thread Kant Kodali

But then don't I need to evict for every batch of writes? I thought cache
would make sense when reads/writes > 1 per say. What do you think?

On Sat, Nov 5, 2016 at 3:33 AM, DuyHai Doan <doanduy...@gmail.com> wrote:

> "I have a requirement where I need to know last value that is written
> successfully so I could read that value and do some computation and include
> it in the subsequent write"
>
> Maybe keeping the last written value in a distributed cache is cheaper
> than doing a read before write in Cassandra ?
>
> On Sat, Nov 5, 2016 at 11:24 AM, Kant Kodali <k...@peernova.com> wrote:
>
>> I have a requirement where I need to know last value that is written
>> successfully so I could read that value and do some computation and include
>> it in the subsequent write. For now we are doing read before write which
>> significantly degrades the performance. Light weight transactions are more
>> of a compare and set than a Read and Set. The very first thing I tried is
>> to see if I can eliminate this need by the application but looks like it is
>> a strong requirement for us so I am wondering if there is any way I can
>> optimize that? I know batching could help in the sense I can do one read
>> for every batch so that the writes in the batch doesn't take a read
>> performance hit but I wonder if there is any clever ideas or tricks I can
>> do?
>>
>
>

1 2 >

1 - 100 of 141 matches

Mail list logo