Re: Scalability of Gossip protocol

2016-08-28 Thread SmartCat - Scott Hirleman
I'd search through some of the VLDB papers that have come out in the last
few years. C* can scale to 100+ nodes more easily than any other technology
I'm aware of, scalability is one of the key driving factors of C* adoption
picking up.

On Sun, Aug 28, 2016 at 11:28 AM, jean paul  wrote:

> Hi, thank you so much for help.
> Please is there a scientific study that evaluates the* scalability* of
> Cassandra? Bests.
>
> 2016-08-16 20:15 GMT+01:00 Jeff Jirsa :
>
>> Jason Brown has an interesting set of tickets:
>>
>>
>>
>> https://issues.apache.org/jira/browse/CASSANDRA-12345
>>
>>
>>
>> One of the sub-tickets there is https://issues.apache.org/jira
>> /browse/CASSANDRA-12347
>>
>>
>>
>> That ticket links to a relevant paper on the subject (and an alternative
>> to the existing approach): http://www.gsd.inesc-id.pt/~jl
>> eitao/pdf/srds10-mario.pdf
>>
>>
>>
>>
>>
>>
>>
>> *From: *jean paul 
>> *Reply-To: *"user@cassandra.apache.org" 
>> *Date: *Tuesday, August 16, 2016 at 12:07 PM
>> *To: *"user@cassandra.apache.org" 
>> *Subject: *Scalability of Gossip protocol
>>
>>
>>
>> Hi all;
>>
>> Please is there a scientific study that evaluates the scalability of
>> Gossip protocol ?
>>
>> Thank you so much for help
>>
>> Kind regards.
>>
>>
>>
>
>


-- 
*Scott Hirleman*
*Head of US Marketing and Sales*
www.smartcat.io
https://github.com/smartcat-labs 




Re: Scalability of Gossip protocol

2016-08-28 Thread jean paul
Hi, thank you so much for help.
Please is there a scientific study that evaluates the* scalability* of
Cassandra? Bests.

2016-08-16 20:15 GMT+01:00 Jeff Jirsa :

> Jason Brown has an interesting set of tickets:
>
>
>
> https://issues.apache.org/jira/browse/CASSANDRA-12345
>
>
>
> One of the sub-tickets there is https://issues.apache.org/
> jira/browse/CASSANDRA-12347
>
>
>
> That ticket links to a relevant paper on the subject (and an alternative
> to the existing approach): http://www.gsd.inesc-id.pt/~
> jleitao/pdf/srds10-mario.pdf
>
>
>
>
>
>
>
> *From: *jean paul 
> *Reply-To: *"user@cassandra.apache.org" 
> *Date: *Tuesday, August 16, 2016 at 12:07 PM
> *To: *"user@cassandra.apache.org" 
> *Subject: *Scalability of Gossip protocol
>
>
>
> Hi all;
>
> Please is there a scientific study that evaluates the scalability of
> Gossip protocol ?
>
> Thank you so much for help
>
> Kind regards.
>
>
>


Re: Read Repairs and CL

2016-08-28 Thread Ben Slater
In case anyone else is interested - we figured this out. When C* decides it
need to do a repair based on a digest mismatch from the initial reads for
the consistency level it does actually try to do a read at CL=ALL in order
to get the most up to date data to use to repair.

This led to an interesting issue in our case where we had one node in an
RF3 cluster down for maintenance (to correct data that became corrupted due
to a severe write overload) and started getting occasional “timeout during
read query at consistency LOCAL_QUORUM” failures. We believe this due to
the case where data for a read was only available on one of the two up
replicas which then triggered an attempt to repair and a failed read at
CL=ALL. It seems that CASSANDRA-7947 (a while ago) change the behaviour so
that C* reports a failure at the originally request level even when it was
actually the attempted repair read at CL=ALL which could not read
sufficient replicas - a bit confusing (although I can also see how getting
CL=ALL errors when you thought you were reading at QUORUM or ONE would be
confusing).

Cheers
Ben

On Sun, 28 Aug 2016 at 10:52 kurt Greaves  wrote:

> Looking at the wiki for the read path (
> http://wiki.apache.org/cassandra/ReadPathForUsers), in the bottom diagram
> for reading with a read repair, it states the following when "reading from
> all replica nodes" after there is a hash mismatch:
>
> If hashes do not match, do conflict resolution. First step is to read all
>> data from all replica nodes excluding the fastest replica (since CL=ALL)
>>
>
>  In the bottom left of the diagram it also states:
>
>> In this example:
>>
> RF>=2
>>
> CL=ALL
>>
>
> The (since CL=ALL) implies that the CL for the read during the read repair
> is based off the CL of the query. However I don't think that makes sense at
> other CLs. Anyway, I just want to clarify what CL the read for the read
> repair occurs at for cases where the overall query CL is not ALL.
>
> Thanks,
> Kurt.
>
> --
> Kurt Greaves
> k...@instaclustr.com
> www.instaclustr.com
>
-- 

Ben Slater
Chief Product Officer
Instaclustr: Cassandra + Spark - Managed | Consulting | Support
+61 437 929 798


Re: Need help with simple schema for time-series

2016-08-28 Thread Noorul Islam K M

http://kairosdb.github.io/

Regards,
Noorul

Peter Figliozzi  writes:

> I have data from many sensors as time-series:
>
>- Sensor name
>- Date
>- Time
>- value
>
> I want to query windows of both date and time.  For example, 8am - 9am from
> Aug. 1st to Aug 10th.
>
> Here's what I did:
>
> CREATE TABLE mykeyspace.mytable (
> sensorname text,
> date date,
> time time,
> data MAP,
> PRIMARY KEY (sensorname, date, time)
> );
>
>
> However, when we query this, Cassandra restricts us to an "equal" relation
> for the date, if we are to select a window of time.  So with that schema,
> I'd have to query once for each date.
>
>
> What's the right way to do this??  ("Right" defined as extracting a window
> of date and of time in one query.)
>
>
> Thank you,
>
>
> Pete