Re: Evolving the client protocol

2018-04-29 Thread Avi Kivity
"Bullied"? Neither me nor anyone else made any demands or threats. I 
proposed cooperation, and acknowledged up front, in my first email, that 
cooperation might not be wanted by Cassandra.





On 2018-04-28 20:50, Jeff Jirsa wrote:


You're a committer Mick, if you think it belongs in the database, write the
patches and get them reviewed.  Until then, the project isn't going to be
bullied into changing the protocol without an implementation.

- Jeff




-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Evolving the client protocol

2018-04-28 Thread mck
Jeff,
 the use of the terms "business concerns" and "technical merit" may be an 
unfortunate choice I made, they can be construed in a narrower way than I 
intended and i apologise Jeff for that confusion. Our community and how we look 
after it does matter.

 The hope I raised was only that we deal with the proposals objectively, 
Sylvain's last paragraph was a good example of that and the best thing I've 
read in the thread so far. 


> Don't get me wrong, protocol-impacting changes/additions are very much
> welcome if reasonable for Cassandra, and both CASSANDRA-14311 and
> CASSANDRA-2848 are
> certainly worthy. Both the definition of done of those ticket certainly
> include the server implementation imo


regards,
Mick

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Evolving the client protocol

2018-04-28 Thread Jeff Jirsa
On Sat, Apr 28, 2018 at 4:49 AM, mck  wrote:


> We should, as open source contributors, put business concerns to the side
> and welcome opportunities to work across company and product lines.
>


I resent the fact that you're calling this a business concern. This isn't a
business concern, and as a committer and ASF member you should be able to
discern the difference.

Sylvain said:

> The native protocol is the protocol of the Apache Cassandra project and
was
> never meant to be a standard protocol.

and

> Don't get me wrong, protocol-impacting changes/additions are very much
> welcome if reasonable for Cassandra, and both CASSANDRA-14311 and
CASSANDRA-2848 are
> certainly worthy. Both the definition of done of those ticket certainly
> include the server implementation imo,

I said:

> So again: we have a Cassandra native protocol, and we have a process for
> changing it, and that process is contributor agnostic. Anyone who wants a
> change can submit a patch, and it'll get reviewed, and maybe if it's a
good
> idea, it'll get committed, but the chances of a review leading to a commit
> without an implementation is nearly zero.

The only reason business names came into it is that someone drew a false
equivalence between two businesses. They're not equivalent, and the lack of
equivalence likely explains why this thread keeps bouncing around -
Datastax would have written a patch and contributed it to the project, and
Scylla didn't. But again, the lack of protocol changes so far ISN'T because
the project somehow favors one company more than the other (it doesn't),
the protocol changes havent happened because nobody's submitted a patch.

You're a committer Mick, if you think it belongs in the database, write the
patches and get them reviewed.  Until then, the project isn't going to be
bullied into changing the protocol without an implementation.

- Jeff


Re: Evolving the client protocol

2018-04-28 Thread Josh McKenzie
Mick - reference Scylla's recent blog post where Dor speaks directly
about the majority of their users migrating there from the Apache
Cassandra ecosystem. This isn't about business concerns being first,
this is about community concerns being first.

On Sat, Apr 28, 2018 at 7:49 AM, mck  wrote:
>
>> Let met just say that as an observer to this conversation -- and someone
>> who believes that compatibility, extensibility, and frankly competition
>> bring out the best in products -- I'm fairly surprised and disappointed
>> with the apparent hostility many community members have shown toward a
>> sincere attempt by another open source product to find common ground here.
>
>
> I agree with you Eric. It's all understandable, but it did drop my spirit a 
> bit.
>
> I'd like to say thank you to both Avi and Dor for reaching out and making an 
> honest attempt to collaborate.
> We should, as open source contributors, put business concerns to the side and 
> welcome opportunities to work across company and product lines.
>
> Grudges around undermining DataStax's efforts of course can't be ignored, 
> they certainly carry weight, so airing them hopefully evolves us. And the 
> clash of licenses is unfortunate, but a project is free to chose any license 
> they want, the *GPLs are as valid as any other, and prejudices shouldn't be 
> made about the intention behind choosing one over another.
>
> Repeating Nate's main point,  focusing first on CASSANDRA-14311 and 
> CASSANDRA-2848 can help establish the offered goodwill. Hopefully from there 
> if there's a rejection of protocol changes it's clearly out there that it's 
> for technical reasons rather than ill will or commercial interests.
>
> Mick
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Evolving the client protocol

2018-04-28 Thread mck

> Let met just say that as an observer to this conversation -- and someone
> who believes that compatibility, extensibility, and frankly competition
> bring out the best in products -- I'm fairly surprised and disappointed
> with the apparent hostility many community members have shown toward a
> sincere attempt by another open source product to find common ground here.


I agree with you Eric. It's all understandable, but it did drop my spirit a bit.

I'd like to say thank you to both Avi and Dor for reaching out and making an 
honest attempt to collaborate.
We should, as open source contributors, put business concerns to the side and 
welcome opportunities to work across company and product lines. 

Grudges around undermining DataStax's efforts of course can't be ignored, they 
certainly carry weight, so airing them hopefully evolves us. And the clash of 
licenses is unfortunate, but a project is free to chose any license they want, 
the *GPLs are as valid as any other, and prejudices shouldn't be made about the 
intention behind choosing one over another. 

Repeating Nate's main point,  focusing first on CASSANDRA-14311 and 
CASSANDRA-2848 can help establish the offered goodwill. Hopefully from there if 
there's a rejection of protocol changes it's clearly out there that it's for 
technical reasons rather than ill will or commercial interests.

Mick


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Evolving the client protocol

2018-04-24 Thread Jeff Jirsa
They aren't even remotely similar, they're VERY different. Here's a few
starting points:

1) Most of Datastax's work for the first 5, 6, 8 years of existence focused
on driving users to cassandra from other DBs (see all of the "Cassandra
Summits" that eventually created trademark friction) ; Scylla's marketing
is squarely Scylla v  Cassandra. Ultimately they're both companies out to
make money, but one has a history of driving users to Cassandra, and the
other is trying to siphon users away from Cassandra.
2) Datastax may not be actively contributing as much as they used to, but
some ridiculous number of engineering hours got paid out of their budget -
maybe 80% of total lines of code? Maybe higher (though it's decreasing day
by day). By contrast, Scylla has exactly zero meaningful concrete code
contributions to the project, uses a license that makes even sharing
concepts prohibitive, only has a handful or so JIRAs opened (which is
better than zero), but has effectively no goodwill in the eyes of many of
the longer-term community members (in large part because of #1, and also
because of the way they positioned their talk-turned-product announcement
at the competitor-funded 2016 summit).
3) Datastax apparently respects the project enough that they'd NEVER come
in and ask for a protocol spec change without providing a reference
implementation.
4) To that end, native protocol changes aren't something anyone is anxious
to shove in without good reason. Even with a reference implementation, and
a REALLY GOOD REASON (namely data correctness / protection from
corruption), https://issues.apache.org/jira/browse/CASSANDRA-13304 has been
sitting patch available for OVER A YEAR.

So again: we have a Cassandra native protocol, and we have a process for
changing it, and that process is contributor agnostic.  Anyone who wants a
change can submit a patch, and it'll get reviewed, and maybe if it's a good
idea, it'll get committed, but the chances of a review leading to a commit
without an implementation is nearly zero.

Would be happy to see this thread die now. There's nothing new coming out
of it.

- Jeff


On Tue, Apr 24, 2018 at 8:30 AM, Eric Stevens  wrote:

> Let met just say that as an observer to this conversation -- and someone
> who believes that compatibility, extensibility, and frankly competition
> bring out the best in products -- I'm fairly surprised and disappointed
> with the apparent hostility many community members have shown toward a
> sincere attempt by another open source product to find common ground here.
>
> Yes, Scylla has a competing OSS project (albeit under a different
> license).  They also have a business built around it.  It's hard for me to
> see that as dramatically different than the DataStax relationship to this
> community.  Though I would love to be shown why.
>


Re: Evolving the client protocol

2018-04-24 Thread Dor Laor
The main point is that we decided to take a strategic decision to invest in
the client
side. We always wanted to get to the state but for natural reasons, it took
us a while.
The client side changes aren't just about a small feature here and there or
stop at
thread per core. Think about the changes that will come in a 3-5 year scope.

Avi had a great idea about changing the underline TCP to UDP. It removes
head-of-the-line
blocking, removes limitations of number of sockets and since clients
restrasmit on timeouts,
it will improve performance a lot.
Another change is in the CDC domain.

Some other idea that comes to my mind is to use IDL and automatic generate
bindings
to different languages, to improve reuse an d standardization Scylla
automatically
generated its internal RPC code from an IDL and modern implementations
should take
this path, especially with polyglot of languages. Believe me, it sounds
more and more compeling
to me as an easier path.




On Tue, Apr 24, 2018 at 9:26 AM, Avi Kivity  wrote:

>
>
> On 2018-04-24 04:18, Nate McCall wrote:
>
>> Folks,
>> Before this goes much further, let's take a step back for a second.
>>
>> I am hearing the following: Folks are fine with CASSANDRA-14311 and
>> CASSANDRA-2848 *BUT* they don't make much sense from the project's
>> perspective without a reference implementation. I think the shard
>> concept is too abstract for the project right now, so we should
>> probably set that one aside.
>>
>> Dor and Avi, I appreciate you both engaging directly on this. Where
>> can we find common ground on this?
>>
>>
> I started with three options:
>
> 1. Scylla (or other protocol implementers) contribute spec changes, and
> each implementer implements them on their own
>
> This was rejected.
>
> 2. Scylla defines and implements spec changes on its own, and when
> Cassandra implements similar changes, it will retroactively apply the
> Scylla change if it makes technical sense
>
> IOW, no gratuitous divergence, but no hard commitment either.
>
> I received no feedback on this.
>
> 3. No cooperation.
>
> This is the fall-back option which I would like to avoid if possible. It's
> main advantage is that it avoids long email threads and flamewars.
>
> There was also a suggestion made in this thread:
>
> 4. Scylla defines spec changes and also implements them for Cassandra
>
> That works for some changes but not all (for example, thread-per-core
> awareness, or changes that require significant effort). I would like to
> find a way that works for all of the changes that we want to make.
>
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: Evolving the client protocol

2018-04-24 Thread Avi Kivity



On 2018-04-24 04:18, Nate McCall wrote:

Folks,
Before this goes much further, let's take a step back for a second.

I am hearing the following: Folks are fine with CASSANDRA-14311 and
CASSANDRA-2848 *BUT* they don't make much sense from the project's
perspective without a reference implementation. I think the shard
concept is too abstract for the project right now, so we should
probably set that one aside.

Dor and Avi, I appreciate you both engaging directly on this. Where
can we find common ground on this?



I started with three options:

1. Scylla (or other protocol implementers) contribute spec changes, and 
each implementer implements them on their own


This was rejected.

2. Scylla defines and implements spec changes on its own, and when 
Cassandra implements similar changes, it will retroactively apply the 
Scylla change if it makes technical sense


IOW, no gratuitous divergence, but no hard commitment either.

I received no feedback on this.

3. No cooperation.

This is the fall-back option which I would like to avoid if possible. 
It's main advantage is that it avoids long email threads and flamewars.


There was also a suggestion made in this thread:

4. Scylla defines spec changes and also implements them for Cassandra

That works for some changes but not all (for example, thread-per-core 
awareness, or changes that require significant effort). I would like to 
find a way that works for all of the changes that we want to make.



-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Evolving the client protocol

2018-04-24 Thread Russell Bateman

Eric,

You have to understand the poisonous GPL. It's very different from 
Apache licensing in the sense that, roughly speaking, you're welcome to 
contribute to Scylla, but legally barred from distributing it with or 
inside any product you base on it unless your product source code is 
also open or you contract with Scylla DB. The objections raised by some 
in this thread are based on the inequality of contribution in the two models


On 04/24/2018 09:30 AM, Eric Stevens wrote:

Let met just say that as an observer to this conversation -- and someone
who believes that compatibility, extensibility, and frankly competition
bring out the best in products -- I'm fairly surprised and disappointed
with the apparent hostility many community members have shown toward a
sincere attempt by another open source product to find common ground here.

Yes, Scylla has a competing OSS project (albeit under a different
license).  They also have a business built around it.  It's hard for me to
see that as dramatically different than the DataStax relationship to this
community.  Though I would love to be shown why.





Re: Evolving the client protocol

2018-04-24 Thread Jonathan Haddad
DataStax invested millions of dollars into Cassandra, tens of thousands of
man hours, hosted hundreds of events and has been a major factor in the
success of the project.

ScyllaDB wants us to change the C* protocol in order to improve features in
a competing database which contributes nothing back to the Cassandra
community.

Seems a little different to me.

On Tue, Apr 24, 2018 at 8:30 AM Eric Stevens  wrote:

> Let met just say that as an observer to this conversation -- and someone
> who believes that compatibility, extensibility, and frankly competition
> bring out the best in products -- I'm fairly surprised and disappointed
> with the apparent hostility many community members have shown toward a
> sincere attempt by another open source product to find common ground here.
>
> Yes, Scylla has a competing OSS project (albeit under a different
> license).  They also have a business built around it.  It's hard for me to
> see that as dramatically different than the DataStax relationship to this
> community.  Though I would love to be shown why.
>


Re: Evolving the client protocol

2018-04-24 Thread Eric Stevens
Let met just say that as an observer to this conversation -- and someone
who believes that compatibility, extensibility, and frankly competition
bring out the best in products -- I'm fairly surprised and disappointed
with the apparent hostility many community members have shown toward a
sincere attempt by another open source product to find common ground here.

Yes, Scylla has a competing OSS project (albeit under a different
license).  They also have a business built around it.  It's hard for me to
see that as dramatically different than the DataStax relationship to this
community.  Though I would love to be shown why.


Re: Evolving the client protocol

2018-04-24 Thread Avi Kivity



On 2018-04-23 17:59, Ben Bromhead wrote:


>> This doesn't work without additional changes, for RF>1. The
token ring could place two replicas of the same token range on the
same physical server, even though those are two separate cores of
the same server. You could add another element to the hierarchy
(cluster -> datacenter -> rack -> node -> core/shard), but that
generates unneeded range movements when a node is added.
> I have seen rack awareness used/abused to solve this.
>

But then you lose real rack awareness. It's fine for a quick hack,
but
not a long-term solution.

(it also creates a lot more tokens, something nobody needs)


I'm having trouble understanding how you loose "real" rack awareness, 
as these shards are in the same rack anyway, because the address and 
port are on the same server in the same rack. So it behaves as 
expected. Could you explain a situation where the shards on a single 
server would be in different racks (or fault domains)?


You're right - it continues to work.



If you wanted to support a situation where you have a single rack per 
DC for simple deployments, extending NetworkTopologyStrategy to behave 
the way it did before 
https://issues.apache.org/jira/browse/CASSANDRA-7544 with respect to 
treating InetAddresses as servers rather than the address and port 
would be simple. Both this implementation in Apache Cassandra and the 
respective load balancing classes in the drivers are explicitly 
designed to be pluggable so that would be an easier integration point 
for you.


I'm not sure how it creates more tokens? If a server normally owns 256 
tokens, each shard on a different port would just advertise ownership 
of 256/# of cores (e.g. 4 tokens if you had 64 cores).


Having just 4 tokens results in imbalance. CASSANDRA-7032 mitigates it, 
but only for one replication factor, and doesn't work for decommission.


(and if you have 60 lcores then you get between 4 and 5 tokens per 
lcore, which is a 20% imbalance right there)




> Regards,
> Ariel
>
>> On Apr 22, 2018, at 8:26 AM, Avi Kivity > wrote:
>>
>>
>>
>>> On 2018-04-19 21:15, Ben Bromhead wrote:
>>> Re #3:
>>>
>>> Yup I was thinking each shard/port would appear as a discrete
server to the
>>> client.
>> This doesn't work without additional changes, for RF>1. The
token ring could place two replicas of the same token range on the
same physical server, even though those are two separate cores of
the same server. You could add another element to the hierarchy
(cluster -> datacenter -> rack -> node -> core/shard), but that
generates unneeded range movements when a node is added.
>>
>>> If the per port suggestion is unacceptable due to hardware
requirements,
>>> remembering that Cassandra is built with the concept scaling
*commodity*
>>> hardware horizontally, you'll have to spend your time and
energy convincing
>>> the community to support a protocol feature it has no
(current) use for or
>>> find another interim solution.
>> Those servers are commodity servers (not x86, but still
commodity). In any case 60+ logical cores are common now (hello
AWS i3.16xlarge or even i3.metal), and we can only expect logical
core count to continue to increase (there are 48-core ARM
processors now).
>>
>>> Another way, would be to build support and consensus around a
clear
>>> technical need in the Apache Cassandra project as it stands today.
>>>
>>> One way to build community support might be to contribute an
Apache
>>> licensed thread per core implementation in Java that matches
the protocol
>>> change and shard concept you are looking for ;P
>> I doubt I'll survive the egregious top-posting that is going on
in this list.
>>
>>>
 On Thu, Apr 19, 2018 at 1:43 PM Ariel Weisberg
> wrote:

 Hi,

 So at technical level I don't understand this yet.

 So you have a database consisting of single threaded shards
and a socket
 for accept that is generating TCP connections and in advance
you don't know
 which connection is going to send messages to which shard.

 What is the mechanism by which you get the packets for a
given TCP
 connection delivered to a specific core? I know that a given
TCP connection
 will normally have all of its packets delivered to the same
queue from the
 NIC because the tuple of source address + port and
destination address +
 port is typically hashed to pick one of the queues the NIC
presents. I
 might have the contents of the tuple slightly wrong, but it
always includes
 a component you don't get to control.

Re: Evolving the client protocol

2018-04-24 Thread Avi Kivity

I have not asked this list to do any work on the drivers.


If Cassandra agrees to Scylla protocol changes (either proactively or 
retroactively) then the benefit to Cassandra is that if the drivers are 
changed (by the driver maintainers or by Scylla developers) then 
Cassandra developers need not do additional work to update the drivers. 
So there is less work for you, in the future, if those features are of 
interest to you.



On 2018-04-24 02:13, Jonathan Haddad wrote:

 From where I stand it looks like you've got only two options for any
feature that involves updating the protocol:

1. Don't built the feature
2. Built it in Cassanda & scylladb, update the drivers accordingly

I don't think you have a third option, which is built it only in ScyllaDB,
because that means you have to fork *all* the drivers and make it work,
then maintain them.  Your business model appears to be built on not doing
any of the driver work yourself, and you certainly aren't giving back to
the open source community via a permissive license on ScyllaDB itself, so
I'm a bit lost here.

To me it looks like you're asking a bunch of volunteers that work on
Cassandra to accommodate you.  What exactly do we get out of this
relationship?  What incentive do I or anyone else have to spend time
helping you instead of working on something that interests me?

Jon


On Mon, Apr 23, 2018 at 7:59 AM Ben Bromhead  wrote:


This doesn't work without additional changes, for RF>1. The token ring

could place two replicas of the same token range on the same physical
server, even though those are two separate cores of the same server. You
could add another element to the hierarchy (cluster -> datacenter -> rack
-> node -> core/shard), but that generates unneeded range movements when

a

node is added.

I have seen rack awareness used/abused to solve this.


But then you lose real rack awareness. It's fine for a quick hack, but
not a long-term solution.

(it also creates a lot more tokens, something nobody needs)


I'm having trouble understanding how you loose "real" rack awareness, as
these shards are in the same rack anyway, because the address and port are
on the same server in the same rack. So it behaves as expected. Could you
explain a situation where the shards on a single server would be in
different racks (or fault domains)?

If you wanted to support a situation where you have a single rack per DC
for simple deployments, extending NetworkTopologyStrategy to behave the way
it did before https://issues.apache.org/jira/browse/CASSANDRA-7544 with
respect to treating InetAddresses as servers rather than the address and
port would be simple. Both this implementation in Apache Cassandra and the
respective load balancing classes in the drivers are explicitly designed to
be pluggable so that would be an easier integration point for you.

I'm not sure how it creates more tokens? If a server normally owns 256
tokens, each shard on a different port would just advertise ownership of
256/# of cores (e.g. 4 tokens if you had 64 cores).



Regards,
Ariel


On Apr 22, 2018, at 8:26 AM, Avi Kivity  wrote:




On 2018-04-19 21:15, Ben Bromhead wrote:
Re #3:

Yup I was thinking each shard/port would appear as a discrete server

to the

client.

This doesn't work without additional changes, for RF>1. The token ring

could place two replicas of the same token range on the same physical
server, even though those are two separate cores of the same server. You
could add another element to the hierarchy (cluster -> datacenter -> rack
-> node -> core/shard), but that generates unneeded range movements when

a

node is added.

If the per port suggestion is unacceptable due to hardware

requirements,

remembering that Cassandra is built with the concept scaling

*commodity*

hardware horizontally, you'll have to spend your time and energy

convincing

the community to support a protocol feature it has no (current) use

for or

find another interim solution.

Those servers are commodity servers (not x86, but still commodity). In

any case 60+ logical cores are common now (hello AWS i3.16xlarge or even
i3.metal), and we can only expect logical core count to continue to
increase (there are 48-core ARM processors now).

Another way, would be to build support and consensus around a clear
technical need in the Apache Cassandra project as it stands today.

One way to build community support might be to contribute an Apache
licensed thread per core implementation in Java that matches the

protocol

change and shard concept you are looking for ;P

I doubt I'll survive the egregious top-posting that is going on in

this

list.

On Thu, Apr 19, 2018 at 1:43 PM Ariel Weisberg 

wrote:

Hi,

So at technical level I don't understand this yet.

So you have a database consisting of single threaded shards and a

socket

for accept that is generating TCP connections and in advance you

don't know

which connection is going 

Re: Evolving the client protocol

2018-04-24 Thread Dor Laor
On Mon, Apr 23, 2018 at 9:13 PM, San Luoji  wrote:

> Dor,
>
> Setting the Thread Per Core code aside, will your developers commit to
> contribute back both https://issues.apache.org/jira/browse/CASSANDRA-2848
> and https://issues.apache.org/jira/browse/CASSANDRA-14311?
>
> Looks like CASSANDRA-2848 has stalled even though some respectable work was
> done, and CASSANDRA-14311 hasn't been started yet. Some material
> contributions from your team on these two areas will be appreciated.
>

Avi was the one who opened 14311 so you can see we already did the (small)
job
of contributing a design plus discussing it. We do want to enhance the
protocol, contribute
to the driver, implement it in Scylla, all but the C* server
implementation. I think that on some
cases, if the server part is trivial, we'll be able to do it too, it
depends on the feature and its
complexity. There is a chicken and the egg issue here since the driver
maintainer will likely
to refuse the changes if not blessed with the server side which can be an
uphill battle.

Honestly, I'm trying to think whether we can contribute the C* server side
too. While these
two features aren't big, developing a database as we know is complex and
not only the code
needs to meet all edge cases, we need to pass unit tests and dtests, etc.
The code changes aren't big but aren't tiny too:
https://issues.apache.org/jira/secure/attachment/12824672/2848-trunk-v2.txt

Probably we were a bit naive to ask to include spec changes w/o c* code
changes. I think that
in some cases we can contribute code, some cases wouldn't be a match
(thread per core,..) and in
some others, it will makes to change the spec, looking at the above patch,
the spec change is trivial
and having a reference implementation in Scylla plus client driver will
motivate C* devs to
finally code it (really, per request timeouts are basic, I just answered to
a question about
select count timeout ..).
It would have been nice to have a more official way to extend the protocol,
not through the tracing flag
but maybe we're ahead of ourselves, let us pick a starting point and let a
developer dive in to one of these
first. Avi is welcome to over rule me ;)


> On Mon, Apr 23, 2018 at 6:17 PM, Dor Laor  wrote:
>
> > On Mon, Apr 23, 2018 at 5:03 PM, Sankalp Kohli 
> > wrote:
> >
> > > Is one of the “abuse” of Apache license is ScyllaDB which is using
> > > Cassandra but not contributing back?
> > >
> >
> > It's not that we have a private version of Cassandra and we don't release
> > all of it or some of it back..
> >
> > We didn't contribute because we have a different server base. We always
> > contribute where it makes sense.
> > I'll be happy to have several beers or emails about the cons and pros of
> > open source licensing but I don't think
> > this is the case. The discussion is about whether the community wish to
> > accept our contributions, we initiated it,
> > didn't we?
> >
> > Let's be practical, I think it's not reasonable to commit C* protocol
> > changes that the community doesn't intend
> > to implement in C* in the short term (thread-per-core like), it's not
> > reasonable to expect Scylla to contribute
> > such a huge effort to the C* server. It is reasonable to collaborate
> around
> > protocol enhancements that are acceptable,
> > even without coding and make sure the protocol is enhanceable in a way
> that
> > forward compatible.
> >
> >
> > Happy to be proved wrong as I am not a lawyer and don’t understand
> various
> > > licenses ..
> > >
> > > > On Apr 23, 2018, at 16:55, Dor Laor  wrote:
> > > >
> > > >> On Mon, Apr 23, 2018 at 4:13 PM, Jonathan Haddad  >
> > > wrote:
> > > >>
> > > >> From where I stand it looks like you've got only two options for any
> > > >> feature that involves updating the protocol:
> > > >>
> > > >> 1. Don't built the feature
> > > >> 2. Built it in Cassanda & scylladb, update the drivers accordingly
> > > >>
> > > >> I don't think you have a third option, which is built it only in
> > > ScyllaDB,
> > > >> because that means you have to fork *all* the drivers and make it
> > work,
> > > >> then maintain them.  Your business model appears to be built on not
> > > doing
> > > >> any of the driver work yourself, and you certainly aren't giving
> back
> > to
> > > >> the open source community via a permissive license on ScyllaDB
> itself,
> > > so
> > > >> I'm a bit lost here.
> > > >>
> > > >
> > > > It's totally not about business model.
> > > > Scylla itself is 99% open source with AGPL license that prevents
> abuse
> > > and
> > > > forces to be committed back to the project. We also have our core
> > engine
> > > > (seastar) licensed
> > > > as Apache since it needs to be integrated with  the core application.
> > > > Recently one of our community members even created a new Seastar
> based,
> > > C++
> > > > driver.
> > > >
> > > > Scylla chose to be compatible 

Re: Evolving the client protocol

2018-04-23 Thread San Luoji
Dor,

Setting the Thread Per Core code aside, will your developers commit to
contribute back both https://issues.apache.org/jira/browse/CASSANDRA-2848
and https://issues.apache.org/jira/browse/CASSANDRA-14311?

Looks like CASSANDRA-2848 has stalled even though some respectable work was
done, and CASSANDRA-14311 hasn't been started yet. Some material
contributions from your team on these two areas will be appreciated.

On Mon, Apr 23, 2018 at 6:17 PM, Dor Laor  wrote:

> On Mon, Apr 23, 2018 at 5:03 PM, Sankalp Kohli 
> wrote:
>
> > Is one of the “abuse” of Apache license is ScyllaDB which is using
> > Cassandra but not contributing back?
> >
>
> It's not that we have a private version of Cassandra and we don't release
> all of it or some of it back..
>
> We didn't contribute because we have a different server base. We always
> contribute where it makes sense.
> I'll be happy to have several beers or emails about the cons and pros of
> open source licensing but I don't think
> this is the case. The discussion is about whether the community wish to
> accept our contributions, we initiated it,
> didn't we?
>
> Let's be practical, I think it's not reasonable to commit C* protocol
> changes that the community doesn't intend
> to implement in C* in the short term (thread-per-core like), it's not
> reasonable to expect Scylla to contribute
> such a huge effort to the C* server. It is reasonable to collaborate around
> protocol enhancements that are acceptable,
> even without coding and make sure the protocol is enhanceable in a way that
> forward compatible.
>
>
> Happy to be proved wrong as I am not a lawyer and don’t understand various
> > licenses ..
> >
> > > On Apr 23, 2018, at 16:55, Dor Laor  wrote:
> > >
> > >> On Mon, Apr 23, 2018 at 4:13 PM, Jonathan Haddad 
> > wrote:
> > >>
> > >> From where I stand it looks like you've got only two options for any
> > >> feature that involves updating the protocol:
> > >>
> > >> 1. Don't built the feature
> > >> 2. Built it in Cassanda & scylladb, update the drivers accordingly
> > >>
> > >> I don't think you have a third option, which is built it only in
> > ScyllaDB,
> > >> because that means you have to fork *all* the drivers and make it
> work,
> > >> then maintain them.  Your business model appears to be built on not
> > doing
> > >> any of the driver work yourself, and you certainly aren't giving back
> to
> > >> the open source community via a permissive license on ScyllaDB itself,
> > so
> > >> I'm a bit lost here.
> > >>
> > >
> > > It's totally not about business model.
> > > Scylla itself is 99% open source with AGPL license that prevents abuse
> > and
> > > forces to be committed back to the project. We also have our core
> engine
> > > (seastar) licensed
> > > as Apache since it needs to be integrated with  the core application.
> > > Recently one of our community members even created a new Seastar based,
> > C++
> > > driver.
> > >
> > > Scylla chose to be compatible with the drivers in order to leverage the
> > > existing infrastructure
> > > and (let's be frank) in order to allow smooth migration.
> > > We would have loved to contribute more to the drivers but up to
> recently
> > we:
> > > 1. Were busy on top of our heads with the server
> > > 2. Happy w/ the existing drivers
> > > 3. Developed extensions - GoCQLX - our own contribution
> > >
> > > Finally we can contribute back to the same driver project, we want to
> do
> > it
> > > the right way,
> > > without forking and without duplicated efforts.
> > >
> > > Many times, having a private fork is way easier than proper open source
> > > work so from
> > > a pure business perspective, we don't select the shortest path.
> > >
> > >
> > >>
> > >> To me it looks like you're asking a bunch of volunteers that work on
> > >> Cassandra to accommodate you.  What exactly do we get out of this
> > >> relationship?  What incentive do I or anyone else have to spend time
> > >> helping you instead of working on something that interests me?
> > >>
> > >
> > > Jon, this is certainty not the case.
> > > We genuinely wish to make true *open source* work on:
> > > a. Cassandra drivers
> > > b. Client protocol
> > > c. Scylla server side.
> > > d. Cassandra community related work: mailing list, Jira, design
> > >
> > > But not
> > > e. Cassandra server side
> > >
> > > While I wouldn't mind doing the Cassandra server work, we don't have
> the
> > > resources or
> > > the expertise. The Cassandra _developer_ community is welcome to decide
> > > whether
> > > we get to contribute a/b/c/d. Avi has enumerated the options of
> > > cooperation, passive cooperation
> > > and zero cooperation (below).
> > >
> > > 1. The protocol change is developed using the Cassandra process in a
> JIRA
> > > ticket, culminating in a patch to doc/native_protocol*.spec when
> > consensus
> > > is achieved.
> > > 2. The protocol change is developed outside the 

Re: Evolving the client protocol

2018-04-23 Thread Jeff Jirsa
Respectfully, there’s pretty much already apparent consensus among those with a 
vote (unless I missed some dissenting opinion while I was on vacation).

Its been expressed multiple times by committers and members of the PMC that 
it’s Cassandra native protocol, it belongs in the protocol when it’s 
implemented. I haven’t seen ANY committers or members of the PMC make an 
argument that we should alter the spec without a matching implementation. 

Unless a committer wants to make an argument that we should change the spec 
without changing the implementation, this conversation can end. 

The spec is what the server implements. Anything we don’t implement can use the 
arbitrary payload from the zipkin tracing ticket or fork.

-- 
Jeff Jirsa


> On Apr 23, 2018, at 6:18 PM, Nate McCall  wrote:
> 
> Folks,
> Before this goes much further, let's take a step back for a second.
> 
> I am hearing the following: Folks are fine with CASSANDRA-14311 and
> CASSANDRA-2848 *BUT* they don't make much sense from the project's
> perspective without a reference implementation. I think the shard
> concept is too abstract for the project right now, so we should
> probably set that one aside.
> 
> Dor and Avi, I appreciate you both engaging directly on this. Where
> can we find common ground on this?
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Evolving the client protocol

2018-04-23 Thread Josh McKenzie
Apologies Nate - didn't realize I'd overlapped with you stepping in and
trying to bring us all back to reason.

I'll take my leave of the conversation at this point. :)

On Mon, Apr 23, 2018 at 9:30 PM, Josh McKenzie  wrote:

> > Datastax, Apple, Instaclstr,
> > thelastpickle and everyone else
> > receive different benefits
> You have mentioned a variety of vendors who received benefits while making
> major contributions back to the project. Comparing Scylla's relationship to
> the Cassandra ecosystem to this list is a false equivalency, and honestly
> one of the sillier things I've seen on this mailing list.
>
> > The C* ecosystem can either shrink or expand. We offer to expand it.
> Your company has not established a precedent for this, whatsoever, since
> its inception. Forgive those of us that don't take you at face value with
> this claim.
>
> On Mon, Apr 23, 2018, 8:54 PM Dor Laor  wrote:
>
>> On Mon, Apr 23, 2018 at 5:28 PM, Josh McKenzie 
>> wrote:
>>
>> > > it's not
>> > > reasonable to expect Scylla to contribute
>> > > such a huge effort to the C* server
>> >
>> > But it's reasonable that a major portion of Scylla's business model is
>> > profiting off those huge efforts other companies have made?
>> >
>> > Seems a little hypocritical to me.
>> >
>>
>> We're an open source based vendor, it's not a secret.
>> Last I checked, all participates on the thread should get business
>> benefits
>> and we all
>> got benefits from following the Dynamo/BigTable path.
>> We never zig-zaged and have very consistent open source approach.
>>
>> We're all here to make some type of profit.
>> Datastax, Apple, Instaclstr, thelastpickle and everyone else receive
>> different benefits,
>> from PR benefits, commercial benefits, service credibility, expertise
>> benefits, personal
>> carrier benefits and fun too.
>>
>> The C* ecosystem can either shrink or expand. We offer to expand it.
>>
>>
>>
>>
>> >
>> > On Mon, Apr 23, 2018, 8:18 PM Dor Laor  wrote:
>> >
>> > > On Mon, Apr 23, 2018 at 5:03 PM, Sankalp Kohli <
>> kohlisank...@gmail.com>
>> > > wrote:
>> > >
>> > > > Is one of the “abuse” of Apache license is ScyllaDB which is using
>> > > > Cassandra but not contributing back?
>> > > >
>> > >
>> > > It's not that we have a private version of Cassandra and we don't
>> release
>> > > all of it or some of it back..
>> > >
>> > > We didn't contribute because we have a different server base. We
>> always
>> > > contribute where it makes sense.
>> > > I'll be happy to have several beers or emails about the cons and pros
>> of
>> > > open source licensing but I don't think
>> > > this is the case. The discussion is about whether the community wish
>> to
>> > > accept our contributions, we initiated it,
>> > > didn't we?
>> > >
>> > > Let's be practical, I think it's not reasonable to commit C* protocol
>> > > changes that the community doesn't intend
>> > > to implement in C* in the short term (thread-per-core like), it's not
>> > > reasonable to expect Scylla to contribute
>> > > such a huge effort to the C* server. It is reasonable to collaborate
>> > around
>> > > protocol enhancements that are acceptable,
>> > > even without coding and make sure the protocol is enhanceable in a way
>> > that
>> > > forward compatible.
>> > >
>> > >
>> > > Happy to be proved wrong as I am not a lawyer and don’t understand
>> > various
>> > > > licenses ..
>> > > >
>> > > > > On Apr 23, 2018, at 16:55, Dor Laor  wrote:
>> > > > >
>> > > > >> On Mon, Apr 23, 2018 at 4:13 PM, Jonathan Haddad <
>> j...@jonhaddad.com
>> > >
>> > > > wrote:
>> > > > >>
>> > > > >> From where I stand it looks like you've got only two options for
>> any
>> > > > >> feature that involves updating the protocol:
>> > > > >>
>> > > > >> 1. Don't built the feature
>> > > > >> 2. Built it in Cassanda & scylladb, update the drivers
>> accordingly
>> > > > >>
>> > > > >> I don't think you have a third option, which is built it only in
>> > > > ScyllaDB,
>> > > > >> because that means you have to fork *all* the drivers and make it
>> > > work,
>> > > > >> then maintain them.  Your business model appears to be built on
>> not
>> > > > doing
>> > > > >> any of the driver work yourself, and you certainly aren't giving
>> > back
>> > > to
>> > > > >> the open source community via a permissive license on ScyllaDB
>> > itself,
>> > > > so
>> > > > >> I'm a bit lost here.
>> > > > >>
>> > > > >
>> > > > > It's totally not about business model.
>> > > > > Scylla itself is 99% open source with AGPL license that prevents
>> > abuse
>> > > > and
>> > > > > forces to be committed back to the project. We also have our core
>> > > engine
>> > > > > (seastar) licensed
>> > > > > as Apache since it needs to be integrated with  the core
>> application.
>> > > > > Recently one of our community members even created a new Seastar
>> > based,
>> > > > C++
>> > > > > driver.
>> > > > >
>> 

Re: Evolving the client protocol

2018-04-23 Thread Josh McKenzie
> Datastax, Apple, Instaclstr,
> thelastpickle and everyone else
> receive different benefits
You have mentioned a variety of vendors who received benefits while making
major contributions back to the project. Comparing Scylla's relationship to
the Cassandra ecosystem to this list is a false equivalency, and honestly
one of the sillier things I've seen on this mailing list.

> The C* ecosystem can either shrink or expand. We offer to expand it.
Your company has not established a precedent for this, whatsoever, since
its inception. Forgive those of us that don't take you at face value with
this claim.

On Mon, Apr 23, 2018, 8:54 PM Dor Laor  wrote:

> On Mon, Apr 23, 2018 at 5:28 PM, Josh McKenzie 
> wrote:
>
> > > it's not
> > > reasonable to expect Scylla to contribute
> > > such a huge effort to the C* server
> >
> > But it's reasonable that a major portion of Scylla's business model is
> > profiting off those huge efforts other companies have made?
> >
> > Seems a little hypocritical to me.
> >
>
> We're an open source based vendor, it's not a secret.
> Last I checked, all participates on the thread should get business benefits
> and we all
> got benefits from following the Dynamo/BigTable path.
> We never zig-zaged and have very consistent open source approach.
>
> We're all here to make some type of profit.
> Datastax, Apple, Instaclstr, thelastpickle and everyone else receive
> different benefits,
> from PR benefits, commercial benefits, service credibility, expertise
> benefits, personal
> carrier benefits and fun too.
>
> The C* ecosystem can either shrink or expand. We offer to expand it.
>
>
>
>
> >
> > On Mon, Apr 23, 2018, 8:18 PM Dor Laor  wrote:
> >
> > > On Mon, Apr 23, 2018 at 5:03 PM, Sankalp Kohli  >
> > > wrote:
> > >
> > > > Is one of the “abuse” of Apache license is ScyllaDB which is using
> > > > Cassandra but not contributing back?
> > > >
> > >
> > > It's not that we have a private version of Cassandra and we don't
> release
> > > all of it or some of it back..
> > >
> > > We didn't contribute because we have a different server base. We always
> > > contribute where it makes sense.
> > > I'll be happy to have several beers or emails about the cons and pros
> of
> > > open source licensing but I don't think
> > > this is the case. The discussion is about whether the community wish to
> > > accept our contributions, we initiated it,
> > > didn't we?
> > >
> > > Let's be practical, I think it's not reasonable to commit C* protocol
> > > changes that the community doesn't intend
> > > to implement in C* in the short term (thread-per-core like), it's not
> > > reasonable to expect Scylla to contribute
> > > such a huge effort to the C* server. It is reasonable to collaborate
> > around
> > > protocol enhancements that are acceptable,
> > > even without coding and make sure the protocol is enhanceable in a way
> > that
> > > forward compatible.
> > >
> > >
> > > Happy to be proved wrong as I am not a lawyer and don’t understand
> > various
> > > > licenses ..
> > > >
> > > > > On Apr 23, 2018, at 16:55, Dor Laor  wrote:
> > > > >
> > > > >> On Mon, Apr 23, 2018 at 4:13 PM, Jonathan Haddad <
> j...@jonhaddad.com
> > >
> > > > wrote:
> > > > >>
> > > > >> From where I stand it looks like you've got only two options for
> any
> > > > >> feature that involves updating the protocol:
> > > > >>
> > > > >> 1. Don't built the feature
> > > > >> 2. Built it in Cassanda & scylladb, update the drivers accordingly
> > > > >>
> > > > >> I don't think you have a third option, which is built it only in
> > > > ScyllaDB,
> > > > >> because that means you have to fork *all* the drivers and make it
> > > work,
> > > > >> then maintain them.  Your business model appears to be built on
> not
> > > > doing
> > > > >> any of the driver work yourself, and you certainly aren't giving
> > back
> > > to
> > > > >> the open source community via a permissive license on ScyllaDB
> > itself,
> > > > so
> > > > >> I'm a bit lost here.
> > > > >>
> > > > >
> > > > > It's totally not about business model.
> > > > > Scylla itself is 99% open source with AGPL license that prevents
> > abuse
> > > > and
> > > > > forces to be committed back to the project. We also have our core
> > > engine
> > > > > (seastar) licensed
> > > > > as Apache since it needs to be integrated with  the core
> application.
> > > > > Recently one of our community members even created a new Seastar
> > based,
> > > > C++
> > > > > driver.
> > > > >
> > > > > Scylla chose to be compatible with the drivers in order to leverage
> > the
> > > > > existing infrastructure
> > > > > and (let's be frank) in order to allow smooth migration.
> > > > > We would have loved to contribute more to the drivers but up to
> > > recently
> > > > we:
> > > > > 1. Were busy on top of our heads with the server
> > > > > 2. Happy w/ the existing drivers
> > > > > 3. 

Re: Evolving the client protocol

2018-04-23 Thread Nate McCall
Folks,
Before this goes much further, let's take a step back for a second.

I am hearing the following: Folks are fine with CASSANDRA-14311 and
CASSANDRA-2848 *BUT* they don't make much sense from the project's
perspective without a reference implementation. I think the shard
concept is too abstract for the project right now, so we should
probably set that one aside.

Dor and Avi, I appreciate you both engaging directly on this. Where
can we find common ground on this?

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Evolving the client protocol

2018-04-23 Thread Dor Laor
On Mon, Apr 23, 2018 at 5:28 PM, Josh McKenzie  wrote:

> > it's not
> > reasonable to expect Scylla to contribute
> > such a huge effort to the C* server
>
> But it's reasonable that a major portion of Scylla's business model is
> profiting off those huge efforts other companies have made?
>
> Seems a little hypocritical to me.
>

We're an open source based vendor, it's not a secret.
Last I checked, all participates on the thread should get business benefits
and we all
got benefits from following the Dynamo/BigTable path.
We never zig-zaged and have very consistent open source approach.

We're all here to make some type of profit.
Datastax, Apple, Instaclstr, thelastpickle and everyone else receive
different benefits,
from PR benefits, commercial benefits, service credibility, expertise
benefits, personal
carrier benefits and fun too.

The C* ecosystem can either shrink or expand. We offer to expand it.




>
> On Mon, Apr 23, 2018, 8:18 PM Dor Laor  wrote:
>
> > On Mon, Apr 23, 2018 at 5:03 PM, Sankalp Kohli 
> > wrote:
> >
> > > Is one of the “abuse” of Apache license is ScyllaDB which is using
> > > Cassandra but not contributing back?
> > >
> >
> > It's not that we have a private version of Cassandra and we don't release
> > all of it or some of it back..
> >
> > We didn't contribute because we have a different server base. We always
> > contribute where it makes sense.
> > I'll be happy to have several beers or emails about the cons and pros of
> > open source licensing but I don't think
> > this is the case. The discussion is about whether the community wish to
> > accept our contributions, we initiated it,
> > didn't we?
> >
> > Let's be practical, I think it's not reasonable to commit C* protocol
> > changes that the community doesn't intend
> > to implement in C* in the short term (thread-per-core like), it's not
> > reasonable to expect Scylla to contribute
> > such a huge effort to the C* server. It is reasonable to collaborate
> around
> > protocol enhancements that are acceptable,
> > even without coding and make sure the protocol is enhanceable in a way
> that
> > forward compatible.
> >
> >
> > Happy to be proved wrong as I am not a lawyer and don’t understand
> various
> > > licenses ..
> > >
> > > > On Apr 23, 2018, at 16:55, Dor Laor  wrote:
> > > >
> > > >> On Mon, Apr 23, 2018 at 4:13 PM, Jonathan Haddad  >
> > > wrote:
> > > >>
> > > >> From where I stand it looks like you've got only two options for any
> > > >> feature that involves updating the protocol:
> > > >>
> > > >> 1. Don't built the feature
> > > >> 2. Built it in Cassanda & scylladb, update the drivers accordingly
> > > >>
> > > >> I don't think you have a third option, which is built it only in
> > > ScyllaDB,
> > > >> because that means you have to fork *all* the drivers and make it
> > work,
> > > >> then maintain them.  Your business model appears to be built on not
> > > doing
> > > >> any of the driver work yourself, and you certainly aren't giving
> back
> > to
> > > >> the open source community via a permissive license on ScyllaDB
> itself,
> > > so
> > > >> I'm a bit lost here.
> > > >>
> > > >
> > > > It's totally not about business model.
> > > > Scylla itself is 99% open source with AGPL license that prevents
> abuse
> > > and
> > > > forces to be committed back to the project. We also have our core
> > engine
> > > > (seastar) licensed
> > > > as Apache since it needs to be integrated with  the core application.
> > > > Recently one of our community members even created a new Seastar
> based,
> > > C++
> > > > driver.
> > > >
> > > > Scylla chose to be compatible with the drivers in order to leverage
> the
> > > > existing infrastructure
> > > > and (let's be frank) in order to allow smooth migration.
> > > > We would have loved to contribute more to the drivers but up to
> > recently
> > > we:
> > > > 1. Were busy on top of our heads with the server
> > > > 2. Happy w/ the existing drivers
> > > > 3. Developed extensions - GoCQLX - our own contribution
> > > >
> > > > Finally we can contribute back to the same driver project, we want to
> > do
> > > it
> > > > the right way,
> > > > without forking and without duplicated efforts.
> > > >
> > > > Many times, having a private fork is way easier than proper open
> source
> > > > work so from
> > > > a pure business perspective, we don't select the shortest path.
> > > >
> > > >
> > > >>
> > > >> To me it looks like you're asking a bunch of volunteers that work on
> > > >> Cassandra to accommodate you.  What exactly do we get out of this
> > > >> relationship?  What incentive do I or anyone else have to spend time
> > > >> helping you instead of working on something that interests me?
> > > >>
> > > >
> > > > Jon, this is certainty not the case.
> > > > We genuinely wish to make true *open source* work on:
> > > > a. Cassandra drivers
> > > > b. Client 

Re: Evolving the client protocol

2018-04-23 Thread Sankalp Kohli
If you are so concerned about forking protocol, why did you fork the server? 
Please pick a side and not what is suitable for your Business. 




On Apr 23, 2018, at 17:28, Josh McKenzie  wrote:

>> it's not
>> reasonable to expect Scylla to contribute
>> such a huge effort to the C* server
> 
> But it's reasonable that a major portion of Scylla's business model is
> profiting off those huge efforts other companies have made?
> 
> Seems a little hypocritical to me.
> 
>> On Mon, Apr 23, 2018, 8:18 PM Dor Laor  wrote:
>> 
>> On Mon, Apr 23, 2018 at 5:03 PM, Sankalp Kohli 
>> wrote:
>> 
>>> Is one of the “abuse” of Apache license is ScyllaDB which is using
>>> Cassandra but not contributing back?
>>> 
>> 
>> It's not that we have a private version of Cassandra and we don't release
>> all of it or some of it back..
>> 
>> We didn't contribute because we have a different server base. We always
>> contribute where it makes sense.
>> I'll be happy to have several beers or emails about the cons and pros of
>> open source licensing but I don't think
>> this is the case. The discussion is about whether the community wish to
>> accept our contributions, we initiated it,
>> didn't we?
>> 
>> Let's be practical, I think it's not reasonable to commit C* protocol
>> changes that the community doesn't intend
>> to implement in C* in the short term (thread-per-core like), it's not
>> reasonable to expect Scylla to contribute
>> such a huge effort to the C* server. It is reasonable to collaborate around
>> protocol enhancements that are acceptable,
>> even without coding and make sure the protocol is enhanceable in a way that
>> forward compatible.
>> 
>> 
>> Happy to be proved wrong as I am not a lawyer and don’t understand various
>>> licenses ..
>>> 
> On Apr 23, 2018, at 16:55, Dor Laor  wrote:
> 
> On Mon, Apr 23, 2018 at 4:13 PM, Jonathan Haddad 
>>> wrote:
> 
> From where I stand it looks like you've got only two options for any
> feature that involves updating the protocol:
> 
> 1. Don't built the feature
> 2. Built it in Cassanda & scylladb, update the drivers accordingly
> 
> I don't think you have a third option, which is built it only in
>>> ScyllaDB,
> because that means you have to fork *all* the drivers and make it
>> work,
> then maintain them.  Your business model appears to be built on not
>>> doing
> any of the driver work yourself, and you certainly aren't giving back
>> to
> the open source community via a permissive license on ScyllaDB itself,
>>> so
> I'm a bit lost here.
> 
 
 It's totally not about business model.
 Scylla itself is 99% open source with AGPL license that prevents abuse
>>> and
 forces to be committed back to the project. We also have our core
>> engine
 (seastar) licensed
 as Apache since it needs to be integrated with  the core application.
 Recently one of our community members even created a new Seastar based,
>>> C++
 driver.
 
 Scylla chose to be compatible with the drivers in order to leverage the
 existing infrastructure
 and (let's be frank) in order to allow smooth migration.
 We would have loved to contribute more to the drivers but up to
>> recently
>>> we:
 1. Were busy on top of our heads with the server
 2. Happy w/ the existing drivers
 3. Developed extensions - GoCQLX - our own contribution
 
 Finally we can contribute back to the same driver project, we want to
>> do
>>> it
 the right way,
 without forking and without duplicated efforts.
 
 Many times, having a private fork is way easier than proper open source
 work so from
 a pure business perspective, we don't select the shortest path.
 
 
> 
> To me it looks like you're asking a bunch of volunteers that work on
> Cassandra to accommodate you.  What exactly do we get out of this
> relationship?  What incentive do I or anyone else have to spend time
> helping you instead of working on something that interests me?
> 
 
 Jon, this is certainty not the case.
 We genuinely wish to make true *open source* work on:
 a. Cassandra drivers
 b. Client protocol
 c. Scylla server side.
 d. Cassandra community related work: mailing list, Jira, design
 
 But not
 e. Cassandra server side
 
 While I wouldn't mind doing the Cassandra server work, we don't have
>> the
 resources or
 the expertise. The Cassandra _developer_ community is welcome to decide
 whether
 we get to contribute a/b/c/d. Avi has enumerated the options of
 cooperation, passive cooperation
 and zero cooperation (below).
 
 1. The protocol change is developed using the Cassandra process in a
>> JIRA
 ticket, culminating in a patch to doc/native_protocol*.spec when
>>> consensus
 

Re: Evolving the client protocol

2018-04-23 Thread Josh McKenzie
> it's not
> reasonable to expect Scylla to contribute
> such a huge effort to the C* server

But it's reasonable that a major portion of Scylla's business model is
profiting off those huge efforts other companies have made?

Seems a little hypocritical to me.

On Mon, Apr 23, 2018, 8:18 PM Dor Laor  wrote:

> On Mon, Apr 23, 2018 at 5:03 PM, Sankalp Kohli 
> wrote:
>
> > Is one of the “abuse” of Apache license is ScyllaDB which is using
> > Cassandra but not contributing back?
> >
>
> It's not that we have a private version of Cassandra and we don't release
> all of it or some of it back..
>
> We didn't contribute because we have a different server base. We always
> contribute where it makes sense.
> I'll be happy to have several beers or emails about the cons and pros of
> open source licensing but I don't think
> this is the case. The discussion is about whether the community wish to
> accept our contributions, we initiated it,
> didn't we?
>
> Let's be practical, I think it's not reasonable to commit C* protocol
> changes that the community doesn't intend
> to implement in C* in the short term (thread-per-core like), it's not
> reasonable to expect Scylla to contribute
> such a huge effort to the C* server. It is reasonable to collaborate around
> protocol enhancements that are acceptable,
> even without coding and make sure the protocol is enhanceable in a way that
> forward compatible.
>
>
> Happy to be proved wrong as I am not a lawyer and don’t understand various
> > licenses ..
> >
> > > On Apr 23, 2018, at 16:55, Dor Laor  wrote:
> > >
> > >> On Mon, Apr 23, 2018 at 4:13 PM, Jonathan Haddad 
> > wrote:
> > >>
> > >> From where I stand it looks like you've got only two options for any
> > >> feature that involves updating the protocol:
> > >>
> > >> 1. Don't built the feature
> > >> 2. Built it in Cassanda & scylladb, update the drivers accordingly
> > >>
> > >> I don't think you have a third option, which is built it only in
> > ScyllaDB,
> > >> because that means you have to fork *all* the drivers and make it
> work,
> > >> then maintain them.  Your business model appears to be built on not
> > doing
> > >> any of the driver work yourself, and you certainly aren't giving back
> to
> > >> the open source community via a permissive license on ScyllaDB itself,
> > so
> > >> I'm a bit lost here.
> > >>
> > >
> > > It's totally not about business model.
> > > Scylla itself is 99% open source with AGPL license that prevents abuse
> > and
> > > forces to be committed back to the project. We also have our core
> engine
> > > (seastar) licensed
> > > as Apache since it needs to be integrated with  the core application.
> > > Recently one of our community members even created a new Seastar based,
> > C++
> > > driver.
> > >
> > > Scylla chose to be compatible with the drivers in order to leverage the
> > > existing infrastructure
> > > and (let's be frank) in order to allow smooth migration.
> > > We would have loved to contribute more to the drivers but up to
> recently
> > we:
> > > 1. Were busy on top of our heads with the server
> > > 2. Happy w/ the existing drivers
> > > 3. Developed extensions - GoCQLX - our own contribution
> > >
> > > Finally we can contribute back to the same driver project, we want to
> do
> > it
> > > the right way,
> > > without forking and without duplicated efforts.
> > >
> > > Many times, having a private fork is way easier than proper open source
> > > work so from
> > > a pure business perspective, we don't select the shortest path.
> > >
> > >
> > >>
> > >> To me it looks like you're asking a bunch of volunteers that work on
> > >> Cassandra to accommodate you.  What exactly do we get out of this
> > >> relationship?  What incentive do I or anyone else have to spend time
> > >> helping you instead of working on something that interests me?
> > >>
> > >
> > > Jon, this is certainty not the case.
> > > We genuinely wish to make true *open source* work on:
> > > a. Cassandra drivers
> > > b. Client protocol
> > > c. Scylla server side.
> > > d. Cassandra community related work: mailing list, Jira, design
> > >
> > > But not
> > > e. Cassandra server side
> > >
> > > While I wouldn't mind doing the Cassandra server work, we don't have
> the
> > > resources or
> > > the expertise. The Cassandra _developer_ community is welcome to decide
> > > whether
> > > we get to contribute a/b/c/d. Avi has enumerated the options of
> > > cooperation, passive cooperation
> > > and zero cooperation (below).
> > >
> > > 1. The protocol change is developed using the Cassandra process in a
> JIRA
> > > ticket, culminating in a patch to doc/native_protocol*.spec when
> > consensus
> > > is achieved.
> > > 2. The protocol change is developed outside the Cassandra process.
> > > 3. No cooperation.
> > >
> > > Look, I can understand the hostility and suspicious, however, from the
> C*
> > > project POV, it makes no

Re: Evolving the client protocol

2018-04-23 Thread Dor Laor
On Mon, Apr 23, 2018 at 5:03 PM, Sankalp Kohli 
wrote:

> Is one of the “abuse” of Apache license is ScyllaDB which is using
> Cassandra but not contributing back?
>

It's not that we have a private version of Cassandra and we don't release
all of it or some of it back..

We didn't contribute because we have a different server base. We always
contribute where it makes sense.
I'll be happy to have several beers or emails about the cons and pros of
open source licensing but I don't think
this is the case. The discussion is about whether the community wish to
accept our contributions, we initiated it,
didn't we?

Let's be practical, I think it's not reasonable to commit C* protocol
changes that the community doesn't intend
to implement in C* in the short term (thread-per-core like), it's not
reasonable to expect Scylla to contribute
such a huge effort to the C* server. It is reasonable to collaborate around
protocol enhancements that are acceptable,
even without coding and make sure the protocol is enhanceable in a way that
forward compatible.


Happy to be proved wrong as I am not a lawyer and don’t understand various
> licenses ..
>
> > On Apr 23, 2018, at 16:55, Dor Laor  wrote:
> >
> >> On Mon, Apr 23, 2018 at 4:13 PM, Jonathan Haddad 
> wrote:
> >>
> >> From where I stand it looks like you've got only two options for any
> >> feature that involves updating the protocol:
> >>
> >> 1. Don't built the feature
> >> 2. Built it in Cassanda & scylladb, update the drivers accordingly
> >>
> >> I don't think you have a third option, which is built it only in
> ScyllaDB,
> >> because that means you have to fork *all* the drivers and make it work,
> >> then maintain them.  Your business model appears to be built on not
> doing
> >> any of the driver work yourself, and you certainly aren't giving back to
> >> the open source community via a permissive license on ScyllaDB itself,
> so
> >> I'm a bit lost here.
> >>
> >
> > It's totally not about business model.
> > Scylla itself is 99% open source with AGPL license that prevents abuse
> and
> > forces to be committed back to the project. We also have our core engine
> > (seastar) licensed
> > as Apache since it needs to be integrated with  the core application.
> > Recently one of our community members even created a new Seastar based,
> C++
> > driver.
> >
> > Scylla chose to be compatible with the drivers in order to leverage the
> > existing infrastructure
> > and (let's be frank) in order to allow smooth migration.
> > We would have loved to contribute more to the drivers but up to recently
> we:
> > 1. Were busy on top of our heads with the server
> > 2. Happy w/ the existing drivers
> > 3. Developed extensions - GoCQLX - our own contribution
> >
> > Finally we can contribute back to the same driver project, we want to do
> it
> > the right way,
> > without forking and without duplicated efforts.
> >
> > Many times, having a private fork is way easier than proper open source
> > work so from
> > a pure business perspective, we don't select the shortest path.
> >
> >
> >>
> >> To me it looks like you're asking a bunch of volunteers that work on
> >> Cassandra to accommodate you.  What exactly do we get out of this
> >> relationship?  What incentive do I or anyone else have to spend time
> >> helping you instead of working on something that interests me?
> >>
> >
> > Jon, this is certainty not the case.
> > We genuinely wish to make true *open source* work on:
> > a. Cassandra drivers
> > b. Client protocol
> > c. Scylla server side.
> > d. Cassandra community related work: mailing list, Jira, design
> >
> > But not
> > e. Cassandra server side
> >
> > While I wouldn't mind doing the Cassandra server work, we don't have the
> > resources or
> > the expertise. The Cassandra _developer_ community is welcome to decide
> > whether
> > we get to contribute a/b/c/d. Avi has enumerated the options of
> > cooperation, passive cooperation
> > and zero cooperation (below).
> >
> > 1. The protocol change is developed using the Cassandra process in a JIRA
> > ticket, culminating in a patch to doc/native_protocol*.spec when
> consensus
> > is achieved.
> > 2. The protocol change is developed outside the Cassandra process.
> > 3. No cooperation.
> >
> > Look, I can understand the hostility and suspicious, however, from the C*
> > project POV, it makes no
> > sense to ignore, otherwise we'll fork the drivers and you won't get
> > anything back. There is another
> > at least one vendor today with their server fork and driver fork and it
> > makes sense to keep the protocol
> > unified in an extensible way and to discuss new features _together_.
> >
> >
> >
> >>
> >> Jon
> >>
> >>
> >> On Mon, Apr 23, 2018 at 7:59 AM Ben Bromhead 
> wrote:
> >>
> >> This doesn't work without additional changes, for RF>1. The token
> >> ring
>  could place two replicas of the same token range on the same 

Re: Evolving the client protocol

2018-04-23 Thread Sankalp Kohli
Is one of the “abuse” of Apache license is ScyllaDB which is using Cassandra 
but not contributing back?
Happy to be proved wrong as I am not a lawyer and don’t understand various 
licenses ..

> On Apr 23, 2018, at 16:55, Dor Laor  wrote:
> 
>> On Mon, Apr 23, 2018 at 4:13 PM, Jonathan Haddad  wrote:
>> 
>> From where I stand it looks like you've got only two options for any
>> feature that involves updating the protocol:
>> 
>> 1. Don't built the feature
>> 2. Built it in Cassanda & scylladb, update the drivers accordingly
>> 
>> I don't think you have a third option, which is built it only in ScyllaDB,
>> because that means you have to fork *all* the drivers and make it work,
>> then maintain them.  Your business model appears to be built on not doing
>> any of the driver work yourself, and you certainly aren't giving back to
>> the open source community via a permissive license on ScyllaDB itself, so
>> I'm a bit lost here.
>> 
> 
> It's totally not about business model.
> Scylla itself is 99% open source with AGPL license that prevents abuse and
> forces to be committed back to the project. We also have our core engine
> (seastar) licensed
> as Apache since it needs to be integrated with  the core application.
> Recently one of our community members even created a new Seastar based, C++
> driver.
> 
> Scylla chose to be compatible with the drivers in order to leverage the
> existing infrastructure
> and (let's be frank) in order to allow smooth migration.
> We would have loved to contribute more to the drivers but up to recently we:
> 1. Were busy on top of our heads with the server
> 2. Happy w/ the existing drivers
> 3. Developed extensions - GoCQLX - our own contribution
> 
> Finally we can contribute back to the same driver project, we want to do it
> the right way,
> without forking and without duplicated efforts.
> 
> Many times, having a private fork is way easier than proper open source
> work so from
> a pure business perspective, we don't select the shortest path.
> 
> 
>> 
>> To me it looks like you're asking a bunch of volunteers that work on
>> Cassandra to accommodate you.  What exactly do we get out of this
>> relationship?  What incentive do I or anyone else have to spend time
>> helping you instead of working on something that interests me?
>> 
> 
> Jon, this is certainty not the case.
> We genuinely wish to make true *open source* work on:
> a. Cassandra drivers
> b. Client protocol
> c. Scylla server side.
> d. Cassandra community related work: mailing list, Jira, design
> 
> But not
> e. Cassandra server side
> 
> While I wouldn't mind doing the Cassandra server work, we don't have the
> resources or
> the expertise. The Cassandra _developer_ community is welcome to decide
> whether
> we get to contribute a/b/c/d. Avi has enumerated the options of
> cooperation, passive cooperation
> and zero cooperation (below).
> 
> 1. The protocol change is developed using the Cassandra process in a JIRA
> ticket, culminating in a patch to doc/native_protocol*.spec when consensus
> is achieved.
> 2. The protocol change is developed outside the Cassandra process.
> 3. No cooperation.
> 
> Look, I can understand the hostility and suspicious, however, from the C*
> project POV, it makes no
> sense to ignore, otherwise we'll fork the drivers and you won't get
> anything back. There is another
> at least one vendor today with their server fork and driver fork and it
> makes sense to keep the protocol
> unified in an extensible way and to discuss new features _together_.
> 
> 
> 
>> 
>> Jon
>> 
>> 
>> On Mon, Apr 23, 2018 at 7:59 AM Ben Bromhead  wrote:
>> 
>> This doesn't work without additional changes, for RF>1. The token
>> ring
 could place two replicas of the same token range on the same physical
 server, even though those are two separate cores of the same server.
>> You
 could add another element to the hierarchy (cluster -> datacenter ->
>> rack
 -> node -> core/shard), but that generates unneeded range movements
>> when
>>> a
 node is added.
> I have seen rack awareness used/abused to solve this.
> 
 
 But then you lose real rack awareness. It's fine for a quick hack, but
 not a long-term solution.
 
 (it also creates a lot more tokens, something nobody needs)
 
>>> 
>>> I'm having trouble understanding how you loose "real" rack awareness, as
>>> these shards are in the same rack anyway, because the address and port
>> are
>>> on the same server in the same rack. So it behaves as expected. Could you
>>> explain a situation where the shards on a single server would be in
>>> different racks (or fault domains)?
>>> 
>>> If you wanted to support a situation where you have a single rack per DC
>>> for simple deployments, extending NetworkTopologyStrategy to behave the
>> way
>>> it did before https://issues.apache.org/jira/browse/CASSANDRA-7544 with
>>> respect to treating 

Re: Evolving the client protocol

2018-04-23 Thread Dor Laor
On Mon, Apr 23, 2018 at 4:13 PM, Jonathan Haddad  wrote:

> From where I stand it looks like you've got only two options for any
> feature that involves updating the protocol:
>
> 1. Don't built the feature
> 2. Built it in Cassanda & scylladb, update the drivers accordingly
>
> I don't think you have a third option, which is built it only in ScyllaDB,
> because that means you have to fork *all* the drivers and make it work,
> then maintain them.  Your business model appears to be built on not doing
> any of the driver work yourself, and you certainly aren't giving back to
> the open source community via a permissive license on ScyllaDB itself, so
> I'm a bit lost here.
>

It's totally not about business model.
Scylla itself is 99% open source with AGPL license that prevents abuse and
forces to be committed back to the project. We also have our core engine
(seastar) licensed
as Apache since it needs to be integrated with  the core application.
Recently one of our community members even created a new Seastar based, C++
driver.

Scylla chose to be compatible with the drivers in order to leverage the
existing infrastructure
and (let's be frank) in order to allow smooth migration.
We would have loved to contribute more to the drivers but up to recently we:
1. Were busy on top of our heads with the server
2. Happy w/ the existing drivers
3. Developed extensions - GoCQLX - our own contribution

Finally we can contribute back to the same driver project, we want to do it
the right way,
without forking and without duplicated efforts.

Many times, having a private fork is way easier than proper open source
work so from
a pure business perspective, we don't select the shortest path.


>
> To me it looks like you're asking a bunch of volunteers that work on
> Cassandra to accommodate you.  What exactly do we get out of this
> relationship?  What incentive do I or anyone else have to spend time
> helping you instead of working on something that interests me?
>

Jon, this is certainty not the case.
We genuinely wish to make true *open source* work on:
a. Cassandra drivers
b. Client protocol
c. Scylla server side.
d. Cassandra community related work: mailing list, Jira, design

But not
e. Cassandra server side

While I wouldn't mind doing the Cassandra server work, we don't have the
resources or
the expertise. The Cassandra _developer_ community is welcome to decide
whether
we get to contribute a/b/c/d. Avi has enumerated the options of
cooperation, passive cooperation
and zero cooperation (below).

1. The protocol change is developed using the Cassandra process in a JIRA
ticket, culminating in a patch to doc/native_protocol*.spec when consensus
is achieved.
2. The protocol change is developed outside the Cassandra process.
3. No cooperation.

Look, I can understand the hostility and suspicious, however, from the C*
project POV, it makes no
sense to ignore, otherwise we'll fork the drivers and you won't get
anything back. There is another
at least one vendor today with their server fork and driver fork and it
makes sense to keep the protocol
unified in an extensible way and to discuss new features _together_.



>
> Jon
>
>
> On Mon, Apr 23, 2018 at 7:59 AM Ben Bromhead  wrote:
>
> > > >> This doesn't work without additional changes, for RF>1. The token
> ring
> > > could place two replicas of the same token range on the same physical
> > > server, even though those are two separate cores of the same server.
> You
> > > could add another element to the hierarchy (cluster -> datacenter ->
> rack
> > > -> node -> core/shard), but that generates unneeded range movements
> when
> > a
> > > node is added.
> > > > I have seen rack awareness used/abused to solve this.
> > > >
> > >
> > > But then you lose real rack awareness. It's fine for a quick hack, but
> > > not a long-term solution.
> > >
> > > (it also creates a lot more tokens, something nobody needs)
> > >
> >
> > I'm having trouble understanding how you loose "real" rack awareness, as
> > these shards are in the same rack anyway, because the address and port
> are
> > on the same server in the same rack. So it behaves as expected. Could you
> > explain a situation where the shards on a single server would be in
> > different racks (or fault domains)?
> >
> > If you wanted to support a situation where you have a single rack per DC
> > for simple deployments, extending NetworkTopologyStrategy to behave the
> way
> > it did before https://issues.apache.org/jira/browse/CASSANDRA-7544 with
> > respect to treating InetAddresses as servers rather than the address and
> > port would be simple. Both this implementation in Apache Cassandra and
> the
> > respective load balancing classes in the drivers are explicitly designed
> to
> > be pluggable so that would be an easier integration point for you.
> >
> > I'm not sure how it creates more tokens? If a server normally owns 256
> > tokens, each shard on a different port would just 

Re: Evolving the client protocol

2018-04-23 Thread Jonathan Haddad
>From where I stand it looks like you've got only two options for any
feature that involves updating the protocol:

1. Don't built the feature
2. Built it in Cassanda & scylladb, update the drivers accordingly

I don't think you have a third option, which is built it only in ScyllaDB,
because that means you have to fork *all* the drivers and make it work,
then maintain them.  Your business model appears to be built on not doing
any of the driver work yourself, and you certainly aren't giving back to
the open source community via a permissive license on ScyllaDB itself, so
I'm a bit lost here.

To me it looks like you're asking a bunch of volunteers that work on
Cassandra to accommodate you.  What exactly do we get out of this
relationship?  What incentive do I or anyone else have to spend time
helping you instead of working on something that interests me?

Jon


On Mon, Apr 23, 2018 at 7:59 AM Ben Bromhead  wrote:

> > >> This doesn't work without additional changes, for RF>1. The token ring
> > could place two replicas of the same token range on the same physical
> > server, even though those are two separate cores of the same server. You
> > could add another element to the hierarchy (cluster -> datacenter -> rack
> > -> node -> core/shard), but that generates unneeded range movements when
> a
> > node is added.
> > > I have seen rack awareness used/abused to solve this.
> > >
> >
> > But then you lose real rack awareness. It's fine for a quick hack, but
> > not a long-term solution.
> >
> > (it also creates a lot more tokens, something nobody needs)
> >
>
> I'm having trouble understanding how you loose "real" rack awareness, as
> these shards are in the same rack anyway, because the address and port are
> on the same server in the same rack. So it behaves as expected. Could you
> explain a situation where the shards on a single server would be in
> different racks (or fault domains)?
>
> If you wanted to support a situation where you have a single rack per DC
> for simple deployments, extending NetworkTopologyStrategy to behave the way
> it did before https://issues.apache.org/jira/browse/CASSANDRA-7544 with
> respect to treating InetAddresses as servers rather than the address and
> port would be simple. Both this implementation in Apache Cassandra and the
> respective load balancing classes in the drivers are explicitly designed to
> be pluggable so that would be an easier integration point for you.
>
> I'm not sure how it creates more tokens? If a server normally owns 256
> tokens, each shard on a different port would just advertise ownership of
> 256/# of cores (e.g. 4 tokens if you had 64 cores).
>
>
> >
> > > Regards,
> > > Ariel
> > >
> > >> On Apr 22, 2018, at 8:26 AM, Avi Kivity  wrote:
> > >>
> > >>
> > >>
> > >>> On 2018-04-19 21:15, Ben Bromhead wrote:
> > >>> Re #3:
> > >>>
> > >>> Yup I was thinking each shard/port would appear as a discrete server
> > to the
> > >>> client.
> > >> This doesn't work without additional changes, for RF>1. The token ring
> > could place two replicas of the same token range on the same physical
> > server, even though those are two separate cores of the same server. You
> > could add another element to the hierarchy (cluster -> datacenter -> rack
> > -> node -> core/shard), but that generates unneeded range movements when
> a
> > node is added.
> > >>
> > >>> If the per port suggestion is unacceptable due to hardware
> > requirements,
> > >>> remembering that Cassandra is built with the concept scaling
> > *commodity*
> > >>> hardware horizontally, you'll have to spend your time and energy
> > convincing
> > >>> the community to support a protocol feature it has no (current) use
> > for or
> > >>> find another interim solution.
> > >> Those servers are commodity servers (not x86, but still commodity). In
> > any case 60+ logical cores are common now (hello AWS i3.16xlarge or even
> > i3.metal), and we can only expect logical core count to continue to
> > increase (there are 48-core ARM processors now).
> > >>
> > >>> Another way, would be to build support and consensus around a clear
> > >>> technical need in the Apache Cassandra project as it stands today.
> > >>>
> > >>> One way to build community support might be to contribute an Apache
> > >>> licensed thread per core implementation in Java that matches the
> > protocol
> > >>> change and shard concept you are looking for ;P
> > >> I doubt I'll survive the egregious top-posting that is going on in
> this
> > list.
> > >>
> > >>>
> >  On Thu, Apr 19, 2018 at 1:43 PM Ariel Weisberg 
> > wrote:
> > 
> >  Hi,
> > 
> >  So at technical level I don't understand this yet.
> > 
> >  So you have a database consisting of single threaded shards and a
> > socket
> >  for accept that is generating TCP connections and in advance you
> > don't know
> >  which connection is going to send messages to which shard.
> > 
> > 

Re: Evolving the client protocol

2018-04-23 Thread Ben Bromhead
> >> This doesn't work without additional changes, for RF>1. The token ring
> could place two replicas of the same token range on the same physical
> server, even though those are two separate cores of the same server. You
> could add another element to the hierarchy (cluster -> datacenter -> rack
> -> node -> core/shard), but that generates unneeded range movements when a
> node is added.
> > I have seen rack awareness used/abused to solve this.
> >
>
> But then you lose real rack awareness. It's fine for a quick hack, but
> not a long-term solution.
>
> (it also creates a lot more tokens, something nobody needs)
>

I'm having trouble understanding how you loose "real" rack awareness, as
these shards are in the same rack anyway, because the address and port are
on the same server in the same rack. So it behaves as expected. Could you
explain a situation where the shards on a single server would be in
different racks (or fault domains)?

If you wanted to support a situation where you have a single rack per DC
for simple deployments, extending NetworkTopologyStrategy to behave the way
it did before https://issues.apache.org/jira/browse/CASSANDRA-7544 with
respect to treating InetAddresses as servers rather than the address and
port would be simple. Both this implementation in Apache Cassandra and the
respective load balancing classes in the drivers are explicitly designed to
be pluggable so that would be an easier integration point for you.

I'm not sure how it creates more tokens? If a server normally owns 256
tokens, each shard on a different port would just advertise ownership of
256/# of cores (e.g. 4 tokens if you had 64 cores).


>
> > Regards,
> > Ariel
> >
> >> On Apr 22, 2018, at 8:26 AM, Avi Kivity  wrote:
> >>
> >>
> >>
> >>> On 2018-04-19 21:15, Ben Bromhead wrote:
> >>> Re #3:
> >>>
> >>> Yup I was thinking each shard/port would appear as a discrete server
> to the
> >>> client.
> >> This doesn't work without additional changes, for RF>1. The token ring
> could place two replicas of the same token range on the same physical
> server, even though those are two separate cores of the same server. You
> could add another element to the hierarchy (cluster -> datacenter -> rack
> -> node -> core/shard), but that generates unneeded range movements when a
> node is added.
> >>
> >>> If the per port suggestion is unacceptable due to hardware
> requirements,
> >>> remembering that Cassandra is built with the concept scaling
> *commodity*
> >>> hardware horizontally, you'll have to spend your time and energy
> convincing
> >>> the community to support a protocol feature it has no (current) use
> for or
> >>> find another interim solution.
> >> Those servers are commodity servers (not x86, but still commodity). In
> any case 60+ logical cores are common now (hello AWS i3.16xlarge or even
> i3.metal), and we can only expect logical core count to continue to
> increase (there are 48-core ARM processors now).
> >>
> >>> Another way, would be to build support and consensus around a clear
> >>> technical need in the Apache Cassandra project as it stands today.
> >>>
> >>> One way to build community support might be to contribute an Apache
> >>> licensed thread per core implementation in Java that matches the
> protocol
> >>> change and shard concept you are looking for ;P
> >> I doubt I'll survive the egregious top-posting that is going on in this
> list.
> >>
> >>>
>  On Thu, Apr 19, 2018 at 1:43 PM Ariel Weisberg 
> wrote:
> 
>  Hi,
> 
>  So at technical level I don't understand this yet.
> 
>  So you have a database consisting of single threaded shards and a
> socket
>  for accept that is generating TCP connections and in advance you
> don't know
>  which connection is going to send messages to which shard.
> 
>  What is the mechanism by which you get the packets for a given TCP
>  connection delivered to a specific core? I know that a given TCP
> connection
>  will normally have all of its packets delivered to the same queue
> from the
>  NIC because the tuple of source address + port and destination
> address +
>  port is typically hashed to pick one of the queues the NIC presents. I
>  might have the contents of the tuple slightly wrong, but it always
> includes
>  a component you don't get to control.
> 
>  Since it's hashing how do you manipulate which queue packets for a TCP
>  connection go to and how is it made worse by having an accept socket
> per
>  shard?
> 
>  You also mention 160 ports as bad, but it doesn't sound like a big
> number
>  resource wise. Is it an operational headache?
> 
>  RE tokens distributed amongst shards. The way that would work right
> now is
>  that each port number appears to be a discrete instance of the
> server. So
>  you could have shards be actual shards that are simply colocated on
> the
>  same box, run in the 

Re: Evolving the client protocol

2018-04-23 Thread Avi Kivity



On 2018-04-22 23:35, Josh McKenzie wrote:

The drivers are not part of Cassandra, so what "the server" is for drivers is 
up to their maintainer.

I'm pretty sure the driver communities don't spend a lot of time
worrying about their Scylla compatibility. That's your cross to bear.


To clarify, I wasn't asking this list for help with the client drivers. 
The purpose of this thread was to see if we can find a way to avoid 
forking the protocol.



On Sun, Apr 22, 2018 at 11:00 AM, Ariel Weisberg  wrote:

Hi,


This doesn't work without additional changes, for RF>1. The token ring could place two 
replicas of the same token range on the same physical server, even though those are two 
separate cores of the same server. You could add another element to the hierarchy (cluster 
-> datacenter -> rack -> node -> core/shard), but that generates unneeded range 
movements when a node is added.

I have seen rack awareness used/abused to solve this.

Regards,
Ariel


On Apr 22, 2018, at 8:26 AM, Avi Kivity  wrote:




On 2018-04-19 21:15, Ben Bromhead wrote:
Re #3:

Yup I was thinking each shard/port would appear as a discrete server to the
client.

This doesn't work without additional changes, for RF>1. The token ring could place two 
replicas of the same token range on the same physical server, even though those are two 
separate cores of the same server. You could add another element to the hierarchy (cluster 
-> datacenter -> rack -> node -> core/shard), but that generates unneeded range 
movements when a node is added.


If the per port suggestion is unacceptable due to hardware requirements,
remembering that Cassandra is built with the concept scaling *commodity*
hardware horizontally, you'll have to spend your time and energy convincing
the community to support a protocol feature it has no (current) use for or
find another interim solution.

Those servers are commodity servers (not x86, but still commodity). In any case 
60+ logical cores are common now (hello AWS i3.16xlarge or even i3.metal), and 
we can only expect logical core count to continue to increase (there are 
48-core ARM processors now).


Another way, would be to build support and consensus around a clear
technical need in the Apache Cassandra project as it stands today.

One way to build community support might be to contribute an Apache
licensed thread per core implementation in Java that matches the protocol
change and shard concept you are looking for ;P

I doubt I'll survive the egregious top-posting that is going on in this list.




On Thu, Apr 19, 2018 at 1:43 PM Ariel Weisberg  wrote:

Hi,

So at technical level I don't understand this yet.

So you have a database consisting of single threaded shards and a socket
for accept that is generating TCP connections and in advance you don't know
which connection is going to send messages to which shard.

What is the mechanism by which you get the packets for a given TCP
connection delivered to a specific core? I know that a given TCP connection
will normally have all of its packets delivered to the same queue from the
NIC because the tuple of source address + port and destination address +
port is typically hashed to pick one of the queues the NIC presents. I
might have the contents of the tuple slightly wrong, but it always includes
a component you don't get to control.

Since it's hashing how do you manipulate which queue packets for a TCP
connection go to and how is it made worse by having an accept socket per
shard?

You also mention 160 ports as bad, but it doesn't sound like a big number
resource wise. Is it an operational headache?

RE tokens distributed amongst shards. The way that would work right now is
that each port number appears to be a discrete instance of the server. So
you could have shards be actual shards that are simply colocated on the
same box, run in the same process, and share resources. I know this pushes
more of the complexity into the server vs the driver as the server expects
all shards to share some client visible like system tables and certain
identifiers.

Ariel

On Thu, Apr 19, 2018, at 12:59 PM, Avi Kivity wrote:
Port-per-shard is likely the easiest option but it's too ugly to
contemplate. We run on machines with 160 shards (IBM POWER 2s20c160t
IIRC), it will be just horrible to have 160 open ports.


It also doesn't fit will with the NICs ability to automatically
distribute packets among cores using multiple queues, so the kernel
would have to shuffle those packets around. Much better to have those
packets delivered directly to the core that will service them.


(also, some protocol changes are needed so the driver knows how tokens
are distributed among shards)


On 2018-04-19 19:46, Ben Bromhead wrote:
WRT to #3
To fit in the existing protocol, could you have each shard listen on a
different port? Drivers are likely going to support this due to

Re: Evolving the client protocol

2018-04-23 Thread Avi Kivity



On 2018-04-22 18:00, Ariel Weisberg wrote:

Hi,


This doesn't work without additional changes, for RF>1. The token ring could place two 
replicas of the same token range on the same physical server, even though those are two 
separate cores of the same server. You could add another element to the hierarchy (cluster 
-> datacenter -> rack -> node -> core/shard), but that generates unneeded range 
movements when a node is added.

I have seen rack awareness used/abused to solve this.



But then you lose real rack awareness. It's fine for a quick hack, but 
not a long-term solution.


(it also creates a lot more tokens, something nobody needs)


Regards,
Ariel


On Apr 22, 2018, at 8:26 AM, Avi Kivity  wrote:




On 2018-04-19 21:15, Ben Bromhead wrote:
Re #3:

Yup I was thinking each shard/port would appear as a discrete server to the
client.

This doesn't work without additional changes, for RF>1. The token ring could place two 
replicas of the same token range on the same physical server, even though those are two 
separate cores of the same server. You could add another element to the hierarchy (cluster 
-> datacenter -> rack -> node -> core/shard), but that generates unneeded range 
movements when a node is added.


If the per port suggestion is unacceptable due to hardware requirements,
remembering that Cassandra is built with the concept scaling *commodity*
hardware horizontally, you'll have to spend your time and energy convincing
the community to support a protocol feature it has no (current) use for or
find another interim solution.

Those servers are commodity servers (not x86, but still commodity). In any case 
60+ logical cores are common now (hello AWS i3.16xlarge or even i3.metal), and 
we can only expect logical core count to continue to increase (there are 
48-core ARM processors now).


Another way, would be to build support and consensus around a clear
technical need in the Apache Cassandra project as it stands today.

One way to build community support might be to contribute an Apache
licensed thread per core implementation in Java that matches the protocol
change and shard concept you are looking for ;P

I doubt I'll survive the egregious top-posting that is going on in this list.




On Thu, Apr 19, 2018 at 1:43 PM Ariel Weisberg  wrote:

Hi,

So at technical level I don't understand this yet.

So you have a database consisting of single threaded shards and a socket
for accept that is generating TCP connections and in advance you don't know
which connection is going to send messages to which shard.

What is the mechanism by which you get the packets for a given TCP
connection delivered to a specific core? I know that a given TCP connection
will normally have all of its packets delivered to the same queue from the
NIC because the tuple of source address + port and destination address +
port is typically hashed to pick one of the queues the NIC presents. I
might have the contents of the tuple slightly wrong, but it always includes
a component you don't get to control.

Since it's hashing how do you manipulate which queue packets for a TCP
connection go to and how is it made worse by having an accept socket per
shard?

You also mention 160 ports as bad, but it doesn't sound like a big number
resource wise. Is it an operational headache?

RE tokens distributed amongst shards. The way that would work right now is
that each port number appears to be a discrete instance of the server. So
you could have shards be actual shards that are simply colocated on the
same box, run in the same process, and share resources. I know this pushes
more of the complexity into the server vs the driver as the server expects
all shards to share some client visible like system tables and certain
identifiers.

Ariel

On Thu, Apr 19, 2018, at 12:59 PM, Avi Kivity wrote:
Port-per-shard is likely the easiest option but it's too ugly to
contemplate. We run on machines with 160 shards (IBM POWER 2s20c160t
IIRC), it will be just horrible to have 160 open ports.


It also doesn't fit will with the NICs ability to automatically
distribute packets among cores using multiple queues, so the kernel
would have to shuffle those packets around. Much better to have those
packets delivered directly to the core that will service them.


(also, some protocol changes are needed so the driver knows how tokens
are distributed among shards)


On 2018-04-19 19:46, Ben Bromhead wrote:
WRT to #3
To fit in the existing protocol, could you have each shard listen on a
different port? Drivers are likely going to support this due to
https://issues.apache.org/jira/browse/CASSANDRA-7544 (
https://issues.apache.org/jira/browse/CASSANDRA-11596).  I'm not super
familiar with the ticket so their might be something I'm missing but it
sounds like a potential approach.

This would give you a path forward at least for the short term.


On Thu, Apr 19, 2018 at 12:10 PM Ariel Weisberg 

wrote:


Re: Evolving the client protocol

2018-04-22 Thread Jeff Jirsa



On Apr 20, 2018, at 5:03 AM, Sylvain Lebresne  wrote:

>> 
>> 
>> Those were just given as examples. Each would be discussed on its own,
>> assuming we are able to find a way to cooperate.
>> 
>> 
>> These are relatively simple and it wouldn't be hard for use to patch
>> Cassandra. But I want to find a way to make more complicated protocol
>> changes where it wouldn't be realistic for us to modify Cassandra.
>> 
> 
> That's where I'm confused with what you are truly asking.
> 
> The native protocol is the protocol of the Apache Cassandra project and was
> never meant to be a standard protocol. If the ask is to move towards more
> of handling the protocol as a standard that would evolve independently of
> whether Cassandra implements it (would the project commit to implement it
> eventually?), then let's be clear on what the concrete suggestion is and
> have this discussion (but to be upfront, the short version of my personal
> opinion is that this would likely be a big distraction with relatively low
> merits for the project, so I'm very unconvinced).
> 
> But if that's not the ask, what is it exactly? That we agree to commit
> changes
> to the protocol spec before we have actually implemented them? If so, I just
> don't get it. The downsides are clear (we risk the feature is either never
> implemeted due to lack of contributions/loss of interest, or that the
> protocol
> changes committed are not fully suitable to the final implementation) but
> what
> benefit to the project can that ever have?

Agree with everything here 

> 
> Don't get me wrong, protocol-impacting changes/additions are very much
> welcome
> if reasonable for Cassandra, and both CASSANDRA-14311 and CASSANDRA-2848 are
> certainly worthy. Both the definition of done of those ticket certainly
> include the server implementation imo,

Also agree here - any changes to protocol on the Apache Cassandra side have to 
come with the implementation, otherwise you should consider using the optional 
arbitrary k/v map that zipkin tracing leverages for arbitrary payloads.


> not just changing the protocol spec
> file. As for the shard notion, it makes no sense for Cassandra at this point
> in time, so unless an additional contribution makes it so that it start to
> make
> sense, I'm not sure why we'd add anything related to it to the protocol.
> 
> --
> Sylvain
> 
> 
> 
>> 
>>> RE #3,
>>> 
>>> It's hard to be +1 on this because we don't benefit by boxing ourselves
>> in by defining a spec we haven't implemented, tested, and decided we are
>> satisfied with. Having it in ScyllaDB de-risks it to a certain extent, but
>> what if Cassandra decides to go a different direction in some way?
>> 
>> Such a proposal would include negotiation about the sharding algorithm
>> used to prevent Cassandra being boxed in. Of course it's impossible to
>> guarantee that a new idea won't come up that requires more changes.
>> 
>>> I don't think there is much discussion to be had without an example of
>> the the changes to the CQL specification to look at, but even then if it
>> looks risky I am not likely to be in favor of it.
>>> 
>>> Regards,
>>> Ariel
>>> 
 On Thu, Apr 19, 2018, at 9:33 AM, glom...@scylladb.com wrote:
 
 On 2018/04/19 07:19:27, kurt greaves  wrote:
>> 1. The protocol change is developed using the Cassandra process in
>>a JIRA ticket, culminating in a patch to
>>doc/native_protocol*.spec when consensus is achieved.
> I don't think forking would be desirable (for anyone) so this seems
> the most reasonable to me. For 1 and 2 it certainly makes sense but
> can't say I know enough about sharding to comment on 3 - seems to me
> like it could be locking in a design before anyone truly knows what
> sharding in C* looks like. But hopefully I'm wrong and there are
> devs out there that have already thought that through.
 Thanks. That is our view and is great to hear.
 
 About our proposal number 3: In my view, good protocol designs are
 future proof and flexible. We certainly don't want to propose a design
 that works just for Scylla, but would support reasonable
 implementations regardless of how they may look like.
 
> Do we have driver authors who wish to support both projects?
> 
> Surely, but I imagine it would be a minority. ​
> 
 -
 To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For
 additional commands, e-mail: dev-h...@cassandra.apache.org
 
>>> -
>>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>>> For additional commands, e-mail: dev-h...@cassandra.apache.org
>>> 
>> 
>> 
>> -
>> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
>> For additional 

Re: Evolving the client protocol

2018-04-22 Thread Josh McKenzie
> The drivers are not part of Cassandra, so what "the server" is for drivers is 
> up to their maintainer.

I'm pretty sure the driver communities don't spend a lot of time
worrying about their Scylla compatibility. That's your cross to bear.

On Sun, Apr 22, 2018 at 11:00 AM, Ariel Weisberg  wrote:
> Hi,
>
>> This doesn't work without additional changes, for RF>1. The token ring could 
>> place two replicas of the same token range on the same physical server, even 
>> though those are two separate cores of the same server. You could add 
>> another element to the hierarchy (cluster -> datacenter -> rack -> node -> 
>> core/shard), but that generates unneeded range movements when a node is 
>> added.
>
> I have seen rack awareness used/abused to solve this.
>
> Regards,
> Ariel
>
>> On Apr 22, 2018, at 8:26 AM, Avi Kivity  wrote:
>>
>>
>>
>>> On 2018-04-19 21:15, Ben Bromhead wrote:
>>> Re #3:
>>>
>>> Yup I was thinking each shard/port would appear as a discrete server to the
>>> client.
>>
>> This doesn't work without additional changes, for RF>1. The token ring could 
>> place two replicas of the same token range on the same physical server, even 
>> though those are two separate cores of the same server. You could add 
>> another element to the hierarchy (cluster -> datacenter -> rack -> node -> 
>> core/shard), but that generates unneeded range movements when a node is 
>> added.
>>
>>> If the per port suggestion is unacceptable due to hardware requirements,
>>> remembering that Cassandra is built with the concept scaling *commodity*
>>> hardware horizontally, you'll have to spend your time and energy convincing
>>> the community to support a protocol feature it has no (current) use for or
>>> find another interim solution.
>>
>> Those servers are commodity servers (not x86, but still commodity). In any 
>> case 60+ logical cores are common now (hello AWS i3.16xlarge or even 
>> i3.metal), and we can only expect logical core count to continue to increase 
>> (there are 48-core ARM processors now).
>>
>>>
>>> Another way, would be to build support and consensus around a clear
>>> technical need in the Apache Cassandra project as it stands today.
>>>
>>> One way to build community support might be to contribute an Apache
>>> licensed thread per core implementation in Java that matches the protocol
>>> change and shard concept you are looking for ;P
>>
>> I doubt I'll survive the egregious top-posting that is going on in this list.
>>
>>>
>>>
 On Thu, Apr 19, 2018 at 1:43 PM Ariel Weisberg  wrote:

 Hi,

 So at technical level I don't understand this yet.

 So you have a database consisting of single threaded shards and a socket
 for accept that is generating TCP connections and in advance you don't know
 which connection is going to send messages to which shard.

 What is the mechanism by which you get the packets for a given TCP
 connection delivered to a specific core? I know that a given TCP connection
 will normally have all of its packets delivered to the same queue from the
 NIC because the tuple of source address + port and destination address +
 port is typically hashed to pick one of the queues the NIC presents. I
 might have the contents of the tuple slightly wrong, but it always includes
 a component you don't get to control.

 Since it's hashing how do you manipulate which queue packets for a TCP
 connection go to and how is it made worse by having an accept socket per
 shard?

 You also mention 160 ports as bad, but it doesn't sound like a big number
 resource wise. Is it an operational headache?

 RE tokens distributed amongst shards. The way that would work right now is
 that each port number appears to be a discrete instance of the server. So
 you could have shards be actual shards that are simply colocated on the
 same box, run in the same process, and share resources. I know this pushes
 more of the complexity into the server vs the driver as the server expects
 all shards to share some client visible like system tables and certain
 identifiers.

 Ariel
> On Thu, Apr 19, 2018, at 12:59 PM, Avi Kivity wrote:
> Port-per-shard is likely the easiest option but it's too ugly to
> contemplate. We run on machines with 160 shards (IBM POWER 2s20c160t
> IIRC), it will be just horrible to have 160 open ports.
>
>
> It also doesn't fit will with the NICs ability to automatically
> distribute packets among cores using multiple queues, so the kernel
> would have to shuffle those packets around. Much better to have those
> packets delivered directly to the core that will service them.
>
>
> (also, some protocol changes are needed so the driver knows how tokens
> are distributed among shards)
>
>> On 2018-04-19 19:46, Ben Bromhead wrote:
>> 

Re: Evolving the client protocol

2018-04-22 Thread Ariel Weisberg
Hi,

> This doesn't work without additional changes, for RF>1. The token ring could 
> place two replicas of the same token range on the same physical server, even 
> though those are two separate cores of the same server. You could add another 
> element to the hierarchy (cluster -> datacenter -> rack -> node -> 
> core/shard), but that generates unneeded range movements when a node is added.

I have seen rack awareness used/abused to solve this.

Regards,
Ariel

> On Apr 22, 2018, at 8:26 AM, Avi Kivity  wrote:
> 
> 
> 
>> On 2018-04-19 21:15, Ben Bromhead wrote:
>> Re #3:
>> 
>> Yup I was thinking each shard/port would appear as a discrete server to the
>> client.
> 
> This doesn't work without additional changes, for RF>1. The token ring could 
> place two replicas of the same token range on the same physical server, even 
> though those are two separate cores of the same server. You could add another 
> element to the hierarchy (cluster -> datacenter -> rack -> node -> 
> core/shard), but that generates unneeded range movements when a node is added.
> 
>> If the per port suggestion is unacceptable due to hardware requirements,
>> remembering that Cassandra is built with the concept scaling *commodity*
>> hardware horizontally, you'll have to spend your time and energy convincing
>> the community to support a protocol feature it has no (current) use for or
>> find another interim solution.
> 
> Those servers are commodity servers (not x86, but still commodity). In any 
> case 60+ logical cores are common now (hello AWS i3.16xlarge or even 
> i3.metal), and we can only expect logical core count to continue to increase 
> (there are 48-core ARM processors now).
> 
>> 
>> Another way, would be to build support and consensus around a clear
>> technical need in the Apache Cassandra project as it stands today.
>> 
>> One way to build community support might be to contribute an Apache
>> licensed thread per core implementation in Java that matches the protocol
>> change and shard concept you are looking for ;P
> 
> I doubt I'll survive the egregious top-posting that is going on in this list.
> 
>> 
>> 
>>> On Thu, Apr 19, 2018 at 1:43 PM Ariel Weisberg  wrote:
>>> 
>>> Hi,
>>> 
>>> So at technical level I don't understand this yet.
>>> 
>>> So you have a database consisting of single threaded shards and a socket
>>> for accept that is generating TCP connections and in advance you don't know
>>> which connection is going to send messages to which shard.
>>> 
>>> What is the mechanism by which you get the packets for a given TCP
>>> connection delivered to a specific core? I know that a given TCP connection
>>> will normally have all of its packets delivered to the same queue from the
>>> NIC because the tuple of source address + port and destination address +
>>> port is typically hashed to pick one of the queues the NIC presents. I
>>> might have the contents of the tuple slightly wrong, but it always includes
>>> a component you don't get to control.
>>> 
>>> Since it's hashing how do you manipulate which queue packets for a TCP
>>> connection go to and how is it made worse by having an accept socket per
>>> shard?
>>> 
>>> You also mention 160 ports as bad, but it doesn't sound like a big number
>>> resource wise. Is it an operational headache?
>>> 
>>> RE tokens distributed amongst shards. The way that would work right now is
>>> that each port number appears to be a discrete instance of the server. So
>>> you could have shards be actual shards that are simply colocated on the
>>> same box, run in the same process, and share resources. I know this pushes
>>> more of the complexity into the server vs the driver as the server expects
>>> all shards to share some client visible like system tables and certain
>>> identifiers.
>>> 
>>> Ariel
 On Thu, Apr 19, 2018, at 12:59 PM, Avi Kivity wrote:
 Port-per-shard is likely the easiest option but it's too ugly to
 contemplate. We run on machines with 160 shards (IBM POWER 2s20c160t
 IIRC), it will be just horrible to have 160 open ports.
 
 
 It also doesn't fit will with the NICs ability to automatically
 distribute packets among cores using multiple queues, so the kernel
 would have to shuffle those packets around. Much better to have those
 packets delivered directly to the core that will service them.
 
 
 (also, some protocol changes are needed so the driver knows how tokens
 are distributed among shards)
 
> On 2018-04-19 19:46, Ben Bromhead wrote:
> WRT to #3
> To fit in the existing protocol, could you have each shard listen on a
> different port? Drivers are likely going to support this due to
> https://issues.apache.org/jira/browse/CASSANDRA-7544 (
> https://issues.apache.org/jira/browse/CASSANDRA-11596).  I'm not super
> familiar with the ticket so their might be something I'm missing but it
> sounds like a potential 

Re: Evolving the client protocol

2018-04-22 Thread Avi Kivity



On 2018-04-19 21:15, Ben Bromhead wrote:

Re #3:

Yup I was thinking each shard/port would appear as a discrete server to the
client.


This doesn't work without additional changes, for RF>1. The token ring 
could place two replicas of the same token range on the same physical 
server, even though those are two separate cores of the same server. You 
could add another element to the hierarchy (cluster -> datacenter -> 
rack -> node -> core/shard), but that generates unneeded range movements 
when a node is added.



If the per port suggestion is unacceptable due to hardware requirements,
remembering that Cassandra is built with the concept scaling *commodity*
hardware horizontally, you'll have to spend your time and energy convincing
the community to support a protocol feature it has no (current) use for or
find another interim solution.


Those servers are commodity servers (not x86, but still commodity). In 
any case 60+ logical cores are common now (hello AWS i3.16xlarge or even 
i3.metal), and we can only expect logical core count to continue to 
increase (there are 48-core ARM processors now).




Another way, would be to build support and consensus around a clear
technical need in the Apache Cassandra project as it stands today.

One way to build community support might be to contribute an Apache
licensed thread per core implementation in Java that matches the protocol
change and shard concept you are looking for ;P


I doubt I'll survive the egregious top-posting that is going on in this 
list.





On Thu, Apr 19, 2018 at 1:43 PM Ariel Weisberg  wrote:


Hi,

So at technical level I don't understand this yet.

So you have a database consisting of single threaded shards and a socket
for accept that is generating TCP connections and in advance you don't know
which connection is going to send messages to which shard.

What is the mechanism by which you get the packets for a given TCP
connection delivered to a specific core? I know that a given TCP connection
will normally have all of its packets delivered to the same queue from the
NIC because the tuple of source address + port and destination address +
port is typically hashed to pick one of the queues the NIC presents. I
might have the contents of the tuple slightly wrong, but it always includes
a component you don't get to control.

Since it's hashing how do you manipulate which queue packets for a TCP
connection go to and how is it made worse by having an accept socket per
shard?

You also mention 160 ports as bad, but it doesn't sound like a big number
resource wise. Is it an operational headache?

RE tokens distributed amongst shards. The way that would work right now is
that each port number appears to be a discrete instance of the server. So
you could have shards be actual shards that are simply colocated on the
same box, run in the same process, and share resources. I know this pushes
more of the complexity into the server vs the driver as the server expects
all shards to share some client visible like system tables and certain
identifiers.

Ariel
On Thu, Apr 19, 2018, at 12:59 PM, Avi Kivity wrote:

Port-per-shard is likely the easiest option but it's too ugly to
contemplate. We run on machines with 160 shards (IBM POWER 2s20c160t
IIRC), it will be just horrible to have 160 open ports.


It also doesn't fit will with the NICs ability to automatically
distribute packets among cores using multiple queues, so the kernel
would have to shuffle those packets around. Much better to have those
packets delivered directly to the core that will service them.


(also, some protocol changes are needed so the driver knows how tokens
are distributed among shards)

On 2018-04-19 19:46, Ben Bromhead wrote:

WRT to #3
To fit in the existing protocol, could you have each shard listen on a
different port? Drivers are likely going to support this due to
https://issues.apache.org/jira/browse/CASSANDRA-7544 (
https://issues.apache.org/jira/browse/CASSANDRA-11596).  I'm not super
familiar with the ticket so their might be something I'm missing but it
sounds like a potential approach.

This would give you a path forward at least for the short term.


On Thu, Apr 19, 2018 at 12:10 PM Ariel Weisberg 

wrote:

Hi,

I think that updating the protocol spec to Cassandra puts the onus on

the

party changing the protocol specification to have an implementation

of the

spec in Cassandra as well as the Java and Python driver (those are

both

used in the Cassandra repo). Until it's implemented in Cassandra we

haven't

fully evaluated the specification change. There is no substitute for

trying

to make it work.

There are also realities to consider as to what the maintainers of the
drivers are willing to commit.

RE #1,

I am +1 on the fact that we shouldn't require an extra hop for range

scans.

In JIRA Jeremiah made the point that you can still do this from the

client

by breaking up the token ranges, but it's a leaky 

Re: Evolving the client protocol

2018-04-22 Thread Avi Kivity
You're right in principle, but in practice we haven't seen problems with 
the term.



On 2018-04-19 20:31, Michael Shuler wrote:

This is purely my own opinion, but I find the use of the term 'shard'
quite unfortunate in the context of a distributed database. The
historical usage of the term has been the notion of data partitions that
reside on separate database servers. There is a learning curve with
distributed databases, and I can foresee the use of the term adding
additional confusion for new users. Not a fan.




-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Evolving the client protocol

2018-04-22 Thread Avi Kivity



On 2018-04-20 12:03, Sylvain Lebresne wrote:


Those were just given as examples. Each would be discussed on its own,
assuming we are able to find a way to cooperate.


These are relatively simple and it wouldn't be hard for use to patch
Cassandra. But I want to find a way to make more complicated protocol
changes where it wouldn't be realistic for us to modify Cassandra.


That's where I'm confused with what you are truly asking.

The native protocol is the protocol of the Apache Cassandra project and was
never meant to be a standard protocol. If the ask is to move towards more
of handling the protocol as a standard that would evolve independently of
whether Cassandra implements it (would the project commit to implement it
eventually?), then let's be clear on what the concrete suggestion is and
have this discussion (but to be upfront, the short version of my personal
opinion is that this would likely be a big distraction with relatively low
merits for the project, so I'm very unconvinced).


I proposed several ways to cooperate. Yes, my "mode 1" essentially makes 
the protocol a standard.


For better or for worse, there are now at least 4 server-side 
implementations of the protocol, 5 if you count dse as a separate 
implementation. So it is de-facto a standard.




But if that's not the ask, what is it exactly? That we agree to commit
changes
to the protocol spec before we have actually implemented them? If so, I just
don't get it. The downsides are clear (we risk the feature is either never
implemeted due to lack of contributions/loss of interest, or that the
protocol
changes committed are not fully suitable to the final implementation) but
what
benefit to the project can that ever have?


If another implementation defines a protocol change, and drivers are 
patched to implement that change, then when Cassandra implements that 
change it gets those driver changes for free. Provided of course that 
the protocol change has a technical match with the implementation.




Don't get me wrong, protocol-impacting changes/additions are very much
welcome
if reasonable for Cassandra, and both CASSANDRA-14311 and CASSANDRA-2848 are
certainly worthy. Both the definition of done of those ticket certainly
include the server implementation imo, not just changing the protocol spec
file. As for the shard notion, it makes no sense for Cassandra at this point
in time, so unless an additional contribution makes it so that it start to
make
sense, I'm not sure why we'd add anything related to it to the protocol.

--
Sylvain




RE #3,

It's hard to be +1 on this because we don't benefit by boxing ourselves

in by defining a spec we haven't implemented, tested, and decided we are
satisfied with. Having it in ScyllaDB de-risks it to a certain extent, but
what if Cassandra decides to go a different direction in some way?

Such a proposal would include negotiation about the sharding algorithm
used to prevent Cassandra being boxed in. Of course it's impossible to
guarantee that a new idea won't come up that requires more changes.


I don't think there is much discussion to be had without an example of

the the changes to the CQL specification to look at, but even then if it
looks risky I am not likely to be in favor of it.

Regards,
Ariel

On Thu, Apr 19, 2018, at 9:33 AM, glom...@scylladb.com wrote:

On 2018/04/19 07:19:27, kurt greaves  wrote:

1. The protocol change is developed using the Cassandra process in
 a JIRA ticket, culminating in a patch to
 doc/native_protocol*.spec when consensus is achieved.

I don't think forking would be desirable (for anyone) so this seems
the most reasonable to me. For 1 and 2 it certainly makes sense but
can't say I know enough about sharding to comment on 3 - seems to me
like it could be locking in a design before anyone truly knows what
sharding in C* looks like. But hopefully I'm wrong and there are
devs out there that have already thought that through.

Thanks. That is our view and is great to hear.

About our proposal number 3: In my view, good protocol designs are
future proof and flexible. We certainly don't want to propose a design
that works just for Scylla, but would support reasonable
implementations regardless of how they may look like.


Do we have driver authors who wish to support both projects?

Surely, but I imagine it would be a minority. ​


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For
additional commands, e-mail: dev-h...@cassandra.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org






Re: Evolving the client protocol

2018-04-22 Thread Avi Kivity



On 2018-04-19 20:43, Ariel Weisberg wrote:

Hi,

So at technical level I don't understand this yet.

So you have a database consisting of single threaded shards and a socket for 
accept that is generating TCP connections and in advance you don't know which 
connection is going to send messages to which shard.

What is the mechanism by which you get the packets for a given TCP connection 
delivered to a specific core? I know that a given TCP connection will normally 
have all of its packets delivered to the same queue from the NIC because the 
tuple of source address + port and destination address + port is typically 
hashed to pick one of the queues the NIC presents. I might have the contents of 
the tuple slightly wrong, but it always includes a component you don't get to 
control.


Right, that's how it's done. The component you typically don't get to 
control is the client-side local port, but you can bind to a local port 
if you want.



Since it's hashing how do you manipulate which queue packets for a TCP 
connection go to and how is it made worse by having an accept socket per shard?


It's not made worse, it's just not made better.

There are three ways at least to get multiqueue to work with 
thread-per-core without software movement of packets, none of them pretty:


1. The client tells the server which shard to connect to. The server 
uses "Flow Director" [1] or an equivalent to bypass the hash and bind 
the connection to a particular queue. This is problematic since you need 
to bypass the tcp stack, and since there are a limited number of entries 
in the flow director table.
2. The client asks the server which shard it happened to connect to. 
This requires the client to open many connections in order to reach all 
shards, and then close any excess connections (did I mention it wasn't 
pretty?).
3. The server communicates the hash function to the client, or perhaps 
suggests local ports for the client to use in order to reach a shard. 
This can be problematic if the server doesn't know the hash function 
(can happen in some virtualized environments, or with new NICs, or with 
limited knowledge of the hardware topology). See similar approach in [2].


[1] 
https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/intel-ethernet-flow-director.pdf
[2] 
https://github.com/scylladb/seastar/blob/0b8b851b432a1d04522a80d9830e07449d71caa2/net/tcp.hh#L790




You also mention 160 ports as bad, but it doesn't sound like a big number 
resource wise. Is it an operational headache?


Port 9042 + N can easily conflict with another statically allocated port 
on the server. I guess you can listen on ephemeral ports, but then if 
you firewall them, you need to adjust the firewall rules.


In any case it doesn't solve the problem of directing a connection's 
packets to a specific queue.




RE tokens distributed amongst shards. The way that would work right now is that 
each port number appears to be a discrete instance of the server. So you could 
have shards be actual shards that are simply colocated on the same box, run in 
the same process, and share resources. I know this pushes more of the 
complexity into the server vs the driver as the server expects all shards to 
share some client visible like system tables and certain identifiers.


This has its own problems, I'll address them in the other sub-thread (or 
using our term, other continuation).




Ariel
On Thu, Apr 19, 2018, at 12:59 PM, Avi Kivity wrote:

Port-per-shard is likely the easiest option but it's too ugly to
contemplate. We run on machines with 160 shards (IBM POWER 2s20c160t
IIRC), it will be just horrible to have 160 open ports.


It also doesn't fit will with the NICs ability to automatically
distribute packets among cores using multiple queues, so the kernel
would have to shuffle those packets around. Much better to have those
packets delivered directly to the core that will service them.


(also, some protocol changes are needed so the driver knows how tokens
are distributed among shards)

On 2018-04-19 19:46, Ben Bromhead wrote:

WRT to #3
To fit in the existing protocol, could you have each shard listen on a
different port? Drivers are likely going to support this due to
https://issues.apache.org/jira/browse/CASSANDRA-7544 (
https://issues.apache.org/jira/browse/CASSANDRA-11596).  I'm not super
familiar with the ticket so their might be something I'm missing but it
sounds like a potential approach.

This would give you a path forward at least for the short term.


On Thu, Apr 19, 2018 at 12:10 PM Ariel Weisberg  wrote:


Hi,

I think that updating the protocol spec to Cassandra puts the onus on the
party changing the protocol specification to have an implementation of the
spec in Cassandra as well as the Java and Python driver (those are both
used in the Cassandra repo). Until it's implemented in Cassandra we haven't
fully evaluated the specification change. There is no substitute for trying
to 

Re: Evolving the client protocol

2018-04-22 Thread Avi Kivity



On 2018-04-19 20:33, Ariel Weisberg wrote:

Hi,


That basically means a fork in the protocol (perhaps a temporary fork if
we go for mode 2 where Cassandra retroactively adopts our protocol
changes, if they fit will).

Implementing a protocol change may be easy for some simple changes, but
in the general case, it is not realistic to expect it.
Can you elaborate? No one is forcing driver maintainers to update their
drivers to support new features, either for Cassandra or Scylla, but
there should be no reason for them to reject a contribution adding that
support.

I think it's unrealistic to expect the next version of the protocol spec to 
include functionality that is not supported by either  the server or drivers 
once a version of the server or driver supporting that protocol version is  
released. Putting something in the spec is making a hard commitment for the 
driver and server without also specifying who will do the work.

So yes a temporary fork is fine, but then you run into things like "we" don't 
like the spec change and find we want to change it again. For us it's fine because we 
never committed to supporting the fork either way. For the driver maintainers it's fine 
because they probably never accepted the spec change either and didn't update the 
drivers. This is because the maintainers aren't going to accept changes that are 
incompatible with what the Cassandra server implements.

So if you have a temporary fork of the spec you might also be committing to a 
temporary fork of the drivers as well as the headaches that come with the final 
version of the spec not matching your fork. We would do what we can to avoid 
that by having the conversation around the protocol design up front.

What I am largely getting at is that I think Apache Cassandra and its drivers 
can only truly commit to a spec where there is a released implementation in the 
server and drivers.


The drivers are not part of Cassandra, so what "the server" is for 
drivers is up to their maintainer.



  Up until that point the spec is subject to change. We are less likely to 
change it if there is an implementation because we have already done the work 
and dug up most of the issues.

For sharding this is thorny and I think Ben makes a really good suggestion RE 
leveraging CASSANDRA-7544.  For paging state and timeouts I think it's likely 
we could stick to what we work out spec wise and we are happy to have the 
discussion and learn from ScyllaDB de-risking protocol changes, but if no one 
commits to doing the work you might find we release the next protocol version 
without the tentative spec changes.


So I think my proposed mode 1 (where the protocol, but not the server) 
is updated in cassandra.git is rejected. Let's discuss the two remaining 
options:


mode 2: cassandra.git reserves the prefix "SCYLLA" for the 
OPTIONS/SUPPORTED message, and, when it comes to implement a protocol 
extensions it will consider Scylla extensions and incorporate them into 
cassandra.git if they are found to be technically acceptable (but may of 
course extend the protocol in a different way if there is a technical 
reason)


mode 3: cassandra.git ignores Scylla


For Cassandra, the advantage of mode 2 is that if driver maintainers add 
support for the change (on their own or by merging changes authored by 
Scylla developers), then Cassandra developers get driver support with 
less effort.




Ariel
On Thu, Apr 19, 2018, at 12:53 PM, Avi Kivity wrote:


On 2018-04-19 19:10, Ariel Weisberg wrote:

Hi,

I think that updating the protocol spec to Cassandra puts the onus on the party 
changing the protocol specification to have an implementation of the spec in 
Cassandra as well as the Java and Python driver (those are both used in the 
Cassandra repo). Until it's implemented in Cassandra we haven't fully evaluated 
the specification change. There is no substitute for trying to make it work.

That basically means a fork in the protocol (perhaps a temporary fork if
we go for mode 2 where Cassandra retroactively adopts our protocol
changes, if they fit will).

Implementing a protocol change may be easy for some simple changes, but
in the general case, it is not realistic to expect it.


There are also realities to consider as to what the maintainers of the drivers 
are willing to commit.

Can you elaborate? No one is forcing driver maintainers to update their
drivers to support new features, either for Cassandra or Scylla, but
there should be no reason for them to reject a contribution adding that
support.

If you refer to a potential politically-motivated rejection by the
DataStax-maintained drivers, then those drivers should and will be
forked. That's not true open source. However, I'm not assuming that will
happen.


RE #1,

I am +1 on the fact that we shouldn't require an extra hop for range scans.

In JIRA Jeremiah made the point that you can still do this from the client by 
breaking up the token ranges, but it's a leaky abstraction to 

Re: Evolving the client protocol

2018-04-20 Thread Jeremiah D Jordan
The protocol does already support optional/custom payloads to do such things.  
IIRC the zipkin tracing implementation 
https://github.com/thelastpickle/cassandra-zipkin-tracing for example uses this 
to pass the zipkin id to the server.

> On Apr 20, 2018, at 1:02 PM, Max C.  wrote:
> 
> For things like #3, would it be a better idea to propose a generic 
> enhancement for “optional vendor extensions” to the protocol?  These 
> extensions would be negotiated during connection formation and then the 
> driver could (optionally) implement these additional features.  These 
> extensions would be documented separately by the vendor, and the driver’s 
> default behavior would be to ignore any extensions it doesn’t understand.
> 
> With that sort of feature, the Scylla folks (CosmoDB too??) could add 
> extensions to the protocol without forking the protocol spec, (potentially) 
> without forking the drivers, and without laying down a C* roadmap that the C* 
> project hasn’t agreed to.  Someday down the line, if C* implements a given 
> capability, then the corresponding “vendor extension” could be incorporated 
> into the main protocol spec… or not.
> 
> Lots and lots of protocols implement this type of technique — SMTP, IMAP, 
> PNG, Sieve, DHCP, etc.   Maybe this a better way to go?
> 
> - Max
> 
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
> 


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Evolving the client protocol

2018-04-20 Thread Max C.
For things like #3, would it be a better idea to propose a generic enhancement 
for “optional vendor extensions” to the protocol?  These extensions would be 
negotiated during connection formation and then the driver could (optionally) 
implement these additional features.  These extensions would be documented 
separately by the vendor, and the driver’s default behavior would be to ignore 
any extensions it doesn’t understand.

With that sort of feature, the Scylla folks (CosmoDB too??) could add 
extensions to the protocol without forking the protocol spec, (potentially) 
without forking the drivers, and without laying down a C* roadmap that the C* 
project hasn’t agreed to.  Someday down the line, if C* implements a given 
capability, then the corresponding “vendor extension” could be incorporated 
into the main protocol spec… or not.

Lots and lots of protocols implement this type of technique — SMTP, IMAP, PNG, 
Sieve, DHCP, etc.   Maybe this a better way to go?

- Max

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Evolving the client protocol

2018-04-20 Thread Sylvain Lebresne
>
>
> Those were just given as examples. Each would be discussed on its own,
> assuming we are able to find a way to cooperate.
>
>
> These are relatively simple and it wouldn't be hard for use to patch
> Cassandra. But I want to find a way to make more complicated protocol
> changes where it wouldn't be realistic for us to modify Cassandra.
>

That's where I'm confused with what you are truly asking.

The native protocol is the protocol of the Apache Cassandra project and was
never meant to be a standard protocol. If the ask is to move towards more
of handling the protocol as a standard that would evolve independently of
whether Cassandra implements it (would the project commit to implement it
eventually?), then let's be clear on what the concrete suggestion is and
have this discussion (but to be upfront, the short version of my personal
opinion is that this would likely be a big distraction with relatively low
merits for the project, so I'm very unconvinced).

But if that's not the ask, what is it exactly? That we agree to commit
changes
to the protocol spec before we have actually implemented them? If so, I just
don't get it. The downsides are clear (we risk the feature is either never
implemeted due to lack of contributions/loss of interest, or that the
protocol
changes committed are not fully suitable to the final implementation) but
what
benefit to the project can that ever have?

Don't get me wrong, protocol-impacting changes/additions are very much
welcome
if reasonable for Cassandra, and both CASSANDRA-14311 and CASSANDRA-2848 are
certainly worthy. Both the definition of done of those ticket certainly
include the server implementation imo, not just changing the protocol spec
file. As for the shard notion, it makes no sense for Cassandra at this point
in time, so unless an additional contribution makes it so that it start to
make
sense, I'm not sure why we'd add anything related to it to the protocol.

--
Sylvain



>
> > RE #3,
> >
> > It's hard to be +1 on this because we don't benefit by boxing ourselves
> in by defining a spec we haven't implemented, tested, and decided we are
> satisfied with. Having it in ScyllaDB de-risks it to a certain extent, but
> what if Cassandra decides to go a different direction in some way?
>
> Such a proposal would include negotiation about the sharding algorithm
> used to prevent Cassandra being boxed in. Of course it's impossible to
> guarantee that a new idea won't come up that requires more changes.
>
> > I don't think there is much discussion to be had without an example of
> the the changes to the CQL specification to look at, but even then if it
> looks risky I am not likely to be in favor of it.
> >
> > Regards,
> > Ariel
> >
> > On Thu, Apr 19, 2018, at 9:33 AM, glom...@scylladb.com wrote:
> >>
> >> On 2018/04/19 07:19:27, kurt greaves  wrote:
>  1. The protocol change is developed using the Cassandra process in
>  a JIRA ticket, culminating in a patch to
>  doc/native_protocol*.spec when consensus is achieved.
> >>> I don't think forking would be desirable (for anyone) so this seems
> >>> the most reasonable to me. For 1 and 2 it certainly makes sense but
> >>> can't say I know enough about sharding to comment on 3 - seems to me
> >>> like it could be locking in a design before anyone truly knows what
> >>> sharding in C* looks like. But hopefully I'm wrong and there are
> >>> devs out there that have already thought that through.
> >> Thanks. That is our view and is great to hear.
> >>
> >> About our proposal number 3: In my view, good protocol designs are
> >> future proof and flexible. We certainly don't want to propose a design
> >> that works just for Scylla, but would support reasonable
> >> implementations regardless of how they may look like.
> >>
> >>> Do we have driver authors who wish to support both projects?
> >>>
> >>> Surely, but I imagine it would be a minority. ​
> >>>
> >> -
> >> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For
> >> additional commands, e-mail: dev-h...@cassandra.apache.org
> >>
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Re: Evolving the client protocol

2018-04-19 Thread Rahul Singh
Sounds interesting. Could 80% of what we gain with a “shard” approach be 
achieved via Mesos to wrap a stateful service? Technically it’s “Sharding” the 
whole machine and better utilizing resources.

--
Rahul Singh
rahul.si...@anant.us

Anant Corporation

On Apr 19, 2018, 1:23 PM -0500, sankalp kohli , wrote:
> If you donate Thread per core to C*, I am sure someone can help you review
> it and get it committed.
>
> On Thu, Apr 19, 2018 at 11:15 AM, Ben Bromhead  wrote:
>
> > Re #3:
> >
> > Yup I was thinking each shard/port would appear as a discrete server to the
> > client.
> >
> > If the per port suggestion is unacceptable due to hardware requirements,
> > remembering that Cassandra is built with the concept scaling *commodity*
> > hardware horizontally, you'll have to spend your time and energy convincing
> > the community to support a protocol feature it has no (current) use for or
> > find another interim solution.
> >
> > Another way, would be to build support and consensus around a clear
> > technical need in the Apache Cassandra project as it stands today.
> >
> > One way to build community support might be to contribute an Apache
> > licensed thread per core implementation in Java that matches the protocol
> > change and shard concept you are looking for ;P
> >
> >
> > On Thu, Apr 19, 2018 at 1:43 PM Ariel Weisberg  wrote:
> >
> > > Hi,
> > >
> > > So at technical level I don't understand this yet.
> > >
> > > So you have a database consisting of single threaded shards and a socket
> > > for accept that is generating TCP connections and in advance you don't
> > know
> > > which connection is going to send messages to which shard.
> > >
> > > What is the mechanism by which you get the packets for a given TCP
> > > connection delivered to a specific core? I know that a given TCP
> > connection
> > > will normally have all of its packets delivered to the same queue from
> > the
> > > NIC because the tuple of source address + port and destination address +
> > > port is typically hashed to pick one of the queues the NIC presents. I
> > > might have the contents of the tuple slightly wrong, but it always
> > includes
> > > a component you don't get to control.
> > >
> > > Since it's hashing how do you manipulate which queue packets for a TCP
> > > connection go to and how is it made worse by having an accept socket per
> > > shard?
> > >
> > > You also mention 160 ports as bad, but it doesn't sound like a big number
> > > resource wise. Is it an operational headache?
> > >
> > > RE tokens distributed amongst shards. The way that would work right now
> > is
> > > that each port number appears to be a discrete instance of the server. So
> > > you could have shards be actual shards that are simply colocated on the
> > > same box, run in the same process, and share resources. I know this
> > pushes
> > > more of the complexity into the server vs the driver as the server
> > expects
> > > all shards to share some client visible like system tables and certain
> > > identifiers.
> > >
> > > Ariel
> > > On Thu, Apr 19, 2018, at 12:59 PM, Avi Kivity wrote:
> > > > Port-per-shard is likely the easiest option but it's too ugly to
> > > > contemplate. We run on machines with 160 shards (IBM POWER 2s20c160t
> > > > IIRC), it will be just horrible to have 160 open ports.
> > > >
> > > >
> > > > It also doesn't fit will with the NICs ability to automatically
> > > > distribute packets among cores using multiple queues, so the kernel
> > > > would have to shuffle those packets around. Much better to have those
> > > > packets delivered directly to the core that will service them.
> > > >
> > > >
> > > > (also, some protocol changes are needed so the driver knows how tokens
> > > > are distributed among shards)
> > > >
> > > > On 2018-04-19 19:46, Ben Bromhead wrote:
> > > > > WRT to #3
> > > > > To fit in the existing protocol, could you have each shard listen on
> > a
> > > > > different port? Drivers are likely going to support this due to
> > > > > https://issues.apache.org/jira/browse/CASSANDRA-7544 (
> > > > > https://issues.apache.org/jira/browse/CASSANDRA-11596). I'm not
> > super
> > > > > familiar with the ticket so their might be something I'm missing but
> > it
> > > > > sounds like a potential approach.
> > > > >
> > > > > This would give you a path forward at least for the short term.
> > > > >
> > > > >
> > > > > On Thu, Apr 19, 2018 at 12:10 PM Ariel Weisberg  > > wrote:
> > > > >
> > > > > > Hi,
> > > > > >
> > > > > > I think that updating the protocol spec to Cassandra puts the onus
> > on
> > > the
> > > > > > party changing the protocol specification to have an implementation
> > > of the
> > > > > > spec in Cassandra as well as the Java and Python driver (those are
> > > both
> > > > > > used in the Cassandra repo). Until it's implemented in Cassandra we
> > > haven't
> > > > > > fully evaluated the 

Re: Evolving the client protocol

2018-04-19 Thread sankalp kohli
If you donate Thread per core to C*, I am sure someone can help you review
it and get it committed.

On Thu, Apr 19, 2018 at 11:15 AM, Ben Bromhead  wrote:

> Re #3:
>
> Yup I was thinking each shard/port would appear as a discrete server to the
> client.
>
> If the per port suggestion is unacceptable due to hardware requirements,
> remembering that Cassandra is built with the concept scaling *commodity*
> hardware horizontally, you'll have to spend your time and energy convincing
> the community to support a protocol feature it has no (current) use for or
> find another interim solution.
>
> Another way, would be to build support and consensus around a clear
> technical need in the Apache Cassandra project as it stands today.
>
> One way to build community support might be to contribute an Apache
> licensed thread per core implementation in Java that matches the protocol
> change and shard concept you are looking for ;P
>
>
> On Thu, Apr 19, 2018 at 1:43 PM Ariel Weisberg  wrote:
>
> > Hi,
> >
> > So at technical level I don't understand this yet.
> >
> > So you have a database consisting of single threaded shards and a socket
> > for accept that is generating TCP connections and in advance you don't
> know
> > which connection is going to send messages to which shard.
> >
> > What is the mechanism by which you get the packets for a given TCP
> > connection delivered to a specific core? I know that a given TCP
> connection
> > will normally have all of its packets delivered to the same queue from
> the
> > NIC because the tuple of source address + port and destination address +
> > port is typically hashed to pick one of the queues the NIC presents. I
> > might have the contents of the tuple slightly wrong, but it always
> includes
> > a component you don't get to control.
> >
> > Since it's hashing how do you manipulate which queue packets for a TCP
> > connection go to and how is it made worse by having an accept socket per
> > shard?
> >
> > You also mention 160 ports as bad, but it doesn't sound like a big number
> > resource wise. Is it an operational headache?
> >
> > RE tokens distributed amongst shards. The way that would work right now
> is
> > that each port number appears to be a discrete instance of the server. So
> > you could have shards be actual shards that are simply colocated on the
> > same box, run in the same process, and share resources. I know this
> pushes
> > more of the complexity into the server vs the driver as the server
> expects
> > all shards to share some client visible like system tables and certain
> > identifiers.
> >
> > Ariel
> > On Thu, Apr 19, 2018, at 12:59 PM, Avi Kivity wrote:
> > > Port-per-shard is likely the easiest option but it's too ugly to
> > > contemplate. We run on machines with 160 shards (IBM POWER 2s20c160t
> > > IIRC), it will be just horrible to have 160 open ports.
> > >
> > >
> > > It also doesn't fit will with the NICs ability to automatically
> > > distribute packets among cores using multiple queues, so the kernel
> > > would have to shuffle those packets around. Much better to have those
> > > packets delivered directly to the core that will service them.
> > >
> > >
> > > (also, some protocol changes are needed so the driver knows how tokens
> > > are distributed among shards)
> > >
> > > On 2018-04-19 19:46, Ben Bromhead wrote:
> > > > WRT to #3
> > > > To fit in the existing protocol, could you have each shard listen on
> a
> > > > different port? Drivers are likely going to support this due to
> > > > https://issues.apache.org/jira/browse/CASSANDRA-7544 (
> > > > https://issues.apache.org/jira/browse/CASSANDRA-11596).  I'm not
> super
> > > > familiar with the ticket so their might be something I'm missing but
> it
> > > > sounds like a potential approach.
> > > >
> > > > This would give you a path forward at least for the short term.
> > > >
> > > >
> > > > On Thu, Apr 19, 2018 at 12:10 PM Ariel Weisberg 
> > wrote:
> > > >
> > > >> Hi,
> > > >>
> > > >> I think that updating the protocol spec to Cassandra puts the onus
> on
> > the
> > > >> party changing the protocol specification to have an implementation
> > of the
> > > >> spec in Cassandra as well as the Java and Python driver (those are
> > both
> > > >> used in the Cassandra repo). Until it's implemented in Cassandra we
> > haven't
> > > >> fully evaluated the specification change. There is no substitute for
> > trying
> > > >> to make it work.
> > > >>
> > > >> There are also realities to consider as to what the maintainers of
> the
> > > >> drivers are willing to commit.
> > > >>
> > > >> RE #1,
> > > >>
> > > >> I am +1 on the fact that we shouldn't require an extra hop for range
> > scans.
> > > >>
> > > >> In JIRA Jeremiah made the point that you can still do this from the
> > client
> > > >> by breaking up the token ranges, but it's a leaky abstraction to
> have
> > a
> > > >> paging interface that isn't a vanilla 

Re: Evolving the client protocol

2018-04-19 Thread Ben Bromhead
Re #3:

Yup I was thinking each shard/port would appear as a discrete server to the
client.

If the per port suggestion is unacceptable due to hardware requirements,
remembering that Cassandra is built with the concept scaling *commodity*
hardware horizontally, you'll have to spend your time and energy convincing
the community to support a protocol feature it has no (current) use for or
find another interim solution.

Another way, would be to build support and consensus around a clear
technical need in the Apache Cassandra project as it stands today.

One way to build community support might be to contribute an Apache
licensed thread per core implementation in Java that matches the protocol
change and shard concept you are looking for ;P


On Thu, Apr 19, 2018 at 1:43 PM Ariel Weisberg  wrote:

> Hi,
>
> So at technical level I don't understand this yet.
>
> So you have a database consisting of single threaded shards and a socket
> for accept that is generating TCP connections and in advance you don't know
> which connection is going to send messages to which shard.
>
> What is the mechanism by which you get the packets for a given TCP
> connection delivered to a specific core? I know that a given TCP connection
> will normally have all of its packets delivered to the same queue from the
> NIC because the tuple of source address + port and destination address +
> port is typically hashed to pick one of the queues the NIC presents. I
> might have the contents of the tuple slightly wrong, but it always includes
> a component you don't get to control.
>
> Since it's hashing how do you manipulate which queue packets for a TCP
> connection go to and how is it made worse by having an accept socket per
> shard?
>
> You also mention 160 ports as bad, but it doesn't sound like a big number
> resource wise. Is it an operational headache?
>
> RE tokens distributed amongst shards. The way that would work right now is
> that each port number appears to be a discrete instance of the server. So
> you could have shards be actual shards that are simply colocated on the
> same box, run in the same process, and share resources. I know this pushes
> more of the complexity into the server vs the driver as the server expects
> all shards to share some client visible like system tables and certain
> identifiers.
>
> Ariel
> On Thu, Apr 19, 2018, at 12:59 PM, Avi Kivity wrote:
> > Port-per-shard is likely the easiest option but it's too ugly to
> > contemplate. We run on machines with 160 shards (IBM POWER 2s20c160t
> > IIRC), it will be just horrible to have 160 open ports.
> >
> >
> > It also doesn't fit will with the NICs ability to automatically
> > distribute packets among cores using multiple queues, so the kernel
> > would have to shuffle those packets around. Much better to have those
> > packets delivered directly to the core that will service them.
> >
> >
> > (also, some protocol changes are needed so the driver knows how tokens
> > are distributed among shards)
> >
> > On 2018-04-19 19:46, Ben Bromhead wrote:
> > > WRT to #3
> > > To fit in the existing protocol, could you have each shard listen on a
> > > different port? Drivers are likely going to support this due to
> > > https://issues.apache.org/jira/browse/CASSANDRA-7544 (
> > > https://issues.apache.org/jira/browse/CASSANDRA-11596).  I'm not super
> > > familiar with the ticket so their might be something I'm missing but it
> > > sounds like a potential approach.
> > >
> > > This would give you a path forward at least for the short term.
> > >
> > >
> > > On Thu, Apr 19, 2018 at 12:10 PM Ariel Weisberg 
> wrote:
> > >
> > >> Hi,
> > >>
> > >> I think that updating the protocol spec to Cassandra puts the onus on
> the
> > >> party changing the protocol specification to have an implementation
> of the
> > >> spec in Cassandra as well as the Java and Python driver (those are
> both
> > >> used in the Cassandra repo). Until it's implemented in Cassandra we
> haven't
> > >> fully evaluated the specification change. There is no substitute for
> trying
> > >> to make it work.
> > >>
> > >> There are also realities to consider as to what the maintainers of the
> > >> drivers are willing to commit.
> > >>
> > >> RE #1,
> > >>
> > >> I am +1 on the fact that we shouldn't require an extra hop for range
> scans.
> > >>
> > >> In JIRA Jeremiah made the point that you can still do this from the
> client
> > >> by breaking up the token ranges, but it's a leaky abstraction to have
> a
> > >> paging interface that isn't a vanilla ResultSet interface. Serial vs.
> > >> parallel is kind of orthogonal as the driver can do either.
> > >>
> > >> I agree it looks like the current specification doesn't make what
> should
> > >> be simple as simple as it could be for driver implementers.
> > >>
> > >> RE #2,
> > >>
> > >> +1 on this change assuming an implementation in Cassandra and the
> Java and
> > >> Python drivers.
> > >>
> > >> RE #3,
> > >>

Re: Evolving the client protocol

2018-04-19 Thread Ariel Weisberg
Hi,

So at technical level I don't understand this yet.

So you have a database consisting of single threaded shards and a socket for 
accept that is generating TCP connections and in advance you don't know which 
connection is going to send messages to which shard.

What is the mechanism by which you get the packets for a given TCP connection 
delivered to a specific core? I know that a given TCP connection will normally 
have all of its packets delivered to the same queue from the NIC because the 
tuple of source address + port and destination address + port is typically 
hashed to pick one of the queues the NIC presents. I might have the contents of 
the tuple slightly wrong, but it always includes a component you don't get to 
control.

Since it's hashing how do you manipulate which queue packets for a TCP 
connection go to and how is it made worse by having an accept socket per shard? 

You also mention 160 ports as bad, but it doesn't sound like a big number 
resource wise. Is it an operational headache?

RE tokens distributed amongst shards. The way that would work right now is that 
each port number appears to be a discrete instance of the server. So you could 
have shards be actual shards that are simply colocated on the same box, run in 
the same process, and share resources. I know this pushes more of the 
complexity into the server vs the driver as the server expects all shards to 
share some client visible like system tables and certain identifiers.

Ariel
On Thu, Apr 19, 2018, at 12:59 PM, Avi Kivity wrote:
> Port-per-shard is likely the easiest option but it's too ugly to 
> contemplate. We run on machines with 160 shards (IBM POWER 2s20c160t 
> IIRC), it will be just horrible to have 160 open ports.
> 
> 
> It also doesn't fit will with the NICs ability to automatically 
> distribute packets among cores using multiple queues, so the kernel 
> would have to shuffle those packets around. Much better to have those 
> packets delivered directly to the core that will service them.
> 
> 
> (also, some protocol changes are needed so the driver knows how tokens 
> are distributed among shards)
> 
> On 2018-04-19 19:46, Ben Bromhead wrote:
> > WRT to #3
> > To fit in the existing protocol, could you have each shard listen on a
> > different port? Drivers are likely going to support this due to
> > https://issues.apache.org/jira/browse/CASSANDRA-7544 (
> > https://issues.apache.org/jira/browse/CASSANDRA-11596).  I'm not super
> > familiar with the ticket so their might be something I'm missing but it
> > sounds like a potential approach.
> >
> > This would give you a path forward at least for the short term.
> >
> >
> > On Thu, Apr 19, 2018 at 12:10 PM Ariel Weisberg  wrote:
> >
> >> Hi,
> >>
> >> I think that updating the protocol spec to Cassandra puts the onus on the
> >> party changing the protocol specification to have an implementation of the
> >> spec in Cassandra as well as the Java and Python driver (those are both
> >> used in the Cassandra repo). Until it's implemented in Cassandra we haven't
> >> fully evaluated the specification change. There is no substitute for trying
> >> to make it work.
> >>
> >> There are also realities to consider as to what the maintainers of the
> >> drivers are willing to commit.
> >>
> >> RE #1,
> >>
> >> I am +1 on the fact that we shouldn't require an extra hop for range scans.
> >>
> >> In JIRA Jeremiah made the point that you can still do this from the client
> >> by breaking up the token ranges, but it's a leaky abstraction to have a
> >> paging interface that isn't a vanilla ResultSet interface. Serial vs.
> >> parallel is kind of orthogonal as the driver can do either.
> >>
> >> I agree it looks like the current specification doesn't make what should
> >> be simple as simple as it could be for driver implementers.
> >>
> >> RE #2,
> >>
> >> +1 on this change assuming an implementation in Cassandra and the Java and
> >> Python drivers.
> >>
> >> RE #3,
> >>
> >> It's hard to be +1 on this because we don't benefit by boxing ourselves in
> >> by defining a spec we haven't implemented, tested, and decided we are
> >> satisfied with. Having it in ScyllaDB de-risks it to a certain extent, but
> >> what if Cassandra decides to go a different direction in some way?
> >>
> >> I don't think there is much discussion to be had without an example of the
> >> the changes to the CQL specification to look at, but even then if it looks
> >> risky I am not likely to be in favor of it.
> >>
> >> Regards,
> >> Ariel
> >>
> >> On Thu, Apr 19, 2018, at 9:33 AM, glom...@scylladb.com wrote:
> >>>
> >>> On 2018/04/19 07:19:27, kurt greaves  wrote:
> > 1. The protocol change is developed using the Cassandra process in
> > a JIRA ticket, culminating in a patch to
> > doc/native_protocol*.spec when consensus is achieved.
>  I don't think forking would be desirable (for anyone) so this seems
>  the most 

Re: Evolving the client protocol

2018-04-19 Thread Ariel Weisberg
Hi,

>That basically means a fork in the protocol (perhaps a temporary fork if 
>we go for mode 2 where Cassandra retroactively adopts our protocol 
>changes, if they fit will).
>
>Implementing a protocol change may be easy for some simple changes, but 
>in the general case, it is not realistic to expect it.

> Can you elaborate? No one is forcing driver maintainers to update their 
> drivers to support new features, either for Cassandra or Scylla, but 
> there should be no reason for them to reject a contribution adding that 
> support.

I think it's unrealistic to expect the next version of the protocol spec to 
include functionality that is not supported by either  the server or drivers 
once a version of the server or driver supporting that protocol version is  
released. Putting something in the spec is making a hard commitment for the 
driver and server without also specifying who will do the work.

So yes a temporary fork is fine, but then you run into things like "we" don't 
like the spec change and find we want to change it again. For us it's fine 
because we never committed to supporting the fork either way. For the driver 
maintainers it's fine because they probably never accepted the spec change 
either and didn't update the drivers. This is because the maintainers aren't 
going to accept changes that are incompatible with what the Cassandra server 
implements.

So if you have a temporary fork of the spec you might also be committing to a 
temporary fork of the drivers as well as the headaches that come with the final 
version of the spec not matching your fork. We would do what we can to avoid 
that by having the conversation around the protocol design up front.

What I am largely getting at is that I think Apache Cassandra and its drivers 
can only truly commit to a spec where there is a released implementation in the 
server and drivers. Up until that point the spec is subject to change. We are 
less likely to change it if there is an implementation because we have already 
done the work and dug up most of the issues.

For sharding this is thorny and I think Ben makes a really good suggestion RE 
leveraging CASSANDRA-7544.  For paging state and timeouts I think it's likely 
we could stick to what we work out spec wise and we are happy to have the 
discussion and learn from ScyllaDB de-risking protocol changes, but if no one 
commits to doing the work you might find we release the next protocol version 
without the tentative spec changes.

Ariel
On Thu, Apr 19, 2018, at 12:53 PM, Avi Kivity wrote:
> 
> 
> On 2018-04-19 19:10, Ariel Weisberg wrote:
> > Hi,
> >
> > I think that updating the protocol spec to Cassandra puts the onus on the 
> > party changing the protocol specification to have an implementation of the 
> > spec in Cassandra as well as the Java and Python driver (those are both 
> > used in the Cassandra repo). Until it's implemented in Cassandra we haven't 
> > fully evaluated the specification change. There is no substitute for trying 
> > to make it work.
> 
> That basically means a fork in the protocol (perhaps a temporary fork if 
> we go for mode 2 where Cassandra retroactively adopts our protocol 
> changes, if they fit will).
> 
> Implementing a protocol change may be easy for some simple changes, but 
> in the general case, it is not realistic to expect it.
> 
> > There are also realities to consider as to what the maintainers of the 
> > drivers are willing to commit.
> 
> Can you elaborate? No one is forcing driver maintainers to update their 
> drivers to support new features, either for Cassandra or Scylla, but 
> there should be no reason for them to reject a contribution adding that 
> support.
> 
> If you refer to a potential politically-motivated rejection by the 
> DataStax-maintained drivers, then those drivers should and will be 
> forked. That's not true open source. However, I'm not assuming that will 
> happen.
> 
> >
> > RE #1,
> >
> > I am +1 on the fact that we shouldn't require an extra hop for range scans.
> >
> > In JIRA Jeremiah made the point that you can still do this from the client 
> > by breaking up the token ranges, but it's a leaky abstraction to have a 
> > paging interface that isn't a vanilla ResultSet interface. Serial vs. 
> > parallel is kind of orthogonal as the driver can do either.
> >
> > I agree it looks like the current specification doesn't make what should be 
> > simple as simple as it could be for driver implementers.
> >
> > RE #2,
> >
> > +1 on this change assuming an implementation in Cassandra and the Java and 
> > Python drivers.
> 
> Those were just given as examples. Each would be discussed on its own, 
> assuming we are able to find a way to cooperate.
> 
> 
> These are relatively simple and it wouldn't be hard for use to patch 
> Cassandra. But I want to find a way to make more complicated protocol 
> changes where it wouldn't be realistic for us to modify Cassandra.
> 
> > RE #3,
> >
> > It's hard to be +1 on this 

Re: Evolving the client protocol

2018-04-19 Thread Michael Shuler
This is purely my own opinion, but I find the use of the term 'shard'
quite unfortunate in the context of a distributed database. The
historical usage of the term has been the notion of data partitions that
reside on separate database servers. There is a learning curve with
distributed databases, and I can foresee the use of the term adding
additional confusion for new users. Not a fan.

-- 
Kind regards,
Michael

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Evolving the client protocol

2018-04-19 Thread Avi Kivity



On 2018-04-19 10:19, kurt greaves wrote:

1. The protocol change is developed using the Cassandra process in a JIRA
ticket, culminating in a patch to doc/native_protocol*.spec when consensus
is achieved.

I don't think forking would be desirable (for anyone) so this seems the
most reasonable to me. For 1 and 2 it certainly makes sense but can't say I
know enough about sharding to comment on 3 - seems to me like it could be
locking in a design before anyone truly knows what sharding in C* looks
like. But hopefully I'm wrong and there are devs out there that have
already thought that through.


Too bad you missed your flight or you'd have seen my NGCC presentation 
about all the mistakes we made when developing the sharding algorithm.




Do we have driver authors who wish to support both projects?

Surely, but I imagine it would be a minority.
​



Why is that?

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Evolving the client protocol

2018-04-19 Thread Avi Kivity



On 2018-04-19 19:10, Ariel Weisberg wrote:

Hi,

I think that updating the protocol spec to Cassandra puts the onus on the party 
changing the protocol specification to have an implementation of the spec in 
Cassandra as well as the Java and Python driver (those are both used in the 
Cassandra repo). Until it's implemented in Cassandra we haven't fully evaluated 
the specification change. There is no substitute for trying to make it work.


That basically means a fork in the protocol (perhaps a temporary fork if 
we go for mode 2 where Cassandra retroactively adopts our protocol 
changes, if they fit will).


Implementing a protocol change may be easy for some simple changes, but 
in the general case, it is not realistic to expect it.



There are also realities to consider as to what the maintainers of the drivers 
are willing to commit.


Can you elaborate? No one is forcing driver maintainers to update their 
drivers to support new features, either for Cassandra or Scylla, but 
there should be no reason for them to reject a contribution adding that 
support.


If you refer to a potential politically-motivated rejection by the 
DataStax-maintained drivers, then those drivers should and will be 
forked. That's not true open source. However, I'm not assuming that will 
happen.




RE #1,

I am +1 on the fact that we shouldn't require an extra hop for range scans.

In JIRA Jeremiah made the point that you can still do this from the client by 
breaking up the token ranges, but it's a leaky abstraction to have a paging 
interface that isn't a vanilla ResultSet interface. Serial vs. parallel is kind 
of orthogonal as the driver can do either.

I agree it looks like the current specification doesn't make what should be 
simple as simple as it could be for driver implementers.

RE #2,

+1 on this change assuming an implementation in Cassandra and the Java and 
Python drivers.


Those were just given as examples. Each would be discussed on its own, 
assuming we are able to find a way to cooperate.



These are relatively simple and it wouldn't be hard for use to patch 
Cassandra. But I want to find a way to make more complicated protocol 
changes where it wouldn't be realistic for us to modify Cassandra.



RE #3,

It's hard to be +1 on this because we don't benefit by boxing ourselves in by 
defining a spec we haven't implemented, tested, and decided we are satisfied 
with. Having it in ScyllaDB de-risks it to a certain extent, but what if 
Cassandra decides to go a different direction in some way?


Such a proposal would include negotiation about the sharding algorithm 
used to prevent Cassandra being boxed in. Of course it's impossible to 
guarantee that a new idea won't come up that requires more changes.



I don't think there is much discussion to be had without an example of the the 
changes to the CQL specification to look at, but even then if it looks risky I 
am not likely to be in favor of it.

Regards,
Ariel

On Thu, Apr 19, 2018, at 9:33 AM, glom...@scylladb.com wrote:


On 2018/04/19 07:19:27, kurt greaves  wrote:

1. The protocol change is developed using the Cassandra process in
a JIRA ticket, culminating in a patch to
doc/native_protocol*.spec when consensus is achieved.

I don't think forking would be desirable (for anyone) so this seems
the most reasonable to me. For 1 and 2 it certainly makes sense but
can't say I know enough about sharding to comment on 3 - seems to me
like it could be locking in a design before anyone truly knows what
sharding in C* looks like. But hopefully I'm wrong and there are
devs out there that have already thought that through.

Thanks. That is our view and is great to hear.

About our proposal number 3: In my view, good protocol designs are
future proof and flexible. We certainly don't want to propose a design
that works just for Scylla, but would support reasonable
implementations regardless of how they may look like.


Do we have driver authors who wish to support both projects?

Surely, but I imagine it would be a minority. ​


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For
additional commands, e-mail: dev-h...@cassandra.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org




-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Evolving the client protocol

2018-04-19 Thread Avi Kivity
Port-per-shard is likely the easiest option but it's too ugly to 
contemplate. We run on machines with 160 shards (IBM POWER 2s20c160t 
IIRC), it will be just horrible to have 160 open ports.



It also doesn't fit will with the NICs ability to automatically 
distribute packets among cores using multiple queues, so the kernel 
would have to shuffle those packets around. Much better to have those 
packets delivered directly to the core that will service them.



(also, some protocol changes are needed so the driver knows how tokens 
are distributed among shards)


On 2018-04-19 19:46, Ben Bromhead wrote:

WRT to #3
To fit in the existing protocol, could you have each shard listen on a
different port? Drivers are likely going to support this due to
https://issues.apache.org/jira/browse/CASSANDRA-7544 (
https://issues.apache.org/jira/browse/CASSANDRA-11596).  I'm not super
familiar with the ticket so their might be something I'm missing but it
sounds like a potential approach.

This would give you a path forward at least for the short term.


On Thu, Apr 19, 2018 at 12:10 PM Ariel Weisberg  wrote:


Hi,

I think that updating the protocol spec to Cassandra puts the onus on the
party changing the protocol specification to have an implementation of the
spec in Cassandra as well as the Java and Python driver (those are both
used in the Cassandra repo). Until it's implemented in Cassandra we haven't
fully evaluated the specification change. There is no substitute for trying
to make it work.

There are also realities to consider as to what the maintainers of the
drivers are willing to commit.

RE #1,

I am +1 on the fact that we shouldn't require an extra hop for range scans.

In JIRA Jeremiah made the point that you can still do this from the client
by breaking up the token ranges, but it's a leaky abstraction to have a
paging interface that isn't a vanilla ResultSet interface. Serial vs.
parallel is kind of orthogonal as the driver can do either.

I agree it looks like the current specification doesn't make what should
be simple as simple as it could be for driver implementers.

RE #2,

+1 on this change assuming an implementation in Cassandra and the Java and
Python drivers.

RE #3,

It's hard to be +1 on this because we don't benefit by boxing ourselves in
by defining a spec we haven't implemented, tested, and decided we are
satisfied with. Having it in ScyllaDB de-risks it to a certain extent, but
what if Cassandra decides to go a different direction in some way?

I don't think there is much discussion to be had without an example of the
the changes to the CQL specification to look at, but even then if it looks
risky I am not likely to be in favor of it.

Regards,
Ariel

On Thu, Apr 19, 2018, at 9:33 AM, glom...@scylladb.com wrote:


On 2018/04/19 07:19:27, kurt greaves  wrote:

1. The protocol change is developed using the Cassandra process in
a JIRA ticket, culminating in a patch to
doc/native_protocol*.spec when consensus is achieved.

I don't think forking would be desirable (for anyone) so this seems
the most reasonable to me. For 1 and 2 it certainly makes sense but
can't say I know enough about sharding to comment on 3 - seems to me
like it could be locking in a design before anyone truly knows what
sharding in C* looks like. But hopefully I'm wrong and there are
devs out there that have already thought that through.

Thanks. That is our view and is great to hear.

About our proposal number 3: In my view, good protocol designs are
future proof and flexible. We certainly don't want to propose a design
that works just for Scylla, but would support reasonable
implementations regardless of how they may look like.


Do we have driver authors who wish to support both projects?

Surely, but I imagine it would be a minority. ​


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For
additional commands, e-mail: dev-h...@cassandra.apache.org


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org

--

Ben Bromhead
CTO | Instaclustr 
+1 650 284 9692
Reliability at Scale
Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer




-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Evolving the client protocol

2018-04-19 Thread Ben Bromhead
WRT to #3
To fit in the existing protocol, could you have each shard listen on a
different port? Drivers are likely going to support this due to
https://issues.apache.org/jira/browse/CASSANDRA-7544 (
https://issues.apache.org/jira/browse/CASSANDRA-11596).  I'm not super
familiar with the ticket so their might be something I'm missing but it
sounds like a potential approach.

This would give you a path forward at least for the short term.


On Thu, Apr 19, 2018 at 12:10 PM Ariel Weisberg  wrote:

> Hi,
>
> I think that updating the protocol spec to Cassandra puts the onus on the
> party changing the protocol specification to have an implementation of the
> spec in Cassandra as well as the Java and Python driver (those are both
> used in the Cassandra repo). Until it's implemented in Cassandra we haven't
> fully evaluated the specification change. There is no substitute for trying
> to make it work.
>
> There are also realities to consider as to what the maintainers of the
> drivers are willing to commit.
>
> RE #1,
>
> I am +1 on the fact that we shouldn't require an extra hop for range scans.
>
> In JIRA Jeremiah made the point that you can still do this from the client
> by breaking up the token ranges, but it's a leaky abstraction to have a
> paging interface that isn't a vanilla ResultSet interface. Serial vs.
> parallel is kind of orthogonal as the driver can do either.
>
> I agree it looks like the current specification doesn't make what should
> be simple as simple as it could be for driver implementers.
>
> RE #2,
>
> +1 on this change assuming an implementation in Cassandra and the Java and
> Python drivers.
>
> RE #3,
>
> It's hard to be +1 on this because we don't benefit by boxing ourselves in
> by defining a spec we haven't implemented, tested, and decided we are
> satisfied with. Having it in ScyllaDB de-risks it to a certain extent, but
> what if Cassandra decides to go a different direction in some way?
>
> I don't think there is much discussion to be had without an example of the
> the changes to the CQL specification to look at, but even then if it looks
> risky I am not likely to be in favor of it.
>
> Regards,
> Ariel
>
> On Thu, Apr 19, 2018, at 9:33 AM, glom...@scylladb.com wrote:
> >
> >
> > On 2018/04/19 07:19:27, kurt greaves  wrote:
> > > >
> > > > 1. The protocol change is developed using the Cassandra process in
> > > >a JIRA ticket, culminating in a patch to
> > > >doc/native_protocol*.spec when consensus is achieved.
> > >
> > > I don't think forking would be desirable (for anyone) so this seems
> > > the most reasonable to me. For 1 and 2 it certainly makes sense but
> > > can't say I know enough about sharding to comment on 3 - seems to me
> > > like it could be locking in a design before anyone truly knows what
> > > sharding in C* looks like. But hopefully I'm wrong and there are
> > > devs out there that have already thought that through.
> >
> > Thanks. That is our view and is great to hear.
> >
> > About our proposal number 3: In my view, good protocol designs are
> > future proof and flexible. We certainly don't want to propose a design
> > that works just for Scylla, but would support reasonable
> > implementations regardless of how they may look like.
> >
> > >
> > > Do we have driver authors who wish to support both projects?
> > >
> > > Surely, but I imagine it would be a minority. ​
> > >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For
> > additional commands, e-mail: dev-h...@cassandra.apache.org
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
> --
Ben Bromhead
CTO | Instaclustr 
+1 650 284 9692
Reliability at Scale
Cassandra, Spark, Elasticsearch on AWS, Azure, GCP and Softlayer


Re: Evolving the client protocol

2018-04-19 Thread Ariel Weisberg
Hi,

I think that updating the protocol spec to Cassandra puts the onus on the party 
changing the protocol specification to have an implementation of the spec in 
Cassandra as well as the Java and Python driver (those are both used in the 
Cassandra repo). Until it's implemented in Cassandra we haven't fully evaluated 
the specification change. There is no substitute for trying to make it work.

There are also realities to consider as to what the maintainers of the drivers 
are willing to commit.

RE #1,

I am +1 on the fact that we shouldn't require an extra hop for range scans.

In JIRA Jeremiah made the point that you can still do this from the client by 
breaking up the token ranges, but it's a leaky abstraction to have a paging 
interface that isn't a vanilla ResultSet interface. Serial vs. parallel is kind 
of orthogonal as the driver can do either.

I agree it looks like the current specification doesn't make what should be 
simple as simple as it could be for driver implementers.

RE #2,

+1 on this change assuming an implementation in Cassandra and the Java and 
Python drivers.

RE #3,

It's hard to be +1 on this because we don't benefit by boxing ourselves in by 
defining a spec we haven't implemented, tested, and decided we are satisfied 
with. Having it in ScyllaDB de-risks it to a certain extent, but what if 
Cassandra decides to go a different direction in some way?

I don't think there is much discussion to be had without an example of the the 
changes to the CQL specification to look at, but even then if it looks risky I 
am not likely to be in favor of it.

Regards,
Ariel

On Thu, Apr 19, 2018, at 9:33 AM, glom...@scylladb.com wrote:
>
>
> On 2018/04/19 07:19:27, kurt greaves  wrote:
> > >
> > > 1. The protocol change is developed using the Cassandra process in
> > >a JIRA ticket, culminating in a patch to
> > >doc/native_protocol*.spec when consensus is achieved.
> >
> > I don't think forking would be desirable (for anyone) so this seems
> > the most reasonable to me. For 1 and 2 it certainly makes sense but
> > can't say I know enough about sharding to comment on 3 - seems to me
> > like it could be locking in a design before anyone truly knows what
> > sharding in C* looks like. But hopefully I'm wrong and there are
> > devs out there that have already thought that through.
>
> Thanks. That is our view and is great to hear.
>
> About our proposal number 3: In my view, good protocol designs are
> future proof and flexible. We certainly don't want to propose a design
> that works just for Scylla, but would support reasonable
> implementations regardless of how they may look like.
>
> >
> > Do we have driver authors who wish to support both projects?
> >
> > Surely, but I imagine it would be a minority. ​
> >
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org For
> additional commands, e-mail: dev-h...@cassandra.apache.org
>

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Evolving the client protocol

2018-04-19 Thread glommer


On 2018/04/19 07:19:27, kurt greaves  wrote: 
> >
> > 1. The protocol change is developed using the Cassandra process in a JIRA
> > ticket, culminating in a patch to doc/native_protocol*.spec when consensus
> > is achieved.
> 
> I don't think forking would be desirable (for anyone) so this seems the
> most reasonable to me. For 1 and 2 it certainly makes sense but can't say I
> know enough about sharding to comment on 3 - seems to me like it could be
> locking in a design before anyone truly knows what sharding in C* looks
> like. But hopefully I'm wrong and there are devs out there that have
> already thought that through.

Thanks. That is our view and is great to hear.

About our proposal number 3: In my view, good protocol designs are future proof 
and flexible. We certainly don't want to propose a design that works just for 
Scylla, but would support reasonable implementations regardless of how they may 
look like.

> 
> Do we have driver authors who wish to support both projects?
> 
> Surely, but I imagine it would be a minority.
> ​
> 

-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org



Re: Evolving the client protocol

2018-04-19 Thread kurt greaves
>
> 1. The protocol change is developed using the Cassandra process in a JIRA
> ticket, culminating in a patch to doc/native_protocol*.spec when consensus
> is achieved.

I don't think forking would be desirable (for anyone) so this seems the
most reasonable to me. For 1 and 2 it certainly makes sense but can't say I
know enough about sharding to comment on 3 - seems to me like it could be
locking in a design before anyone truly knows what sharding in C* looks
like. But hopefully I'm wrong and there are devs out there that have
already thought that through.

Do we have driver authors who wish to support both projects?

Surely, but I imagine it would be a minority.
​


Re: Evolving the client protocol

2018-04-18 Thread Jonathan Haddad
Avi, if this is something that matters to you, you're more than welcome to
submit a patch to both this project and to the different drivers.  Getting
the first 2 optimizations into 4.0 would be nice, I'm sure someone would be
happy to work with you on it.

The third, I can't see why we'd need that right now.  It's going to be more
of an uphill battle, since we currently would have no notion of a shard in
Cassandra.  If you want to work on Thread Per Core for Cassandra 5.0 it
seems like it would be a reasonable addition to the protocol.

On Wed, Apr 18, 2018 at 1:24 PM sankalp kohli 
wrote:

> Do we have driver authors who wish to support both projects?
>
> On Wed, Apr 18, 2018 at 9:25 AM, Jeff Jirsa  wrote:
>
> > Removed other lists (please don't cross post)
> >
> >
> >
> >
> >
> > On Wed, Apr 18, 2018 at 3:47 AM, Avi Kivity  wrote:
> >
> > > Hello Cassandra developers,
> > >
> > >
> > > We're starting to see client protocol limitations impact performance,
> and
> > > so we'd like to evolve the protocol to remove the limitations. In order
> > to
> > > avoid fragmenting the driver ecosystem and reduce work duplication for
> > > driver authors, we'd like to avoid forking the protocol. Since these
> > issues
> > > affect Cassandra, either now or in the future, I'd like to cooperate on
> > > protocol development.
> > >
> > >
> > > Some issues that we'd like to work on near-term are:
> > >
> > >
> > > 1. Token-aware range queries
> > >
> > >
> > > When the server returns a page in a range query, it will also return a
> > > token to continue on. In case that token is on a different node, the
> > client
> > > selects a new coordinator based on the token. This eliminates a network
> > hop
> > > for range queries.
> > >
> > >
> > > For the first page, the PREPARE message returns information allowing
> the
> > > client to compute where the first page is held, given the query
> > parameters.
> > > This is just information identifying how to compute the token, given
> the
> > > query parameters (non-range queries already do this).
> > >
> > >
> > > https://issues.apache.org/jira/browse/CASSANDRA-14311
> > >
> > >
> > > 2. Per-request timeouts
> > >
> > >
> > > Allow each request to have its own timeout. This allows the user to set
> > > short timeouts on business-critical queries that are invalid if not
> > served
> > > within a short time, long timeouts for scanning or indexed queries, and
> > > even longer timeouts for administrative tasks like TRUNCATE and DROP.
> > >
> > >
> > > https://issues.apache.org/jira/browse/CASSANDRA-2848
> > >
> > >
> > > 3. Shard-aware driver
> > >
> > >
> > > This admittedly is a burning issue for ScyllaDB, but not so much for
> > > Cassandra at this time.
> > >
> > >
> > > In the same way that drivers are token-aware, they can be shard-aware -
> > > know how many shards each node has, and the sharding algorithm. They
> can
> > > then open a connection per shard and send cql requests directly to the
> > > shard that will serve them, instead of requiring cross-core
> communication
> > > to happen on the server.
> > >
> > >
> > > https://issues.apache.org/jira/browse/CASSANDRA-10989
> > >
> > >
> > > I see three possible modes of cooperation:
> > >
> > >
> > > 1. The protocol change is developed using the Cassandra process in a
> JIRA
> > > ticket, culminating in a patch to doc/native_protocol*.spec when
> > consensus
> > > is achieved.
> > >
> > >
> > > The advantage to this mode is that Cassandra developers can verify that
> > > the change is easily implementable; when they are ready to implement
> the
> > > feature, drivers that were already adapted to support it will just
> work.
> > >
> > >
> > > 2. The protocol change is developed outside the Cassandra process.
> > >
> > >
> > > In this mode, we develop the change in a forked version of
> > > native_protocol*.spec; Cassandra can still retroactively merge that
> > change
> > > when (and if) it is implemented, but the ability to influence the
> change
> > > during development is reduced.
> > >
> > >
> > > If we agree on this, I'd like to allocate a prefix for feature names in
> > > the SUPPORTED message for our use.
> > >
> > >
> > > 3. No cooperation.
> > >
> > >
> > > This requires the least amount of effort from Cassandra developers
> (just
> > > enough to reach this point in this email), but will cause duplication
> of
> > > effort for driver authors who wish to support both projects, and may
> > cause
> > > Cassandra developers to redo work that we already did.
> > >
> > >
> > > Looking forward to your views.
> > >
> > >
> > > Avi
> > >
> > >
> > > -
> > > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > > For additional commands, e-mail: dev-h...@cassandra.apache.org
> > >
> > >
> >
>


Re: Evolving the client protocol

2018-04-18 Thread sankalp kohli
Do we have driver authors who wish to support both projects?

On Wed, Apr 18, 2018 at 9:25 AM, Jeff Jirsa  wrote:

> Removed other lists (please don't cross post)
>
>
>
>
>
> On Wed, Apr 18, 2018 at 3:47 AM, Avi Kivity  wrote:
>
> > Hello Cassandra developers,
> >
> >
> > We're starting to see client protocol limitations impact performance, and
> > so we'd like to evolve the protocol to remove the limitations. In order
> to
> > avoid fragmenting the driver ecosystem and reduce work duplication for
> > driver authors, we'd like to avoid forking the protocol. Since these
> issues
> > affect Cassandra, either now or in the future, I'd like to cooperate on
> > protocol development.
> >
> >
> > Some issues that we'd like to work on near-term are:
> >
> >
> > 1. Token-aware range queries
> >
> >
> > When the server returns a page in a range query, it will also return a
> > token to continue on. In case that token is on a different node, the
> client
> > selects a new coordinator based on the token. This eliminates a network
> hop
> > for range queries.
> >
> >
> > For the first page, the PREPARE message returns information allowing the
> > client to compute where the first page is held, given the query
> parameters.
> > This is just information identifying how to compute the token, given the
> > query parameters (non-range queries already do this).
> >
> >
> > https://issues.apache.org/jira/browse/CASSANDRA-14311
> >
> >
> > 2. Per-request timeouts
> >
> >
> > Allow each request to have its own timeout. This allows the user to set
> > short timeouts on business-critical queries that are invalid if not
> served
> > within a short time, long timeouts for scanning or indexed queries, and
> > even longer timeouts for administrative tasks like TRUNCATE and DROP.
> >
> >
> > https://issues.apache.org/jira/browse/CASSANDRA-2848
> >
> >
> > 3. Shard-aware driver
> >
> >
> > This admittedly is a burning issue for ScyllaDB, but not so much for
> > Cassandra at this time.
> >
> >
> > In the same way that drivers are token-aware, they can be shard-aware -
> > know how many shards each node has, and the sharding algorithm. They can
> > then open a connection per shard and send cql requests directly to the
> > shard that will serve them, instead of requiring cross-core communication
> > to happen on the server.
> >
> >
> > https://issues.apache.org/jira/browse/CASSANDRA-10989
> >
> >
> > I see three possible modes of cooperation:
> >
> >
> > 1. The protocol change is developed using the Cassandra process in a JIRA
> > ticket, culminating in a patch to doc/native_protocol*.spec when
> consensus
> > is achieved.
> >
> >
> > The advantage to this mode is that Cassandra developers can verify that
> > the change is easily implementable; when they are ready to implement the
> > feature, drivers that were already adapted to support it will just work.
> >
> >
> > 2. The protocol change is developed outside the Cassandra process.
> >
> >
> > In this mode, we develop the change in a forked version of
> > native_protocol*.spec; Cassandra can still retroactively merge that
> change
> > when (and if) it is implemented, but the ability to influence the change
> > during development is reduced.
> >
> >
> > If we agree on this, I'd like to allocate a prefix for feature names in
> > the SUPPORTED message for our use.
> >
> >
> > 3. No cooperation.
> >
> >
> > This requires the least amount of effort from Cassandra developers (just
> > enough to reach this point in this email), but will cause duplication of
> > effort for driver authors who wish to support both projects, and may
> cause
> > Cassandra developers to redo work that we already did.
> >
> >
> > Looking forward to your views.
> >
> >
> > Avi
> >
> >
> > -
> > To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> > For additional commands, e-mail: dev-h...@cassandra.apache.org
> >
> >
>


Re: Evolving the client protocol

2018-04-18 Thread Glauber Costa
BTW,

Folks from Cassandra apparently didn'tr eceive this message.

On Wed, Apr 18, 2018 at 6:47 AM, Avi Kivity  wrote:

> Hello Cassandra developers,
>
>
> We're starting to see client protocol limitations impact performance, and
> so we'd like to evolve the protocol to remove the limitations. In order to
> avoid fragmenting the driver ecosystem and reduce work duplication for
> driver authors, we'd like to avoid forking the protocol. Since these issues
> affect Cassandra, either now or in the future, I'd like to cooperate on
> protocol development.
>
>
> Some issues that we'd like to work on near-term are:
>
>
> 1. Token-aware range queries
>
>
> When the server returns a page in a range query, it will also return a
> token to continue on. In case that token is on a different node, the client
> selects a new coordinator based on the token. This eliminates a network hop
> for range queries.
>
>
> For the first page, the PREPARE message returns information allowing the
> client to compute where the first page is held, given the query parameters.
> This is just information identifying how to compute the token, given the
> query parameters (non-range queries already do this).
>
>
> https://issues.apache.org/jira/browse/CASSANDRA-14311
>
>
> 2. Per-request timeouts
>
>
> Allow each request to have its own timeout. This allows the user to set
> short timeouts on business-critical queries that are invalid if not served
> within a short time, long timeouts for scanning or indexed queries, and
> even longer timeouts for administrative tasks like TRUNCATE and DROP.
>
>
> https://issues.apache.org/jira/browse/CASSANDRA-2848
>
>
> 3. Shard-aware driver
>
>
> This admittedly is a burning issue for ScyllaDB, but not so much for
> Cassandra at this time.
>
>
> In the same way that drivers are token-aware, they can be shard-aware -
> know how many shards each node has, and the sharding algorithm. They can
> then open a connection per shard and send cql requests directly to the
> shard that will serve them, instead of requiring cross-core communication
> to happen on the server.
>
>
> https://issues.apache.org/jira/browse/CASSANDRA-10989
>
>
> I see three possible modes of cooperation:
>
>
> 1. The protocol change is developed using the Cassandra process in a JIRA
> ticket, culminating in a patch to doc/native_protocol*.spec when consensus
> is achieved.
>
>
> The advantage to this mode is that Cassandra developers can verify that
> the change is easily implementable; when they are ready to implement the
> feature, drivers that were already adapted to support it will just work.
>
>
> 2. The protocol change is developed outside the Cassandra process.
>
>
> In this mode, we develop the change in a forked version of
> native_protocol*.spec; Cassandra can still retroactively merge that change
> when (and if) it is implemented, but the ability to influence the change
> during development is reduced.
>
>
> If we agree on this, I'd like to allocate a prefix for feature names in
> the SUPPORTED message for our use.
>
>
> 3. No cooperation.
>
>
> This requires the least amount of effort from Cassandra developers (just
> enough to reach this point in this email), but will cause duplication of
> effort for driver authors who wish to support both projects, and may cause
> Cassandra developers to redo work that we already did.
>
>
> Looking forward to your views.
>
>
> Avi
>
> --
> You received this message because you are subscribed to the Google Groups
> "ScyllaDB development" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to scylladb-dev+unsubscr...@googlegroups.com.
> To post to this group, send email to scylladb-...@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/ms
> gid/scylladb-dev/de6e33eb-b438-8524-ac99-a299e9ba0e72%40scylladb.com.
> For more options, visit https://groups.google.com/d/optout.
>


Re: Evolving the client protocol

2018-04-18 Thread Jeff Jirsa
Removed other lists (please don't cross post)





On Wed, Apr 18, 2018 at 3:47 AM, Avi Kivity  wrote:

> Hello Cassandra developers,
>
>
> We're starting to see client protocol limitations impact performance, and
> so we'd like to evolve the protocol to remove the limitations. In order to
> avoid fragmenting the driver ecosystem and reduce work duplication for
> driver authors, we'd like to avoid forking the protocol. Since these issues
> affect Cassandra, either now or in the future, I'd like to cooperate on
> protocol development.
>
>
> Some issues that we'd like to work on near-term are:
>
>
> 1. Token-aware range queries
>
>
> When the server returns a page in a range query, it will also return a
> token to continue on. In case that token is on a different node, the client
> selects a new coordinator based on the token. This eliminates a network hop
> for range queries.
>
>
> For the first page, the PREPARE message returns information allowing the
> client to compute where the first page is held, given the query parameters.
> This is just information identifying how to compute the token, given the
> query parameters (non-range queries already do this).
>
>
> https://issues.apache.org/jira/browse/CASSANDRA-14311
>
>
> 2. Per-request timeouts
>
>
> Allow each request to have its own timeout. This allows the user to set
> short timeouts on business-critical queries that are invalid if not served
> within a short time, long timeouts for scanning or indexed queries, and
> even longer timeouts for administrative tasks like TRUNCATE and DROP.
>
>
> https://issues.apache.org/jira/browse/CASSANDRA-2848
>
>
> 3. Shard-aware driver
>
>
> This admittedly is a burning issue for ScyllaDB, but not so much for
> Cassandra at this time.
>
>
> In the same way that drivers are token-aware, they can be shard-aware -
> know how many shards each node has, and the sharding algorithm. They can
> then open a connection per shard and send cql requests directly to the
> shard that will serve them, instead of requiring cross-core communication
> to happen on the server.
>
>
> https://issues.apache.org/jira/browse/CASSANDRA-10989
>
>
> I see three possible modes of cooperation:
>
>
> 1. The protocol change is developed using the Cassandra process in a JIRA
> ticket, culminating in a patch to doc/native_protocol*.spec when consensus
> is achieved.
>
>
> The advantage to this mode is that Cassandra developers can verify that
> the change is easily implementable; when they are ready to implement the
> feature, drivers that were already adapted to support it will just work.
>
>
> 2. The protocol change is developed outside the Cassandra process.
>
>
> In this mode, we develop the change in a forked version of
> native_protocol*.spec; Cassandra can still retroactively merge that change
> when (and if) it is implemented, but the ability to influence the change
> during development is reduced.
>
>
> If we agree on this, I'd like to allocate a prefix for feature names in
> the SUPPORTED message for our use.
>
>
> 3. No cooperation.
>
>
> This requires the least amount of effort from Cassandra developers (just
> enough to reach this point in this email), but will cause duplication of
> effort for driver authors who wish to support both projects, and may cause
> Cassandra developers to redo work that we already did.
>
>
> Looking forward to your views.
>
>
> Avi
>
>
> -
> To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: dev-h...@cassandra.apache.org
>
>


Evolving the client protocol

2018-04-18 Thread Avi Kivity

Hello Cassandra developers,


We're starting to see client protocol limitations impact performance, 
and so we'd like to evolve the protocol to remove the limitations. In 
order to avoid fragmenting the driver ecosystem and reduce work 
duplication for driver authors, we'd like to avoid forking the protocol. 
Since these issues affect Cassandra, either now or in the future, I'd 
like to cooperate on protocol development.



Some issues that we'd like to work on near-term are:


1. Token-aware range queries


When the server returns a page in a range query, it will also return a 
token to continue on. In case that token is on a different node, the 
client selects a new coordinator based on the token. This eliminates a 
network hop for range queries.



For the first page, the PREPARE message returns information allowing the 
client to compute where the first page is held, given the query 
parameters. This is just information identifying how to compute the 
token, given the query parameters (non-range queries already do this).



https://issues.apache.org/jira/browse/CASSANDRA-14311


2. Per-request timeouts


Allow each request to have its own timeout. This allows the user to set 
short timeouts on business-critical queries that are invalid if not 
served within a short time, long timeouts for scanning or indexed 
queries, and even longer timeouts for administrative tasks like TRUNCATE 
and DROP.



https://issues.apache.org/jira/browse/CASSANDRA-2848


3. Shard-aware driver


This admittedly is a burning issue for ScyllaDB, but not so much for 
Cassandra at this time.



In the same way that drivers are token-aware, they can be shard-aware - 
know how many shards each node has, and the sharding algorithm. They can 
then open a connection per shard and send cql requests directly to the 
shard that will serve them, instead of requiring cross-core 
communication to happen on the server.



https://issues.apache.org/jira/browse/CASSANDRA-10989


I see three possible modes of cooperation:


1. The protocol change is developed using the Cassandra process in a 
JIRA ticket, culminating in a patch to doc/native_protocol*.spec when 
consensus is achieved.



The advantage to this mode is that Cassandra developers can verify that 
the change is easily implementable; when they are ready to implement the 
feature, drivers that were already adapted to support it will just work.



2. The protocol change is developed outside the Cassandra process.


In this mode, we develop the change in a forked version of 
native_protocol*.spec; Cassandra can still retroactively merge that 
change when (and if) it is implemented, but the ability to influence the 
change during development is reduced.



If we agree on this, I'd like to allocate a prefix for feature names in 
the SUPPORTED message for our use.



3. No cooperation.


This requires the least amount of effort from Cassandra developers (just 
enough to reach this point in this email), but will cause duplication of 
effort for driver authors who wish to support both projects, and may 
cause Cassandra developers to redo work that we already did.



Looking forward to your views.


Avi


-
To unsubscribe, e-mail: dev-unsubscr...@cassandra.apache.org
For additional commands, e-mail: dev-h...@cassandra.apache.org