Re: java driver with cassandra proxies (option: -Dcassandra.join_ring=false)

2023-10-12 Thread Regis Le Bretonnic
Hi Jeff

Today, we use proxy nodes with php, and we are migrating from php to java.
Our farm of php servers is made of 80 physical servers, all of them having
a local proxy node.

So our cassandra cluster is made of :
- DC1 : 6 data nodes + 40 proxies
- DC2 : 6 data nodes + 40 proxies

During our tests with java, we still have our 80 proxy nodes.
We have defined 4 of them as contact point,  also being in a whitelist
policy.

Le ven. 13 oct. 2023 à 01:27, Regis Le Bretonnic <
r.lebreton...@meetic-corp.com> a écrit :

> Hi Stefan,
>
> Your analysis is exactly what happen !
> What I can say is that we are migrating from php to java and the behaviour
> of the php (or cpp) driver is completely different.
>
> * The topology of the cassandra cluster returned by the contact point with
> the php driver includes data nodes + proxy nodes.
> * The topology of the cassandra cluster returned by the contact point
> with the java driver only includes data nodes and there is a control based
> on system.peers table.
>
> If this helps, proxy nodes are part of gossip and participate to
> handshakes. They can be listed with "nodetool gossipinfo" (but I don't find
> a system table having this information)  but they are not listed with
> "nodetool status" and are not registered in system.peers.
>
> The challenge is probably to understand how the php driver achieve to have
> a full list with proxy nodes.
>


Re: java driver with cassandra proxies (option: -Dcassandra.join_ring=false)

2023-10-12 Thread Regis Le Bretonnic
Hi Stefan,

Your analysis is exactly what happen !
What I can say is that we are migrating from php to java and the behaviour
of the php (or cpp) driver is completely different.

* The topology of the cassandra cluster returned by the contact point with
the php driver includes data nodes + proxy nodes.
* The topology of the cassandra cluster returned by the contact point
with the java driver only includes data nodes and there is a control based
on system.peers table.

If this helps, proxy nodes are part of gossip and participate to
handshakes. They can be listed with "nodetool gossipinfo" (but I don't find
a system table having this information)  but they are not listed with
"nodetool status" and are not registered in system.peers.

The challenge is probably to understand how the php driver achieve to have
a full list with proxy nodes.


Re: java driver with cassandra proxies (option: -Dcassandra.join_ring=false)

2023-10-12 Thread Miklosovic, Stefan via user
It will use the first contact point to connect to the database and once 
connected, it will read that peers table, which is empty. Contact points are 
really just that - contact points. I think it does not mean that all of them 
will be used in some round-robin fashion or what. They are there just to read 
that peer's table and use these nodes, not contact points.

I think same would be seen if you specify two contact points where the first 
one is a non-existing ip address and the second one is proxy. It should connect 
to that proxy again which reads peers table as empty.

I was involved in some investigation around this functionality and I hit the 
same problem, basically. My idea was to put these proxies to peers table but 
that complicates things quite fast as they are not proper members of the ring, 
by definition, as they do not hold data etc 

I think this would need to be fixed in the driver - to included all contact 
points even they are not found in peers. But, if they are not part of the ring, 
they can never "leave" the ring. I wonder if they are visible in gossip etc ... 
I do not remember. Hence, how would you know that your proxy went down?


From: Jeff Jirsa 
Sent: Thursday, October 12, 2023 14:20
To: user@cassandra.apache.org
Subject: Re: java driver with cassandra proxies (option: 
-Dcassandra.join_ring=false)

NetApp Security WARNING: This is an external email. Do not click links or open 
attachments unless you recognize the sender and know the content is safe.



Just to be clear:

- How many of the proxy nodes are you providing as contact points? One of them 
or all of them?

It sounds like you're saying you're passing all of them, and only one is 
connecting, and the driver is declining to connect to the rest because they're 
not in system.peers. I'm not surprised that the proxies aren't in system.peers, 
but I'd have also expected that if you pass all proxies in contact points, it'd 
connect to all of them, so I think you're appropriately surprised here.



On Thu, Oct 12, 2023 at 5:09 AM Regis Le Bretonnic 
mailto:r.lebreton...@meetic-corp.com>> wrote:
We have tested Stargate and were very disappointed...

Originally our architecture was PHP microservices (with FPM) + cassandra 
proxies.
But we were blocked because PHP driver is no more supported.

We made tests to keep PHP + stargate but there were many issues, the main one 
(but not the only one) being stargate does not support "ALLOW FILTERING" 
clause. I don't want to re-open this debate I already had with Stargate 
maintainers...

We finally decided to move from PHP to java but we'd like to keep cassandra 
proxies that are very usefull.

Regards

Le jeu. 12 oct. 2023 à 12:05, Erick Ramirez 
mailto:erickramire...@apache.org>> a écrit :
Those nodes are not in the peers table(s) because you told them NOT to join the 
ring with `join_ring=false` so it is working by design.

I'm not really sure what you're trying to achieve but if you want to separate 
the coordinator functions from the storage then what you probably want is to 
deploy Stargate nodes<https://stargate.io/>. Stargate is a data API gateway 
that sits between the app instances and the Cassandra database. It decouples 
client request coordination from the storage aspects of C*. It also allows you 
to perform CRUD operations against C* using APIs -- REST, JSON, gRPC, GraphQL.

See the docs on Using the Stargate CQL 
API<https://stargate.io/docs/latest/develop/dev-with-cql.html> for details on 
how to set up Stargate nodes as coordinators for your C* database.

If you want to see it in action, you can try it free on Astra 
DB<https://astra.datastax.com/> (Cassandra-as-a-service). Cheers!


Re: java driver with cassandra proxies (option: -Dcassandra.join_ring=false)

2023-10-12 Thread Jeff Jirsa
Just to be clear:

- How many of the proxy nodes are you providing as contact points? One of
them or all of them?

It sounds like you're saying you're passing all of them, and only one is
connecting, and the driver is declining to connect to the rest because
they're not in system.peers. I'm not surprised that the proxies aren't in
system.peers, but I'd have also expected that if you pass all proxies in
contact points, it'd connect to all of them, so I think you're
appropriately surprised here.



On Thu, Oct 12, 2023 at 5:09 AM Regis Le Bretonnic <
r.lebreton...@meetic-corp.com> wrote:

> We have tested Stargate and were very disappointed...
>
> Originally our architecture was PHP microservices (with FPM) + cassandra
> proxies.
> But we were blocked because PHP driver is no more supported.
>
> We made tests to keep PHP + stargate but there were many issues, the main
> one (but not the only one) being stargate does not support "ALLOW
> FILTERING" clause. I don't want to re-open this debate I already had with
> Stargate maintainers...
>
> We finally decided to move from PHP to java but we'd like to keep
> cassandra proxies that are very usefull.
>
> Regards
>
> Le jeu. 12 oct. 2023 à 12:05, Erick Ramirez  a
> écrit :
>
>> Those nodes are not in the peers table(s) because you told them NOT to
>> join the ring with `join_ring=false` so it is working by design.
>>
>> I'm not really sure what you're trying to achieve but if you want to
>> separate the coordinator functions from the storage then what you probably
>> want is to deploy Stargate nodes . Stargate is a
>> data API gateway that sits between the app instances and the Cassandra
>> database. It decouples client request coordination from the storage aspects
>> of C*. It also allows you to perform CRUD operations against C* using APIs
>> -- REST, JSON, gRPC, GraphQL.
>>
>> See the docs on Using the Stargate CQL API
>>  for details
>> on how to set up Stargate nodes as coordinators for your C* database.
>>
>> If you want to see it in action, you can try it free on Astra DB
>>  (Cassandra-as-a-service). Cheers!
>>
>>>


Re: java driver with cassandra proxies (option: -Dcassandra.join_ring=false)

2023-10-12 Thread Regis Le Bretonnic
We have tested Stargate and were very disappointed...

Originally our architecture was PHP microservices (with FPM) + cassandra
proxies.
But we were blocked because PHP driver is no more supported.

We made tests to keep PHP + stargate but there were many issues, the main
one (but not the only one) being stargate does not support "ALLOW
FILTERING" clause. I don't want to re-open this debate I already had with
Stargate maintainers...

We finally decided to move from PHP to java but we'd like to keep cassandra
proxies that are very usefull.

Regards

Le jeu. 12 oct. 2023 à 12:05, Erick Ramirez  a
écrit :

> Those nodes are not in the peers table(s) because you told them NOT to
> join the ring with `join_ring=false` so it is working by design.
>
> I'm not really sure what you're trying to achieve but if you want to
> separate the coordinator functions from the storage then what you probably
> want is to deploy Stargate nodes . Stargate is a
> data API gateway that sits between the app instances and the Cassandra
> database. It decouples client request coordination from the storage aspects
> of C*. It also allows you to perform CRUD operations against C* using APIs
> -- REST, JSON, gRPC, GraphQL.
>
> See the docs on Using the Stargate CQL API
>  for details
> on how to set up Stargate nodes as coordinators for your C* database.
>
> If you want to see it in action, you can try it free on Astra DB
>  (Cassandra-as-a-service). Cheers!
>
>>


Re: java driver with cassandra proxies (option: -Dcassandra.join_ring=false)

2023-10-12 Thread Erick Ramirez
Those nodes are not in the peers table(s) because you told them NOT to join
the ring with `join_ring=false` so it is working by design.

I'm not really sure what you're trying to achieve but if you want to
separate the coordinator functions from the storage then what you probably
want is to deploy Stargate nodes . Stargate is a data
API gateway that sits between the app instances and the Cassandra database.
It decouples client request coordination from the storage aspects of C*. It
also allows you to perform CRUD operations against C* using APIs -- REST,
JSON, gRPC, GraphQL.

See the docs on Using the Stargate CQL API
 for details on
how to set up Stargate nodes as coordinators for your C* database.

If you want to see it in action, you can try it free on Astra DB
 (Cassandra-as-a-service). Cheers!

>


Re: java driver with cassandra proxies (option: -Dcassandra.join_ring=false)

2023-10-12 Thread Bowen Song via user
I'm not 100% sure, but it's worth trying to disable the token metadata 
, 
because the driver needs to read the "system.peers_v2" table for 
populating the token metadata.


On 11/10/2023 19:15, Regis Le Bretonnic wrote:

Hi (also posted in dev mailing list but not sure I can publish on it),

We use datastax cassandra java driver v4.15.0 and we want to limit connexion 
only to Cassandra proxy nodes (Nodes with no data started with option: 
-Dcassandra.join_ring=false).
For that:
  - we configured the driver to have only proxy hosts in the contact-points 
(datastax-java-driver.basic.contact-points).
  - we added a custom configuration containing "whitelisted host" (same list as 
contact-points)
  - we implemented a custom NodeFilter Class to limit allowed nodes to 
whitelisted one

If we look at opened TCP connexions between client and Cassandra cluster we see 
only 2:
  - one to one of the proxy listed in the contact-points (coordinator connexion)
  - another one the the same proxy (query connexion)

We expected to have an opened connexion to each proxy listed in contact-points 
/ whitelisted hosts.
We found that it is not the case because during cluster discovery the driver execute a query in 
table "system.peers" or "system.peers_v2" (made in DefaultTopologyMonitor 
class) and proxy nodes are not in this table.

Why are proxy nodes lot listed in system.peers and why the discovery checks in this table 
? Is it possible to bypass this control or add these nodes in table "peers" ?
Is there a way to implement a custom version of TopologyMonitor interface to 
bypass this mechanism ?
Is there another way to do this ?

Thanks in advance
Regards