Re: Async queries

2017-10-02 Thread Andy Tolbert
Hi Charu,

Since the driver uses Guava futures, you can use some of the methods in
Futures

to
add listeners, callbacks and transformers that are invoked when the future
completes without blocking the calling thread like getUninterruptibly
does.  For example, the following registers a callback whose onSuccess or
onFailure method is called based on the success of the query.

Futures.addCallback(future, new FutureCallback() {
public void onSuccess(ResultSet result) {
// process result
}

public void onFailure(Throwable t) {
// log exception
}
});

You can read more about using the driver's async features here
.

Since executeAsync does not block, you'll want to be careful of is not
submitting too many requests at a time as this can degrade performance and
may explain what you are observing.  One simple (although somewhat crude)
way of handling this is to use a Semaphore with a fixed number of permits.
You would acquire a Semaphore permit before you execute a query, and then
release a permit in a callback on completion of the request.  This would
cause your calling thread to block whenever you run out of permits, and
then continue when a query completes and releases a permit.

The upcoming version (4.0) of the java driver uses
CompletionStage/CompletableFuture (java 8 futures), although we'll probably
provide a guava extension as well for those who still want to use
ListenableFuture.

Thanks,
Andy


On Mon, Oct 2, 2017 at 6:44 PM Charulata Sharma (charshar) <
chars...@cisco.com> wrote:

> Hi ,
>
>
>
>
>
> We are observing some performance issues when executing a large number of
> read/write queries.
>
> We use executeAsync query for most of our read and write requests and then
> future.getUninterruptibly() methods before returning to the client
> application.
>
>
>
>
>
> Code snippet is:  (In the bind portion we have some GSON object
> conversions.
>
>
>
>  List futures = *new* ArrayList<>();
>
>
>
>   BoundStatement bind ;
>
>   For(loop condition) {
>
> bind =PreparedStatement.bind(….) //The PreparedStatement is
> prepared outside the loop.
>
>resultSetFuture = *SESSION*.executeAsync(bind);
>
>
>  futures.add(resultSetFu ture);
>
>   }
>
>
>
> *for*(ResultSetFuture future: futures){
>
>future.getUninterruptibly();
>
> }
>
>
>
>
>
> Reading through the documents , I found that although the queries are
> executed in an async fashion, the future. getUninterruptibly(), is a
> blocking call.
>
> I am trying to implement a callable future, but wanted to know from the
> community if there is any better way of doing this and if changing to
> callable future will help.
>
>
>
>
>
> Thanks,
>
> Charu
>


Re: Migrating a Limit/Offset Pagination and Sorting to Cassandra

2017-10-04 Thread Andy Tolbert
Hi Daniel,

To answer this question:

> How long is the paginationState from the driver current?

The paging state itself contains information about the position in data
where to proceed from, so you don't need to worry about it becoming
stale/invalid.  The only exception is if you upgrade your cluster and start
using a newer protocol version, at which point the paging state will likely
become invalid.  The java driver guide has an explanation of saving and
reusing the paging state

that explains this.

Thanks,
Andy

On Wed, Oct 4, 2017 at 1:36 AM Greg Saylor  wrote:

> Without knowing other details, of course, have you considered using
> something like Elassandra?  That is a pretty tightly integrated Cassandra +
> Elastic Search solution.   You’d insert data into Cassandra like you do
> normally, then query it with Elastic Search.  Of course this would increase
> the size of your storage requirements.
>
> - Greg
>
>
> On Oct 3, 2017, at 11:10 PM, Daniel Hölbling-Inzko <
> daniel.hoelbling-in...@bitmovin.com> wrote:
>
> Thanks Kurt,
> I thought about that but one issue is that we are doing limit/offset not
> pages. So one customer can choose to page through the list in 10 Item
> increments, another might want to page through with 100 elements per page.
> So I can't have a clustering key that represents a page range.
>
> What I was thinking about doing was saving the paginationState in a
> separate table along with limit/offset info of the last query the
> paginationState originated from so I can use the last paginationState to
> continue the iteration from if the customer requests the next page with the
> same limit but a different offset.
> This breaks down if the customer does a cold offset=1000 request but
> that's something I can throw error messages for at, what I do need to
> support is a customer doing
> Request 1: offset=0 + limit=100
> Request 2: offset=100 + limit=100
> Request 3: offset=200 + limit=100
>
> So next question would be: How long is the paginationState from the driver
> current? I was thinking about inserting the paginationState with a TTL into
> another Cassandra table - not sure if that's smart though.
>
> greetings Daniel
>
> On Tue, 3 Oct 2017 at 12:20 kurt greaves  wrote:
>
>> I get the impression that you are paging through a single partition in
>> Cassandra? If so you should probably use bounds on clustering keys to get
>> your "next page". You could use LIMIT as well here but it's mostly
>> unnecessary. Probably just use the pagesize that you intend for the API.
>>
>> Yes you'll need a table for each sort order, which ties into how you
>> would use clustering keys for LIMIT/OFFSET. Essentially just do range
>> slices on the clustering keys for each table to get your "pages".
>>
>> Also I'm assuming there's a lot of data per partition if in-mem sorting
>> isn't an option, if this is true you will want to be wary of creating large
>> partitions and reading them all at once. Although this depends on your data
>> model and compaction strategy choices.
>>
>> On 3 October 2017 at 08:36, Daniel Hölbling-Inzko <
>> daniel.hoelbling-in...@bitmovin.com> wrote:
>>
>>> Hi,
>>> I am currently working on migrating a service that so far was MySQL
>>> based to Cassandra.
>>> Everything seems to work fine so far, but a few things in the old
>>> services API Spec is posing some interesting data modeling challenges:
>>>
>>> The old service was doing Limit/Offset pagination which is obviously
>>> something Cassandra can't really do. I understand how paginationState works
>>> - but so far I haven't figured out a good way to make Limit/Offset work on
>>> top of paginationState (as I need to be 100% backwards compatible).
>>> The only ways which I could think of to make Limit/Offset work would
>>> create scalability issues down the road.
>>>
>>> The old service allowed sorting by any field. If I understood correctly
>>> that would require a table for each sort order right? (In-Mem sorting is
>>> not an option unfortunately)
>>> In doing so, how can I make the Java Datastax mapper save to another
>>> table (I really don't want to be writing a Subclass of the Entity for each
>>> Table to add the @Table annotation.
>>>
>>> greetings Daniel
>>>
>>
>>
>


Re: Limit on number of connections to Cassandra

2017-09-08 Thread Andy Tolbert
Hello,

If I'm understanding the question correctly, as of C* 2.0.15 / 2.1.5 via
CASSANDRA-8086  you
can limit the maximum number of connections allowed to a C* node via
native_transport_max_concurrent_connections

in
cassandra.yaml.

As far as the java driver goes, newer versions (i.e. 2.1.10+, 3.0.1+)
behave in such a way that as long as the driver can maintain at least one
connection to a node it considers it up.  If it can no longer maintain
connection it will not send requests to it and will try reconnecting per
the configured retry policy.  By default with C* 2.1+ (protocol version
V3+), the driver will only maintain one connection per host in local data
center, although you can tweak this using PoolingOptions

.

Thanks,
Andy


On Fri, Sep 8, 2017 at 1:27 PM techpyaasa .  wrote:

> Hi
>
> Is there any limit on number of client connections to Cassandra just like
> MySQL etc., ?
>
> If YES, what is that & how can we set that?
>
> If NO , how will get to know that node has reached it's capacity serving
> client requests/over loaded?
>
> Using C*-2.1.17 , datastax java driver
>
>
> Thanks
> Techpyaasa
>


Re: Cassandra compatibility matrix

2017-09-12 Thread Andy Tolbert
Hi Dmitry,

We are currently working on updating our matrices for the drivers, but any
version of driver 3.0+ will work with cassandra 3.X.

Thanks,
Andy

On Tue, Sep 12, 2017 at 12:30 PM Dmitry Buzolin 
wrote:

> Thank you Jon.
>
> Is there way to find what Datastax driver is compatible with Cassandra
> 3.11?
>
> http://docs.datastax.com/en/developer/driver-matrix/doc/javaDrivers.html#java-drivers
> For some reason they don’t print latest version of Cassandra.
>
>
> On Sep 7, 2017, at 1:15 PM, Jon Haddad  wrote:
>
> There aren’t any drivers maintained by the Cassandra project.
> Compatibility is up to each driver.  Usually a section is included in the
> README.  For instance, in the DataStax Java Driver:
> https://github.com/datastax/java-driver#compatibility
>
> Jon
>
> On Sep 7, 2017, at 9:39 AM, Dmitry Buzolin  wrote:
>
> Hello list!
>
> Where can I find C* compatibility matrix for example server server version
> is compatible with what client drivers? Thank you!
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>
>
>


Re: DataStax Java driver QueryBuilder: CREATE table?

2017-12-14 Thread Andy Tolbert
Hi Oliver,

SchemaBuilder

enables building schema DDL statements like CREATE TABLE, KEYSPACE and so
on.  You can find some examples in the tests

.

Thanks,
Andy

On Thu, Dec 14, 2017 at 5:16 PM Oliver Ruebenacker  wrote:

>
>  Hello,
>
>   I'm using the DataStax Java Driver, which has a QueryBuilder class to
> construct CQL statements. I can see how to build SELECT, INSERT, TRUNCATE
> etc statements, but I can't find how to build a CREATE statement. Am I
> missing something?
>
>   Thanks!
>
>  Best, Oliver
>
>
> --
> Oliver Ruebenacker
> Senior Software Engineer, Diabetes Portal
> , Broad Institute
> 
>
>


Re: code snippet for cqlsh COPY from

2017-10-25 Thread Andy Tolbert
Hi Suresh,

cqlsh COPY does batches intelligently by only grouping inserts targeting
the same partition in a batch.

As of version 3.6, C* will not emit the "batch size exceeded" errors if all
statements in a batch belong to the same partition (CASSANDRA-13467
).

The docs (https://cassandra.apache.org/doc/latest/tools/cqlsh.html#copy-from)
are a good reference for how to use copy from.

https://www.datastax.com/dev/blog/new-features-in-cqlsh-copy is also a good
reference.

Here's an example from something I was working from locally:

cqlsh -e "COPY andy.table100b (pkey,skey,text1,text2,text3,text4,text5)
from 'csv/ordered/100b/*.csv' WITH header = true AND INGESTRATE=100 AND
NUMPROCESSES=32 AND MAXBATCHSIZE=100;" myhostname

Note you should probably still keep your batches relatively small even with
single partition batches depending on your dataset.  In my particular case
I was working with relatively small data (100-byte rows).  There is
diminishing returns in terms of throughput as your increase your batch
size, but that will vary based on your data and environment.

Thanks,
Andy


On Wed, Oct 25, 2017 at 11:51 AM Suresh Babu Mallampati <
smallampat...@gmail.com> wrote:

> Hi All,
>
> Can someone provide me the code snippet for the cqlsh COPY from csv file.
>
> I just want to know how that COPY mechanism work compared to normal
> insert/commit to avaoid the batch size exceed the limit.
>
> Thanks,
> Suresh.
>


Re: Cassandra client drivers

2018-05-07 Thread Andy Tolbert
Hi Abdul,

If you are already at C* 3.1.0 and the driver you are using works, it's
almost certain to also work with 3.11.2 as well as there are no protocol
changes between these versions.  I would advise testing your application
out against a 3.11.2 test environment first though, just to be safe ;).

Thanks,
Andy

On Mon, May 7, 2018 at 5:47 PM, Abdul Patel  wrote:

> Hi
>
> I am.planning for upgrade from 3.1.0 to 3.11.2 , just wanted to confirm if
> cleint drivers need to upgraded? Or it will work with 3.1.0 drivers?
>
>
>


Re: Client ID logging

2018-05-21 Thread Andy Tolbert
CASSANDRA-13665 
adds a 'nodetool clientlist' command which I think would be helpful in this
circumstance.  That feature is targeted for C* 4.0 however.

You could use something like lsof  to
see what active TCP connections there are to the host servers running your
C* cluster to capture the IP addresses of the clients connected to your
cluster.

Thanks,
Andy

On Mon, May 21, 2018 at 1:42 PM, Hannu Kröger  wrote:

> Hmm, I think that by default not but you can create a hook to log that.
> Create a wrapper for PasswordAuthenticator class for example and use that.
> Or if you don’t use authentication you can create your own query handler.
>
> Hannu
>
> James Lovato  kirjoitti 21.5.2018 kello 21.37:
>
> Hi guys,
>
>
>
> Can standard OSS Cassandra 3 do logging of who connects to it?  We have a
> cluster in 3 DCs and our devs want to see if the client is crossing across
> DC (even though they have DCLOCAL set from their DS driver).
>
>
>
> Thanks,
> James
>
>


Re: How to get page id without transmitting data to client

2017-12-29 Thread Andy Tolbert
Hi Eunsu,

Unfortunately there is not really a way to do this that I'm aware of.  The
page id contains data indicating where to start reading the next set of
rows (such as partition and clustering information), and in order to get to
that position you have to actually read the data.

The driver does have an API for manually specifying the page id to use and
we've documented some strategies

for storing and reusing the page id later, but not sure if that helps for
your particular use case.

Thanks,
Andy

On Thu, Dec 28, 2017 at 9:11 PM, Eunsu Kim  wrote:

> Hello everybody,
>
> I am using the datastax Java driver (3.3.0).
>
> When query large amounts of data, we set the fetch size (1) and
> transmit the data to the browser on a page-by-page basis.
>
> I am wondering if I can get the page id without receiving the real rows
> from the cassandra to my server.
>
> I only need 100 in front of 100,000. But I want the next page to be
> 11th.
>
> If you have a good idea, please share it.
>
> Thank you.
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


Re: why returned achievedConsistencyLevel is null

2018-08-25 Thread Andy Tolbert
Hi Vitaliy,

That method 
(https://docs.datastax.com/en/latest-java-driver-api/com/datastax/driver/core/ExecutionInfo.html#getAchievedConsistencyLevel--)
is a bit confusing as it will return null when your desired
consistency level is achieved:

> If the query returned without achieving the requested consistency level due 
> to the RetryPolicy, this return the biggest consistency level that has been 
> actually achieved by the query.
>
> Note that the default RetryPolicy (DefaultRetryPolicy) will never allow a 
> query to be successful without achieving the initially requested consistency 
> level and hence with that default policy, this method will always return 
> null. However, it might occasionally return a non-null with say, 
> DowngradingConsistencyRetryPolicy.

As long as you are using a RetryPolicy that doesn't downgrade
Consistency Level on retry, you can expect this method to always
return null.  I heavily discourage downgrading consistency levels on
retry, you can read the driver team's rationale about it here
(https://docs.datastax.com/en/developer/java-driver/3.5/upgrade_guide/#3-5-0).

> Is it possible to make DataStax driver throw an exception in case
> desired consistency level was not achieved during the insert?

This is actually the default behavior.  If consistency level cannot be
met within Cassandra's configured timeouts, or if not enough replicas
are available to service the consistency level from the start, C* will
raise ReadTimeout, WriteTimeout or Unavailable exceptions
respectively.  The driver can be configured to retry on those errors
per RetryPolicy, although there is some nuance when it comes to it not
retrying statements that are non-idempotent
(https://docs.datastax.com/en/developer/java-driver/3.5/manual/retries/#retries-and-idempotence).
If the driver is not configured to retry, it will raise the exception
to the user.

In summary, as long as you aren't using some form of downgrading
consistency retry policy, if you get a successfully completed request,
you can assume the consistency level you have configured was met for
your operations.

Thanks,
Andy



On Fri, Aug 24, 2018 at 4:14 PM Vitaliy Semochkin  wrote:
>
> HI,
>
> While using DataStax driver
> session.execute("some insert
> query")getExecutionInfo().getAchievedConsistencyLevel()
> is already returned as null, despite data is stored. Why could it be?
>
> Is it possible to make DataStax driver throw an exception in case
> desired consistency level was not achieved during the insert?
>
> Regards,
> Vitaliy
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>

-
To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
For additional commands, e-mail: user-h...@cassandra.apache.org



Re: which driver to use with cassandra 3

2018-07-20 Thread Andy Tolbert
Hi Vitaliy,

Spring uses driver from datastax, though is it a reliable solution for
> a long term project, having in mind that datastax and cassandra
> parted?
>

I would definitely consider the DataStax drivers a reliable solution going
forward.  We remain very committed to supporting our C* drivers at
DataStax.  We are actively working to stay on top of supporting C* 4.0
features and we intend that the 3.x driver will support C* 4.0.

We are also hard at work on our next major release of the java driver (4.0)
for which a beta release is imminent.  4.0 has some pretty big changes (see
here ) so
we intend to continue maintaining the 3.x line for quite a while.

Thanks,
Andy

On Fri, Jul 20, 2018 at 9:54 AM, Vitaliy Semochkin 
wrote:

> Thank you very much Duy Hai Doan!
> I have relatively simple demands and since spring using datastax
> driver I can always get back to it,
> though  I would prefer to use spring in order to do bootstrapping and
> resource management for me.
> On Fri, Jul 20, 2018 at 4:51 PM DuyHai Doan  wrote:
> >
> > Spring data cassandra is so so ... It has less features (at last at the
> time I looked at it) than the default Java driver
> >
> > For driver, right now most of people are using Datastax's ones
> >
> > On Fri, Jul 20, 2018 at 3:36 PM, Vitaliy Semochkin 
> wrote:
> >>
> >> Hi,
> >>
> >> Which driver to use with cassandra 3
> >>
> >> the one that is provided by datastax, netflix or something else.
> >>
> >> Spring uses driver from datastax, though is it a reliable solution for
> >> a long term project, having in mind that datastax and cassandra
> >> parted?
> >>
> >> Regards,
> >> Vitaliy
> >>
> >> -
> >> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> >> For additional commands, e-mail: user-h...@cassandra.apache.org
> >>
> >
>
> -
> To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org
> For additional commands, e-mail: user-h...@cassandra.apache.org
>
>


Re: Cassandra Driver Pagination

2018-04-24 Thread Andy Tolbert
Hi Ahmed,

The java driver docs do a good job explaining how the driver uses paging,
including providing a sequence diagram that describes the flow of the
process:
https://docs.datastax.com/en/developer/java-driver/3.5/manual/paging/

The driver requests X rows (5000 by default, controlled via
QueryOptions.setFetchSize
)
at a time.  When C* replies, it returns a 'paging state' id which
identifies where in the result set (partition and clustering key) to
continue retrieving the next set of rows.  When you continue iterating over
the result set in the java driver and hit the end of the current page, it
will send another request to C* using that paging state to get the next set
of rows.

Thanks,
Andy

On Tue, Apr 24, 2018 at 9:49 AM, Ahmed Eljami 
wrote:

> Hello,
>
> Can someone explain me how paging is implemented ?
>
> according to the doc of datastax, the goal  being to avoid loading much
> results in memory.
>
> Does it mean that the whole partition is not upload to heap memory?
>
>
> ​C* version: 2.1
>
> Java Driver version: 3.0
>
> ​Best regards​
>
>


Re: Cassandra Driver Pagination

2018-04-25 Thread Andy Tolbert
Hi Ahmed,

It does not, it only reads enough rows to satisfy the clients request.
Although, that may be a bit of an oversimplification as it has to scan
through sstable files, read indices, pass over tombstones and so on, but it
will stop reading new rows once it has read the number of rows the driver
has requested.   If you have a really wide partition or a lot of tombstones
in the partition, you may find your query performance being slow in general
when reading rows from that partition.

Thanks,
Andy

On Wed, Apr 25, 2018 at 3:08 AM, Ahmed Eljami <ahmed.elj...@gmail.com>
wrote:

> Hi Andy,
>
> Thanks.
>
> When the driver requests X rows, C* will load the whole partition (All
> rows) before reply to driver ?
>
> Thanls.
>
> 2018-04-24 18:11 GMT+02:00 Andy Tolbert <andrew.tolb...@datastax.com>:
>
>> Hi Ahmed,
>>
>> The java driver docs do a good job explaining how the driver uses paging,
>> including providing a sequence diagram that describes the flow of the
>> process:  https://docs.datastax.com/en/developer/java-
>> driver/3.5/manual/paging/
>>
>> The driver requests X rows (5000 by default, controlled via
>> QueryOptions.setFetchSize
>> <https://docs.datastax.com/en/drivers/java/3.5/com/datastax/driver/core/QueryOptions.html#setFetchSize-int->)
>> at a time.  When C* replies, it returns a 'paging state' id which
>> identifies where in the result set (partition and clustering key) to
>> continue retrieving the next set of rows.  When you continue iterating over
>> the result set in the java driver and hit the end of the current page, it
>> will send another request to C* using that paging state to get the next set
>> of rows.
>>
>> Thanks,
>> Andy
>>
>> On Tue, Apr 24, 2018 at 9:49 AM, Ahmed Eljami <ahmed.elj...@gmail.com>
>> wrote:
>>
>>> Hello,
>>>
>>> Can someone explain me how paging is implemented ?
>>>
>>> according to the doc of datastax, the goal  being to avoid loading much
>>> results in memory.
>>>
>>> Does it mean that the whole partition is not upload to heap memory?
>>>
>>>
>>> ​C* version: 2.1
>>>
>>> Java Driver version: 3.0
>>>
>>> ​Best regards​
>>>
>>>
>>
>
>
> --
> Cordialement;
>
> Ahmed ELJAMI
>


Re: Nodejs connector high latency

2018-11-04 Thread Andy Tolbert
Hi Tarun,

There are a ton of factors that can impact query performance.

The cassandra native protocol supports multiple simultaneous requests per
connection.  Most drivers by default only create one connection to each C*
host in the local data center.  That being said, that shouldn't be a
problem, particularly if you are only executing 20 concurrent requests,
this is something both driver clients and C* handles well.  The driver does
do some write batching to reduce the amount of system calls, but I'm
reasonably confident this is not an issue.

It may be worth enabling client logging
 to see if that shines
any light.   You can also enable tracing on your requests by specifying
traceQuery

as a query option (example
)
to see if the delay is caused by C*-side processing.

Also keep in mind that all user code in node.js is handled in a single
thread.  If you have callbacks tied to your responses that do non-trivial
work, that can delay subsequent requests from being processed, which may
give impression that some queries are slow.

Thanks,
Andy

On Sun, Nov 4, 2018 at 8:59 AM Tarun Chabarwal 
wrote:

> Hi
>
> I used cassandra driver provided by datastax (3.5.0) library in nodejs.
> I've 5 nodes cluster. I'm writing to a table with quorum.
>
> I observed that there is some spike in write. In ~20 writes, 2-5 writes
> are taking longer(~200ms). I debugged one of the node process with strace
> and found that longer latencies are batched and they use same fd to connect
> to cassandra. This may be the multiplexing.
>
> Why it takes that long ?
> Where should I look to resolve it?
>
> Regards
> Tarun Chabarwal
>


Re: Java 10 for Cassandra 3.11.3

2018-09-03 Thread Andy Tolbert
Hi Jeronimo,

Until Cassandra 4.0, JDK 8 is required.   See CASSANDRA-9608
 for more details.

Thanks,
Andy

On Mon, Sep 3, 2018 at 8:45 AM Jeronimo de A. Barros <
jeronimo.bar...@gmail.com> wrote:

> Hi guys,
>
> I'd like to know which java version to use with Cassandra 3.11.3. Is Java
> 10 already supported ? Is it safe ?
>
> Thanks.
>


Re: Deployment

2019-01-12 Thread Andy Tolbert
Hi Amit,

a) If queries are submitted to co-ordinator nodes (i assume this includes
> writes as well as reads) then:
>   -- is this the approach also followed for the initial data load?
>

Writes get sent to all replica nodes, and then the coordinator responds to
the client as soon as enough replicas have responded to achieve the
configured consistency level.


>   -- some select queries may not have restrictions on all the partition
> key columns, and Cassandra would reject such a query, but we we utilize
> ALLOW FILTER then the query will execute, since there is no way to
> determine which node to send the query to, it will be sent to all the nodes
> that could potentially have results. In such a case it would seem that the
> co-ordinator would gather the results from all the nodes and return it to
> the application.
>

Correct, if the data is on multiple ranges, the coordinator will make
queries to as many replicas needed to cover those ranges and will then
gather those results.  Using tracing (blog post
) is a good way
to get insights into what replicas are involved in your queries.

Application does not know which nodes may have the data, so it can not
> directly send the data to the right nodes. Even if application had the
> data, it may not be able to perform load balancing.
>

Most client drivers have a nice optimization called token-aware load
balancing (i.e. DataStax Java Driver's TokenAwarePolicy
),
where if the driver is able to infer which partition is being accessed, it
will prioritize coordinators that have that data.   This determination will
typically work if all parts of your partition key are bind parameters in
your statement (requirements

).

Does the coordinator perform load balancing? I imagine it would have to ...
>

The coordinator utilizes a dynamic snitch
 to determine
where to route read queries.

Thanks,
Andy

On Sat, Jan 12, 2019 at 9:14 AM amit sehas  wrote:

> Thanks for your response, this leads to some further questions:
>
> a) If queries are submitted to co-ordinator nodes (i assume this includes
> writes as well as reads) then:
>   -- is this the approach also followed for the initial data load?
>   -- some select queries may not have restrictions on all the partition
> key columns, and Cassandra would reject such a query, but we we utilize
> ALLOW FILTER then the query will execute, since there is no way to
> determine which node to send the query to, it will be sent to all the nodes
> that could potentially have results. In such a case it would seem that the
> co-ordinator would gather the results from all the nodes and return it to
> the application.
>
> b) This seems as if this is a 3 tier architecture. Application sends query
> to coordinator. coordinator sends it to the right nodes.
> Application does not know which nodes may have the data, so it can not
> directly send the data to the right nodes. Even if application had the
> data, it may not be able to perform load balancing. Does the coordinator
> perform load balancing? I imagine it would have to ...
>
> thanks
>
> On Saturday, January 12, 2019, 3:32:53 AM PST, Rajesh Kishore <
> rajesh10si...@gmail.com> wrote:
>
>
> Application would send request to one of the node(called as coordinating
> node) & this coordinating node is aware of where your result
> lies(considering you have modelled your DB correctly, it should not result
> in scatter& gather kind of stuff) and thus delegate the query to respective
> node, so it does follow client server architecture & your assumption is
> correct.
> As per my knowledge , generally application should be unaware where your
> result lies & must not be tied to a specific node because it would have
> bigger implications when stuffs like re-balancing would occur. So, your
> application should be unaware where your data lies (in which node I meant),
> but obviously keeping application in same region as that of cassandra
> cluster would make sense, can't comment much on cloud deployment.
>
> Thanks,
> Rajesh
>
> On Sat, Jan 12, 2019 at 8:54 AM amit sehas 
> wrote:
>
> I am new to Cassandra, i am wondering how the Cassandra applications are
> deployed in the cloud. Does Cassandra have a client server architecture and
> the application is deployed as a 3rd tier that sends over queries to the
> clients, which then submit them to the Cassandra servers?  Or does the
> application submit the request directly to any of the Cassandra server
> which then decides where the query will be routed to, and then gathers the
> response and returns that to the application.
>
> Does the application accessing the data get deployed on the same nodes in
> the cloud as the Cassandra 

Re: Unexpected error while refreshing token map, keeping previous version (IllegalArgumentException: Multiple entries with same key ?

2019-06-20 Thread Andy Tolbert
One thing that strikes me is that the endpoint reported is '127.0.0.1'.  Is
it possible that you have rpc_address set to 127.0.0.1 on each of your
three nodes in cassandra.yaml?  The driver uses the system.peers table to
identify nodes in the cluster and associates them by rpc_address.  Can you
verify this by executing 'select peer, rpc_address from system.peers' to
see what is being reported as the rpc_address and let me know?

In any case, the driver should probably handle this better, I'll create a
driver ticket.

Thanks,
Andy

On Thu, Jun 20, 2019 at 10:03 AM Jeff Jirsa  wrote:

> There’s a reasonable chance this is a bug in the Datastax driver - may
> want to start there when debugging .
>
> It’s also just a warn, and the two entries with the same token are the
> same endpoint which doesn’t seem concerning to me, but I don’t know the
> Datastax driver that well
>
> On Jun 20, 2019, at 7:40 AM, Котельников Александр 
> wrote:
>
> It appears that no such warning is issued if I connected to Cassandra from
> a remote server, not locally.
>
>
>
> *From: *Котельников Александр 
> *Reply-To: *"user@cassandra.apache.org" 
> *Date: *Thursday, 20 June 2019 at 10:46
> *To: *"user@cassandra.apache.org" 
> *Subject: *Unexpected error while refreshing token map, keeping previous
> version (IllegalArgumentException: Multiple entries with same key ?
>
>
>
> Hey!
>
>
>
> I’ve  just configured a test 3-node Cassandra cluster and run very trivial
> java test against it.
>
>
>
> I see the following warning from java-driver on each CqlSession
> initialization:
>
>
>
> 13:54:13.913 [loader-admin-0] WARN  c.d.o.d.i.c.metadata.DefaultMetadata -
> [loader] Unexpected error while refreshing token map, keeping previous
> version (IllegalArgumentException: Multiple entries with same key:
> Murmur3Token(-1060405237057176857)=/127.0.0.1:9042 and
> Murmur3Token(-1060405237057176857)=/127.0.0.1:9042)
>
>
>
> What does It mean? Why?
>
>
>
> Cassandra 3.11.4, driver 4.0.1.
>
>
>
> nodetool status
>
> Datacenter: datacenter1
>
> ===
>
> Status=Up/Down
>
> |/ State=Normal/Leaving/Joining/Moving
>
> --  Address   Load   Tokens   Owns (effective)  Host
> ID   Rack
>
> UN  10.73.66.36   419.36 MiB  256  100.0%
> fafa2737-9024-437b-9a59-c1c037bce244  rack1
>
> UN  10.73.66.100  336.47 MiB  256  100.0%
> d5323ad0-f8cd-42d4-b34d-9afcd002ea47  rack1
>
> UN  10.73.67.196  336.4 MiB  256  100.0%
> 74dffe0c-32a4-4071-8b36-5ada5afa4a7d  rack1
>
>
>
> The issue persists if I reset the cluster, just the token changes its
> value.
>
> Alexander
>
>


Re: Unexpected error while refreshing token map, keeping previous version (IllegalArgumentException: Multiple entries with same key ?

2019-06-20 Thread Andy Tolbert
I just configured a 3 node cluster in this way and was able to reproduce
the warning message:

cqlsh> select peer, rpc_address from system.peers;

 peer  | rpc_address
---+-
 127.0.0.3 |   127.0.0.1
 127.0.0.2 |   127.0.0.1

(2 rows)

cqlsh> select rpc_address from system.local;

 rpc_address
-
   127.0.0.1

10:22:40.399 [s0-admin-0] WARN  c.d.o.d.i.c.metadata.DefaultMetadata - [s0]
Unexpected error while refreshing token map, keeping previous version
java.lang.IllegalArgumentException: Multiple entries with same key:
Murmur3Token(-100881582699237014)=/127.0.0.1:9042 and
Murmur3Token(-100881582699237014)=/127.0.0.1:9042
at
com.datastax.oss.driver.shaded.guava.common.collect.ImmutableMap.conflictException(ImmutableMap.java:215)
at
com.datastax.oss.driver.shaded.guava.common.collect.ImmutableMap.checkNoConflict(ImmutableMap.java:209)
at
com.datastax.oss.driver.shaded.guava.common.collect.RegularImmutableMap.checkNoConflictInKeyBucket(RegularImmutableMap.java:147)
at
com.datastax.oss.driver.shaded.guava.common.collect.RegularImmutableMap.fromEntryArray(RegularImmutableMap.java:110)
at
com.datastax.oss.driver.shaded.guava.common.collect.ImmutableMap$Builder.build(ImmutableMap.java:393)
at
com.datastax.oss.driver.internal.core.metadata.token.DefaultTokenMap.buildTokenToPrimaryAndRing(DefaultTokenMap.java:261)
at
com.datastax.oss.driver.internal.core.metadata.token.DefaultTokenMap.build(DefaultTokenMap.java:57)
at
com.datastax.oss.driver.internal.core.metadata.DefaultMetadata.rebuildTokenMap(DefaultMetadata.java:146)
at
com.datastax.oss.driver.internal.core.metadata.DefaultMetadata.withNodes(DefaultMetadata.java:104)
at
com.datastax.oss.driver.internal.core.metadata.InitialNodeListRefresh.compute(InitialNodeListRefresh.java:96)
at
com.datastax.oss.driver.internal.core.metadata.MetadataManager.apply(MetadataManager.java:475)
at
com.datastax.oss.driver.internal.core.metadata.MetadataManager$SingleThreaded.refreshNodes(MetadataManager.java:299)
at
com.datastax.oss.driver.internal.core.metadata.MetadataManager$SingleThreaded.access$1700(MetadataManager.java:265)
at
com.datastax.oss.driver.internal.core.metadata.MetadataManager.lambda$refreshNodes$0(MetadataManager.java:155)
at
java.util.concurrent.CompletableFuture.uniApply(CompletableFuture.java:602)
at
java.util.concurrent.CompletableFuture$UniApply.tryFire(CompletableFuture.java:577)
at
java.util.concurrent.CompletableFuture$Completion.run(CompletableFuture.java:442)
at io.netty.channel.DefaultEventLoop.run(DefaultEventLoop.java:54)
at
io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:905)
at
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.lang.Thread.run(Thread.java:748)

Interestingly enough, the version 3 of the driver only recognizes 1 node,
where version 4 is able to detect 3 nodes separately.  It's probably not a
scenario that was given a lot of thought since this is a misconfiguration.
Will think about how it should be handled and log tickets in any case, as
would be nice to surface to the user that something isn't right in a more
clear way.

Can you please confirm when you have chance that this is indeed a
configuration issue with rpc_address?  Just to make sure I'm not ignoring a
possible bug ;)

Thanks,
Andy


On Thu, Jun 20, 2019 at 10:20 AM Andy Tolbert 
wrote:

> One thing that strikes me is that the endpoint reported is '127.0.0.1'.
> Is it possible that you have rpc_address set to 127.0.0.1 on each of your
> three nodes in cassandra.yaml?  The driver uses the system.peers table to
> identify nodes in the cluster and associates them by rpc_address.  Can you
> verify this by executing 'select peer, rpc_address from system.peers' to
> see what is being reported as the rpc_address and let me know?
>
> In any case, the driver should probably handle this better, I'll create a
> driver ticket.
>
> Thanks,
> Andy
>
> On Thu, Jun 20, 2019 at 10:03 AM Jeff Jirsa  wrote:
>
>> There’s a reasonable chance this is a bug in the Datastax driver - may
>> want to start there when debugging .
>>
>> It’s also just a warn, and the two entries with the same token are the
>> same endpoint which doesn’t seem concerning to me, but I don’t know the
>> Datastax driver that well
>>
>> On Jun 20, 2019, at 7:40 AM, Котельников Александр 
>> wrote:
>>
>> It appears that no such warning is issued if I connected to Cassandra
>> from a remote server, not locally.
>>
>>
>>
>> *From: *Котельников Александр 
>> *Reply-To: *"user@cassandra.apache.org" 
>> *Date: *Thursday, 20 June 2019 at 10:46
>> *To: *"user@cassandra.apache.org" 
>> *Subject: *Unexpected error while refreshing token map, keeping previous
>> version (IllegalArgumentException: Multiple entries with same ke