Re: Is replication possible with already existing data?

Ryan Svihla Sun, 25 Oct 2015 05:23:26 -0700

Ajay,

So It's the default driver behavior to pin requests to the first data
center it connects to (DCAwareRoundRobin strategy). but let me explain why
this is.


I think you're thinking about data centers in Cassandra as a unit of
failure, and while you can have say a rack fail, as you scale up and use
rack awareness, it's rare you lose a whole "data center" in the sense
you're thinking about, so lets reset a bit:

   1. If I'm designing a multidc architecture, usually the nature of
   latency I will not want my app servers connecting _across_ data centers.
   2. So since the common desire is not to magically have very high latency
   requests  bleed out to remote data centers, the default behavior of the
   driver is to pin to the first data center it connects too, you can change
   this with a different Load Balancing Policy (
   
http://docs.datastax.com/en/drivers/java/2.0/com/datastax/driver/core/policies/LoadBalancingPolicy.html
   )
   3. However, I generally do NOT advise users connecting to an app server
   from another data center, since Cassandra is a masterless architecture you
   typically have issues that affect nodes, and not an entire data center and
   if they affect an entire data center (say the intra DC link is down) then
   it's going to affect your app server as well!

So for new users, I typically just recommend pinning an app server to a DC
and do your data center level switching further up. You can get more
advanced and handle bleed out later, but you have to think of latencies.

Final point, rely on repairs for your data consistency, hints are great and
all but repair is how you make sure you're in sync.

On Sun, Oct 25, 2015 at 3:10 AM, Ajay Garg <ajaygargn...@gmail.com> wrote:

> Some more observations ::
>
> a)
> CAS11 and CAS12 are down, CAS21 and CAS22 up.
> If I connect via the driver to the cluster using only CAS21 and CAS22 as
> contact-points, even then the exception occurs.
>
> b)
> CAS11 down, CAS12 up, CAS21 and CAS22 up.
> If I connect via the driver to the cluster using only CAS21 and CAS22 as
> contact-points, then connection goes fine.
>
> c)
> CAS11 up, CAS12 down, CAS21 and CAS22 up.
> If I connect via the driver to the cluster using only CAS21 and CAS22 as
> contact-points, then connection goes fine.
>
>
> Seems the java-driver is kinda always requiring either one of CAS11 or
> CAS12 to be up (although the expectation is that the driver must work fine
> if ANY of the 4 nodes is up).
>
>
> Thoughts, experts !? :)
>
>
>
> On Sat, Oct 24, 2015 at 9:40 PM, Ajay Garg <ajaygargn...@gmail.com> wrote:
>
>> Ideas please, on what I may be doing wrong?
>>
>> On Sat, Oct 24, 2015 at 5:48 PM, Ajay Garg <ajaygargn...@gmail.com>
>> wrote:
>>
>>> Hi All.
>>>
>>> I have been doing extensive testing, and replication works fine, even if
>>> any permuatation of CAS11, CAS12, CAS21, CAS22 are downed and brought up.
>>> Syncing always takes place (obviously, as long as continuous-downtime-value
>>> does not exceed *max_hint_window_in_ms*).
>>>
>>>
>>> However, things behave weird when I try connecting via DataStax
>>> Java-Driver.
>>> I always add the nodes to the cluster in the order ::
>>>
>>>                          CAS11, CAS12, CAS21, CAS22
>>>
>>> during "cluster.connect" method.
>>>
>>>
>>> Now, following happens ::
>>>
>>> a)
>>> If CAS11 goes down, data is persisted fine (presumably first in CAS12,
>>> and later replicated to CAS21 and CAS22).
>>>
>>> b)
>>> If CAS11 and CAS12 go down, data is NOT persisted.
>>> Instead the following exceptions are observed in the Java-Driver ::
>>>
>>>
>>> ##################################################################################
>>> Exception in thread "main"
>>> com.datastax.driver.core.exceptions.NoHostAvailableException: All host(s)
>>> tried for query failed (no host was tried)
>>>     at
>>> com.datastax.driver.core.exceptions.NoHostAvailableException.copy(NoHostAvailableException.java:65)
>>>     at
>>> com.datastax.driver.core.DefaultResultSetFuture.extractCauseFromExecutionException(DefaultResultSetFuture.java:258)
>>>     at com.datastax.driver.core.Cluster.connect(Cluster.java:267)
>>>     at com.example.cassandra.SimpleClient.connect(SimpleClient.java:43)
>>>     at
>>> com.example.cassandra.SimpleClientTest.setUp(SimpleClientTest.java:50)
>>>     at
>>> com.example.cassandra.SimpleClientTest.main(SimpleClientTest.java:86)
>>> Caused by: com.datastax.driver.core.exceptions.NoHostAvailableException:
>>> All host(s) tried for query failed (no host was tried)
>>>     at
>>> com.datastax.driver.core.RequestHandler.sendRequest(RequestHandler.java:103)
>>>     at
>>> com.datastax.driver.core.SessionManager.execute(SessionManager.java:446)
>>>     at
>>> com.datastax.driver.core.SessionManager.executeQuery(SessionManager.java:482)
>>>     at
>>> com.datastax.driver.core.SessionManager.executeAsync(SessionManager.java:88)
>>>     at
>>> com.datastax.driver.core.AbstractSession.executeAsync(AbstractSession.java:60)
>>>     at com.datastax.driver.core.Cluster.connect(Cluster.java:260)
>>>     ... 3 more
>>>
>>> ###################################################################################
>>>
>>>
>>> I have already tried ::
>>>
>>> 1)
>>> Increasing driver-read-timeout from 12 seconds to 30 seconds.
>>>
>>> 2)
>>> Increasing driver-connect-timeout from 5 seconds to 30 seconds.
>>>
>>> 3)
>>> I have also confirmed that each of the 4 nodes are telnet-able over
>>> ports 9042 and 9160 each.
>>>
>>>
>>> Definitely seems to be some driver-issue, since
>>> data-persistence/replication works perfect (with any permutation) if
>>> data-persistence is done via "cqlsh".
>>>
>>>
>>> Kindly provide some pointers.
>>> Ultimately, it is the Java-driver that will be used in production, so it
>>> is imperative that data-persistence/replication happens for any downing of
>>> any permutation of node(s).
>>>
>>>
>>> Thanks and Regards,
>>> Ajay
>>>
>>
>>
>>
>> --
>> Regards,
>> Ajay
>>
>
>
>
> --
> Regards,
> Ajay
>



-- 

Thanks,
Ryan Svihla

Re: Is replication possible with already existing data?

Reply via email to