Re: Cqlsh timeout and schema refresh exceptions

Saumitra S Mon, 19 Dec 2016 11:35:47 -0800

Thanks Vladimir!

Is there any known issue in 3.0.10, where creating "CF with large number of
cols" or "creating large number of CFs quickly" one after other gives
schema agreement issue?


What other things can I try to support ~12000 CF without hitting schema
agreement related issues? I can put more RAM and increase heap size(even if
I need to spend time in GC tuning for such large heap), but the issue which
I get with 2400 cols CFs starts happening just after few keyspaces(less
than 200 CFs). What can I try to fix that?





On Tue, Dec 20, 2016 at 12:53 AM, Vladimir Yudovin <vla...@winguzone.com>
wrote:

> >I want to dig deeper into what all things happen in C* at time of CF
> creation
> It starts somewhere in *MigrationManager.announceNewColumnFamily*
> function, I guess.
>
>
> >imitation of number of keyspaces which can be created.
> Actually it's CF limitation, not keyspaces.
>
>
> >if you can also point me to the this 1MB per CF thingy, it would be great.
> Look at http://www.mail-archive.com/user@cassandra.apache.org/
> msg46359.html, CASSANDRA-5935, CASSANDRA-2252
> In source look at *SlabAllocator.REGION_SIZE* definition.
>
>
> Best regards, Vladimir Yudovin,
> *Winguzone <https://winguzone.com?from=list> - Cloud Cassandra Hosting*
>
>
> ---- On Mon, 19 Dec 2016 14:10:37 -0500 *Saumitra S
> <saumitra.srivast...@gmail.com <saumitra.srivast...@gmail.com>>* wrote
> ----
>
> Hi Vladimir,
>
> Thanks for the response.
>
> When I see *"**com.datastax.driver.core.ControlConnection"* exceptions, I
> see that keyspaces and CF are created. But when I create CF with large
> number of columns(2400 cols) quickly one after the other(with 2 seconds gap
> between CREATE TABLE queries), I get schema agreement timeout errors (* 
> com.datastax.driver.core.Cluster
> | Error while waiting for schema agreement). *This happens even with a
> clean slate(empty data directory), just after creating 4 keyspaces. Timeout
> is set to 30 seconds. Please note that CREATE TABLE queries are NOT fired
> in parallel. I wait for 1 query to complete(with schema agreement) before
> firing another one.
>
> I want to dig deeper into what all things happen in C* at time of CF
> creation to understand more about the limitation of number of keyspaces
> which can be created. Can you please point me to the corresponding source
> code? Specifically if you can also point me to the this 1MB per CF thingy,
> it would be great.
>
>
> Best Regards,
> Saumitra
>
>
>
>
>
>
>
>
>
> On Mon, Dec 19, 2016 at 11:41 PM, Vladimir Yudovin <vla...@winguzone.com>
> wrote:
>
>
> Hi,
>
> *Question*: Does C* reads some schema/metadata on calling cqlsh, which is
> causing timeout with large number of keyspaces?
>
> A lot ). cqlsh reads schemas, cluster topology, each node tokens, etc. You
> can just capture TCP port 9042 (unless you use SSL)  and view all
> negotiation between cqlsh and node.
>
>
> *Question*: Can a single C* cluster of 5 nodes(32gb/8cpu each) support
> upto 500 keyspaces each having 25 CFs. What kind of issues I can expect?
>
> You have 500*25 = 12500 tables, it's huge number. Each CF takes at least
> 1M of heap memory. So it needs 12G heap only for starting usage. Make test
> on one-two node cluster.
>
>
> *Question*: What is the effect of below exception?
>
> Is keyspaces created despite exception or no?
>
> Best regards, Vladimir Yudovin,
> *Winguzone <https://winguzone.com?from=list> - Cloud Cassandra Hosting*
>
>
> ---- On Mon, 19 Dec 2016 10:24:20 -0500 *Saumitra S
> <saumitra.srivast...@gmail.com <saumitra.srivast...@gmail.com>>* wrote
> ----
>
> Hi All,
>
> I have a 2 node cluster(32gb ram/8cpu) running 3.0.10 and I created 50
> keyspaces in it. Each keyspace has 25 CF. Column count in each CF ranges
> between 5 to 30.
>
> I am getting few issues once keyspace count reaches ~50.
>
> *Issue 1:*
>
> When I try to use cqlsh, I get timeout.
>
> *$ cqlsh `hostname -i`*
> *Connection error: ('Unable to connect to any servers', {'10.0.20.220':
> OperationTimedOut('errors=None, last_host=None',)})*
>
> If I increase connect timeout, I am able to access cluster through cqlsh
>
> *$ cqlsh --connect-timeout 20  `hostname -i   //this works fine*
>
> *Question: *Does C* reads some schema/metadata on calling cqlsh, which is
> causing timeout with large number of keyspaces?
>
>
> *Issue 2:*
>
> If I create keyspaces which have 3 large CF(each having around 2500 cols),
> then I start to see schema agreement timeout in my logs. I have set schema
> agreement timeout to 30 seconds in driver.
>
> *2016-12-13 08:37:02.733 | gbd-std-01 | WARN | cluster2-worker-194 |
> com.datastax.driver.core.Cluster | Error while waiting for schema agreement*
>
> *Question:* Can a single C* cluster of 5 nodes(32gb/8cpu each) support
> upto 500 keyspaces each having 25 CFs. What kind of issues I can expect?
>
>
> *Issue 3:*
>
> I am creating keyspaces and CFs through datastax driver. I see following
> exception in my log after reaching *~50 keyspaces.*
>
> *Question: *What is the effect of below exception?
>
> 2016-12-19 13:55:35.615 | gbd-std-01 | ERROR | cluster1-worker-147 | 
> *com.datastax.driver.core.ControlConnection
> | [Control connection] Unexpected error while refreshing schema*
> *java.util.concurrent.ExecutionException:
> com.datastax.driver.core.exceptions.OperationTimedOutException:
> [gbd-cass-20.ec2-east1.hidden.com/10.0.20.220
> <http://gbd-cass-20.ec2-east1.hidden.com/10.0.20.220>] Operation timed out*
>         at com.google.common.util.concurrent.AbstractFuture$
> Sync.getValue(AbstractFuture.java:299) ~[com.google.guava.guava-18.0.
> jar:na]
>         at com.google.common.util.concurrent.AbstractFuture$
> Sync.get(AbstractFuture.java:286) ~[com.google.guava.guava-18.0.jar:na]
>         at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)
> ~[com.google.guava.guava-18.0.jar:na]
>         at com.datastax.driver.core.SchemaParser.get(SchemaParser.java:467)
> ~[com.datastax.cassandra.cassandra-driver-core-3.0.0.jar:na]
>         at 
> com.datastax.driver.core.SchemaParser.access$400(SchemaParser.java:30)
> ~[com.datastax.cassandra.cassandra-driver-core-3.0.0.jar:na]
>         at com.datastax.driver.core.SchemaParser$V3SchemaParser.
> fetchSystemRows(SchemaParser.java:632) ~[com.datastax.cassandra.
> cassandra-driver-core-3.0.0.jar:na]
>         at com.datastax.driver.core.SchemaParser.refresh(SchemaParser.java:56)
> ~[com.datastax.cassandra.cassandra-driver-core-3.0.0.jar:na]
>         at 
> com.datastax.driver.core.ControlConnection.refreshSchema(ControlConnection.java:341)
> ~[com.datastax.cassandra.cassandra-driver-core-3.0.0.jar:na]
>         at 
> com.datastax.driver.core.ControlConnection.refreshSchema(ControlConnection.java:306)
> ~[com.datastax.cassandra.cassandra-driver-core-3.0.0.jar:na]
>         at com.datastax.driver.core.Cluster$Manager$
> SchemaRefreshRequestDeliveryCallback$1.runMayThrow(Cluster.java:2570)
> [com.datastax.cassandra.cassandra-driver-core-3.0.0.jar:na]
>         at com.datastax.driver.core.ExceptionCatchingRunnable.run(
> ExceptionCatchingRunnable.java:32) [com.datastax.cassandra.
> cassandra-driver-core-3.0.0.jar:na]
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> [na:1.8.0_45]
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [na:1.8.0_45]
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> [na:1.8.0_45]
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> [na:1.8.0_45]
>         at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
> Caused by: com.datastax.driver.core.exceptions.OperationTimedOutException:
> [gbd-cass-20.ec2-east1.hidden.com/10.0.20.220] Operation timed out
>         at com.datastax.driver.core.DefaultResultSetFuture.onTimeout(
> DefaultResultSetFuture.java:209) ~[com.datastax.cassandra.
> cassandra-driver-core-3.0.0.jar:na]
>         at 
> com.datastax.driver.core.Connection$ResponseHandler$1.run(Connection.java:1260)
> ~[com.datastax.cassandra.cassandra-driver-core-3.0.0.jar:na]
>         at 
> io.netty.util.HashedWheelTimer$HashedWheelTimeout.expire(HashedWheelTimer.java:581)
> ~[io.netty.netty-common-4.0.33.Final.jar:4.0.33.Final]
>         at io.netty.util.HashedWheelTimer$HashedWheelBucket.
> expireTimeouts(HashedWheelTimer.java:655) ~[io.netty.netty-common-4.0.
> 33.Final.jar:4.0.33.Final]
>         at 
> io.netty.util.HashedWheelTimer$Worker.run(HashedWheelTimer.java:367)
> ~[io.netty.netty-common-4.0.33.Final.jar:4.0.33.Final]
>         ... 1 common frames omitted
> 2016-12-19 13:55:39.885 | gbd-std-01 | ERROR | cluster2-worker-124 | 
> *com.datastax.driver.core.ControlConnection
> | [Control connection] Unexpected error while refreshing schema*
> *java.util.concurrent.ExecutionException:
> com.datastax.driver.core.exceptions.ReadTimeoutException: Cassandra timeout
> during read query at consistency ONE (1 responses were required but only 0
> replica responded)*
>         at com.google.common.util.concurrent.AbstractFuture$
> Sync.getValue(AbstractFuture.java:299) ~[com.google.guava.guava-18.0.
> jar:na]
>         at com.google.common.util.concurrent.AbstractFuture$
> Sync.get(AbstractFuture.java:286) ~[com.google.guava.guava-18.0.jar:na]
>         at 
> com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:116)
> ~[com.google.guava.guava-18.0.jar:na]
>         at com.datastax.driver.core.SchemaParser.get(SchemaParser.java:467)
> ~[com.datastax.cassandra.cassandra-driver-core-3.0.0.jar:na]
>         at 
> com.datastax.driver.core.SchemaParser.access$400(SchemaParser.java:30)
> ~[com.datastax.cassandra.cassandra-driver-core-3.0.0.jar:na]
>         at com.datastax.driver.core.SchemaParser$V3SchemaParser.
> fetchSystemRows(SchemaParser.java:632) ~[com.datastax.cassandra.
> cassandra-driver-core-3.0.0.jar:na]
>         at com.datastax.driver.core.SchemaParser.refresh(SchemaParser.java:56)
> ~[com.datastax.cassandra.cassandra-driver-core-3.0.0.jar:na]
>         at 
> com.datastax.driver.core.ControlConnection.refreshSchema(ControlConnection.java:341)
> ~[com.datastax.cassandra.cassandra-driver-core-3.0.0.jar:na]
>         at 
> com.datastax.driver.core.ControlConnection.refreshSchema(ControlConnection.java:306)
> ~[com.datastax.cassandra.cassandra-driver-core-3.0.0.jar:na]
>         at com.datastax.driver.core.Cluster$Manager$
> SchemaRefreshRequestDeliveryCallback$1.runMayThrow(Cluster.java:2570)
> [com.datastax.cassandra.cassandra-driver-core-3.0.0.jar:na]
>         at com.datastax.driver.core.ExceptionCatchingRunnable.run(
> ExceptionCatchingRunnable.java:32) [com.datastax.cassandra.
> cassandra-driver-core-3.0.0.jar:na]
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> [na:1.8.0_45]
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> [na:1.8.0_45]
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
> [na:1.8.0_45]
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
> [na:1.8.0_45]
>         at java.lang.Thread.run(Thread.java:745) [na:1.8.0_45]
>
>
>
> Best Regards,
> Saumitra
>
>
>
>

Re: Cqlsh timeout and schema refresh exceptions

Reply via email to