No wonder, the client is timing out. Even though C* supports up to 2B columns, 
it is recommended not to have more 100k CQL rows in a partition.

It has been a long time since I used Astyanax, so I don’t remember whether the 
AllRowsReader reads all CQL rows or storage rows. If it is reading all CQL 
rows, then essentially it is trying to read 800k*200k rows. That will be 160B 
rows!

Did you try “SELECT DISTINCT …” from cqlsh?

Mohammed

From: Ravi Agrawal [mailto:ragra...@clearpoolgroup.com]
Sent: Thursday, January 22, 2015 11:12 PM
To: user@cassandra.apache.org
Subject: RE: Retrieving all row keys of a CF

In each partition cql rows on average is 200K. Max is 3M.
800K is number of cassandra partitions.


From: Mohammed Guller [mailto:moham...@glassbeam.com]
Sent: Thursday, January 22, 2015 7:43 PM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: RE: Retrieving all row keys of a CF

What is the average and max # of CQL rows in each partition? Is 800,000 the 
number of CQL rows or Cassandra partitions (storage engine rows)?

Another option you could try is a CQL statement to fetch all partition keys. 
You could first try this in the cqlsh:

“SELECT DISTINCT pk1, pk2…pkn FROM CF”

You will need to specify all the composite columns if you are using a composite 
partition key.

Mohammed

From: Ravi Agrawal [mailto:ragra...@clearpoolgroup.com]
Sent: Thursday, January 22, 2015 1:57 PM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: RE: Retrieving all row keys of a CF

Hi,
I increased range timeout, read timeout to first to 50 secs then 500 secs and 
Astyanax client to 60, 550 secs respectively. I still get timeout exception.
I see the logic with .withCheckpointManager() code, is that the only way it 
could work?


From: Eric Stevens [mailto:migh...@gmail.com]
Sent: Saturday, January 17, 2015 9:55 AM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: Retrieving all row keys of a CF

If you're getting partial data back, then failing eventually, try setting 
.withCheckpointManager() - this will let you keep track of the token ranges 
you've successfully processed, and not attempt to reprocess them.  This will 
also let you set up tasks on bigger data sets that take hours or days to run, 
and reasonably safely interrupt it at any time without losing progress.

This is some *very* old code, but I dug this out of a git history.  We don't 
use Astyanax any longer, but maybe an example implementation will help you.  
This is Scala instead of Java, but hopefully you can get the gist.

https://gist.github.com/MightyE/83a79b74f3a69cfa3c4e

If you're timing out talking to your cluster, then I don't recommend using the 
cluster to track your checkpoints, but some other data store (maybe just a 
flatfile).  Again, this is just to give you a sense of what's involved.

On Fri, Jan 16, 2015 at 6:31 PM, Mohammed Guller 
<moham...@glassbeam.com<mailto:moham...@glassbeam.com>> wrote:
Both total system memory and heap size can’t be 8GB?

The timeout on the Astyanax client should be greater than the timeouts on the 
C* nodes, otherwise your client will timeout prematurely.

Also, have you tried increasing the timeout for the range queries to a higher 
number? It is not recommended to set them very high, because a lot of other 
problems may start happening, but then reading 800,000 partitions is not a 
normal operation.

Just as an experimentation, can you set the range timeout to 45 seconds on each 
node and the timeout on the Astyanax client to 50 seconds? Restart the nodes 
after increasing the timeout and try again.

Mohammed

From: Ravi Agrawal 
[mailto:ragra...@clearpoolgroup.com<mailto:ragra...@clearpoolgroup.com>]
Sent: Friday, January 16, 2015 5:11 PM

To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: RE: Retrieving all row keys of a CF


1)            What is the heap size and total memory on each node? 8GB, 8GB
2)            How big is the cluster? 4
3)            What are the read and range timeouts (in cassandra.yaml) on the 
C* nodes? 10 secs, 10 secs
4)            What are the timeouts for the Astyanax client? 2 secs
5)            Do you see GC pressure on the C* nodes? How long does GC for new 
gen and old gen take? occurs every 5 secs dont see huge gc pressure, <50ms
6)            Does any node crash with OOM error when you try AllRowsReader? No

From: Mohammed Guller [mailto:moham...@glassbeam.com]
Sent: Friday, January 16, 2015 7:30 PM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: RE: Retrieving all row keys of a CF

A few questions:


1)      What is the heap size and total memory on each node?

2)      How big is the cluster?

3)      What are the read and range timeouts (in cassandra.yaml) on the C* 
nodes?

4)      What are the timeouts for the Astyanax client?

5)      Do you see GC pressure on the C* nodes? How long does GC for new gen 
and old gen take?

6)      Does any node crash with OOM error when you try AllRowsReader?

Mohammed

From: Ravi Agrawal [mailto:ragra...@clearpoolgroup.com]
Sent: Friday, January 16, 2015 4:14 PM
To: user@cassandra.apache.org<mailto:user@cassandra.apache.org>
Subject: Re: Retrieving all row keys of a CF

Hi,
I and Ruchir tried query using AllRowsReader recipe but had no luck. We are 
seeing PoolTimeoutException.
SEVERE: [Thread_1] Error reading RowKeys
com.netflix.astyanax.connectionpool.exceptions.PoolTimeoutException: 
PoolTimeoutException: [host=servername, latency=2003(2003), attempts=4]Timed 
out waiting for connection
       at 
com.netflix.astyanax.connectionpool.impl.SimpleHostConnectionPool.waitForConnection(SimpleHostConnectionPool.java:231)
       at 
com.netflix.astyanax.connectionpool.impl.SimpleHostConnectionPool.borrowConnection(SimpleHostConnectionPool.java:198)
       at 
com.netflix.astyanax.connectionpool.impl.RoundRobinExecuteWithFailover.borrowConnection(RoundRobinExecuteWithFailover.java:84)
       at 
com.netflix.astyanax.connectionpool.impl.AbstractExecuteWithFailoverImpl.tryOperation(AbstractExecuteWithFailoverImpl.java:117)
       at 
com.netflix.astyanax.connectionpool.impl.AbstractHostPartitionConnectionPool.executeWithFailover(AbstractHostPartitionConnectionPool.java:338)
       at 
com.netflix.astyanax.thrift.ThriftColumnFamilyQueryImpl$2.execute(ThriftColumnFamilyQueryImpl.java:397)
       at 
com.netflix.astyanax.recipes.reader.AllRowsReader$1.call(AllRowsReader.java:447)
       at 
com.netflix.astyanax.recipes.reader.AllRowsReader$1.call(AllRowsReader.java:419)
       at java.util.concurrent.FutureTask.run(FutureTask.java:262)
       at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
       at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
       at java.lang.Thread.run(Thread.java:745)

We did receive a portion of data which changes on every try. We used following 
method.
boolean result = new AllRowsReader.Builder<String, String>(keyspace, 
CF_STANDARD1)
        .withColumnRange(null, null, false, 0)
        .withPartitioner(null) // this will use keyspace's partitioner
        .forEachRow(new Function<Row<String, String>, Boolean>() {
            @Override
            public Boolean apply(@Nullable Row<String, String> row) {
                // Process the row here ...
                return true;
            }
        })
        .build()
        .call();

Tried setting concurrency level as mentioned in this post 
(https://github.com/Netflix/astyanax/issues/411) as well on both astyanax 
1.56.49 and 2.0.0. Still nothing.

Reply via email to