Re: No node was available to execute query error

Bowen Song Mon, 15 Mar 2021 14:28:05 -0700

There are different approaches, depending on the application's logic.Roughly speaking, there's two distinct scenarios:


1. Your application knows all the partition keys of the required data
   in advance, either by reading them from another data source (e.g.:
   another Cassandra table, other database, a file, or an API), or can
   reconstruct the partition keys from other known information (e.g.:
   sequential numbers, date time in a known range, etc.).
2. Your application needs all (or nearly all) rows from a given table,
   so you can use range requests to read everything out from that table.

However, before you choose the second option and create a table for each"source" value, I must warn you that creating hundreds of tables inCassandra is a bad idea.

Ask yourself a question, what is really required to 'do something'? Doyou really need *all* data each time? Is it possible to make 'dosomething' incremental, so you'll only need *some* data each time?



On 15/03/2021 19:33, Joe Obernberger wrote:

Thank you.
What is the best way to iterate over a very large number of rows inCassandra? I know the datastax driver let's java do blocks of nrecords, but is that the best way?
-joe

On 3/15/2021 1:42 PM, Bowen Song wrote:
I personally try to avoid using secondary indexes, especially inlarge clusters.
SI is not scalable, because a SI query doesn't have the partition keyinformation, Cassandra must send it to nearly all nodes in a DC toget the answer. Thus, the more nodes you have in a cluster, theslower and more expensive to run a SI query. Creating a SI on a tablealso can indirectly create large partitions in the index tables.
On 15/03/2021 17:27, Joe Obernberger wrote:
Great stuff - thank you. I've spent the morning here redesigningwith smaller partitions.
If I have a large number of unique IDs that I want to regularly 'dosomething' with, would it make sense to have a table where a UUID isthe partition key, and create a secondary index on a field (call itsource) that I want to select from where the number of UUIDs persource might be very large (billions).
So - select * from table where source=?
The number of unique source values is small - maybe 1000
Whereas each source may have billions of UUIDs.

-Joe


On 3/15/2021 11:18 AM, Bowen Song wrote:
To be clear, this

    CREATE TABLE ... PRIMARY KEY (k1, k2);

is the same as:

    CREATE TABLE ... PRIMARY KEY ((k1), k2);

but they are NOT the same as:

    CREATE TABLE ... PRIMARY KEY ((k1, k2));
The first two statements creates a table with a partition key k1and a clustering key k2. The 3rd statement creates a compositepartition key from k1 and k2, therefore k1 and k2 are the partitionkeys for this table.
Your example"create table xyz (uuid text, source text, primary key(source, uuid));" uses the same syntax as the first statement,which creates the table xyz with a partition key source, and aclustering key uuid (which, BTW, is a non-reserved keyword).
A partition in Cassandra is solely determined by the partitionkey(s), and the clustering key(s) have nothing to do with it. Thesize of a compacted partition is determined by the number of rowsin the partition and the size of each row. If the table doesn'thave a clustering key, each partition will have at most one row.The row size is the serialized size of all data in that row,including tombstones.
You can reduce the partition size for a table by either reducingthe serialized data size or adding more columns to the (composite)partition keys. But please be aware, you will have to provide ALLpartition key values when you read from or write to this table(other than range, SI or MV queries), therefore you will need toconsider the queries before designing the table schema. Forscalability, you will need predictable partition size that does notgrow over time, or have an actionable plan to re-partition thetable when the partition size exceeds a certain threshold. Pickingthe threshold is more of an art than science, generally speaking itshould stay below a few hundred MBs, and often no more than 100 MB.
On 15/03/2021 14:36, Joe Obernberger wrote:
Thank you Bowen - I'm redesigning the tables now. When you giveCassandra two parts to the primary key like
create table xyz (uuid text, source text, primary key (source, uuid));
How is the second part of the primary key used to determinepartition size?
-Joe

On 3/12/2021 5:27 PM, Bowen Song wrote:
The partition size min/avg/max of 8409008/15096925/25109160 byteslooks fine for the table fieldcounts, but the number ofpartitions is a bit worrying. Only 3 partitions? Are youexpecting the partition size (instead of number of partitions) togrow in the future? That can lead to a lots of headaches.
Forget about the fieldcounts table for now, the doc table looksreally bad. It has min/avg/max partition size of24602/7052951452/63771372175 bytes, the partition sizes areseverely unevenly distributed, and the over 60GB partition is waytoo big.
You really need to redesign your table schemas, and avoidcreating large or uneven partitions.
On 12/03/2021 18:52, Joe Obernberger wrote:
Thank you very much for helping me out on this! The tablefieldcounts is currently pretty small - 6.4 million rows.
cfstats are:

Total number of tables: 81
----------------
Keyspace : doc
        Read Count: 3713134
        Read Latency: 0.2664131157130338 ms
        Write Count: 47513045
        Write Latency: 1.0725477948634947 ms
        Pending Flushes: 0
                Table: fieldcounts
                SSTable count: 3
                Space used (live): 16010248
                Space used (total): 16010248
                Space used by snapshots (total): 0
                Off heap memory used (total): 4947
                SSTable Compression Ratio: 0.3994304032360534
                Number of partitions (estimate): 3
                Memtable cell count: 0
                Memtable data size: 0
                Memtable off heap memory used: 0
                Memtable switch count: 0
                Local read count: 379
                Local read latency: NaN ms
                Local write count: 0
                Local write latency: NaN ms
                Pending flushes: 0
                Percent repaired: 100.0
                Bloom filter false positives: 0
                Bloom filter false ratio: 0.00000
                Bloom filter space used: 48
                Bloom filter off heap memory used: 24
                Index summary off heap memory used: 51
                Compression metadata off heap memory used: 4872
                Compacted partition minimum bytes: 8409008
                Compacted partition maximum bytes: 25109160
                Compacted partition mean bytes: 15096925
Average live cells per slice (last fiveminutes): NaN
                Maximum live cells per slice (last five minutes): 0
Average tombstones per slice (last fiveminutes): NaN
                Maximum tombstones per slice (last five minutes): 0
                Dropped Mutations: 0
Commitlog is on a separate spindle on the 7 node cluster. Alldisks are SATA (spinning rust as they say!). This is an R&Dplatform, but I will switch to NetworkTopologyStrategy. I'musing Prometheus and Grafana to monitor Cassandra and the CPUload is typically 100 to 200% on most of the nodes. Disk IO istypically pretty low.
Performance - in general Async is about 10x faster.
ExecuteAsync:
35mSec for 364 rows.
8120mSec for 205001 rows.
14788mSec for 345001 rows.
4117mSec for 86400 rows.

23,330 rows per second on average

Execute:
232mSec for 364 rows.
584869mSec for 1263283 rows
46290mSec for 86400 rows

2,160 rows per second on average
Curious - our largest table (doc) has the following stats - isit not partitioned well?
Total number of tables: 81
----------------
Keyspace : doc
        Read Count: 3713134
        Read Latency: 0.2664131157130338 ms
        Write Count: 47513045
        Write Latency: 1.0725477948634947 ms
        Pending Flushes: 0
                Table: doc
                SSTable count: 26
                Space used (live): 57124641753
                Space used (total): 57124641753
                Space used by snapshots (total): 113012646218
                Off heap memory used (total): 27331913
                SSTable Compression Ratio: 0.2531585373184219
                Number of partitions (estimate): 12
                Memtable cell count: 0
                Memtable data size: 0
                Memtable off heap memory used: 0
                Memtable switch count: 0
                Local read count: 27169
                Local read latency: NaN ms
                Local write count: 0
                Local write latency: NaN ms
                Pending flushes: 0
                Percent repaired: 0.0
                Bloom filter false positives: 0
                Bloom filter false ratio: 0.00000
                Bloom filter space used: 576
                Bloom filter off heap memory used: 368
                Index summary off heap memory used: 425
                Compression metadata off heap memory used: 27331120
                Compacted partition minimum bytes: 24602
                Compacted partition maximum bytes: 63771372175
                Compacted partition mean bytes: 7052951452
Average live cells per slice (last fiveminutes): NaN
                Maximum live cells per slice (last five minutes): 0
Average tombstones per slice (last fiveminutes): NaN
                Maximum tombstones per slice (last five minutes): 0
                Dropped Mutations: 0

Thank again!

-Joe

On 3/12/2021 11:01 AM, Bowen Song wrote:
Sleep-then-retry works is just another indicator that it'slikely a GC pause related issue. I'd recommend you to checkyour Cassandra servers' GC logs first.
Do you know what's the maximum partition size for thedoc.fieldcounts table? (Try the "nodetool cfstatsdoc.fieldcounts" command) I suspect this table has largepartitions, which usually leads to GC issues.
As of your failed executeAsync() insert issue, do you know howmany concurrent on-the-fly queries do you have? Cassandradriver has limitations on it, and new executeAsync() calls willfail when the limit is reached.
I'm also a bit concerned about your "significantly" slowerinserts. Inserts (excluding "INSERT IF NOT EXISTS") should bevery fast in Cassandra. How slow are they? Are they always slowlike that, or usually fast but some are much slower thanothers? What does the CPU usage & disk IO look like on theCassandra server? Do you have commitlog on the same disk as thedata? Is it a spinning disk, SATA SSD or NVMe?
BTW, you really shouldn't use SimpleStrategy for productionenvironments.
On 12/03/2021 15:18, Joe Obernberger wrote:
The queries that are failing are:
select fieldvalue, count from doc.ordered_fieldcounts wheresource=? and fieldname=? limit 10
Created with:
CREATE TABLE doc.ordered_fieldcounts (
    source text,
    fieldname text,
    count bigint,
    fieldvalue text,
    PRIMARY KEY ((source, fieldname), count, fieldvalue)
) WITH CLUSTERING ORDER BY (count DESC, fieldvalue ASC)

and:
select fieldvalue, count from doc.fieldcounts where source=?and fieldname=?
Created with:
CREATE TABLE doc.fieldcounts (
    source text,
    fieldname text,
    fieldvalue text,
    count bigint,
    PRIMARY KEY (source, fieldname, fieldvalue)
)
This really seems like a driver issue. I put retry logicaround the calls and now those queries work. Basically if itthrows an exception, I Thread.sleep(500) and then retry. Thisseems to be a continuing theme with Cassandra in general. Isthis common practice?
After doing this retry logic, an insert statement startedfailing with an illegal state exception when I retried it(which makes sense). This insert was usingsession.executeAsync(boundStatement). I changed that to justexecute (instead of async) and now I get no errors, no retriesanywhere. The insert is *significantly* slower when runningexecute vs executeAsync. When using executeAsync:
com.datastax.oss.driver.api.core.NoNodeAvailableException: Nonode was available to execute the query atcom.datastax.oss.driver.api.core.NoNodeAvailableException.copy(NoNodeAvailableException.java:40) atcom.datastax.oss.driver.internal.core.util.concurrent.CompletableFutures.getUninterruptibly(CompletableFutures.java:149) atcom.datastax.oss.driver.internal.core.cql.MultiPageResultSet$RowIterator.maybeMoveToNextPage(MultiPageResultSet.java:99) atcom.datastax.oss.driver.internal.core.cql.MultiPageResultSet$RowIterator.computeNext(MultiPageResultSet.java:91) atcom.datastax.oss.driver.internal.core.cql.MultiPageResultSet$RowIterator.computeNext(MultiPageResultSet.java:79) atcom.datastax.oss.driver.internal.core.util.CountingIterator.tryToComputeNext(CountingIterator.java:91) atcom.datastax.oss.driver.internal.core.util.CountingIterator.hasNext(CountingIterator.java:86) atcom.ngc.helios.fieldanalyzer.FTAProcess.handleOrderedFieldCounts(FTAProcess.java:684) atcom.ngc.helios.fieldanalyzer.FTAProcess.storeResults(FTAProcess.java:214) atcom.ngc.helios.fieldanalyzer.FTAProcess.startProcess(FTAProcess.java:190)
        at com.ngc.helios.fieldanalyzer.Main.main(Main.java:20)
The interesting part here is the the line that is now failing(line 684 in FTAProcess) is:
if (itRs.hasNext())
where itRs is an iterator<Row> over a select query fromanother table. I'm iterating over a result set from a selectand inserting those results via executeAsync.
-Joe

On 3/12/2021 9:07 AM, Bowen Song wrote:
Millions rows in a single query? That sounds like a bad ideato me. Your "NoNodeAvailableException" could be caused bystop-the-world GC pauses, and the GC pauses are likely causedby the query itself.
On 12/03/2021 13:39, Joe Obernberger wrote:
Thank you Paul and Erick.  The keyspace is defined like this:
CREATE KEYSPACE doc WITH replication = {'class':'SimpleStrategy', 'replication_factor': '3'} ANDdurable_writes = true;
Would that cause this?
The program that is having the problem selects data,calculates stuff, and inserts. It works with smallerselects, but when the number of rows is in the millions, Istart to get this error. Since it works with smaller sets,I don't believe it to be a network error. All the nodes aredefinitely up as other processes are working OK, it's justthis one program that fails.
The full stack trace:
Error:com.datastax.oss.driver.api.core.NoNodeAvailableException:No node was available to execute the querycom.datastax.oss.driver.api.core.NoNodeAvailableException:No node was available to execute the query atcom.datastax.oss.driver.api.core.NoNodeAvailableException.copy(NoNodeAvailableException.java:40) atcom.datastax.oss.driver.internal.core.util.concurrent.CompletableFutures.getUninterruptibly(CompletableFutures.java:149) atcom.datastax.oss.driver.internal.core.cql.CqlRequestSyncProcessor.process(CqlRequestSyncProcessor.java:53) atcom.datastax.oss.driver.internal.core.cql.CqlRequestSyncProcessor.process(CqlRequestSyncProcessor.java:30) atcom.datastax.oss.driver.internal.core.session.DefaultSession.execute(DefaultSession.java:230) atcom.datastax.oss.driver.api.core.cql.SyncCqlSession.execute(SyncCqlSession.java:54) atcom.abc.xxxx.fieldanalyzer.FTAProcess.udpateCassandraFTAMetrics(FTAProcess.java:275) atcom.abc.xxxx.fieldanalyzer.FTAProcess.storeResults(FTAProcess.java:216) atcom.abc.xxxx.fieldanalyzer.FTAProcess.startProcess(FTAProcess.java:199)
        at com.abc.xxxx.fieldanalyzer.Main.main(Main.java:20)

FTAProcess like 275 is:
ResultSet rs =session.execute(getFieldCounts.bind().setString(0,rb.getSource()).setString(1, rb.getFieldName()));
-Joe

On 3/12/2021 8:30 AM, Paul Chandler wrote:
Hi Joe
This could also be caused by the replication factor of thekeyspace, if you have NetworkTopologyStrategy and itdoesn’t list a replication factor for thedatacenter datacenter1 then you will get this error messagetoo.
Paul
On 12 Mar 2021, at 13:07, Erick Ramirez<erick.rami...@datastax.com<mailto:erick.rami...@datastax.com>> wrote:
Does it get returned by the driver every single time? TheNoNodeAvailableExceptiongets thrown when (1) all nodes aredown, or (2) all the contact points are invalid from thedriver's perspective.
Is it possible there's no route/connectivity from your appserver(s) to the 172.16.x.xnetwork? If you post the fullerror message + full stacktrace, it might provide clues.Cheers!
<http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>Virus-free. www.avg.com<http://www.avg.com/email-signature?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=emailclient>
<#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2>

Re: No node was available to execute query error

Reply via email to