sstable processing times
Hi folks, I'm running a job on an offline node to test how long it takes to run sstablesplit several large sstable. I'm a bit dismayed to see it took about 22 hours to process a 1.5 gigabyte sstable! I worry about the 32 gigabyte sstable that is my ultimate target to split. This is running on an otherwise unloaded Linux 3.10.0 CentOS 7 server with 4 cpus and 24 gigabytes of ram. Cassandra 3.11.0 and OpenJDK 1.8.0_252 are the installed versions of the software. The machine isn't very busy itself, it looks as though java is only making use of 1 of the 4 processors, and it's not using much of the available 24 gigabytes of memory either, all the memory usage is in the linux buffer cache, which I guess makes sense if it's just working on these large files w/o needing to do a lot of heavy computation on what it reads from them. When you folks run sstablesplit, do you provide specific CASSANDRA_INCLUDE settings to increase its performance? Jim - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org
cassandra tracing's source_elapsed microseconds
Hi folks, I've been looking at various articles on the TRACING ON output of cassandra. I'm not finding a definitive description of what the output means. https://docs.datastax.com/en/dse/6.7/cql/cql/cql_reference/cqlsh_commands/cqlshTracing.html says "Note: The source_elapsed column value is the elapsed time of the event on the source node in microseconds." Am I correct in the understanding that the current row's source_elapsed value minus the source_elapsed value of the previous row for the same source node should tell me how long the current row took to execute? As an example: activity | timestamp | source | source_elapsed | client ... Bloom filter allows skipping sstable 396 [ReadStage-3] | 2020-10-08 10:31:48.631001 | 10.220.50.148 | 402 | 10.220.50.148 Partition index with 0 entries found for sstable 382 [ReadStage-3] | 2020-10-08 10:31:48.636000 | 10.220.50.148 | 5819 | 10.220.50.148 Bloom filter allows skipping sstable 380 [ReadStage-3] | 2020-10-08 10:31:48.645000 | 10.220.50.148 | 14685 | 10.220.50.148 Bloom filter allows skipping sstable 14 [ReadStage-3] | 2020-10-08 10:31:48.645000 | 10.220.50.148 | 14714 | 10.220.50.148 Partition index with 0 entries found for sstable 6 [ReadStage-3]| 2020-10-08 10:31:48.666000 | 10.220.50.148 | 35410 | 10.220.50.148 ... Does the above indicate it took 20.696 milliseconds to run the last "Partition index with 0 entries found for sstable 6" activity? Jim
Re: sstableloader - warning vs. failure?
Ok, thanks very much the answer! On Fri, Feb 7, 2020 at 9:00 PM Erick Ramirez wrote: > INFO [pool-1-thread-4] 2020-02-08 01:35:37,946 NoSpamLogger.java:91 - >> Maximum memory usage reached (536870912), cannot allocate chunk of 1048576 >> > > The message gets logged when SSTables are being cached and the cache fills > up faster than objects are evicted from it. Note that the message is logged > at INFO level (instead of WARN or ERROR) because there is no detrimental > effect but there will be a performance hit in the form of read latency. > When space becomes available, it will just continue on to cache the next > 64k chunk of the sstable. > > FWIW The default cache size (file_cache_size_in_mb in cassandra.yaml) is > 512 MB (max memory of 536870912 in the log entry). Cheers! >
sstableloader - warning vs. failure?
Hi folks, When sstableloader hits a very large sstable cassandra may end up logging a message like this: INFO [pool-1-thread-4] 2020-02-08 01:35:37,946 NoSpamLogger.java:91 - Maximum memory usage reached (536870912), cannot allocate chunk of 1048576 The loading process doesn't abort, and the sstableloader stdout logging appears to end up reporting success, e.g., with a few 100% totals across the nodes reported: progress: [/10.0.1.116]0:11/11 100% [/10.0.1.248]0:11/11 100% [/10.0.1.93]0:11/11 100% total: 100% 0.000KiB/s (avg: 36.156MiB/s) progress: [/10.0.1.116]0:11/11 100% [/10.0.1.248]0:11/11 100% [/10.0.1.93]0:11/11 100% total: 100% 0.000KiB/s (avg: 34.914MiB/s) progress: [/10.0.1.116]0:11/11 100% [/10.0.1.248]0:11/11 100% [/10.0.1.93]0:11/11 100% total: 100% 0.000KiB/s (avg: 33.794MiB/s) Summary statistics: Connections per host: 1 Total files transferred : 33 Total bytes transferred : 116.027GiB Total duration : 3515748 ms Average transfer rate : 33.794MiB/s Peak transfer rate : 53.130MiB/s In these situations is sstableloader hitting the memory issue and then retrying a few times until it succeeds? Or is it silently dropping data on the floor? I'd assume the former, but thought it'd be good to ask you folks to be sure... Jim
Cassandra and UTF-8 BOM?
Hi folks, I'm looking at a table that has a primary key defined as "publisher_id text". I've noticed some of the entries have what appears to me to be a UTF-8 BOM marker and some do not. https://docs.datastax.com/en/archived/cql/3.3/cql/cql_reference/cql_data_types_c.html says text is a UTF-8 encoded string. If I look at the first 3 bytes of one of these columns: $ dd if=~/tmp/sample.data of=/dev/stdout bs=1 count=3 2>/dev/null | hexdump 000 bbef 00bf 003 When I swap the byte order: $ dd if=~/tmp/sample.data of=/dev/stdout bs=1 count=3 conv=swab 2>/dev/null | hexdump 000 efbb 00bf 003 And I think this matches the UTF-8 BOM. However, not all the rows have this prefix, and I'm wondering if this is a client issue (client being inconsistent about how it's dealing with strings) or if Cassandra is doing something special on its own. The rest of the column falls within the US-ASCII codepoint compatible range of UTF-8, e.g., something as simple as 'abc' but in some cases it's got this marker in front of it. Cassandra is treating 'abc' as a distinct value from 'abc' , which certainly makes sense, for the sake of efficiency I assume it'd just be looking at the byte-for-byte values w/o layering meaning on top of it. But that means I'll need to clean the data up to be consistent, and I need to figure out how to prevent it from being reintroduced in the future. Jim - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org
n00b q re UPDATE v. INSERT in CQL
Hi folks, I'm working on a clean-up task for some bad data in a cassandra db. The bad data in this case are values with mixed case that will need to be lowercased. In some tables the value that needs to be changed is a primary key, in other cases it is not. >From the reading I've done, the situations where I need to change a primary key column to lowercase will mean I need to perform an INSERT of the entire row using the new primary key values merged with the old non-primary-key values, followed by a DELETE of the old primary key row. My question is, on a table where I need to update a column that isn't primary key, should I perform a limited UPDATE in that situation like I would in SQL: UPDATE ks.table SET col1 = ? WHERE pk1 = ? AND pk2 = ? or will there be any downsides to that over an INSERT where I specify all columns? INSERT INTO ks.table (pk1, pk2, col1, col2, ...) VALUES (?,?,?,?, ...) In SQL I'd never question just using the update but my impression reading the blogosphere is that Cassandra has subtleties that I might not be grasping when it comes to UPDATE v. INSERT behavior... Jim - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org
snapshots and 'dot' prefixed _index directories
Hi folks, I took a nodetool snapshot of a keyspace in my cassandra 3.11 cluster and it included directories with a 'dot' prefix (often called a hidden file/directory). As an example: /var/lib/cassandra/data/impactvizor/tableau_notification-04bfb600291e11e7aeab31f0f0e5804b/snapshots/1569974640/.tableau_notification_alert_id_index Am I supposed to back up the files under the dot-prefixed directories the same as I do the other files? I ask because tar just complained that one of these files 'changed as we read it' which I wouldn't have expected given the documentation of how snapshots worked Jim - To unsubscribe, e-mail: user-unsubscr...@cassandra.apache.org For additional commands, e-mail: user-h...@cassandra.apache.org