[jira] [Updated] (CASSANDRA-13307) The specification of protocol version in cqlsh means the python driver doesn't automatically downgrade protocol version.
[ https://issues.apache.org/jira/browse/CASSANDRA-13307?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] mck updated CASSANDRA-13307: Labels: doc-impacting (was: ) > The specification of protocol version in cqlsh means the python driver > doesn't automatically downgrade protocol version. > > > Key: CASSANDRA-13307 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13307 > Project: Cassandra > Issue Type: Bug > Components: Tools >Reporter: Matt Byrd >Assignee: Matt Byrd >Priority: Minor > Labels: doc-impacting > Fix For: 3.11.x > > > Hi, > Looks like we've regressed on the issue described in: > https://issues.apache.org/jira/browse/CASSANDRA-9467 > In that we're no longer able to connect from newer cqlsh versions > (e.g trunk) to older versions of Cassandra with a lower version of the > protocol (e.g 2.1 with protocol version 3) > The problem seems to be that we're relying on the ability for the client to > automatically downgrade protocol version implemented in Cassandra here: > https://issues.apache.org/jira/browse/CASSANDRA-12838 > and utilised in the python client here: > https://datastax-oss.atlassian.net/browse/PYTHON-240 > The problem however comes when we implemented: > https://datastax-oss.atlassian.net/browse/PYTHON-537 > "Don't downgrade protocol version if explicitly set" > (included when we bumped from 3.5.0 to 3.7.0 of the python driver as part of > fixing: https://issues.apache.org/jira/browse/CASSANDRA-11534) > Since we do explicitly specify the protocol version in the bin/cqlsh.py. > I've got a patch which just adds an option to explicitly specify the protocol > version (for those who want to do that) and then otherwise defaults to not > setting the protocol version, i.e using the protocol version from the client > which we ship, which should by default be the same protocol as the server. > Then it should downgrade gracefully as was intended. > Let me know if that seems reasonable. > Thanks, > Matt -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (CASSANDRA-12661) Make gc_log and gc_warn settable at runtime
[ https://issues.apache.org/jira/browse/CASSANDRA-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15965339#comment-15965339 ] Michael Shuler edited comment on CASSANDRA-12661 at 4/12/17 4:35 AM: - One of the new tests fail: http://cassci.datastax.com/job/trunk_testall/1509/testReport/org.apache.cassandra.service/GCInspectorTest/ensureLogLessThanWarn/ http://cassci.datastax.com/job/trunk_testall/1509/testReport/org.apache.cassandra.service/GCInspectorTest/ensureLogLessThanWarn_compression/ {noformat} Error Message Expected exception: java.lang.IllegalArgumentException Stacktrace junit.framework.AssertionFailedError: Expected exception: java.lang.IllegalArgumentException Standard Output ERROR [main] 2017-04-12 02:05:31,914 SubstituteLogger.java:250 - SLF4J: stderr INFO [main] 2017-04-12 02:05:32,110 YamlConfigurationLoader.java:89 - Configuration location: file:/home/automaton/cassandra/test/conf/cassandra.yaml DEBUG [main] 2017-04-12 02:05:32,112 YamlConfigurationLoader.java:108 - Loading settings from file:/home/automaton/cassandra/test/conf/cassandra.yaml INFO [main] 2017-04-12 02:05:32,995 Config.java:454 - Node configuration:[allocate_tokens_for_keyspace=null; authenticator=null; authorizer=null; auto_bootstrap=true; auto_snapshot=true; back_pressure_enabled=false; back_pressure_strategy=null; batch_size_fail_threshold_in_kb=50; batch_size_warn_threshold_in_kb=5; batchlog_replay_throttle_in_kb=1024; broadcast_address=null; broadcast_rpc_address=null; buffer_pool_use_heap_if_exhausted=true; cas_contention_timeout_in_ms=1000; cdc_enabled=false; cdc_free_space_check_interval_ms=250; cdc_raw_directory=build/test/cassandra/cdc_raw:331; cdc_total_space_in_mb=0; client_encryption_options=; cluster_name=Test Cluster; column_index_cache_size_in_kb=2; column_index_size_in_kb=4; commit_failure_policy=stop; commitlog_compression=null; commitlog_directory=build/test/cassandra/commitlog:331; commitlog_max_compression_buffers_in_pool=3; commitlog_periodic_queue_size=-1; commitlog_segment_size_in_mb=5; commitlog_sync=batch; commitlog_sync_batch_window_in_ms=1.0; commitlog_sync_period_in_ms=0; commitlog_total_space_in_mb=null; compaction_large_partition_warning_threshold_mb=100; compaction_throughput_mb_per_sec=0; concurrent_compactors=4; concurrent_counter_writes=32; concurrent_materialized_view_writes=32; concurrent_reads=32; concurrent_replicates=null; concurrent_writes=32; counter_cache_keys_to_save=2147483647; counter_cache_save_period=7200; counter_cache_size_in_mb=null; counter_write_request_timeout_in_ms=5000; credentials_cache_max_entries=1000; credentials_update_interval_in_ms=-1; credentials_validity_in_ms=2000; cross_node_timeout=false; data_file_directories=[Ljava.lang.String;@4c178a76; disk_access_mode=mmap; disk_failure_policy=ignore; disk_optimization_estimate_percentile=0.95; disk_optimization_page_cross_chance=0.1; disk_optimization_strategy=ssd; dynamic_snitch=true; dynamic_snitch_badness_threshold=0.1; dynamic_snitch_reset_interval_in_ms=60; dynamic_snitch_update_interval_in_ms=100; enable_scripted_user_defined_functions=true; enable_user_defined_functions=true; enable_user_defined_functions_threads=true; encryption_options=null; endpoint_snitch=org.apache.cassandra.locator.SimpleSnitch; file_cache_size_in_mb=null; gc_log_threshold_in_ms=200; gc_warn_threshold_in_ms=1000; hinted_handoff_disabled_datacenters=[]; hinted_handoff_enabled=true; hinted_handoff_throttle_in_kb=1024; hints_compression=null; hints_directory=build/test/cassandra/hints:331; hints_flush_period_in_ms=1; ideal_consistency_level=null; incremental_backups=true; index_summary_capacity_in_mb=null; index_summary_resize_interval_in_minutes=60; initial_token=null; inter_dc_stream_throughput_outbound_megabits_per_sec=200; inter_dc_tcp_nodelay=true; internode_authenticator=null; internode_compression=none; internode_recv_buff_size_in_bytes=0; internode_send_buff_size_in_bytes=0; key_cache_keys_to_save=2147483647; key_cache_save_period=14400; key_cache_size_in_mb=null; listen_address=127.0.0.1; listen_interface=null; listen_interface_prefer_ipv6=false; listen_on_broadcast_address=false; max_hint_window_in_ms=1080; max_hints_delivery_threads=2; max_hints_file_size_in_mb=128; max_mutation_size_in_kb=null; max_streaming_retries=3; max_value_size_in_mb=256; memtable_allocation_type=offheap_objects; memtable_cleanup_threshold=null; memtable_flush_writers=0; memtable_heap_space_in_mb=null; memtable_offheap_space_in_mb=null; min_free_space_per_drive_in_mb=50; native_transport_max_concurrent_connections=-1; native_transport_max_concurrent_connections_per_ip=-1; native_transport_max_frame_size_in_mb=256; native_transport_max_threads=128; native_transport_port=9373; native_transport_port_ssl=null; num_tokens=1; otc_coalescing_enough_coalesced_messages=8;
[jira] [Commented] (CASSANDRA-12661) Make gc_log and gc_warn settable at runtime
[ https://issues.apache.org/jira/browse/CASSANDRA-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15965339#comment-15965339 ] Michael Shuler commented on CASSANDRA-12661: The new tests fail: http://cassci.datastax.com/job/trunk_testall/1509/testReport/org.apache.cassandra.service/GCInspectorTest/ensureLogLessThanWarn/ http://cassci.datastax.com/job/trunk_testall/1509/testReport/org.apache.cassandra.service/GCInspectorTest/ensureLogLessThanWarn_compression/ {noformat} Error Message Expected exception: java.lang.IllegalArgumentException Stacktrace junit.framework.AssertionFailedError: Expected exception: java.lang.IllegalArgumentException Standard Output ERROR [main] 2017-04-12 02:05:31,914 SubstituteLogger.java:250 - SLF4J: stderr INFO [main] 2017-04-12 02:05:32,110 YamlConfigurationLoader.java:89 - Configuration location: file:/home/automaton/cassandra/test/conf/cassandra.yaml DEBUG [main] 2017-04-12 02:05:32,112 YamlConfigurationLoader.java:108 - Loading settings from file:/home/automaton/cassandra/test/conf/cassandra.yaml INFO [main] 2017-04-12 02:05:32,995 Config.java:454 - Node configuration:[allocate_tokens_for_keyspace=null; authenticator=null; authorizer=null; auto_bootstrap=true; auto_snapshot=true; back_pressure_enabled=false; back_pressure_strategy=null; batch_size_fail_threshold_in_kb=50; batch_size_warn_threshold_in_kb=5; batchlog_replay_throttle_in_kb=1024; broadcast_address=null; broadcast_rpc_address=null; buffer_pool_use_heap_if_exhausted=true; cas_contention_timeout_in_ms=1000; cdc_enabled=false; cdc_free_space_check_interval_ms=250; cdc_raw_directory=build/test/cassandra/cdc_raw:331; cdc_total_space_in_mb=0; client_encryption_options=; cluster_name=Test Cluster; column_index_cache_size_in_kb=2; column_index_size_in_kb=4; commit_failure_policy=stop; commitlog_compression=null; commitlog_directory=build/test/cassandra/commitlog:331; commitlog_max_compression_buffers_in_pool=3; commitlog_periodic_queue_size=-1; commitlog_segment_size_in_mb=5; commitlog_sync=batch; commitlog_sync_batch_window_in_ms=1.0; commitlog_sync_period_in_ms=0; commitlog_total_space_in_mb=null; compaction_large_partition_warning_threshold_mb=100; compaction_throughput_mb_per_sec=0; concurrent_compactors=4; concurrent_counter_writes=32; concurrent_materialized_view_writes=32; concurrent_reads=32; concurrent_replicates=null; concurrent_writes=32; counter_cache_keys_to_save=2147483647; counter_cache_save_period=7200; counter_cache_size_in_mb=null; counter_write_request_timeout_in_ms=5000; credentials_cache_max_entries=1000; credentials_update_interval_in_ms=-1; credentials_validity_in_ms=2000; cross_node_timeout=false; data_file_directories=[Ljava.lang.String;@4c178a76; disk_access_mode=mmap; disk_failure_policy=ignore; disk_optimization_estimate_percentile=0.95; disk_optimization_page_cross_chance=0.1; disk_optimization_strategy=ssd; dynamic_snitch=true; dynamic_snitch_badness_threshold=0.1; dynamic_snitch_reset_interval_in_ms=60; dynamic_snitch_update_interval_in_ms=100; enable_scripted_user_defined_functions=true; enable_user_defined_functions=true; enable_user_defined_functions_threads=true; encryption_options=null; endpoint_snitch=org.apache.cassandra.locator.SimpleSnitch; file_cache_size_in_mb=null; gc_log_threshold_in_ms=200; gc_warn_threshold_in_ms=1000; hinted_handoff_disabled_datacenters=[]; hinted_handoff_enabled=true; hinted_handoff_throttle_in_kb=1024; hints_compression=null; hints_directory=build/test/cassandra/hints:331; hints_flush_period_in_ms=1; ideal_consistency_level=null; incremental_backups=true; index_summary_capacity_in_mb=null; index_summary_resize_interval_in_minutes=60; initial_token=null; inter_dc_stream_throughput_outbound_megabits_per_sec=200; inter_dc_tcp_nodelay=true; internode_authenticator=null; internode_compression=none; internode_recv_buff_size_in_bytes=0; internode_send_buff_size_in_bytes=0; key_cache_keys_to_save=2147483647; key_cache_save_period=14400; key_cache_size_in_mb=null; listen_address=127.0.0.1; listen_interface=null; listen_interface_prefer_ipv6=false; listen_on_broadcast_address=false; max_hint_window_in_ms=1080; max_hints_delivery_threads=2; max_hints_file_size_in_mb=128; max_mutation_size_in_kb=null; max_streaming_retries=3; max_value_size_in_mb=256; memtable_allocation_type=offheap_objects; memtable_cleanup_threshold=null; memtable_flush_writers=0; memtable_heap_space_in_mb=null; memtable_offheap_space_in_mb=null; min_free_space_per_drive_in_mb=50; native_transport_max_concurrent_connections=-1; native_transport_max_concurrent_connections_per_ip=-1; native_transport_max_frame_size_in_mb=256; native_transport_max_threads=128; native_transport_port=9373; native_transport_port_ssl=null; num_tokens=1; otc_coalescing_enough_coalesced_messages=8; otc_coalescing_strategy=DISABLED; otc_coalescing_window_us=200;
[jira] [Updated] (CASSANDRA-13257) Add repair streaming preview
[ https://issues.apache.org/jira/browse/CASSANDRA-13257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston updated CASSANDRA-13257: Reviewer: Marcus Eriksson (was: Yuki Morishita) > Add repair streaming preview > > > Key: CASSANDRA-13257 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13257 > Project: Cassandra > Issue Type: New Feature > Components: Streaming and Messaging >Reporter: Blake Eggleston >Assignee: Blake Eggleston > Fix For: 4.0 > > > It would be useful to be able to estimate the amount of repair streaming that > needs to be done, without actually doing any streaming. Our main motivation > for this having something this is validating CASSANDRA-9143 in production, > but I’d imagine it could also be a useful tool in troubleshooting. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-12661) Make gc_log and gc_warn settable at runtime
[ https://issues.apache.org/jira/browse/CASSANDRA-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15965188#comment-15965188 ] Blake Eggleston commented on CASSANDRA-12661: - Committed as {{1096f9f5e77ff7b17cc7f9fe5aba008834899251}}, thanks! > Make gc_log and gc_warn settable at runtime > --- > > Key: CASSANDRA-12661 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12661 > Project: Cassandra > Issue Type: New Feature >Reporter: Edward Capriolo >Priority: Minor > > Changes: > * Move gc_log_threshold_in_ms and gc_warn_threshold_in_ms close together in > the config > * rename variables to match properties > * add unit tests to ensure hybration > * add unit tests to ensure variables are set propertly > * minor perf (do not consturct string from buffer f not logging) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-12661) Make gc_log and gc_warn settable at runtime
[ https://issues.apache.org/jira/browse/CASSANDRA-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Blake Eggleston updated CASSANDRA-12661: Resolution: Fixed Assignee: Edward Capriolo Status: Resolved (was: Patch Available) > Make gc_log and gc_warn settable at runtime > --- > > Key: CASSANDRA-12661 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12661 > Project: Cassandra > Issue Type: New Feature >Reporter: Edward Capriolo >Assignee: Edward Capriolo >Priority: Minor > > Changes: > * Move gc_log_threshold_in_ms and gc_warn_threshold_in_ms close together in > the config > * rename variables to match properties > * add unit tests to ensure hybration > * add unit tests to ensure variables are set propertly > * minor perf (do not consturct string from buffer f not logging) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
cassandra git commit: Make gc_log and gc_warn settable at runtime
Repository: cassandra Updated Branches: refs/heads/trunk b207f0d95 -> 1096f9f5e Make gc_log and gc_warn settable at runtime Patch by Edward Capriolo; reviewed by Jon Haddad for CASSANDRA-12661 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/1096f9f5 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/1096f9f5 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/1096f9f5 Branch: refs/heads/trunk Commit: 1096f9f5e77ff7b17cc7f9fe5aba008834899251 Parents: b207f0d Author: Edward CaprioloAuthored: Sat Sep 17 12:07:20 2016 -0400 Committer: Blake Eggleston Committed: Tue Apr 11 17:14:11 2017 -0700 -- CHANGES.txt | 1 + conf/cassandra.yaml | 14 ++-- .../org/apache/cassandra/config/Config.java | 2 +- .../apache/cassandra/service/GCInspector.java | 75 .../cassandra/service/GCInspectorMXBean.java| 11 ++- .../cassandra/service/GCInspectorTest.java | 60 6 files changed, 139 insertions(+), 24 deletions(-) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/1096f9f5/CHANGES.txt -- diff --git a/CHANGES.txt b/CHANGES.txt index 5c38307..4f3cb3b 100644 --- a/CHANGES.txt +++ b/CHANGES.txt @@ -1,4 +1,5 @@ 4.0 + * Make gc_log and gc_warn settable at runtime (CASSANDRA-12661) * Take number of files in L0 in account when estimating remaining compaction tasks (CASSANDRA-13354) * Skip building views during base table streams on range movements (CASSANDRA-13065) * Improve error messages for +/- operations on maps and tuples (CASSANDRA-13197) http://git-wip-us.apache.org/repos/asf/cassandra/blob/1096f9f5/conf/cassandra.yaml -- diff --git a/conf/cassandra.yaml b/conf/cassandra.yaml index f2c4c84..1c54830 100644 --- a/conf/cassandra.yaml +++ b/conf/cassandra.yaml @@ -981,10 +981,6 @@ inter_dc_tcp_nodelay: false tracetype_query_ttl: 86400 tracetype_repair_ttl: 604800 -# By default, Cassandra logs GC Pauses greater than 200 ms at INFO level -# This threshold can be adjusted to minimize logging if necessary -# gc_log_threshold_in_ms: 200 - # If unset, all GC Pauses greater than gc_log_threshold_in_ms will log at # INFO level # UDFs (user defined functions) are disabled by default. @@ -1062,10 +1058,14 @@ unlogged_batch_across_partitions_warn_threshold: 10 # Log a warning when compacting partitions larger than this value compaction_large_partition_warning_threshold_mb: 100 +# GC Pauses greater than 200 ms will be logged at INFO level +# This threshold can be adjusted to minimize logging if necessary +# gc_log_threshold_in_ms: 200 + # GC Pauses greater than gc_warn_threshold_in_ms will be logged at WARN level -# Adjust the threshold based on your application throughput requirement -# By default, Cassandra logs GC Pauses greater than 200 ms at INFO level -gc_warn_threshold_in_ms: 1000 +# Adjust the threshold based on your application throughput requirement. Setting to 0 +# will deactivate the feature. +# gc_warn_threshold_in_ms: 1000 # Maximum size of any value in SSTables. Safety measure to detect SSTable corruption # early. Any value size larger than this threshold will result into marking an SSTable http://git-wip-us.apache.org/repos/asf/cassandra/blob/1096f9f5/src/java/org/apache/cassandra/config/Config.java -- diff --git a/src/java/org/apache/cassandra/config/Config.java b/src/java/org/apache/cassandra/config/Config.java index 1461cd4..b86429c 100644 --- a/src/java/org/apache/cassandra/config/Config.java +++ b/src/java/org/apache/cassandra/config/Config.java @@ -267,7 +267,7 @@ public class Config public volatile int index_summary_resize_interval_in_minutes = 60; public int gc_log_threshold_in_ms = 200; -public int gc_warn_threshold_in_ms = 0; +public int gc_warn_threshold_in_ms = 1000; // TTL for different types of trace events. public int tracetype_query_ttl = (int) TimeUnit.DAYS.toSeconds(1); http://git-wip-us.apache.org/repos/asf/cassandra/blob/1096f9f5/src/java/org/apache/cassandra/service/GCInspector.java -- diff --git a/src/java/org/apache/cassandra/service/GCInspector.java b/src/java/org/apache/cassandra/service/GCInspector.java index e7cfcd0..b016249 100644 --- a/src/java/org/apache/cassandra/service/GCInspector.java +++ b/src/java/org/apache/cassandra/service/GCInspector.java @@ -17,6 +17,7 @@ */ package org.apache.cassandra.service; +import java.io.IOException; import
[jira] [Commented] (CASSANDRA-12661) Make gc_log and gc_warn settable at runtime
[ https://issues.apache.org/jira/browse/CASSANDRA-12661?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15965112#comment-15965112 ] Jon Haddad commented on CASSANDRA-12661: Sorry for the long delay. I've had a million things on my plate. This looks good, I'm +1 to merge it. > Make gc_log and gc_warn settable at runtime > --- > > Key: CASSANDRA-12661 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12661 > Project: Cassandra > Issue Type: New Feature >Reporter: Edward Capriolo >Priority: Minor > > Changes: > * Move gc_log_threshold_in_ms and gc_warn_threshold_in_ms close together in > the config > * rename variables to match properties > * add unit tests to ensure hybration > * add unit tests to ensure variables are set propertly > * minor perf (do not consturct string from buffer f not logging) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread
[ https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15964943#comment-15964943 ] Ariel Weisberg edited comment on CASSANDRA-13265 at 4/11/17 8:51 PM: - There seem to be some build issues in various branches? Maybe because I rebased? You should register with CircleCI so it will automatically build and run the unit tests for you out of your repo when you commit. When you rebase there will be a circle.yml in each branch that will automatically have it run the build. ||Code|utests|dtests|| |[2.2|https://github.com/aweisberg/cassandra/pull/new/cassandra-13265-2.2]|[utests|https://circleci.com/gh/aweisberg/cassandra/tree/cassandra-13265-2%2E2]|| |[3.0|https://github.com/aweisberg/cassandra/pull/new/cassandra-13265-3.0]|[utests|https://circleci.com/gh/aweisberg/cassandra/tree/cassandra-13265-3%2E0]|| |[3.11|https://github.com/aweisberg/cassandra/pull/new/cassandra-13265-3.11]|[utests|https://circleci.com/gh/aweisberg/cassandra/tree/cassandra-13265-3%2E11]|| |[trunk|https://github.com/aweisberg/cassandra/pull/new/cassandra-13265-trunk]|[utests|https://circleci.com/gh/aweisberg/cassandra/tree/cassandra-13625-trunk]|| was (Author: aweisberg): There seem to be some build issues in various branches? Maybe because I rebased? You should register with CircleCI so it will automatically build and run the unit tests for you out of your repo when you commit. ||Code|utests|dtests|| |[2.2|https://github.com/aweisberg/cassandra/pull/new/cassandra-13265-2.2]|[utests|https://circleci.com/gh/aweisberg/cassandra/tree/cassandra-13265-2%2E2]|| |[3.0|https://github.com/aweisberg/cassandra/pull/new/cassandra-13265-3.0]|[utests|https://circleci.com/gh/aweisberg/cassandra/tree/cassandra-13265-3%2E0]|| |[3.11|https://github.com/aweisberg/cassandra/pull/new/cassandra-13265-3.11]|[utests|https://circleci.com/gh/aweisberg/cassandra/tree/cassandra-13265-3%2E11]|| |[trunk|https://github.com/aweisberg/cassandra/pull/new/cassandra-13265-trunk]|[utests|https://circleci.com/gh/aweisberg/cassandra/tree/cassandra-13625-trunk]|| > Expiration in OutboundTcpConnection can block the reader Thread > --- > > Key: CASSANDRA-13265 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13265 > Project: Cassandra > Issue Type: Bug > Environment: Cassandra 3.0.9 > Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version > 1.8.0_112-b15) > Linux 3.16 >Reporter: Christian Esken >Assignee: Christian Esken > Fix For: 3.0.x > > Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, > cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz > > > I observed that sometimes a single node in a Cassandra cluster fails to > communicate to the other nodes. This can happen at any time, during peak load > or low load. Restarting that single node from the cluster fixes the issue. > Before going in to details, I want to state that I have analyzed the > situation and am already developing a possible fix. Here is the analysis so > far: > - A Threaddump in this situation showed 324 Threads in the > OutboundTcpConnection class that want to lock the backlog queue for doing > expiration. > - A class histogram shows 262508 instances of > OutboundTcpConnection$QueuedMessage. > What is the effect of it? As soon as the Cassandra node has reached a certain > amount of queued messages, it starts thrashing itself to death. Each of the > Thread fully locks the Queue for reading and writing by calling > iterator.next(), making the situation worse and worse. > - Writing: Only after 262508 locking operation it can progress with actually > writing to the Queue. > - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and > fully lock the Queue > This means: Writing blocks the Queue for reading, and readers might even be > starved which makes the situation even worse. > - > The setup is: > - 3-node cluster > - replication factor 2 > - Consistency LOCAL_ONE > - No remote DC's > - high write throughput (10 INSERT statements per second and more during > peak times). > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread
[ https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15964943#comment-15964943 ] Ariel Weisberg commented on CASSANDRA-13265: There seem to be some build issues in various branches? Maybe because I rebased? You should register with CircleCI so it will automatically build and run the unit tests for you out of your repo when you commit. ||Code|utests|dtests|| |[2.2|https://github.com/aweisberg/cassandra/pull/new/cassandra-13265-2.2]|[utests|https://circleci.com/gh/aweisberg/cassandra/tree/cassandra-13265-2%2E2]|| |[3.0|https://github.com/aweisberg/cassandra/pull/new/cassandra-13265-3.0]|[utests|https://circleci.com/gh/aweisberg/cassandra/tree/cassandra-13265-3%2E0]|| |[3.11|https://github.com/aweisberg/cassandra/pull/new/cassandra-13265-3.11]|[utests|https://circleci.com/gh/aweisberg/cassandra/tree/cassandra-13265-3%2E11]|| |[trunk|https://github.com/aweisberg/cassandra/pull/new/cassandra-13265-trunk]|[utests|https://circleci.com/gh/aweisberg/cassandra/tree/cassandra-13625-trunk]|| > Expiration in OutboundTcpConnection can block the reader Thread > --- > > Key: CASSANDRA-13265 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13265 > Project: Cassandra > Issue Type: Bug > Environment: Cassandra 3.0.9 > Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version > 1.8.0_112-b15) > Linux 3.16 >Reporter: Christian Esken >Assignee: Christian Esken > Fix For: 3.0.x > > Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, > cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz > > > I observed that sometimes a single node in a Cassandra cluster fails to > communicate to the other nodes. This can happen at any time, during peak load > or low load. Restarting that single node from the cluster fixes the issue. > Before going in to details, I want to state that I have analyzed the > situation and am already developing a possible fix. Here is the analysis so > far: > - A Threaddump in this situation showed 324 Threads in the > OutboundTcpConnection class that want to lock the backlog queue for doing > expiration. > - A class histogram shows 262508 instances of > OutboundTcpConnection$QueuedMessage. > What is the effect of it? As soon as the Cassandra node has reached a certain > amount of queued messages, it starts thrashing itself to death. Each of the > Thread fully locks the Queue for reading and writing by calling > iterator.next(), making the situation worse and worse. > - Writing: Only after 262508 locking operation it can progress with actually > writing to the Queue. > - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and > fully lock the Queue > This means: Writing blocks the Queue for reading, and readers might even be > starved which makes the situation even worse. > - > The setup is: > - 3-node cluster > - replication factor 2 > - Consistency LOCAL_ONE > - No remote DC's > - high write throughput (10 INSERT statements per second and more during > peak times). > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Comment Edited] (CASSANDRA-12805) Website documentation for commitlog
[ https://issues.apache.org/jira/browse/CASSANDRA-12805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15964821#comment-15964821 ] Evan Prothro edited comment on CASSANDRA-12805 at 4/11/17 7:05 PM: --- LGTM in {{22acb00235ee081d3555cb1ff2780805e0268b07}}, thanks! was (Author: eprothro): LGTM in 22acb00235ee081d3555cb1ff2780805e0268b07, thanks! > Website documentation for commitlog > --- > > Key: CASSANDRA-12805 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12805 > Project: Cassandra > Issue Type: Improvement > Components: Documentation and Website >Reporter: Hau Phan >Assignee: Hau Phan >Priority: Minor > Labels: documentation > Attachments: 12805-trunk.txt > > > Updated Storage Engine page for commitlogs > Commit: > https://github.com/nothau/cassandra/commit/f90038e9f35281bdd58dabb25f21836a690e56f5 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-12805) Website documentation for commitlog
[ https://issues.apache.org/jira/browse/CASSANDRA-12805?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15964821#comment-15964821 ] Evan Prothro commented on CASSANDRA-12805: -- LGTM in 22acb00235ee081d3555cb1ff2780805e0268b07, thanks! > Website documentation for commitlog > --- > > Key: CASSANDRA-12805 > URL: https://issues.apache.org/jira/browse/CASSANDRA-12805 > Project: Cassandra > Issue Type: Improvement > Components: Documentation and Website >Reporter: Hau Phan >Assignee: Hau Phan >Priority: Minor > Labels: documentation > Attachments: 12805-trunk.txt > > > Updated Storage Engine page for commitlogs > Commit: > https://github.com/nothau/cassandra/commit/f90038e9f35281bdd58dabb25f21836a690e56f5 -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[cassandra] Git Push Summary
Repository: cassandra Updated Tags: refs/tags/3.0.13-tentative [created] 91661ec29
[6/6] cassandra git commit: Merge branch 'cassandra-3.11' into trunk
Merge branch 'cassandra-3.11' into trunk Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/b207f0d9 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/b207f0d9 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/b207f0d9 Branch: refs/heads/trunk Commit: b207f0d95d0554c4beb344e35261027a00082081 Parents: f6f5012 fe8e211 Author: Michael ShulerAuthored: Tue Apr 11 12:54:40 2017 -0500 Committer: Michael Shuler Committed: Tue Apr 11 12:54:40 2017 -0500 -- --
[3/6] cassandra git commit: Prep 3.0.13 release
Prep 3.0.13 release Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/91661ec2 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/91661ec2 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/91661ec2 Branch: refs/heads/trunk Commit: 91661ec296c6d089e3238e1a72f3861c449326aa Parents: f63ea27 Author: Michael ShulerAuthored: Tue Apr 11 12:54:01 2017 -0500 Committer: Michael Shuler Committed: Tue Apr 11 12:54:01 2017 -0500 -- debian/changelog | 6 ++ 1 file changed, 6 insertions(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/91661ec2/debian/changelog -- diff --git a/debian/changelog b/debian/changelog index 5ced67c..6067541 100644 --- a/debian/changelog +++ b/debian/changelog @@ -1,3 +1,9 @@ +cassandra (3.0.13) unstable; urgency=medium + + * New release + + -- Michael Shuler Tue, 11 Apr 2017 12:52:26 -0500 + cassandra (3.0.12) unstable; urgency=medium * New release
[4/6] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11
Merge branch 'cassandra-3.0' into cassandra-3.11 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/fe8e2110 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/fe8e2110 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/fe8e2110 Branch: refs/heads/trunk Commit: fe8e21109371c5d768204da4076d168c8b2fc01a Parents: c975142 91661ec Author: Michael ShulerAuthored: Tue Apr 11 12:54:15 2017 -0500 Committer: Michael Shuler Committed: Tue Apr 11 12:54:15 2017 -0500 -- --
[1/6] cassandra git commit: Prep 3.0.13 release
Repository: cassandra Updated Branches: refs/heads/cassandra-3.0 f63ea2727 -> 91661ec29 refs/heads/cassandra-3.11 c97514243 -> fe8e21109 refs/heads/trunk f6f50129d -> b207f0d95 Prep 3.0.13 release Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/91661ec2 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/91661ec2 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/91661ec2 Branch: refs/heads/cassandra-3.0 Commit: 91661ec296c6d089e3238e1a72f3861c449326aa Parents: f63ea27 Author: Michael ShulerAuthored: Tue Apr 11 12:54:01 2017 -0500 Committer: Michael Shuler Committed: Tue Apr 11 12:54:01 2017 -0500 -- debian/changelog | 6 ++ 1 file changed, 6 insertions(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/91661ec2/debian/changelog -- diff --git a/debian/changelog b/debian/changelog index 5ced67c..6067541 100644 --- a/debian/changelog +++ b/debian/changelog @@ -1,3 +1,9 @@ +cassandra (3.0.13) unstable; urgency=medium + + * New release + + -- Michael Shuler Tue, 11 Apr 2017 12:52:26 -0500 + cassandra (3.0.12) unstable; urgency=medium * New release
[2/6] cassandra git commit: Prep 3.0.13 release
Prep 3.0.13 release Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/91661ec2 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/91661ec2 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/91661ec2 Branch: refs/heads/cassandra-3.11 Commit: 91661ec296c6d089e3238e1a72f3861c449326aa Parents: f63ea27 Author: Michael ShulerAuthored: Tue Apr 11 12:54:01 2017 -0500 Committer: Michael Shuler Committed: Tue Apr 11 12:54:01 2017 -0500 -- debian/changelog | 6 ++ 1 file changed, 6 insertions(+) -- http://git-wip-us.apache.org/repos/asf/cassandra/blob/91661ec2/debian/changelog -- diff --git a/debian/changelog b/debian/changelog index 5ced67c..6067541 100644 --- a/debian/changelog +++ b/debian/changelog @@ -1,3 +1,9 @@ +cassandra (3.0.13) unstable; urgency=medium + + * New release + + -- Michael Shuler Tue, 11 Apr 2017 12:52:26 -0500 + cassandra (3.0.12) unstable; urgency=medium * New release
[5/6] cassandra git commit: Merge branch 'cassandra-3.0' into cassandra-3.11
Merge branch 'cassandra-3.0' into cassandra-3.11 Project: http://git-wip-us.apache.org/repos/asf/cassandra/repo Commit: http://git-wip-us.apache.org/repos/asf/cassandra/commit/fe8e2110 Tree: http://git-wip-us.apache.org/repos/asf/cassandra/tree/fe8e2110 Diff: http://git-wip-us.apache.org/repos/asf/cassandra/diff/fe8e2110 Branch: refs/heads/cassandra-3.11 Commit: fe8e21109371c5d768204da4076d168c8b2fc01a Parents: c975142 91661ec Author: Michael ShulerAuthored: Tue Apr 11 12:54:15 2017 -0500 Committer: Michael Shuler Committed: Tue Apr 11 12:54:15 2017 -0500 -- --
[cassandra] Git Push Summary
Repository: cassandra Updated Branches: refs/heads/3.0 [deleted] 2d6fd7824
[jira] [Updated] (CASSANDRA-13228) SASI index on partition key part doesn't match
[ https://issues.apache.org/jira/browse/CASSANDRA-13228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrés de la Peña updated CASSANDRA-13228: -- Status: Patch Available (was: In Progress) > SASI index on partition key part doesn't match > -- > > Key: CASSANDRA-13228 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13228 > Project: Cassandra > Issue Type: Bug > Components: sasi >Reporter: Hannu Kröger >Assignee: Andrés de la Peña > Labels: sasi > > I created a SASI index on first part of multi-part partition key. Running > query using that index doesn't seem to work. > I have here a log of queries that should indicate the issue: > {code}cqlsh:test> CREATE TABLE test1(name text, event_date date, data_type > text, bytes int, PRIMARY KEY ((name, event_date), data_type)); > cqlsh:test> CREATE CUSTOM INDEX test_index ON test1(name) USING > 'org.apache.cassandra.index.sasi.SASIIndex'; > cqlsh:test> INSERT INTO test1(name, event_date, data_type, bytes) > values('1234', '2010-01-01', 'sensor', 128); > cqlsh:test> INSERT INTO test1(name, event_date, data_type, bytes) > values('abcd', '2010-01-02', 'sensor', 500); > cqlsh:test> select * from test1 where NAME = '1234'; > name | event_date | data_type | bytes > --++---+--- > (0 rows) > cqlsh:test> CONSISTENCY ALL; > Consistency level set to ALL. > cqlsh:test> select * from test1 where NAME = '1234'; > name | event_date | data_type | bytes > --++---+--- > (0 rows){code} > Note! Creating a SASI index on single part partition key, SASI index creation > fails. Apparently this should not work at all, so is it about missing > validation on index creation? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13228) SASI index on partition key part doesn't match
[ https://issues.apache.org/jira/browse/CASSANDRA-13228?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15964618#comment-15964618 ] Andrés de la Peña commented on CASSANDRA-13228: --- Here is a patch forbidding the creation of SASI indexes over partition key columns during index options validation: ||[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...adelapena:13228-3.11]|[utests|http://cassci.datastax.com/view/Dev/view/adelapena/job/adelapena-13228-3.11-testall/]|[dtests|http://cassci.datastax.com/view/Dev/view/adelapena/job/adelapena-13228-3.11-dtest/]| ||[trunk|https://github.com/apache/cassandra/compare/trunk...adelapena:13228-trunk]|[utests|http://cassci.datastax.com/view/Dev/view/adelapena/job/adelapena-13228-trunk-testall/]|[dtests|http://cassci.datastax.com/view/Dev/view/adelapena/job/adelapena-13228-trunk-dtest/]| > SASI index on partition key part doesn't match > -- > > Key: CASSANDRA-13228 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13228 > Project: Cassandra > Issue Type: Bug > Components: sasi >Reporter: Hannu Kröger >Assignee: Andrés de la Peña > Labels: sasi > > I created a SASI index on first part of multi-part partition key. Running > query using that index doesn't seem to work. > I have here a log of queries that should indicate the issue: > {code}cqlsh:test> CREATE TABLE test1(name text, event_date date, data_type > text, bytes int, PRIMARY KEY ((name, event_date), data_type)); > cqlsh:test> CREATE CUSTOM INDEX test_index ON test1(name) USING > 'org.apache.cassandra.index.sasi.SASIIndex'; > cqlsh:test> INSERT INTO test1(name, event_date, data_type, bytes) > values('1234', '2010-01-01', 'sensor', 128); > cqlsh:test> INSERT INTO test1(name, event_date, data_type, bytes) > values('abcd', '2010-01-02', 'sensor', 500); > cqlsh:test> select * from test1 where NAME = '1234'; > name | event_date | data_type | bytes > --++---+--- > (0 rows) > cqlsh:test> CONSISTENCY ALL; > Consistency level set to ALL. > cqlsh:test> select * from test1 where NAME = '1234'; > name | event_date | data_type | bytes > --++---+--- > (0 rows){code} > Note! Creating a SASI index on single part partition key, SASI index creation > fails. Apparently this should not work at all, so is it about missing > validation on index creation? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread
[ https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15964613#comment-15964613 ] Ariel Weisberg commented on CASSANDRA-13265: Squashing is preferred, but I like to keep the history. When I squash I create a second -squashed branch to hold the squash commit. I never delete branches and tolerate the cluttered namespace. If we used pull requests I would delete branches since the pull request preserves the information, but we don't :-( Since people tend to work on multiple tickets they don't name the branch cassandra-3.0 they do something like cassandra-13625-3.0. The commit process I follow is http://cassandra.apache.org/doc/latest/development/how_to_commit.html. For the commit message don't list multiple reviewers just the one in the JIRA (me). I have been told that line is automatically parsed so we want to stick to the expected format. Also some OCD people want a line break in between the first and last line of the commit message. Having CHANGES.TXT in the patch is helpful so I don't forget to add it in one branch. If it's not there and I follow the commit process I have to add it at each branch. For the entry also include the ticket number in parens at the end. I'll kick off the tests now. > Expiration in OutboundTcpConnection can block the reader Thread > --- > > Key: CASSANDRA-13265 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13265 > Project: Cassandra > Issue Type: Bug > Environment: Cassandra 3.0.9 > Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version > 1.8.0_112-b15) > Linux 3.16 >Reporter: Christian Esken >Assignee: Christian Esken > Fix For: 3.0.x > > Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, > cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz > > > I observed that sometimes a single node in a Cassandra cluster fails to > communicate to the other nodes. This can happen at any time, during peak load > or low load. Restarting that single node from the cluster fixes the issue. > Before going in to details, I want to state that I have analyzed the > situation and am already developing a possible fix. Here is the analysis so > far: > - A Threaddump in this situation showed 324 Threads in the > OutboundTcpConnection class that want to lock the backlog queue for doing > expiration. > - A class histogram shows 262508 instances of > OutboundTcpConnection$QueuedMessage. > What is the effect of it? As soon as the Cassandra node has reached a certain > amount of queued messages, it starts thrashing itself to death. Each of the > Thread fully locks the Queue for reading and writing by calling > iterator.next(), making the situation worse and worse. > - Writing: Only after 262508 locking operation it can progress with actually > writing to the Queue. > - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and > fully lock the Queue > This means: Writing blocks the Queue for reading, and readers might even be > starved which makes the situation even worse. > - > The setup is: > - 3-node cluster > - replication factor 2 > - Consistency LOCAL_ONE > - No remote DC's > - high write throughput (10 INSERT statements per second and more during > peak times). > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Assigned] (CASSANDRA-13228) SASI index on partition key part doesn't match
[ https://issues.apache.org/jira/browse/CASSANDRA-13228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrés de la Peña reassigned CASSANDRA-13228: - Assignee: Andrés de la Peña (was: Alex Petrov) > SASI index on partition key part doesn't match > -- > > Key: CASSANDRA-13228 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13228 > Project: Cassandra > Issue Type: Bug > Components: sasi >Reporter: Hannu Kröger >Assignee: Andrés de la Peña > Labels: sasi > > I created a SASI index on first part of multi-part partition key. Running > query using that index doesn't seem to work. > I have here a log of queries that should indicate the issue: > {code}cqlsh:test> CREATE TABLE test1(name text, event_date date, data_type > text, bytes int, PRIMARY KEY ((name, event_date), data_type)); > cqlsh:test> CREATE CUSTOM INDEX test_index ON test1(name) USING > 'org.apache.cassandra.index.sasi.SASIIndex'; > cqlsh:test> INSERT INTO test1(name, event_date, data_type, bytes) > values('1234', '2010-01-01', 'sensor', 128); > cqlsh:test> INSERT INTO test1(name, event_date, data_type, bytes) > values('abcd', '2010-01-02', 'sensor', 500); > cqlsh:test> select * from test1 where NAME = '1234'; > name | event_date | data_type | bytes > --++---+--- > (0 rows) > cqlsh:test> CONSISTENCY ALL; > Consistency level set to ALL. > cqlsh:test> select * from test1 where NAME = '1234'; > name | event_date | data_type | bytes > --++---+--- > (0 rows){code} > Note! Creating a SASI index on single part partition key, SASI index creation > fails. Apparently this should not work at all, so is it about missing > validation on index creation? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13265) Expiration in OutboundTcpConnection can block the reader Thread
[ https://issues.apache.org/jira/browse/CASSANDRA-13265?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15964522#comment-15964522 ] Christian Esken commented on CASSANDRA-13265: - Done. 2 organizational topics left: I will add the required line to the commit message. Looks OK?!? bq. patch by Christian Esken; reviewed by Ariel Weisberg and Jason Brown for CASSANDRA-13265 My proposals for the CHANGES.txt would be the following text. Can you do that, Ariel? I do not know in which versions to add that, as they are upcoming versions. bq. Expire OutboundTcpConnection messages by a single Thread Here are the branches.The cassandra-3.0 is already squashed. If that branch is OK, I will also squash the other 3 branches. https://github.com/christian-esken/cassandra/commits/cassandra-3.0 https://github.com/christian-esken/cassandra/commits/cassandra-3.11 https://github.com/christian-esken/cassandra/commits/trunk https://github.com/christian-esken/cassandra/commits/cassandra-2.2 > Expiration in OutboundTcpConnection can block the reader Thread > --- > > Key: CASSANDRA-13265 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13265 > Project: Cassandra > Issue Type: Bug > Environment: Cassandra 3.0.9 > Java HotSpot(TM) 64-Bit Server VM version 25.112-b15 (Java version > 1.8.0_112-b15) > Linux 3.16 >Reporter: Christian Esken >Assignee: Christian Esken > Fix For: 3.0.x > > Attachments: cassandra.pb-cache4-dus.2017-02-17-19-36-26.chist.xz, > cassandra.pb-cache4-dus.2017-02-17-19-36-26.td.xz > > > I observed that sometimes a single node in a Cassandra cluster fails to > communicate to the other nodes. This can happen at any time, during peak load > or low load. Restarting that single node from the cluster fixes the issue. > Before going in to details, I want to state that I have analyzed the > situation and am already developing a possible fix. Here is the analysis so > far: > - A Threaddump in this situation showed 324 Threads in the > OutboundTcpConnection class that want to lock the backlog queue for doing > expiration. > - A class histogram shows 262508 instances of > OutboundTcpConnection$QueuedMessage. > What is the effect of it? As soon as the Cassandra node has reached a certain > amount of queued messages, it starts thrashing itself to death. Each of the > Thread fully locks the Queue for reading and writing by calling > iterator.next(), making the situation worse and worse. > - Writing: Only after 262508 locking operation it can progress with actually > writing to the Queue. > - Reading: Is also blocked, as 324 Threads try to do iterator.next(), and > fully lock the Queue > This means: Writing blocks the Queue for reading, and readers might even be > starved which makes the situation even worse. > - > The setup is: > - 3-node cluster > - replication factor 2 > - Consistency LOCAL_ONE > - No remote DC's > - high write throughput (10 INSERT statements per second and more during > peak times). > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13432) MemtableReclaimMemory can get stuck because of lack of timeout in getTopLevelColumns()
[ https://issues.apache.org/jira/browse/CASSANDRA-13432?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15964465#comment-15964465 ] Corentin Chary commented on CASSANDRA-13432: I checked, 3.x has a different code to count tombstones so it's likely not affected > MemtableReclaimMemory can get stuck because of lack of timeout in > getTopLevelColumns() > -- > > Key: CASSANDRA-13432 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13432 > Project: Cassandra > Issue Type: Bug >Reporter: Corentin Chary > Fix For: 2.1.x > > > This might affect 3.x too, I'm not sure. > {code} > $ nodetool tpstats > Pool NameActive Pending Completed Blocked All > time blocked > MutationStage 0 0 32135875 0 > 0 > ReadStage 114 0 29492940 0 > 0 > RequestResponseStage 0 0 86090931 0 > 0 > ReadRepairStage 0 0 166645 0 > 0 > CounterMutationStage 0 0 0 0 > 0 > MiscStage 0 0 0 0 > 0 > HintedHandoff 0 0 47 0 > 0 > GossipStage 0 0 188769 0 > 0 > CacheCleanupExecutor 0 0 0 0 > 0 > InternalResponseStage 0 0 0 0 > 0 > CommitLogArchiver 0 0 0 0 > 0 > CompactionExecutor0 0 86835 0 > 0 > ValidationExecutor0 0 0 0 > 0 > MigrationStage0 0 0 0 > 0 > AntiEntropyStage 0 0 0 0 > 0 > PendingRangeCalculator0 0 92 0 > 0 > Sampler 0 0 0 0 > 0 > MemtableFlushWriter 0 0563 0 > 0 > MemtablePostFlush 0 0 1500 0 > 0 > MemtableReclaimMemory 129534 0 > 0 > Native-Transport-Requests41 0 54819182 0 > 1896 > {code} > {code} > "MemtableReclaimMemory:195" - Thread t@6268 >java.lang.Thread.State: WAITING > at sun.misc.Unsafe.park(Native Method) > at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304) > at > org.apache.cassandra.utils.concurrent.WaitQueue$AbstractSignal.awaitUninterruptibly(WaitQueue.java:283) > at > org.apache.cassandra.utils.concurrent.OpOrder$Barrier.await(OpOrder.java:417) > at > org.apache.cassandra.db.ColumnFamilyStore$Flush$1.runMayThrow(ColumnFamilyStore.java:1151) > at > org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) >Locked ownable synchronizers: > - locked <6e7b1160> (a java.util.concurrent.ThreadPoolExecutor$Worker) > "SharedPool-Worker-195" - Thread t@989 >java.lang.Thread.State: RUNNABLE > at > org.apache.cassandra.db.RangeTombstoneList.addInternal(RangeTombstoneList.java:690) > at > org.apache.cassandra.db.RangeTombstoneList.insertFrom(RangeTombstoneList.java:650) > at > org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:171) > at > org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:143) > at org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:240) > at > org.apache.cassandra.db.ArrayBackedSortedColumns.delete(ArrayBackedSortedColumns.java:483) > at org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:153) > at > org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:184) > at >
[jira] [Updated] (CASSANDRA-13432) MemtableReclaimMemory can get stuck because of lack of timeout in getTopLevelColumns()
[ https://issues.apache.org/jira/browse/CASSANDRA-13432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Corentin Chary updated CASSANDRA-13432: --- Description: This might affect 3.x too, I'm not sure. {code} $ nodetool tpstats Pool NameActive Pending Completed Blocked All time blocked MutationStage 0 0 32135875 0 0 ReadStage 114 0 29492940 0 0 RequestResponseStage 0 0 86090931 0 0 ReadRepairStage 0 0 166645 0 0 CounterMutationStage 0 0 0 0 0 MiscStage 0 0 0 0 0 HintedHandoff 0 0 47 0 0 GossipStage 0 0 188769 0 0 CacheCleanupExecutor 0 0 0 0 0 InternalResponseStage 0 0 0 0 0 CommitLogArchiver 0 0 0 0 0 CompactionExecutor0 0 86835 0 0 ValidationExecutor0 0 0 0 0 MigrationStage0 0 0 0 0 AntiEntropyStage 0 0 0 0 0 PendingRangeCalculator0 0 92 0 0 Sampler 0 0 0 0 0 MemtableFlushWriter 0 0563 0 0 MemtablePostFlush 0 0 1500 0 0 MemtableReclaimMemory 129534 0 0 Native-Transport-Requests41 0 54819182 0 1896 {code} {code} "MemtableReclaimMemory:195" - Thread t@6268 java.lang.Thread.State: WAITING at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304) at org.apache.cassandra.utils.concurrent.WaitQueue$AbstractSignal.awaitUninterruptibly(WaitQueue.java:283) at org.apache.cassandra.utils.concurrent.OpOrder$Barrier.await(OpOrder.java:417) at org.apache.cassandra.db.ColumnFamilyStore$Flush$1.runMayThrow(ColumnFamilyStore.java:1151) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Locked ownable synchronizers: - locked <6e7b1160> (a java.util.concurrent.ThreadPoolExecutor$Worker) "SharedPool-Worker-195" - Thread t@989 java.lang.Thread.State: RUNNABLE at org.apache.cassandra.db.RangeTombstoneList.addInternal(RangeTombstoneList.java:690) at org.apache.cassandra.db.RangeTombstoneList.insertFrom(RangeTombstoneList.java:650) at org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:171) at org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:143) at org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:240) at org.apache.cassandra.db.ArrayBackedSortedColumns.delete(ArrayBackedSortedColumns.java:483) at org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:153) at org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:184) at org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:156) at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:146) at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:125) at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:99) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:263) at
[jira] [Updated] (CASSANDRA-13432) MemtableReclaimMemory can get stuck because of lack of timeout in getTopLevelColumns()
[ https://issues.apache.org/jira/browse/CASSANDRA-13432?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Corentin Chary updated CASSANDRA-13432: --- Description: This might affect 3.x too, I'm not sure. {code} $ nodetool tpstats Pool NameActive Pending Completed Blocked All time blocked MutationStage 0 0 32135875 0 0 ReadStage 114 0 29492940 0 0 RequestResponseStage 0 0 86090931 0 0 ReadRepairStage 0 0 166645 0 0 CounterMutationStage 0 0 0 0 0 MiscStage 0 0 0 0 0 HintedHandoff 0 0 47 0 0 GossipStage 0 0 188769 0 0 CacheCleanupExecutor 0 0 0 0 0 InternalResponseStage 0 0 0 0 0 CommitLogArchiver 0 0 0 0 0 CompactionExecutor0 0 86835 0 0 ValidationExecutor0 0 0 0 0 MigrationStage0 0 0 0 0 AntiEntropyStage 0 0 0 0 0 PendingRangeCalculator0 0 92 0 0 Sampler 0 0 0 0 0 MemtableFlushWriter 0 0563 0 0 MemtablePostFlush 0 0 1500 0 0 MemtableReclaimMemory 129534 0 0 Native-Transport-Requests41 0 54819182 0 1896 {code} {code} "MemtableReclaimMemory:195" - Thread t@6268 java.lang.Thread.State: WAITING at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304) at org.apache.cassandra.utils.concurrent.WaitQueue$AbstractSignal.awaitUninterruptibly(WaitQueue.java:283) at org.apache.cassandra.utils.concurrent.OpOrder$Barrier.await(OpOrder.java:417) at org.apache.cassandra.db.ColumnFamilyStore$Flush$1.runMayThrow(ColumnFamilyStore.java:1151) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Locked ownable synchronizers: - locked <6e7b1160> (a java.util.concurrent.ThreadPoolExecutor$Worker) "SharedPool-Worker-195" - Thread t@989 java.lang.Thread.State: RUNNABLE at org.apache.cassandra.db.RangeTombstoneList.addInternal(RangeTombstoneList.java:690) at org.apache.cassandra.db.RangeTombstoneList.insertFrom(RangeTombstoneList.java:650) at org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:171) at org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:143) at org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:240) at org.apache.cassandra.db.ArrayBackedSortedColumns.delete(ArrayBackedSortedColumns.java:483) at org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:153) at org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:184) at org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:156) at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:146) at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:125) at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:99) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at com.google.common.collect.AbstractIterator.hasNext(AbstractIterator.java:138) at org.apache.cassandra.db.filter.SliceQueryFilter.collectReducedColumns(SliceQueryFilter.java:263) at
[jira] [Created] (CASSANDRA-13432) MemtableReclaimMemory can get stuck because of lack of timeout in getTopLevelColumns()
Corentin Chary created CASSANDRA-13432: -- Summary: MemtableReclaimMemory can get stuck because of lack of timeout in getTopLevelColumns() Key: CASSANDRA-13432 URL: https://issues.apache.org/jira/browse/CASSANDRA-13432 Project: Cassandra Issue Type: Bug Reporter: Corentin Chary Fix For: 2.1.x This might affect 3.x too, I'm not sure. {code} $ nodetool tpstats Pool NameActive Pending Completed Blocked All time blocked MutationStage 0 0 32135875 0 0 ReadStage 114 0 29492940 0 0 RequestResponseStage 0 0 86090931 0 0 ReadRepairStage 0 0 166645 0 0 CounterMutationStage 0 0 0 0 0 MiscStage 0 0 0 0 0 HintedHandoff 0 0 47 0 0 GossipStage 0 0 188769 0 0 CacheCleanupExecutor 0 0 0 0 0 InternalResponseStage 0 0 0 0 0 CommitLogArchiver 0 0 0 0 0 CompactionExecutor0 0 86835 0 0 ValidationExecutor0 0 0 0 0 MigrationStage0 0 0 0 0 AntiEntropyStage 0 0 0 0 0 PendingRangeCalculator0 0 92 0 0 Sampler 0 0 0 0 0 MemtableFlushWriter 0 0563 0 0 MemtablePostFlush 0 0 1500 0 0 MemtableReclaimMemory 129534 0 0 Native-Transport-Requests41 0 54819182 0 1896 {code} {code} "MemtableReclaimMemory:195" - Thread t@6268 java.lang.Thread.State: WAITING at sun.misc.Unsafe.park(Native Method) at java.util.concurrent.locks.LockSupport.park(LockSupport.java:304) at org.apache.cassandra.utils.concurrent.WaitQueue$AbstractSignal.awaitUninterruptibly(WaitQueue.java:283) at org.apache.cassandra.utils.concurrent.OpOrder$Barrier.await(OpOrder.java:417) at org.apache.cassandra.db.ColumnFamilyStore$Flush$1.runMayThrow(ColumnFamilyStore.java:1151) at org.apache.cassandra.utils.WrappedRunnable.run(WrappedRunnable.java:28) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) at java.lang.Thread.run(Thread.java:745) Locked ownable synchronizers: - locked <6e7b1160> (a java.util.concurrent.ThreadPoolExecutor$Worker) "SharedPool-Worker-195" - Thread t@989 java.lang.Thread.State: RUNNABLE at org.apache.cassandra.db.RangeTombstoneList.addInternal(RangeTombstoneList.java:690) at org.apache.cassandra.db.RangeTombstoneList.insertFrom(RangeTombstoneList.java:650) at org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:171) at org.apache.cassandra.db.RangeTombstoneList.add(RangeTombstoneList.java:143) at org.apache.cassandra.db.DeletionInfo.add(DeletionInfo.java:240) at org.apache.cassandra.db.ArrayBackedSortedColumns.delete(ArrayBackedSortedColumns.java:483) at org.apache.cassandra.db.ColumnFamily.addAtom(ColumnFamily.java:153) at org.apache.cassandra.db.filter.QueryFilter$2.getNext(QueryFilter.java:184) at org.apache.cassandra.db.filter.QueryFilter$2.hasNext(QueryFilter.java:156) at org.apache.cassandra.utils.MergeIterator$Candidate.advance(MergeIterator.java:146) at org.apache.cassandra.utils.MergeIterator$ManyToOne.advance(MergeIterator.java:125) at org.apache.cassandra.utils.MergeIterator$ManyToOne.computeNext(MergeIterator.java:99) at com.google.common.collect.AbstractIterator.tryToComputeNext(AbstractIterator.java:143) at
[jira] [Commented] (CASSANDRA-9200) Sequences
[ https://issues.apache.org/jira/browse/CASSANDRA-9200?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15964277#comment-15964277 ] Aleksey Yeschenko commented on CASSANDRA-9200: -- The related issues were all closed with essentially "Won't Fix/Later" resolution, just like this ticket. The objections here are still standing, however, so I'm not sure reopening the ticket - or opening a new one - would lead to anything. > Sequences > - > > Key: CASSANDRA-9200 > URL: https://issues.apache.org/jira/browse/CASSANDRA-9200 > Project: Cassandra > Issue Type: New Feature >Reporter: Jonathan Ellis > > UUIDs are usually the right choice for surrogate keys, but sometimes > application constraints dictate an increasing numeric value. > We could do this by using LWT to reserve "blocks" of the sequence for each > member of the cluster, which would eliminate paxos contention at the cost of > not being strictly increasing. > PostgreSQL syntax: > http://www.postgresql.org/docs/9.4/static/sql-createsequence.html -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13427) testall failure in org.apache.cassandra.index.internal.CassandraIndexTest.indexOnRegularColumn
[ https://issues.apache.org/jira/browse/CASSANDRA-13427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15964245#comment-15964245 ] Andrés de la Peña commented on CASSANDRA-13427: --- Thanks [~ifesdjeen], now it's perfect, +1 > testall failure in > org.apache.cassandra.index.internal.CassandraIndexTest.indexOnRegularColumn > -- > > Key: CASSANDRA-13427 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13427 > Project: Cassandra > Issue Type: Bug >Reporter: Alex Petrov >Assignee: Alex Petrov > > Because of the name clash, there's a following failure happening (extremely > infrequently, it's worth noting, seen it only once, no further traces / > instances found): > {code} > Error setting schema for test (query was: CREATE INDEX v_index ON > cql_test_keyspace.table_22(v)) > {code} > Stacktrace: > {code} > java.lang.RuntimeException: Error setting schema for test (query was: CREATE > INDEX v_index ON cql_test_keyspace.table_22(v)) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13427) testall failure in org.apache.cassandra.index.internal.CassandraIndexTest.indexOnRegularColumn
[ https://issues.apache.org/jira/browse/CASSANDRA-13427?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrés de la Peña updated CASSANDRA-13427: -- Status: Ready to Commit (was: Patch Available) > testall failure in > org.apache.cassandra.index.internal.CassandraIndexTest.indexOnRegularColumn > -- > > Key: CASSANDRA-13427 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13427 > Project: Cassandra > Issue Type: Bug >Reporter: Alex Petrov >Assignee: Alex Petrov > > Because of the name clash, there's a following failure happening (extremely > infrequently, it's worth noting, seen it only once, no further traces / > instances found): > {code} > Error setting schema for test (query was: CREATE INDEX v_index ON > cql_test_keyspace.table_22(v)) > {code} > Stacktrace: > {code} > java.lang.RuntimeException: Error setting schema for test (query was: CREATE > INDEX v_index ON cql_test_keyspace.table_22(v)) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-8675) COPY TO/FROM broken for newline characters
[ https://issues.apache.org/jira/browse/CASSANDRA-8675?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15964228#comment-15964228 ] Christophe commented on CASSANDRA-8675: --- This COPY FROM issue should be considered a real bug. If someone runs COPY TO followed by COPY FROM, it is reasonable to expect that the data loaded should exactly matched the data extracted. But because if this issue, that's not the case when the original data contains string with escaped characters. > COPY TO/FROM broken for newline characters > -- > > Key: CASSANDRA-8675 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8675 > Project: Cassandra > Issue Type: Bug > Environment: [cqlsh 5.0.1 | Cassandra 2.1.2 | CQL spec 3.2.0 | Native > protocol v3] > Ubuntu 14.04 64-bit >Reporter: Lex Lythius > Labels: cqlsh > Fix For: 2.1.3 > > Attachments: copytest.csv > > > Exporting/importing does not preserve contents when texts containing newline > (and possibly other) characters are involved: > {code:sql} > cqlsh:test> create table if not exists copytest (id int primary key, t text); > cqlsh:test> insert into copytest (id, t) values (1, 'This has a newline > ... character'); > cqlsh:test> insert into copytest (id, t) values (2, 'This has a quote " > character'); > cqlsh:test> insert into copytest (id, t) values (3, 'This has a fake tab \t > character (typed backslash, t)'); > cqlsh:test> select * from copytest; > id | t > +- > 1 | This has a newline\ncharacter > 2 |This has a quote " character > 3 | This has a fake tab \t character (entered slash-t text) > (3 rows) > cqlsh:test> copy copytest to '/tmp/copytest.csv'; > 3 rows exported in 0.034 seconds. > cqlsh:test> copy copytest from '/tmp/copytest.csv'; > 3 rows imported in 0.005 seconds. > cqlsh:test> select * from copytest; > id | t > +--- > 1 | This has a newlinencharacter > 2 | This has a quote " character > 3 | This has a fake tab \t character (typed backslash, t) > (3 rows) > {code} > I tried replacing \n in the CSV file with \\n, which just expands to \n in > the table; and with an actual newline character, which fails with error since > it prematurely terminates the record. > It seems backslashes are only used to take the following character as a > literal > Until this is fixed, what would be the best way to refactor an old table with > a new, incompatible structure maintaining its content and name, since we > can't rename tables? -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13276) Regression on CASSANDRA-11416: can't load snapshots of tables with dropped columns
[ https://issues.apache.org/jira/browse/CASSANDRA-13276?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15964196#comment-15964196 ] Andrés de la Peña commented on CASSANDRA-13276: --- Here is the patch for 3.0.x, 3.x and trunk: ||[3.0|https://github.com/apache/cassandra/compare/cassandra-3.0...adelapena:13276-3.0]|[utests|http://cassci.datastax.com/view/Dev/view/adelapena/job/adelapena-13276-3.0-testall/]|[dtests|http://cassci.datastax.com/view/Dev/view/adelapena/job/adelapena-13276-3.0-dtest/]| ||[3.11|https://github.com/apache/cassandra/compare/cassandra-3.11...adelapena:13276-3.11]|[utests|http://cassci.datastax.com/view/Dev/view/adelapena/job/adelapena-13276-3.11-testall/]|[dtests|http://cassci.datastax.com/view/Dev/view/adelapena/job/adelapena-13276-3.11-dtest/]| ||[trunk|https://github.com/apache/cassandra/compare/trunk...adelapena:13276-trunk]|[utests|http://cassci.datastax.com/view/Dev/view/adelapena/job/adelapena-13276-trunk-testall/]|[dtests|http://cassci.datastax.com/view/Dev/view/adelapena/job/adelapena-13276-trunk-dtest/]| And a new dtest can be found [here|https://github.com/riptano/cassandra-dtest/compare/master...adelapena:CASSANDRA-13276]. Thanks for reporting the bug. > Regression on CASSANDRA-11416: can't load snapshots of tables with dropped > columns > -- > > Key: CASSANDRA-13276 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13276 > Project: Cassandra > Issue Type: Bug >Reporter: Matt Kopit >Assignee: Andrés de la Peña > Fix For: 3.0.13, 3.11.0, 4.0 > > > I'm running Cassandra 3.10 and running into the exact same issue described in > CASSANDRA-11416: > 1. A table is created with columns 'a' and 'b' > 2. Data is written to the table > 3. Drop column 'b' > 4. Take a snapshot > 5. Drop the table > 6. Run the snapshot schema.cql to recreate the table and the run the alter > 7. Try to restore the snapshot data using sstableloader > sstableloader yields the error: > java.lang.RuntimeException: Unknown column b during deserialization -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (CASSANDRA-13276) Regression on CASSANDRA-11416: can't load snapshots of tables with dropped columns
[ https://issues.apache.org/jira/browse/CASSANDRA-13276?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Andrés de la Peña updated CASSANDRA-13276: -- Status: Patch Available (was: In Progress) > Regression on CASSANDRA-11416: can't load snapshots of tables with dropped > columns > -- > > Key: CASSANDRA-13276 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13276 > Project: Cassandra > Issue Type: Bug >Reporter: Matt Kopit >Assignee: Andrés de la Peña > Fix For: 3.0.13, 3.11.0, 4.0 > > > I'm running Cassandra 3.10 and running into the exact same issue described in > CASSANDRA-11416: > 1. A table is created with columns 'a' and 'b' > 2. Data is written to the table > 3. Drop column 'b' > 4. Take a snapshot > 5. Drop the table > 6. Run the snapshot schema.cql to recreate the table and the run the alter > 7. Try to restore the snapshot data using sstableloader > sstableloader yields the error: > java.lang.RuntimeException: Unknown column b during deserialization -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-8457) nio MessagingService
[ https://issues.apache.org/jira/browse/CASSANDRA-8457?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963989#comment-15963989 ] Sylvain Lebresne commented on CASSANDRA-8457: - bq. One remaining open issue is the current patch does not expire messages on the producer thread, like what we used to do in {{OutboundTcpConnection#enqueue}} But was that even a good idea? I'm not convinced. The chance of a message expiring on the sending node are low to start with, but having it expire _on the producer thread_ is kind of remote, and it doesn't feel like it's worth bothering. Especially since we'll expire the message a tiny bit later when processing it anyway. Overall, the benefit to code clarify of not checking for expiration in too many places outweigh imo any benefit we'd have here. {quote} bq. In {{connectionTimeout()}}, what happens when a connection attempt timeout? This is what {{OutboundTcpConnnection}} does, and I've replicated it here. {quote} Fair enough, but can't we at least log a warning? Sounds to me this would be an improvement. bq. TBH, I think both of these behaviors are incorrect I agree and never fully understood the reasoning here, it's just not something that seem to have created problem. Imo, there is just 2 cases: # we have a version match: success, just use the connection. # we have a version mismatch: disconnect from that connection and reconnect with what we know will be a proper version. Don't do anything to the backlog, we're still just negociating our connection and there is no reason to do anything with outstanding messages. Other than that, I had a quick look over the branch and it looks good, but unfortunately you seem to have squashed everything with makes it really hard to check the changes made since my last review, and I don't have the time right now to redo a full careful read of the whole patch. > nio MessagingService > > > Key: CASSANDRA-8457 > URL: https://issues.apache.org/jira/browse/CASSANDRA-8457 > Project: Cassandra > Issue Type: New Feature >Reporter: Jonathan Ellis >Assignee: Jason Brown >Priority: Minor > Labels: netty, performance > Fix For: 4.x > > > Thread-per-peer (actually two each incoming and outbound) is a big > contributor to context switching, especially for larger clusters. Let's look > at switching to nio, possibly via Netty. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13427) testall failure in org.apache.cassandra.index.internal.CassandraIndexTest.indexOnRegularColumn
[ https://issues.apache.org/jira/browse/CASSANDRA-13427?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963960#comment-15963960 ] Alex Petrov commented on CASSANDRA-13427: - [~adelapena] great find! Sorry I missed it. Fixed, rebased and re-triggered CI. > testall failure in > org.apache.cassandra.index.internal.CassandraIndexTest.indexOnRegularColumn > -- > > Key: CASSANDRA-13427 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13427 > Project: Cassandra > Issue Type: Bug >Reporter: Alex Petrov >Assignee: Alex Petrov > > Because of the name clash, there's a following failure happening (extremely > infrequently, it's worth noting, seen it only once, no further traces / > instances found): > {code} > Error setting schema for test (query was: CREATE INDEX v_index ON > cql_test_keyspace.table_22(v)) > {code} > Stacktrace: > {code} > java.lang.RuntimeException: Error setting schema for test (query was: CREATE > INDEX v_index ON cql_test_keyspace.table_22(v)) > {code} -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (CASSANDRA-13304) Add checksumming to the native protocol
[ https://issues.apache.org/jira/browse/CASSANDRA-13304?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15963944#comment-15963944 ] Sylvain Lebresne commented on CASSANDRA-13304: -- bq. defaulting to self-signed unverified TLS I'll admit I'm not enough of a TLS expert to know if there is potential problem with that solution or not, but I do very much like the idea of re-using existing and standard solution here if possible so I think we should assess the pros and cons of that approach. > Add checksumming to the native protocol > --- > > Key: CASSANDRA-13304 > URL: https://issues.apache.org/jira/browse/CASSANDRA-13304 > Project: Cassandra > Issue Type: Improvement > Components: Core >Reporter: Michael Kjellman >Assignee: Michael Kjellman > Labels: client-impacting > Attachments: 13304_v1.diff, boxplot-read-throughput.png, > boxplot-write-throughput.png > > > The native binary transport implementation doesn't include checksums. This > makes it highly susceptible to silently inserting corrupted data either due > to hardware issues causing bit flips on the sender/client side, C*/receiver > side, or network in between. > Attaching an implementation that makes checksum'ing mandatory (assuming both > client and server know about a protocol version that supports checksums) -- > and also adds checksumming to clients that request compression. > The serialized format looks something like this: > {noformat} > * 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 2 2 2 3 3 > * 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Number of Compressed Chunks | Compressed Length (e1)/ > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * / Compressed Length cont. (e1) |Uncompressed Length (e1) / > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Uncompressed Length cont. (e1)| CRC32 Checksum of Lengths (e1)| > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Checksum of Lengths cont. (e1)|Compressed Bytes (e1)+// > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | CRC32 Checksum (e1) || > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * |Compressed Length (e2) | > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Uncompressed Length (e2)| > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * |CRC32 Checksum of Lengths (e2) | > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Compressed Bytes (e2) +// > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | CRC32 Checksum (e2) || > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * |Compressed Length (en) | > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Uncompressed Length (en)| > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * |CRC32 Checksum of Lengths (en) | > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | Compressed Bytes (en) +// > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > * | CRC32 Checksum (en) || > * +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ > {noformat} > The first pass here adds checksums only to the actual contents of the frame > body itself (and doesn't actually checksum lengths and headers). While it > would be great to fully add checksuming across the entire protocol, the > proposed implementation will ensure we at least catch corrupted data and > likely protect ourselves pretty well anyways. > I didn't go to the trouble of implementing a Snappy Checksum'ed Compressor > implementation as it's been deprecated for a while -- is really slow and > crappy compared to LZ4 -- and we should do everything in our power to make > sure no one in the community is still using it. I left it in (for obvious > backwards compatibility aspects) old for clients that don't know about the > new protocol. > The current protocol has a 256MB (max) frame body -- where the serialized > contents are simply written in to the frame body. > If the client sends a compression option in the startup, we will install a > FrameCompressor inline.