[jira] [Commented] (CASSANDRA-14320) dtest tools/jmxutils.py JolokiaAgent raises TypeError using json.loads on bytes

2018-03-16 Thread ASF GitHub Bot (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14320?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16403243#comment-16403243
 ] 

ASF GitHub Bot commented on CASSANDRA-14320:


GitHub user ptbannister opened a pull request:

https://github.com/apache/cassandra-dtest/pull/21

tools/jmxutils.py decode bytes to string before passing to json.loads

See CASSANDRA-14320 - addresses TypeError raised by calling json.loads on 
bytes without decoding to string, also fixes a visual indent for PEP-8 
compliance in the same file.

Result of this change can be seen easily by running the deprecated repair 
tests (repair_tests/deprecated_repair_test.py) with and without this 
modification.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ptbannister/cassandra-dtest CASSANDRA-14320

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/cassandra-dtest/pull/21.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #21


commit dc48608ffe707804a41299957a72917805b9a684
Author: Patrick Bannister 
Date:   2018-03-17T03:32:50Z

tools/jmxutils.py decode bytes to string before passing to json.loads




> dtest tools/jmxutils.py JolokiaAgent raises TypeError using json.loads on 
> bytes
> ---
>
> Key: CASSANDRA-14320
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14320
> Project: Cassandra
>  Issue Type: Bug
>  Components: Testing
>Reporter: Patrick Bannister
>Priority: Minor
>  Labels: Python3, dtest, python3
> Fix For: 3.0.x, 3.11.x
>
>
> JolokiaAgent in tools/jmxutils.py raises a TypeError when used, because its 
> _query function tries to use json.loads (which only accepts string input) on 
> a bytes object.
> {code:java}
>     def _query(self, body, verbose=True):
>     request_data = json.dumps(body).encode("utf-8")
>     url = 'http://%s:8778/jolokia/' % 
> (self.node.network_interfaces['binary'][0],)
>     req = urllib.request.Request(url)
>     response = urllib.request.urlopen(req, data=request_data, 
> timeout=10.0)
>     if response.code != 200:
>     raise Exception("Failed to query Jolokia agent; HTTP response 
> code: %d; response: %s" % (response.code, response.readlines()))
>     raw_response = response.readline() # response is 
> http.client.HTTPResponse, which subclasses RawIOBase, which returns bytes 
> when read
>     response = json.loads(raw_response) # this raises a TypeError now
>     if response['status'] != 200:
>     stacktrace = response.get('stacktrace')
>     if stacktrace and verbose:
>     print("Stacktrace from Jolokia error follows:")
>     for line in stacktrace.splitlines():
>     print(line)
>     raise Exception("Jolokia agent returned non-200 status: %s" % 
> (response,))
>     return response{code}
> This can be seen clearly by running the deprecated repair tests 
> (repair_tests/deprecated_repair_test.py). They all fail right now because of 
> this TypeError.
> This is a side effect of the migration to Python 3, which makes bytes objects 
> fundamentally different from strings. This will also happen anytime we try to 
> json.loads data returned from stdout or stderr piped from subprocess. I need 
> to take a closer look at offline_tools_test.py and 
> cqlsh_tests/cqlsh_copy_tests.py, because I suspect they're impacted as well.
> We can fix this issue by decoding bytes objects to strings before calling 
> json.loads(). For example, in the above:
> {code:java}
>     response = json.loads(raw_response.decode(encoding='utf-8')){code}
> I have a fix for the JolokiaAgent problem - I'll submit a pull request to 
> cassandra-dtest once I have this issue number to reference.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14320) dtest tools/jmxutils.py JolokiaAgent raises TypeError using json.loads on bytes

2018-03-16 Thread Patrick Bannister (JIRA)
Patrick Bannister created CASSANDRA-14320:
-

 Summary: dtest tools/jmxutils.py JolokiaAgent raises TypeError 
using json.loads on bytes
 Key: CASSANDRA-14320
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14320
 Project: Cassandra
  Issue Type: Bug
  Components: Testing
Reporter: Patrick Bannister
 Fix For: 3.0.x, 3.11.x


JolokiaAgent in tools/jmxutils.py raises a TypeError when used, because its 
_query function tries to use json.loads (which only accepts string input) on a 
bytes object.
{code:java}
    def _query(self, body, verbose=True):
    request_data = json.dumps(body).encode("utf-8")
    url = 'http://%s:8778/jolokia/' % 
(self.node.network_interfaces['binary'][0],)
    req = urllib.request.Request(url)
    response = urllib.request.urlopen(req, data=request_data, timeout=10.0)
    if response.code != 200:
    raise Exception("Failed to query Jolokia agent; HTTP response code: 
%d; response: %s" % (response.code, response.readlines()))

    raw_response = response.readline() # response is 
http.client.HTTPResponse, which subclasses RawIOBase, which returns bytes when 
read
    response = json.loads(raw_response) # this raises a TypeError now
    if response['status'] != 200:
    stacktrace = response.get('stacktrace')
    if stacktrace and verbose:
    print("Stacktrace from Jolokia error follows:")
    for line in stacktrace.splitlines():
    print(line)
    raise Exception("Jolokia agent returned non-200 status: %s" % 
(response,))
    return response{code}
This can be seen clearly by running the deprecated repair tests 
(repair_tests/deprecated_repair_test.py). They all fail right now because of 
this TypeError.

This is a side effect of the migration to Python 3, which makes bytes objects 
fundamentally different from strings. This will also happen anytime we try to 
json.loads data returned from stdout or stderr piped from subprocess. I need to 
take a closer look at offline_tools_test.py and 
cqlsh_tests/cqlsh_copy_tests.py, because I suspect they're impacted as well.

We can fix this issue by decoding bytes objects to strings before calling 
json.loads(). For example, in the above:
{code:java}
    response = json.loads(raw_response.decode(encoding='utf-8')){code}
I have a fix for the JolokiaAgent problem - I'll submit a pull request to 
cassandra-dtest once I have this issue number to reference.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14319) nodetool rebuild from DC lets you pass invalid datacenters

2018-03-16 Thread mck (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14319?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

mck updated CASSANDRA-14319:

Fix Version/s: 4.x
   3.11.x
   3.0.x
   2.2.x
   2.1.x

> nodetool rebuild from DC lets you pass invalid datacenters 
> ---
>
> Key: CASSANDRA-14319
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14319
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Jon Haddad
>Priority: Major
> Fix For: 2.1.x, 2.2.x, 3.0.x, 3.11.x, 4.x
>
>
> If you pass an invalid datacenter to nodetool rebuild, you'll get an error 
> like this:
> {code}
> Unable to find sufficient sources for streaming range 
> (3074457345618258602,-9223372036854775808] in keyspace system_distributed
> {code}
> Unfortunately, this is a rabbit hole of frustration if you are using caps for 
> your DC names and you pass in a lowercase DC name, or you just typo the DC.  
> Let's do the following:
> # Check the DC name that's passed in against the list of DCs we know about
> # If we don't find it, let's output a reasonable error, and list all the DCs 
> someone could put in.
> # Ideally we indicate which keyspaces are set to replicate to this DC and 
> which aren't



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-12937) Default setting (yaml) for SSTable compression

2018-03-16 Thread Vinay Chella (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12937?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16403211#comment-16403211
 ] 

Vinay Chella commented on CASSANDRA-12937:
--

Good one. Yes, we have bitten by this in the past, I am picking it up. 

> Default setting (yaml) for SSTable compression
> --
>
> Key: CASSANDRA-12937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12937
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: mck
>Priority: Minor
>  Labels: lhf
>
> In many situations the choice of compression for sstables is more relevant to 
> the disks attached than to the schema and data.
> This issue is to add to cassandra.yaml a default value for sstable 
> compression that new tables will inherit (instead of the defaults found in 
> {{CompressionParams.DEFAULT}}.
> Examples where this can be relevant are filesystems that do on-the-fly 
> compression (btrfs, zfs) or specific disk configurations or even specific C* 
> versions (see CASSANDRA-10995 ).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-12937) Default setting (yaml) for SSTable compression

2018-03-16 Thread Vinay Chella (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinay Chella reassigned CASSANDRA-12937:


Assignee: Vinay Chella

> Default setting (yaml) for SSTable compression
> --
>
> Key: CASSANDRA-12937
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12937
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: mck
>Assignee: Vinay Chella
>Priority: Minor
>  Labels: lhf
>
> In many situations the choice of compression for sstables is more relevant to 
> the disks attached than to the schema and data.
> This issue is to add to cassandra.yaml a default value for sstable 
> compression that new tables will inherit (instead of the defaults found in 
> {{CompressionParams.DEFAULT}}.
> Examples where this can be relevant are filesystems that do on-the-fly 
> compression (btrfs, zfs) or specific disk configurations or even specific C* 
> versions (see CASSANDRA-10995 ).



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Assigned] (CASSANDRA-14303) NetworkTopologyStrategy could have a "default replication" option

2018-03-16 Thread Vinay Chella (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14303?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinay Chella reassigned CASSANDRA-14303:


Assignee: Joseph Lynch

> NetworkTopologyStrategy could have a "default replication" option
> -
>
> Key: CASSANDRA-14303
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14303
> Project: Cassandra
>  Issue Type: Improvement
>  Components: Configuration
>Reporter: Joseph Lynch
>Assignee: Joseph Lynch
>Priority: Minor
>
> Right now when creating a keyspace with {{NetworkTopologyStrategy}} the user 
> has to manually specify the datacenters they want their data replicated to 
> with parameters, e.g.:
> {noformat}
>  CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 
> 'dc1': 3, 'dc2': 3}{noformat}
> This is a poor user interface because it requires the creator of the keyspace 
> (typically a developer) to know the layout of the Cassandra cluster (which 
> may or may not be controlled by them). Also, at least in my experience, folks 
> typo the datacenters _all_ the time. To work around this I see a number of 
> users creating automation around this where the automation describes the 
> Cassandra cluster and automatically expands out to all the dcs that Cassandra 
> knows about. Why can't Cassandra just do this for us, re-using the previously 
> forbidden {{replication_factor}} option (for backwards compatibility):
> {noformat}
>  CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 
> 'replication_factor': 3}{noformat}
> This would automatically replicate this Keyspace to all datacenters that are 
> present in the cluster. If you need to _override_ the default you could 
> supply a datacenter name, e.g.:
> {noformat}
> > CREATE KEYSPACE test WITH replication = {'class': 
> > 'NetworkTopologyStrategy', 'replication_factor': 3, 'dc1': 2}
> > DESCRIBE KEYSPACE test
> CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 
> 'dc1': '2', 'dc2': 3} AND durable_writes = true;
> {noformat}
> On the implementation side I think this may be reasonably straightforward to 
> do an auto-expansion at the time of keyspace creation (or alter), where the 
> above would automatically expand to list out the datacenters. We could allow 
> this to be recomputed whenever an AlterKeyspaceStatement runs so that to add 
> datacenters you would just run:
> {noformat}
> ALTER KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 
> 'replication_factor': 3}{noformat}
> and this would check that if the dc's in the current schema are different you 
> add in the new ones (_for safety reasons we'd never remove non explicitly 
> supplied zero dcs when auto-generating dcs_). Removing a datacenter becomes 
> an alter that includes an override for the dc you want to remove (or of 
> course you can always not use the auto-expansion and just use the old way):
> {noformat}
> // Tell it explicitly not to replicate to dc2
> > ALTER KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 
> > 'replication_factor': 3, 'dc2': 0}
> > DESCRIBE KEYSPACE test
> CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 
> 'dc1': '3'} AND durable_writes = true;{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14118) Refactor write path

2018-03-16 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16403184#comment-16403184
 ] 

Dikang Gu commented on CASSANDRA-14118:
---

Sure, I was going to deal with commit log and cache later. I can try to 
abstract them in this patch.

> Refactor write path
> ---
>
> Key: CASSANDRA-14118
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14118
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>Priority: Major
>
> As part of the pluggable storage engine effort, we'd like to modularize the 
> write path related code, make it to be independent from existing storage 
> engine implementation details.
> For now, refer to 
> https://docs.google.com/document/d/1suZlvhzgB6NIyBNpM9nxoHxz_Ri7qAm-UEO8v8AIFsc
>  for high level designs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14232) Add metric for coordinator writes per column family

2018-03-16 Thread Vinay Chella (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14232?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16403173#comment-16403173
 ] 

Vinay Chella commented on CASSANDRA-14232:
--

Patch looks good. Can you update NEWS.txt and documentation in the source tree? 

> Add metric for coordinator writes per column family
> ---
>
> Key: CASSANDRA-14232
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14232
> Project: Cassandra
>  Issue Type: Improvement
>Reporter: Sumanth Pasupuleti
>Assignee: Sumanth Pasupuleti
>Priority: Major
> Attachments: 14232-trunk.txt
>
>
> Includes write ops and latencies at coordinator per column family.
> Relevant discussion in dev mailing list - 
> [https://lists.apache.org/thread.html/f68f694b13b670a1fa28fa75620304603fc89e94ec515933199f4c37@%3Cdev.cassandra.apache.org%3E]
> Below are a few advantages of having such metric
>  * Ability to identify specific column family that coordinator writes are 
> slow to
>  * Also useful in a multi-tenant cluster, where different column families are 
> owned by different teams



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14118) Refactor write path

2018-03-16 Thread Blake Eggleston (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16403165#comment-16403165
 ] 

Blake Eggleston commented on CASSANDRA-14118:
-

[~dikanggu], the commit log and cache are part of the storage implementation. 
The write path's interaction with them needs to be abstracted away. We can't 
assume that other storage implementations will be using our implementations of 
those.

> Refactor write path
> ---
>
> Key: CASSANDRA-14118
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14118
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>Priority: Major
>
> As part of the pluggable storage engine effort, we'd like to modularize the 
> write path related code, make it to be independent from existing storage 
> engine implementation details.
> For now, refer to 
> https://docs.google.com/document/d/1suZlvhzgB6NIyBNpM9nxoHxz_Ri7qAm-UEO8v8AIFsc
>  for high level designs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14118) Refactor write path

2018-03-16 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16403152#comment-16403152
 ] 

Dikang Gu commented on CASSANDRA-14118:
---

Hmm, I'm not sure what else you want me to abstract. As I mentioned, on the 
local code path, Keyspace.applyInternal -> cfs.apply() -> writehandler.apply(), 
there are no memtable and sstable involved already. Only component is the 
commitlog, which I pass it as the parameter in the apply() function already. 

> Refactor write path
> ---
>
> Key: CASSANDRA-14118
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14118
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>Priority: Major
>
> As part of the pluggable storage engine effort, we'd like to modularize the 
> write path related code, make it to be independent from existing storage 
> engine implementation details.
> For now, refer to 
> https://docs.google.com/document/d/1suZlvhzgB6NIyBNpM9nxoHxz_Ri7qAm-UEO8v8AIFsc
>  for high level designs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-12151) Audit logging for database activity

2018-03-16 Thread Vinay Chella (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16403146#comment-16403146
 ] 

Vinay Chella edited comment on CASSANDRA-12151 at 3/17/18 12:12 AM:


[~djoshi3]

Implemented all the code reviews comments provided in this JIRA thread as well 
as Github PR. Except one below
{quote}Consider refactoring your code to add a netty handler that invokes an 
auditing interface. The advantage of this approach would be that, when audit 
logging is disabled, you can take this handler out of the netty pipeline. This 
way there is zero performance impact when the audit is disabled. You can define 
a IAuditLogger interface that has sufficient contextual information to log all 
queries. This will help make the audit logging implementation pluggable.
{quote}
I am creating a follow-up JIRA to discuss the more details on this.

On a high level, this changeset includes following changes
 # Extended and reused FullQueryLogger in logging audit events
 # Combined and Simplified FQL and AuditLog entry points in the request path
 # AuditLogEntryType::allStatementsMap - Instead of creating an explicit map of 
statements, type of statement is being added to the actual class itself. This 
makes new statements easy to manage
 # AuditLogFilter::loadFilters - Simplified filter loading logic, easy to add 
new filters if needed
 # CQL query auditing can now be filtered on user level.
 # Added documentation in the doc folder
 # Removed ConsistencyLevel in logging details
 # Added more test cases
 # Implemented code review comments provided in this JIRA as well as Github PR

 
||[branch|https://github.com/vinaykumarchella/cassandra/tree/trunk_CASSANDRA-12151]||
|[PR for trunk|https://github.com/vinaykumarchella/cassandra/pull/2/commits]|
|[circleci|https://circleci.com/gh/vinaykumarchella/cassandra/tree/trunk_CASSANDRA-12151]|

 

We ran cassandra stress test with this patch and attached stress test results 
([^CASSANDRA_12151-benchmark.html]). Here is the high level summary

Note: Below tests are run on AWS i2.2xl instance. 
 {{cass-stress cmd: write n=100 -rate threads=10 -graph 
file=CASSANDRA_12151-benchmark.html}}
||WRITE - Test Suite||Throughput||Latency Mean||Latency 95th||Latency 99th||
|trunk|13,925 op/s|0.7 ms|1.1 ms|1.7 ms|
|CASSANDRA-12151:Disabled AuditLog|14,422 op/s|0.7 ms|1.1 ms|1.6 ms|
|CASSANDRA-12151:FQL based AuditLog with Sync|13,372 op/s|0.7 ms|1.2 ms|1.7 ms|
|CASSANDRA-12151:FQL based AuditLog with Async|12,908 op/s|0.8 ms|1.2 ms|1.9 ms|
|CASSANDRA-12151:SLF4j based AuditLog|10,520 op/s|0.9 ms|1.6 ms|2.4 ms|


 {{cass-stress cmd: mixed n=100 -rate threads=10 -graph 
file=CASSANDRA_12151-benchmark.html}}
||MIXED - Test Suite||Throughput||Latency Mean||Latency 95th||Latency 99th||
|trunk|12,939 op/s [READ: 6,494 op/s, WRITE: 6,444 op/s]|0.7 ms [READ: 0.8 ms, 
WRITE: 0.7 ms]|1.2 ms [READ: 1.3 ms, WRITE: 1.2 ms]|1.7 ms [READ: 1.8 ms, 
WRITE: 1.7 ms]|
|CASSANDRA-12151: Disabled AuditLog|12,840 op/s [READ: 6,421 op/s, WRITE: 6,419 
op/s]|0.8 ms [READ: 0.8 ms, WRITE: 0.7 ms]|1.2 ms [READ: 1.3 ms, WRITE: 1.2 
ms]|1.8 ms [READ: 1.8 ms, WRITE: 1.7 ms]|
|CASSANDRA-12151: FQL based AuditLog with Sync|10,932 op/s [READ: 5,452 op/s, 
WRITE: 5,481 op/s]|0.9 ms [READ: 1.0 ms, WRITE: 0.8 ms]|1.5 ms [READ: 1.6 ms, 
WRITE: 1.4 ms]|2.3 ms [READ: 2.4 ms, WRITE: 2.1 ms]|
|CASSANDRA-12151: FQL based AuditLog with Async|11,146 op/s [READ: 5,565 op/s, 
WRITE: 5,581 op/s]|0.9 ms [READ: 0.9 ms, WRITE: 0.8 ms]|1.5 ms [READ: 1.5 ms, 
WRITE: 1.4 ms]|2.2 ms [READ: 2.2 ms, WRITE: 2.1 ms]|
|CASSANDRA-12151: SLF4j based AuditLog|9,764 op/s [READ: 4,883 op/s, WRITE: 
4,882 op/s]|1.0 ms [READ: 1.0 ms, WRITE: 1.0 ms]|1.7 ms [READ: 1.7 ms, WRITE: 
1.6 ms]|2.5 ms [READ: 2.6 ms, WRITE: 2.4 ms]|

 

Looking at the results, with AuditLog feature disabled, there appears to be no 
measurable difference in performance. FQL appears to have little or no overhead 
in WRITE only workloads, and a minor overhead in MIXED workload. SLF4J appears 
to have minor regressions in both workloads (with mixed slightly worse).


was (Author: vinaykumarcse):
[~djoshi3]

Implemented all the code reviews comments provided in this JIRA thread as well 
as Github PR. Except one below

{quote}
Consider refactoring your code to add a netty handler that invokes an auditing 
interface. The advantage of this approach would be that, when audit logging is 
disabled, you can take this handler out of the netty pipeline. This way there 
is zero performance impact when the audit is disabled. You can define a 
IAuditLogger interface that has sufficient contextual information to log all 
queries. This will help make the audit logging implementation pluggable.
{quote}

I am creating a follow-up JIRA to discuss the more details on this.

On a high level, this changeset includes following changes

# Extended and reused FullQueryLogger in 

[jira] [Updated] (CASSANDRA-12151) Audit logging for database activity

2018-03-16 Thread Vinay Chella (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-12151?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Vinay Chella updated CASSANDRA-12151:
-
Attachment: CASSANDRA_12151-benchmark.html

> Audit logging for database activity
> ---
>
> Key: CASSANDRA-12151
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12151
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: stefan setyadi
>Assignee: Vinay Chella
>Priority: Major
> Fix For: 4.x
>
> Attachments: 12151.txt, CASSANDRA_12151-benchmark.html, 
> DesignProposal_AuditingFeature_ApacheCassandra_v1.docx
>
>
> we would like a way to enable cassandra to log database activity being done 
> on our server.
> It should show username, remote address, timestamp, action type, keyspace, 
> column family, and the query statement.
> it should also be able to log connection attempt and changes to the 
> user/roles.
> I was thinking of making a new keyspace and insert an entry for every 
> activity that occurs.
> Then It would be possible to query for specific activity or a query targeting 
> a specific keyspace and column family.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-12151) Audit logging for database activity

2018-03-16 Thread Vinay Chella (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-12151?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16403146#comment-16403146
 ] 

Vinay Chella commented on CASSANDRA-12151:
--

[~djoshi3]

Implemented all the code reviews comments provided in this JIRA thread as well 
as Github PR. Except one below

{quote}
Consider refactoring your code to add a netty handler that invokes an auditing 
interface. The advantage of this approach would be that, when audit logging is 
disabled, you can take this handler out of the netty pipeline. This way there 
is zero performance impact when the audit is disabled. You can define a 
IAuditLogger interface that has sufficient contextual information to log all 
queries. This will help make the audit logging implementation pluggable.
{quote}

I am creating a follow-up JIRA to discuss the more details on this.

On a high level, this changeset includes following changes

# Extended and reused FullQueryLogger in logging audit events
# Combined and Simplified FQL and AuditLog entry points in the request path
# AuditLogEntryType::allStatementsMap - Instead of creating an explicit map of 
statements, type of statement is being added to the actual class itself. This 
makes new statements easy to manage
# AuditLogFilter::loadFilters - Simplified filter loading logic, easy to add 
new filters if needed
# CQL query auditing can now be filtered on user level.
# Added documentation in the doc folder
# Removed ConsistencyLevel in logging details
# Added more test cases
# Implemented code review comments provided in this JIRA as well as Github PR

\\

||[branch|https://github.com/vinaykumarchella/cassandra/tree/trunk_CASSANDRA-12151]||
|[PR for trunk|https://github.com/vinaykumarchella/cassandra/pull/2/commits]|
|[circleci|https://circleci.com/gh/vinaykumarchella/cassandra/tree/trunk_CASSANDRA-12151]|

\\

We ran cassandra stress test with this patch and attached stress test results. 
Here is the high level summary

Note: Below tests are run on AWS i2.2xl instance.
\\
{{cass-stress cmd: write n=100 -rate threads=10 -graph 
file=CASSANDRA_12151-benchmark.html}}
||WRITE - Test Suite||Throughput||Latency Mean||Latency 95th||Latency 99th||
|trunk|13,925 op/s|0.7 ms|1.1 ms|1.7 ms|
|CASSANDRA-12151:Disabled AuditLog|14,422 op/s|0.7 ms|1.1 ms|1.6 ms|
|CASSANDRA-12151:FQL based AuditLog with Sync|13,372 op/s|0.7 ms|1.2 ms|1.7 ms|
|CASSANDRA-12151:FQL based AuditLog with Async|12,908 op/s|0.8 ms|1.2 ms|1.9 ms|
|CASSANDRA-12151:SLF4j based AuditLog|10,520 op/s|0.9 ms|1.6 ms|2.4 ms|
\\
{{cass-stress cmd: mixed n=100 -rate threads=10 -graph 
file=CASSANDRA_12151-benchmark.html}}
||MIXED - Test Suite||Throughput||Latency Mean||Latency 95th||Latency 99th||
|trunk|12,939 op/s [READ: 6,494 op/s, WRITE: 6,444 op/s]|0.7 ms [READ: 0.8 ms, 
WRITE: 0.7 ms]|1.2 ms [READ: 1.3 ms, WRITE: 1.2 ms]|1.7 ms [READ: 1.8 ms, 
WRITE: 1.7 ms]|
|CASSANDRA-12151: Disabled AuditLog|12,840 op/s [READ: 6,421 op/s, WRITE: 6,419 
op/s]|0.8 ms [READ: 0.8 ms, WRITE: 0.7 ms]|1.2 ms [READ: 1.3 ms, WRITE: 1.2 
ms]|1.8 ms [READ: 1.8 ms, WRITE: 1.7 ms]|
|CASSANDRA-12151: FQL based AuditLog with Sync|10,932 op/s [READ: 5,452 op/s, 
WRITE: 5,481 op/s]|0.9 ms [READ: 1.0 ms, WRITE: 0.8 ms]|1.5 ms [READ: 1.6 ms, 
WRITE: 1.4 ms]|2.3 ms [READ: 2.4 ms, WRITE: 2.1 ms]|
|CASSANDRA-12151: FQL based AuditLog with Async|11,146 op/s [READ: 5,565 op/s, 
WRITE: 5,581 op/s]|0.9 ms [READ: 0.9 ms, WRITE: 0.8 ms]|1.5 ms [READ: 1.5 ms, 
WRITE: 1.4 ms]|2.2 ms [READ: 2.2 ms, WRITE: 2.1 ms]|
|CASSANDRA-12151: SLF4j based AuditLog|9,764 op/s [READ: 4,883 op/s, WRITE: 
4,882 op/s]|1.0 ms [READ: 1.0 ms, WRITE: 1.0 ms]|1.7 ms [READ: 1.7 ms, WRITE: 
1.6 ms]|2.5 ms [READ: 2.6 ms, WRITE: 2.4 ms]|

\\

Looking at the results, with AuditLog feature disabled, there appears to be no 
measurable difference in performance. FQL appears to have little or no overhead 
in WRITE only workloads, and a minor overhead in MIXED workload. SLF4J appears 
to have minor regressions in both workloads (with mixed slightly worse).

> Audit logging for database activity
> ---
>
> Key: CASSANDRA-12151
> URL: https://issues.apache.org/jira/browse/CASSANDRA-12151
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: stefan setyadi
>Assignee: Vinay Chella
>Priority: Major
> Fix For: 4.x
>
> Attachments: 12151.txt, 
> DesignProposal_AuditingFeature_ApacheCassandra_v1.docx
>
>
> we would like a way to enable cassandra to log database activity being done 
> on our server.
> It should show username, remote address, timestamp, action type, keyspace, 
> column family, and the query statement.
> it should also be able to log connection attempt and changes to the 
> user/roles.
> I was thinking of making a new keyspace and insert an entry for every 
> activity that occurs.
> Then 

[jira] [Commented] (CASSANDRA-14118) Refactor write path

2018-03-16 Thread Blake Eggleston (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16403067#comment-16403067
 ] 

Blake Eggleston commented on CASSANDRA-14118:
-

Right, I saw that commit. I'm just waiting for you to abstract the abstract 
away the other stuff I mentioned.

> Refactor write path
> ---
>
> Key: CASSANDRA-14118
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14118
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>Priority: Major
>
> As part of the pluggable storage engine effort, we'd like to modularize the 
> write path related code, make it to be independent from existing storage 
> engine implementation details.
> For now, refer to 
> https://docs.google.com/document/d/1suZlvhzgB6NIyBNpM9nxoHxz_Ri7qAm-UEO8v8AIFsc
>  for high level designs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14118) Refactor write path

2018-03-16 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16403005#comment-16403005
 ] 

Dikang Gu commented on CASSANDRA-14118:
---

[~bdeggleston], that's what I already have, create a PR so it's easy for you to 
review, https://github.com/apache/cassandra/pull/209

> Refactor write path
> ---
>
> Key: CASSANDRA-14118
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14118
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>Priority: Major
>
> As part of the pluggable storage engine effort, we'd like to modularize the 
> write path related code, make it to be independent from existing storage 
> engine implementation details.
> For now, refer to 
> https://docs.google.com/document/d/1suZlvhzgB6NIyBNpM9nxoHxz_Ri7qAm-UEO8v8AIFsc
>  for high level designs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14118) Refactor write path

2018-03-16 Thread Blake Eggleston (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402647#comment-16402647
 ] 

Blake Eggleston commented on CASSANDRA-14118:
-

Yes, you'll need to pass something like that into {{WriteHandler.apply}}

> Refactor write path
> ---
>
> Key: CASSANDRA-14118
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14118
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>Priority: Major
>
> As part of the pluggable storage engine effort, we'd like to modularize the 
> write path related code, make it to be independent from existing storage 
> engine implementation details.
> For now, refer to 
> https://docs.google.com/document/d/1suZlvhzgB6NIyBNpM9nxoHxz_Ri7qAm-UEO8v8AIFsc
>  for high level designs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14118) Refactor write path

2018-03-16 Thread Dikang Gu (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402620#comment-16402620
 ] 

Dikang Gu commented on CASSANDRA-14118:
---

[~bdeggleston], on the local write path, the memtable/sstables are hidden 
inside the CassandraWriteHandler.apply(...) already. Only the CommitLogPosition 
is passed as a parameter in the apply signature:
{quote}void apply(PartitionUpdate update, UpdateTransaction indexer, 
OpOrder.Group opGroup, CommitLogPosition commitLogPosition)
{quote}
I think storage engine implementation can use the commitLogPosition if it 
needs, or ignore the commitLogPosition if it doesn't.  What do you think?

 

> Refactor write path
> ---
>
> Key: CASSANDRA-14118
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14118
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>Priority: Major
>
> As part of the pluggable storage engine effort, we'd like to modularize the 
> write path related code, make it to be independent from existing storage 
> engine implementation details.
> For now, refer to 
> https://docs.google.com/document/d/1suZlvhzgB6NIyBNpM9nxoHxz_Ri7qAm-UEO8v8AIFsc
>  for high level designs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14118) Refactor write path

2018-03-16 Thread Blake Eggleston (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402587#comment-16402587
 ] 

Blake Eggleston commented on CASSANDRA-14118:
-

{quote}is it me or would a storage engine need to know information about the 
commit log to operate correctly?
{quote}
I might be misunderstanding you, but I don’t think it would. I think we should 
make it available to use, but not make it a requirement. If an implementation 
isn't using the commit log, there’s no need to communicate with it. And yes, 
not using the C* commit log would probably preclude supporting cdc.

> Refactor write path
> ---
>
> Key: CASSANDRA-14118
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14118
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>Priority: Major
>
> As part of the pluggable storage engine effort, we'd like to modularize the 
> write path related code, make it to be independent from existing storage 
> engine implementation details.
> For now, refer to 
> https://docs.google.com/document/d/1suZlvhzgB6NIyBNpM9nxoHxz_Ri7qAm-UEO8v8AIFsc
>  for high level designs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14318) Debug logging can create massive performance issues

2018-03-16 Thread Vinay Chella (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402526#comment-16402526
 ] 

Vinay Chella commented on CASSANDRA-14318:
--

+1 on your perf test results. Patch LGTM.

> Debug logging can create massive performance issues
> ---
>
> Key: CASSANDRA-14318
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14318
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Alexander Dejanovski
>Priority: Major
>  Labels: performance
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
> Attachments: debuglogging.png, flame22 nodebug sjk svg.png, 
> flame22-nodebug-sjk.svg, flame22-sjk.svg, flame_graph_snapshot.png
>
>
> Debug logging can involve in many cases (especially very low latency ones) a 
> very important overhead on the read path in 2.2 as we've seen when upgrading 
> clusters from 2.0 to 2.2.
> The performance impact was especially noticeable on the client side metrics, 
> where p99 could go up to 10 times higher, while ClientRequest metrics 
> recorded by Cassandra didn't show any overhead.
> Below shows latencies recorded on the client side with debug logging on 
> first, and then without it :
> !debuglogging.png!  
> We generated a flame graph before turning off debug logging that shows the 
> read call stack is dominated by debug logging : 
> !flame_graph_snapshot.png!
> I've attached the original flame graph for exploration.
> Once disabled, the new flame graph shows that the read call stack gets 
> extremely thin, which is further confirmed by client recorded metrics : 
> !flame22 nodebug sjk svg.png!
> The query pager code has been reworked since 3.0 and it looks like 
> log.debug() calls are gone there, but for 2.2 users and to prevent such 
> issues to appear with default settings, I really think debug logging should 
> be disabled by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14118) Refactor write path

2018-03-16 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402478#comment-16402478
 ] 

Ariel Weisberg commented on CASSANDRA-14118:


Or maybe I'm not following and you just mean CFS.apply should basically become 
CFS.getWriteHandler(). That makes sense.

> Refactor write path
> ---
>
> Key: CASSANDRA-14118
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14118
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>Priority: Major
>
> As part of the pluggable storage engine effort, we'd like to modularize the 
> write path related code, make it to be independent from existing storage 
> engine implementation details.
> For now, refer to 
> https://docs.google.com/document/d/1suZlvhzgB6NIyBNpM9nxoHxz_Ri7qAm-UEO8v8AIFsc
>  for high level designs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14118) Refactor write path

2018-03-16 Thread Ariel Weisberg (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402473#comment-16402473
 ] 

Ariel Weisberg commented on CASSANDRA-14118:


[~bdeggleston] is it me or would a storage engine need to know information 
about the commit log to operate correctly? We don't want the storage engine to 
have its own commit log. Unless of course we want to delegate the commit log to 
a storage engine which has implications for other systems like CDC.

The storage engine needs to communicate up to what log entry is durable so the 
commit log can be truncated.

> Refactor write path
> ---
>
> Key: CASSANDRA-14118
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14118
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>Priority: Major
>
> As part of the pluggable storage engine effort, we'd like to modularize the 
> write path related code, make it to be independent from existing storage 
> engine implementation details.
> For now, refer to 
> https://docs.google.com/document/d/1suZlvhzgB6NIyBNpM9nxoHxz_Ri7qAm-UEO8v8AIFsc
>  for high level designs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14319) nodetool rebuild from DC lets you pass invalid datacenters

2018-03-16 Thread Jon Haddad (JIRA)
Jon Haddad created CASSANDRA-14319:
--

 Summary: nodetool rebuild from DC lets you pass invalid 
datacenters 
 Key: CASSANDRA-14319
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14319
 Project: Cassandra
  Issue Type: Improvement
Reporter: Jon Haddad


If you pass an invalid datacenter to nodetool rebuild, you'll get an error like 
this:

{code}
Unable to find sufficient sources for streaming range 
(3074457345618258602,-9223372036854775808] in keyspace system_distributed
{code}

Unfortunately, this is a rabbit hole of frustration if you are using caps for 
your DC names and you pass in a lowercase DC name, or you just typo the DC.  

Let's do the following:

# Check the DC name that's passed in against the list of DCs we know about
# If we don't find it, let's output a reasonable error, and list all the DCs 
someone could put in.
# Ideally we indicate which keyspaces are set to replicate to this DC and which 
aren't



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-13740) Orphan hint file gets created while node is being removed from cluster

2018-03-16 Thread Jaydeepkumar Chovatia (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-13740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402241#comment-16402241
 ] 

Jaydeepkumar Chovatia commented on CASSANDRA-13740:
---

Thanks [~iamaleksey] for the review.

Reason behind {{RING_DELAY}} is as following, in this fix one thing is clear 
that we have to delay {{StorageProxy.excise()}} which means we have to put some 
sleep. So we have two options to put sleep:
 1. Hardcode some random value say for example delay {{StorageProxy.excise()}} 
for 10 seconds
 OR
 2. Other nodes in the ring will no longer accept writes once they learn that 
given node is no longer part of the ring. Hence I have used {{RING_DELAY}} 
which is general delay used at [many 
places|https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/gms/Gossiper.java#L553]
 and after this delay we can assume ring has stabilized. So my theory is that 
once ring has stabilized then everyone in the ring would have learnt about node 
that just left and at this time it is safe to do {{StorageProxy.excise(). 
}}Please let me know if my understanding is not correct, I can change it to 
some hardcoded value say 20 seconds.

I will incorporate other code review comments and will send you updated patch 
soon.

> Orphan hint file gets created while node is being removed from cluster
> --
>
> Key: CASSANDRA-13740
> URL: https://issues.apache.org/jira/browse/CASSANDRA-13740
> Project: Cassandra
>  Issue Type: Bug
>  Components: Core, Hints
>Reporter: Jaydeepkumar Chovatia
>Assignee: Jaydeepkumar Chovatia
>Priority: Minor
> Fix For: 3.0.x, 3.11.x
>
> Attachments: 13740-3.0.15.txt, gossip_hang_test.py
>
>
> I have found this new issue during my test, whenever node is being removed 
> then hint file for that node gets written and stays inside the hint directory 
> forever. I debugged the code and found that it is due to the race condition 
> between [HintsWriteExecutor.java::flush | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L195]
>  and [HintsWriteExecutor.java::closeWriter | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L106]
> . 
>  
> *Time t1* Node is down, as a result Hints are being written by 
> [HintsWriteExecutor.java::flush | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L195]
> *Time t2* Node is removed from cluster as a result it calls 
> [HintsService.java-exciseStore | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L327]
>  which removes hint files for the node being removed
> *Time t3* Mutation stage keeps pumping Hints through [HintService.java::write 
> | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L145]
>  which again calls [HintsWriteExecutor.java::flush | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L215]
>  and new orphan file gets created
> I was writing a new dtest for {CASSANDRA-13562, CASSANDRA-13308} and that 
> helped me reproduce this new bug. I will submit patch for this new dtest 
> later.
> I also tried following to check how this orphan hint file responds:
> 1. I tried {{nodetool truncatehints }} but it fails as node is no 
> longer part of the ring
> 2. I then tried {{nodetool truncatehints}}, that still doesn’t remove hint 
> file because it is not yet included in the [dispatchDequeue | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsStore.java#L53]
> Reproducible steps:
> Please find dTest python file {{gossip_hang_test.py}} attached which 
> reproduces this bug.
> Solution:
> This is due to race condition as mentioned above. Since 
> {{HintsWriteExecutor.java}} creates thread pool with only 1 worker, so 
> solution becomes little simple. Whenever we [HintService.java::excise | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsService.java#L303]
>  a host, just store it in-memory, and check for already evicted host inside 
> [HintsWriteExecutor.java::flush | 
> https://github.com/apache/cassandra/blob/cassandra-3.0/src/java/org/apache/cassandra/hints/HintsWriteExecutor.java#L215].
>  If already evicted host is found then ignore hints.
> Jaydeep



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, 

[jira] [Commented] (CASSANDRA-14118) Refactor write path

2018-03-16 Thread Blake Eggleston (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14118?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402233#comment-16402233
 ] 

Blake Eggleston commented on CASSANDRA-14118:
-

[~dikanggu] that's a lot closer, yes. You haven't abstracted all of the storage 
layer implementation details though. You should be able to step through the 
entire local write path without seeing anything related to commit logs, caches, 
sstables, or memtables.
 


> Refactor write path
> ---
>
> Key: CASSANDRA-14118
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14118
> Project: Cassandra
>  Issue Type: Sub-task
>  Components: Core
>Reporter: Dikang Gu
>Assignee: Dikang Gu
>Priority: Major
>
> As part of the pluggable storage engine effort, we'd like to modularize the 
> write path related code, make it to be independent from existing storage 
> engine implementation details.
> For now, refer to 
> https://docs.google.com/document/d/1suZlvhzgB6NIyBNpM9nxoHxz_Ri7qAm-UEO8v8AIFsc
>  for high level designs.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14315) ThrottledUnfilteredIterator failed on UnfilteredRowIterator with only partition level info

2018-03-16 Thread Paulo Motta (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14315?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Paulo Motta updated CASSANDRA-14315:

Reviewer: Paulo Motta

> ThrottledUnfilteredIterator failed on UnfilteredRowIterator with only 
> partition level info
> --
>
> Key: CASSANDRA-14315
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14315
> Project: Cassandra
>  Issue Type: Bug
>  Components: Materialized Views
>Reporter: ZhaoYang
>Assignee: ZhaoYang
>Priority: Major
> Fix For: 4.0
>
>
> When repairing base table with MV, in order to avoid OOM, Cassandra-13299 
> added ThrottledUnfilteredIterator to split large partition into small chunks, 
> but it didn't handle partition without unfiltered properly.
> {code:title=repro}
> // create cell tombstone, range tombstone, partition deletion
> createTable("CREATE TABLE %s (pk int, ck1 int, ck2 int, v1 int, v2 int, 
> PRIMARY KEY (pk, ck1, ck2))");
> // partition deletion
> execute("DELETE FROM %s USING TIMESTAMP 160 WHERE pk=1");
> // flush and generate 1 sstable
> ColumnFamilyStore cfs = 
> Keyspace.open(keyspace()).getColumnFamilyStore(currentTable());
> cfs.forceBlockingFlush();
> cfs.disableAutoCompaction();
> cfs.forceMajorCompaction();
> assertEquals(1, cfs.getLiveSSTables().size());
> SSTableReader reader = cfs.getLiveSSTables().iterator().next();
> try (ISSTableScanner scanner = reader.getScanner();
> CloseableIterator throttled = 
> ThrottledUnfilteredIterator.throttle(scanner, 100))
> {
> assertTrue(throttled.hasNext());
> UnfilteredRowIterator iterator = throttled.next();
> assertFalse(throttled.hasNext());
> assertFalse(iterator.hasNext());
> assertEquals(iterator.partitionLevelDeletion().markedForDeleteAt(), 160);
> }
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-03-16 Thread Jeremiah Jordan (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16402060#comment-16402060
 ] 

Jeremiah Jordan commented on CASSANDRA-5836:


CASSANDRA-12681 for the NTS change

> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
>Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped.  If a user 
> wants to bootstrap a node configured as a seed (for example to replace a seed 
> node via replace_token), they first need to remove the node's own IP from the 
> seed list, and then start the bootstrap process.  This seems like an 
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a 
> seed node to bootstrap without manual intervention when there are other seed 
> nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Updated] (CASSANDRA-14318) Debug logging can create massive performance issues

2018-03-16 Thread Alexander Dejanovski (JIRA)

 [ 
https://issues.apache.org/jira/browse/CASSANDRA-14318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Alexander Dejanovski updated CASSANDRA-14318:
-
Status: Patch Available  (was: Open)

> Debug logging can create massive performance issues
> ---
>
> Key: CASSANDRA-14318
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14318
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Alexander Dejanovski
>Priority: Major
>  Labels: performance
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
> Attachments: debuglogging.png, flame22 nodebug sjk svg.png, 
> flame22-nodebug-sjk.svg, flame22-sjk.svg, flame_graph_snapshot.png
>
>
> Debug logging can involve in many cases (especially very low latency ones) a 
> very important overhead on the read path in 2.2 as we've seen when upgrading 
> clusters from 2.0 to 2.2.
> The performance impact was especially noticeable on the client side metrics, 
> where p99 could go up to 10 times higher, while ClientRequest metrics 
> recorded by Cassandra didn't show any overhead.
> Below shows latencies recorded on the client side with debug logging on 
> first, and then without it :
> !debuglogging.png!  
> We generated a flame graph before turning off debug logging that shows the 
> read call stack is dominated by debug logging : 
> !flame_graph_snapshot.png!
> I've attached the original flame graph for exploration.
> Once disabled, the new flame graph shows that the read call stack gets 
> extremely thin, which is further confirmed by client recorded metrics : 
> !flame22 nodebug sjk svg.png!
> The query pager code has been reworked since 3.0 and it looks like 
> log.debug() calls are gone there, but for 2.2 users and to prevent such 
> issues to appear with default settings, I really think debug logging should 
> be disabled by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14318) Debug logging can create massive performance issues

2018-03-16 Thread Alexander Dejanovski (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14318?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401971#comment-16401971
 ] 

Alexander Dejanovski commented on CASSANDRA-14318:
--

Here's the [patch for 
2.2|https://github.com/apache/cassandra/compare/cassandra-2.2...thelastpickle:disable-debug-logging-by-default]

It should be mergeable in 3.0/3.11/4.0 without a problem.

> Debug logging can create massive performance issues
> ---
>
> Key: CASSANDRA-14318
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14318
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Alexander Dejanovski
>Priority: Major
>  Labels: performance
> Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
>
> Attachments: debuglogging.png, flame22 nodebug sjk svg.png, 
> flame22-nodebug-sjk.svg, flame22-sjk.svg, flame_graph_snapshot.png
>
>
> Debug logging can involve in many cases (especially very low latency ones) a 
> very important overhead on the read path in 2.2 as we've seen when upgrading 
> clusters from 2.0 to 2.2.
> The performance impact was especially noticeable on the client side metrics, 
> where p99 could go up to 10 times higher, while ClientRequest metrics 
> recorded by Cassandra didn't show any overhead.
> Below shows latencies recorded on the client side with debug logging on 
> first, and then without it :
> !debuglogging.png!  
> We generated a flame graph before turning off debug logging that shows the 
> read call stack is dominated by debug logging : 
> !flame_graph_snapshot.png!
> I've attached the original flame graph for exploration.
> Once disabled, the new flame graph shows that the read call stack gets 
> extremely thin, which is further confirmed by client recorded metrics : 
> !flame22 nodebug sjk svg.png!
> The query pager code has been reworked since 3.0 and it looks like 
> log.debug() calls are gone there, but for 2.2 users and to prevent such 
> issues to appear with default settings, I really think debug logging should 
> be disabled by default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Created] (CASSANDRA-14318) Debug logging can create massive performance issues

2018-03-16 Thread Alexander Dejanovski (JIRA)
Alexander Dejanovski created CASSANDRA-14318:


 Summary: Debug logging can create massive performance issues
 Key: CASSANDRA-14318
 URL: https://issues.apache.org/jira/browse/CASSANDRA-14318
 Project: Cassandra
  Issue Type: Bug
Reporter: Alexander Dejanovski
 Fix For: 2.2.x, 3.0.x, 3.11.x, 4.x
 Attachments: debuglogging.png, flame22 nodebug sjk svg.png, 
flame22-nodebug-sjk.svg, flame22-sjk.svg, flame_graph_snapshot.png

Debug logging can involve in many cases (especially very low latency ones) a 
very important overhead on the read path in 2.2 as we've seen when upgrading 
clusters from 2.0 to 2.2.

The performance impact was especially noticeable on the client side metrics, 
where p99 could go up to 10 times higher, while ClientRequest metrics recorded 
by Cassandra didn't show any overhead.

Below shows latencies recorded on the client side with debug logging on first, 
and then without it :

!debuglogging.png!  

We generated a flame graph before turning off debug logging that shows the read 
call stack is dominated by debug logging : 

!flame_graph_snapshot.png!

I've attached the original flame graph for exploration.

Once disabled, the new flame graph shows that the read call stack gets 
extremely thin, which is further confirmed by client recorded metrics : 

!flame22 nodebug sjk svg.png!

The query pager code has been reworked since 3.0 and it looks like log.debug() 
calls are gone there, but for 2.2 users and to prevent such issues to appear 
with default settings, I really think debug logging should be disabled by 
default.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14317) Auditing Plug-in for Cassandra

2018-03-16 Thread Jason Brown (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401960#comment-16401960
 ] 

Jason Brown commented on CASSANDRA-14317:
-

[~eanujwa] As this is a new feature, it will have to go into 4.0 (trunk), which 
is where CASSANDRA-12151 is going. Why introduce yet another ticket to do the 
same thing instead of focusing on CASSANDRA-12151?


> Auditing Plug-in for Cassandra
> --
>
> Key: CASSANDRA-14317
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14317
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
> Environment: Cassandra 3.11.x
>Reporter: Anuj Wadehra
>Priority: Major
>  Labels: security
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> Cassandra lacks database auditing feature. Till the new feature is 
> implemented as part of CASSANDRA-12151, a database auditing plug-in can be 
> built. The plug-in can be implemented and plugged into Cassandra by 
> customizing components such as Query Handler , Authenticator and Role 
> Manager. The Auditing plug-in shall log all CQL queries and user logins. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-14317) Auditing Plug-in for Cassandra

2018-03-16 Thread Chris Lohfink (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401930#comment-16401930
 ] 

Chris Lohfink edited comment on CASSANDRA-14317 at 3/16/18 1:51 PM:


Is this a duplicate of CASSANDRA-13983, triggers, CDC? To make it by until 
CASSANDRA-12151


was (Author: cnlwsu):
Is this a duplicate of CASSANDRA-13983, triggers, CDC, or CASSANDRA-12151?

> Auditing Plug-in for Cassandra
> --
>
> Key: CASSANDRA-14317
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14317
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
> Environment: Cassandra 3.11.x
>Reporter: Anuj Wadehra
>Priority: Major
>  Labels: security
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> Cassandra lacks database auditing feature. Till the new feature is 
> implemented as part of CASSANDRA-12151, a database auditing plug-in can be 
> built. The plug-in can be implemented and plugged into Cassandra by 
> customizing components such as Query Handler , Authenticator and Role 
> Manager. The Auditing plug-in shall log all CQL queries and user logins. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14317) Auditing Plug-in for Cassandra

2018-03-16 Thread Chris Lohfink (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401930#comment-16401930
 ] 

Chris Lohfink commented on CASSANDRA-14317:
---

Is this a duplicate of CASSANDRA-13983, triggers, CDC, or CASSANDRA-12151?

> Auditing Plug-in for Cassandra
> --
>
> Key: CASSANDRA-14317
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14317
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Tools
> Environment: Cassandra 3.11.x
>Reporter: Anuj Wadehra
>Priority: Major
>  Labels: security
>   Original Estimate: 504h
>  Remaining Estimate: 504h
>
> Cassandra lacks database auditing feature. Till the new feature is 
> implemented as part of CASSANDRA-12151, a database auditing plug-in can be 
> built. The plug-in can be implemented and plugged into Cassandra by 
> customizing components such as Query Handler , Authenticator and Role 
> Manager. The Auditing plug-in shall log all CQL queries and user logins. 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Comment Edited] (CASSANDRA-11243) Memory LEAK CqlInputFormat

2018-03-16 Thread Pankaj (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401805#comment-16401805
 ] 

Pankaj edited comment on CASSANDRA-11243 at 3/16/18 12:23 PM:
--

Any update on the fix: I'm facing same issue with 
flink-cassandra-connector_2.11: 1.3.2. System crashes with outof memory error\ 
java heap space


was (Author: pmishra01):
Any update on the fix: I'm facing same issue with 
flink-cassandra-connector_2.11: 1.3.2

> Memory LEAK CqlInputFormat
> --
>
> Key: CASSANDRA-11243
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11243
> Project: Cassandra
>  Issue Type: Bug
> Environment: Ubuntu 14.04.04 LTS
> Hadoop 2.7
> Cassandra 3.3
>Reporter: Matteo Zuccon
>Priority: Major
>
> Error: "util.ResourceLeakDetector: LEAK: You are creating too many 
> HashedWheelTimer instances.  HashedWheelTimer is a shared resource that must 
> be reused across the JVM,so that only a few instances are created"
> Using CqlInputFormat.Class as input format for an Hadoop Mapreduce program 
> (on distributed Hadoop Cluster) gives a memory leak error.
> Version of the library used:
> 
>   org.apache.cassandra
>   cassandra-all
>   3.3
> 
> The same jar is working on a single node Hadoop configuration, the memory 
> leak error show up in the cluster hadoop configuration.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-11243) Memory LEAK CqlInputFormat

2018-03-16 Thread Pankaj (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-11243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401805#comment-16401805
 ] 

Pankaj commented on CASSANDRA-11243:


Any update on the fix: I'm facing same issue with 
flink-cassandra-connector_2.11: 1.3.2

> Memory LEAK CqlInputFormat
> --
>
> Key: CASSANDRA-11243
> URL: https://issues.apache.org/jira/browse/CASSANDRA-11243
> Project: Cassandra
>  Issue Type: Bug
> Environment: Ubuntu 14.04.04 LTS
> Hadoop 2.7
> Cassandra 3.3
>Reporter: Matteo Zuccon
>Priority: Major
>
> Error: "util.ResourceLeakDetector: LEAK: You are creating too many 
> HashedWheelTimer instances.  HashedWheelTimer is a shared resource that must 
> be reused across the JVM,so that only a few instances are created"
> Using CqlInputFormat.Class as input format for an Hadoop Mapreduce program 
> (on distributed Hadoop Cluster) gives a memory leak error.
> Version of the library used:
> 
>   org.apache.cassandra
>   cassandra-all
>   3.3
> 
> The same jar is working on a single node Hadoop configuration, the memory 
> leak error show up in the cluster hadoop configuration.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-03-16 Thread Oleksandr Shulgin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401552#comment-16401552
 ] 

Oleksandr Shulgin commented on CASSANDRA-5836:
--

{quote}That was always a datastax recommendation so I don't know where it came 
from. As I'm sure you're aware, the Cassandra docs are quite sparse in the 
operations area, but all of this should be documented properly.{quote}

Given the above comment from [~jjordan], the only reason I can still see to use 
{{auto_bootstrap=false}} is to make new token allocation algorithm work on the 
new DC.  I would then also strongly argue for following the DSE exampe and 
deprecating {{allocate_tokens_for_keyspace}} option, exactly because this one 
requires you to add your new DC to your NTS data keyspace before starting the 
nodes there.  The allocator depends solely on the local DC replication factor, 
it doesn't need the keyspace to be replicated initially to the new nodes.

{quote}It's literally that it's irrelevant what the first node does. If 
auto_bootstrap is true for the first node, it's a no-op, if it's false, it's a 
defined no-op. The first node still respects auto_bootstrap, but the result is 
the same for either true or false. This is always going to be the case.{quote}

I'm fully aware of that.  The problem is how to make sure that a node starting 
up *correctly* assumes that it is the very first one.

{quote}The first node would be defined as a node that only has itself as a 
seed, and no existing knowledge of any other node in the cluster.{quote}

OK, but this implies that you have to start the very first node differently 
from the rest of the cluster.  If you want to have 3 seed nodes, what you do 
currently is just list all of them in configuration and deploy nodes one by 
one, starting with the seeds, with identical config and you're done.

With your proposed approach, there are two extra steps:
1. Deploy the very first seed node with a different config, i.e. only itself in 
the seeds list.
2. After other seeds nodes are there (or all nodes are there), restart the 
first node with the complete seeds list.

So that already makes startup more complicated than it is currently.  And don't 
forget the pluggable seeds providers: how (reliably) is this going to work 
together?

{quote}it's the fact that things get implemented without documentation{quote}

But this is exactly what I mean.  If it's because of attitude or not is just my 
judgement, so let's set that aside.

My point is: by spending time on writing decent documentation (preferably, 
before starting on the code!) it could be possible to avoid certain 
implementation pitfalls.  In some extreme cases, like the aforementioned token 
allocation option, it would become obvious that the implementation and the very 
name of the option is wrong: it should be about replication factor and not at 
all about keyspace name.


> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
>Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped.  If a user 
> wants to bootstrap a node configured as a seed (for example to replace a seed 
> node via replace_token), they first need to remove the node's own IP from the 
> seed list, and then start the bootstrap process.  This seems like an 
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a 
> seed node to bootstrap without manual intervention when there are other seed 
> nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-5836) Seed nodes should be able to bootstrap without manual intervention

2018-03-16 Thread Oleksandr Shulgin (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-5836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16401531#comment-16401531
 ] 

Oleksandr Shulgin commented on CASSANDRA-5836:
--

{quote}Since we have changed NTS such that you can’t set the new DC name in 
until after there are nodes in that DC this is no longer something someone 
could easily do by going in the “wrong” order and altering keyspaces 
first.{quote}

Whoa, but in which version?  Trunk?  DSE?  We are using Apache Cassandra 3.0 
and definitely that one doesn't check DC names at all.


> Seed nodes should be able to bootstrap without manual intervention
> --
>
> Key: CASSANDRA-5836
> URL: https://issues.apache.org/jira/browse/CASSANDRA-5836
> Project: Cassandra
>  Issue Type: Bug
>Reporter: Bill Hathaway
>Priority: Minor
>
> The current logic doesn't allow a seed node to be bootstrapped.  If a user 
> wants to bootstrap a node configured as a seed (for example to replace a seed 
> node via replace_token), they first need to remove the node's own IP from the 
> seed list, and then start the bootstrap process.  This seems like an 
> unnecessary step since a node never uses itself as a seed.
> I think it would be a better experience if the logic was changed to allow a 
> seed node to bootstrap without manual intervention when there are other seed 
> nodes up in a ring.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org