[jira] [Commented] (CASSANDRA-14435) Diag. Events: JMX events

2018-08-07 Thread mck (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16571508#comment-16571508
 ] 

mck commented on CASSANDRA-14435:
-

reviewed. +1 from me.

> Diag. Events: JMX events
> 
>
> Key: CASSANDRA-14435
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14435
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Observability
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
> Fix For: 4.x
>
>
> Nodes currently use JMX events for progress reporting on bootstrap and 
> repairs. This might also be an option to expose diagnostic events to external 
> subscribers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14435) Diag. Events: JMX events

2018-08-04 Thread Stefan Podkowinski (JIRA)


[ 
https://issues.apache.org/jira/browse/CASSANDRA-14435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16569129#comment-16569129
 ] 

Stefan Podkowinski commented on CASSANDRA-14435:


The latest version of the code has been squashed and tested. It now basically 
follows the design as described in my previous post.

* [github|https://github.com/spodkowinski/cassandra/tree/CASSANDRA-14435]
* [circleci|https://circleci.com/gh/spodkowinski/cassandra/382]
* 
[dtests|https://builds.apache.org/view/A-D/view/Cassandra/job/Cassandra-devbranch-dtest/602/]


> Diag. Events: JMX events
> 
>
> Key: CASSANDRA-14435
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14435
> Project: Cassandra
>  Issue Type: New Feature
>  Components: Observability
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
> Fix For: 4.x
>
>
> Nodes currently use JMX events for progress reporting on bootstrap and 
> repairs. This might also be an option to expose diagnostic events to external 
> subscribers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14435) Diag. Events: JMX events

2018-05-25 Thread Stefan Podkowinski (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16490609#comment-16490609
 ] 

Stefan Podkowinski commented on CASSANDRA-14435:


As already pointed out in this discussion, we need to be careful to avoid 
contention around the JMX notification (ring) buffer. I've now changed the 
approach implemented in this ticket to stop broadcasting events as part of JMX 
notifications directly. Instead, notifications will only be used to announce 
the last (greatest) ID for each event type. Clients will be able to detect if 
new events will be available by keeping a local list of IDs and subscribe to 
notifications with ID updates. To make this work, IDs must be monotonically 
increasing and comparable (e.g. Long or TimeUUID). As notification on updated 
IDs will be broadcasted periodically, missing notifications isn't an issue and 
the full list of IDs will be received on the next broadcast interval.

The actual events will be available through a standard MBean method call, which 
accepts the event ID of the client's last retrieved event and sends a limited 
number of events newer since the provided ID. This call can be remotely polled 
until the latest event has been retrieved.

> Diag. Events: JMX events
> 
>
> Key: CASSANDRA-14435
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14435
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
> Fix For: 4.x
>
>
> Nodes currently use JMX events for progress reporting on bootstrap and 
> repairs. This might also be an option to expose diagnostic events to external 
> subscribers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14435) Diag. Events: JMX events

2018-05-11 Thread mck (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16471772#comment-16471772
 ] 

mck commented on CASSANDRA-14435:
-

{quote}That or do like in CASSANDRA-13480 where there is an operation to check 
recent events or something when notifications are lost. \{quote}

Yes, I think this is the idea [~cnlwsu]. With CASSANDRA-13460 it'll be possible 
to query the list via a jmx endpoint (and a virtual table?).

> Diag. Events: JMX events
> 
>
> Key: CASSANDRA-14435
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14435
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
> Fix For: 4.x
>
>
> Nodes currently use JMX events for progress reporting on bootstrap and 
> repairs. This might also be an option to expose diagnostic events to external 
> subscribers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14435) Diag. Events: JMX events

2018-05-03 Thread Chris Lohfink (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16462629#comment-16462629
 ] 

Chris Lohfink commented on CASSANDRA-14435:
---

We should at least make it clear that there its just best effort and theres 
high likelihood of missing events to make sure people dont rely on the events 
for alerting or anything.

That or do like in CASSANDRA-13480 where there is an operation to check recent 
events or something when notifications are lost. Can test with 
{{-Djmx.remote.x.notification.buffer.size=1}}

> Diag. Events: JMX events
> 
>
> Key: CASSANDRA-14435
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14435
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
> Fix For: 4.x
>
>
> Nodes currently use JMX events for progress reporting on bootstrap and 
> repairs. This might also be an option to expose diagnostic events to external 
> subscribers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14435) Diag. Events: JMX events

2018-05-03 Thread Stefan Podkowinski (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16462475#comment-16462475
 ] 

Stefan Podkowinski commented on CASSANDRA-14435:


{quote}While I think its a good idea here, we should probably at least have a 
note in yaml about enabling it may impact operational tooling if using 
broadcaster. With the shared event buffer (1000), the more we use it (even if 
no one is listening to that mbean's events) the more lost notifications will 
occur. On an active node we already end up losing a lot of events if the client 
is anywhere with relevant latency from the node. Increasing the buffer isn't 
really a good option as it puts massive pressure on the heap as the composite 
data objects (particularly streaming ones) are huge.
{quote}
JMX surely isn't the most scalable and robust eventing solution. But any diag. 
event consumers would also fall into the "operational tooling" category and 
tool creators and users should be aware of latency and contention based 
limitations. It's not ideal, but selectively sending infrequent events should 
hurt that much either.

We can always improve form here and work on a more scalable long term solution. 
Maybe something based on chronicle queue with a CQL streaming extension and/or 
virtual tables on top. But that's not strictly related to diagnostic events and 
shouldn't prevent us from continue to use JMX until we have another solution.
{quote}Once of the issues I can see is that events are sent on the current 
thread, ref NotificationBroadcasterSupport.defaultExecutor.
{quote}
I've pushed a commit 
[here|https://github.com/spodkowinski/cassandra/commit/c7df1333e84f5b91ebe61161ab4d669fe8da9b32]
 to share the same executor introduced in CASSANDRA-12146.

> Diag. Events: JMX events
> 
>
> Key: CASSANDRA-14435
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14435
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
> Fix For: 4.x
>
>
> Nodes currently use JMX events for progress reporting on bootstrap and 
> repairs. This might also be an option to expose diagnostic events to external 
> subscribers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14435) Diag. Events: JMX events

2018-05-02 Thread mck (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16462003#comment-16462003
 ] 

mck commented on CASSANDRA-14435:
-

[~cnlwsu], thanks for highlighting past issues with jmx in  CASSANDRA-13480

> Diag. Events: JMX events
> 
>
> Key: CASSANDRA-14435
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14435
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
> Fix For: 4.x
>
>
> Nodes currently use JMX events for progress reporting on bootstrap and 
> repairs. This might also be an option to expose diagnostic events to external 
> subscribers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14435) Diag. Events: JMX events

2018-05-02 Thread mck (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16461865#comment-16461865
 ] 

mck commented on CASSANDRA-14435:
-

Thanks [~cnlwsu], am reading up on it.
Do you know any suggestions to improving jmx in C*, ref 
https://issues.apache.org/jira/browse/CASSANDRA-14346?focusedCommentId=16459583&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16459583

> Diag. Events: JMX events
> 
>
> Key: CASSANDRA-14435
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14435
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
> Fix For: 4.x
>
>
> Nodes currently use JMX events for progress reporting on bootstrap and 
> repairs. This might also be an option to expose diagnostic events to external 
> subscribers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14435) Diag. Events: JMX events

2018-05-02 Thread Chris Lohfink (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16461849#comment-16461849
 ] 

Chris Lohfink commented on CASSANDRA-14435:
---

Jmx notifications are stateless with clients. It keeps a cyclic buffer of 
events with ids. When a polling client sends for an update it sends last if 
seen. If the is is no longer in buffer the values between last read and lowest 
are lost.   In between nvm and existing events going on that buffer we 
frequently lose events as is. Not that we can’t use it but it’s a global 
limited resource that can be sensitive with higher latenciea between jmx client 
and server

> Diag. Events: JMX events
> 
>
> Key: CASSANDRA-14435
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14435
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
> Fix For: 4.x
>
>
> Nodes currently use JMX events for progress reporting on bootstrap and 
> repairs. This might also be an option to expose diagnostic events to external 
> subscribers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14435) Diag. Events: JMX events

2018-05-02 Thread mck (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16461804#comment-16461804
 ] 

mck commented on CASSANDRA-14435:
-

{quote}With the shared event buffer (1000){quote}
I'm lost, where's this buffer you mention [~cnlwsu]?
Once of the issues I can see is that events are sent on the current thread, ref 
{{NotificationBroadcasterSupport.defaultExecutor}}.

> Diag. Events: JMX events
> 
>
> Key: CASSANDRA-14435
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14435
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
> Fix For: 4.x
>
>
> Nodes currently use JMX events for progress reporting on bootstrap and 
> repairs. This might also be an option to expose diagnostic events to external 
> subscribers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14435) Diag. Events: JMX events

2018-05-02 Thread Chris Lohfink (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16461508#comment-16461508
 ] 

Chris Lohfink commented on CASSANDRA-14435:
---

Would be nice to be able to enable it for other mechanisms like native 
transport but not JMX

While I think its a good idea here, we should probably at least have a note in 
yaml about enabling it may impact operational tooling if using broadcaster. 
With the shared event buffer (1000), the more we use it (even if no one is 
listening to that mbean's events) the more lost notifications will occur. On an 
active node we already end up losing a lot of events if the client is anywhere 
with relevant latency from the node. Increasing the buffer isn't really a good 
option as it puts massive pressure on the heap as the composite data objects 
(particularly streaming ones) are huge.

> Diag. Events: JMX events
> 
>
> Key: CASSANDRA-14435
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14435
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
> Fix For: 4.x
>
>
> Nodes currently use JMX events for progress reporting on bootstrap and 
> repairs. This might also be an option to expose diagnostic events to external 
> subscribers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org



[jira] [Commented] (CASSANDRA-14435) Diag. Events: JMX events

2018-05-02 Thread Stefan Podkowinski (JIRA)

[ 
https://issues.apache.org/jira/browse/CASSANDRA-14435?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16460990#comment-16460990
 ] 

Stefan Podkowinski commented on CASSANDRA-14435:


Quick way to give this a local test:

* compile and start
* jconsole localhost:7199
* Enable events if disabled in yaml: {{o.a.c.diag DiagnosticEventService}} -> 
{{resumePublishing()}}
* Start emit dummy events: {{o.a.c.diag DummyEventEmitter}} -> 
{{dummyEventEmitIntervalMillis(1000)}}
* Enable listening to dummy events: {{o.a.c.diag DiagnosticEvents}} -> 
{{enableEvents(org.apache.cassandra.diag.DummyEvent)}}
* Go to {{o.a.c.diag DiagnosticEvents}} Notifications and subscribe 



> Diag. Events: JMX events
> 
>
> Key: CASSANDRA-14435
> URL: https://issues.apache.org/jira/browse/CASSANDRA-14435
> Project: Cassandra
>  Issue Type: New Feature
>Reporter: Stefan Podkowinski
>Assignee: Stefan Podkowinski
>Priority: Major
> Fix For: 4.x
>
>
> Nodes currently use JMX events for progress reporting on bootstrap and 
> repairs. This might also be an option to expose diagnostic events to external 
> subscribers.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: commits-unsubscr...@cassandra.apache.org
For additional commands, e-mail: commits-h...@cassandra.apache.org