[jira] [Created] (STORM-1768) Reduce noise in worker logs
Derek Dagit created STORM-1768: -- Summary: Reduce noise in worker logs Key: STORM-1768 URL: https://issues.apache.org/jira/browse/STORM-1768 Project: Apache Storm Issue Type: Improvement Components: storm-core Affects Versions: 0.10.0 Reporter: Derek Dagit Much of the time when debugging a problematic topology, worker logs are filled with many statements like netty connection issues, so that more relevant information is harder to find. Perhaps we can identify some of the noisiest logs and change their log level to DEBUG or TRACE. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: STORM-433: Give users visibility to the depth ...
Github user dsKarthick commented on the pull request: https://github.com/apache/storm/pull/236#issuecomment-217241771 @abhishekagarwal87 I am excited to see your screenshot as well. Also, I am curious as why you chose to use https://github.com/apache/storm/blob/master/storm-core/src/jvm/org/apache/storm/metric/internal/LatencyStatAndMetric.java over https://github.com/apache/storm/blob/master/storm-core/src/jvm/org/apache/storm/metric/internal/CountStatAndMetric.java? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-433) Give users visibility to the depth of queues at each bolt
[ https://issues.apache.org/jira/browse/STORM-433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272783#comment-15272783 ] ASF GitHub Bot commented on STORM-433: -- Github user abhishekagarwal87 commented on the pull request: https://github.com/apache/storm/pull/236#issuecomment-217232912 I will do that Eric. I am using https://github.com/apache/storm/blob/master/storm-core/src/jvm/org/apache/storm/metric/internal/LatencyStatAndMetric.java to store the windowed values. It is easy to add the instantaneous values in the result map so that is not a problem. I will put a screenshot and the PR soon. May be that will clear the confusion. > Give users visibility to the depth of queues at each bolt > - > > Key: STORM-433 > URL: https://issues.apache.org/jira/browse/STORM-433 > Project: Apache Storm > Issue Type: Wish > Components: storm-core >Reporter: Dane Hammer >Assignee: Kyle Nusbaum >Priority: Minor > > I envision being able to browse the Storm UI and see where queues of tuples > are backing up. > Today if I see latencies increasing at a bolt, it may not be due to anything > specific to that bolt, but that it is backed up behind an overwhelmed bolt > (which has too low of parallelism or too high of latency). > I would expect this could use sampling like the metrics reported to the UI > today, and just retrieve data from netty about the state of the queues. I > wouldn't imagine supporting zeromq on the first pass. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-1749) Fix storm-starter links in the github code
[ https://issues.apache.org/jira/browse/STORM-1749?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272767#comment-15272767 ] ASF GitHub Bot commented on STORM-1749: --- GitHub user abhishekagarwal87 opened a pull request: https://github.com/apache/storm/pull/1401 STORM-1749: Fix storm-starter github links You can merge this pull request into a Git repository by running: $ git pull https://github.com/abhishekagarwal87/storm documentation Alternatively you can review and apply these changes as the patch at: https://github.com/apache/storm/pull/1401.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1401 commit 0458b0078dcaaeb1d7290e8c77c31cd46ac0f1e2 Author: Abhishek AgarwalDate: 2016-05-05T18:11:40Z STORM-1749: Fix storm-starter github links > Fix storm-starter links in the github code > -- > > Key: STORM-1749 > URL: https://issues.apache.org/jira/browse/STORM-1749 > Project: Apache Storm > Issue Type: Bug > Components: examples >Reporter: Abhishek Agarwal >Assignee: Abhishek Agarwal > > github links in storm-starter readme.md are still pointing to the old package > name. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: STORM-1749: Fix storm-starter github links
GitHub user abhishekagarwal87 opened a pull request: https://github.com/apache/storm/pull/1401 STORM-1749: Fix storm-starter github links You can merge this pull request into a Git repository by running: $ git pull https://github.com/abhishekagarwal87/storm documentation Alternatively you can review and apply these changes as the patch at: https://github.com/apache/storm/pull/1401.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1401 commit 0458b0078dcaaeb1d7290e8c77c31cd46ac0f1e2 Author: Abhishek AgarwalDate: 2016-05-05T18:11:40Z STORM-1749: Fix storm-starter github links --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request: STORM-433: Give users visibility to the depth ...
Github user erikdw commented on the pull request: https://github.com/apache/storm/pull/236#issuecomment-217228874 @abhishekagarwal87 : awesome news! Any way you can post a screenshot of the UI you are currently proposing? At least please do so with the PR when you send it. That could help others to brainstorm how to put such values into the UI. Maybe you're instead asking for suggestions on how to handle *obtaining* the instantaneous values? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-433) Give users visibility to the depth of queues at each bolt
[ https://issues.apache.org/jira/browse/STORM-433?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272741#comment-15272741 ] ASF GitHub Bot commented on STORM-433: -- Github user erikdw commented on the pull request: https://github.com/apache/storm/pull/236#issuecomment-217228874 @abhishekagarwal87 : awesome news! Any way you can post a screenshot of the UI you are currently proposing? At least please do so with the PR when you send it. That could help others to brainstorm how to put such values into the UI. Maybe you're instead asking for suggestions on how to handle *obtaining* the instantaneous values? > Give users visibility to the depth of queues at each bolt > - > > Key: STORM-433 > URL: https://issues.apache.org/jira/browse/STORM-433 > Project: Apache Storm > Issue Type: Wish > Components: storm-core >Reporter: Dane Hammer >Assignee: Kyle Nusbaum >Priority: Minor > > I envision being able to browse the Storm UI and see where queues of tuples > are backing up. > Today if I see latencies increasing at a bolt, it may not be due to anything > specific to that bolt, but that it is backed up behind an overwhelmed bolt > (which has too low of parallelism or too high of latency). > I would expect this could use sampling like the metrics reported to the UI > today, and just retrieve data from netty about the state of the queues. I > wouldn't imagine supporting zeromq on the first pass. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-1730) LocalCluster#shutdown() does not terminate all storm threads/thread pools.
[ https://issues.apache.org/jira/browse/STORM-1730?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272736#comment-15272736 ] ASF GitHub Bot commented on STORM-1730: --- GitHub user fbyrne opened a pull request: https://github.com/apache/storm/pull/1400 STORM-1730 LocalCluster#shutdown() does not terminate all storm threads/thread pools. The thread pool executor used by the `DisruptorQueue.FlusherPool` is in a static context of the DisruptorQueue. To avoid major changes to how this queues lifecycle is managed I added a simple fix is to create a thread pool factory which creates daemon threads. The threads are now also named with the 'disruptor-flush' prefix to be consistent with the TimerTask. You can merge this pull request into a Git repository by running: $ git pull https://github.com/fbyrne/storm master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/storm/pull/1400.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1400 commit 3c78314e18e21cd188cd7b8e5ae3f2143ed75cbd Author: Fergus ByrneDate: 2016-05-05T16:32:33Z STORM-1730 LocalCluster#shutdown() does not terminate all storm threads/thread pools. The thread pool that is remaining is in a static context of the DisruptorQueue. To avoid major changes to how this queues lifecycle is managed the fix is to create a thread pool factory which creates daemon threads. The threads are now also named with the 'disruptor-flush' prefix to be consistent with the TimerTask. > LocalCluster#shutdown() does not terminate all storm threads/thread pools. > -- > > Key: STORM-1730 > URL: https://issues.apache.org/jira/browse/STORM-1730 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Affects Versions: 1.0.0 > Environment: Windows 7 x64 > Oracle Java 1.8.0 u92 x64 >Reporter: Fergus Byrne >Priority: Minor > Attachments: Thread Pool '47' remaining..png, storm-shutdown-issue.zip > > > When using the LocalCluster in test setup. LocalCluster#shutdown() does not > shutdown all executor services it starts. In my test case, there is a single > thread pool executor service that is not shutdown and not daemon. This keeps > the jvm alive when it is expected to terminate. > Please see attached test case. In my example, thread pool 47 is not > shutdown. Naming here is conditional on threading. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: STORM-1730 LocalCluster#shutdown() does not te...
GitHub user fbyrne opened a pull request: https://github.com/apache/storm/pull/1400 STORM-1730 LocalCluster#shutdown() does not terminate all storm threads/thread pools. The thread pool executor used by the `DisruptorQueue.FlusherPool` is in a static context of the DisruptorQueue. To avoid major changes to how this queues lifecycle is managed I added a simple fix is to create a thread pool factory which creates daemon threads. The threads are now also named with the 'disruptor-flush' prefix to be consistent with the TimerTask. You can merge this pull request into a Git repository by running: $ git pull https://github.com/fbyrne/storm master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/storm/pull/1400.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1400 commit 3c78314e18e21cd188cd7b8e5ae3f2143ed75cbd Author: Fergus ByrneDate: 2016-05-05T16:32:33Z STORM-1730 LocalCluster#shutdown() does not terminate all storm threads/thread pools. The thread pool that is remaining is in a static context of the DisruptorQueue. To avoid major changes to how this queues lifecycle is managed the fix is to create a thread pool factory which creates daemon threads. The threads are now also named with the 'disruptor-flush' prefix to be consistent with the TimerTask. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Updated] (STORM-1767) metrics log entries are being appended to root log
[ https://issues.apache.org/jira/browse/STORM-1767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Daniel Drozdzewski updated STORM-1767: -- Description: Current setup of metrics logger ( {{storm/log4j2/worker.xml}}) uses fully qualified name of the class where the logging is happening from i.e `org.apache.storm.metric.LoggingMetricsConsumer`, which is problematic and does not achieve the original intent as stated by the METRICS appender defined in {{storm/log4j2/worker.xml}}. Currently the metrics logger created explicitly by using the name above: {{LoggerFactory.getLogger("org.apache.storm.metric.LoggingMetricsConsumer")}} or implicitly from within the {{LoggingMetricsConsumer}} by calling {{LoggerFactory.getLogger(LoggingMetricsConsumer.class)}} will be logging to **root** logger. This happens because logger names use Java namespaces and as such create hierarchies. The solution is to name metrics logger outside of {{org.apache.storm.*}} namespace which is what is happening for all other non-root loggers defined within the {{storm/log4j2/worker.xml}} file. This will also mean a code change to {{LoggingMetricsConsumer}} class itself for it to use the logger with an explicit name matching the name defined in the {{worker.xml}} file. The fix is easy. was: Current setup of metrics logger (`storm/log4j2/worker.xml`) uses fully qualified name of the class where the logging is happening from i.e `org.apache.storm.metric.LoggingMetricsConsumer`, which is problematic and does not achieve the original intent as stated by the METRICS appender defined in `storm/log4j2/worker.xml`. Currently the metrics logger created explicitly by using the name above: `LoggerFactory.getLogger("org.apache.storm.metric.LoggingMetricsConsumer")` or implicitly from within the `LoggingMetricsConsumer` by calling `LoggerFactory.getLogger(LoggingMetricsConsumer.class)` will be logging to **root** logger. This happens because logger names use Java namespaces and as such create hierarchies. The solution is to name metrics logger outside of `org.apache.storm.*` namespace which is what is happening for all other non-root loggers defined within the `storm/log4j2/worker.xml` file. This will also mean a code change to `LoggingMetricsConsumer` class itself for it to use the logger with an explicit name matching the name defined in the `worker.xml` file. The fix is easy. > metrics log entries are being appended to root log > -- > > Key: STORM-1767 > URL: https://issues.apache.org/jira/browse/STORM-1767 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Affects Versions: 1.0.0, 2.0.0 >Reporter: Daniel Drozdzewski > Labels: easyfix > Fix For: 1.0.0, 2.0.0 > > > Current setup of metrics logger ( {{storm/log4j2/worker.xml}}) uses fully > qualified name of the class where the logging is happening from i.e > `org.apache.storm.metric.LoggingMetricsConsumer`, which is problematic and > does not achieve the original intent as stated by the METRICS appender > defined in {{storm/log4j2/worker.xml}}. > Currently the metrics logger created explicitly by using the name above: > {{LoggerFactory.getLogger("org.apache.storm.metric.LoggingMetricsConsumer")}} > or implicitly from within the {{LoggingMetricsConsumer}} by calling > {{LoggerFactory.getLogger(LoggingMetricsConsumer.class)}} will be logging to > **root** logger. > This happens because logger names use Java namespaces and as such create > hierarchies. > The solution is to name metrics logger outside of {{org.apache.storm.*}} > namespace which is what is happening for all other non-root loggers defined > within the {{storm/log4j2/worker.xml}} file. > This will also mean a code change to {{LoggingMetricsConsumer}} class itself > for it to use the logger with an explicit name matching the name defined in > the {{worker.xml}} file. > The fix is easy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (STORM-1767) metrics log entries are being appended to root log
Daniel Drozdzewski created STORM-1767: - Summary: metrics log entries are being appended to root log Key: STORM-1767 URL: https://issues.apache.org/jira/browse/STORM-1767 Project: Apache Storm Issue Type: Bug Components: storm-core Affects Versions: 1.0.0, 2.0.0 Reporter: Daniel Drozdzewski Fix For: 1.0.0, 2.0.0 Current setup of metrics logger (`storm/log4j2/worker.xml`) uses fully qualified name of the class where the logging is happening from i.e `org.apache.storm.metric.LoggingMetricsConsumer`, which is problematic and does not achieve the original intent as stated by the METRICS appender defined in `storm/log4j2/worker.xml`. Currently the metrics logger created explicitly by using the name above: `LoggerFactory.getLogger("org.apache.storm.metric.LoggingMetricsConsumer")` or implicitly from within the `LoggingMetricsConsumer` by calling `LoggerFactory.getLogger(LoggingMetricsConsumer.class)` will be logging to **root** logger. This happens because logger names use Java namespaces and as such create hierarchies. The solution is to name metrics logger outside of `org.apache.storm.*` namespace which is what is happening for all other non-root loggers defined within the `storm/log4j2/worker.xml` file. This will also mean a code change to `LoggingMetricsConsumer` class itself for it to use the logger with an explicit name matching the name defined in the `worker.xml` file. The fix is easy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request:
Github user diginoise commented on the pull request: https://github.com/apache/storm/commit/96da31ed0068dde3b4309ee417382be5084ce941#commitcomment-17368296 Jira ticket created: https://issues.apache.org/jira/browse/STORM-1767 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1767) metrics log entries are being appended to root log
[ https://issues.apache.org/jira/browse/STORM-1767?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272551#comment-15272551 ] ASF GitHub Bot commented on STORM-1767: --- Github user diginoise commented on the pull request: https://github.com/apache/storm/commit/96da31ed0068dde3b4309ee417382be5084ce941#commitcomment-17368296 Jira ticket created: https://issues.apache.org/jira/browse/STORM-1767 > metrics log entries are being appended to root log > -- > > Key: STORM-1767 > URL: https://issues.apache.org/jira/browse/STORM-1767 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Affects Versions: 1.0.0, 2.0.0 >Reporter: Daniel Drozdzewski > Labels: easyfix > Fix For: 1.0.0, 2.0.0 > > > Current setup of metrics logger (`storm/log4j2/worker.xml`) uses fully > qualified name of the class where the logging is happening from i.e > `org.apache.storm.metric.LoggingMetricsConsumer`, which is problematic and > does not achieve the original intent as stated by the METRICS appender > defined in `storm/log4j2/worker.xml`. > Currently the metrics logger created explicitly by using the name above: > `LoggerFactory.getLogger("org.apache.storm.metric.LoggingMetricsConsumer")` > or implicitly from within the `LoggingMetricsConsumer` by calling > `LoggerFactory.getLogger(LoggingMetricsConsumer.class)` will be logging to > **root** logger. > This happens because logger names use Java namespaces and as such create > hierarchies. > The solution is to name metrics logger outside of `org.apache.storm.*` > namespace which is what is happening for all other non-root loggers defined > within the `storm/log4j2/worker.xml` file. > This will also mean a code change to `LoggingMetricsConsumer` class itself > for it to use the logger with an explicit name matching the name defined in > the `worker.xml` file. > The fix is easy. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request:
Github user diginoise commented on the pull request: https://github.com/apache/storm/commit/96da31ed0068dde3b4309ee417382be5084ce941#commitcomment-17368100 doing so now... --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request:
Github user harshach commented on the pull request: https://github.com/apache/storm/commit/96da31ed0068dde3b4309ee417382be5084ce941#commitcomment-17368081 @diginoise can you please file a JIRA --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request:
Github user diginoise commented on the pull request: https://github.com/apache/storm/commit/96da31ed0068dde3b4309ee417382be5084ce941#commitcomment-17367854 Please note that making the logger name as `org.apache.storm.metric.LoggingMetricsConsumer` will cause the logger created within this class (using class' name) to actually use ROOT logger, instead of this logger, and the metrics entries are being logged into the standard OUT log of each worker. One must create a separate logger outside of `org.apache.storm.*` hierarchy, i.e. the name cannot use the fully qualified name of the class where logging is happening from. Calling it say `METRICS_LOGGER` and then creating it by calling `LoggerFactory.getLogger("METRICS_LOG")` inside `LoggingMetricsConsumer` would prevent log entries appearing in the logs governed by the ROOT logger. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request: update readme.md
GitHub user ddmonk opened a pull request: https://github.com/apache/storm/pull/1399 update readme.md fix the wrong commands`$ bin/storm sql order_filtering order_filtering.sql` to `$ bin/storm sql order_filtering.sql order_filtering` You can merge this pull request into a Git repository by running: $ git pull https://github.com/ddmonk/storm master Alternatively you can review and apply these changes as the patch at: https://github.com/apache/storm/pull/1399.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1399 commit 75edcfc9bbed47328b10203b9a0c71f33c585760 Author: ddmonkDate: 2016-05-05T10:20:39Z update readme.md fix the wrong shell `$ bin/storm sql order_filtering order_filtering.sql` to `$ bin/storm sql order_filtering.sql order_filtering` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1705) Cap on number of retries for a failed message in kafka spout
[ https://issues.apache.org/jira/browse/STORM-1705?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272134#comment-15272134 ] ASF GitHub Bot commented on STORM-1705: --- Github user abhishekagarwal87 commented on a diff in the pull request: https://github.com/apache/storm/pull/1331#discussion_r62167427 --- Diff: external/storm-kafka/src/jvm/org/apache/storm/kafka/PartitionManager.java --- @@ -46,6 +46,9 @@ private final CountMetric _fetchAPIMessageCount; --- End diff -- @harshach - is it fine with you to do the refactoring separately? Apologies for a long pause. > Cap on number of retries for a failed message in kafka spout > > > Key: STORM-1705 > URL: https://issues.apache.org/jira/browse/STORM-1705 > Project: Apache Storm > Issue Type: New Feature > Components: storm-kafka >Reporter: Abhishek Agarwal >Assignee: Abhishek Agarwal > > The kafka-spout module based on newer APIs has a cap on the number of times, > a message is to be retried. It will be a good feature add in the older kafka > spout code as well. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: STORM-1705: Cap number of retries for a failed...
Github user abhishekagarwal87 commented on a diff in the pull request: https://github.com/apache/storm/pull/1331#discussion_r62167427 --- Diff: external/storm-kafka/src/jvm/org/apache/storm/kafka/PartitionManager.java --- @@ -46,6 +46,9 @@ private final CountMetric _fetchAPIMessageCount; --- End diff -- @harshach - is it fine with you to do the refactoring separately? Apologies for a long pause. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request: STORM-433: Give users visibility to the depth ...
Github user abhishekagarwal87 commented on the pull request: https://github.com/apache/storm/pull/236#issuecomment-217110521 I am preparing a patch for publishing in-backlog (receive-queue) and out-backlog (send-queue). I can see the average value of these metrics on UI over each time window period but only in executors section. (No aggregation over component level etc) Also users may be interested in instant value of these metrics and I don't know how will I fit that into UI. any suggestions are welcome. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1750) Report-error-and-die may not kill the worker
[ https://issues.apache.org/jira/browse/STORM-1750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272081#comment-15272081 ] ASF GitHub Bot commented on STORM-1750: --- Github user srdo commented on a diff in the pull request: https://github.com/apache/storm/pull/1384#discussion_r62162389 --- Diff: storm-core/src/jvm/org/apache/storm/cluster/ZKStateStorage.java --- @@ -189,7 +190,15 @@ public void set_data(String path, byte[] data, List acls) { Zookeeper.setData(zkWriter, path, data); } else { Zookeeper.mkdirs(zkWriter, Zookeeper.parentPath(path), acls); -Zookeeper.createNode(zkWriter, path, data, CreateMode.PERSISTENT, acls); +try { +Zookeeper.createNode(zkWriter, path, data, CreateMode.PERSISTENT, acls); +} catch (RuntimeException e) { --- End diff -- Good idea, I hadn't noticed that function :) > Report-error-and-die may not kill the worker > > > Key: STORM-1750 > URL: https://issues.apache.org/jira/browse/STORM-1750 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Affects Versions: 0.10.0, 1.0.0, 2.0.0 >Reporter: Stig Rohde Døssing >Assignee: Stig Rohde Døssing >Priority: Critical > > The report-error-and-die function in executor.clj calls report-error, which > can throw exceptions if Curator runs into any kind of trouble while > registering the error. I suspect this may happen with network errors, but it > can also happen if two executors for the same component throw exceptions at > the same time and no errors have been registered for the component > previously. This is because both calls to report-error-and-die update the > lastErrorPath, and ZkStateStorage set_data doesn't catch the potential > NodeExistsException that may be thrown from the create call. > If an exception is thrown from report-error, the suicide-fn is never called, > and the worker keeps running sans the crashed executor. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-1750) Report-error-and-die may not kill the worker
[ https://issues.apache.org/jira/browse/STORM-1750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272034#comment-15272034 ] ASF GitHub Bot commented on STORM-1750: --- Github user Jackyele commented on a diff in the pull request: https://github.com/apache/storm/pull/1384#discussion_r62157087 --- Diff: storm-core/src/jvm/org/apache/storm/cluster/ZKStateStorage.java --- @@ -189,7 +190,15 @@ public void set_data(String path, byte[] data, List acls) { Zookeeper.setData(zkWriter, path, data); } else { Zookeeper.mkdirs(zkWriter, Zookeeper.parentPath(path), acls); -Zookeeper.createNode(zkWriter, path, data, CreateMode.PERSISTENT, acls); +try { +Zookeeper.createNode(zkWriter, path, data, CreateMode.PERSISTENT, acls); +} catch (RuntimeException e) { --- End diff -- yeah, we have to catch RuntimeError first; and how about using **Utils. exceptionCauseIsInstanceOf** instead for consistency, just like https://github.com/apache/storm/blob/master/storm-core/src/jvm/org/apache/storm/zookeeper/Zookeeper.java#L165 does > Report-error-and-die may not kill the worker > > > Key: STORM-1750 > URL: https://issues.apache.org/jira/browse/STORM-1750 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Affects Versions: 0.10.0, 1.0.0, 2.0.0 >Reporter: Stig Rohde Døssing >Assignee: Stig Rohde Døssing >Priority: Critical > > The report-error-and-die function in executor.clj calls report-error, which > can throw exceptions if Curator runs into any kind of trouble while > registering the error. I suspect this may happen with network errors, but it > can also happen if two executors for the same component throw exceptions at > the same time and no errors have been registered for the component > previously. This is because both calls to report-error-and-die update the > lastErrorPath, and ZkStateStorage set_data doesn't catch the potential > NodeExistsException that may be thrown from the create call. > If an exception is thrown from report-error, the suicide-fn is never called, > and the worker keeps running sans the crashed executor. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: STORM-1750: Ensure worker dies when report-err...
Github user Jackyele commented on a diff in the pull request: https://github.com/apache/storm/pull/1384#discussion_r62157087 --- Diff: storm-core/src/jvm/org/apache/storm/cluster/ZKStateStorage.java --- @@ -189,7 +190,15 @@ public void set_data(String path, byte[] data, List acls) { Zookeeper.setData(zkWriter, path, data); } else { Zookeeper.mkdirs(zkWriter, Zookeeper.parentPath(path), acls); -Zookeeper.createNode(zkWriter, path, data, CreateMode.PERSISTENT, acls); +try { +Zookeeper.createNode(zkWriter, path, data, CreateMode.PERSISTENT, acls); +} catch (RuntimeException e) { --- End diff -- yeah, we have to catch RuntimeError first; and how about using **Utils. exceptionCauseIsInstanceOf** instead for consistency, just like https://github.com/apache/storm/blob/master/storm-core/src/jvm/org/apache/storm/zookeeper/Zookeeper.java#L165 does --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1750) Report-error-and-die may not kill the worker
[ https://issues.apache.org/jira/browse/STORM-1750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15272006#comment-15272006 ] ASF GitHub Bot commented on STORM-1750: --- Github user srdo commented on a diff in the pull request: https://github.com/apache/storm/pull/1384#discussion_r62155412 --- Diff: storm-core/src/jvm/org/apache/storm/cluster/ZKStateStorage.java --- @@ -189,7 +190,15 @@ public void set_data(String path, byte[] data, List acls) { Zookeeper.setData(zkWriter, path, data); } else { Zookeeper.mkdirs(zkWriter, Zookeeper.parentPath(path), acls); -Zookeeper.createNode(zkWriter, path, data, CreateMode.PERSISTENT, acls); +try { +Zookeeper.createNode(zkWriter, path, data, CreateMode.PERSISTENT, acls); +} catch (RuntimeException e) { --- End diff -- Not sure how? createNode() wraps it in RuntimeException, so it has to be unwrapped first, right? https://github.com/apache/storm/blob/master/storm-core/src/jvm/org/apache/storm/zookeeper/Zookeeper.java#L129 > Report-error-and-die may not kill the worker > > > Key: STORM-1750 > URL: https://issues.apache.org/jira/browse/STORM-1750 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Affects Versions: 0.10.0, 1.0.0, 2.0.0 >Reporter: Stig Rohde Døssing >Assignee: Stig Rohde Døssing >Priority: Critical > > The report-error-and-die function in executor.clj calls report-error, which > can throw exceptions if Curator runs into any kind of trouble while > registering the error. I suspect this may happen with network errors, but it > can also happen if two executors for the same component throw exceptions at > the same time and no errors have been registered for the component > previously. This is because both calls to report-error-and-die update the > lastErrorPath, and ZkStateStorage set_data doesn't catch the potential > NodeExistsException that may be thrown from the create call. > If an exception is thrown from report-error, the suicide-fn is never called, > and the worker keeps running sans the crashed executor. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: STORM-1750: Ensure worker dies when report-err...
Github user srdo commented on a diff in the pull request: https://github.com/apache/storm/pull/1384#discussion_r62155412 --- Diff: storm-core/src/jvm/org/apache/storm/cluster/ZKStateStorage.java --- @@ -189,7 +190,15 @@ public void set_data(String path, byte[] data, List acls) { Zookeeper.setData(zkWriter, path, data); } else { Zookeeper.mkdirs(zkWriter, Zookeeper.parentPath(path), acls); -Zookeeper.createNode(zkWriter, path, data, CreateMode.PERSISTENT, acls); +try { +Zookeeper.createNode(zkWriter, path, data, CreateMode.PERSISTENT, acls); +} catch (RuntimeException e) { --- End diff -- Not sure how? createNode() wraps it in RuntimeException, so it has to be unwrapped first, right? https://github.com/apache/storm/blob/master/storm-core/src/jvm/org/apache/storm/zookeeper/Zookeeper.java#L129 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1750) Report-error-and-die may not kill the worker
[ https://issues.apache.org/jira/browse/STORM-1750?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15271996#comment-15271996 ] ASF GitHub Bot commented on STORM-1750: --- Github user Jackyele commented on a diff in the pull request: https://github.com/apache/storm/pull/1384#discussion_r62154514 --- Diff: storm-core/src/jvm/org/apache/storm/cluster/ZKStateStorage.java --- @@ -189,7 +190,15 @@ public void set_data(String path, byte[] data, List acls) { Zookeeper.setData(zkWriter, path, data); } else { Zookeeper.mkdirs(zkWriter, Zookeeper.parentPath(path), acls); -Zookeeper.createNode(zkWriter, path, data, CreateMode.PERSISTENT, acls); +try { +Zookeeper.createNode(zkWriter, path, data, CreateMode.PERSISTENT, acls); +} catch (RuntimeException e) { --- End diff -- Why not catch NodeExistsException directly > Report-error-and-die may not kill the worker > > > Key: STORM-1750 > URL: https://issues.apache.org/jira/browse/STORM-1750 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Affects Versions: 0.10.0, 1.0.0, 2.0.0 >Reporter: Stig Rohde Døssing >Assignee: Stig Rohde Døssing >Priority: Critical > > The report-error-and-die function in executor.clj calls report-error, which > can throw exceptions if Curator runs into any kind of trouble while > registering the error. I suspect this may happen with network errors, but it > can also happen if two executors for the same component throw exceptions at > the same time and no errors have been registered for the component > previously. This is because both calls to report-error-and-die update the > lastErrorPath, and ZkStateStorage set_data doesn't catch the potential > NodeExistsException that may be thrown from the create call. > If an exception is thrown from report-error, the suicide-fn is never called, > and the worker keeps running sans the crashed executor. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: STORM-1750: Ensure worker dies when report-err...
Github user Jackyele commented on a diff in the pull request: https://github.com/apache/storm/pull/1384#discussion_r62154514 --- Diff: storm-core/src/jvm/org/apache/storm/cluster/ZKStateStorage.java --- @@ -189,7 +190,15 @@ public void set_data(String path, byte[] data, List acls) { Zookeeper.setData(zkWriter, path, data); } else { Zookeeper.mkdirs(zkWriter, Zookeeper.parentPath(path), acls); -Zookeeper.createNode(zkWriter, path, data, CreateMode.PERSISTENT, acls); +try { +Zookeeper.createNode(zkWriter, path, data, CreateMode.PERSISTENT, acls); +} catch (RuntimeException e) { --- End diff -- Why not catch NodeExistsException directly --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---