[GitHub] storm issue #1767: STORM-2194: Report error and die, not report error or die
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/1767 @sathyafmt sorry I have taken so long to respond December was a really crazy month for me. From STORM-2194 I see that the SocketTimeoutException goes through the code being changed. The RMI code does not go through that path at all. ``` 2016-12-01 04:24:41.721 STDERR [INFO] Error: Exception thrown by the agent : java.rmi.server.ExportException: Port already in use: 56700; nested exception is: 2016-12-01 04:24:41.722 STDERR [INFO] java.net.BindException: Address already in use ``` If it did then we would have exited because BindException and ExportException are neither InterruptedIOException nor InterruptedException. So this patch, nor the one I proposed would have any impact on the RMI case at all. Something else is catching the ExportException and printing to STDERR the error message above. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm issue #1767: STORM-2194: Report error and die, not report error or die
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/1767 We should be able to fix this with code like. ``` (if (or (exception-cause? InterruptedException error) (and (exception-cause? java.io.InterruptedIOException error) (not (exception-cause? java.net.SocketTimeoutException ``` --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm issue #1767: STORM-2194: Report error and die, not report error or die
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/1767 @chawco Okay so I understand the issue better now. SocketTimeoutException is a subclass of InterruptedIOException. https://docs.oracle.com/javase/7/docs/api/java/net/SocketTimeoutException.html I could argue that it is a mistake on the part of java and that it is wrong, but that is already set in stone so we have to deal with it. I see two options. 1) We can treat a SocketTimeoutException differently from other InterruptedIOExceptions, 2) or we can just treat all InterruptedIOExceptions as fatal. We started ignoring InterruptedIOExceptions because we would occasionally run into them in the supervisor or nimbus local cluster tests and that would fail everything. Having proper behavior is more important than having super stable unit tests, but if we can have both (option 1) I think that would be best. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm issue #1767: STORM-2194: Report error and die, not report error or die
Github user chawco commented on the issue: https://github.com/apache/storm/pull/1767 So, there's some ambiguity in place here that would need to get resolved. Basically, there's two sources for InterruptedException/InterruptedIOException -- one from Storm itself, and one from the user's code (See: [STORM-2194](https://issues.apache.org/jira/browse/STORM-2194) If we want to have different behaviour from these two sources we do need some solution to disambiguate the uses. Basically, the problem is this: If anything in the user's thread raises these exceptions, the executor thread will terminate *but the worker will not*. This leaves the entire topology in a zombified state. I find it difficult to see that this behaviour is "as intended". @revans2 can you suggest a way we can disambiguate between Storm initiated and user initiated exceptions here? I'm having a real tough time thinking how we could accomplish that. Alternatively, can you propose an alternate implementation for signalling this shutdown? I'm happy to take on that work to make things shutdown in a more appropriate manner, as long as we get to fix this zombie topology behaviour. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm issue #1767: STORM-2194: Report error and die, not report error or die
Github user sathyafmt commented on the issue: https://github.com/apache/storm/pull/1767 @revans2 - His change in storm-1.0.2 is in the report-error-and-die function, so it should shutdown, correct ? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm issue #1767: STORM-2194: Report error and die, not report error or die
Github user revans2 commented on the issue: https://github.com/apache/storm/pull/1767 -1. InterruptedException we often use to indicate that the process is shutting down, and we want to not blow up when we see them. Perhaps what we want to do is to better document it. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---