[jira] [Commented] (STORM-329) Add Option to Config Message handling strategy when connection timeout

ASF GitHub Bot (JIRA) Mon, 17 Nov 2014 21:30:07 -0800

    [ 
https://issues.apache.org/jira/browse/STORM-329?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14215731#comment-14215731
 ]


ASF GitHub Bot commented on STORM-329:
--------------------------------------

Github user tedxia commented on the pull request:

    https://github.com/apache/storm/pull/268#issuecomment-63424599
  
    @ptgoetz I have merge storm/master to this patch and all tests pass except 
"storm-core/test/clj/backtype/storm/multilang_test.clj", test results are(after 
I removed multilang_test.clj):
    ```
    [INFO] Reactor Summary:
    [INFO] 
    [INFO] Storm ............................................. SUCCESS [1.340s]
    [INFO] maven-shade-clojure-transformer ................... SUCCESS [1.712s]
    [INFO] Storm Core ........................................ SUCCESS 
[6:04.103s]
    [INFO] storm-starter ..................................... SUCCESS [7.010s]
    [INFO] storm-kafka ....................................... SUCCESS 
[1:03.183s]
    [INFO] storm-hdfs ........................................ SUCCESS [2.056s]
    [INFO] storm-hbase ....................................... SUCCESS [2.186s]
    [INFO] Storm Binary Distribution ......................... SUCCESS [0.185s]
    [INFO] Storm Source Distribution ......................... SUCCESS [0.136s]
    ```
    
    When I run multilang_test.clj, I got exception like this:
    ```
    java.lang.Exception: Shell Process Exception: Exception in bolt: "\xE4" on 
US-ASCII - /usr/lib/ruby/1.9.1/json/common.rb:148:in 
`encode'\n/usr/lib/ruby/1.9.1/json/common.rb:148:in 
`initialize'\n/usr/lib/ruby/1.9.1/json/common.rb:148:in 
`new'\n/usr/lib/ruby/1.9.1/json/common.rb:148:in 
`parse'\n/tmp/81b49de0-4ee0-493a-afe0-6286e393fb14/supervisor/stormdist/test-1-1416288370/resources/storm.rb:39:in
 
`read_message'\n/tmp/81b49de0-4ee0-493a-afe0-6286e393fb14/supervisor/stormdist/test-1-1416288370/resources/storm.rb:57:in
 
`read_command'\n/tmp/81b49de0-4ee0-493a-afe0-6286e393fb14/supervisor/stormdist/test-1-1416288370/resources/storm.rb:190:in
 `run'\ntester_bolt.rb:37:in `<main>'
            at backtype.storm.task.ShellBolt.handleError(ShellBolt.java:188) 
[classes/:na]
            at backtype.storm.task.ShellBolt.access$1100(ShellBolt.java:69) 
[classes/:na]
            at 
backtype.storm.task.ShellBolt$BoltReaderRunnable.run(ShellBolt.java:331) 
[classes/:na]
            at java.lang.Thread.run(Thread.java:662) [na:1.6.0_37]
    124526 [Thread-1209] ERROR backtype.storm.task.ShellBolt - Halting process: 
ShellBolt died.
    java.lang.RuntimeException: backtype.storm.multilang.NoOutputException: 
Pipe to subprocess seems to be broken! No output read.
    Serializer Exception:
    
    
            at 
backtype.storm.utils.ShellProcess.readShellMsg(ShellProcess.java:101) 
~[classes/:na]
            at 
backtype.storm.task.ShellBolt$BoltReaderRunnable.run(ShellBolt.java:318) 
~[classes/:na]
            at java.lang.Thread.run(Thread.java:662) [na:1.6.0_37]
    124527 [Thread-1209] ERROR backtype.storm.daemon.executor - 
    java.lang.RuntimeException: backtype.storm.multilang.NoOutputException: 
Pipe to subprocess seems to be broken! No output read.
    Serializer Exception:
    
    
            at 
backtype.storm.utils.ShellProcess.readShellMsg(ShellProcess.java:101) 
~[classes/:na]
            at 
backtype.storm.task.ShellBolt$BoltReaderRunnable.run(ShellBolt.java:318) 
~[classes/:na]
            at java.lang.Thread.run(Thread.java:662) [na:1.6.0_37]
    ```
     When I run multilang_test.clj at storm/master, I got the same exception, I 
think this may be my personal environment problem, can you merge this to 0.9.3 
branch and run tests again, thanks a lot.


> Add Option to Config Message handling strategy when connection timeout
> ----------------------------------------------------------------------
>
>                 Key: STORM-329
>                 URL: https://issues.apache.org/jira/browse/STORM-329
>             Project: Apache Storm
>          Issue Type: Improvement
>    Affects Versions: 0.9.2-incubating
>            Reporter: Sean Zhong
>            Priority: Minor
>              Labels: Netty
>             Fix For: 0.9.3-rc2
>
>         Attachments: storm-329.patch, worker-kill-recover3.jpg
>
>
> This is to address a [concern brought 
> up|https://github.com/apache/incubator-storm/pull/103#issuecomment-43632986] 
> during the work at STORM-297:
> {quote}
> [~revans2] wrote: Your logic makes since to me on why these calls are 
> blocking. My biggest concern around the blocking is in the case of a worker 
> crashing. If a single worker crashes this can block the entire topology from 
> executing until that worker comes back up. In some cases I can see that being 
> something that you would want. In other cases I can see speed being the 
> primary concern and some users would like to get partial data fast, rather 
> then accurate data later.
> Could we make it configurable on a follow up JIRA where we can have a max 
> limit to the buffering that is allowed, before we block, or throw data away 
> (which is what zeromq does)?
> {quote}
> If some worker crash suddenly, how to handle the message which was supposed 
> to be delivered to the worker?
> 1. Should we buffer all message infinitely?
> 2. Should we block the message sending until the connection is resumed?
> 3. Should we config a buffer limit, try to buffer the message first, if the 
> limit is met, then block?
> 4. Should we neither block, nor buffer too much, but choose to drop the 
> messages, and use the built-in storm failover mechanism? 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (STORM-329) Add Option to Config Message handling strategy when connection timeout

Reply via email to