[ 
https://issues.apache.org/jira/browse/STORM-163?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Michael Noll updated STORM-163:
-------------------------------

    Description: 
https://github.com/nathanmarz/storm/issues/512

{code}
(deftest test-builtin-metrics-2
  (with-simulated-time-local-cluster
    [cluster :daemon-conf {TOPOLOGY-ENABLE-MESSAGE-TIMEOUTS true                
         
                           TOPOLOGY-MESSAGE-TIMEOUT-SECS 5
                           }]
    (let [feeder (feeder-spout ["field1"])
          tracker (AckFailMapTracker.)
          _ (.setAckFailDelegate feeder tracker)
          topology (thrift/mk-topology
                    {"myspout" (thrift/mk-spout-spec feeder)}
                    {"mybolt" (thrift/mk-bolt-spec {"myspout" :shuffle} 
ack-every-other)})]      
      (submit-local-topology (:nimbus cluster)
                             "metrics-tester"
                             {TOPOLOGY-ENABLE-MESSAGE-TIMEOUTS true             
              
                              TOPOLOGY-MESSAGE-TIMEOUT-SECS 5}
                             topology)
      (.feed feeder ["a"] 1)              
      (.feed feeder ["b"] 2)
      (advance-cluster-time cluster 10)
      (assert-failed tracker 2)
      )))
{code}

The above unit test just hangs. 
This isn't just a one off unit test, there's a whole class of these when 
advance-cluster time is near message-timeout-secs (but greater than).

I noticed that when I added system executors in order to get heap size metrics, 
that an existing metrics unit test started to fail at assert-failed where 
previously it succeeded. So it seems like the amount that advance-cluster-time 
has to exceed message timeout by is not constant . This might explain the why 
zookeeper 3.4.5 upgrade caused unit tests to hang (where mk-in-process 
zookeeper has slower performance and start-up time).

And it'll pass if I run it with lein2 test selector , but fail if I run all 
unit tests together

I spent 6 hours trying to fix the unit tests, but haven't figured it out yet.


  was:
https://github.com/nathanmarz/storm/issues/512

(deftest test-builtin-metrics-2
  (with-simulated-time-local-cluster
    [cluster :daemon-conf {TOPOLOGY-ENABLE-MESSAGE-TIMEOUTS true                
         
                           TOPOLOGY-MESSAGE-TIMEOUT-SECS 5
                           }]
    (let [feeder (feeder-spout ["field1"])
          tracker (AckFailMapTracker.)
          _ (.setAckFailDelegate feeder tracker)
          topology (thrift/mk-topology
                    {"myspout" (thrift/mk-spout-spec feeder)}
                    {"mybolt" (thrift/mk-bolt-spec {"myspout" :shuffle} 
ack-every-other)})]      
      (submit-local-topology (:nimbus cluster)
                             "metrics-tester"
                             {TOPOLOGY-ENABLE-MESSAGE-TIMEOUTS true             
              
                              TOPOLOGY-MESSAGE-TIMEOUT-SECS 5}
                             topology)
      (.feed feeder ["a"] 1)              
      (.feed feeder ["b"] 2)
      (advance-cluster-time cluster 10)
      (assert-failed tracker 2)
      )))

The above unit test just hangs. 
This isn't just a one off unit test, there's a whole class of these when 
advance-cluster time is near message-timeout-secs (but greater than).

I noticed that when I added system executors in order to get heap size metrics, 
that an existing metrics unit test started to fail at assert-failed where 
previously it succeeded. So it seems like the amount that advance-cluster-time 
has to exceed message timeout by is not constant . This might explain the why 
zookeeper 3.4.5 upgrade caused unit tests to hang (where mk-in-process 
zookeeper has slower performance and start-up time).

And it'll pass if I run it with lein2 test selector , but fail if I run all 
unit tests together

I spent 6 hours trying to fix the unit tests, but haven't figured it out yet.



> Simulated Cluster Time doesn't work well for me.
> ------------------------------------------------
>
>                 Key: STORM-163
>                 URL: https://issues.apache.org/jira/browse/STORM-163
>             Project: Apache Storm (Incubating)
>          Issue Type: Bug
>            Reporter: James Xu
>            Priority: Minor
>
> https://github.com/nathanmarz/storm/issues/512
> {code}
> (deftest test-builtin-metrics-2
>   (with-simulated-time-local-cluster
>     [cluster :daemon-conf {TOPOLOGY-ENABLE-MESSAGE-TIMEOUTS true              
>            
>                            TOPOLOGY-MESSAGE-TIMEOUT-SECS 5
>                            }]
>     (let [feeder (feeder-spout ["field1"])
>           tracker (AckFailMapTracker.)
>           _ (.setAckFailDelegate feeder tracker)
>           topology (thrift/mk-topology
>                     {"myspout" (thrift/mk-spout-spec feeder)}
>                     {"mybolt" (thrift/mk-bolt-spec {"myspout" :shuffle} 
> ack-every-other)})]      
>       (submit-local-topology (:nimbus cluster)
>                              "metrics-tester"
>                              {TOPOLOGY-ENABLE-MESSAGE-TIMEOUTS true           
>                 
>                               TOPOLOGY-MESSAGE-TIMEOUT-SECS 5}
>                              topology)
>       (.feed feeder ["a"] 1)              
>       (.feed feeder ["b"] 2)
>       (advance-cluster-time cluster 10)
>       (assert-failed tracker 2)
>       )))
> {code}
> The above unit test just hangs. 
> This isn't just a one off unit test, there's a whole class of these when 
> advance-cluster time is near message-timeout-secs (but greater than).
> I noticed that when I added system executors in order to get heap size 
> metrics, that an existing metrics unit test started to fail at assert-failed 
> where previously it succeeded. So it seems like the amount that 
> advance-cluster-time has to exceed message timeout by is not constant . This 
> might explain the why zookeeper 3.4.5 upgrade caused unit tests to hang 
> (where mk-in-process zookeeper has slower performance and start-up time).
> And it'll pass if I run it with lein2 test selector , but fail if I run all 
> unit tests together
> I spent 6 hours trying to fix the unit tests, but haven't figured it out yet.



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to