[jira] [Updated] (STORM-3384) storm set-log-level command throws wrong exception when the topology is not running
[ https://issues.apache.org/jira/browse/STORM-3384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz updated STORM-3384: --- Fix Version/s: (was: 1.2.4) 1.2.3 > storm set-log-level command throws wrong exception when the topology is not > running > --- > > Key: STORM-3384 > URL: https://issues.apache.org/jira/browse/STORM-3384 > Project: Apache Storm > Issue Type: Bug >Affects Versions: 1.0.6, 1.1.3, 1.2.2 >Reporter: Ethan Li >Assignee: Ethan Li >Priority: Trivial > Labels: pull-request-available > Fix For: 1.2.3, 1.1.4 > > Time Spent: 2h 20m > Remaining Estimate: 0h > > https://github.com/apache/storm/blob/1.1.x-branch/storm-core/src/clj/org/apache/storm/command/set_log_level.clj#L31 > will throw an exception like the following if the topology is not running > {code:java} > 3396 [main] INFO b.s.c.set-log-level - Sent log config > LogConfig(named_logger_level:{ROOT=LogLevel(action:UPDATE, > target_log_level:DEBUG, reset_log_level_timeout_secs:30)}) for topology w > Exception in thread "main" java.lang.IllegalArgumentException: No matching > field found: IllegalArgumentException for class java.lang.String > at clojure.lang.Reflector.getInstanceField(Reflector.java:271) > at clojure.lang.Reflector.invokeNoArgInstanceMember(Reflector.java:315) > at > backtype.storm.command.set_log_level$get_storm_id.invoke(set_log_level.clj:31) > at > backtype.storm.command.set_log_level$_main.doInvoke(set_log_level.clj:75) > at clojure.lang.RestFn.applyTo(RestFn.java:137) > at backtype.storm.command.set_log_level.main(Unknown Source) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (STORM-3381) Upgrading to Zookeeper 3.4.14 added an LGPL dependency
[ https://issues.apache.org/jira/browse/STORM-3381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz updated STORM-3381: --- Fix Version/s: (was: 1.2.4) 1.2.3 > Upgrading to Zookeeper 3.4.14 added an LGPL dependency > -- > > Key: STORM-3381 > URL: https://issues.apache.org/jira/browse/STORM-3381 > Project: Apache Storm > Issue Type: Bug >Affects Versions: 2.0.0, 1.2.3, 1.1.4 >Reporter: Stig Rohde Døssing >Assignee: Stig Rohde Døssing >Priority: Blocker > Labels: pull-request-available > Fix For: 2.0.0, 1.2.3, 1.1.4 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > https://issues.apache.org/jira/browse/ZOOKEEPER-3367 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (STORM-3233) Upgrade zookeeper client to newest version (3.4.13)
[ https://issues.apache.org/jira/browse/STORM-3233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz updated STORM-3233: --- Fix Version/s: (was: 1.2.4) 1.2.3 > Upgrade zookeeper client to newest version (3.4.13) > --- > > Key: STORM-3233 > URL: https://issues.apache.org/jira/browse/STORM-3233 > Project: Apache Storm > Issue Type: New Feature > Components: storm-core >Affects Versions: 2.0.0 >Reporter: Michal Koziorowski >Assignee: Peter Chamberlain >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0, 1.2.3, 1.1.4 > > Time Spent: 4h 10m > Remaining Estimate: 0h > > Hi, > I would like to see new zookeeper client (3.4.13) used in storm. > New release contains an important fix for cloud environments where zookeeper > servers have dynamic ips > ([https://jira.apache.org/jira/browse/ZOOKEEPER-2184]). > If possible, it would be nice to see updated zookeeper also on older storm > versions (1.2.x, 1.1.x) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (STORM-3233) Upgrade zookeeper client to newest version (3.4.13)
[ https://issues.apache.org/jira/browse/STORM-3233?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16830397#comment-16830397 ] P. Taylor Goetz commented on STORM-3233: [~Srdo] yep. I'll revert and do another RC tomorrow. > Upgrade zookeeper client to newest version (3.4.13) > --- > > Key: STORM-3233 > URL: https://issues.apache.org/jira/browse/STORM-3233 > Project: Apache Storm > Issue Type: New Feature > Components: storm-core >Affects Versions: 2.0.0 >Reporter: Michal Koziorowski >Assignee: Peter Chamberlain >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0, 1.1.4, 1.2.4 > > Time Spent: 4h 10m > Remaining Estimate: 0h > > Hi, > I would like to see new zookeeper client (3.4.13) used in storm. > New release contains an important fix for cloud environments where zookeeper > servers have dynamic ips > ([https://jira.apache.org/jira/browse/ZOOKEEPER-2184]). > If possible, it would be nice to see updated zookeeper also on older storm > versions (1.2.x, 1.1.x) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (STORM-3384) storm set-log-level command throws wrong exception when the topology is not running
[ https://issues.apache.org/jira/browse/STORM-3384?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz updated STORM-3384: --- Fix Version/s: (was: 1.2.3) 1.2.4 > storm set-log-level command throws wrong exception when the topology is not > running > --- > > Key: STORM-3384 > URL: https://issues.apache.org/jira/browse/STORM-3384 > Project: Apache Storm > Issue Type: Bug >Affects Versions: 1.0.6, 1.1.3, 1.2.2 >Reporter: Ethan Li >Assignee: Ethan Li >Priority: Trivial > Labels: pull-request-available > Fix For: 1.1.4, 1.2.4 > > Time Spent: 2h 20m > Remaining Estimate: 0h > > https://github.com/apache/storm/blob/1.1.x-branch/storm-core/src/clj/org/apache/storm/command/set_log_level.clj#L31 > will throw an exception like the following if the topology is not running > {code:java} > 3396 [main] INFO b.s.c.set-log-level - Sent log config > LogConfig(named_logger_level:{ROOT=LogLevel(action:UPDATE, > target_log_level:DEBUG, reset_log_level_timeout_secs:30)}) for topology w > Exception in thread "main" java.lang.IllegalArgumentException: No matching > field found: IllegalArgumentException for class java.lang.String > at clojure.lang.Reflector.getInstanceField(Reflector.java:271) > at clojure.lang.Reflector.invokeNoArgInstanceMember(Reflector.java:315) > at > backtype.storm.command.set_log_level$get_storm_id.invoke(set_log_level.clj:31) > at > backtype.storm.command.set_log_level$_main.doInvoke(set_log_level.clj:75) > at clojure.lang.RestFn.applyTo(RestFn.java:137) > at backtype.storm.command.set_log_level.main(Unknown Source) > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (STORM-3381) Upgrading to Zookeeper 3.4.14 added an LGPL dependency
[ https://issues.apache.org/jira/browse/STORM-3381?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz updated STORM-3381: --- Fix Version/s: (was: 1.2.3) 1.2.4 > Upgrading to Zookeeper 3.4.14 added an LGPL dependency > -- > > Key: STORM-3381 > URL: https://issues.apache.org/jira/browse/STORM-3381 > Project: Apache Storm > Issue Type: Bug >Affects Versions: 2.0.0, 1.2.3, 1.1.4 >Reporter: Stig Rohde Døssing >Assignee: Stig Rohde Døssing >Priority: Blocker > Labels: pull-request-available > Fix For: 2.0.0, 1.14, 1.2.4 > > Time Spent: 1h 20m > Remaining Estimate: 0h > > https://issues.apache.org/jira/browse/ZOOKEEPER-3367 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (STORM-3233) Upgrade zookeeper client to newest version (3.4.13)
[ https://issues.apache.org/jira/browse/STORM-3233?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz updated STORM-3233: --- Fix Version/s: (was: 1.2.3) 1.2.4 > Upgrade zookeeper client to newest version (3.4.13) > --- > > Key: STORM-3233 > URL: https://issues.apache.org/jira/browse/STORM-3233 > Project: Apache Storm > Issue Type: New Feature > Components: storm-core >Affects Versions: 2.0.0 >Reporter: Michal Koziorowski >Assignee: Peter Chamberlain >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0, 1.1.4, 1.2.4 > > Time Spent: 4h 10m > Remaining Estimate: 0h > > Hi, > I would like to see new zookeeper client (3.4.13) used in storm. > New release contains an important fix for cloud environments where zookeeper > servers have dynamic ips > ([https://jira.apache.org/jira/browse/ZOOKEEPER-2184]). > If possible, it would be nice to see updated zookeeper also on older storm > versions (1.2.x, 1.1.x) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (STORM-3348) Incorrect message when group id is not provided as kafka spout config on storm ui
[ https://issues.apache.org/jira/browse/STORM-3348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz updated STORM-3348: --- Fix Version/s: (was: 1.2.3) 1.2.4 > Incorrect message when group id is not provided as kafka spout config on > storm ui > - > > Key: STORM-3348 > URL: https://issues.apache.org/jira/browse/STORM-3348 > Project: Apache Storm > Issue Type: Bug >Affects Versions: 2.0.0, 1.x, 1.2.2 > Environment: prod >Reporter: Vivek Ojha >Assignee: Vivek Ojha >Priority: Trivial > Labels: pull-request-available > Fix For: 2.0.0, 1.2.4 > > Time Spent: 1.5h > Remaining Estimate: 0h > > Steps to produce the issue - > # Use kafka as source for a spout. > # Don't provide group id in spout configuration. > # Start the topology and go to storm UI topology page. > # Instead of showing kafka spout lags it shows the following message - > "Offset lags for kafka not supported for older versions. Please update kafka > spout to latest version.", even though kafka spout is having correct version. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (STORM-3339) Port all the AtomicReference to ConcurrentHashMap for Nimbus
[ https://issues.apache.org/jira/browse/STORM-3339?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz updated STORM-3339: --- Fix Version/s: (was: 2.0.0) > Port all the AtomicReference to ConcurrentHashMap for Nimbus > > > Key: STORM-3339 > URL: https://issues.apache.org/jira/browse/STORM-3339 > Project: Apache Storm > Issue Type: Improvement > Components: storm-core >Affects Versions: 2.0.0 >Reporter: Danny Chan >Assignee: Danny Chan >Priority: Major > > Now for many concurrent access resource in Nimbus.java, we use > AtomicReference to make them multi thread safe. The resources summarized > below: > 1. heartbeatsCache > 2. schedulingStartTimeNs > 3. idToSchedStatus > 4. nodeIdToResources > 5. idToWorkerResources > 6. idToExecutors > The 1, 4, 5 and 6 may grows huge if we have hundreds of topologies on > cluster, when we update AtomicReference, actually we passed in a Function and > use compareAndSet to update the whole val to the new returned by the > Function, in that case, we must do a reference copy and merge the changes, > which seems not necessary. > I think the reason to use AtomicReference is a legacy from old Clojure code, > we can replace them totally with ConcurrentHashMap which supported better > performance. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (STORM-3291) Worker can't run as the user who submitted the topology
[ https://issues.apache.org/jira/browse/STORM-3291?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz updated STORM-3291: --- Fix Version/s: (was: 2.0.0) > Worker can't run as the user who submitted the topology > --- > > Key: STORM-3291 > URL: https://issues.apache.org/jira/browse/STORM-3291 > Project: Apache Storm > Issue Type: Bug > Components: storm-server >Affects Versions: 1.2.2 >Reporter: liuzhaokun >Assignee: liuzhaokun >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > Without principal, worker can't be launched as the user who submitted the > topology even we set "supervisor.run.worker.as.user" to true.Because the > submitterUser will be overwrited by the user who launched nimbus. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (STORM-3144) Extend metrics on Nimbus
[ https://issues.apache.org/jira/browse/STORM-3144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz closed STORM-3144. -- Resolution: Duplicate > Extend metrics on Nimbus > > > Key: STORM-3144 > URL: https://issues.apache.org/jira/browse/STORM-3144 > Project: Apache Storm > Issue Type: Improvement > Components: storm-webapp >Affects Versions: 2.0.0 >Reporter: Zhengdai Hu >Assignee: Zhengdai Hu >Priority: Major > > Metrics include: > # File upload time > # Nimbus restart count > # Nimbus loss of leadership: meter marking when a nimbus node gains or loses > leadership > # Excessive scheduling time (both duration distribution and current longest) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (STORM-3144) Extend metrics on Nimbus
[ https://issues.apache.org/jira/browse/STORM-3144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz reopened STORM-3144: > Extend metrics on Nimbus > > > Key: STORM-3144 > URL: https://issues.apache.org/jira/browse/STORM-3144 > Project: Apache Storm > Issue Type: Improvement > Components: storm-webapp >Affects Versions: 2.0.0 >Reporter: Zhengdai Hu >Assignee: Zhengdai Hu >Priority: Major > > Metrics include: > # File upload time > # Nimbus restart count > # Nimbus loss of leadership: meter marking when a nimbus node gains or loses > leadership > # Excessive scheduling time (both duration distribution and current longest) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (STORM-3144) Extend metrics on Nimbus
[ https://issues.apache.org/jira/browse/STORM-3144?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz updated STORM-3144: --- Fix Version/s: (was: 2.0.0) > Extend metrics on Nimbus > > > Key: STORM-3144 > URL: https://issues.apache.org/jira/browse/STORM-3144 > Project: Apache Storm > Issue Type: Improvement > Components: storm-webapp >Affects Versions: 2.0.0 >Reporter: Zhengdai Hu >Assignee: Zhengdai Hu >Priority: Major > > Metrics include: > # File upload time > # Nimbus restart count > # Nimbus loss of leadership: meter marking when a nimbus node gains or loses > leadership > # Excessive scheduling time (both duration distribution and current longest) -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (STORM-1356) StormSQL Explain Execute Plan
[ https://issues.apache.org/jira/browse/STORM-1356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz closed STORM-1356. -- Resolution: Duplicate > StormSQL Explain Execute Plan > -- > > Key: STORM-1356 > URL: https://issues.apache.org/jira/browse/STORM-1356 > Project: Apache Storm > Issue Type: New Feature > Components: storm-sql >Reporter: darion yaphet >Assignee: Jungtaek Lim >Priority: Major > > StormSQL maybe need a explain method use to show SQL's topology component > structure looks like MySQL and any other RDBMS . -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (STORM-1356) StormSQL Explain Execute Plan
[ https://issues.apache.org/jira/browse/STORM-1356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz reopened STORM-1356: > StormSQL Explain Execute Plan > -- > > Key: STORM-1356 > URL: https://issues.apache.org/jira/browse/STORM-1356 > Project: Apache Storm > Issue Type: New Feature > Components: storm-sql >Reporter: darion yaphet >Assignee: Jungtaek Lim >Priority: Major > > StormSQL maybe need a explain method use to show SQL's topology component > structure looks like MySQL and any other RDBMS . -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (STORM-1356) StormSQL Explain Execute Plan
[ https://issues.apache.org/jira/browse/STORM-1356?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz updated STORM-1356: --- Fix Version/s: (was: 1.1.0) (was: 2.0.0) > StormSQL Explain Execute Plan > -- > > Key: STORM-1356 > URL: https://issues.apache.org/jira/browse/STORM-1356 > Project: Apache Storm > Issue Type: New Feature > Components: storm-sql >Reporter: darion yaphet >Assignee: Jungtaek Lim >Priority: Major > > StormSQL maybe need a explain method use to show SQL's topology component > structure looks like MySQL and any other RDBMS . -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (STORM-1492) With nimbus.seeds set to default, a nimbus for localhost may appear "Offline"
[ https://issues.apache.org/jira/browse/STORM-1492?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz resolved STORM-1492. Resolution: Fixed > With nimbus.seeds set to default, a nimbus for localhost may appear "Offline" > - > > Key: STORM-1492 > URL: https://issues.apache.org/jira/browse/STORM-1492 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Affects Versions: 1.0.0 >Reporter: P. Taylor Goetz >Assignee: Jungtaek Lim >Priority: Minor > Fix For: 2.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > > With the default value for {{nimbus.seeds}} ({{["localhost"]}}) Storm UI may > list one "Offline" nimbus for localhost, and another as "Leader" for the > resolved machine name. > Steps to reproduce (assumes ZK is running; all on local machine): > 1. Clean install of 1.0.0-SNAPSHOT (do not modify {{storm.yaml}}) > 2. Start nimbus > 3. Start supervisor > 4. Start ui > 5. Navigate to http://localhost:8080 > A workaround is to modify {{storm.yaml}} and replace "localhost" with the > hostname of the machine in {{nimbus.seeds}}. > While trivial to correct, this may confuse users. One approach is to simply > document this behavior. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (STORM-1692) HBaseSecurityUtil#login modifies the current UGI causing issues if two instances are running with different credentials
[ https://issues.apache.org/jira/browse/STORM-1692?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz updated STORM-1692: --- Fix Version/s: (was: 2.0.0) > HBaseSecurityUtil#login modifies the current UGI causing issues if two > instances are running with different credentials > --- > > Key: STORM-1692 > URL: https://issues.apache.org/jira/browse/STORM-1692 > Project: Apache Storm > Issue Type: Bug > Components: storm-hbase >Reporter: Satish Duggana >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (STORM-1651) Add event time based support for trident windowing.
[ https://issues.apache.org/jira/browse/STORM-1651?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz updated STORM-1651: --- Fix Version/s: (was: 2.0.0) > Add event time based support for trident windowing. > --- > > Key: STORM-1651 > URL: https://issues.apache.org/jira/browse/STORM-1651 > Project: Apache Storm > Issue Type: New Feature > Components: trident >Reporter: Satish Duggana >Assignee: Satish Duggana >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (STORM-1697) artifacts symlink not created
[ https://issues.apache.org/jira/browse/STORM-1697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz closed STORM-1697. -- Resolution: Invalid > artifacts symlink not created > -- > > Key: STORM-1697 > URL: https://issues.apache.org/jira/browse/STORM-1697 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Affects Versions: 2.0.0 >Reporter: Zhuo Liu >Priority: Major > > No artifacts symlink generated under worker's current directory. Gc log, > jstack and heapdump will not be working. > 2016-04-07 17:43:19.909 STDERR [INFO] Java HotSpot(TM) 64-Bit Server VM > warning: Cannot open file artifacts/gc.log due to No such file or directory > 2016-04-07 17:43:19.913 STDERR [INFO] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (STORM-1697) artifacts symlink not created
[ https://issues.apache.org/jira/browse/STORM-1697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz updated STORM-1697: --- Fix Version/s: (was: 2.0.0) > artifacts symlink not created > -- > > Key: STORM-1697 > URL: https://issues.apache.org/jira/browse/STORM-1697 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Affects Versions: 2.0.0 >Reporter: Zhuo Liu >Priority: Major > > No artifacts symlink generated under worker's current directory. Gc log, > jstack and heapdump will not be working. > 2016-04-07 17:43:19.909 STDERR [INFO] Java HotSpot(TM) 64-Bit Server VM > warning: Cannot open file artifacts/gc.log due to No such file or directory > 2016-04-07 17:43:19.913 STDERR [INFO] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (STORM-1697) artifacts symlink not created
[ https://issues.apache.org/jira/browse/STORM-1697?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz reopened STORM-1697: > artifacts symlink not created > -- > > Key: STORM-1697 > URL: https://issues.apache.org/jira/browse/STORM-1697 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Affects Versions: 2.0.0 >Reporter: Zhuo Liu >Priority: Major > > No artifacts symlink generated under worker's current directory. Gc log, > jstack and heapdump will not be working. > 2016-04-07 17:43:19.909 STDERR [INFO] Java HotSpot(TM) 64-Bit Server VM > warning: Cannot open file artifacts/gc.log due to No such file or directory > 2016-04-07 17:43:19.913 STDERR [INFO] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (STORM-2393) Support Hortonworks schema registry with HDFS connector
[ https://issues.apache.org/jira/browse/STORM-2393?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz updated STORM-2393: --- Fix Version/s: (was: 2.0.0) > Support Hortonworks schema registry with HDFS connector > --- > > Key: STORM-2393 > URL: https://issues.apache.org/jira/browse/STORM-2393 > Project: Apache Storm > Issue Type: Improvement >Reporter: Satish Duggana >Assignee: Satish Duggana >Priority: Major > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (STORM-2646) NimbusClient Class cast exception when nimbus seeds is not an array of hosts
[ https://issues.apache.org/jira/browse/STORM-2646?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz updated STORM-2646: --- Fix Version/s: (was: 2.0.0) > NimbusClient Class cast exception when nimbus seeds is not an array of hosts > > > Key: STORM-2646 > URL: https://issues.apache.org/jira/browse/STORM-2646 > Project: Apache Storm > Issue Type: Bug > Components: storm-client >Affects Versions: 1.0.3 >Reporter: Eugeniu Cararus >Priority: Major > Labels: client, config, nimbus > Original Estimate: 2h > Time Spent: 2h 20m > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (STORM-2669) Extend the BinaryEventDataScheme in storm-eventhubs to include MessageId in addition to system properties
[ https://issues.apache.org/jira/browse/STORM-2669?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz updated STORM-2669: --- Fix Version/s: (was: 2.0.0) > Extend the BinaryEventDataScheme in storm-eventhubs to include MessageId in > addition to system properties > - > > Key: STORM-2669 > URL: https://issues.apache.org/jira/browse/STORM-2669 > Project: Apache Storm > Issue Type: Improvement > Components: storm-eventhubs >Affects Versions: 2.0.0 > Environment: Ubuntu/Azure >Reporter: Ravi Tandon >Priority: Minor > > Currently there are two types of EventDataScheme included with the > storm-eventhubs spout. > The default is the StringEventDataScheme that emits a single output field, > the message itself as a string. > There is an additional BinaryEventDataScheme that passes the message as is, > but also has two additional fields: metadata and system_metadata that is > passed by eventhubs-client. > The system_metadata only contains the sequence number, offset and enqeued > time of an event. > As part of recent requirements by certain applications for tracking an event, > they also need the partition id. The partition id is NOT sent by the > eventhubs-client, instead the partition manager in the spout already has this > information. > The goal of this JIRA is to introduce another output field in > BinaryEventDataScheme that contains the MessageId for an event. The messageId > will contain: partitionId, sequence number and the offset information for any > downstream bolt to be able to locate where the message arrived from. > I will also be fixing any maven checkstyle warnings/errors in the files that > I will be committing changes in. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (STORM-2725) Support GPUs and other generic resource types in scheduling of topologes
[ https://issues.apache.org/jira/browse/STORM-2725?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz resolved STORM-2725. Resolution: Fixed > Support GPUs and other generic resource types in scheduling of topologes > > > Key: STORM-2725 > URL: https://issues.apache.org/jira/browse/STORM-2725 > Project: Apache Storm > Issue Type: Epic > Components: storm-core >Reporter: Govind Menon >Assignee: Govind Menon >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > Time Spent: 0.5h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (STORM-2745) Hdfs Open Files problem
[ https://issues.apache.org/jira/browse/STORM-2745?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz updated STORM-2745: --- Fix Version/s: (was: 1.x) (was: 2.0.0) > Hdfs Open Files problem > --- > > Key: STORM-2745 > URL: https://issues.apache.org/jira/browse/STORM-2745 > Project: Apache Storm > Issue Type: New Feature > Components: storm-hdfs >Affects Versions: 2.0.0, 1.x >Reporter: Shoeb >Priority: Major > Labels: features, pull-request-available, starter > Original Estimate: 48h > Time Spent: 50m > Remaining Estimate: 47h 10m > > Issue: > Problem exists when there are multiple HDFS writers in writersMap. Each > writer keeps an open hdfs handle to the file. Incase of Inactive writer(i.e. > one which is not consuming any data from long period), the files are not > closed and always remain in open state. > Ideally, these files should get closed and Hdfs writers removed from the > WritersMap. > Solution: > Implement a ClosingFilesPolicy that is based on Tick tuple intervals. At each > tick tuple all Writers are checked and closed if they exist for a long time. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (STORM-3045) Microsoft Azure EventHubs: Storm Spout and Bolt improvements
[ https://issues.apache.org/jira/browse/STORM-3045?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz updated STORM-3045: --- Fix Version/s: (was: 2.0.0) > Microsoft Azure EventHubs: Storm Spout and Bolt improvements > > > Key: STORM-3045 > URL: https://issues.apache.org/jira/browse/STORM-3045 > Project: Apache Storm > Issue Type: Improvement > Components: storm-eventhubs >Affects Versions: 2.0.0 >Reporter: Sreeram Garlapati >Assignee: Sreeram Garlapati >Priority: Minor > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (STORM-3016) Nimbus gets down when job has large amount of parallelism components
[ https://issues.apache.org/jira/browse/STORM-3016?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz updated STORM-3016: --- Fix Version/s: (was: 2.0.0) > Nimbus gets down when job has large amount of parallelism components > > > Key: STORM-3016 > URL: https://issues.apache.org/jira/browse/STORM-3016 > Project: Apache Storm > Issue Type: Improvement > Components: storm-core >Affects Versions: 2.0.0 >Reporter: StaticMian >Priority: Major > Labels: security > Attachments: nimbus.log > > Original Estimate: 96h > Remaining Estimate: 96h > > When a job having large amount of parallelism components( total parallelism > rises to 5000 for example) been submmited to storm cluster, Nimubs might get > crashed, the work flow is as below: > 1) Nimbus computting assignment > 2) Nimbus sending assignment to zk > {color:#ff}3) When assignment mapping info string is too long due to > total parallelism of job being too large, sending this info to zk will fail > (zNode datalength set default is 1M ){color} > {color:#33}4) Nimbus keeps trying sending this assignment info, after > some times, it gives up and crashed, with that happend, the stablity of the > cluster will be greatly impacted{color} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (STORM-3101) Fix unexpected metrics registration in StormMetricsRegistry
[ https://issues.apache.org/jira/browse/STORM-3101?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz updated STORM-3101: --- Fix Version/s: (was: 2.0.0) > Fix unexpected metrics registration in StormMetricsRegistry > --- > > Key: STORM-3101 > URL: https://issues.apache.org/jira/browse/STORM-3101 > Project: Apache Storm > Issue Type: Improvement > Components: storm-server >Affects Versions: 2.0.0 >Reporter: Zhengdai Hu >Assignee: Zhengdai Hu >Priority: Major > Labels: pull-request-available > Time Spent: 3.5h > Remaining Estimate: 0h > > Metrics that are registered using StormMetricRegistry all added through > static method from the registry class, and attached to a singleton > MetricRegistry object per process. Currently most metrics are bound to > classes (static), so the issue occurs when metrics from irrelevant components > are accidentally registered in class initialization phase. > For example, a process running supervisor daemon will incorrectly register > metrics from nimbus when BasicContainer class is initialized and statically > imports > "org.apache.storm.daemon.nimbus.Nimbus.MIN_VERSION_SUPPORT_RPC_HEARTBEAT", > which triggers initialization of Nimbus class and all metrics registration, > even though no functionalities of nimbus daemon will be used and no nimbus > metrics will be updated. > This creates many garbage metrics and makes metrics hard to read. Therefore > we should filter metrics registration by the type of daemon that the process > actually runs. > For implementation please see the pull request. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (STORM-3104) Delayed worker launch due to accidental transitioning in state machine
[ https://issues.apache.org/jira/browse/STORM-3104?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz updated STORM-3104: --- Fix Version/s: (was: 2.0.0) > Delayed worker launch due to accidental transitioning in state machine > -- > > Key: STORM-3104 > URL: https://issues.apache.org/jira/browse/STORM-3104 > Project: Apache Storm > Issue Type: Bug > Components: storm-server >Affects Versions: 2.0.0 >Reporter: Zhengdai Hu >Priority: Major > > In Slot.java, there is a comparison in > {code:java} > handleWaitingForBlobUpdate() > {code} > between dynamic state's current assignment and new assignment, which > accidentally route back state machine just transitioned from > WAIT_FOR_BLOB_LOCALIZATION back to WAIT_FOR_BLOB_LOCALIZATION, because the > current assignment in this case is highly likely to be null and different > from new assignment (I'm not sure if it's guaranteed). This causes delay for > a worker to start/restart. > The symptom can be reproduced by launching an empty storm server and submit > any topology. Here's the log sample (relevant transition starting from > 2018-06-13 16:57:12.274 o.a.s.d.s.Slot SLOT_6700 [DEBUG]): > {code:sh} > 2018-06-13 16:57:10.254 o.a.s.d.s.Slot SLOT_6700 [INFO] STATE EMPTY > msInState: 6024 -> EMPTY msInState: 6024 > 2018-06-13 16:57:10.255 o.a.s.d.s.Slot SLOT_6700 [DEBUG] STATE EMPTY > 2018-06-13 16:57:10.257 o.a.s.d.s.Slot SLOT_6700 [DEBUG] Transition from > EMPTY to WAITING_FOR_BLOB_LOCALIZATION > 2018-06-13 16:57:10.257 o.a.s.d.s.Slot SLOT_6700 [INFO] STATE EMPTY > msInState: 6027 -> WAITING_FOR_BLOB_LOCALIZATION msInState: 0 > 2018-06-13 16:57:10.258 o.a.s.d.s.Slot SLOT_6700 [DEBUG] STATE > WAITING_FOR_BLOB_LOCALIZATION > 2018-06-13 16:57:10.258 o.a.s.d.s.Slot SLOT_6700 [DEBUG] pendingChangingBlobs > are [] > 2018-06-13 16:57:11.259 o.a.s.d.s.Slot SLOT_6700 [INFO] STATE > WAITING_FOR_BLOB_LOCALIZATION msInState: 1003 -> > WAITING_FOR_BLOB_LOCALIZATION msInState: 1003 > 2018-06-13 16:57:11.260 o.a.s.d.s.Slot SLOT_6700 [DEBUG] STATE > WAITING_FOR_BLOB_LOCALIZATION > 2018-06-13 16:57:11.260 o.a.s.d.s.Slot SLOT_6700 [DEBUG] found changing blobs > [BLOB CHANGING LOCAL TOPO BLOB TOPO_CONF test-1-1528927024 > LocalAssignment(topology_id:test-1-1528927024, > executors:[ExecutorInfo(task_start:10, task_end:10), > ExecutorInfo(task_start:16, task_end:16), ExecutorInfo(task_start:4, > task_end:4), ExecutorInfo(task_start:7, task_end:7), > ExecutorInfo(task_start:1, task_end:1), ExecutorInfo(task_start:13, > task_end:13)], resources:WorkerResources(mem_on_heap:768.0, mem_off_heap:0.0, > cpu:60.0, shared_mem_on_heap:0.0, shared_mem_off_heap:0.0, > resources:{offheap.memory.mb=0.0, onheap.memory.mb=768.0, > cpu.pcore.percent=60.0}, shared_resources:{}), owner:zhu02), BLOB CHANGING > LOCAL TOPO BLOB TOPO_CODE test-1-1528927024 > LocalAssignment(topology_id:test-1-1528927024, > executors:[ExecutorInfo(task_start:10, task_end:10), > ExecutorInfo(task_start:16, task_end:16), ExecutorInfo(task_start:4, > task_end:4), ExecutorInfo(task_start:7, task_end:7), > ExecutorInfo(task_start:1, task_end:1), ExecutorInfo(task_start:13, > task_end:13)], resources:WorkerResources(mem_on_heap:768.0, mem_off_heap:0.0, > cpu:60.0, shared_mem_on_heap:0.0, shared_mem_off_heap:0.0, > resources:{offheap.memory.mb=0.0, onheap.memory.mb=768.0, > cpu.pcore.percent=60.0}, shared_resources:{}), owner:zhu02)] moving them to > pending... > 2018-06-13 16:57:12.262 o.a.s.d.s.Slot SLOT_6700 [INFO] STATE > WAITING_FOR_BLOB_LOCALIZATION msInState: 2005 -> > WAITING_FOR_BLOB_LOCALIZATION msInState: 2005 > 2018-06-13 16:57:12.263 o.a.s.d.s.Slot SLOT_6700 [DEBUG] STATE > WAITING_FOR_BLOB_LOCALIZATION > 2018-06-13 16:57:12.263 o.a.s.d.s.Slot SLOT_6700 [DEBUG] found changing blobs > [BLOB CHANGING LOCAL TOPO BLOB TOPO_JAR test-1-1528927024 > LocalAssignment(topology_id:test-1-1528927024, > executors:[ExecutorInfo(task_start:10, task_end:10), > ExecutorInfo(task_start:16, task_end:16), ExecutorInfo(task_start:4, > task_end:4), ExecutorInfo(task_start:7, task_end:7), > ExecutorInfo(task_start:1, task_end:1), ExecutorInfo(task_start:13, > task_end:13)], resources:WorkerResources(mem_on_heap:768.0, mem_off_heap:0.0, > cpu:60.0, shared_mem_on_heap:0.0, shared_mem_off_heap:0.0, > resources:{offheap.memory.mb=0.0, onheap.memory.mb=768.0, > cpu.pcore.percent=60.0}, shared_resources:{}), owner:zhu02)] moving them to > pending... > 2018-06-13 16:57:12.274 o.a.s.d.s.Slot SLOT_6700 [DEBUG] pendingLocalization > LocalAssignment(topology_id:test-1-1528927024, > executors:[ExecutorInfo(task_start:10, task_end:10), > ExecutorInfo(task_start:16, task_end:16), ExecutorInfo(task_start:4, > task_end:4), ExecutorInfo(task_start:7, task_end:7), >
[jira] [Updated] (STORM-3112) Incremental scheduling supports
[ https://issues.apache.org/jira/browse/STORM-3112?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz updated STORM-3112: --- Fix Version/s: (was: 2.0.0) > Incremental scheduling supports > --- > > Key: STORM-3112 > URL: https://issues.apache.org/jira/browse/STORM-3112 > Project: Apache Storm > Issue Type: Improvement > Components: storm-server >Affects Versions: 2.0.0 >Reporter: Yuzhao Chen >Assignee: Yuzhao Chen >Priority: Major > Labels: pull-request-available > Time Spent: 3h > Remaining Estimate: 0h > > As https://issues.apache.org/jira/browse/STORM-3093 described, now the > scheduling work for a round is a complete scan and computation for all the > topologies on cluster, which is a very heavy work when topologies increment > to hundreds. > So this JIRA is to refactor the scheduling logic that only care about > topologies that need to. > Promotions list: > 1. Cache the id to storm base mapping which reduce the pressure to ZooKeeper. > 2. Only schedule the topologies that need to: with dead executors or not > enough running workers. > 3. For some schedulers we still need a full scheduling, i.e. > IsolationScheduler. > 4. Cache the scheduling resource bestride multi scheduling round, i.e. nodeId > -> used slot, nodeId -> used resource, nodeId -> totalResource. > Cause in https://issues.apache.org/jira/browse/STORM-3093 i already cache the > storm-id -> executors mapping, now for a scheduling round, thing we will do: > 1. Scan all the active storm bases( cached ) and local > storm-conf/storm-topology, then to refresh the heartbeats cache, and we will > know which topologies need to schedule. > 2. Compute scheduleAssignment only for need scheduling topologies. > About robustness when nimbus restarts: > 1. The cached storm-bases are taken care of by ILocalAssignmentsBackend. > 2. the scheduling cache will be refresh for the first time scheduling through > a full topologies scheduling. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (STORM-3173) flush metrics to ScheduledReporter on shutdown
[ https://issues.apache.org/jira/browse/STORM-3173?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz updated STORM-3173: --- Fix Version/s: (was: 2.0.0) > flush metrics to ScheduledReporter on shutdown > -- > > Key: STORM-3173 > URL: https://issues.apache.org/jira/browse/STORM-3173 > Project: Apache Storm > Issue Type: Improvement > Components: storm-server, storm-webapp >Affects Versions: 2.0.0 >Reporter: Zhengdai Hu >Assignee: Zhengdai Hu >Priority: Minor > Labels: pull-request-available > Time Spent: 6h 40m > Remaining Estimate: 0h > > We lose shutdown related metrics that we should alert on at shutdown. We > should flush metrics on a shutdown. > https://github.com/apache/storm/blob/master/storm-server/src/main/java/org/apache/storm/daemon/nimbus/Nimbus.java#L4497 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (STORM-3056) Add a test for quickly rebinding to a port
[ https://issues.apache.org/jira/browse/STORM-3056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz updated STORM-3056: --- Fix Version/s: (was: 1.2.2) (was: 1.1.3) (was: 2.0.0) > Add a test for quickly rebinding to a port > -- > > Key: STORM-3056 > URL: https://issues.apache.org/jira/browse/STORM-3056 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Affects Versions: 1.1.2, 1.2.1 >Reporter: Raghav Kumar Gautam >Assignee: Raghav Kumar Gautam >Priority: Minor > Labels: pull-request-available > Time Spent: 50m > Remaining Estimate: 0h > > We need to add a test for the bug fix of STORM-3039. We try to rebind to port > 6700 a few times and expect it to be usable quickly. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (STORM-3039) Ports of killed topologies remain in TIME_WAIT state preventing to start new topology
[ https://issues.apache.org/jira/browse/STORM-3039?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz updated STORM-3039: --- Fix Version/s: 1.2.2 > Ports of killed topologies remain in TIME_WAIT state preventing to start new > topology > - > > Key: STORM-3039 > URL: https://issues.apache.org/jira/browse/STORM-3039 > Project: Apache Storm > Issue Type: Improvement >Affects Versions: 1.1.2, 1.2.1 >Reporter: Gergely Hajós >Assignee: Gergely Hajós >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0, 1.1.3, 1.2.2 > > Time Spent: 1h 50m > Remaining Estimate: 0h > > When topology is killed the slot ports (supervisor.slots.ports) remain in > TIME_WAIT state. In that case new topology can not be started, because > workers throw the following error: > {code:java} > 2018-04-20 08:37:08.742 o.a.s.d.worker main [ERROR] Error on initialization > of server mk-worker > org.apache.storm.shade.org.jboss.netty.channel.ChannelException: Failed to > bind to: 0.0.0.0/0.0.0.0:6700 > at > org.apache.storm.shade.org.jboss.netty.bootstrap.ServerBootstrap.bind(ServerBootstrap.java:272) > ~[storm-core-1.2.1.jar:1.2.1] > at org.apache.storm.messaging.netty.Server.(Server.java:101) > ~[storm-core-1.2.1.jar:1.2.1] > at org.apache.storm.messaging.netty.Context.bind(Context.java:67) > ~[storm-core-1.2.1.jar:1.2.1] > at > org.apache.storm.daemon.worker$worker_data$fn__10395.invoke(worker.clj:285) > ~[storm-core-1.2.1.jar:1.2.1] > at org.apache.storm.util$assoc_apply_self.invoke(util.clj:931) > ~[storm-core-1.2.1.jar:1.2.1] > at org.apache.storm.daemon.worker$worker_data.invoke(worker.clj:282) > ~[storm-core-1.2.1.jar:1.2.1] > at > org.apache.storm.daemon.worker$fn__10693$exec_fn__3301__auto__$reify__10695.run(worker.clj:626) > ~[storm-core-1.2.1.jar:1.2.1] > at java.security.AccessController.doPrivileged(Native Method) ~[?:1.8.0_161] > at javax.security.auth.Subject.doAs(Subject.java:422) ~[?:1.8.0_161] > at > org.apache.storm.daemon.worker$fn__10693$exec_fn__3301__auto10694.invoke(worker.clj:624) > ~[storm-core-1.2.1.jar:1.2.1] > at clojure.lang.AFn.applyToHelper(AFn.java:178) ~[clojure-1.7.0.jar:?] > at clojure.lang.AFn.applyTo(AFn.java:144) ~[clojure-1.7.0.jar:?] > at clojure.core$apply.invoke(core.clj:630) ~[clojure-1.7.0.jar:?] > at > org.apache.storm.daemon.worker$fn__10693$mk_worker__10784.doInvoke(worker.clj:598) > [storm-core-1.2.1.jar:1.2.1] > at clojure.lang.RestFn.invoke(RestFn.java:512) [clojure-1.7.0.jar:?] > at org.apache.storm.daemon.worker$_main.invoke(worker.clj:787) > [storm-core-1.2.1.jar:1.2.1] > at clojure.lang.AFn.applyToHelper(AFn.java:165) [clojure-1.7.0.jar:?] > at clojure.lang.AFn.applyTo(AFn.java:144) [clojure-1.7.0.jar:?] > at org.apache.storm.daemon.worker.main(Unknown Source) > [storm-core-1.2.1.jar:1.2.1] > Caused by: java.net.BindException: Address already in use > at sun.nio.ch.Net.bind0(Native Method) ~[?:1.8.0_161] > at sun.nio.ch.Net.bind(Net.java:433) ~[?:1.8.0_161] > at sun.nio.ch.Net.bind(Net.java:425) ~[?:1.8.0_161] > at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:223) > ~[?:1.8.0_161] > at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74) > ~[?:1.8.0_161] > at > org.apache.storm.shade.org.jboss.netty.channel.socket.nio.NioServerBoss$RegisterTask.run(NioServerBoss.java:193) > ~[storm-core-1.2.1.jar:1.2.1] > at > org.apache.storm.shade.org.jboss.netty.channel.socket.nio.AbstractNioSelector.processTaskQueue(AbstractNioSelector.java:391) > ~[storm-core-1.2.1.jar:1.2.1] > at > org.apache.storm.shade.org.jboss.netty.channel.socket.nio.AbstractNioSelector.run(AbstractNioSelector.java:315) > ~[storm-core-1.2.1.jar:1.2.1] > at > org.apache.storm.shade.org.jboss.netty.channel.socket.nio.NioServerBoss.run(NioServerBoss.java:42) > ~[storm-core-1.2.1.jar:1.2.1] > at > org.apache.storm.shade.org.jboss.netty.util.ThreadRenamingRunnable.run(ThreadRenamingRunnable.java:108) > ~[storm-core-1.2.1.jar:1.2.1] > at > org.apache.storm.shade.org.jboss.netty.util.internal.DeadLockProofWorker$1.run(DeadLockProofWorker.java:42) > ~[storm-core-1.2.1.jar:1.2.1] > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) > ~[?:1.8.0_161] > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) > ~[?:1.8.0_161] > at java.lang.Thread.run(Thread.java:748) ~[?:1.8.0_161] > {code} > > This exception occurs often when topologies stopped and started automatically. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (STORM-3035) JMS Spout ack method causes failure in some cases
[ https://issues.apache.org/jira/browse/STORM-3035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz updated STORM-3035: --- Fix Version/s: (was: 2.0.0) > JMS Spout ack method causes failure in some cases > - > > Key: STORM-3035 > URL: https://issues.apache.org/jira/browse/STORM-3035 > Project: Apache Storm > Issue Type: Bug >Reporter: Arun Mahadevan >Assignee: Arun Mahadevan >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > JMS Spout ack method assumes that the set "toCommit" is always non-empty but > if a fail is invoked (that clears the "toCommit") followed by an ack, it can > cause failure. > > {noformat} > 2018-03-09 08:43:03,220 GMT-0500 MCO-432882-L2 > [Thread-36-inboundSpout-executor[5 5]] 7.0.0 ERROR > logging$eval1$fn__7.invoke Async loop died! java.lang.RuntimeException: > java.util.NoSuchElementException at > org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:485) > ~[storm-core-1.1.0.2.6.3.0- > 235.jar:1.1.0.2.6.3.0-235] at > org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:451) > ~[storm-core- > 1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at > org.apache.storm.utils.DisruptorQueue.consumeBatch(DisruptorQueue.java:441) > ~[storm-core-1.1.0.2.6.3.0- > 235.jar:1.1.0.2.6.3.0-235] at > org.apache.storm.disruptor$consume_batch.invoke(disruptor.clj:69) > ~[storm-core- > 1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at > org.apache.storm.daemon.executor$fn__6856$fn__6871$fn__6902.invoke(executor.clj:627) > ~[storm-core-1.1.0.2.6.3.0- > 235.jar:1.1.0.2.6.3.0-235] at > org.apache.storm.util$async_loop$fn__555.invoke(util.clj:484) [storm-core- > 1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at clojure.lang.AFn.run(AFn.java:22) > [clojure-1.7.0.jar:?] at > java.lang.Thread.run(Thread.java:745) [?:1.8.0_111] Caused by: > java.util.NoSuchElementException at > java.util.TreeMap.key(TreeMap.java:1327) ~[?:1.8.0_111] at > java.util.TreeMap.firstKey(TreeMap.java:290) ~ > [?:1.8.0_111] at java.util.TreeSet.first(TreeSet.java:394) ~[?:1.8.0_111] at > org.apache.storm.jms.spout.JmsSpout.ack(JmsSpout.java:251) ~[classes/:?] at > org.apache.storm.daemon.executor$ack_spout_msg.invoke(executor.clj:446) > ~[storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at > org.apache.storm.daemon.executor$fn__6856$tuple_action_fn__6862.invoke(executor.clj:535) > ~[storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at > org.apache.storm.daemon.executor$mk_task_receiver$fn__6845.invoke(executor.clj:462) > ~[storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at > org.apache.storm.disruptor$clojure_handler$reify__6558.onEvent(disruptor.clj:40) > ~[storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at > org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:472) > ~[storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] ... 7 more > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (STORM-3035) JMS Spout ack method causes failure in some cases
[ https://issues.apache.org/jira/browse/STORM-3035?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz updated STORM-3035: --- Fix Version/s: (was: 1.2.2) > JMS Spout ack method causes failure in some cases > - > > Key: STORM-3035 > URL: https://issues.apache.org/jira/browse/STORM-3035 > Project: Apache Storm > Issue Type: Bug >Reporter: Arun Mahadevan >Assignee: Arun Mahadevan >Priority: Major > Labels: pull-request-available > Time Spent: 1h > Remaining Estimate: 0h > > JMS Spout ack method assumes that the set "toCommit" is always non-empty but > if a fail is invoked (that clears the "toCommit") followed by an ack, it can > cause failure. > > {noformat} > 2018-03-09 08:43:03,220 GMT-0500 MCO-432882-L2 > [Thread-36-inboundSpout-executor[5 5]] 7.0.0 ERROR > logging$eval1$fn__7.invoke Async loop died! java.lang.RuntimeException: > java.util.NoSuchElementException at > org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:485) > ~[storm-core-1.1.0.2.6.3.0- > 235.jar:1.1.0.2.6.3.0-235] at > org.apache.storm.utils.DisruptorQueue.consumeBatchWhenAvailable(DisruptorQueue.java:451) > ~[storm-core- > 1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at > org.apache.storm.utils.DisruptorQueue.consumeBatch(DisruptorQueue.java:441) > ~[storm-core-1.1.0.2.6.3.0- > 235.jar:1.1.0.2.6.3.0-235] at > org.apache.storm.disruptor$consume_batch.invoke(disruptor.clj:69) > ~[storm-core- > 1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at > org.apache.storm.daemon.executor$fn__6856$fn__6871$fn__6902.invoke(executor.clj:627) > ~[storm-core-1.1.0.2.6.3.0- > 235.jar:1.1.0.2.6.3.0-235] at > org.apache.storm.util$async_loop$fn__555.invoke(util.clj:484) [storm-core- > 1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at clojure.lang.AFn.run(AFn.java:22) > [clojure-1.7.0.jar:?] at > java.lang.Thread.run(Thread.java:745) [?:1.8.0_111] Caused by: > java.util.NoSuchElementException at > java.util.TreeMap.key(TreeMap.java:1327) ~[?:1.8.0_111] at > java.util.TreeMap.firstKey(TreeMap.java:290) ~ > [?:1.8.0_111] at java.util.TreeSet.first(TreeSet.java:394) ~[?:1.8.0_111] at > org.apache.storm.jms.spout.JmsSpout.ack(JmsSpout.java:251) ~[classes/:?] at > org.apache.storm.daemon.executor$ack_spout_msg.invoke(executor.clj:446) > ~[storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at > org.apache.storm.daemon.executor$fn__6856$tuple_action_fn__6862.invoke(executor.clj:535) > ~[storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at > org.apache.storm.daemon.executor$mk_task_receiver$fn__6845.invoke(executor.clj:462) > ~[storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at > org.apache.storm.disruptor$clojure_handler$reify__6558.onEvent(disruptor.clj:40) > ~[storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] at > org.apache.storm.utils.DisruptorQueue.consumeBatchToCursor(DisruptorQueue.java:472) > ~[storm-core-1.1.0.2.6.3.0-235.jar:1.1.0.2.6.3.0-235] ... 7 more > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (STORM-2955) Problems with download page
[ https://issues.apache.org/jira/browse/STORM-2955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz reassigned STORM-2955: -- Assignee: P. Taylor Goetz > Problems with download page > --- > > Key: STORM-2955 > URL: https://issues.apache.org/jira/browse/STORM-2955 > Project: Apache Storm > Issue Type: Bug > Environment: http://storm.apache.org/downloads.html >Reporter: Sebb >Assignee: P. Taylor Goetz >Priority: Major > > Mostly the download page is OK, however there are some areas where it could > be improved. > 1) Use https (SSL) for links to KEYS, sigs and hashes > 2) Use www.apache.org rather than www.us.apache.org. The former will redirect > as necessary to the US or EU host. > 3) Old releases should use archive.apache.org. Please fix the links for 1.1.1 > and 1.0.5 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (STORM-2955) Problems with download page
[ https://issues.apache.org/jira/browse/STORM-2955?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz resolved STORM-2955. Resolution: Fixed Update published (may take some time to appear). > Problems with download page > --- > > Key: STORM-2955 > URL: https://issues.apache.org/jira/browse/STORM-2955 > Project: Apache Storm > Issue Type: Bug > Environment: http://storm.apache.org/downloads.html >Reporter: Sebb >Assignee: P. Taylor Goetz >Priority: Major > > Mostly the download page is OK, however there are some areas where it could > be improved. > 1) Use https (SSL) for links to KEYS, sigs and hashes > 2) Use www.apache.org rather than www.us.apache.org. The former will redirect > as necessary to the US or EU host. > 3) Old releases should use archive.apache.org. Please fix the links for 1.1.1 > and 1.0.5 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (STORM-2956) Please delete old releases from mirroring system
[ https://issues.apache.org/jira/browse/STORM-2956?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz resolved STORM-2956. Resolution: Fixed Older releases have been archived. > Please delete old releases from mirroring system > > > Key: STORM-2956 > URL: https://issues.apache.org/jira/browse/STORM-2956 > Project: Apache Storm > Issue Type: Bug >Reporter: Sebb >Priority: Major > > To reduce the load on the ASF mirrors, projects are required to delete old > releases [1] > Please can you remove all non-current releases? > i.e 1.0.0-1.0.5, 1.1.0, 1.1.1 > It's unfair to expect the 3rd party mirrors to carry old releases. > Note that older releases can still be linked from the download page, but such > links should use the archive server. > Thanks! > [1] http://www.apache.org/dev/release.html#when-to-archive -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (STORM-2951) Storm binaries packages oncrpc jar which is LGPL
[ https://issues.apache.org/jira/browse/STORM-2951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz resolved STORM-2951. Resolution: Fixed Fix Version/s: 1.2.1 Added exclusion for oncrpc to the storm-core pom. > Storm binaries packages oncrpc jar which is LGPL > - > > Key: STORM-2951 > URL: https://issues.apache.org/jira/browse/STORM-2951 > Project: Apache Storm > Issue Type: Bug >Affects Versions: 1.2.0 >Reporter: Arun Mahadevan >Assignee: P. Taylor Goetz >Priority: Major > Fix For: 1.2.1 > > > With the recent storm metrics changes storm packages oncrpc-1.0.7.jar which > is LGPL licence. > > [https://mvnrepository.com/artifact/org.acplt/oncrpc/1.0.7] > > I am not sure if its ok to package libraries with LGPL license in storm > distribution. > > Its coming from metrics-ganglia dependency in storm-core. > [~ptgoetz], can you provide inputs ? If this needs to be excluded, I can > craft a patch and push it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Assigned] (STORM-2951) Storm binaries packages oncrpc jar which is LGPL
[ https://issues.apache.org/jira/browse/STORM-2951?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz reassigned STORM-2951: -- Assignee: P. Taylor Goetz > Storm binaries packages oncrpc jar which is LGPL > - > > Key: STORM-2951 > URL: https://issues.apache.org/jira/browse/STORM-2951 > Project: Apache Storm > Issue Type: Bug >Affects Versions: 1.2.0 >Reporter: Arun Mahadevan >Assignee: P. Taylor Goetz >Priority: Major > > With the recent storm metrics changes storm packages oncrpc-1.0.7.jar which > is LGPL licence. > > [https://mvnrepository.com/artifact/org.acplt/oncrpc/1.0.7] > > I am not sure if its ok to package libraries with LGPL license in storm > distribution. > > Its coming from metrics-ganglia dependency in storm-core. > [~ptgoetz], can you provide inputs ? If this needs to be excluded, I can > craft a patch and push it. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (STORM-2942) Remove javadoc and source jars from toollib directory in binary distribution
[ https://issues.apache.org/jira/browse/STORM-2942?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz resolved STORM-2942. Resolution: Fixed Fix Version/s: 1.1.2 1.2.0 > Remove javadoc and source jars from toollib directory in binary distribution > > > Key: STORM-2942 > URL: https://issues.apache.org/jira/browse/STORM-2942 > Project: Apache Storm > Issue Type: Bug >Affects Versions: 1.1.0, 1.1.1 >Reporter: P. Taylor Goetz >Assignee: P. Taylor Goetz >Priority: Major > Fix For: 1.2.0, 1.1.2 > > > Need to update the assembly to only include the classes jar. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (STORM-2943) Binary distribution includes storm-kafka-monitor source/javadoc in toollib directory
[ https://issues.apache.org/jira/browse/STORM-2943?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz resolved STORM-2943. Resolution: Duplicate Duplicate of STORM-2942 > Binary distribution includes storm-kafka-monitor source/javadoc in toollib > directory > > > Key: STORM-2943 > URL: https://issues.apache.org/jira/browse/STORM-2943 > Project: Apache Storm > Issue Type: Bug > Components: storm-kafka-monitor >Affects Versions: 1.1.0, 1.1.1, 1.2.0, 1.1.2 >Reporter: Alexandre Vermeerbergen >Assignee: P. Taylor Goetz >Priority: Blocker > > Quoting Alexandre's RC3 vote: > > {quote}I hate to be the one who always give bad news, but as a matter of > facts, Storm 1.2.0 RC3 installation from binary artifacts (both > apache-storm-1.2.0-src.tar.gz and apache-storm-1.2.0.zip) leads to "by > default KO Kafka monitor" in Nimbus UI (which dirty exceptions in > ui.log) > Here's for example what I get from apache-storm-1.2.0-src.tar.gz > downloaded from > [https://dist.apache.org/repos/dist/dev/storm/apache-storm-1.2.0-rc3/apache-storm-1.2.0-src.tar.gz]: > $ tar ztvf apache-storm-1.2.0.tar.gz apache-storm-1.2.0/toollib > -rwxrwxrwx ptgoetz/staff 16999 2018-02-06 21:22 > apache-storm-1.2.0/toollib/storm-kafka-monitor-1.2.0-sources.jar > -rwxrwxrwx ptgoetz/staff 93461 2018-02-06 21:22 > apache-storm-1.2.0/toollib/storm-kafka-monitor-1.2.0-javadoc.jar > -rwxrwxrwx ptgoetz/staff 21591320 2018-02-06 21:22 > apache-storm-1.2.0/toollib/storm-kafka-monitor-1.2.0.jar > And here's what I see in ui.log: > org.apache.storm.kafka.spout.KafkaSpout > 2018-02-07 16:49:57.153 o.a.s.u.TopologySpoutLag qtp1997623038-18 > [WARN] Exception message:Error: Could not find or load main class > .usr.local.Storm.storm-stable.toollib.storm-kafka-monitor-1.2.0-javadoc.jar > org.apache.storm.utils.ShellUtils$ExitCodeException: Error: Could not > find or load main class > .usr.local.Storm.storm-stable.toollib.storm-kafka-monitor-1.2.0-javadoc.jar > at org.apache.storm.utils.ShellUtils.runCommand(ShellUtils.java:231) > ~[storm-core-1.2.0.jar:1.2.0] > at org.apache.storm.utils.ShellUtils.run(ShellUtils.java:161) > ~[storm-core-1.2.0.jar:1.2.0] > at > org.apache.storm.utils.ShellUtils$ShellCommandExecutor.execute(ShellUtils.java:371) > ~[storm-core-1.2.0.jar:1.2.0] > at org.apache.storm.utils.ShellUtils.execCommand(ShellUtils.java:461) > ~[storm-core-1.2.0.jar:1.2.0] > at org.apache.storm.utils.ShellUtils.execCommand(ShellUtils.java:444) > ~[storm-core-1.2.0.jar:1.2.0] > at > org.apache.storm.utils.TopologySpoutLag.getLagResultForKafka(TopologySpoutLag.java:163) > ~[storm-core-1.2.0.jar:1.2.0] > at > org.apache.storm.utils.TopologySpoutLag.getLagResultForNewKafkaSpout(TopologySpoutLag.java:189) > ~[storm-core-1.2.0.jar:1.2.0] > at > org.apache.storm.utils.TopologySpoutLag.lag(TopologySpoutLag.java:57) > ~[storm-core-1.2.0.jar:1.2.0] > at org.apache.storm.ui.core$topology_lag.invoke(core.clj:805) > ~[storm-core-1.2.0.jar:1.2.0] > at org.apache.storm.ui.core$fn__9586.invoke(core.clj:1165) > ~[storm-core-1.2.0.jar:1.2.0] > at > org.apache.storm.shade.compojure.core$make_route$fn__5979.invoke(core.clj:100) > ~[storm-core-1.2.0.jar:1.2.0] > at > org.apache.storm.shade.compojure.core$if_route$fn__5967.invoke(core.clj:46) > ~[storm-core-1.2.0.jar:1.2.0] > at > org.apache.storm.shade.compojure.core$if_method$fn__5960.invoke(core.clj:31) > ~[storm-core-1.2.0.jar:1.2.0] > at > org.apache.storm.shade.compojure.core$routing$fn__5985.invoke(core.clj:113) > ~[storm-core-1.2.0.jar:1.2.0] > at clojure.core$some.invoke(core.clj:2570) ~[clojure-1.7.0.jar:?] > at > org.apache.storm.shade.compojure.core$routing.doInvoke(core.clj:113) > ~[storm-core-1.2.0.jar:1.2.0] > at clojure.lang.RestFn.applyTo(RestFn.java:139) > ~[clojure-1.7.0.jar:?] > at clojure.core$apply.invoke(core.clj:632) ~[clojure-1.7.0.jar:?] > at > org.apache.storm.shade.compojure.core$routes$fn__5989.invoke(core.clj:118) > ~[storm-core-1.2.0.jar:1.2.0] > at > org.apache.storm.shade.ring.middleware.cors$wrap_cors$fn__8894.invoke(cors.clj:149) > ~[storm-core-1.2.0.jar:1.2.0] > at > org.apache.storm.shade.ring.middleware.json$wrap_json_params$fn__8841.invoke(json.clj:56) > ~[storm-core-1.2.0.jar:1.2.0] > at > org.apache.storm.shade.ring.middleware.multipart_params$wrap_multipart_params$fn__6621.invoke(multipart_params.clj:118) > ~[storm-core-1.2.0.jar:1.2.0] > at > org.apache.storm.shade.ring.middleware.reload$wrap_reload$fn__7904.invoke(reload.clj:22) > ~[storm-core-1.2.0.jar:1.2.0] >
[jira] [Created] (STORM-2942) Remove javadoc and source jars from toollib directory in binary distribution
P. Taylor Goetz created STORM-2942: -- Summary: Remove javadoc and source jars from toollib directory in binary distribution Key: STORM-2942 URL: https://issues.apache.org/jira/browse/STORM-2942 Project: Apache Storm Issue Type: Bug Affects Versions: 1.1.1, 1.1.0 Reporter: P. Taylor Goetz Assignee: P. Taylor Goetz Need to update the assembly to only include the classes jar. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (STORM-2709) Release Apache Storm 1.1.2
[ https://issues.apache.org/jira/browse/STORM-2709?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz updated STORM-2709: --- Fix Version/s: (was: 1.1.2) > Release Apache Storm 1.1.2 > -- > > Key: STORM-2709 > URL: https://issues.apache.org/jira/browse/STORM-2709 > Project: Apache Storm > Issue Type: Epic >Reporter: Jungtaek Lim >Priority: Major > > This is to track remaining issues on releasing Storm 1.1.2. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (STORM-2912) Tick tuple is being shared without resetting start time and incur side-effect to break metrics
[ https://issues.apache.org/jira/browse/STORM-2912?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz resolved STORM-2912. Resolution: Fixed Fix Version/s: 1.0.6 1.1.2 1.2.0 2.0.0 Merged to master, 1.x-branch, 1.1.x-branch, 1.0.x-branch. > Tick tuple is being shared without resetting start time and incur side-effect > to break metrics > -- > > Key: STORM-2912 > URL: https://issues.apache.org/jira/browse/STORM-2912 > Project: Apache Storm > Issue Type: Bug > Components: storm-client, storm-core >Affects Versions: 2.0.0, 1.2.0, 1.1.2, 1.0.6 >Reporter: Jungtaek Lim >Assignee: Jungtaek Lim >Priority: Blocker > Labels: pull-request-available > Fix For: 2.0.0, 1.2.0, 1.1.2, 1.0.6 > > Time Spent: 3h 10m > Remaining Estimate: 0h > > In STORM-2786 we have applied small optimization: create tick tuple only once > and reuse. The optimization completely makes sense, but when measuring > built-in metrics, when reused tick tuple is selected for sampling, we never > reset start time unless it is selected for sampling again, hence further tick > tuple is always considered as sampled with start time unchanged. > What I've observed is that delta for tick tuple is gradually increasing for > some time-period, and reset to 0, which messes up latencies. It also messes > up executed as well because it is always considered as selected tuple (hence > recorded to 20x for tick tuple), but unless interval of tick tuple is super > small, the effect is much smaller then latency. > > Here's part of log denoting this issue. Please take a look at DELTA values, > which shouldn't be such huge. > {code:java} > 2018-01-25 13:34:41.464 o.a.s.d.executor Thread-14-__acker-executor[1 1] > [INFO] Execute done TUPLE source: __system:-1, stream: __tick, id: {}, [30] > TASK: 1 DELTA: 0 > 2018-01-25 13:34:41.658 o.a.s.d.executor Thread-12-counter-executor[3 3] > [INFO] Execute done TUPLE source: __system:-1, stream: __tick, id: {}, [3] > TASK: 3 DELTA: 87083 > 2018-01-25 13:34:41.658 o.a.s.d.executor Thread-8-counter-executor[2 2] > [INFO] Execute done TUPLE source: __system:-1, stream: __tick, id: {}, [3] > TASK: 2 DELTA: 6003 > 2018-01-25 13:34:41.728 o.a.s.d.executor Thread-26-counter-executor[5 5] > [INFO] Execute done TUPLE source: __system:-1, stream: __tick, id: {}, [3] > TASK: 5 DELTA: 30036 > 2018-01-25 13:34:41.728 o.a.s.d.executor > Thread-4-intermediateRanker-executor[8 8] [INFO] Execute done TUPLE source: > __system:-1, stream: __tick, id: {}, [2] TASK: 8 DELTA: 4001 > 2018-01-25 13:34:41.729 o.a.s.d.executor Thread-32-counter-executor[4 4] > [INFO] Execute done TUPLE source: __system:-1, stream: __tick, id: {}, [3] > TASK: 4 DELTA: 156155 > 2018-01-25 13:34:41.813 o.a.s.d.executor Thread-16-finalRanker-executor[6 6] > [INFO] Execute done TUPLE source: __system:-1, stream: __tick, id: {}, [2] > TASK: 6 DELTA: 24043 > 2018-01-25 13:34:41.813 o.a.s.d.executor > Thread-10-intermediateRanker-executor[7 7] [INFO] Execute done TUPLE source: > __system:-1, stream: __tick, id: {}, [2] TASK: 7 DELTA: 14025 > 2018-01-25 13:34:41.813 o.a.s.d.executor > Thread-18-intermediateRanker-executor[9 9] [INFO] Execute done TUPLE source: > __system:-1, stream: __tick, id: {}, [2] TASK: 9 DELTA: 52091 > 2018-01-25 13:34:41.886 o.a.s.d.executor > Thread-28-intermediateRanker-executor[10 10] [INFO] Execute done TUPLE > source: __system:-1, stream: __tick, id: {}, [2] TASK: 10 DELTA: 18025 > 2018-01-25 13:34:43.731 o.a.s.d.executor > Thread-4-intermediateRanker-executor[8 8] [INFO] Execute done TUPLE source: > __system:-1, stream: __tick, id: {}, [2] TASK: 8 DELTA: 6004 > 2018-01-25 13:34:43.817 o.a.s.d.executor Thread-16-finalRanker-executor[6 6] > [INFO] Execute done TUPLE source: __system:-1, stream: __tick, id: {}, [2] > TASK: 6 DELTA: 26047 > 2018-01-25 13:34:43.817 o.a.s.d.executor > Thread-10-intermediateRanker-executor[7 7] [INFO] Execute done TUPLE source: > __system:-1, stream: __tick, id: {}, [2] TASK: 7 DELTA: 16029 > 2018-01-25 13:34:43.817 o.a.s.d.executor > Thread-18-intermediateRanker-executor[9 9] [INFO] Execute done TUPLE source: > __system:-1, stream: __tick, id: {}, [2] TASK: 9 DELTA: 1 > 2018-01-25 13:34:43.890 o.a.s.d.executor > Thread-28-intermediateRanker-executor[10 10] [INFO] Execute done TUPLE > source: __system:-1, stream: __tick, id: {}, [2] TASK: 10 DELTA: 20029 > 2018-01-25 13:34:44.661 o.a.s.d.executor Thread-12-counter-executor[3 3] > [INFO] Execute done TUPLE source: __system:-1, stream: __tick, id: {}, [3] > TASK: 3 DELTA: 90086 > 2018-01-25 13:34:44.662 o.a.s.d.executor Thread-8-counter-executor[2 2] > [INFO]
[jira] [Resolved] (STORM-2160) Expose interface to MetricRegistry via TopologyContext
[ https://issues.apache.org/jira/browse/STORM-2160?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz resolved STORM-2160. Resolution: Fixed Fix Version/s: 1.2.0 > Expose interface to MetricRegistry via TopologyContext > -- > > Key: STORM-2160 > URL: https://issues.apache.org/jira/browse/STORM-2160 > Project: Apache Storm > Issue Type: Improvement >Reporter: Alessandro Bellina >Assignee: P. Taylor Goetz >Priority: Major > Fix For: 1.2.0 > > > Since each worker has an instance of MetricRegistry, expose via > TopologyContext to .prepare and .open for end-users to register their own > codahale Meters. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (STORM-2164) Create simple generic plugin system to register codahale reporters
[ https://issues.apache.org/jira/browse/STORM-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz resolved STORM-2164. Resolution: Fixed Fix Version/s: 1.2.0 > Create simple generic plugin system to register codahale reporters > -- > > Key: STORM-2164 > URL: https://issues.apache.org/jira/browse/STORM-2164 > Project: Apache Storm > Issue Type: Improvement >Reporter: Alessandro Bellina >Assignee: P. Taylor Goetz >Priority: Major > Fix For: 1.2.0 > > > Configurable plugin interface s.t. daemons can instantiate codahale reporters. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (STORM-2161) Stop shading the codahale metrics library so that it is available to users
[ https://issues.apache.org/jira/browse/STORM-2161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz resolved STORM-2161. Resolution: Fixed Fix Version/s: 1.2.0 > Stop shading the codahale metrics library so that it is available to users > -- > > Key: STORM-2161 > URL: https://issues.apache.org/jira/browse/STORM-2161 > Project: Apache Storm > Issue Type: Sub-task >Reporter: Alessandro Bellina >Priority: Major > Fix For: 1.2.0 > > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Reopened] (STORM-2153) New Metrics Reporting API
[ https://issues.apache.org/jira/browse/STORM-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz reopened STORM-2153: > New Metrics Reporting API > - > > Key: STORM-2153 > URL: https://issues.apache.org/jira/browse/STORM-2153 > Project: Apache Storm > Issue Type: Improvement >Reporter: P. Taylor Goetz >Assignee: P. Taylor Goetz >Priority: Major > Labels: pull-request-available > Fix For: 1.2.0 > > Time Spent: 34.5h > Remaining Estimate: 0h > > This is a proposal to provide a new metrics reporting API based on [Coda > Hale's metrics library | http://metrics.dropwizard.io/3.1.0/] (AKA > Dropwizard/Yammer metrics). > h2. Background > In a [discussion on the dev@ mailing list | > http://mail-archives.apache.org/mod_mbox/storm-dev/201610.mbox/%3ccagx0urh85nfh0pbph11pmc1oof6htycjcxsxgwp2nnofukq...@mail.gmail.com%3e] > a number of community and PMC members recommended replacing Storm’s metrics > system with a new API as opposed to enhancing the existing metrics system. > Some of the objections to the existing metrics API include: > # Metrics are reported as an untyped Java object, making it very difficult to > reason about how to report it (e.g. is it a gauge, a counter, etc.?) > # It is difficult to determine if metrics coming into the consumer are > pre-aggregated or not. > # Storm’s metrics collection occurs through a specialized bolt, which in > addition to potentially affecting system performance, complicates certain > types of aggregation when the parallelism of that bolt is greater than one. > In the discussion on the developer mailing list, there is growing consensus > for replacing Storm’s metrics API with a new API based on Coda Hale’s metrics > library. This approach has the following benefits: > # Coda Hale’s metrics library is very stable, performant, well thought out, > and widely adopted among open source projects (e.g. Kafka). > # The metrics library provides many existing metric types: Meters, Gauges, > Counters, Histograms, and more. > # The library has a pluggable “reporter” API for publishing metrics to > various systems, with existing implementations for: JMX, console, CSV, SLF4J, > Graphite, Ganglia. > # Reporters are straightforward to implement, and can be reused by any > project that uses the metrics library (i.e. would have broader application > outside of Storm) > As noted earlier, the metrics library supports pluggable reporters for > sending metrics data to other systems, and implementing a reporter is fairly > straightforward (an example reporter implementation can be found here). For > example if someone develops a reporter based on Coda Hale’s metrics, it could > not only be used for pushing Storm metrics, but also for any system that used > the metrics library, such as Kafka. > h2. Scope of Effort > The effort to implement a new metrics API for Storm can be broken down into > the following development areas: > # Implement API for Storms internal worker metrics: latencies, queue sizes, > capacity, etc. > # Implement API for user defined, topology-specific metrics (exposed via the > {{org.apache.storm.task.TopologyContext}} class) > # Implement API for storm daemons: nimbus, supervisor, etc. > h2. Relationship to Existing Metrics > This would be a new API that would not affect the existing metrics API. Upon > completion, the old metrics API would presumably be deprecated, but kept in > place for backward compatibility. > Internally the current metrics API uses Storm bolts for the reporting > mechanism. The proposed metrics API would not depend on any of Storm's > messaging capabilities and instead use the [metrics library's built-in > reporter mechanism | > http://metrics.dropwizard.io/3.1.0/manual/core/#man-core-reporters]. This > would allow users to use existing {{Reporter}} implementations which are not > Storm-specific, and would simplify the process of collecting metrics. > Compared to Storm's {{IMetricCollector}} interface, implementing a reporter > for the metrics library is much more straightforward (an example can be found > [here | > https://github.com/dropwizard/metrics/blob/3.2-development/metrics-core/src/main/java/com/codahale/metrics/ConsoleReporter.java]. > The new metrics capability would not use or affect the ZooKeeper-based > metrics used by Storm UI. > h2. Relationship to JStorm Metrics > [TBD] > h2. Target Branches > [TBD] > h2. Performance Implications > [TBD] > h2. Metrics Namespaces > [TBD] > h2. Metrics Collected > *Worker* > || Namespace || Metric Type || Description || > *Nimbus* > || Namespace || Metric Type || Description || > *Supervisor* > || Namespace || Metric Type || Description || > h2. User-Defined Metrics > [TBD] -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (STORM-2153) New Metrics Reporting API
[ https://issues.apache.org/jira/browse/STORM-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz resolved STORM-2153. Resolution: Fixed Fix Version/s: 1.2.0 > New Metrics Reporting API > - > > Key: STORM-2153 > URL: https://issues.apache.org/jira/browse/STORM-2153 > Project: Apache Storm > Issue Type: Improvement >Reporter: P. Taylor Goetz >Assignee: P. Taylor Goetz >Priority: Major > Labels: pull-request-available > Fix For: 1.2.0 > > Time Spent: 34.5h > Remaining Estimate: 0h > > This is a proposal to provide a new metrics reporting API based on [Coda > Hale's metrics library | http://metrics.dropwizard.io/3.1.0/] (AKA > Dropwizard/Yammer metrics). > h2. Background > In a [discussion on the dev@ mailing list | > http://mail-archives.apache.org/mod_mbox/storm-dev/201610.mbox/%3ccagx0urh85nfh0pbph11pmc1oof6htycjcxsxgwp2nnofukq...@mail.gmail.com%3e] > a number of community and PMC members recommended replacing Storm’s metrics > system with a new API as opposed to enhancing the existing metrics system. > Some of the objections to the existing metrics API include: > # Metrics are reported as an untyped Java object, making it very difficult to > reason about how to report it (e.g. is it a gauge, a counter, etc.?) > # It is difficult to determine if metrics coming into the consumer are > pre-aggregated or not. > # Storm’s metrics collection occurs through a specialized bolt, which in > addition to potentially affecting system performance, complicates certain > types of aggregation when the parallelism of that bolt is greater than one. > In the discussion on the developer mailing list, there is growing consensus > for replacing Storm’s metrics API with a new API based on Coda Hale’s metrics > library. This approach has the following benefits: > # Coda Hale’s metrics library is very stable, performant, well thought out, > and widely adopted among open source projects (e.g. Kafka). > # The metrics library provides many existing metric types: Meters, Gauges, > Counters, Histograms, and more. > # The library has a pluggable “reporter” API for publishing metrics to > various systems, with existing implementations for: JMX, console, CSV, SLF4J, > Graphite, Ganglia. > # Reporters are straightforward to implement, and can be reused by any > project that uses the metrics library (i.e. would have broader application > outside of Storm) > As noted earlier, the metrics library supports pluggable reporters for > sending metrics data to other systems, and implementing a reporter is fairly > straightforward (an example reporter implementation can be found here). For > example if someone develops a reporter based on Coda Hale’s metrics, it could > not only be used for pushing Storm metrics, but also for any system that used > the metrics library, such as Kafka. > h2. Scope of Effort > The effort to implement a new metrics API for Storm can be broken down into > the following development areas: > # Implement API for Storms internal worker metrics: latencies, queue sizes, > capacity, etc. > # Implement API for user defined, topology-specific metrics (exposed via the > {{org.apache.storm.task.TopologyContext}} class) > # Implement API for storm daemons: nimbus, supervisor, etc. > h2. Relationship to Existing Metrics > This would be a new API that would not affect the existing metrics API. Upon > completion, the old metrics API would presumably be deprecated, but kept in > place for backward compatibility. > Internally the current metrics API uses Storm bolts for the reporting > mechanism. The proposed metrics API would not depend on any of Storm's > messaging capabilities and instead use the [metrics library's built-in > reporter mechanism | > http://metrics.dropwizard.io/3.1.0/manual/core/#man-core-reporters]. This > would allow users to use existing {{Reporter}} implementations which are not > Storm-specific, and would simplify the process of collecting metrics. > Compared to Storm's {{IMetricCollector}} interface, implementing a reporter > for the metrics library is much more straightforward (an example can be found > [here | > https://github.com/dropwizard/metrics/blob/3.2-development/metrics-core/src/main/java/com/codahale/metrics/ConsoleReporter.java]. > The new metrics capability would not use or affect the ZooKeeper-based > metrics used by Storm UI. > h2. Relationship to JStorm Metrics > [TBD] > h2. Target Branches > [TBD] > h2. Performance Implications > [TBD] > h2. Metrics Namespaces > [TBD] > h2. Metrics Collected > *Worker* > || Namespace || Metric Type || Description || > *Nimbus* > || Namespace || Metric Type || Description || > *Supervisor* > || Namespace || Metric Type || Description || > h2. User-Defined Metrics > [TBD] -- This
[jira] [Commented] (STORM-2854) Expose IEventLogger to make event logging pluggable
[ https://issues.apache.org/jira/browse/STORM-2854?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16290328#comment-16290328 ] P. Taylor Goetz commented on STORM-2854: [~kabhwan] I’m not sure I follow, can you elaborate on the use case? > Expose IEventLogger to make event logging pluggable > --- > > Key: STORM-2854 > URL: https://issues.apache.org/jira/browse/STORM-2854 > Project: Apache Storm > Issue Type: Improvement > Components: storm-client >Reporter: Jungtaek Lim >Assignee: Jungtaek Lim > > For the first time, "Event Logger" feature is designed to make implementation > pluggable, so that's why IEventLogger exists, but we didn't have actual use > case other than just writing them to the file at that time, so we just > simplified the case. > Now we have use case which also write events to file, but with awareness of > structure of event so that it can be easily parseable from log feeder. We > would want to have custom IEventLogger to represent event as our own format > in this case. > There's another issue as well: EventInfo has `ts` which stores epoch but it's > defined as String, not long. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (STORM-2796) Flux: Provide means for invoking static factory methods and improve non-primitive number handling
[ https://issues.apache.org/jira/browse/STORM-2796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz updated STORM-2796: --- Description: Provide a means to invoke static factory methods for flux components. E.g: Java signature: {code} public static MyComponent newInstance(String... params) {code} Yaml: {code} className: "org.apache.storm.flux.test.MyComponent" factory: "newInstance" factoryArgs: ["a", "b", "c"] {code} Also include a fix for non-primitive numbers, so constructs like the following work: Java constructor: {code} public TestBolt(Long l){} {code} Yaml: {code} - id: "bolt-4" className: "org.apache.storm.flux.test.TestBolt" constructorArgs: - 10 parallelism: 1 {code} (Before fix the above would fail because snakeyaml would convert `10` to an Integer.) was: Provide a means to invoke static factory methods for flux components. E.g: Java signature: {code} public static MyComponent newInstance(String... params) {code} Yaml: {code} className: "org.apache.storm.flux.test.MyComponent" factory: "newInstance" factoryArgs: ["a", "b", "c"] {code} > Flux: Provide means for invoking static factory methods and improve > non-primitive number handling > - > > Key: STORM-2796 > URL: https://issues.apache.org/jira/browse/STORM-2796 > Project: Apache Storm > Issue Type: Improvement > Components: Flux >Affects Versions: 2.0.0, 1.1.1, 1.2.0, 1.0.6 >Reporter: P. Taylor Goetz >Assignee: P. Taylor Goetz > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Provide a means to invoke static factory methods for flux components. E.g: > Java signature: > {code} > public static MyComponent newInstance(String... params) > {code} > Yaml: > {code} > className: "org.apache.storm.flux.test.MyComponent" > factory: "newInstance" > factoryArgs: ["a", "b", "c"] > {code} > Also include a fix for non-primitive numbers, so constructs like the > following work: > Java constructor: > {code} > public TestBolt(Long l){} > {code} > Yaml: > {code} > - id: "bolt-4" > className: "org.apache.storm.flux.test.TestBolt" > constructorArgs: > - 10 > parallelism: 1 > {code} > (Before fix the above would fail because snakeyaml would convert `10` to an > Integer.) -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (STORM-2796) Flux: Provide means for invoking static factory methods and improve non-primitive number handling
[ https://issues.apache.org/jira/browse/STORM-2796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz updated STORM-2796: --- Summary: Flux: Provide means for invoking static factory methods and improve non-primitive number handling (was: Flux: Provide means for invoking static factory methods) > Flux: Provide means for invoking static factory methods and improve > non-primitive number handling > - > > Key: STORM-2796 > URL: https://issues.apache.org/jira/browse/STORM-2796 > Project: Apache Storm > Issue Type: Improvement > Components: Flux >Affects Versions: 2.0.0, 1.1.1, 1.2.0, 1.0.6 >Reporter: P. Taylor Goetz >Assignee: P. Taylor Goetz > Labels: pull-request-available > Time Spent: 0.5h > Remaining Estimate: 0h > > Provide a means to invoke static factory methods for flux components. E.g: > Java signature: > {code} > public static MyComponent newInstance(String... params) > {code} > Yaml: > {code} > className: "org.apache.storm.flux.test.MyComponent" > factory: "newInstance" > factoryArgs: ["a", "b", "c"] > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Updated] (STORM-2796) Flux: Provide means for invoking static factory methods
[ https://issues.apache.org/jira/browse/STORM-2796?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz updated STORM-2796: --- Description: Provide a means to invoke static factory methods for flux components. E.g: Java signature: {code} public static MyComponent newInstance(String... params) {code} Yaml: {code} className: "org.apache.storm.flux.test.MyComponent" factory: "newInstance" factoryArgs: ["a", "b", "c"] {code} was: Provide a means to invoke static factory methods for flux components. E.g: Java signature: {code} public static MyComponent newInstance(String... params) {code} Yaml: {code} className: "org.apache.storm.flux.test.MyComponent" factory: "factory" factoryArgs: ["a", "b", "c"] {code} > Flux: Provide means for invoking static factory methods > --- > > Key: STORM-2796 > URL: https://issues.apache.org/jira/browse/STORM-2796 > Project: Apache Storm > Issue Type: Improvement > Components: Flux >Affects Versions: 2.0.0, 1.1.1, 1.2.0, 1.0.6 >Reporter: P. Taylor Goetz >Assignee: P. Taylor Goetz >Priority: Normal > > Provide a means to invoke static factory methods for flux components. E.g: > Java signature: > {code} > public static MyComponent newInstance(String... params) > {code} > Yaml: > {code} > className: "org.apache.storm.flux.test.MyComponent" > factory: "newInstance" > factoryArgs: ["a", "b", "c"] > {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Created] (STORM-2796) Flux: Provide means for invoking static factory methods
P. Taylor Goetz created STORM-2796: -- Summary: Flux: Provide means for invoking static factory methods Key: STORM-2796 URL: https://issues.apache.org/jira/browse/STORM-2796 Project: Apache Storm Issue Type: Improvement Components: Flux Affects Versions: 2.0.0, 1.1.1, 1.2.0, 1.0.6 Reporter: P. Taylor Goetz Assignee: P. Taylor Goetz Priority: Normal Provide a means to invoke static factory methods for flux components. E.g: Java signature: {code} public static MyComponent newInstance(String... params) {code} Yaml: {code} className: "org.apache.storm.flux.test.MyComponent" factory: "factory" factoryArgs: ["a", "b", "c"] {code} -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Assigned] (STORM-2316) Enumeration support for properties configuration
[ https://issues.apache.org/jira/browse/STORM-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz reassigned STORM-2316: -- Assignee: P. Taylor Goetz > Enumeration support for properties configuration > > > Key: STORM-2316 > URL: https://issues.apache.org/jira/browse/STORM-2316 > Project: Apache Storm > Issue Type: Improvement > Components: Flux >Affects Versions: 1.0.2 >Reporter: Bogdan Rudka >Assignee: P. Taylor Goetz > > It would be great if a Flux builder will resolve enumeration within > properties configuration. This feature is only available for constructor > arguments. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (STORM-1390) Optimization of flux
[ https://issues.apache.org/jira/browse/STORM-1390?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz resolved STORM-1390. Resolution: Invalid > Optimization of flux > > > Key: STORM-1390 > URL: https://issues.apache.org/jira/browse/STORM-1390 > Project: Apache Storm > Issue Type: Improvement > Components: Flux >Reporter: lispking > > If you find the right constructor, you do not have to look any further. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Commented] (STORM-2628) Minor issues with the site
[ https://issues.apache.org/jira/browse/STORM-2628?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16149513#comment-16149513 ] P. Taylor Goetz commented on STORM-2628: +1 > Minor issues with the site > -- > > Key: STORM-2628 > URL: https://issues.apache.org/jira/browse/STORM-2628 > Project: Apache Storm > Issue Type: Bug > Components: asf-site >Reporter: Stig Rohde Døssing >Assignee: Stig Rohde Døssing >Priority: Trivial > Attachments: STORM-2628.patch > > > Found a few minor issues with the site: > * https://svn.apache.org/repos/asf/storm/site/_config.yml excludes > READEME.md, when it should exclude README.md. I don't think this page should > be there http://storm.apache.org/README.md > * about.md seems unused. I can't find a link to it anywhere, but the page is > available via direct link at http://storm.apache.org/about.html. This is a > different page from the one linked by the "Project Information" link. If > about.md isn't used I think we should remove it. > * The about.html layout page is missing the integrates item present in > about.md. http://storm.apache.org/about/integrates.html should probably be > present as an item in the menu on the left on the "Project Information" page. -- This message was sent by Atlassian JIRA (v6.4.14#64029)
[jira] [Resolved] (STORM-1114) Racing condition in trident zookeeper zk-node create/delete
[ https://issues.apache.org/jira/browse/STORM-1114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz resolved STORM-1114. Resolution: Fixed Fix Version/s: 1.1.1 1.0.4 0.10.3 2.0.0 Merged to master, 1.x-branch, 1.0.x-branch, and 0.10.x-branch. > Racing condition in trident zookeeper zk-node create/delete > --- > > Key: STORM-1114 > URL: https://issues.apache.org/jira/browse/STORM-1114 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Reporter: Zhuo Liu >Assignee: P. Taylor Goetz > Fix For: 2.0.0, 0.10.3, 1.0.4, 1.1.1 > > Time Spent: 50m > Remaining Estimate: 0h > > In production for some trident topology, we met the bug that some workers are > trying to create a zk-node that is already existent or delete a zk node that > has already been deleted. This causes the worker process to die. > > We dissect the problem and figure out that there exists racing condition in > trident TransactionalState's zk-node create and delete codes. > failure stack trace in worker.log: > {noformat} > Caused by: > org.apache.storm.shade.org.apache.zookeeper.KeeperException$NodeExistsException: > KeeperErrorCode = NodeExists for /ignoreStoredMetadata > at > org.apache.storm.shade.org.apache.zookeeper.KeeperException.create(KeeperException.java:119) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.zookeeper.KeeperException.create(KeeperException.java:51) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:676) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:660) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:656) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(CreateBuilderImpl.java:441) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:431) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.curator.framework.imps.CreateBuilderImpl$3.forPath(CreateBuilderImpl.java:239) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.curator.framework.imps.CreateBuilderImpl$3.forPath(CreateBuilderImpl.java:193) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > storm.trident.topology.state.TransactionalState.forPath(TransactionalState.java:83) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > storm.trident.topology.state.TransactionalState.createNode(TransactionalState.java:100) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > storm.trident.topology.state.TransactionalState.setData(TransactionalState.java:115) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > ... 9 more > 2015-10-14 18:10:43.786 b.s.util [ERROR] Halting process: ("Worker died") > {noformat} > {noformat} > Caused by: > org.apache.storm.shade.org.apache.zookeeper.KeeperException$NoNodeException: > KeeperErrorCode = NoNode for /rainbowHdfsPath > at > org.apache.storm.shade.org.apache.zookeeper.KeeperException.create(KeeperException.java:111) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.zookeeper.KeeperException.create(KeeperException.java:51) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:873) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.curator.framework.imps.DeleteBuilderImpl$5.call(DeleteBuilderImpl.java:239) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.curator.framework.imps.DeleteBuilderImpl$5.call(DeleteBuilderImpl.java:234) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.curator.framework.imps.DeleteBuilderImpl.pathInForeground(DeleteBuilderImpl.java:230) >
[jira] [Updated] (STORM-1114) Racing condition in trident zookeeper zk-node create/delete
[ https://issues.apache.org/jira/browse/STORM-1114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz updated STORM-1114: --- Priority: Major (was: Minor) Issue Type: Bug (was: Documentation) > Racing condition in trident zookeeper zk-node create/delete > --- > > Key: STORM-1114 > URL: https://issues.apache.org/jira/browse/STORM-1114 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Reporter: Zhuo Liu >Assignee: P. Taylor Goetz > Time Spent: 10m > Remaining Estimate: 0h > > In production for some trident topology, we met the bug that some workers are > trying to create a zk-node that is already existent or delete a zk node that > has already been deleted. This causes the worker process to die. > > We dissect the problem and figure out that there exists racing condition in > trident TransactionalState's zk-node create and delete codes. > failure stack trace in worker.log: > {noformat} > Caused by: > org.apache.storm.shade.org.apache.zookeeper.KeeperException$NodeExistsException: > KeeperErrorCode = NodeExists for /ignoreStoredMetadata > at > org.apache.storm.shade.org.apache.zookeeper.KeeperException.create(KeeperException.java:119) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.zookeeper.KeeperException.create(KeeperException.java:51) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:676) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:660) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:656) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(CreateBuilderImpl.java:441) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:431) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.curator.framework.imps.CreateBuilderImpl$3.forPath(CreateBuilderImpl.java:239) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.curator.framework.imps.CreateBuilderImpl$3.forPath(CreateBuilderImpl.java:193) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > storm.trident.topology.state.TransactionalState.forPath(TransactionalState.java:83) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > storm.trident.topology.state.TransactionalState.createNode(TransactionalState.java:100) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > storm.trident.topology.state.TransactionalState.setData(TransactionalState.java:115) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > ... 9 more > 2015-10-14 18:10:43.786 b.s.util [ERROR] Halting process: ("Worker died") > {noformat} > {noformat} > Caused by: > org.apache.storm.shade.org.apache.zookeeper.KeeperException$NoNodeException: > KeeperErrorCode = NoNode for /rainbowHdfsPath > at > org.apache.storm.shade.org.apache.zookeeper.KeeperException.create(KeeperException.java:111) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.zookeeper.KeeperException.create(KeeperException.java:51) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:873) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.curator.framework.imps.DeleteBuilderImpl$5.call(DeleteBuilderImpl.java:239) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.curator.framework.imps.DeleteBuilderImpl$5.call(DeleteBuilderImpl.java:234) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.curator.framework.imps.DeleteBuilderImpl.pathInForeground(DeleteBuilderImpl.java:230) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.curator.framework.imps.DeleteBuilderImpl.forPath(DeleteBuilderImpl.java:215) >
[jira] [Assigned] (STORM-1114) Racing condition in trident zookeeper zk-node create/delete
[ https://issues.apache.org/jira/browse/STORM-1114?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz reassigned STORM-1114: -- Assignee: P. Taylor Goetz > Racing condition in trident zookeeper zk-node create/delete > --- > > Key: STORM-1114 > URL: https://issues.apache.org/jira/browse/STORM-1114 > Project: Apache Storm > Issue Type: Documentation > Components: storm-core >Reporter: Zhuo Liu >Assignee: P. Taylor Goetz >Priority: Minor > Time Spent: 10m > Remaining Estimate: 0h > > In production for some trident topology, we met the bug that some workers are > trying to create a zk-node that is already existent or delete a zk node that > has already been deleted. This causes the worker process to die. > > We dissect the problem and figure out that there exists racing condition in > trident TransactionalState's zk-node create and delete codes. > failure stack trace in worker.log: > {noformat} > Caused by: > org.apache.storm.shade.org.apache.zookeeper.KeeperException$NodeExistsException: > KeeperErrorCode = NodeExists for /ignoreStoredMetadata > at > org.apache.storm.shade.org.apache.zookeeper.KeeperException.create(KeeperException.java:119) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.zookeeper.KeeperException.create(KeeperException.java:51) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:676) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.curator.framework.imps.CreateBuilderImpl$11.call(CreateBuilderImpl.java:660) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.curator.framework.imps.CreateBuilderImpl.pathInForeground(CreateBuilderImpl.java:656) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.curator.framework.imps.CreateBuilderImpl.protectedPathInForeground(CreateBuilderImpl.java:441) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.curator.framework.imps.CreateBuilderImpl.forPath(CreateBuilderImpl.java:431) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.curator.framework.imps.CreateBuilderImpl$3.forPath(CreateBuilderImpl.java:239) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.curator.framework.imps.CreateBuilderImpl$3.forPath(CreateBuilderImpl.java:193) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > storm.trident.topology.state.TransactionalState.forPath(TransactionalState.java:83) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > storm.trident.topology.state.TransactionalState.createNode(TransactionalState.java:100) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > storm.trident.topology.state.TransactionalState.setData(TransactionalState.java:115) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > ... 9 more > 2015-10-14 18:10:43.786 b.s.util [ERROR] Halting process: ("Worker died") > {noformat} > {noformat} > Caused by: > org.apache.storm.shade.org.apache.zookeeper.KeeperException$NoNodeException: > KeeperErrorCode = NoNode for /rainbowHdfsPath > at > org.apache.storm.shade.org.apache.zookeeper.KeeperException.create(KeeperException.java:111) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.zookeeper.KeeperException.create(KeeperException.java:51) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.zookeeper.ZooKeeper.delete(ZooKeeper.java:873) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.curator.framework.imps.DeleteBuilderImpl$5.call(DeleteBuilderImpl.java:239) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.curator.framework.imps.DeleteBuilderImpl$5.call(DeleteBuilderImpl.java:234) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:107) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.curator.framework.imps.DeleteBuilderImpl.pathInForeground(DeleteBuilderImpl.java:230) > ~[storm-core-0.10.1.y.jar:0.10.1.y] > at > org.apache.storm.shade.org.apache.curator.framework.imps.DeleteBuilderImpl.forPath(DeleteBuilderImpl.java:215) >
[jira] [Created] (STORM-2416) Release Packaging Improvements
P. Taylor Goetz created STORM-2416: -- Summary: Release Packaging Improvements Key: STORM-2416 URL: https://issues.apache.org/jira/browse/STORM-2416 Project: Apache Storm Issue Type: Improvement Components: build Reporter: P. Taylor Goetz Assignee: P. Taylor Goetz This issue is to address distribution packaging improvements discussed on the dev@ list: 1. Move remaining examples to "examples" directory. 2. Package examples as source-only, to be compiled by users 3. Remove connector jars from binary distribution (since they are available in Maven, and we want to discourage users from hand-crafting topology jars) -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (STORM-2374) Storm Kafka Client Func Interface Must be Serializable
[ https://issues.apache.org/jira/browse/STORM-2374?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz resolved STORM-2374. Resolution: Fixed Fix Version/s: 1.1.0 2.0.0 Merged to master and 1.x. > Storm Kafka Client Func Interface Must be Serializable > -- > > Key: STORM-2374 > URL: https://issues.apache.org/jira/browse/STORM-2374 > Project: Apache Storm > Issue Type: Bug > Components: storm-kafka-client >Affects Versions: 2.0.0, 1.1.0 >Reporter: Hugo Louro >Assignee: Hugo Louro >Priority: Blocker > Fix For: 2.0.0, 1.1.0 > > Time Spent: 3h > Remaining Estimate: 0h > -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (STORM-2372) Pacemaker client doesn't clean up heartbeats properly.
[ https://issues.apache.org/jira/browse/STORM-2372?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz resolved STORM-2372. Resolution: Fixed Fix Version/s: 1.0.4 1.1.0 2.0.0 Merged to master, 1.x, 1.0.x. > Pacemaker client doesn't clean up heartbeats properly. > -- > > Key: STORM-2372 > URL: https://issues.apache.org/jira/browse/STORM-2372 > Project: Apache Storm > Issue Type: Bug >Affects Versions: 2.0.0, 1.x >Reporter: Kyle Nusbaum >Assignee: Kyle Nusbaum > Fix For: 2.0.0, 1.1.0, 1.0.4 > > Time Spent: 2h 10m > Remaining Estimate: 0h > > Paths are not deleted correctly. Pacemaker's delete-path operates by matching > a prefix against all the keys in the map. > The issue here is that the prefix is given a '/' on the end, but keys don't > have a trailing '/' if there is no 'subkey'. > i.e. delete path /foo/bar/baz/ doesn't match the key /foo/bar/baz > The path has to have the trailing '/' so that delete path /foo/bar/baz > doesn't also delete /foo/bar/bazoo > The solution here is to tack on a '/' to every key when checking against the > prefix. > We also want to send the delete command to *every* pacemaker server rather > than just the normal write client. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Reopened] (STORM-2334) Bolt for Joining streams
[ https://issues.apache.org/jira/browse/STORM-2334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz reopened STORM-2334: > Bolt for Joining streams > > > Key: STORM-2334 > URL: https://issues.apache.org/jira/browse/STORM-2334 > Project: Apache Storm > Issue Type: Bug >Affects Versions: 2.0.0, 1.x >Reporter: Roshan Naik >Assignee: Roshan Naik > Fix For: 2.0.0, 1.1.0 > > Time Spent: 6h 10m > Remaining Estimate: 0h > > Create a general purpose windowed bolt that performs Joins on multiple data > streams. > Since, depending on the topo config, the bolt could be receiving data either > on 'default' streams or on named streams join bolt should be able to > differentiate the incoming data based on names of upstream components as well > as stream names. > *Example:* > The following SQL style join involving 4 tables : > {code} > select userId, key4, key2, key3 > from stream1 > join stream2 on stream2.userId = stream1.key1 > join stream3 on stream3.key3 = stream2.userId > left join stream4 on stream4.key4 = stream3.key3 > {code} > Could be expressed using the Join Bolt over 4 named streams as : > {code} > new JoinBolt(STREAM, "stream1", "key1") //'STREAM' arg indicates that > stream1/2/3/4 are names of streams. 'key1' is the key on which > .join ("stream2", "userId", "stream1") //join stream2 on > stream2.userId=stream1.key1 > .join ("stream3", "key3","stream2") //join stream3 on > stream3.key3=stream2.userId > .leftjoin ("stream4", "key4","stream3") //left join stream4 on > stream4.key4=stream3.key3 > .select("userId, key4, key2, key3") // chose output fields > .withWindowLength(..) > .withSlidingInterval(..); > {code} > Or based on named source components : > {code} > new JoinBolt(SOURCE, "kafkaSpout1", "key1") //'SOURCE' arg indicates that > kafkaSpout1, hdfsSpout3 etc are names of upstream components > .join ("kafkaSpout2", "userId","kafkaSpout1" ) > .join ("hdfsSpout3", "key3", "kafkaSpout2") > .leftjoin ("mqttSpout1", "key4", "hdfsSpout3") > .select ("userId, key4, key2, key3") > .withWindowLength(..) > .withSlidingInterval(..); > {code} > In order for the tuples to be joined correctly, 'fields grouping' should be > employed on the incoming streams. Each stream should be grouped on the same > key using which it will be joined against other streams. This is a > restriction compared to SQL which allows join a table with others on any key > and any number of keys. > *For example:* If a 'Stream1' is Fields Grouped on 'key1', we cannot use a > different 'key2' on 'Stream1' to join it with other streams. However, > 'Stream1' can be joined using the same key with multiple other streams as > show in this SQL. > {code} > select > from stream1 > join stream2 on stream2.userId = stream1.key1 > join stream3 on stream3.key3 = stream1.key2 // not supportable in Join > Bolt > {code} > Consequently the join bolt's syntax is a bit simplified compared to SQL. The > key name for any given stream only appears once, as soon the stream is > introduced for the first time in the join. Thereafter that key is implicitly > used for joining. See the case of 'stream3' being joined with both 'stream2' > and 'stream4' in the first example. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (STORM-2334) Bolt for Joining streams
[ https://issues.apache.org/jira/browse/STORM-2334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz updated STORM-2334: --- Fix Version/s: (was: 1.1.1) 1.1.0 > Bolt for Joining streams > > > Key: STORM-2334 > URL: https://issues.apache.org/jira/browse/STORM-2334 > Project: Apache Storm > Issue Type: Bug >Affects Versions: 2.0.0, 1.x >Reporter: Roshan Naik >Assignee: Roshan Naik > Fix For: 2.0.0, 1.1.0 > > Time Spent: 6h 10m > Remaining Estimate: 0h > > Create a general purpose windowed bolt that performs Joins on multiple data > streams. > Since, depending on the topo config, the bolt could be receiving data either > on 'default' streams or on named streams join bolt should be able to > differentiate the incoming data based on names of upstream components as well > as stream names. > *Example:* > The following SQL style join involving 4 tables : > {code} > select userId, key4, key2, key3 > from stream1 > join stream2 on stream2.userId = stream1.key1 > join stream3 on stream3.key3 = stream2.userId > left join stream4 on stream4.key4 = stream3.key3 > {code} > Could be expressed using the Join Bolt over 4 named streams as : > {code} > new JoinBolt(STREAM, "stream1", "key1") //'STREAM' arg indicates that > stream1/2/3/4 are names of streams. 'key1' is the key on which > .join ("stream2", "userId", "stream1") //join stream2 on > stream2.userId=stream1.key1 > .join ("stream3", "key3","stream2") //join stream3 on > stream3.key3=stream2.userId > .leftjoin ("stream4", "key4","stream3") //left join stream4 on > stream4.key4=stream3.key3 > .select("userId, key4, key2, key3") // chose output fields > .withWindowLength(..) > .withSlidingInterval(..); > {code} > Or based on named source components : > {code} > new JoinBolt(SOURCE, "kafkaSpout1", "key1") //'SOURCE' arg indicates that > kafkaSpout1, hdfsSpout3 etc are names of upstream components > .join ("kafkaSpout2", "userId","kafkaSpout1" ) > .join ("hdfsSpout3", "key3", "kafkaSpout2") > .leftjoin ("mqttSpout1", "key4", "hdfsSpout3") > .select ("userId, key4, key2, key3") > .withWindowLength(..) > .withSlidingInterval(..); > {code} > In order for the tuples to be joined correctly, 'fields grouping' should be > employed on the incoming streams. Each stream should be grouped on the same > key using which it will be joined against other streams. This is a > restriction compared to SQL which allows join a table with others on any key > and any number of keys. > *For example:* If a 'Stream1' is Fields Grouped on 'key1', we cannot use a > different 'key2' on 'Stream1' to join it with other streams. However, > 'Stream1' can be joined using the same key with multiple other streams as > show in this SQL. > {code} > select > from stream1 > join stream2 on stream2.userId = stream1.key1 > join stream3 on stream3.key3 = stream1.key2 // not supportable in Join > Bolt > {code} > Consequently the join bolt's syntax is a bit simplified compared to SQL. The > key name for any given stream only appears once, as soon the stream is > introduced for the first time in the join. Thereafter that key is implicitly > used for joining. See the case of 'stream3' being joined with both 'stream2' > and 'stream4' in the first example. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (STORM-2334) Bolt for Joining streams
[ https://issues.apache.org/jira/browse/STORM-2334?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz resolved STORM-2334. Resolution: Fixed > Bolt for Joining streams > > > Key: STORM-2334 > URL: https://issues.apache.org/jira/browse/STORM-2334 > Project: Apache Storm > Issue Type: Bug >Affects Versions: 2.0.0, 1.x >Reporter: Roshan Naik >Assignee: Roshan Naik > Fix For: 2.0.0, 1.1.0 > > Time Spent: 6h 10m > Remaining Estimate: 0h > > Create a general purpose windowed bolt that performs Joins on multiple data > streams. > Since, depending on the topo config, the bolt could be receiving data either > on 'default' streams or on named streams join bolt should be able to > differentiate the incoming data based on names of upstream components as well > as stream names. > *Example:* > The following SQL style join involving 4 tables : > {code} > select userId, key4, key2, key3 > from stream1 > join stream2 on stream2.userId = stream1.key1 > join stream3 on stream3.key3 = stream2.userId > left join stream4 on stream4.key4 = stream3.key3 > {code} > Could be expressed using the Join Bolt over 4 named streams as : > {code} > new JoinBolt(STREAM, "stream1", "key1") //'STREAM' arg indicates that > stream1/2/3/4 are names of streams. 'key1' is the key on which > .join ("stream2", "userId", "stream1") //join stream2 on > stream2.userId=stream1.key1 > .join ("stream3", "key3","stream2") //join stream3 on > stream3.key3=stream2.userId > .leftjoin ("stream4", "key4","stream3") //left join stream4 on > stream4.key4=stream3.key3 > .select("userId, key4, key2, key3") // chose output fields > .withWindowLength(..) > .withSlidingInterval(..); > {code} > Or based on named source components : > {code} > new JoinBolt(SOURCE, "kafkaSpout1", "key1") //'SOURCE' arg indicates that > kafkaSpout1, hdfsSpout3 etc are names of upstream components > .join ("kafkaSpout2", "userId","kafkaSpout1" ) > .join ("hdfsSpout3", "key3", "kafkaSpout2") > .leftjoin ("mqttSpout1", "key4", "hdfsSpout3") > .select ("userId, key4, key2, key3") > .withWindowLength(..) > .withSlidingInterval(..); > {code} > In order for the tuples to be joined correctly, 'fields grouping' should be > employed on the incoming streams. Each stream should be grouped on the same > key using which it will be joined against other streams. This is a > restriction compared to SQL which allows join a table with others on any key > and any number of keys. > *For example:* If a 'Stream1' is Fields Grouped on 'key1', we cannot use a > different 'key2' on 'Stream1' to join it with other streams. However, > 'Stream1' can be joined using the same key with multiple other streams as > show in this SQL. > {code} > select > from stream1 > join stream2 on stream2.userId = stream1.key1 > join stream3 on stream3.key3 = stream1.key2 // not supportable in Join > Bolt > {code} > Consequently the join bolt's syntax is a bit simplified compared to SQL. The > key name for any given stream only appears once, as soon the stream is > introduced for the first time in the join. Thereafter that key is implicitly > used for joining. See the case of 'stream3' being joined with both 'stream2' > and 'stream4' in the first example. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (STORM-2250) Kafka Spout Refactoring to Increase Modularity and Testability
[ https://issues.apache.org/jira/browse/STORM-2250?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz resolved STORM-2250. Resolution: Fixed Fix Version/s: 1.1.0 2.0.0 Patches merged to master and 1.x-branch. > Kafka Spout Refactoring to Increase Modularity and Testability > -- > > Key: STORM-2250 > URL: https://issues.apache.org/jira/browse/STORM-2250 > Project: Apache Storm > Issue Type: Improvement > Components: storm-kafka-client >Affects Versions: 2.0.0, 1.0.2 >Reporter: Stig Rohde Døssing >Assignee: Stig Rohde Døssing >Priority: Minor > Fix For: 2.0.0, 1.1.0 > > Time Spent: 7.5h > Remaining Estimate: 0h > > Per the discussion here https://github.com/apache/storm/pull/1826 the > KafkaSpout class should be split up a bit, and the unit tests should be > improved to use time simulation and not break encapsulation on the spout to > test. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Updated] (STORM-1464) storm-hdfs should support writing to multiple files
[ https://issues.apache.org/jira/browse/STORM-1464?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz updated STORM-1464: --- Fix Version/s: 1.1.0 > storm-hdfs should support writing to multiple files > --- > > Key: STORM-1464 > URL: https://issues.apache.org/jira/browse/STORM-1464 > Project: Apache Storm > Issue Type: Improvement > Components: storm-hdfs >Reporter: Aaron Dossett >Assignee: Aaron Dossett > Labels: avro > Fix For: 2.0.0, 1.1.0 > > Time Spent: 20m > Remaining Estimate: 0h > > Examples of when this is needed include: > - One avro bolt writing multiple schemas, each of which require a different > file. Schema evolution is a common use of avro and the avro bolt should > support that seamlessly. > - Partitioning output to different directories based on the tuple contents. > For example, if the tuple contains a "USER" field, it should be possible to > partition based on that value. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Resolved] (STORM-2270) Kafka spout should consume from latest when zk committed offset bigger than latest offset
[ https://issues.apache.org/jira/browse/STORM-2270?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz resolved STORM-2270. Resolution: Fixed Fix Version/s: 1.1.0 Merged to 1.x-branch. > Kafka spout should consume from latest when zk committed offset bigger than > latest offset > - > > Key: STORM-2270 > URL: https://issues.apache.org/jira/browse/STORM-2270 > Project: Apache Storm > Issue Type: Bug > Components: storm-kafka >Affects Versions: 1.0.0, 0.9.6 >Reporter: Yuzhao Chen > Labels: easyfix > Fix For: 1.1.0 > > Time Spent: 2h > Remaining Estimate: 0h > > Kafka spout should consume from latest when ZK offset bigger than latest > offset[ an TopicOffsetOutOfRangeException thrown out ], especially when Kafka > topic change it's leader and some data lost, if we consume from earliest > offset, much meaningless duplicate records will be re-consumed. So, we should > consume from latest offset instead. -- This message was sent by Atlassian JIRA (v6.3.15#6346)
[jira] [Commented] (STORM-2292) Kafka spout enhancement, for our of range edge cases
[ https://issues.apache.org/jira/browse/STORM-2292?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838622#comment-15838622 ] P. Taylor Goetz commented on STORM-2292: [~jianbzhou] Can you repost your changes as a diff (the attachment is full of HTML) or better yet a pull request? > Kafka spout enhancement, for our of range edge cases > > > Key: STORM-2292 > URL: https://issues.apache.org/jira/browse/STORM-2292 > Project: Apache Storm > Issue Type: Improvement > Components: storm-kafka >Affects Versions: 0.10.0 >Reporter: WayneZhou >Assignee: Hugo Louro > Fix For: 0.10.0 > > Attachments: KafkaSpout.java.txt > > Original Estimate: 336h > Remaining Estimate: 336h > > @hmcl and all, we have communicated via email for a while and going forward > let's talk in this thread so everyone is in same page. > Base on the spout from the community(written by you), we have several fixes > and it worked quite stable in our production for about 6 months. > We want to share the latest spout to you and could you please kindly help > review and merge to the community version if any fix is reasonable? we want > to avoid diverging too much from the community version. > Below are our major fixes: > For failed message, in next tuple method, originally the spout seek back to > the non-continuous offset, so the failed message will be polled again for > retry, say we seek back to message 10 for retry, now if kafka log file was > purged, earliest offset is 1000, it means we will seek to 10 but reset to > 1000 as per the reset policy, and we cannot poll the message 10, so spout not > work. > Our fix is: we manually catch the out of range exception, commit the offset > to earliest offset first, then seek to the earliest offset > Currently the way to find next committed offset is very complex, under some > edge cases – a), if no message acked back because bolt has some issue or > cannot catch up with the spout emit; b) seek back is happened frequently and > it is much faster than the message be acked back > We give each message a status – None, emit, acked, failed(if failed number is > bigger than the maximum retry, set to acked) > One of our use cases need ordering in partition level, so after seek back for > retry, we re-emit all the follow messages again no matter they have emitted > or not, if possible, maybe you can give an option here to configure it – > either re-emit all the message from the failed one, or just emit the failed > one, same as current version. > We record the message count for acked, failed, emitted, just for statistics. > Could you please kindly help review and let us know if you can merge it into > the community version? Any comments/concern pls feel free to let us know. > Btw, our code is attached in this Jira. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-2321) Nimbus did not come up after restart
[ https://issues.apache.org/jira/browse/STORM-2321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15838518#comment-15838518 ] P. Taylor Goetz commented on STORM-2321: [~kabhwan] Does this only affect the 1.0.x line, or is 1.x affected as well? > Nimbus did not come up after restart > > > Key: STORM-2321 > URL: https://issues.apache.org/jira/browse/STORM-2321 > Project: Apache Storm > Issue Type: Bug > Components: blobstore >Affects Versions: 1.0.0, 1.0.1, 1.0.2 >Reporter: Raghav Kumar Gautam >Assignee: Jungtaek Lim >Priority: Critical > Labels: ha > > The nimbus was restarted during HA testing. After the restart the nimbus > failed to come up. > {code} > 2017-01-18 04:57:58.231 o.a.s.s.o.a.c.f.s.ConnectionStateManager [INFO] State > change: CONNECTED > 2017-01-18 04:57:58.247 o.a.s.b.BlobStoreUtils [ERROR] Could not update the > blob with keyKillLeaderThenSubmitNewTopology1-1-1484715309-stormjar.jar > 2017-01-18 04:57:58.273 o.a.s.b.KeySequenceNumber [ERROR] Exception {} > org.apache.storm.shade.org.apache.zookeeper.KeeperException$NoNodeException: > KeeperErrorCode = NoNode for > /blobstore/KillLeaderThenSubmitNewTopology1-1-1484715309-stormjar.jar > at > org.apache.storm.shade.org.apache.zookeeper.KeeperException.create(KeeperException.java:111) > at > org.apache.storm.shade.org.apache.zookeeper.KeeperException.create(KeeperException.java:51) > at > org.apache.storm.shade.org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1590) > at > org.apache.storm.shade.org.apache.curator.framework.imps.GetChildrenBuilderImpl$3.call(GetChildrenBuilderImpl.java:214) > at > org.apache.storm.shade.org.apache.curator.framework.imps.GetChildrenBuilderImpl$3.call(GetChildrenBuilderImpl.java:203) > at > org.apache.storm.shade.org.apache.curator.RetryLoop.callWithRetry(RetryLoop.java:108) > at > org.apache.storm.shade.org.apache.curator.framework.imps.GetChildrenBuilderImpl.pathInForeground(GetChildrenBuilderImpl.java:200) > at > org.apache.storm.shade.org.apache.curator.framework.imps.GetChildrenBuilderImpl.forPath(GetChildrenBuilderImpl.java:191) > at > org.apache.storm.shade.org.apache.curator.framework.imps.GetChildrenBuilderImpl.forPath(GetChildrenBuilderImpl.java:38) > at > org.apache.storm.blobstore.KeySequenceNumber.getKeySequenceNumber(KeySequenceNumber.java:149) > at > org.apache.storm.daemon.nimbus$get_version_for_key.invoke(nimbus.clj:456) > at > org.apache.storm.daemon.nimbus$mk_reified_nimbus$reify__9548.createStateInZookeeper(nimbus.clj:2056) > at > org.apache.storm.generated.Nimbus$Processor$createStateInZookeeper.getResult(Nimbus.java:3755) > at > org.apache.storm.generated.Nimbus$Processor$createStateInZookeeper.getResult(Nimbus.java:3740) > at > org.apache.storm.thrift.ProcessFunction.process(ProcessFunction.java:39) > at > org.apache.storm.thrift.TBaseProcessor.process(TBaseProcessor.java:39) > at > org.apache.storm.security.auth.SaslTransportPlugin$TUGIWrapProcessor.process(SaslTransportPlugin.java:144) > at > org.apache.storm.thrift.server.TThreadPoolServer$WorkerProcess.run(TThreadPoolServer.java:286) > at > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) > at > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) > at java.lang.Thread.run(Thread.java:745) > 2017-01-18 04:57:58.274 o.a.s.s.o.a.c.f.i.CuratorFrameworkImpl [INFO] > backgroundOperationsLoop exiting > 2017-01-18 04:57:58.296 o.a.s.m.n.Login [INFO] successfully logged in. > 2017-01-18 04:57:58.309 o.a.s.s.o.a.z.ZooKeeper [INFO] Session: > 0x359afc1eaa2009b closed > 2017-01-18 04:57:58.309 o.a.s.s.o.a.z.ClientCnxn [INFO] EventThread shut down > 2017-01-18 04:57:58.310 o.a.s.t.s.TThreadPoolServer [ERROR] Error occurred > during processing of message. > java.util.NoSuchElementException > at java.util.TreeMap.key(TreeMap.java:1327) > at java.util.TreeMap.lastKey(TreeMap.java:297) > at java.util.TreeSet.last(TreeSet.java:401) > at > org.apache.storm.blobstore.KeySequenceNumber.getKeySequenceNumber(KeySequenceNumber.java:206) > at > org.apache.storm.daemon.nimbus$get_version_for_key.invoke(nimbus.clj:456) > at > org.apache.storm.daemon.nimbus$mk_reified_nimbus$reify__9548.createStateInZookeeper(nimbus.clj:2056) > at > org.apache.storm.generated.Nimbus$Processor$createStateInZookeeper.getResult(Nimbus.java:3755) > at > org.apache.storm.generated.Nimbus$Processor$createStateInZookeeper.getResult(Nimbus.java:3740) > at > org.apache.storm.thrift.ProcessFunction.process(ProcessFunction.java:39) > at >
[jira] [Commented] (STORM-1856) Release Apache Storm 1.1.0
[ https://issues.apache.org/jira/browse/STORM-1856?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15830483#comment-15830483 ] P. Taylor Goetz commented on STORM-1856: [~kabhwan] I added those to the epic. The epic link is still there, it's just buried. If you type "Release" into the epic link field it should show up in the drop-down. > Release Apache Storm 1.1.0 > -- > > Key: STORM-1856 > URL: https://issues.apache.org/jira/browse/STORM-1856 > Project: Apache Storm > Issue Type: Epic >Reporter: P. Taylor Goetz > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Issue Comment Deleted] (STORM-2228) KafkaSpout does not replay properly when a topic maps to multiple streams
[ https://issues.apache.org/jira/browse/STORM-2228?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz updated STORM-2228: --- Comment: was deleted (was: Added to 1.1.0 release epic.) > KafkaSpout does not replay properly when a topic maps to multiple streams > - > > Key: STORM-2228 > URL: https://issues.apache.org/jira/browse/STORM-2228 > Project: Apache Storm > Issue Type: Bug > Components: storm-kafka-client >Affects Versions: 1.0.0, 2.0.0, 1.0.1, 1.0.2, 1.1.0, 1.0.3 >Reporter: Robert Joseph Evans >Assignee: Robert Joseph Evans >Priority: Blocker > > In the example. > KafkaSpoutTopologyMainNamedTopics.java > The code creates a TuplesBuilder and a KafkaSpoutStreams > {code} > protected KafkaSpoutTuplesBuildergetTuplesBuilder() { > return new KafkaSpoutTuplesBuilderNamedTopics.Builder<>( > new TopicsTest0Test1TupleBuilder (TOPICS[0], > TOPICS[1]), > new TopicTest2TupleBuilder (TOPICS[2])) > .build(); > } > protected KafkaSpoutStreams getKafkaSpoutStreams() { > final Fields outputFields = new Fields("topic", "partition", "offset", > "key", "value"); > final Fields outputFields1 = new Fields("topic", "partition", "offset"); > return new KafkaSpoutStreamsNamedTopics.Builder(outputFields, STREAMS[0], > new String[]{TOPICS[0], TOPICS[1]}) // contents of topics test, test1, sent > to test_stream > .addStream(outputFields, STREAMS[0], new String[]{TOPICS[2]}) // > contents of topic test2 sent to test_stream > .addStream(outputFields1, STREAMS[2], new String[]{TOPICS[2]}) > // contents of topic test2 sent to test2_stream > .build(); > } > {code} > Essentially the code is trying to take {{TOPICS\[0]}}, {{TOPICS\[1]}}, and > {{TOPICS\[2]}} translate them to {{Fields("topic", "partition", "offset", > "key", "value")}} and output them on {{STREAMS\[0]}}. Then just for > {{TOPICS\[2]}} they want it to be output as {{Fields("topic", "partition", > "offset")}} to {{STREAMS\[2]}}. (Don't know what happened to {{STREAMS\[1]}}) > There are two issues here. First with how the TupleBuilder and the > SpoutStreams are split up, but coupled {{STREAMS\[2]}} is actually getting > the full "topic" "partition" "offset" "key" "value", but this minor. The > real issue is that the code uses the same KafkaSpoutMessageId for all the > tuples emitted to both {{STREAMS\[1]}} and {{STREAMS\[2]}}. > https://git.corp.yahoo.com/storm/storm/blob/5bcbb8d6d700d0d238d23f8f6d3976667aaedab9/external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/KafkaSpout.java#L284-L304 > The code, however, is written to assume that it will only ever get one > ack/fail for a given KafkaSpoutMessageId. This means that if one of the > emitted tuple trees succeed and then the other fails, the failure will not > result in anything being replayed! This violates how storm is intended to > work. > I discovered this as a part of STORM-2225, and I am fine with fixing it on > STORM-2225 (I would just remove support for that functionality because there > are other ways of doing this correctly). But that would not maintain > backwards compatibility and I am not sure it would be appropriate for 1.x > releases. I really would like to have feedback from others on this. > I can put something into 1.x where it will throw an exception if acking is > enabled and this situation is present, but I don't want to spend the time > tying to do reference counting on the number of tuples actually emitted. If > someone else wants to do that I would be happy to turn this JIRA over to them. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-2223) Storm PMML Bolt
[ https://issues.apache.org/jira/browse/STORM-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15704143#comment-15704143 ] P. Taylor Goetz commented on STORM-2223: [~harsha_ch] Our bylaws are essentially project specific amendments to Apache policy. PMC members need to be aware of that, and track ASF policy. A new JIRA created by a hortonworks employee, assigned to another hortonworks employee without open discussion could very easily be construed (in this case likely minor, but nonetheless applicable) as evidence of undue corporate influence over a project. (And yes, I am a hortonworks employee myself.) [~hmclouro] Any discussions from an informal meetup should always be brought back to the public list. There are no exceptions to that rule. > Storm PMML Bolt > --- > > Key: STORM-2223 > URL: https://issues.apache.org/jira/browse/STORM-2223 > Project: Apache Storm > Issue Type: Improvement >Reporter: Sriharsha Chintalapani >Assignee: Hugo Louro > > This JIRA is to build a Storm PMML bolt which uses JPMML library to load PMML > doc and evaluate the incoming tuples based on the user provided PMML doc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-2223) Storm PMML Bolt
[ https://issues.apache.org/jira/browse/STORM-2223?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15703802#comment-15703802 ] P. Taylor Goetz commented on STORM-2223: Why was this assigned without a pubic discussion or a request from the assignee? I may have missed it. What libraries are planned for use, and what are the the associated licenses? I've not seen many pmml libraries with ASF-compatible licenses, unfortunately. > Storm PMML Bolt > --- > > Key: STORM-2223 > URL: https://issues.apache.org/jira/browse/STORM-2223 > Project: Apache Storm > Issue Type: Improvement >Reporter: Sriharsha Chintalapani >Assignee: Hugo Louro > > This JIRA is to build a Storm PMML bolt which uses JPMML library to load PMML > doc and evaluate the incoming tuples based on the user provided PMML doc. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-2164) Create simple generic plugin system to register codahale reporters
[ https://issues.apache.org/jira/browse/STORM-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15654647#comment-15654647 ] P. Taylor Goetz commented on STORM-2164: [~abellina] the feature branch will be in the Apache repo. We can relax the commit rules for that branch while we work on features, then when ready, we create a pull request from that branch to an official version branch for formal review. I'll create a branch off 1.x-branch called "metrics_v2" so we can start creating pull requests, etc. bq. I should have some code to show for this JIRA soon. I've been generating metrics from the workers and reporting them using the file system, and having Slots pick them up and send over to Nimbus via thrift. This is not related to this JIRA specifically, but I wanted to get that going before doing configs. Yes, there's going to overlap with what we're doing. For now, I have a hard-coded reporter in my code that will obviously go away once your config work is ready. > Create simple generic plugin system to register codahale reporters > -- > > Key: STORM-2164 > URL: https://issues.apache.org/jira/browse/STORM-2164 > Project: Apache Storm > Issue Type: Improvement >Reporter: Alessandro Bellina >Assignee: Alessandro Bellina > > Configurable plugin interface s.t. daemons can instantiate codahale reporters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (STORM-2159) Codahale-ize Executor and Worker builtin-in stats
[ https://issues.apache.org/jira/browse/STORM-2159?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz reassigned STORM-2159: -- Assignee: P. Taylor Goetz > Codahale-ize Executor and Worker builtin-in stats > - > > Key: STORM-2159 > URL: https://issues.apache.org/jira/browse/STORM-2159 > Project: Apache Storm > Issue Type: Improvement >Reporter: Alessandro Bellina >Assignee: P. Taylor Goetz > > We want built in metrics in CommonStats and [Spout|Bolt]ExecutorStats to use > codahale Meter directly. > Each worker process should keep a MetricRegistry and have a configurable set > of reporters that can be instantiated. > Other reporters can be added for the registry for direct reporting to the > metric backend of choice. > Note that I am proposing we leave old metrics through Zookeper in place after > this task (taken care of by STORM-2168), but we can change this if others > perfer to go cold turkey. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-2164) Create simple generic plugin system to register codahale reporters
[ https://issues.apache.org/jira/browse/STORM-2164?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15644883#comment-15644883 ] P. Taylor Goetz commented on STORM-2164: [~abellina] Can you share your basic design idea for this? I'm going to pick up the task for exposing default worker metrics via the Coda Hale API. For the metrics reporters, I was considering adding wrappers around the existing reporters that would make it easier to configure reporters via Yaml configuration in {{storm.yaml}}. My reasoning for the wrappers is to provider a bridge to the fluent-style API the reporters currently use for configuration. So the config might look something like this in {{storm.yaml}}: {code} storm.metrics.reporters: reporter.class: "org.apache.storm.metrics2.GraphiteReporter" config: report.interval.seconds: 60 graphite.host: "localhost" graphite.port: 2003 # more graphite reporter specific config... reporter.class: "org.apache.storm.metrics2.ConsoleReporter" config: report.interval.seconds: 60 {code} Then each reporter implementation would have a no-args constructor and {{prepare(Mapconfig)}} method that would receive the above "config" map. Does that align with what you are thinking? Would you be interested in doing this work in a feature branch to make collaboration easier? > Create simple generic plugin system to register codahale reporters > -- > > Key: STORM-2164 > URL: https://issues.apache.org/jira/browse/STORM-2164 > Project: Apache Storm > Issue Type: Improvement >Reporter: Alessandro Bellina >Assignee: Alessandro Bellina > > Configurable plugin interface s.t. daemons can instantiate codahale reporters. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Created] (STORM-2176) Workers do not shutdown cleanly and worker hooks don't run when a topology is killed
P. Taylor Goetz created STORM-2176: -- Summary: Workers do not shutdown cleanly and worker hooks don't run when a topology is killed Key: STORM-2176 URL: https://issues.apache.org/jira/browse/STORM-2176 Project: Apache Storm Issue Type: Bug Affects Versions: 1.0.0, 1.0.1, 1.0.2 Reporter: P. Taylor Goetz Priority: Critical This appears to have been introduced in the 1.0.0 release. The issues does not seem to affect 0.10.2. When a topology is killed and workers receive the notification to shutdown, they do not shutdown cleanly, so worker hooks never get invoked. When a worker shuts down cleanly, the worker logs should contain entries such as the following: {code} 2016-10-28 18:52:06.273 b.s.d.worker [INFO] Shut down transfer thread 2016-10-28 18:52:06.279 b.s.d.worker [INFO] Shutting down default resources 2016-10-28 18:52:06.287 b.s.d.worker [INFO] Shut down default resources 2016-10-28 18:52:06.351 b.s.d.worker [INFO] Disconnecting from storm cluster state context 2016-10-28 18:52:06.359 b.s.d.worker [INFO] Shut down worker exclaim-1-1477680593 61bddd66-0fda-4556-b742-4b63f0df6fc1 6700 {code} In the 1.0.x line of releases (and presumably 1.x, though I haven't checked) this does not happen -- the worker shutdown process appears to get stuck shutting down executors (https://github.com/apache/storm/blob/v1.0.2/storm-core/src/clj/org/apache/storm/daemon/worker.clj#L666), no further log messages are seen in the worker log, and worker hooks do not run. There are two properties that affect how workers exit. The first is the configuration property {{supervisor.worker.shutdown.sleep.secs}}, which defaults to 1 second. This corresponds to how long the supervisor will wait for a worker to exit gracefully before forcibly killing it with {{kill -9}}. When this happens the supervisor will log that the worker terminated with exit code 137 (128 + 9). The second property is a hard-coded 1 second delay (https://github.com/apache/storm/blob/v1.0.2/storm-core/src/clj/org/apache/storm/util.clj#L463) added as a shutdown hook that will call {{Runtime.halt()}} if the delay is exceeded. When this happens, the supervisor will log that the worker terminated with exit code 20 (hard-coded). Side Note: The hardcoded halt delay in worker.clj and the default value for {{supervisor.worker.shutdown.sleep.secs}} both being 1 second should probably be changed since it creates a race to see whether the supervisor delay or the worker delay wins. To test this, I set {{supervisor.worker.shutdown.sleep.secs}} to 15 to allow plenty of time for the worker to exit gracefully, and deployed and killed a topology. In this case the supervisor consistently reported exit code 20 for the worker, indicating the hard-coded shutdown hook caused the worker to exit. I thought the hard-coded 1 second shutdown hook delay might not be long enough for the worker to shutdown cleanly. To test that hypothesis, I changed the hard-code delay to 10 seconds, leaving {{supervisor.worker.shutdown.sleep.secs}} at 15 seconds. Again supervisor reported an exit code of 20 for the worker, and there were no log messages indicating the worker had exited cleanly and that the worker hook had run. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-2153) New Metrics Reporting API
[ https://issues.apache.org/jira/browse/STORM-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15595862#comment-15595862 ] P. Taylor Goetz commented on STORM-2153: One of the things we're going to have to work out is how to properly aggregate different metric types, if we can at all. For simple metric types like Counters, it is straightforward. But things get more complicated with metrics like Meters and Histograms -- there's no way to really aggregate percentiles. The problem is that metrics maintain a certain amount of state necessary to calculate certain values like percentiles. When a reporter reports those values, it takes a point-in-time snapshot of those calculations, so all the data used to calculate those values is no longer available. As an example lets say we want to have a Histogram for execute latency (so we can report various percentiles), and we want to aggregate those statistics across a topology. So worker A and worker B report their histogram snapshot to nimbus. How does nimbus aggregate those values without the datasets the calculations were based on? > New Metrics Reporting API > - > > Key: STORM-2153 > URL: https://issues.apache.org/jira/browse/STORM-2153 > Project: Apache Storm > Issue Type: Improvement >Reporter: P. Taylor Goetz > > This is a proposal to provide a new metrics reporting API based on [Coda > Hale's metrics library | http://metrics.dropwizard.io/3.1.0/] (AKA > Dropwizard/Yammer metrics). > h2. Background > In a [discussion on the dev@ mailing list | > http://mail-archives.apache.org/mod_mbox/storm-dev/201610.mbox/%3ccagx0urh85nfh0pbph11pmc1oof6htycjcxsxgwp2nnofukq...@mail.gmail.com%3e] > a number of community and PMC members recommended replacing Storm’s metrics > system with a new API as opposed to enhancing the existing metrics system. > Some of the objections to the existing metrics API include: > # Metrics are reported as an untyped Java object, making it very difficult to > reason about how to report it (e.g. is it a gauge, a counter, etc.?) > # It is difficult to determine if metrics coming into the consumer are > pre-aggregated or not. > # Storm’s metrics collection occurs through a specialized bolt, which in > addition to potentially affecting system performance, complicates certain > types of aggregation when the parallelism of that bolt is greater than one. > In the discussion on the developer mailing list, there is growing consensus > for replacing Storm’s metrics API with a new API based on Coda Hale’s metrics > library. This approach has the following benefits: > # Coda Hale’s metrics library is very stable, performant, well thought out, > and widely adopted among open source projects (e.g. Kafka). > # The metrics library provides many existing metric types: Meters, Gauges, > Counters, Histograms, and more. > # The library has a pluggable “reporter” API for publishing metrics to > various systems, with existing implementations for: JMX, console, CSV, SLF4J, > Graphite, Ganglia. > # Reporters are straightforward to implement, and can be reused by any > project that uses the metrics library (i.e. would have broader application > outside of Storm) > As noted earlier, the metrics library supports pluggable reporters for > sending metrics data to other systems, and implementing a reporter is fairly > straightforward (an example reporter implementation can be found here). For > example if someone develops a reporter based on Coda Hale’s metrics, it could > not only be used for pushing Storm metrics, but also for any system that used > the metrics library, such as Kafka. > h2. Scope of Effort > The effort to implement a new metrics API for Storm can be broken down into > the following development areas: > # Implement API for Storms internal worker metrics: latencies, queue sizes, > capacity, etc. > # Implement API for user defined, topology-specific metrics (exposed via the > {{org.apache.storm.task.TopologyContext}} class) > # Implement API for storm daemons: nimbus, supervisor, etc. > h2. Relationship to Existing Metrics > This would be a new API that would not affect the existing metrics API. Upon > completion, the old metrics API would presumably be deprecated, but kept in > place for backward compatibility. > Internally the current metrics API uses Storm bolts for the reporting > mechanism. The proposed metrics API would not depend on any of Storm's > messaging capabilities and instead use the [metrics library's built-in > reporter mechanism | > http://metrics.dropwizard.io/3.1.0/manual/core/#man-core-reporters]. This > would allow users to use existing {{Reporter}} implementations which are not > Storm-specific, and would simplify the process of collecting metrics. > Compared to Storm's {{IMetricCollector}} interface, implementing a reporter
[jira] [Commented] (STORM-2165) Implement default worker metrics reporter and changes in Supervisor to read metrics
[ https://issues.apache.org/jira/browse/STORM-2165?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15592615#comment-15592615 ] P. Taylor Goetz commented on STORM-2165: Please see this for background: https://issues.apache.org/jira/browse/STORM-2169 If we use that approach for aggregating, then aggregating at the supervisor would actually lead to more traffic to the metric store -- in addition to forwarding the original metric, the supervisor would also be sending a value for each level of aggregation. I think if we have sane worker reporting intervals we can avoid too much of an impact on the metrics store. Another future optimization would be to try to stagger worker report times to avoid all workers reporting at once, but I'd have to think more about how we would do that. > Implement default worker metrics reporter and changes in Supervisor to read > metrics > --- > > Key: STORM-2165 > URL: https://issues.apache.org/jira/browse/STORM-2165 > Project: Apache Storm > Issue Type: Improvement >Reporter: Alessandro Bellina > > A reporter that writes metrics to a known location such that Supervisor can > read them and send them to the collection service (initially Nimbus). > Also for this ticket to be complete, Supervisors should be able to read and > send metrics to Nimbus. > This doesn't involve aggregation, just individual worker metrics. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-2153) New Metrics Reporting API
[ https://issues.apache.org/jira/browse/STORM-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15589768#comment-15589768 ] P. Taylor Goetz commented on STORM-2153: One concern I have with having nimbus as the destination of aggregated metrics, is nimbus HA. If a new nimbus becomes leader, presumably all metrics will be lost or reset to stale values, unless there is some sort of replication. Would a "metrics store" need to be HA? I would probably argue that it isn't, provided the reporters handle a metrics store outage gracefully without affecting performance (since the reporter threads are not on the critical path this wouldn't be too hard). I also like the idea of doing the aggregation at the "metrics store" level. That would simplify things since all metrics reporters could push values to that store without doing any intermediate aggregation. (I'm using scare quotes around "metrics store" because it is starting to sound like we are talking about a new service/daemon.) > New Metrics Reporting API > - > > Key: STORM-2153 > URL: https://issues.apache.org/jira/browse/STORM-2153 > Project: Apache Storm > Issue Type: Improvement >Reporter: P. Taylor Goetz > > This is a proposal to provide a new metrics reporting API based on [Coda > Hale's metrics library | http://metrics.dropwizard.io/3.1.0/] (AKA > Dropwizard/Yammer metrics). > h2. Background > In a [discussion on the dev@ mailing list | > http://mail-archives.apache.org/mod_mbox/storm-dev/201610.mbox/%3ccagx0urh85nfh0pbph11pmc1oof6htycjcxsxgwp2nnofukq...@mail.gmail.com%3e] > a number of community and PMC members recommended replacing Storm’s metrics > system with a new API as opposed to enhancing the existing metrics system. > Some of the objections to the existing metrics API include: > # Metrics are reported as an untyped Java object, making it very difficult to > reason about how to report it (e.g. is it a gauge, a counter, etc.?) > # It is difficult to determine if metrics coming into the consumer are > pre-aggregated or not. > # Storm’s metrics collection occurs through a specialized bolt, which in > addition to potentially affecting system performance, complicates certain > types of aggregation when the parallelism of that bolt is greater than one. > In the discussion on the developer mailing list, there is growing consensus > for replacing Storm’s metrics API with a new API based on Coda Hale’s metrics > library. This approach has the following benefits: > # Coda Hale’s metrics library is very stable, performant, well thought out, > and widely adopted among open source projects (e.g. Kafka). > # The metrics library provides many existing metric types: Meters, Gauges, > Counters, Histograms, and more. > # The library has a pluggable “reporter” API for publishing metrics to > various systems, with existing implementations for: JMX, console, CSV, SLF4J, > Graphite, Ganglia. > # Reporters are straightforward to implement, and can be reused by any > project that uses the metrics library (i.e. would have broader application > outside of Storm) > As noted earlier, the metrics library supports pluggable reporters for > sending metrics data to other systems, and implementing a reporter is fairly > straightforward (an example reporter implementation can be found here). For > example if someone develops a reporter based on Coda Hale’s metrics, it could > not only be used for pushing Storm metrics, but also for any system that used > the metrics library, such as Kafka. > h2. Scope of Effort > The effort to implement a new metrics API for Storm can be broken down into > the following development areas: > # Implement API for Storms internal worker metrics: latencies, queue sizes, > capacity, etc. > # Implement API for user defined, topology-specific metrics (exposed via the > {{org.apache.storm.task.TopologyContext}} class) > # Implement API for storm daemons: nimbus, supervisor, etc. > h2. Relationship to Existing Metrics > This would be a new API that would not affect the existing metrics API. Upon > completion, the old metrics API would presumably be deprecated, but kept in > place for backward compatibility. > Internally the current metrics API uses Storm bolts for the reporting > mechanism. The proposed metrics API would not depend on any of Storm's > messaging capabilities and instead use the [metrics library's built-in > reporter mechanism | > http://metrics.dropwizard.io/3.1.0/manual/core/#man-core-reporters]. This > would allow users to use existing {{Reporter}} implementations which are not > Storm-specific, and would simplify the process of collecting metrics. > Compared to Storm's {{IMetricCollector}} interface, implementing a reporter > for the metrics library is much more straightforward (an example can be found > [here | >
[jira] [Commented] (STORM-2153) New Metrics Reporting API
[ https://issues.apache.org/jira/browse/STORM-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15589547#comment-15589547 ] P. Taylor Goetz commented on STORM-2153: bq. There really are several smaller pieces of this project that each need to be addressed separately. I'd say lets make the broken-down taks subtasks of this JIRA. > New Metrics Reporting API > - > > Key: STORM-2153 > URL: https://issues.apache.org/jira/browse/STORM-2153 > Project: Apache Storm > Issue Type: Improvement >Reporter: P. Taylor Goetz > > This is a proposal to provide a new metrics reporting API based on [Coda > Hale's metrics library | http://metrics.dropwizard.io/3.1.0/] (AKA > Dropwizard/Yammer metrics). > h2. Background > In a [discussion on the dev@ mailing list | > http://mail-archives.apache.org/mod_mbox/storm-dev/201610.mbox/%3ccagx0urh85nfh0pbph11pmc1oof6htycjcxsxgwp2nnofukq...@mail.gmail.com%3e] > a number of community and PMC members recommended replacing Storm’s metrics > system with a new API as opposed to enhancing the existing metrics system. > Some of the objections to the existing metrics API include: > # Metrics are reported as an untyped Java object, making it very difficult to > reason about how to report it (e.g. is it a gauge, a counter, etc.?) > # It is difficult to determine if metrics coming into the consumer are > pre-aggregated or not. > # Storm’s metrics collection occurs through a specialized bolt, which in > addition to potentially affecting system performance, complicates certain > types of aggregation when the parallelism of that bolt is greater than one. > In the discussion on the developer mailing list, there is growing consensus > for replacing Storm’s metrics API with a new API based on Coda Hale’s metrics > library. This approach has the following benefits: > # Coda Hale’s metrics library is very stable, performant, well thought out, > and widely adopted among open source projects (e.g. Kafka). > # The metrics library provides many existing metric types: Meters, Gauges, > Counters, Histograms, and more. > # The library has a pluggable “reporter” API for publishing metrics to > various systems, with existing implementations for: JMX, console, CSV, SLF4J, > Graphite, Ganglia. > # Reporters are straightforward to implement, and can be reused by any > project that uses the metrics library (i.e. would have broader application > outside of Storm) > As noted earlier, the metrics library supports pluggable reporters for > sending metrics data to other systems, and implementing a reporter is fairly > straightforward (an example reporter implementation can be found here). For > example if someone develops a reporter based on Coda Hale’s metrics, it could > not only be used for pushing Storm metrics, but also for any system that used > the metrics library, such as Kafka. > h2. Scope of Effort > The effort to implement a new metrics API for Storm can be broken down into > the following development areas: > # Implement API for Storms internal worker metrics: latencies, queue sizes, > capacity, etc. > # Implement API for user defined, topology-specific metrics (exposed via the > {{org.apache.storm.task.TopologyContext}} class) > # Implement API for storm daemons: nimbus, supervisor, etc. > h2. Relationship to Existing Metrics > This would be a new API that would not affect the existing metrics API. Upon > completion, the old metrics API would presumably be deprecated, but kept in > place for backward compatibility. > Internally the current metrics API uses Storm bolts for the reporting > mechanism. The proposed metrics API would not depend on any of Storm's > messaging capabilities and instead use the [metrics library's built-in > reporter mechanism | > http://metrics.dropwizard.io/3.1.0/manual/core/#man-core-reporters]. This > would allow users to use existing {{Reporter}} implementations which are not > Storm-specific, and would simplify the process of collecting metrics. > Compared to Storm's {{IMetricCollector}} interface, implementing a reporter > for the metrics library is much more straightforward (an example can be found > [here | > https://github.com/dropwizard/metrics/blob/3.2-development/metrics-core/src/main/java/com/codahale/metrics/ConsoleReporter.java]. > The new metrics capability would not use or affect the ZooKeeper-based > metrics used by Storm UI. > h2. Relationship to JStorm Metrics > [TBD] > h2. Target Branches > [TBD] > h2. Performance Implications > [TBD] > h2. Metrics Namespaces > [TBD] > h2. Metrics Collected > *Worker* > || Namespace || Metric Type || Description || > *Nimbus* > || Namespace || Metric Type || Description || > *Supervisor* > || Namespace || Metric Type || Description || > h2. User-Defined Metrics > [TBD] -- This message was sent by Atlassian JIRA
[jira] [Commented] (STORM-2153) New Metrics Reporting API
[ https://issues.apache.org/jira/browse/STORM-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15589447#comment-15589447 ] P. Taylor Goetz commented on STORM-2153: Copying your wish list here for easy reference: bq. 1. Aggregation at component level (Average, Sum etc) As I mentioned in an earlier comment, this is likely going to require something centralized in order to aggregate cross-worker metrics. We'll need more discussion to work this out. bq. 2. Blacklist/whitelist Metrics supports the concept of filtering: https://github.com/dropwizard/metrics/blob/3.2-development/metrics-core/src/main/java/com/codahale/metrics/MetricFilter.java bq. 3. Allow only numbers for values All the metric types in the library are pretty well typed to only number values. Gauges support non-number types, but I don't see us using many non-number types. bq. 4. Efficient routing of built-in metrics to UI (current they get tagged along with executor heartbeat which puts pressure on zookeeper) I feel UI integration is out of scope for the time being, but can be a follow on effort. bq. 5. Worker/JVM level metrics which are not owned by a particular component Metrics JVM support: http://metrics.dropwizard.io/3.1.0/manual/jvm/ bq. 6. Percentiles for latency metrics such as p99, p95 etc This is handled automatically for a number of metrics: https://github.com/dropwizard/metrics/blob/3.2-development/metrics-core/src/main/java/com/codahale/metrics/Snapshot.java bq. 7. Aggregation at stream level, and machine level This goes back to point #1. We'll have to plan carefully how we support aggregation. bq. 8. way to subscribe cluster metrics I think this is covered by the metrics reporter mechanism. bq. 9. counter stats as non-sampled if it doesn't hurt performance We should be able to determine that with some micro-benchmarking. bq. 10. more metrics like serialization/deserialization latency, queue status Agreed. It's actually very easy to add new metrics. We do however want to be careful to not over-instrument and affect performance. 11. Dynamically turning on/off specific metrics That I haven't thought about too much, but we could probably do something like a dynamic filter (see #2). > New Metrics Reporting API > - > > Key: STORM-2153 > URL: https://issues.apache.org/jira/browse/STORM-2153 > Project: Apache Storm > Issue Type: Improvement >Reporter: P. Taylor Goetz > > This is a proposal to provide a new metrics reporting API based on [Coda > Hale's metrics library | http://metrics.dropwizard.io/3.1.0/] (AKA > Dropwizard/Yammer metrics). > h2. Background > In a [discussion on the dev@ mailing list | > http://mail-archives.apache.org/mod_mbox/storm-dev/201610.mbox/%3ccagx0urh85nfh0pbph11pmc1oof6htycjcxsxgwp2nnofukq...@mail.gmail.com%3e] > a number of community and PMC members recommended replacing Storm’s metrics > system with a new API as opposed to enhancing the existing metrics system. > Some of the objections to the existing metrics API include: > # Metrics are reported as an untyped Java object, making it very difficult to > reason about how to report it (e.g. is it a gauge, a counter, etc.?) > # It is difficult to determine if metrics coming into the consumer are > pre-aggregated or not. > # Storm’s metrics collection occurs through a specialized bolt, which in > addition to potentially affecting system performance, complicates certain > types of aggregation when the parallelism of that bolt is greater than one. > In the discussion on the developer mailing list, there is growing consensus > for replacing Storm’s metrics API with a new API based on Coda Hale’s metrics > library. This approach has the following benefits: > # Coda Hale’s metrics library is very stable, performant, well thought out, > and widely adopted among open source projects (e.g. Kafka). > # The metrics library provides many existing metric types: Meters, Gauges, > Counters, Histograms, and more. > # The library has a pluggable “reporter” API for publishing metrics to > various systems, with existing implementations for: JMX, console, CSV, SLF4J, > Graphite, Ganglia. > # Reporters are straightforward to implement, and can be reused by any > project that uses the metrics library (i.e. would have broader application > outside of Storm) > As noted earlier, the metrics library supports pluggable reporters for > sending metrics data to other systems, and implementing a reporter is fairly > straightforward (an example reporter implementation can be found here). For > example if someone develops a reporter based on Coda Hale’s metrics, it could > not only be used for pushing Storm metrics, but also for any system that used > the metrics library, such as Kafka. > h2. Scope of Effort > The effort to implement a new metrics API for Storm can be broken
[jira] [Updated] (STORM-2153) New Metrics Reporting API
[ https://issues.apache.org/jira/browse/STORM-2153?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz updated STORM-2153: --- Description: This is a proposal to provide a new metrics reporting API based on [Coda Hale's metrics library | http://metrics.dropwizard.io/3.1.0/] (AKA Dropwizard/Yammer metrics). h2. Background In a [discussion on the dev@ mailing list | http://mail-archives.apache.org/mod_mbox/storm-dev/201610.mbox/%3ccagx0urh85nfh0pbph11pmc1oof6htycjcxsxgwp2nnofukq...@mail.gmail.com%3e] a number of community and PMC members recommended replacing Storm’s metrics system with a new API as opposed to enhancing the existing metrics system. Some of the objections to the existing metrics API include: # Metrics are reported as an untyped Java object, making it very difficult to reason about how to report it (e.g. is it a gauge, a counter, etc.?) # It is difficult to determine if metrics coming into the consumer are pre-aggregated or not. # Storm’s metrics collection occurs through a specialized bolt, which in addition to potentially affecting system performance, complicates certain types of aggregation when the parallelism of that bolt is greater than one. In the discussion on the developer mailing list, there is growing consensus for replacing Storm’s metrics API with a new API based on Coda Hale’s metrics library. This approach has the following benefits: # Coda Hale’s metrics library is very stable, performant, well thought out, and widely adopted among open source projects (e.g. Kafka). # The metrics library provides many existing metric types: Meters, Gauges, Counters, Histograms, and more. # The library has a pluggable “reporter” API for publishing metrics to various systems, with existing implementations for: JMX, console, CSV, SLF4J, Graphite, Ganglia. # Reporters are straightforward to implement, and can be reused by any project that uses the metrics library (i.e. would have broader application outside of Storm) As noted earlier, the metrics library supports pluggable reporters for sending metrics data to other systems, and implementing a reporter is fairly straightforward (an example reporter implementation can be found here). For example if someone develops a reporter based on Coda Hale’s metrics, it could not only be used for pushing Storm metrics, but also for any system that used the metrics library, such as Kafka. h2. Scope of Effort The effort to implement a new metrics API for Storm can be broken down into the following development areas: # Implement API for Storms internal worker metrics: latencies, queue sizes, capacity, etc. # Implement API for user defined, topology-specific metrics (exposed via the {{org.apache.storm.task.TopologyContext}} class) # Implement API for storm daemons: nimbus, supervisor, etc. h2. Relationship to Existing Metrics This would be a new API that would not affect the existing metrics API. Upon completion, the old metrics API would presumably be deprecated, but kept in place for backward compatibility. Internally the current metrics API uses Storm bolts for the reporting mechanism. The proposed metrics API would not depend on any of Storm's messaging capabilities and instead use the [metrics library's built-in reporter mechanism | http://metrics.dropwizard.io/3.1.0/manual/core/#man-core-reporters]. This would allow users to use existing {{Reporter}} implementations which are not Storm-specific, and would simplify the process of collecting metrics. Compared to Storm's {{IMetricCollector}} interface, implementing a reporter for the metrics library is much more straightforward (an example can be found [here | https://github.com/dropwizard/metrics/blob/3.2-development/metrics-core/src/main/java/com/codahale/metrics/ConsoleReporter.java]. The new metrics capability would not use or affect the ZooKeeper-based metrics used by Storm UI. h2. Relationship to JStorm Metrics [TBD] h2. Target Branches [TBD] h2. Performance Implications [TBD] h2. Metrics Namespaces [TBD] h2. Metrics Collected *Worker* || Namespace || Metric Type || Description || *Nimbus* || Namespace || Metric Type || Description || *Supervisor* || Namespace || Metric Type || Description || h2. User-Defined Metrics [TBD] was: This is a proposal to provide a new metrics reporting API based on [Coda Hale's metrics library | http://metrics.dropwizard.io/3.1.0/] (AKA Dropwizard/Yammer metrics). h2. Background In a [discussion on the dev@ mailing list | http://mail-archives.apache.org/mod_mbox/storm-dev/201610.mbox/%3ccagx0urh85nfh0pbph11pmc1oof6htycjcxsxgwp2nnofukq...@mail.gmail.com%3e] a number of community and PMC members recommended replacing Storm’s metrics system with a new API as opposed to enhancing the existing metrics system. Some of the objections to the existing metrics API include: # Metrics are reported as an untyped Java object, making it very