[GitHub] storm issue #2621: [STORM-3017] Refactor pacemaker client exception handling
Github user Ethanlm commented on the issue: https://github.com/apache/storm/pull/2621 @revans2 @srdo Could you please take a look if you get a chance? Thanks ---
[GitHub] storm issue #2634: [STORM-3021] Fix wrong usages of Config.TOPOLOGY_WORKERS ...
Github user Ethanlm commented on the issue: https://github.com/apache/storm/pull/2634 @revans2 @roshannaik Could you please take a look if you get a chance as we discussed about this in https://issues.apache.org/jira/browse/STORM-3021 ---
[GitHub] storm pull request #2634: [STORM-3021] Fix wrong usages of Config.TOPOLOGY_W...
Github user roshannaik commented on a diff in the pull request: https://github.com/apache/storm/pull/2634#discussion_r182545551 --- Diff: docs/Resource_Aware_Scheduler_overview.md --- @@ -184,6 +184,10 @@ The user can set some default configurations for the Resource Aware Scheduler in topology.worker.max.heap.size.mb: 768.0 ``` +### Warning + +The number of workers will be dynamically calculated by the Resource Aware Scheduler. The `Config.TOPOLOGY_WORKERS` will not be honored. --- End diff -- May I suggest a minor rewording: "If Resource Aware Scheduling is enabled, it will dynamically calculate the number of workers and the `topology.workers` setting is ignored." ---
[GitHub] storm pull request #2634: [STORM-3021] Fix wrong usages of Config.TOPOLOGY_W...
Github user roshannaik commented on a diff in the pull request: https://github.com/apache/storm/pull/2634#discussion_r182552073 --- Diff: storm-client/src/jvm/org/apache/storm/daemon/worker/Worker.java --- @@ -305,8 +305,9 @@ private void setupFlushTupleTimer(final MaptopologyConf, final } private void setupBackPressureCheckTimer(final Map topologyConf) { -final Integer workerCount = ObjectReader.getInt(topologyConf.get(Config.TOPOLOGY_WORKERS)); -if (workerCount <= 1) { +Set nonLocalTasks = Sets.difference(workerState.getTaskToComponent().keySet(), --- End diff -- Perhaps we can consolidate these set difference calls to count single/multiworker mode into a single function ? - https://github.com/apache/storm/blob/master/storm-client/src/jvm/org/apache/storm/daemon/worker/Worker.java#L230 ---
[GitHub] storm pull request #2634: [STORM-3021] Fix wrong usages of Config.TOPOLOGY_W...
Github user roshannaik commented on a diff in the pull request: https://github.com/apache/storm/pull/2634#discussion_r182549691 --- Diff: storm-server/src/main/java/org/apache/storm/daemon/nimbus/Nimbus.java --- @@ -3032,9 +3019,19 @@ public void submitTopologyWithOpts(String topoName, String uploadedJarLocation, // if the other config does not have it set. topology = normalizeTopology(totalConf, topology); -//set the number of acker executors; -totalConfToSave.put(Config.TOPOLOGY_ACKER_EXECUTORS, getNumOfAckerExecs(totalConf, topology)); -LOG.debug("Config.TOPOLOGY_ACKER_EXECUTORS set to: {}", totalConfToSave.get(Config.TOPOLOGY_ACKER_EXECUTORS)); +// if the Resource Aware Scheduler is used, +// we might need to set the number of acker executors and eventlogger executors to be the estimated number of workers. +if (ServerUtils.isRAS(conf)) { +int estimatedNumWorker = ServerUtils.getEstimatedWorkerCountForRASTopo(totalConf, topology); +int numAckerExecs = ObjectReader.getInt(totalConf.get(Config.TOPOLOGY_ACKER_EXECUTORS), estimatedNumWorker); +int numEventLoggerExecs = ObjectReader.getInt(totalConf.get(Config.TOPOLOGY_EVENTLOGGER_EXECUTORS), estimatedNumWorker); + +totalConfToSave.put(Config.TOPOLOGY_ACKER_EXECUTORS, numAckerExecs); --- End diff -- Is this a case where we are overriding the acker & eventLogger count settings that are set by user ? ---
[GitHub] storm pull request #2634: [STORM-3021] Fix wrong usages of Config.TOPOLOGY_W...
Github user roshannaik commented on a diff in the pull request: https://github.com/apache/storm/pull/2634#discussion_r182628537 --- Diff: storm-client/src/jvm/org/apache/storm/daemon/worker/Worker.java --- @@ -305,8 +305,9 @@ private void setupFlushTupleTimer(final MaptopologyConf, final } private void setupBackPressureCheckTimer(final Map topologyConf) { -final Integer workerCount = ObjectReader.getInt(topologyConf.get(Config.TOPOLOGY_WORKERS)); -if (workerCount <= 1) { +Set nonLocalTasks = Sets.difference(workerState.getTaskToComponent().keySet(), --- End diff -- Some other places that may need similar fixes: - https://github.com/apache/storm/blob/master/examples/storm-loadgen/src/main/java/org/apache/storm/loadgen/CaptureLoad.java#L109 - https://github.com/apache/storm/blob/master/storm-client/src/jvm/org/apache/storm/daemon/StormCommon.java#L363-L364 - https://github.com/apache/storm/blob/master/storm-client/src/jvm/org/apache/storm/daemon/StormCommon.java#L276 may want to check for all usages of that setting in code base. ---
[GitHub] storm pull request #2634: [STORM-3021] Fix wrong usages of Config.TOPOLOGY_W...
Github user roshannaik commented on a diff in the pull request: https://github.com/apache/storm/pull/2634#discussion_r182584203 --- Diff: storm-server/src/main/java/org/apache/storm/daemon/nimbus/Nimbus.java --- @@ -3032,9 +3019,19 @@ public void submitTopologyWithOpts(String topoName, String uploadedJarLocation, // if the other config does not have it set. topology = normalizeTopology(totalConf, topology); -//set the number of acker executors; -totalConfToSave.put(Config.TOPOLOGY_ACKER_EXECUTORS, getNumOfAckerExecs(totalConf, topology)); -LOG.debug("Config.TOPOLOGY_ACKER_EXECUTORS set to: {}", totalConfToSave.get(Config.TOPOLOGY_ACKER_EXECUTORS)); +// if the Resource Aware Scheduler is used, +// we might need to set the number of acker executors and eventlogger executors to be the estimated number of workers. +if (ServerUtils.isRAS(conf)) { +int estimatedNumWorker = ServerUtils.getEstimatedWorkerCountForRASTopo(totalConf, topology); +int numAckerExecs = ObjectReader.getInt(totalConf.get(Config.TOPOLOGY_ACKER_EXECUTORS), estimatedNumWorker); +int numEventLoggerExecs = ObjectReader.getInt(totalConf.get(Config.TOPOLOGY_EVENTLOGGER_EXECUTORS), estimatedNumWorker); + +totalConfToSave.put(Config.TOPOLOGY_ACKER_EXECUTORS, numAckerExecs); --- End diff -- Some other places that may need similar fixes: - https://github.com/apache/storm/blob/master/examples/storm-loadgen/src/main/java/org/apache/storm/loadgen/CaptureLoad.java#L109 - https://github.com/apache/storm/blob/master/storm-client/src/jvm/org/apache/storm/daemon/StormCommon.java#L363-L364 - https://github.com/apache/storm/blob/master/storm-client/src/jvm/org/apache/storm/daemon/StormCommon.java#L276 may want to check for all usages of that setting in code base. ---
[GitHub] storm pull request #2637: Map of Spout configurations from `storm-kafka` to ...
GitHub user srishtyagrawal opened a pull request: https://github.com/apache/storm/pull/2637 Map of Spout configurations from `storm-kafka` to `storm-kafka-client` As per @srdo and @ptgoetz's replies on the Storm Dev mailing list, I am adding the spout configuration map in the `storm-kafka-client` document . [The gist](https://gist.github.com/srishtyagrawal/850b0c3f661cf3c620c27f314791224b), with initial changes, had comments from @srdo and questions from me which I am pasting here for convenience: Last comment by @srdo: Thanks, I think this is nearly there. The maxOffsetBehind section says that "If a failing tuple's offset is less than maxOffsetBehind, the spout stops retrying the tuple.". Shouldn't it be more than? i.e. if the latest offset is 100, and you set maxOffsetBehind to 50, and then offset 30 fails, 30 is more than maxOffsetBehind behind the latest offset, so it is not retried. Regarding the links, I think we should try to use links that automatically point at the right release. There's some documentation about it here https://github.com/apache/storm-site#how-release-specific-docs-work, and example usage "The allowed values are listed in the FirstPollOffsetStrategy javadocs" (from https://github.com/apache/storm/blob/master/docs/storm-kafka-client.md). It would be great if you fix any broken links you find, or any links that are hard coded to point at a specific release. My reply: I copied the [maxOffsetBehind documentation](https://docs.hortonworks.com/HDPDocuments/HDP2/HDP-2.6.2/bk_storm-component-guide/content/storm-kafkaspout-config-core.html) from here. It is confusing because from your earlier example the value 30 itself is lesser than 100-50, but I like the idea of adding behind to make it more clear. As there are more than 1 scenarios where maxOffsetBehind is used, I have modified the documentation to specify the fail scenario as an example. Thanks for the documentation on links, I will fix all the existing links and the ones which are currently broken in storm-kafka-client documentation. Question: Seems like all the release related links in [storm-kafka-client.md](https://github.com/apache/storm/blob/master/docs/storm-kafka-client.md) don't work. I looked at other docs as well, for example [Hooks.md](https://github.com/apache/storm/blob/a4afacd9617d620f50cf026fc599821f7ac25c79/docs/Hooks.md), [Concepts.md](https://github.com/apache/storm/blob/09e01231cc427004bab475c9c70f21fa79cfedef/docs/Concepts.md), [Configuration.md](https://github.com/apache/storm/blob/a4afacd9617d620f50cf026fc599821f7ac25c79/docs/Configuration.md), [Common-patterns.md](https://github.com/apache/storm/blob/a4afacd9617d620f50cf026fc599821f7ac25c79/docs/Common-patterns.md) (the first 4 documents I looked into for relative links) where these links gave a 404. Yet to figure out why these links don't work. You can merge this pull request into a Git repository by running: $ git pull https://github.com/srishtyagrawal/storm migrateSpoutConfigs Alternatively you can review and apply these changes as the patch at: https://github.com/apache/storm/pull/2637.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #2637 commit 2ca4fc851c17e1cb8a4208fe5cb0c3916551080b Author: Srishty AgrawalDate: 2018-04-19T00:13:57Z Map of Spout configurations from storm-kafka to storm-kafka-client ---