[jira] [Created] (STORM-2963) Updates to Performance.md
Roshan Naik created STORM-2963: -- Summary: Updates to Performance.md Key: STORM-2963 URL: https://issues.apache.org/jira/browse/STORM-2963 Project: Apache Storm Issue Type: Improvement Affects Versions: 2.0.0 Reporter: Roshan Naik Assignee: Roshan Naik Fix For: 2.0.0 -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (STORM-2947) Review and fix/remove deprecated things in Storm 2.0.0
[ https://issues.apache.org/jira/browse/STORM-2947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16369615#comment-16369615 ] Roshan Naik commented on STORM-2947: *Notes:* *Settings to deprecate in 1.x :* As they have been removed in 2.x: - topology.sleep.spout.wait.strategy.time.ms - topology.backpressure.enable - backpressure.disruptor.high.watermark - backpressure.disruptor.low.watermark - backpressure.znode.timeout.secs - backpressure.znode.update.freq.secs - task.backpressure.poll.secs - topology.executor.receive.buffer.size: 1024 #batched - topology.executor.send.buffer.size: 1024 #individual messages - topology.transfer.buffer.size: 1024 # batched - topology.bolts.outgoing.overflow.buffer.enable: false - topology.disruptor.wait.timeout.millis: 1000 - topology.disruptor.batch.size: 100 - topology.disruptor.batch.timeout.millis: *Classes to deprecate in 1.x:* - org.apache.storm.spout.SleepSpoutWaitStrategy - org.apache.storm.spout.ISpoutWaitStrategy - org.apache.storm.spout.NothingEmptyEmitStrategy *Settings to remove in 2.x:* - nimbus.host - storm.messaging.netty.max_retries - storm.local.mode.zmq > Review and fix/remove deprecated things in Storm 2.0.0 > -- > > Key: STORM-2947 > URL: https://issues.apache.org/jira/browse/STORM-2947 > Project: Apache Storm > Issue Type: Task > Components: storm-client, storm-hdfs, storm-kafka, storm-server, > storm-solr >Affects Versions: 2.0.0 >Reporter: Jungtaek Lim >Assignee: Jungtaek Lim >Priority: Major > > We've been deprecating the things but haven't have time to replace/get rid of > them. It should be better if we have time to review and address them. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Comment Edited] (STORM-2947) Review and fix/remove deprecated things in Storm 2.0.0
[ https://issues.apache.org/jira/browse/STORM-2947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16369615#comment-16369615 ] Roshan Naik edited comment on STORM-2947 at 2/20/18 12:39 AM: -- Few things that i am aware of: *Settings to deprecate in 1.x :* As they have been removed in 2.x: - topology.sleep.spout.wait.strategy.time.ms - topology.backpressure.enable - backpressure.disruptor.high.watermark - backpressure.disruptor.low.watermark - backpressure.znode.timeout.secs - backpressure.znode.update.freq.secs - task.backpressure.poll.secs - topology.executor.receive.buffer.size: 1024 #batched - topology.executor.send.buffer.size: 1024 #individual messages - topology.transfer.buffer.size: 1024 # batched - topology.bolts.outgoing.overflow.buffer.enable: false - topology.disruptor.wait.timeout.millis: 1000 - topology.disruptor.batch.size: 100 - topology.disruptor.batch.timeout.millis: *Classes to deprecate in 1.x:* - org.apache.storm.spout.SleepSpoutWaitStrategy - org.apache.storm.spout.ISpoutWaitStrategy - org.apache.storm.spout.NothingEmptyEmitStrategy *Settings to remove in 2.x:* - nimbus.host - storm.messaging.netty.max_retries - storm.local.mode.zmq was (Author: roshan_naik): *Notes:* *Settings to deprecate in 1.x :* As they have been removed in 2.x: - topology.sleep.spout.wait.strategy.time.ms - topology.backpressure.enable - backpressure.disruptor.high.watermark - backpressure.disruptor.low.watermark - backpressure.znode.timeout.secs - backpressure.znode.update.freq.secs - task.backpressure.poll.secs - topology.executor.receive.buffer.size: 1024 #batched - topology.executor.send.buffer.size: 1024 #individual messages - topology.transfer.buffer.size: 1024 # batched - topology.bolts.outgoing.overflow.buffer.enable: false - topology.disruptor.wait.timeout.millis: 1000 - topology.disruptor.batch.size: 100 - topology.disruptor.batch.timeout.millis: *Classes to deprecate in 1.x:* - org.apache.storm.spout.SleepSpoutWaitStrategy - org.apache.storm.spout.ISpoutWaitStrategy - org.apache.storm.spout.NothingEmptyEmitStrategy *Settings to remove in 2.x:* - nimbus.host - storm.messaging.netty.max_retries - storm.local.mode.zmq > Review and fix/remove deprecated things in Storm 2.0.0 > -- > > Key: STORM-2947 > URL: https://issues.apache.org/jira/browse/STORM-2947 > Project: Apache Storm > Issue Type: Task > Components: storm-client, storm-hdfs, storm-kafka, storm-server, > storm-solr >Affects Versions: 2.0.0 >Reporter: Jungtaek Lim >Assignee: Jungtaek Lim >Priority: Major > > We've been deprecating the things but haven't have time to replace/get rid of > them. It should be better if we have time to review and address them. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (STORM-2964) Deprecate old Spout wait strategy model in 1.x version line
Jungtaek Lim created STORM-2964: --- Summary: Deprecate old Spout wait strategy model in 1.x version line Key: STORM-2964 URL: https://issues.apache.org/jira/browse/STORM-2964 Project: Apache Storm Issue Type: Task Components: storm-client Reporter: Jungtaek Lim This is follow-up issue for STORM-2958, deprecating old Spout wait strategy in 1.x version line with properly noting that it will be removed at Storm 2.0.0. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (STORM-2958) Use new wait strategies for Spout as well
[ https://issues.apache.org/jira/browse/STORM-2958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16369735#comment-16369735 ] Jungtaek Lim commented on STORM-2958: - [~roshan_naik] I filed new issue (STORM-2964) about deprecating old spout model in 1.x version line. I'd be really appreciated if you could take up. Please let me know if you are busy with other tasks. I'll be happy to take up. > Use new wait strategies for Spout as well > - > > Key: STORM-2958 > URL: https://issues.apache.org/jira/browse/STORM-2958 > Project: Apache Storm > Issue Type: Improvement > Components: storm-client >Affects Versions: 2.0.0 >Reporter: Roshan Naik >Assignee: Roshan Naik >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > Time Spent: 40m > Remaining Estimate: 0h > > STORM-2306 introduced a new configurable wait strategy system for these > situations > * BackPressure Wait (used by spout & bolt) > * No incoming data (used by bolt) > There is another wait situation in the spout when there are no emits > generated in a nextTuple() or if max.spout.pending has been reached. This > Jira is to transition the spout wait strategy from the old model to the new > model. Thereby we have a uniform model for dealing with wait strategies. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (STORM-2962) KeyValue State Resharding
Jungtaek Lim created STORM-2962: --- Summary: KeyValue State Resharding Key: STORM-2962 URL: https://issues.apache.org/jira/browse/STORM-2962 Project: Apache Storm Issue Type: Improvement Components: storm-client Reporter: Jungtaek Lim Storm's KeyValueState leverages namespace which is typically composed to component name + task id, which means the key-value pair is bound to the task. To allow rebalancing topology with different parallelism for stateful component, we should support resharding of the state. I wonder more thing about current State implementation but it should be filed to other issues after verification. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (STORM-2958) Use new wait strategies for Spout as well
[ https://issues.apache.org/jira/browse/STORM-2958?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved STORM-2958. - Resolution: Fixed Thanks [~roshan_naik], I merged into master. > Use new wait strategies for Spout as well > - > > Key: STORM-2958 > URL: https://issues.apache.org/jira/browse/STORM-2958 > Project: Apache Storm > Issue Type: Improvement > Components: storm-client >Affects Versions: 2.0.0 >Reporter: Roshan Naik >Assignee: Roshan Naik >Priority: Major > Labels: pull-request-available > Fix For: 2.0.0 > > Time Spent: 40m > Remaining Estimate: 0h > > STORM-2306 introduced a new configurable wait strategy system for these > situations > * BackPressure Wait (used by spout & bolt) > * No incoming data (used by bolt) > There is another wait situation in the spout when there are no emits > generated in a nextTuple() or if max.spout.pending has been reached. This > Jira is to transition the spout wait strategy from the old model to the new > model. Thereby we have a uniform model for dealing with wait strategies. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Created] (STORM-2960) Better to stress importance of setting up proper OS account for Storm processes
Jungtaek Lim created STORM-2960: --- Summary: Better to stress importance of setting up proper OS account for Storm processes Key: STORM-2960 URL: https://issues.apache.org/jira/browse/STORM-2960 Project: Apache Storm Issue Type: Documentation Components: documentation Reporter: Jungtaek Lim Assignee: Jungtaek Lim We have SECURITY.md and also "Firewall/OS level Security" section, but the document doesn't explicitly mention the fact that Storm processes should use OS account(s) which should be properly restricted. We may also want to note that workers which could execute arbitrary code will be running with Supervisor OS account by default. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (STORM-2960) Better to stress importance of setting up proper OS account for Storm processes
[ https://issues.apache.org/jira/browse/STORM-2960?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-2960: -- Labels: pull-request-available (was: ) > Better to stress importance of setting up proper OS account for Storm > processes > --- > > Key: STORM-2960 > URL: https://issues.apache.org/jira/browse/STORM-2960 > Project: Apache Storm > Issue Type: Documentation > Components: documentation >Reporter: Jungtaek Lim >Assignee: Jungtaek Lim >Priority: Major > Labels: pull-request-available > > We have SECURITY.md and also "Firewall/OS level Security" section, but the > document doesn't explicitly mention the fact that Storm processes should use > OS account(s) which should be properly restricted. We may also want to note > that workers which could execute arbitrary code will be running with > Supervisor OS account by default. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Resolved] (STORM-2939) Create interface for processing worker metrics
[ https://issues.apache.org/jira/browse/STORM-2939?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim resolved STORM-2939. - Resolution: Fixed Marking as resolved as [~agresch] stated the work for issue is finished. > Create interface for processing worker metrics > -- > > Key: STORM-2939 > URL: https://issues.apache.org/jira/browse/STORM-2939 > Project: Apache Storm > Issue Type: Improvement >Affects Versions: 2.0.0 >Reporter: Aaron Gresch >Assignee: Aaron Gresch >Priority: Minor > Labels: pull-request-available > Fix For: 2.0.0 > > Time Spent: 1h 40m > Remaining Estimate: 0h > > In Container.java, we send worker metrics to Nimbus to store to RocksDB. > Other implementations (HBase, etc) may want to process in different fashions. > -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (STORM-2961) Refactoring duplicate code in Topology Builder classes
[ https://issues.apache.org/jira/browse/STORM-2961?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-2961: -- Labels: pull-request-available (was: ) > Refactoring duplicate code in Topology Builder classes > -- > > Key: STORM-2961 > URL: https://issues.apache.org/jira/browse/STORM-2961 > Project: Apache Storm > Issue Type: Improvement >Reporter: Kishor Patil >Assignee: Kishor Patil >Priority: Minor > Labels: pull-request-available > > Most subclasses for {{BaseConfigurationDeclarer}} class are duplicating code > for {{addResource}} and {{addResources. }} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Updated] (STORM-2954) Add HBase metricstore implementation
[ https://issues.apache.org/jira/browse/STORM-2954?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] ASF GitHub Bot updated STORM-2954: -- Labels: pull-request-available (was: ) > Add HBase metricstore implementation > > > Key: STORM-2954 > URL: https://issues.apache.org/jira/browse/STORM-2954 > Project: Apache Storm > Issue Type: Improvement >Reporter: Aaron Gresch >Assignee: Aaron Gresch >Priority: Major > Labels: pull-request-available > > In addition to RocksDB, we would like an HBase implementation of a > MetricStore class. -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Closed] (STORM-2878) Supervisor collapse continuously when there is a expired assignment for overdue storm
[ https://issues.apache.org/jira/browse/STORM-2878?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Jungtaek Lim closed STORM-2878. --- Resolution: Duplicate Assignee: Yuzhao Chen > Supervisor collapse continuously when there is a expired assignment for > overdue storm > - > > Key: STORM-2878 > URL: https://issues.apache.org/jira/browse/STORM-2878 > Project: Apache Storm > Issue Type: Bug > Components: storm-core, storm-server >Affects Versions: 2.0.0, 1.x >Reporter: Yuzhao Chen >Assignee: Yuzhao Chen >Priority: Critical > Labels: patch > > For now, when a topology is reassigned or killed for a cluster, supervisor > will delete 4 files for an overdue storm: > - storm-code > - storm-ser > - storm-jar > - LocalAssignment > Slot.java > static DynamicState cleanupCurrentContainer(DynamicState dynamicState, > StaticState staticState, MachineState nextState) throws Exception { > assert(dynamicState.container != null); > assert(dynamicState.currentAssignment != null); > assert(dynamicState.container.areAllProcessesDead()); > > dynamicState.container.cleanUp(); > staticState.localizer.releaseSlotFor(dynamicState.currentAssignment, > staticState.port); > DynamicState ret = dynamicState.withCurrentAssignment(null, null); > if (nextState != null) { > ret = ret.withState(nextState); > } > return ret; > } > But we do not make a transaction to do this, if an exception occurred during > deleting storm-code/ser/jar, an overdue local assignment will be left on disk. > Then when supervisor restart from the exception above, the slots will be > initial and container will be recovered from LocalAssignments, the blob store > will fetch the files from Nimbus/Master, but will get a KeyNotFoundException, > and supervisor collapses again. > This will happens continuously and supervisor will never recover until we > clean up all the local assignments manually. > This is the stack: > 2017-12-27 14:15:04.434 o.a.s.l.AsyncLocalizer [INFO] Cleaning up unused > topologies in /opt/meituan/storm/data/supervisor/stormdist > 2017-12-27 14:15:04.434 o.a.s.d.s.AdvancedFSOps [INFO] Deleting path > /opt/meituan/storm/data/supervisor/stormdist/app_dpsr_realtime_shop_vane_allcates-14-1513685785 > 2017-12-27 14:15:04.445 o.a.s.d.s.Slot [INFO] STATE EMPTY msInState: 109 -> > WAITING_FOR_BASIC_LOCALIZATION msInState: 1 > 2017-12-27 14:15:04.471 o.a.s.d.s.Supervisor [INFO] Starting supervisor with > id 255d3fed-f3ee-4c7e-8a08-b693c9a6a072 at host gq-data-rt48.gq.sankuai.com. > 2017-12-27 14:15:04.502 o.a.s.u.Utils [ERROR] An exception happened while > downloading > /opt/meituan/storm/data/supervisor/tmp/ca4f8174-59be-40a4-b431-dbc8b697f063/stormjar.jar > from blob store. > org.apache.storm.generated.KeyNotFoundException: null > at > org.apache.storm.generated.Nimbus$beginBlobDownload_result$beginBlobDownload_resultStandardScheme.read(Nimbus.java:26656) > ~[storm-core-1.1.2-mt001.jar:?] > at > org.apache.storm.generated.Nimbus$beginBlobDownload_result$beginBlobDownload_resultStandardScheme.read(Nimbus.java:26624) > ~[storm-core-1.1.2-mt001.jar:?] > at > org.apache.storm.generated.Nimbus$beginBlobDownload_result.read(Nimbus.java:26555) > ~[storm-core-1.1.2-mt001.jar:?] > at > org.apache.storm.thrift.TServiceClient.receiveBase(TServiceClient.java:86) > ~[storm-core-1.1.2-mt001.jar:?] > at > org.apache.storm.generated.Nimbus$Client.recv_beginBlobDownload(Nimbus.java:864) > ~[storm-core-1.1.2-mt001.jar:?] > at > org.apache.storm.generated.Nimbus$Client.beginBlobDownload(Nimbus.java:851) > ~[storm-core-1.1.2-mt001.jar:?] > at > org.apache.storm.blobstore.NimbusBlobStore.getBlob(NimbusBlobStore.java:357) > ~[storm-core-1.1.2-mt001.jar:?] > at > org.apache.storm.utils.Utils.downloadResourcesAsSupervisorAttempt(Utils.java:598) > ~[storm-core-1.1.2-mt001.jar:?] > at > org.apache.storm.utils.Utils.downloadResourcesAsSupervisorImpl(Utils.java:582) > ~[storm-core-1.1.2-mt001.jar:?] > at > org.apache.storm.utils.Utils.downloadResourcesAsSupervisor(Utils.java:574) > ~[storm-core-1.1.2-mt001.jar:?] > at > org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.downloadBaseBlobs(AsyncLocalizer.java:123) > ~[storm-core-1.1.2-mt001.jar:?] > at > org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.call(AsyncLocalizer.java:148) > ~[storm-core-1.1.2-mt001.jar:?] > at > org.apache.storm.localizer.AsyncLocalizer$DownloadBaseBlobsDistributed.call(AsyncLocalizer.java:101) > ~[storm-core-1.1.2-mt001.jar:?] > at