[jira] [Commented] (STORM-1632) Disable event logging by default
[ https://issues.apache.org/jira/browse/STORM-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209799#comment-15209799 ] ASF GitHub Bot commented on STORM-1632: --- Github user arunmahadevan commented on the pull request: https://github.com/apache/storm/pull/1217#issuecomment-200677018 @harshach I had removed the sleep in the latest run to match what @roshannaik was evaluating. > Disable event logging by default > > > Key: STORM-1632 > URL: https://issues.apache.org/jira/browse/STORM-1632 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Reporter: Roshan Naik >Assignee: Roshan Naik >Priority: Blocker > Fix For: 1.0.0 > > > EventLogging has performance penalty. For a simple speed of light topology > with a single instances of a spout and a bolt, disabling event logging > delivers a 7% to 9% perf improvement (with acker count =1) > Event logging can be enabled when there is need to do debug, but turned off > by default. > **Update:** with acker=0 the observed impact was much higher... **25%** > faster when event loggers = 0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: STORM-1632 Disable event logging by default
Github user arunmahadevan commented on the pull request: https://github.com/apache/storm/pull/1217#issuecomment-200677018 @harshach I had removed the sleep in the latest run to match what @roshannaik was evaluating. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1632) Disable event logging by default
[ https://issues.apache.org/jira/browse/STORM-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209798#comment-15209798 ] ASF GitHub Bot commented on STORM-1632: --- Github user harshach commented on the pull request: https://github.com/apache/storm/pull/1217#issuecomment-200674706 @arunmahadevan I saw the earlier comment. Is that topology ran with 10ms sleep in the spout? @roshannaik do you also have some numbers like Arun posted , events per sec. > Disable event logging by default > > > Key: STORM-1632 > URL: https://issues.apache.org/jira/browse/STORM-1632 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Reporter: Roshan Naik >Assignee: Roshan Naik >Priority: Blocker > Fix For: 1.0.0 > > > EventLogging has performance penalty. For a simple speed of light topology > with a single instances of a spout and a bolt, disabling event logging > delivers a 7% to 9% perf improvement (with acker count =1) > Event logging can be enabled when there is need to do debug, but turned off > by default. > **Update:** with acker=0 the observed impact was much higher... **25%** > faster when event loggers = 0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: STORM-1632 Disable event logging by default
Github user harshach commented on the pull request: https://github.com/apache/storm/pull/1217#issuecomment-200674706 @arunmahadevan I saw the earlier comment. Is that topology ran with 10ms sleep in the spout? @roshannaik do you also have some numbers like Arun posted , events per sec. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request: STORM-1632 Disable event logging by default
Github user arunmahadevan commented on the pull request: https://github.com/apache/storm/pull/1217#issuecomment-200672407 @harshach > @arunmahadevan @HeartSaVioR Looking at the perf analysis from @roshannaik It looks to be there is enough evidence to consider this as serious issue in performance. Can you also take a look at the results that I observed where the throughput difference is negligible ? I am for disabling it if theres a consensus on the results and that it really affects performance. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1632) Disable event logging by default
[ https://issues.apache.org/jira/browse/STORM-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209789#comment-15209789 ] ASF GitHub Bot commented on STORM-1632: --- Github user arunmahadevan commented on the pull request: https://github.com/apache/storm/pull/1217#issuecomment-200672407 @harshach > @arunmahadevan @HeartSaVioR Looking at the perf analysis from @roshannaik It looks to be there is enough evidence to consider this as serious issue in performance. Can you also take a look at the results that I observed where the throughput difference is negligible ? I am for disabling it if theres a consensus on the results and that it really affects performance. > Disable event logging by default > > > Key: STORM-1632 > URL: https://issues.apache.org/jira/browse/STORM-1632 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Reporter: Roshan Naik >Assignee: Roshan Naik >Priority: Blocker > Fix For: 1.0.0 > > > EventLogging has performance penalty. For a simple speed of light topology > with a single instances of a spout and a bolt, disabling event logging > delivers a 7% to 9% perf improvement (with acker count =1) > Event logging can be enabled when there is need to do debug, but turned off > by default. > **Update:** with acker=0 the observed impact was much higher... **25%** > faster when event loggers = 0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-1632) Disable event logging by default
[ https://issues.apache.org/jira/browse/STORM-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209778#comment-15209778 ] ASF GitHub Bot commented on STORM-1632: --- Github user harshach commented on the pull request: https://github.com/apache/storm/pull/1217#issuecomment-200670850 @arunmahadevan @ptgoetz we are not worried about 0.4 to 0.5% affect on throughput. For most cases no one going to notice that. Lets wait for @roshannaik topology and you can run it and see if its still 0.4% than we can ignore this. @ptgoetz can you add details how did you run your benchmark. > Disable event logging by default > > > Key: STORM-1632 > URL: https://issues.apache.org/jira/browse/STORM-1632 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Reporter: Roshan Naik >Assignee: Roshan Naik >Priority: Blocker > Fix For: 1.0.0 > > > EventLogging has performance penalty. For a simple speed of light topology > with a single instances of a spout and a bolt, disabling event logging > delivers a 7% to 9% perf improvement (with acker count =1) > Event logging can be enabled when there is need to do debug, but turned off > by default. > **Update:** with acker=0 the observed impact was much higher... **25%** > faster when event loggers = 0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: STORM-1632 Disable event logging by default
Github user harshach commented on the pull request: https://github.com/apache/storm/pull/1217#issuecomment-200670850 @arunmahadevan @ptgoetz we are not worried about 0.4 to 0.5% affect on throughput. For most cases no one going to notice that. Lets wait for @roshannaik topology and you can run it and see if its still 0.4% than we can ignore this. @ptgoetz can you add details how did you run your benchmark. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1632) Disable event logging by default
[ https://issues.apache.org/jira/browse/STORM-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209771#comment-15209771 ] ASF GitHub Bot commented on STORM-1632: --- Github user arunmahadevan commented on the pull request: https://github.com/apache/storm/pull/1217#issuecomment-200669173 @roshannaik did you try passing the map values are arguments and take the measurements? Based on your earlier results it appeared that PersistentMap lookup was causing the hit (I still think it could very well be due to the profiler overhead) Here are the changes I made - https://github.com/arunmahadevan/storm/commit/7eae5ec9f63cee82c49980a3bedf5f0dfe4e3a8d . I would like see how it affects your profiling. I don't think a 0.4 to 0.5 % increase in throughput should be a reason to completely disable a feature. And spouts that emit tuples in a tight loop would not be a very common use case whatsoever. I am for documenting this feature so that the users can adjust the config values based on their needs rather than turning it off. > Disable event logging by default > > > Key: STORM-1632 > URL: https://issues.apache.org/jira/browse/STORM-1632 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Reporter: Roshan Naik >Assignee: Roshan Naik >Priority: Blocker > Fix For: 1.0.0 > > > EventLogging has performance penalty. For a simple speed of light topology > with a single instances of a spout and a bolt, disabling event logging > delivers a 7% to 9% perf improvement (with acker count =1) > Event logging can be enabled when there is need to do debug, but turned off > by default. > **Update:** with acker=0 the observed impact was much higher... **25%** > faster when event loggers = 0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: STORM-1632 Disable event logging by default
Github user arunmahadevan commented on the pull request: https://github.com/apache/storm/pull/1217#issuecomment-200669173 @roshannaik did you try passing the map values are arguments and take the measurements? Based on your earlier results it appeared that PersistentMap lookup was causing the hit (I still think it could very well be due to the profiler overhead) Here are the changes I made - https://github.com/arunmahadevan/storm/commit/7eae5ec9f63cee82c49980a3bedf5f0dfe4e3a8d . I would like see how it affects your profiling. I don't think a 0.4 to 0.5 % increase in throughput should be a reason to completely disable a feature. And spouts that emit tuples in a tight loop would not be a very common use case whatsoever. I am for documenting this feature so that the users can adjust the config values based on their needs rather than turning it off. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1654) HBaseBolt creates tick tuples with no interval when we don't set flushIntervalSecs
[ https://issues.apache.org/jira/browse/STORM-1654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209767#comment-15209767 ] ASF GitHub Bot commented on STORM-1654: --- Github user harshach commented on the pull request: https://github.com/apache/storm/pull/1251#issuecomment-200668555 +1 > HBaseBolt creates tick tuples with no interval when we don't set > flushIntervalSecs > > > Key: STORM-1654 > URL: https://issues.apache.org/jira/browse/STORM-1654 > Project: Apache Storm > Issue Type: Bug > Components: storm-hbase >Affects Versions: 1.0.0, 2.0.0 >Reporter: Jungtaek Lim >Assignee: Jungtaek Lim >Priority: Critical > > As STORM-1219 addressed, we can't get value about topology's message timeout > seconds at getComponentConfiguration(), so logic for applying flush interval > to the half of message timeout is no effect. > Unless we set flushIntervalSeconds explicitly, tick tuple interval is set to > 0 second, no interval. > Other bolts were fixed as STORM-1219, but seems missing HBaseBolt. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-1654) HBaseBolt creates tick tuples with no interval when we don't set flushIntervalSecs
[ https://issues.apache.org/jira/browse/STORM-1654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209766#comment-15209766 ] ASF GitHub Bot commented on STORM-1654: --- Github user harshach commented on the pull request: https://github.com/apache/storm/pull/1252#issuecomment-200668486 +1 > HBaseBolt creates tick tuples with no interval when we don't set > flushIntervalSecs > > > Key: STORM-1654 > URL: https://issues.apache.org/jira/browse/STORM-1654 > Project: Apache Storm > Issue Type: Bug > Components: storm-hbase >Affects Versions: 1.0.0, 2.0.0 >Reporter: Jungtaek Lim >Assignee: Jungtaek Lim >Priority: Critical > > As STORM-1219 addressed, we can't get value about topology's message timeout > seconds at getComponentConfiguration(), so logic for applying flush interval > to the half of message timeout is no effect. > Unless we set flushIntervalSeconds explicitly, tick tuple interval is set to > 0 second, no interval. > Other bolts were fixed as STORM-1219, but seems missing HBaseBolt. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: STORM-1654 HBaseBolt creates tick tuples with ...
Github user harshach commented on the pull request: https://github.com/apache/storm/pull/1251#issuecomment-200668555 +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request: STORM-1654 (1.x) HBaseBolt creates tick tuples...
Github user harshach commented on the pull request: https://github.com/apache/storm/pull/1252#issuecomment-200668486 +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1268) port backtype.storm.daemon.builtin-metrics to java
[ https://issues.apache.org/jira/browse/STORM-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209764#comment-15209764 ] ASF GitHub Bot commented on STORM-1268: --- Github user unsleepy22 commented on the pull request: https://github.com/apache/storm/pull/1218#issuecomment-200667388 @abhishekagarwal87 done~ > port backtype.storm.daemon.builtin-metrics to java > -- > > Key: STORM-1268 > URL: https://issues.apache.org/jira/browse/STORM-1268 > Project: Apache Storm > Issue Type: New Feature > Components: storm-core >Reporter: Robert Joseph Evans >Assignee: Cody > Labels: java-migration, jstorm-merger > > Built-in metrics -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: [STORM-1268] port builtin-metrics to java
Github user unsleepy22 commented on the pull request: https://github.com/apache/storm/pull/1218#issuecomment-200667388 @abhishekagarwal87 done~ --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1632) Disable event logging by default
[ https://issues.apache.org/jira/browse/STORM-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209752#comment-15209752 ] ASF GitHub Bot commented on STORM-1632: --- Github user HeartSaVioR commented on the pull request: https://github.com/apache/storm/pull/1217#issuecomment-200662946 OK I'm assuming this is valid performance hit whether it is small or huge. For choosing 6 or half a dozen, I think whether turning on or off by default should be decided on that its use cases are valid for production. Since STORM-954 is created by @harshach, I guess he considered use cases for this. If we predict most use cases are in dev., sure I agree we can just disable it by default, and document how to enable. > Disable event logging by default > > > Key: STORM-1632 > URL: https://issues.apache.org/jira/browse/STORM-1632 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Reporter: Roshan Naik >Assignee: Roshan Naik >Priority: Blocker > Fix For: 1.0.0 > > > EventLogging has performance penalty. For a simple speed of light topology > with a single instances of a spout and a bolt, disabling event logging > delivers a 7% to 9% perf improvement (with acker count =1) > Event logging can be enabled when there is need to do debug, but turned off > by default. > **Update:** with acker=0 the observed impact was much higher... **25%** > faster when event loggers = 0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: STORM-1632 Disable event logging by default
Github user HeartSaVioR commented on the pull request: https://github.com/apache/storm/pull/1217#issuecomment-200662946 OK I'm assuming this is valid performance hit whether it is small or huge. For choosing 6 or half a dozen, I think whether turning on or off by default should be decided on that its use cases are valid for production. Since STORM-954 is created by @harshach, I guess he considered use cases for this. If we predict most use cases are in dev., sure I agree we can just disable it by default, and document how to enable. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1268) port backtype.storm.daemon.builtin-metrics to java
[ https://issues.apache.org/jira/browse/STORM-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209730#comment-15209730 ] ASF GitHub Bot commented on STORM-1268: --- Github user abhishekagarwal87 commented on the pull request: https://github.com/apache/storm/pull/1218#issuecomment-200660046 @unsleepy22 can you squash the commits? > port backtype.storm.daemon.builtin-metrics to java > -- > > Key: STORM-1268 > URL: https://issues.apache.org/jira/browse/STORM-1268 > Project: Apache Storm > Issue Type: New Feature > Components: storm-core >Reporter: Robert Joseph Evans >Assignee: Cody > Labels: java-migration, jstorm-merger > > Built-in metrics -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: [STORM-1268] port builtin-metrics to java
Github user abhishekagarwal87 commented on the pull request: https://github.com/apache/storm/pull/1218#issuecomment-200660046 @unsleepy22 can you squash the commits? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Assigned] (STORM-1537) Upgrade to Kryo 3
[ https://issues.apache.org/jira/browse/STORM-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Abhishek Agarwal reassigned STORM-1537: --- Assignee: Abhishek Agarwal (was: P. Taylor Goetz) > Upgrade to Kryo 3 > - > > Key: STORM-1537 > URL: https://issues.apache.org/jira/browse/STORM-1537 > Project: Apache Storm > Issue Type: Improvement >Affects Versions: 1.0.0 >Reporter: Oscar Boykin >Assignee: Abhishek Agarwal > Fix For: 1.0.0 > > > In storm, Kryo (2.21) is used for serialization: > https://github.com/apache/storm/blob/02a44c7fc1b7b3a1571b326fde7bcae13e1b5c8d/pom.xml#L231 > The user must use the same version storm does, or there will be a java class > error at runtime. > Storm depends on a quasi-abandoned library: carbonite: > https://github.com/apache/storm/blob/02a44c7fc1b7b3a1571b326fde7bcae13e1b5c8d/pom.xml#L210 > which depends on Kryo 2.21 and Twitter chill 0.3.6: > https://github.com/sritchie/carbonite/blob/master/project.clj#L8 > Chill, currently on 0.7.3, would like to upgrade to Kryo 3.0.3: > https://github.com/twitter/chill/pull/245 > because Spark, also depending on chill, would like to upgrade for performance > improvements and bugfixes. > https://issues.apache.org/jira/browse/SPARK-11416 > Unfortunately, summingbird depends on storm: > https://github.com/twitter/summingbird/blob/develop/build.sbt#L34 > so, if chill is upgraded, and that gets on the classpath, summingbird will > break at runtime. > I propose: > 1) copy the carbonite code into storm. It is likely the only consumer. > 2) bump the storm kryo dependency after chill upgrades: recall that storm > actually depends on chill-java. A dependency that could possibly be removed > after you pull carbonite in. > 3) once a new version of storm is published, summingbird (and scalding) can > upgrade to the latest chill. > Also, I hope for: > 4) we as a JVM community get better about classpath isolation and versioning. > Diamonds like this in one big classpath make large codebases very fragile. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: STORM-676 Trident implementation for sliding a...
Github user satishd commented on a diff in the pull request: https://github.com/apache/storm/pull/1072#discussion_r57272190 --- Diff: external/storm-hbase/src/main/java/org/apache/storm/hbase/trident/windowing/HBaseWindowsStore.java --- @@ -0,0 +1,275 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.storm.hbase.trident.windowing; + +import com.esotericsoftware.kryo.Kryo; +import com.esotericsoftware.kryo.io.Input; +import com.esotericsoftware.kryo.io.Output; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.hbase.client.Delete; +import org.apache.hadoop.hbase.client.Get; +import org.apache.hadoop.hbase.client.HTable; +import org.apache.hadoop.hbase.client.Put; +import org.apache.hadoop.hbase.client.Result; +import org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException; +import org.apache.hadoop.hbase.client.Scan; +import org.apache.hadoop.hbase.filter.FirstKeyOnlyFilter; +import org.apache.storm.trident.windowing.WindowsStore; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.ByteArrayOutputStream; +import java.io.IOException; +import java.io.InterruptedIOException; +import java.io.UnsupportedEncodingException; +import java.nio.ByteBuffer; +import java.util.ArrayList; +import java.util.Collection; +import java.util.Iterator; +import java.util.List; +import java.util.Queue; +import java.util.concurrent.ConcurrentLinkedQueue; + +/** + * This class stores entries into hbase instance of the given configuration. + * + */ +public class HBaseWindowsStore implements WindowsStore { +private static final Logger LOG = LoggerFactory.getLogger(HBaseWindowsStore.class); +public static final String UTF_8 = "utf-8"; + +private final ThreadLocal threadLocalHtable; +private Queue htables = new ConcurrentLinkedQueue<>(); +private final byte[] family; +private final byte[] qualifier; + +public HBaseWindowsStore(final Configuration config, final String tableName, byte[] family, byte[] qualifier) { +this.family = family; +this.qualifier = qualifier; + +threadLocalHtable = new ThreadLocal() { +@Override +protected HTable initialValue() { +try { +HTable hTable = new HTable(config, tableName); +htables.add(hTable); +return hTable; +} catch (IOException e) { +throw new RuntimeException(e); +} +} +}; + +} + +private HTable htable() { +return threadLocalHtable.get(); +} + +private byte[] effectiveKey(String key) { +try { +return key.getBytes(UTF_8); +} catch (UnsupportedEncodingException e) { +throw new RuntimeException(e); +} +} + +@Override +public Object get(String key) { +WindowsStore.Entry.nonNullCheckForKey(key); + +byte[] effectiveKey = effectiveKey(key); +Get get = new Get(effectiveKey); +Result result = null; +try { +result = htable().get(get); +} catch (IOException e) { +throw new RuntimeException(e); +} + +if(result.isEmpty()) { +return null; +} + +Kryo kryo = new Kryo(); +Input input = new Input(result.getValue(family, qualifier)); +Object resultObject = kryo.readClassAndObject(input); +return resultObject; + +} + +@Override +public Iterable get(List keys) { +List gets = new ArrayList<>(); +for (String key : keys) { +WindowsStore.Entry.nonNullCheckForKey(key); + +byte[]
[jira] [Commented] (STORM-676) Storm Trident support for sliding/tumbling windows
[ https://issues.apache.org/jira/browse/STORM-676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209695#comment-15209695 ] ASF GitHub Bot commented on STORM-676: -- Github user satishd commented on a diff in the pull request: https://github.com/apache/storm/pull/1072#discussion_r57272190 --- Diff: external/storm-hbase/src/main/java/org/apache/storm/hbase/trident/windowing/HBaseWindowsStore.java --- @@ -0,0 +1,275 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.storm.hbase.trident.windowing; + +import com.esotericsoftware.kryo.Kryo; +import com.esotericsoftware.kryo.io.Input; +import com.esotericsoftware.kryo.io.Output; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.hbase.client.Delete; +import org.apache.hadoop.hbase.client.Get; +import org.apache.hadoop.hbase.client.HTable; +import org.apache.hadoop.hbase.client.Put; +import org.apache.hadoop.hbase.client.Result; +import org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException; +import org.apache.hadoop.hbase.client.Scan; +import org.apache.hadoop.hbase.filter.FirstKeyOnlyFilter; +import org.apache.storm.trident.windowing.WindowsStore; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.ByteArrayOutputStream; +import java.io.IOException; +import java.io.InterruptedIOException; +import java.io.UnsupportedEncodingException; +import java.nio.ByteBuffer; +import java.util.ArrayList; +import java.util.Collection; +import java.util.Iterator; +import java.util.List; +import java.util.Queue; +import java.util.concurrent.ConcurrentLinkedQueue; + +/** + * This class stores entries into hbase instance of the given configuration. + * + */ +public class HBaseWindowsStore implements WindowsStore { +private static final Logger LOG = LoggerFactory.getLogger(HBaseWindowsStore.class); +public static final String UTF_8 = "utf-8"; + +private final ThreadLocal threadLocalHtable; +private Queue htables = new ConcurrentLinkedQueue<>(); +private final byte[] family; +private final byte[] qualifier; + +public HBaseWindowsStore(final Configuration config, final String tableName, byte[] family, byte[] qualifier) { +this.family = family; +this.qualifier = qualifier; + +threadLocalHtable = new ThreadLocal() { +@Override +protected HTable initialValue() { +try { +HTable hTable = new HTable(config, tableName); +htables.add(hTable); +return hTable; +} catch (IOException e) { +throw new RuntimeException(e); +} +} +}; + +} + +private HTable htable() { +return threadLocalHtable.get(); +} + +private byte[] effectiveKey(String key) { +try { +return key.getBytes(UTF_8); +} catch (UnsupportedEncodingException e) { +throw new RuntimeException(e); +} +} + +@Override +public Object get(String key) { +WindowsStore.Entry.nonNullCheckForKey(key); + +byte[] effectiveKey = effectiveKey(key); +Get get = new Get(effectiveKey); +Result result = null; +try { +result = htable().get(get); +} catch (IOException e) { +throw new RuntimeException(e); +} + +if(result.isEmpty()) { +return null; +} + +Kryo kryo = new Kryo(); +Input input = new Input(result.getValue(family, qualifier)); +Object resultObject = kryo.readClassAndObject(input); +return resultObject; + +} + +
[jira] [Updated] (STORM-1649) Optimize Kryo instaces creation in HBaseWindowsStore
[ https://issues.apache.org/jira/browse/STORM-1649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Duggana updated STORM-1649: -- Fix Version/s: 1.0.0 > Optimize Kryo instaces creation in HBaseWindowsStore > > > Key: STORM-1649 > URL: https://issues.apache.org/jira/browse/STORM-1649 > Project: Apache Storm > Issue Type: Improvement > Components: trident >Reporter: Satish Duggana >Assignee: Satish Duggana > Fix For: 1.0.0, 2.0.0 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Updated] (STORM-1649) Optimize Kryo instaces creation in HBaseWindowsStore
[ https://issues.apache.org/jira/browse/STORM-1649?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Satish Duggana updated STORM-1649: -- Summary: Optimize Kryo instaces creation in HBaseWindowsStore (was: Use DefaultStateSerializer for serialize/deserialize tuple/trigger instances in HBaseWindowsStore) > Optimize Kryo instaces creation in HBaseWindowsStore > > > Key: STORM-1649 > URL: https://issues.apache.org/jira/browse/STORM-1649 > Project: Apache Storm > Issue Type: Improvement > Components: trident >Reporter: Satish Duggana >Assignee: Satish Duggana > Fix For: 2.0.0 > > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-1654) HBaseBolt creates tick tuples with no interval when we don't set flushIntervalSecs
[ https://issues.apache.org/jira/browse/STORM-1654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209658#comment-15209658 ] ASF GitHub Bot commented on STORM-1654: --- GitHub user HeartSaVioR opened a pull request: https://github.com/apache/storm/pull/1252 STORM-1654 (1.x) HBaseBolt creates tick tuples with no interval when we don't set flushIntervalSecs Same patch to #1251, based on 1.x-branch. You can merge this pull request into a Git repository by running: $ git pull https://github.com/HeartSaVioR/storm STORM-1654-1.x-branch Alternatively you can review and apply these changes as the patch at: https://github.com/apache/storm/pull/1252.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1252 commit cdbde7e2d15ae5f9a75e4ec5c0ad94bfc9bca1d8 Author: Jungtaek LimDate: 2016-03-24T02:57:28Z STORM-1654 HBaseBolt creates tick tuples with no interval when we don't set flushIntervalSecs * set 'default' flush interval seconds (1s) to HBaseBolt * since taking half of message timeout secs doesn't work > HBaseBolt creates tick tuples with no interval when we don't set > flushIntervalSecs > > > Key: STORM-1654 > URL: https://issues.apache.org/jira/browse/STORM-1654 > Project: Apache Storm > Issue Type: Bug > Components: storm-hbase >Affects Versions: 1.0.0, 2.0.0 >Reporter: Jungtaek Lim >Assignee: Jungtaek Lim >Priority: Critical > > As STORM-1219 addressed, we can't get value about topology's message timeout > seconds at getComponentConfiguration(), so logic for applying flush interval > to the half of message timeout is no effect. > Unless we set flushIntervalSeconds explicitly, tick tuple interval is set to > 0 second, no interval. > Other bolts were fixed as STORM-1219, but seems missing HBaseBolt. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: STORM-1654 (1.x) HBaseBolt creates tick tuples...
GitHub user HeartSaVioR opened a pull request: https://github.com/apache/storm/pull/1252 STORM-1654 (1.x) HBaseBolt creates tick tuples with no interval when we don't set flushIntervalSecs Same patch to #1251, based on 1.x-branch. You can merge this pull request into a Git repository by running: $ git pull https://github.com/HeartSaVioR/storm STORM-1654-1.x-branch Alternatively you can review and apply these changes as the patch at: https://github.com/apache/storm/pull/1252.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1252 commit cdbde7e2d15ae5f9a75e4ec5c0ad94bfc9bca1d8 Author: Jungtaek LimDate: 2016-03-24T02:57:28Z STORM-1654 HBaseBolt creates tick tuples with no interval when we don't set flushIntervalSecs * set 'default' flush interval seconds (1s) to HBaseBolt * since taking half of message timeout secs doesn't work --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1654) HBaseBolt creates tick tuples with no interval when we don't set flushIntervalSecs
[ https://issues.apache.org/jira/browse/STORM-1654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209653#comment-15209653 ] ASF GitHub Bot commented on STORM-1654: --- GitHub user HeartSaVioR opened a pull request: https://github.com/apache/storm/pull/1251 STORM-1654 HBaseBolt creates tick tuples with no interval when we don't set flushIntervalSecs Please also refer #893 to see why this change is necessary. Set 'default' flush interval seconds to HBaseBolt, since taking half of message timeout secs doesn't work, as #893 showed. Since we should hardcode default flush interval seconds for now, I think we can change default flush interval to 1s since I guess it's enough for HBase, and anyone never set up tuple timeout seconds to 1. (When if this is happening, all bolts with batching manner using tick tuple should also fail.) I'll craft separate PR to be applied to 1.x-branch. You can merge this pull request into a Git repository by running: $ git pull https://github.com/HeartSaVioR/storm STORM-1654 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/storm/pull/1251.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1251 commit c6ed8d0c419dba8bdb479a432d5e7855f78b012b Author: Jungtaek LimDate: 2016-03-24T02:57:28Z STORM-1654 HBaseBolt creates tick tuples with no interval when we don't set flushIntervalSecs * set 'default' flush interval seconds (1s) to HBaseBolt * since taking half of message timeout secs doesn't work > HBaseBolt creates tick tuples with no interval when we don't set > flushIntervalSecs > > > Key: STORM-1654 > URL: https://issues.apache.org/jira/browse/STORM-1654 > Project: Apache Storm > Issue Type: Bug > Components: storm-hbase >Affects Versions: 1.0.0, 2.0.0 >Reporter: Jungtaek Lim >Assignee: Jungtaek Lim >Priority: Critical > > As STORM-1219 addressed, we can't get value about topology's message timeout > seconds at getComponentConfiguration(), so logic for applying flush interval > to the half of message timeout is no effect. > Unless we set flushIntervalSeconds explicitly, tick tuple interval is set to > 0 second, no interval. > Other bolts were fixed as STORM-1219, but seems missing HBaseBolt. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: STORM-1654 HBaseBolt creates tick tuples with ...
GitHub user HeartSaVioR opened a pull request: https://github.com/apache/storm/pull/1251 STORM-1654 HBaseBolt creates tick tuples with no interval when we don't set flushIntervalSecs Please also refer #893 to see why this change is necessary. Set 'default' flush interval seconds to HBaseBolt, since taking half of message timeout secs doesn't work, as #893 showed. Since we should hardcode default flush interval seconds for now, I think we can change default flush interval to 1s since I guess it's enough for HBase, and anyone never set up tuple timeout seconds to 1. (When if this is happening, all bolts with batching manner using tick tuple should also fail.) I'll craft separate PR to be applied to 1.x-branch. You can merge this pull request into a Git repository by running: $ git pull https://github.com/HeartSaVioR/storm STORM-1654 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/storm/pull/1251.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #1251 commit c6ed8d0c419dba8bdb479a432d5e7855f78b012b Author: Jungtaek LimDate: 2016-03-24T02:57:28Z STORM-1654 HBaseBolt creates tick tuples with no interval when we don't set flushIntervalSecs * set 'default' flush interval seconds (1s) to HBaseBolt * since taking half of message timeout secs doesn't work --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1268) port backtype.storm.daemon.builtin-metrics to java
[ https://issues.apache.org/jira/browse/STORM-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209580#comment-15209580 ] ASF GitHub Bot commented on STORM-1268: --- Github user unsleepy22 commented on the pull request: https://github.com/apache/storm/pull/1218#issuecomment-200613922 thanks, addressed > port backtype.storm.daemon.builtin-metrics to java > -- > > Key: STORM-1268 > URL: https://issues.apache.org/jira/browse/STORM-1268 > Project: Apache Storm > Issue Type: New Feature > Components: storm-core >Reporter: Robert Joseph Evans >Assignee: Cody > Labels: java-migration, jstorm-merger > > Built-in metrics -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-1268) port backtype.storm.daemon.builtin-metrics to java
[ https://issues.apache.org/jira/browse/STORM-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209576#comment-15209576 ] ASF GitHub Bot commented on STORM-1268: --- Github user unsleepy22 commented on a diff in the pull request: https://github.com/apache/storm/pull/1218#discussion_r57266090 --- Diff: storm-core/src/jvm/org/apache/storm/daemon/metrics/BuiltinMetricsUtil.java --- @@ -0,0 +1,77 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.storm.daemon.metrics; + +import java.util.HashMap; +import java.util.Map; +import org.apache.storm.Config; +import org.apache.storm.metric.api.IMetric; +import org.apache.storm.metric.api.IStatefulObject; +import org.apache.storm.metric.api.StateMetric; +import org.apache.storm.stats.BoltExecutorStats; +import org.apache.storm.stats.SpoutExecutorStats; +import org.apache.storm.stats.StatsUtil; +import org.apache.storm.task.TopologyContext; + +public class BuiltinMetricsUtil { +public static BuiltinMetrics mkData(String type, Object stats) { +if (StatsUtil.SPOUT.equals(type)) { +return new BuiltinSpoutMetrics((SpoutExecutorStats) stats); +} +return new BuiltinBoltMetrics((BoltExecutorStats) stats); +} + +public static void registerIconnectionServerMetric(Object server, Map stormConf, TopologyContext context) { +if (server instanceof IStatefulObject) { +registerMetric("__recv-iconnection", new StateMetric((IStatefulObject) server), stormConf, context); +} +} + +public static void registerIconnectionClientMetrics(final Map nodePort2socket, Map stormConf, TopologyContext context) { +IMetric metric = new IMetric() { +@Override +public Object getValueAndReset() { +Map
[GitHub] storm pull request: [STORM-1268] port builtin-metrics to java
Github user unsleepy22 commented on the pull request: https://github.com/apache/storm/pull/1218#issuecomment-200613922 thanks, addressed --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request: [STORM-1268] port builtin-metrics to java
Github user unsleepy22 commented on a diff in the pull request: https://github.com/apache/storm/pull/1218#discussion_r57266090 --- Diff: storm-core/src/jvm/org/apache/storm/daemon/metrics/BuiltinMetricsUtil.java --- @@ -0,0 +1,77 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.storm.daemon.metrics; + +import java.util.HashMap; +import java.util.Map; +import org.apache.storm.Config; +import org.apache.storm.metric.api.IMetric; +import org.apache.storm.metric.api.IStatefulObject; +import org.apache.storm.metric.api.StateMetric; +import org.apache.storm.stats.BoltExecutorStats; +import org.apache.storm.stats.SpoutExecutorStats; +import org.apache.storm.stats.StatsUtil; +import org.apache.storm.task.TopologyContext; + +public class BuiltinMetricsUtil { +public static BuiltinMetrics mkData(String type, Object stats) { +if (StatsUtil.SPOUT.equals(type)) { +return new BuiltinSpoutMetrics((SpoutExecutorStats) stats); +} +return new BuiltinBoltMetrics((BoltExecutorStats) stats); +} + +public static void registerIconnectionServerMetric(Object server, Map stormConf, TopologyContext context) { +if (server instanceof IStatefulObject) { +registerMetric("__recv-iconnection", new StateMetric((IStatefulObject) server), stormConf, context); +} +} + +public static void registerIconnectionClientMetrics(final Map nodePort2socket, Map stormConf, TopologyContext context) { +IMetric metric = new IMetric() { +@Override +public Object getValueAndReset() { +Map
[GitHub] storm pull request: [STORM-1650] improve performance by XORShiftRa...
Github user roshannaik commented on the pull request: https://github.com/apache/storm/pull/1250#issuecomment-200609948 My gut feeling based on profiling storm recently is that there are much bigger bottlenecks elsewhere that will eclipse any potential improvement this can deliver. I would recommend verifying with a simple and quantify the performance improvement in throughput / latency this delivers. Having said that, assuming it is a safe optimization, even if it does not improve perf in the current code base, as other bottlenecks get addressed.. eventually this should help. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1650) improve performance by XORShiftRandom
[ https://issues.apache.org/jira/browse/STORM-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209564#comment-15209564 ] ASF GitHub Bot commented on STORM-1650: --- Github user roshannaik commented on the pull request: https://github.com/apache/storm/pull/1250#issuecomment-200609948 My gut feeling based on profiling storm recently is that there are much bigger bottlenecks elsewhere that will eclipse any potential improvement this can deliver. I would recommend verifying with a simple and quantify the performance improvement in throughput / latency this delivers. Having said that, assuming it is a safe optimization, even if it does not improve perf in the current code base, as other bottlenecks get addressed.. eventually this should help. > improve performance by XORShiftRandom > - > > Key: STORM-1650 > URL: https://issues.apache.org/jira/browse/STORM-1650 > Project: Apache Storm > Issue Type: Improvement >Reporter: John Fang >Assignee: John Fang > > '''Implement a random number generator based on the XORShift algorithm > discovered by George Marsaglia. This RNG is observed 4.5 times faster than > {@link Random} in benchmark, with the cost that abandon thread-safety. So > it's recommended to create a new {@link XORShiftRandom} for each thread.''' -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-1650) improve performance by XORShiftRandom
[ https://issues.apache.org/jira/browse/STORM-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209495#comment-15209495 ] ASF GitHub Bot commented on STORM-1650: --- Github user hustfxj closed the pull request at: https://github.com/apache/storm/pull/1250 > improve performance by XORShiftRandom > - > > Key: STORM-1650 > URL: https://issues.apache.org/jira/browse/STORM-1650 > Project: Apache Storm > Issue Type: Improvement >Reporter: John Fang >Assignee: John Fang > > '''Implement a random number generator based on the XORShift algorithm > discovered by George Marsaglia. This RNG is observed 4.5 times faster than > {@link Random} in benchmark, with the cost that abandon thread-safety. So > it's recommended to create a new {@link XORShiftRandom} for each thread.''' -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: [STORM-1650] improve performance by XORShiftRa...
Github user hustfxj closed the pull request at: https://github.com/apache/storm/pull/1250 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1632) Disable event logging by default
[ https://issues.apache.org/jira/browse/STORM-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209475#comment-15209475 ] ASF GitHub Bot commented on STORM-1632: --- Github user ptgoetz commented on the pull request: https://github.com/apache/storm/pull/1217#issuecomment-200591500 We're debating six versus one half dozen. Do we disable it by default and explicitly tell users they have to turn it on for the UI functionality to work? Or do we enable it by default and tell users to disable it per topology to realize a small performance gain? I could go either way, but the latter seems like a better user experience for users new to the feature. Also, the minor performance hit is eclipsed by the performance improvements in 1.0. And it can be easily turned off. It just needs to be documented clearly, IMO. > Disable event logging by default > > > Key: STORM-1632 > URL: https://issues.apache.org/jira/browse/STORM-1632 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Reporter: Roshan Naik >Assignee: Roshan Naik >Priority: Blocker > Fix For: 1.0.0 > > > EventLogging has performance penalty. For a simple speed of light topology > with a single instances of a spout and a bolt, disabling event logging > delivers a 7% to 9% perf improvement (with acker count =1) > Event logging can be enabled when there is need to do debug, but turned off > by default. > **Update:** with acker=0 the observed impact was much higher... **25%** > faster when event loggers = 0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: STORM-1632 Disable event logging by default
Github user ptgoetz commented on the pull request: https://github.com/apache/storm/pull/1217#issuecomment-200591500 We're debating six versus one half dozen. Do we disable it by default and explicitly tell users they have to turn it on for the UI functionality to work? Or do we enable it by default and tell users to disable it per topology to realize a small performance gain? I could go either way, but the latter seems like a better user experience for users new to the feature. Also, the minor performance hit is eclipsed by the performance improvements in 1.0. And it can be easily turned off. It just needs to be documented clearly, IMO. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request: STORM-1625: Move storm-sql dependencies out of...
Github user haohui commented on the pull request: https://github.com/apache/storm/pull/1239#issuecomment-200583518 +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1625) Move storm-sql dependencies out of lib folder
[ https://issues.apache.org/jira/browse/STORM-1625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209424#comment-15209424 ] ASF GitHub Bot commented on STORM-1625: --- Github user haohui commented on the pull request: https://github.com/apache/storm/pull/1239#issuecomment-200583518 +1 > Move storm-sql dependencies out of lib folder > - > > Key: STORM-1625 > URL: https://issues.apache.org/jira/browse/STORM-1625 > Project: Apache Storm > Issue Type: Bug > Components: storm-core, storm-sql >Reporter: Abhishek Agarwal >Assignee: Abhishek Agarwal > > We shade guava classes inside storm-core so that client can work with any > version of guava without any clashes. However, storm-sql-core has a > transitive dependency on guava and thus the guava jar is still getting > shipped in lib folder. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-1580) Secure hdfs spout failed
[ https://issues.apache.org/jira/browse/STORM-1580?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209389#comment-15209389 ] Roshan Naik commented on STORM-1580: [~ght] fyi.. I am beginning to take a look at this. > Secure hdfs spout failed > > > Key: STORM-1580 > URL: https://issues.apache.org/jira/browse/STORM-1580 > Project: Apache Storm > Issue Type: Bug > Components: storm-hdfs >Reporter: guoht > Labels: security > > Some error occured when using secure hdfs spout: > "Login successful for user t...@example.com using keytab file > /home/test/test.keytab > 2016-02-26 10:33:14 o.a.h.i.Client [WARN] Exception encountered while > connecting to the server : javax.security.sasl.SaslException: GSS initiate > failed [Caused by GSSException: No valid credentials provided (Mechanism > level: Failed to find any Kerberos tgt)] > 2016-02-26 10:33:14 o.a.h.i.Client [WARN] Exception encountered while > connecting to the server : javax.security.sasl.SaslException: GSS initiate > failed [Caused by GSSException: No valid credentials provided (Mechanism > level: Failed to find any Kerberos tgt)] > 2016-02-26 10:33:14 o.a.h.i.r.RetryInvocationHandler [INFO] Exception while > invoking getFileInfo of class ClientNamenodeProtocolTranslatorPB over > hnn025/192.168.137.2:8020 after 1 fail over attempts. Trying to fail over > immediately. > java.io.IOException: Failed on local exception: java.io.IOException: > javax.security.sasl.SaslException: GSS initiate failed [Caused by > GSSException: No valid credentials provided (Mechanism level: Failed to find > any Kerberos tgt)]; Host Details : local host is: "HDD021/192.168.137.6"; > destination host is: "hnn025":8020;" -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-1632) Disable event logging by default
[ https://issues.apache.org/jira/browse/STORM-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209337#comment-15209337 ] ASF GitHub Bot commented on STORM-1632: --- Github user HeartSaVioR commented on the pull request: https://github.com/apache/storm/pull/1217#issuecomment-200571684 @roshannaik I'm sorry I feel I picked the wrong word `micro optimization` to confuse you what I mean. `local optimization` seems clearer. Btw, I guess @ptgoetz got me. Yes, I agree we need micro-benchmark to clear out variables, but I think it has to be re-evaluated with normal benchmark to reason how it affects in relatively normal situation, especially if it has to touch functionalities. If it doesn't touch functionality I would say "Awesome work!" even though under 5% of performance gain on local optimization. (Why STORM-1526 and STORM-1539 didn't need to re-evaluate with normal benchmark is that it didn't affect any functionalities.) And I guess this overhead (0.006720889 ms = 6720.889 ns per each tuple spend in send_to_eventlogger as @arunmahadevan posted) is relatively very small than what Storm has to do for process tuple - enqueue and dequeue, finding task id to send, serde, transfer - which we may find spots to improve. Though I agree that's inside of critical path so we may want to find the alternative way with not touching functionality. If you really want to disable it by default, it would be better to post mail to dev mailing list to build consensus first. > Disable event logging by default > > > Key: STORM-1632 > URL: https://issues.apache.org/jira/browse/STORM-1632 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Reporter: Roshan Naik >Assignee: Roshan Naik >Priority: Blocker > Fix For: 1.0.0 > > > EventLogging has performance penalty. For a simple speed of light topology > with a single instances of a spout and a bolt, disabling event logging > delivers a 7% to 9% perf improvement (with acker count =1) > Event logging can be enabled when there is need to do debug, but turned off > by default. > **Update:** with acker=0 the observed impact was much higher... **25%** > faster when event loggers = 0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: STORM-1632 Disable event logging by default
Github user HeartSaVioR commented on the pull request: https://github.com/apache/storm/pull/1217#issuecomment-200571684 @roshannaik I'm sorry I feel I picked the wrong word `micro optimization` to confuse you what I mean. `local optimization` seems clearer. Btw, I guess @ptgoetz got me. Yes, I agree we need micro-benchmark to clear out variables, but I think it has to be re-evaluated with normal benchmark to reason how it affects in relatively normal situation, especially if it has to touch functionalities. If it doesn't touch functionality I would say "Awesome work!" even though under 5% of performance gain on local optimization. (Why STORM-1526 and STORM-1539 didn't need to re-evaluate with normal benchmark is that it didn't affect any functionalities.) And I guess this overhead (0.006720889 ms = 6720.889 ns per each tuple spend in send_to_eventlogger as @arunmahadevan posted) is relatively very small than what Storm has to do for process tuple - enqueue and dequeue, finding task id to send, serde, transfer - which we may find spots to improve. Though I agree that's inside of critical path so we may want to find the alternative way with not touching functionality. If you really want to disable it by default, it would be better to post mail to dev mailing list to build consensus first. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1030) Hive Connector Fixes
[ https://issues.apache.org/jira/browse/STORM-1030?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209278#comment-15209278 ] ASF GitHub Bot commented on STORM-1030: --- Github user Parth-Brahmbhatt commented on the pull request: https://github.com/apache/storm/pull/871#issuecomment-200564783 This PR has been open for a long time, i am still +1 and will merge this this weekend if no one objects. > Hive Connector Fixes > > > Key: STORM-1030 > URL: https://issues.apache.org/jira/browse/STORM-1030 > Project: Apache Storm > Issue Type: Bug > Components: storm-hive >Reporter: Sriharsha Chintalapani >Assignee: Sriharsha Chintalapani >Priority: Blocker > Fix For: 1.0.0 > > > 1. Schedule Hive transaction heartbeats outside of execute method. > 2. Fix retiring idleWriters > 3. Do not call flush if there is no data added to a txnbatch > 4. Catch any exception and abort transaction. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: STORM-1030. Hive Connector Fixes.
Github user Parth-Brahmbhatt commented on the pull request: https://github.com/apache/storm/pull/871#issuecomment-200564783 This PR has been open for a long time, i am still +1 and will merge this this weekend if no one objects. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Resolved] (STORM-1653) BYLAWS has disappeared
[ https://issues.apache.org/jira/browse/STORM-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans resolved STORM-1653. Resolution: Invalid It is still there. http://svn.apache.org/viewvc/storm/site/contribute/BYLAWS.md?view=log Go to the community pull down and bylaws is listed there. > BYLAWS has disappeared > -- > > Key: STORM-1653 > URL: https://issues.apache.org/jira/browse/STORM-1653 > Project: Apache Storm > Issue Type: Bug >Reporter: Kyle Nusbaum >Assignee: Robert Joseph Evans > > When moving asf-site to svn, BYLAWS (and probably some other stuff) > disappeared. > Someone needs to compare the previous docs with what's on SVN now and fill in > any missing spots. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Assigned] (STORM-1653) BYLAWS has disappeared
[ https://issues.apache.org/jira/browse/STORM-1653?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Robert Joseph Evans reassigned STORM-1653: -- Assignee: Robert Joseph Evans > BYLAWS has disappeared > -- > > Key: STORM-1653 > URL: https://issues.apache.org/jira/browse/STORM-1653 > Project: Apache Storm > Issue Type: Bug >Reporter: Kyle Nusbaum >Assignee: Robert Joseph Evans > > When moving asf-site to svn, BYLAWS (and probably some other stuff) > disappeared. > Someone needs to compare the previous docs with what's on SVN now and fill in > any missing spots. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-1632) Disable event logging by default
[ https://issues.apache.org/jira/browse/STORM-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209253#comment-15209253 ] ASF GitHub Bot commented on STORM-1632: --- Github user ptgoetz commented on the pull request: https://github.com/apache/storm/pull/1217#issuecomment-200559629 I did a quick benchmark on a real cluster (albeit on a VMware cluster) and found that there was a throughput hit, but it was small -- about 0.4%. I'm okay with leaving the defaults as is, and documenting how to disable the feature. If there's a better solution, I'm okay with waiting for a post 1.0 release. > Disable event logging by default > > > Key: STORM-1632 > URL: https://issues.apache.org/jira/browse/STORM-1632 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Reporter: Roshan Naik >Assignee: Roshan Naik >Priority: Blocker > Fix For: 1.0.0 > > > EventLogging has performance penalty. For a simple speed of light topology > with a single instances of a spout and a bolt, disabling event logging > delivers a 7% to 9% perf improvement (with acker count =1) > Event logging can be enabled when there is need to do debug, but turned off > by default. > **Update:** with acker=0 the observed impact was much higher... **25%** > faster when event loggers = 0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-1632) Disable event logging by default
[ https://issues.apache.org/jira/browse/STORM-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209223#comment-15209223 ] ASF GitHub Bot commented on STORM-1632: --- Github user harshach commented on the pull request: https://github.com/apache/storm/pull/1217#issuecomment-200553473 @arunmahadevan @HeartSaVioR Looking at the perf analysis from @roshannaik It looks to be there is enough evidence to consider this as serious issue in performance. Given that eventlogging is new feature and we do have evidence its causing perf issue I am +1 on disabling it by default. I understand that once they disabled they can't enable it in a running topology and that is OK. For most usecases this might be used in dev cluster than a production cluster. Also this is a blocker for 1.0 release , lets get this merged in and see if there is a better a way to enable it by default and we can that in 1.1 release. > Disable event logging by default > > > Key: STORM-1632 > URL: https://issues.apache.org/jira/browse/STORM-1632 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Reporter: Roshan Naik >Assignee: Roshan Naik >Priority: Blocker > Fix For: 1.0.0 > > > EventLogging has performance penalty. For a simple speed of light topology > with a single instances of a spout and a bolt, disabling event logging > delivers a 7% to 9% perf improvement (with acker count =1) > Event logging can be enabled when there is need to do debug, but turned off > by default. > **Update:** with acker=0 the observed impact was much higher... **25%** > faster when event loggers = 0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: STORM-1632 Disable event logging by default
Github user harshach commented on the pull request: https://github.com/apache/storm/pull/1217#issuecomment-200553473 @arunmahadevan @HeartSaVioR Looking at the perf analysis from @roshannaik It looks to be there is enough evidence to consider this as serious issue in performance. Given that eventlogging is new feature and we do have evidence its causing perf issue I am +1 on disabling it by default. I understand that once they disabled they can't enable it in a running topology and that is OK. For most usecases this might be used in dev cluster than a production cluster. Also this is a blocker for 1.0 release , lets get this merged in and see if there is a better a way to enable it by default and we can that in 1.1 release. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1268) port backtype.storm.daemon.builtin-metrics to java
[ https://issues.apache.org/jira/browse/STORM-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209165#comment-15209165 ] ASF GitHub Bot commented on STORM-1268: --- Github user revans2 commented on the pull request: https://github.com/apache/storm/pull/1218#issuecomment-200542649 It looks good to me I am +1, but I would like to see the one minor naming comment addressed. > nodePort2Socket can be renamed to nodePortToSocket. > port backtype.storm.daemon.builtin-metrics to java > -- > > Key: STORM-1268 > URL: https://issues.apache.org/jira/browse/STORM-1268 > Project: Apache Storm > Issue Type: New Feature > Components: storm-core >Reporter: Robert Joseph Evans >Assignee: Cody > Labels: java-migration, jstorm-merger > > Built-in metrics -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-1632) Disable event logging by default
[ https://issues.apache.org/jira/browse/STORM-1632?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209171#comment-15209171 ] ASF GitHub Bot commented on STORM-1632: --- Github user roshannaik commented on the pull request: https://github.com/apache/storm/pull/1217#issuecomment-200543287 @arunmahadevan :-) ... i am not taking the throughput measurements while profiler is attached. It will take some time for me to continue to iterate over and analyze your attempts and JProfiler usage to see what is going on. With a quick glance I see multiple differences in your topology setup. But the profiler screenshots that i have posted are hopefully evidence that I didn't cook it up :-). You can either try with the topology i described .. also i shall post a Github link to the topology i am using soon. @HeartSaVioR I am a bit puzzled to see a 8% or 25% diff in perf (for a given topology) being referred to as *micro* optimization. This is a case of potentially significant overhead being imposed upon the common code path by a infrequently used code path. Quite the contrary, i feel, one should have to have a very good justification to leave this turned on. It is not feasible to do a full fledged Yahoo style benchmark to identify and fix all such issues. Micro-benchmarking is essential. Here we are looking at a simple case of emit() call dominating most of the time within nextTuple() ... the spout computation itself is taking negligible % of the time. I have deliberately separated out #1242 from this .. as this is PR about simply disabling a DEBUG config setting.. as opposed to modifying code to avoid repetitive lookups. Seeking and testing an alternative implementation for event logging (unless its trivial) i felt might be tricky at this late stage of 1.x. > Disable event logging by default > > > Key: STORM-1632 > URL: https://issues.apache.org/jira/browse/STORM-1632 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Reporter: Roshan Naik >Assignee: Roshan Naik >Priority: Blocker > Fix For: 1.0.0 > > > EventLogging has performance penalty. For a simple speed of light topology > with a single instances of a spout and a bolt, disabling event logging > delivers a 7% to 9% perf improvement (with acker count =1) > Event logging can be enabled when there is need to do debug, but turned off > by default. > **Update:** with acker=0 the observed impact was much higher... **25%** > faster when event loggers = 0 -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: STORM-1632 Disable event logging by default
Github user roshannaik commented on the pull request: https://github.com/apache/storm/pull/1217#issuecomment-200543287 @arunmahadevan :-) ... i am not taking the throughput measurements while profiler is attached. It will take some time for me to continue to iterate over and analyze your attempts and JProfiler usage to see what is going on. With a quick glance I see multiple differences in your topology setup. But the profiler screenshots that i have posted are hopefully evidence that I didn't cook it up :-). You can either try with the topology i described .. also i shall post a Github link to the topology i am using soon. @HeartSaVioR I am a bit puzzled to see a 8% or 25% diff in perf (for a given topology) being referred to as *micro* optimization. This is a case of potentially significant overhead being imposed upon the common code path by a infrequently used code path. Quite the contrary, i feel, one should have to have a very good justification to leave this turned on. It is not feasible to do a full fledged Yahoo style benchmark to identify and fix all such issues. Micro-benchmarking is essential. Here we are looking at a simple case of emit() call dominating most of the time within nextTuple() ... the spout computation itself is taking negligible % of the time. I have deliberately separated out #1242 from this .. as this is PR about simply disabling a DEBUG config setting.. as opposed to modifying code to avoid repetitive lookups. Seeking and testing an alternative implementation for event logging (unless its trivial) i felt might be tricky at this late stage of 1.x. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request: [STORM-1268] port builtin-metrics to java
Github user revans2 commented on the pull request: https://github.com/apache/storm/pull/1218#issuecomment-200542649 It looks good to me I am +1, but I would like to see the one minor naming comment addressed. > nodePort2Socket can be renamed to nodePortToSocket. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1625) Move storm-sql dependencies out of lib folder
[ https://issues.apache.org/jira/browse/STORM-1625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209142#comment-15209142 ] ASF GitHub Bot commented on STORM-1625: --- Github user revans2 commented on the pull request: https://github.com/apache/storm/pull/1239#issuecomment-200538695 I am fine with that too. +1 we can make this simpler in the future. > Move storm-sql dependencies out of lib folder > - > > Key: STORM-1625 > URL: https://issues.apache.org/jira/browse/STORM-1625 > Project: Apache Storm > Issue Type: Bug > Components: storm-core, storm-sql >Reporter: Abhishek Agarwal >Assignee: Abhishek Agarwal > > We shade guava classes inside storm-core so that client can work with any > version of guava without any clashes. However, storm-sql-core has a > transitive dependency on guava and thus the guava jar is still getting > shipped in lib folder. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: STORM-1625: Move storm-sql dependencies out of...
Github user revans2 commented on the pull request: https://github.com/apache/storm/pull/1239#issuecomment-200538695 I am fine with that too. +1 we can make this simpler in the future. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-676) Storm Trident support for sliding/tumbling windows
[ https://issues.apache.org/jira/browse/STORM-676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209117#comment-15209117 ] ASF GitHub Bot commented on STORM-676: -- Github user ptgoetz commented on a diff in the pull request: https://github.com/apache/storm/pull/1072#discussion_r57232035 --- Diff: external/storm-hbase/src/main/java/org/apache/storm/hbase/trident/windowing/HBaseWindowsStore.java --- @@ -0,0 +1,275 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.storm.hbase.trident.windowing; + +import com.esotericsoftware.kryo.Kryo; +import com.esotericsoftware.kryo.io.Input; +import com.esotericsoftware.kryo.io.Output; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.hbase.client.Delete; +import org.apache.hadoop.hbase.client.Get; +import org.apache.hadoop.hbase.client.HTable; +import org.apache.hadoop.hbase.client.Put; +import org.apache.hadoop.hbase.client.Result; +import org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException; +import org.apache.hadoop.hbase.client.Scan; +import org.apache.hadoop.hbase.filter.FirstKeyOnlyFilter; +import org.apache.storm.trident.windowing.WindowsStore; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.ByteArrayOutputStream; +import java.io.IOException; +import java.io.InterruptedIOException; +import java.io.UnsupportedEncodingException; +import java.nio.ByteBuffer; +import java.util.ArrayList; +import java.util.Collection; +import java.util.Iterator; +import java.util.List; +import java.util.Queue; +import java.util.concurrent.ConcurrentLinkedQueue; + +/** + * This class stores entries into hbase instance of the given configuration. + * + */ +public class HBaseWindowsStore implements WindowsStore { +private static final Logger LOG = LoggerFactory.getLogger(HBaseWindowsStore.class); +public static final String UTF_8 = "utf-8"; + +private final ThreadLocal threadLocalHtable; +private Queue htables = new ConcurrentLinkedQueue<>(); +private final byte[] family; +private final byte[] qualifier; + +public HBaseWindowsStore(final Configuration config, final String tableName, byte[] family, byte[] qualifier) { +this.family = family; +this.qualifier = qualifier; + +threadLocalHtable = new ThreadLocal() { +@Override +protected HTable initialValue() { +try { +HTable hTable = new HTable(config, tableName); +htables.add(hTable); +return hTable; +} catch (IOException e) { +throw new RuntimeException(e); +} +} +}; + +} + +private HTable htable() { +return threadLocalHtable.get(); +} + +private byte[] effectiveKey(String key) { +try { +return key.getBytes(UTF_8); +} catch (UnsupportedEncodingException e) { +throw new RuntimeException(e); +} +} + +@Override +public Object get(String key) { +WindowsStore.Entry.nonNullCheckForKey(key); + +byte[] effectiveKey = effectiveKey(key); +Get get = new Get(effectiveKey); +Result result = null; +try { +result = htable().get(get); +} catch (IOException e) { +throw new RuntimeException(e); +} + +if(result.isEmpty()) { +return null; +} + +Kryo kryo = new Kryo(); +Input input = new Input(result.getValue(family, qualifier)); +Object resultObject = kryo.readClassAndObject(input); +return resultObject; + +} + +
[jira] [Commented] (STORM-1056) allow supervisor log filename to be configurable via ENV variable
[ https://issues.apache.org/jira/browse/STORM-1056?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209115#comment-15209115 ] Erik Weathers commented on STORM-1056: -- [~kabhwan]: ahh, seems that [the release notes for storm 0.10.0|https://storm.apache.org/2015/11/05/storm0100-released.html] were just missing STORM-1056, but it's actually present in v0.10.0: * https://github.com/apache/storm/blob/v0.10.0/bin/storm.py#L80 And in the binary release tarball: {code} (/tmp) % wget http://www.carfab.com/apachesoftware/storm/apache-storm-0.10.0/apache-storm-0.10.0.tar.gz ... (/tmp) % tar -xf apache-storm-0.10.0.tar.gz (/tmp/apache-storm-0.10.0) % grep SUPERVI bin/storm.py STORM_SUPERVISOR_LOG_FILE = os.getenv('STORM_SUPERVISOR_LOG_FILE', "supervisor.log") "-Dlogfile.name=" + STORM_SUPERVISOR_LOG_FILE, {code} > allow supervisor log filename to be configurable via ENV variable > - > > Key: STORM-1056 > URL: https://issues.apache.org/jira/browse/STORM-1056 > Project: Apache Storm > Issue Type: Task > Components: storm-core >Reporter: Erik Weathers >Assignee: Erik Weathers >Priority: Minor > Fix For: 0.9.6 > > > *Requested feature:* allow configuring the supervisor's log filename when > launching it via an ENV variable. > *Motivation:* The storm-on-mesos project (https://github.com/mesos/storm) > relies on multiple Storm Supervisor processes per worker host, where each > Supervisor is dedicated to a particular topology. This is part of the > framework's functionality of separating topologies from each other. i.e., > storm-on-mesos is a multi-tenant system. But before the change requested in > this issue, the logs from all supervisors on a worker host will be written > into a supervisor log with a single name of supervisor.log. If all logs are > written to a common location on the mesos host, then all logs go to the same > log file. Instead it would be desirable to separate the supervisor logs > per-topology, so that each tenant/topology-owner can peruse the logs that are > related to their own topology. Thus this ticket is requesting the ability to > configure the supervisor log via an environment variable whilst invoking > bin/storm.py (or bin/storm in pre-0.10 storm releases). > When this ticket is fixed, we will include the topology ID into the > supervisor log filename for storm-on-mesos. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: STORM-676 Trident implementation for sliding a...
Github user ptgoetz commented on a diff in the pull request: https://github.com/apache/storm/pull/1072#discussion_r57232035 --- Diff: external/storm-hbase/src/main/java/org/apache/storm/hbase/trident/windowing/HBaseWindowsStore.java --- @@ -0,0 +1,275 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, + * software distributed under the License is distributed on an + * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + * KIND, either express or implied. See the License for the + * specific language governing permissions and limitations + * under the License. + */ +package org.apache.storm.hbase.trident.windowing; + +import com.esotericsoftware.kryo.Kryo; +import com.esotericsoftware.kryo.io.Input; +import com.esotericsoftware.kryo.io.Output; +import org.apache.hadoop.conf.Configuration; +import org.apache.hadoop.hbase.client.Delete; +import org.apache.hadoop.hbase.client.Get; +import org.apache.hadoop.hbase.client.HTable; +import org.apache.hadoop.hbase.client.Put; +import org.apache.hadoop.hbase.client.Result; +import org.apache.hadoop.hbase.client.RetriesExhaustedWithDetailsException; +import org.apache.hadoop.hbase.client.Scan; +import org.apache.hadoop.hbase.filter.FirstKeyOnlyFilter; +import org.apache.storm.trident.windowing.WindowsStore; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.ByteArrayOutputStream; +import java.io.IOException; +import java.io.InterruptedIOException; +import java.io.UnsupportedEncodingException; +import java.nio.ByteBuffer; +import java.util.ArrayList; +import java.util.Collection; +import java.util.Iterator; +import java.util.List; +import java.util.Queue; +import java.util.concurrent.ConcurrentLinkedQueue; + +/** + * This class stores entries into hbase instance of the given configuration. + * + */ +public class HBaseWindowsStore implements WindowsStore { +private static final Logger LOG = LoggerFactory.getLogger(HBaseWindowsStore.class); +public static final String UTF_8 = "utf-8"; + +private final ThreadLocal threadLocalHtable; +private Queue htables = new ConcurrentLinkedQueue<>(); +private final byte[] family; +private final byte[] qualifier; + +public HBaseWindowsStore(final Configuration config, final String tableName, byte[] family, byte[] qualifier) { +this.family = family; +this.qualifier = qualifier; + +threadLocalHtable = new ThreadLocal() { +@Override +protected HTable initialValue() { +try { +HTable hTable = new HTable(config, tableName); +htables.add(hTable); +return hTable; +} catch (IOException e) { +throw new RuntimeException(e); +} +} +}; + +} + +private HTable htable() { +return threadLocalHtable.get(); +} + +private byte[] effectiveKey(String key) { +try { +return key.getBytes(UTF_8); +} catch (UnsupportedEncodingException e) { +throw new RuntimeException(e); +} +} + +@Override +public Object get(String key) { +WindowsStore.Entry.nonNullCheckForKey(key); + +byte[] effectiveKey = effectiveKey(key); +Get get = new Get(effectiveKey); +Result result = null; +try { +result = htable().get(get); +} catch (IOException e) { +throw new RuntimeException(e); +} + +if(result.isEmpty()) { +return null; +} + +Kryo kryo = new Kryo(); +Input input = new Input(result.getValue(family, qualifier)); +Object resultObject = kryo.readClassAndObject(input); +return resultObject; + +} + +@Override +public Iterable get(List keys) { +List gets = new ArrayList<>(); +for (String key : keys) { +WindowsStore.Entry.nonNullCheckForKey(key); + +byte[]
[jira] [Commented] (STORM-1617) storm.apache.org has no release specific documentation
[ https://issues.apache.org/jira/browse/STORM-1617?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209111#comment-15209111 ] Robert Joseph Evans commented on STORM-1617: I just filed INFRA-11528 > storm.apache.org has no release specific documentation > -- > > Key: STORM-1617 > URL: https://issues.apache.org/jira/browse/STORM-1617 > Project: Apache Storm > Issue Type: Bug > Components: asf-site >Reporter: Robert Joseph Evans >Assignee: Robert Joseph Evans > > Our current documentation on http://storm.apache.org/ has no place to put > release specific documentation. Ever other project I know of has a little > bit of generic information, but most of the site is tied to a specific > release. I would like to copy that model and I propose that the following > files be maintained only on the asf-site branch. All other files will be > moved to a documentation directory on the individual branches, and a copy of > each, along with the generated javadocs, will be copied to a release specific > directory for each release. > ./index.html > ./releases.html (Combined downloads + documentation for each release) > ./contribute/BYLAWS.md > ./contribute/Contributing-to-Storm.md > ./contribute/People.md (Perhaps just PMC for this) > ./_posts > ./news.html > ./feed.xml > ./getting-help.html > ./LICENSE.html (Apache License) > ./talksAndVideos.md > Possibly also: > ./about/deployment.md > ./about/fault-tolerant.md > ./about/free-and-open-source.md > ./about/guarantees-data-processing.md > ./about/integrates.md > ./about/multi-language.md > ./about/scalable.md > ./about/simple-api.md > ./about.md -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-1625) Move storm-sql dependencies out of lib folder
[ https://issues.apache.org/jira/browse/STORM-1625?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209095#comment-15209095 ] ASF GitHub Bot commented on STORM-1625: --- Github user ptgoetz commented on the pull request: https://github.com/apache/storm/pull/1239#issuecomment-200529436 I'm +1 for merging this. I think moving the jars out of the lib folder is the most important part for the 1.0 release. How to package the SQL dependencies for the best user experience can be addressed in a follow-up JIRA or as simple documentation instructing users what jars need to go where. > Move storm-sql dependencies out of lib folder > - > > Key: STORM-1625 > URL: https://issues.apache.org/jira/browse/STORM-1625 > Project: Apache Storm > Issue Type: Bug > Components: storm-core, storm-sql >Reporter: Abhishek Agarwal >Assignee: Abhishek Agarwal > > We shade guava classes inside storm-core so that client can work with any > version of guava without any clashes. However, storm-sql-core has a > transitive dependency on guava and thus the guava jar is still getting > shipped in lib folder. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: STORM-1625: Move storm-sql dependencies out of...
Github user ptgoetz commented on the pull request: https://github.com/apache/storm/pull/1239#issuecomment-200529436 I'm +1 for merging this. I think moving the jars out of the lib folder is the most important part for the 1.0 release. How to package the SQL dependencies for the best user experience can be addressed in a follow-up JIRA or as simple documentation instructing users what jars need to go where. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Resolved] (STORM-1622) Topology jars referring to shaded classes of older version of storm jar cannot be run in Storm 1.0.0
[ https://issues.apache.org/jira/browse/STORM-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz resolved STORM-1622. Resolution: Fixed Fix Version/s: 1.0.0 Thanks [~abhishek.agarwal]. Merged to 1.x-branch. > Topology jars referring to shaded classes of older version of storm jar > cannot be run in Storm 1.0.0 > > > Key: STORM-1622 > URL: https://issues.apache.org/jira/browse/STORM-1622 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Reporter: Abhishek Agarwal >Assignee: Abhishek Agarwal > Fix For: 1.0.0 > > > This commit > https://github.com/apache/storm/commit/12fdaa01b20eb144bf18a231a05c92873c90aa09 > changes the package names of shaded classes inside the storm. These classes > are shipped inside the maven release of storm-core jar and can depended upon > the topology jar. Jar built with older version of storm-core and using this > dependency, wouldn't run on newer version of storm cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-1622) Topology jars referring to shaded classes of older version of storm jar cannot be run in Storm 1.0.0
[ https://issues.apache.org/jira/browse/STORM-1622?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209082#comment-15209082 ] ASF GitHub Bot commented on STORM-1622: --- Github user asfgit closed the pull request at: https://github.com/apache/storm/pull/1240 > Topology jars referring to shaded classes of older version of storm jar > cannot be run in Storm 1.0.0 > > > Key: STORM-1622 > URL: https://issues.apache.org/jira/browse/STORM-1622 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Reporter: Abhishek Agarwal >Assignee: Abhishek Agarwal > > This commit > https://github.com/apache/storm/commit/12fdaa01b20eb144bf18a231a05c92873c90aa09 > changes the package names of shaded classes inside the storm. These classes > are shipped inside the maven release of storm-core jar and can depended upon > the topology jar. Jar built with older version of storm-core and using this > dependency, wouldn't run on newer version of storm cluster. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: STORM-1622: Rename classes with older third pa...
Github user asfgit closed the pull request at: https://github.com/apache/storm/pull/1240 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1279) port backtype.storm.daemon.supervisor to java
[ https://issues.apache.org/jira/browse/STORM-1279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209079#comment-15209079 ] ASF GitHub Bot commented on STORM-1279: --- Github user jerrypeng commented on a diff in the pull request: https://github.com/apache/storm/pull/1184#discussion_r57228512 --- Diff: storm-core/src/jvm/org/apache/storm/daemon/supervisor/SyncSupervisorEvent.java --- @@ -0,0 +1,632 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.storm.daemon.supervisor; + +import org.apache.commons.io.FileUtils; +import org.apache.storm.Config; +import org.apache.storm.blobstore.BlobStore; +import org.apache.storm.blobstore.ClientBlobStore; +import org.apache.storm.cluster.IStateStorage; +import org.apache.storm.cluster.IStormClusterState; +import org.apache.storm.event.EventManager; +import org.apache.storm.generated.*; +import org.apache.storm.localizer.LocalResource; +import org.apache.storm.localizer.LocalizedResource; +import org.apache.storm.localizer.Localizer; +import org.apache.storm.utils.*; +import org.apache.thrift.transport.TTransportException; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.File; +import java.io.FileOutputStream; +import java.io.IOException; +import java.net.JarURLConnection; +import java.net.URL; +import java.nio.file.Files; +import java.nio.file.StandardCopyOption; +import java.util.*; +import java.util.concurrent.atomic.AtomicInteger; + +public class SyncSupervisorEvent implements Runnable { + +private static final Logger LOG = LoggerFactory.getLogger(SyncSupervisorEvent.class); + +private EventManager syncSupEventManager; +private EventManager syncProcessManager; +private IStormClusterState stormClusterState; +private LocalState localState; +private SyncProcessEvent syncProcesses; +private SupervisorData supervisorData; + +public SyncSupervisorEvent(SupervisorData supervisorData, SyncProcessEvent syncProcesses, EventManager syncSupEventManager, +EventManager syncProcessManager) { + +this.syncProcesses = syncProcesses; +this.syncSupEventManager = syncSupEventManager; +this.syncProcessManager = syncProcessManager; +this.stormClusterState = supervisorData.getStormClusterState(); +this.localState = supervisorData.getLocalState(); +this.supervisorData = supervisorData; +} + +@Override +public void run() { +try { +Map conf = supervisorData.getConf(); +Runnable syncCallback = new EventManagerPushCallback(this, syncSupEventManager); +List stormIds = stormClusterState.assignments(syncCallback); +Map> assignmentsSnapshot = +getAssignmentsSnapshot(stormClusterState, stormIds, supervisorData.getAssignmentVersions().get(), syncCallback); +Map stormIdToProfilerActions = getProfileActions(stormClusterState, stormIds); + +Set allDownloadedTopologyIds = SupervisorUtils.readDownLoadedStormIds(conf); +Map stormcodeMap = readStormCodeLocations(assignmentsSnapshot); +Map existingAssignment = localState.getLocalAssignmentsMap(); +if (existingAssignment == null) { +existingAssignment = new HashMap<>(); +} + +Map allAssignment = +readAssignments(assignmentsSnapshot, existingAssignment, supervisorData.getAssignmentId(), supervisorData.getSyncRetry()); + +Map newAssignment = new HashMap<>(); +Set assignedStormIds = new HashSet<>(); + +for (Map.Entry
[GitHub] storm pull request: [STORM-1279] port backtype.storm.daemon.superv...
Github user jerrypeng commented on a diff in the pull request: https://github.com/apache/storm/pull/1184#discussion_r57228512 --- Diff: storm-core/src/jvm/org/apache/storm/daemon/supervisor/SyncSupervisorEvent.java --- @@ -0,0 +1,632 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.storm.daemon.supervisor; + +import org.apache.commons.io.FileUtils; +import org.apache.storm.Config; +import org.apache.storm.blobstore.BlobStore; +import org.apache.storm.blobstore.ClientBlobStore; +import org.apache.storm.cluster.IStateStorage; +import org.apache.storm.cluster.IStormClusterState; +import org.apache.storm.event.EventManager; +import org.apache.storm.generated.*; +import org.apache.storm.localizer.LocalResource; +import org.apache.storm.localizer.LocalizedResource; +import org.apache.storm.localizer.Localizer; +import org.apache.storm.utils.*; +import org.apache.thrift.transport.TTransportException; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.File; +import java.io.FileOutputStream; +import java.io.IOException; +import java.net.JarURLConnection; +import java.net.URL; +import java.nio.file.Files; +import java.nio.file.StandardCopyOption; +import java.util.*; +import java.util.concurrent.atomic.AtomicInteger; + +public class SyncSupervisorEvent implements Runnable { + +private static final Logger LOG = LoggerFactory.getLogger(SyncSupervisorEvent.class); + +private EventManager syncSupEventManager; +private EventManager syncProcessManager; +private IStormClusterState stormClusterState; +private LocalState localState; +private SyncProcessEvent syncProcesses; +private SupervisorData supervisorData; + +public SyncSupervisorEvent(SupervisorData supervisorData, SyncProcessEvent syncProcesses, EventManager syncSupEventManager, +EventManager syncProcessManager) { + +this.syncProcesses = syncProcesses; +this.syncSupEventManager = syncSupEventManager; +this.syncProcessManager = syncProcessManager; +this.stormClusterState = supervisorData.getStormClusterState(); +this.localState = supervisorData.getLocalState(); +this.supervisorData = supervisorData; +} + +@Override +public void run() { +try { +Map conf = supervisorData.getConf(); +Runnable syncCallback = new EventManagerPushCallback(this, syncSupEventManager); +List stormIds = stormClusterState.assignments(syncCallback); +Map> assignmentsSnapshot = +getAssignmentsSnapshot(stormClusterState, stormIds, supervisorData.getAssignmentVersions().get(), syncCallback); +Map stormIdToProfilerActions = getProfileActions(stormClusterState, stormIds); + +Set allDownloadedTopologyIds = SupervisorUtils.readDownLoadedStormIds(conf); +Map stormcodeMap = readStormCodeLocations(assignmentsSnapshot); +Map existingAssignment = localState.getLocalAssignmentsMap(); +if (existingAssignment == null) { +existingAssignment = new HashMap<>(); +} + +Map allAssignment = +readAssignments(assignmentsSnapshot, existingAssignment, supervisorData.getAssignmentId(), supervisorData.getSyncRetry()); + +Map newAssignment = new HashMap<>(); +Set assignedStormIds = new HashSet<>(); + +for (Map.Entry entry : allAssignment.entrySet()) { +if (supervisorData.getiSupervisor().confirmAssigned(entry.getKey())) { +newAssignment.put(entry.getKey(), entry.getValue()); +
[jira] [Resolved] (STORM-1537) Upgrade to Kryo 3
[ https://issues.apache.org/jira/browse/STORM-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz resolved STORM-1537. Resolution: Fixed Fix Version/s: 1.0.0 Merged to 1.x-branch. > Upgrade to Kryo 3 > - > > Key: STORM-1537 > URL: https://issues.apache.org/jira/browse/STORM-1537 > Project: Apache Storm > Issue Type: Improvement >Affects Versions: 1.0.0 >Reporter: Oscar Boykin >Assignee: P. Taylor Goetz > Fix For: 1.0.0 > > > In storm, Kryo (2.21) is used for serialization: > https://github.com/apache/storm/blob/02a44c7fc1b7b3a1571b326fde7bcae13e1b5c8d/pom.xml#L231 > The user must use the same version storm does, or there will be a java class > error at runtime. > Storm depends on a quasi-abandoned library: carbonite: > https://github.com/apache/storm/blob/02a44c7fc1b7b3a1571b326fde7bcae13e1b5c8d/pom.xml#L210 > which depends on Kryo 2.21 and Twitter chill 0.3.6: > https://github.com/sritchie/carbonite/blob/master/project.clj#L8 > Chill, currently on 0.7.3, would like to upgrade to Kryo 3.0.3: > https://github.com/twitter/chill/pull/245 > because Spark, also depending on chill, would like to upgrade for performance > improvements and bugfixes. > https://issues.apache.org/jira/browse/SPARK-11416 > Unfortunately, summingbird depends on storm: > https://github.com/twitter/summingbird/blob/develop/build.sbt#L34 > so, if chill is upgraded, and that gets on the classpath, summingbird will > break at runtime. > I propose: > 1) copy the carbonite code into storm. It is likely the only consumer. > 2) bump the storm kryo dependency after chill upgrades: recall that storm > actually depends on chill-java. A dependency that could possibly be removed > after you pull carbonite in. > 3) once a new version of storm is published, summingbird (and scalding) can > upgrade to the latest chill. > Also, I hope for: > 4) we as a JVM community get better about classpath isolation and versioning. > Diamonds like this in one big classpath make large codebases very fragile. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-1537) Upgrade to Kryo 3
[ https://issues.apache.org/jira/browse/STORM-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209066#comment-15209066 ] ASF GitHub Bot commented on STORM-1537: --- Github user asfgit closed the pull request at: https://github.com/apache/storm/pull/1223 > Upgrade to Kryo 3 > - > > Key: STORM-1537 > URL: https://issues.apache.org/jira/browse/STORM-1537 > Project: Apache Storm > Issue Type: Improvement >Affects Versions: 1.0.0 >Reporter: Oscar Boykin >Assignee: P. Taylor Goetz > > In storm, Kryo (2.21) is used for serialization: > https://github.com/apache/storm/blob/02a44c7fc1b7b3a1571b326fde7bcae13e1b5c8d/pom.xml#L231 > The user must use the same version storm does, or there will be a java class > error at runtime. > Storm depends on a quasi-abandoned library: carbonite: > https://github.com/apache/storm/blob/02a44c7fc1b7b3a1571b326fde7bcae13e1b5c8d/pom.xml#L210 > which depends on Kryo 2.21 and Twitter chill 0.3.6: > https://github.com/sritchie/carbonite/blob/master/project.clj#L8 > Chill, currently on 0.7.3, would like to upgrade to Kryo 3.0.3: > https://github.com/twitter/chill/pull/245 > because Spark, also depending on chill, would like to upgrade for performance > improvements and bugfixes. > https://issues.apache.org/jira/browse/SPARK-11416 > Unfortunately, summingbird depends on storm: > https://github.com/twitter/summingbird/blob/develop/build.sbt#L34 > so, if chill is upgraded, and that gets on the classpath, summingbird will > break at runtime. > I propose: > 1) copy the carbonite code into storm. It is likely the only consumer. > 2) bump the storm kryo dependency after chill upgrades: recall that storm > actually depends on chill-java. A dependency that could possibly be removed > after you pull carbonite in. > 3) once a new version of storm is published, summingbird (and scalding) can > upgrade to the latest chill. > Also, I hope for: > 4) we as a JVM community get better about classpath isolation and versioning. > Diamonds like this in one big classpath make large codebases very fragile. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (STORM-1537) Upgrade to Kryo 3
[ https://issues.apache.org/jira/browse/STORM-1537?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15209062#comment-15209062 ] ASF GitHub Bot commented on STORM-1537: --- Github user ptgoetz commented on the pull request: https://github.com/apache/storm/pull/1223#issuecomment-200521136 +1 Tested on a real cluster and compared to 1.x-branch before and after this patch. Throughput increased and latency decreased. > Upgrade to Kryo 3 > - > > Key: STORM-1537 > URL: https://issues.apache.org/jira/browse/STORM-1537 > Project: Apache Storm > Issue Type: Improvement >Affects Versions: 1.0.0 >Reporter: Oscar Boykin >Assignee: P. Taylor Goetz > > In storm, Kryo (2.21) is used for serialization: > https://github.com/apache/storm/blob/02a44c7fc1b7b3a1571b326fde7bcae13e1b5c8d/pom.xml#L231 > The user must use the same version storm does, or there will be a java class > error at runtime. > Storm depends on a quasi-abandoned library: carbonite: > https://github.com/apache/storm/blob/02a44c7fc1b7b3a1571b326fde7bcae13e1b5c8d/pom.xml#L210 > which depends on Kryo 2.21 and Twitter chill 0.3.6: > https://github.com/sritchie/carbonite/blob/master/project.clj#L8 > Chill, currently on 0.7.3, would like to upgrade to Kryo 3.0.3: > https://github.com/twitter/chill/pull/245 > because Spark, also depending on chill, would like to upgrade for performance > improvements and bugfixes. > https://issues.apache.org/jira/browse/SPARK-11416 > Unfortunately, summingbird depends on storm: > https://github.com/twitter/summingbird/blob/develop/build.sbt#L34 > so, if chill is upgraded, and that gets on the classpath, summingbird will > break at runtime. > I propose: > 1) copy the carbonite code into storm. It is likely the only consumer. > 2) bump the storm kryo dependency after chill upgrades: recall that storm > actually depends on chill-java. A dependency that could possibly be removed > after you pull carbonite in. > 3) once a new version of storm is published, summingbird (and scalding) can > upgrade to the latest chill. > Also, I hope for: > 4) we as a JVM community get better about classpath isolation and versioning. > Diamonds like this in one big classpath make large codebases very fragile. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: STORM-1537: Upgrade to kryo 3
Github user ptgoetz commented on the pull request: https://github.com/apache/storm/pull/1223#issuecomment-200521136 +1 Tested on a real cluster and compared to 1.x-branch before and after this patch. Throughput increased and latency decreased. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Created] (STORM-1653) BYLAWS has disappeared
Kyle Nusbaum created STORM-1653: --- Summary: BYLAWS has disappeared Key: STORM-1653 URL: https://issues.apache.org/jira/browse/STORM-1653 Project: Apache Storm Issue Type: Bug Reporter: Kyle Nusbaum When moving asf-site to svn, BYLAWS (and probably some other stuff) disappeared. Someone needs to compare the previous docs with what's on SVN now and fill in any missing spots. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: Upgraded HBase version to 1.1.0
Github user asfgit closed the pull request at: https://github.com/apache/storm/pull/1197 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1611) port org.apache.storm.pacemaker.pacemaker to java
[ https://issues.apache.org/jira/browse/STORM-1611?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15208962#comment-15208962 ] ASF GitHub Bot commented on STORM-1611: --- Github user knusbaum commented on the pull request: https://github.com/apache/storm/pull/1195#issuecomment-200497814 By using gauges we're re-implementing stuff that coda hale does, namely windowed stats. Just take a look at DRPCServer.java, like I mentioned. You can use Meters or Histograms (I don't care). But they should both allow you to delete a large amount of code. It should basically come down to 3-5 lines initializing the meters/histograms. Some of the current ones will be combined, for instance `largestHeartbeatSize` and `averageHeartbeatSize` can be combined into one Histogram automatically. Just set up a Histogram and mark it with the heartbeat size whenever a heartbeat comes in. Most of the work is just deleting code. > port org.apache.storm.pacemaker.pacemaker to java > - > > Key: STORM-1611 > URL: https://issues.apache.org/jira/browse/STORM-1611 > Project: Apache Storm > Issue Type: New Feature >Reporter: John Fang >Assignee: John Fang > -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: [STORM-1611] port org.apache.storm.pacemaker.p...
Github user knusbaum commented on the pull request: https://github.com/apache/storm/pull/1195#issuecomment-200497814 By using gauges we're re-implementing stuff that coda hale does, namely windowed stats. Just take a look at DRPCServer.java, like I mentioned. You can use Meters or Histograms (I don't care). But they should both allow you to delete a large amount of code. It should basically come down to 3-5 lines initializing the meters/histograms. Some of the current ones will be combined, for instance `largestHeartbeatSize` and `averageHeartbeatSize` can be combined into one Histogram automatically. Just set up a Histogram and mark it with the heartbeat size whenever a heartbeat comes in. Most of the work is just deleting code. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1650) improve performance by XORShiftRandom
[ https://issues.apache.org/jira/browse/STORM-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15208942#comment-15208942 ] ASF GitHub Bot commented on STORM-1650: --- Github user revans2 commented on the pull request: https://github.com/apache/storm/pull/1250#issuecomment-200490900 @hustfxj If you want to close this pull request I will leave it up to you. In some situations the uniqueness of a number is important, and having Random emit truly unique values in a thread safe way is important. This is specifically when creating the tuple IDs that will be used with acking. In other situations we are using Random to do sub-sampling where the uniqueness of the numbers is not critical. The correctness of the system is not compromised if the numbers are repeated or poorly distributed. The only concern for those situations would be if we violate a contract, like we generate a random number that is not within the range specified by the API, or we deadlock, etc. Looking at the code here we will never violate those constraints. The worst thing that happens is that we may repeat some "random" numbers in different threads because the compiler cached the seed in a local register and didn't write back to memory. For me the extra performance can outweigh the less then ideal situation. Looking at ThreadLocalRandom, they are using unsafe operations and I don't know enough about the internal implementation to feel comfortable saying if it will or will not violate any of the constraints, but I don't think it will. > improve performance by XORShiftRandom > - > > Key: STORM-1650 > URL: https://issues.apache.org/jira/browse/STORM-1650 > Project: Apache Storm > Issue Type: Improvement >Reporter: John Fang >Assignee: John Fang > > '''Implement a random number generator based on the XORShift algorithm > discovered by George Marsaglia. This RNG is observed 4.5 times faster than > {@link Random} in benchmark, with the cost that abandon thread-safety. So > it's recommended to create a new {@link XORShiftRandom} for each thread.''' -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: [STORM-1650] improve performance by XORShiftRa...
Github user revans2 commented on the pull request: https://github.com/apache/storm/pull/1250#issuecomment-200490900 @hustfxj If you want to close this pull request I will leave it up to you. In some situations the uniqueness of a number is important, and having Random emit truly unique values in a thread safe way is important. This is specifically when creating the tuple IDs that will be used with acking. In other situations we are using Random to do sub-sampling where the uniqueness of the numbers is not critical. The correctness of the system is not compromised if the numbers are repeated or poorly distributed. The only concern for those situations would be if we violate a contract, like we generate a random number that is not within the range specified by the API, or we deadlock, etc. Looking at the code here we will never violate those constraints. The worst thing that happens is that we may repeat some "random" numbers in different threads because the compiler cached the seed in a local register and didn't write back to memory. For me the extra performance can outweigh the less then ideal situation. Looking at ThreadLocalRandom, they are using unsafe operations and I don't know enough about the internal implementation to feel comfortable saying if it will or will not violate any of the constraints, but I don't think it will. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request: Upgraded HBase version to 1.1.0
Github user knusbaum commented on the pull request: https://github.com/apache/storm/pull/1197#issuecomment-200489523 +1 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-587) trident transactional state in zk should be namespaced with topology id
[ https://issues.apache.org/jira/browse/STORM-587?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15208899#comment-15208899 ] ASF GitHub Bot commented on STORM-587: -- Github user harshach closed the pull request at: https://github.com/apache/storm/pull/475 > trident transactional state in zk should be namespaced with topology id > --- > > Key: STORM-587 > URL: https://issues.apache.org/jira/browse/STORM-587 > Project: Apache Storm > Issue Type: Improvement > Components: storm-core >Reporter: Parth Brahmbhatt >Assignee: Sriharsha Chintalapani > > Currently when a trident transaction spout is initialized it creates a node > in zk under /transactional with the spout name as the node's name. This is > pretty dangerous as any other topology can be submitted with same spout name > and now these 2 spouts will be overwriting each other's states. I believe it > is better to namespace this with topologyId just like all other zk entries > under /storm. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: STORM-587. trident transactional state in zk s...
Github user harshach closed the pull request at: https://github.com/apache/storm/pull/475 --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1650) improve performance by XORShiftRandom
[ https://issues.apache.org/jira/browse/STORM-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15208782#comment-15208782 ] ASF GitHub Bot commented on STORM-1650: --- Github user hustfxj commented on the pull request: https://github.com/apache/storm/pull/1250#issuecomment-200444929 @revans2 It is used in a non-thread safe way, especialy spout/bolt thread. So we think it may not make sense. So I will close the PR? > improve performance by XORShiftRandom > - > > Key: STORM-1650 > URL: https://issues.apache.org/jira/browse/STORM-1650 > Project: Apache Storm > Issue Type: Improvement >Reporter: John Fang >Assignee: John Fang > > '''Implement a random number generator based on the XORShift algorithm > discovered by George Marsaglia. This RNG is observed 4.5 times faster than > {@link Random} in benchmark, with the cost that abandon thread-safety. So > it's recommended to create a new {@link XORShiftRandom} for each thread.''' -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: [STORM-1650] improve performance by XORShiftRa...
Github user hustfxj commented on the pull request: https://github.com/apache/storm/pull/1250#issuecomment-200444929 @revans2 It is used in a non-thread safe way, especialy spout/bolt thread. So we think it may not make sense. So I will close the PR? --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1279) port backtype.storm.daemon.supervisor to java
[ https://issues.apache.org/jira/browse/STORM-1279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15208733#comment-15208733 ] ASF GitHub Bot commented on STORM-1279: --- Github user hustfxj commented on a diff in the pull request: https://github.com/apache/storm/pull/1184#discussion_r57193403 --- Diff: storm-core/src/jvm/org/apache/storm/daemon/supervisor/SyncSupervisorEvent.java --- @@ -0,0 +1,632 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.storm.daemon.supervisor; + +import org.apache.commons.io.FileUtils; +import org.apache.storm.Config; +import org.apache.storm.blobstore.BlobStore; +import org.apache.storm.blobstore.ClientBlobStore; +import org.apache.storm.cluster.IStateStorage; +import org.apache.storm.cluster.IStormClusterState; +import org.apache.storm.event.EventManager; +import org.apache.storm.generated.*; +import org.apache.storm.localizer.LocalResource; +import org.apache.storm.localizer.LocalizedResource; +import org.apache.storm.localizer.Localizer; +import org.apache.storm.utils.*; +import org.apache.thrift.transport.TTransportException; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.File; +import java.io.FileOutputStream; +import java.io.IOException; +import java.net.JarURLConnection; +import java.net.URL; +import java.nio.file.Files; +import java.nio.file.StandardCopyOption; +import java.util.*; +import java.util.concurrent.atomic.AtomicInteger; + +public class SyncSupervisorEvent implements Runnable { + +private static final Logger LOG = LoggerFactory.getLogger(SyncSupervisorEvent.class); + +private EventManager syncSupEventManager; +private EventManager syncProcessManager; +private IStormClusterState stormClusterState; +private LocalState localState; +private SyncProcessEvent syncProcesses; +private SupervisorData supervisorData; + +public SyncSupervisorEvent(SupervisorData supervisorData, SyncProcessEvent syncProcesses, EventManager syncSupEventManager, +EventManager syncProcessManager) { + +this.syncProcesses = syncProcesses; +this.syncSupEventManager = syncSupEventManager; +this.syncProcessManager = syncProcessManager; +this.stormClusterState = supervisorData.getStormClusterState(); +this.localState = supervisorData.getLocalState(); +this.supervisorData = supervisorData; +} + +@Override +public void run() { +try { +Map conf = supervisorData.getConf(); +Runnable syncCallback = new EventManagerPushCallback(this, syncSupEventManager); +List stormIds = stormClusterState.assignments(syncCallback); +Map> assignmentsSnapshot = +getAssignmentsSnapshot(stormClusterState, stormIds, supervisorData.getAssignmentVersions().get(), syncCallback); +Map stormIdToProfilerActions = getProfileActions(stormClusterState, stormIds); + +Set allDownloadedTopologyIds = SupervisorUtils.readDownLoadedStormIds(conf); +Map stormcodeMap = readStormCodeLocations(assignmentsSnapshot); +Map existingAssignment = localState.getLocalAssignmentsMap(); +if (existingAssignment == null) { +existingAssignment = new HashMap<>(); +} + +Map allAssignment = +readAssignments(assignmentsSnapshot, existingAssignment, supervisorData.getAssignmentId(), supervisorData.getSyncRetry()); + +Map newAssignment = new HashMap<>(); +Set assignedStormIds = new HashSet<>(); + +for (Map.Entry
[GitHub] storm pull request: [STORM-1279] port backtype.storm.daemon.superv...
Github user hustfxj commented on a diff in the pull request: https://github.com/apache/storm/pull/1184#discussion_r57193403 --- Diff: storm-core/src/jvm/org/apache/storm/daemon/supervisor/SyncSupervisorEvent.java --- @@ -0,0 +1,632 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.storm.daemon.supervisor; + +import org.apache.commons.io.FileUtils; +import org.apache.storm.Config; +import org.apache.storm.blobstore.BlobStore; +import org.apache.storm.blobstore.ClientBlobStore; +import org.apache.storm.cluster.IStateStorage; +import org.apache.storm.cluster.IStormClusterState; +import org.apache.storm.event.EventManager; +import org.apache.storm.generated.*; +import org.apache.storm.localizer.LocalResource; +import org.apache.storm.localizer.LocalizedResource; +import org.apache.storm.localizer.Localizer; +import org.apache.storm.utils.*; +import org.apache.thrift.transport.TTransportException; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.File; +import java.io.FileOutputStream; +import java.io.IOException; +import java.net.JarURLConnection; +import java.net.URL; +import java.nio.file.Files; +import java.nio.file.StandardCopyOption; +import java.util.*; +import java.util.concurrent.atomic.AtomicInteger; + +public class SyncSupervisorEvent implements Runnable { + +private static final Logger LOG = LoggerFactory.getLogger(SyncSupervisorEvent.class); + +private EventManager syncSupEventManager; +private EventManager syncProcessManager; +private IStormClusterState stormClusterState; +private LocalState localState; +private SyncProcessEvent syncProcesses; +private SupervisorData supervisorData; + +public SyncSupervisorEvent(SupervisorData supervisorData, SyncProcessEvent syncProcesses, EventManager syncSupEventManager, +EventManager syncProcessManager) { + +this.syncProcesses = syncProcesses; +this.syncSupEventManager = syncSupEventManager; +this.syncProcessManager = syncProcessManager; +this.stormClusterState = supervisorData.getStormClusterState(); +this.localState = supervisorData.getLocalState(); +this.supervisorData = supervisorData; +} + +@Override +public void run() { +try { +Map conf = supervisorData.getConf(); +Runnable syncCallback = new EventManagerPushCallback(this, syncSupEventManager); +List stormIds = stormClusterState.assignments(syncCallback); +Map> assignmentsSnapshot = +getAssignmentsSnapshot(stormClusterState, stormIds, supervisorData.getAssignmentVersions().get(), syncCallback); +Map stormIdToProfilerActions = getProfileActions(stormClusterState, stormIds); + +Set allDownloadedTopologyIds = SupervisorUtils.readDownLoadedStormIds(conf); +Map stormcodeMap = readStormCodeLocations(assignmentsSnapshot); +Map existingAssignment = localState.getLocalAssignmentsMap(); +if (existingAssignment == null) { +existingAssignment = new HashMap<>(); +} + +Map allAssignment = +readAssignments(assignmentsSnapshot, existingAssignment, supervisorData.getAssignmentId(), supervisorData.getSyncRetry()); + +Map newAssignment = new HashMap<>(); +Set assignedStormIds = new HashSet<>(); + +for (Map.Entry entry : allAssignment.entrySet()) { +if (supervisorData.getiSupervisor().confirmAssigned(entry.getKey())) { +newAssignment.put(entry.getKey(), entry.getValue()); +
[jira] [Updated] (STORM-1469) Unable to deploy large topologies on apache storm
[ https://issues.apache.org/jira/browse/STORM-1469?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] P. Taylor Goetz updated STORM-1469: --- Removing from 1.0 release epic since this has been fixed in the 1.x-branch. > Unable to deploy large topologies on apache storm > - > > Key: STORM-1469 > URL: https://issues.apache.org/jira/browse/STORM-1469 > Project: Apache Storm > Issue Type: Bug > Components: storm-core >Affects Versions: 1.0.0, 2.0.0 >Reporter: Rudra Sharma >Assignee: Kishor Patil > Fix For: 1.0.0 > > > When deploying to a nimbus a topology which is larger in size >17MB, we get > an exception. In storm 0.9.3 this could be mitigated by using the following > config on the storm.yaml to increse the buffer size to handle the topology > size. i.e. 50MB would be > nimbus.thrift.max_buffer_size: 5000 > This configuration does not resolve the issue in the master branch of storm > and we cannot deploy topologies which are large in size. > Here is the log on the client side when attempting to deploy to the nimbus > node: > java.lang.RuntimeException: org.apache.thrift7.transport.TTransportException > at > backtype.storm.StormSubmitter.submitTopologyAs(StormSubmitter.java:251) > ~[storm-core-0.11.0-SNAPSHOT.jar:0.11.0-SNAPSHOT] > at > backtype.storm.StormSubmitter.submitTopology(StormSubmitter.java:272) > ~[storm-core-0.11.0-SNAPSHOT.jar:0.11.0-SNAPSHOT] > at > backtype.storm.StormSubmitter.submitTopology(StormSubmitter.java:155) > ~[storm-core-0.11.0-SNAPSHOT.jar:0.11.0-SNAPSHOT] > at > com.trustwave.siem.storm.topology.deployer.TopologyDeployer.deploy(TopologyDeployer.java:149) > [siem-ng-storm-deployer-cloud.jar:] > at > com.trustwave.siem.storm.topology.deployer.TopologyDeployer.main(TopologyDeployer.java:87) > [siem-ng-storm-deployer-cloud.jar:] > Caused by: org.apache.thrift7.transport.TTransportException > at > org.apache.thrift7.transport.TIOStreamTransport.read(TIOStreamTransport.java:132) > ~[storm-core-0.11.0-SNAPSHOT.jar:0.11.0-SNAPSHOT] > at org.apache.thrift7.transport.TTransport.readAll(TTransport.java:86) > ~[storm-core-0.11.0-SNAPSHOT.jar:0.11.0-SNAPSHOT] > at > org.apache.thrift7.transport.TFramedTransport.readFrame(TFramedTransport.java:129) > ~[storm-core-0.11.0-SNAPSHOT.jar:0.11.0-SNAPSHOT] > at > org.apache.thrift7.transport.TFramedTransport.read(TFramedTransport.java:101) > ~[storm-core-0.11.0-SNAPSHOT.jar:0.11.0-SNAPSHOT] > at org.apache.thrift7.transport.TTransport.readAll(TTransport.java:86) > ~[storm-core-0.11.0-SNAPSHOT.jar:0.11.0-SNAPSHOT] > at > org.apache.thrift7.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) > ~[storm-core-0.11.0-SNAPSHOT.jar:0.11.0-SNAPSHOT] > at > org.apache.thrift7.protocol.TBinaryProtocol.readI32(TBinaryProtocol.java:318) > ~[storm-core-0.11.0-SNAPSHOT.jar:0.11.0-SNAPSHOT] > at > org.apache.thrift7.protocol.TBinaryProtocol.readMessageBegin(TBinaryProtocol.java:219) > ~[storm-core-0.11.0-SNAPSHOT.jar:0.11.0-SNAPSHOT] > at > org.apache.thrift7.TServiceClient.receiveBase(TServiceClient.java:77) > ~[storm-core-0.11.0-SNAPSHOT.jar:0.11.0-SNAPSHOT] > at > backtype.storm.generated.Nimbus$Client.recv_submitTopology(Nimbus.java:238) > ~[storm-core-0.11.0-SNAPSHOT.jar:0.11.0-SNAPSHOT] > at > backtype.storm.generated.Nimbus$Client.submitTopology(Nimbus.java:222) > ~[storm-core-0.11.0-SNAPSHOT.jar:0.11.0-SNAPSHOT] > at > backtype.storm.StormSubmitter.submitTopologyAs(StormSubmitter.java:237) > ~[storm-core-0.11.0-SNAPSHOT.jar:0.11.0-SNAPSHOT] > ... 4 more > Here is the log on the server side (nimbus.log): > 2016-01-13 10:48:07.206 o.a.s.d.nimbus [INFO] Cleaning inbox ... deleted: > stormjar-c8666220-fa19-426b-a7e4-c62dfb57f1f0.jar > 2016-01-13 10:55:09.823 o.a.s.d.nimbus [INFO] Uploading file from client to > /var/storm-data/nimbus/inbox/stormjar-80ecdf05-6a25-4281-8c78-10062ac5e396.jar > 2016-01-13 10:55:11.910 o.a.s.d.nimbus [INFO] Finished uploading file from > client: > /var/storm-data/nimbus/inbox/stormjar-80ecdf05-6a25-4281-8c78-10062ac5e396.jar > 2016-01-13 10:55:12.084 o.a.t.s.AbstractNonblockingServer$FrameBuffer [WARN] > Exception while invoking! > org.apache.thrift7.transport.TTransportException: Frame size (17435758) > larger than max length (16384000)! > at > org.apache.thrift7.transport.TFramedTransport.readFrame(TFramedTransport.java:137) > at > org.apache.thrift7.transport.TFramedTransport.read(TFramedTransport.java:101) > at org.apache.thrift7.transport.TTransport.readAll(TTransport.java:86) > at > org.apache.thrift7.protocol.TBinaryProtocol.readAll(TBinaryProtocol.java:429) > at >
[jira] [Commented] (STORM-1279) port backtype.storm.daemon.supervisor to java
[ https://issues.apache.org/jira/browse/STORM-1279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15208645#comment-15208645 ] ASF GitHub Bot commented on STORM-1279: --- Github user hustfxj commented on a diff in the pull request: https://github.com/apache/storm/pull/1184#discussion_r57185550 --- Diff: storm-core/src/jvm/org/apache/storm/daemon/supervisor/SyncProcessEvent.java --- @@ -0,0 +1,428 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.storm.daemon.supervisor; + +import org.apache.commons.io.FileUtils; +import org.apache.commons.lang.StringUtils; +import org.apache.storm.Config; +import org.apache.storm.container.cgroup.CgroupManager; +import org.apache.storm.daemon.supervisor.workermanager.IWorkerManager; +import org.apache.storm.generated.ExecutorInfo; +import org.apache.storm.generated.LSWorkerHeartbeat; +import org.apache.storm.generated.LocalAssignment; +import org.apache.storm.generated.WorkerResources; +import org.apache.storm.utils.ConfigUtils; +import org.apache.storm.utils.LocalState; +import org.apache.storm.utils.Time; +import org.apache.storm.utils.Utils; +import org.eclipse.jetty.util.ConcurrentHashSet; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; +import org.yaml.snakeyaml.Yaml; + +import java.io.File; +import java.io.FileWriter; +import java.io.IOException; +import java.util.*; + +/** + * 1. to kill are those in allocated that are dead or disallowed 2. kill the ones that should be dead - read pids, kill -9 and individually remove file - rmr + * heartbeat dir, rmdir pid dir, rmdir id dir (catch exception and log) 3. of the rest, figure out what assignments aren't yet satisfied 4. generate new worker + * ids, write new "approved workers" to LS 5. create local dir for worker id 5. launch new workers (give worker-id, port, and supervisor-id) 6. wait for workers + * launch + */ +public class SyncProcessEvent implements Runnable { + +private static Logger LOG = LoggerFactory.getLogger(SyncProcessEvent.class); + +private LocalState localState; +private SupervisorData supervisorData; +public static final ExecutorInfo SYSTEM_EXECUTOR_INFO = new ExecutorInfo(-1, -1); + +private class ProcessExitCallback implements Utils.ExitCodeCallable { +private final String logPrefix; +private final String workerId; + +public ProcessExitCallback(String logPrefix, String workerId) { +this.logPrefix = logPrefix; +this.workerId = workerId; +} + +@Override +public Object call() throws Exception { +return null; +} + +@Override +public Object call(int exitCode) { +LOG.info("{} exited with code: {}", logPrefix, exitCode); +supervisorData.getDeadWorkers().add(workerId); +return null; +} +} + +public SyncProcessEvent(){ + +} +public SyncProcessEvent(SupervisorData supervisorData) { +init(supervisorData); +} + +//TODO: initData is intended to local supervisor, so we will remove them after porting worker.clj to java +public void init(SupervisorData supervisorData){ +this.supervisorData = supervisorData; +this.localState = supervisorData.getLocalState(); +} + +@Override +public void run() { +LOG.debug("Syncing processes"); +try { +Map conf = supervisorData.getConf(); +MapassignedExecutors = localState.getLocalAssignmentsMap(); + +if (assignedExecutors == null) { +assignedExecutors = new HashMap<>(); +} +int now = Time.currentTimeSecs(); + +Map
[GitHub] storm pull request: [STORM-1279] port backtype.storm.daemon.superv...
Github user hustfxj commented on a diff in the pull request: https://github.com/apache/storm/pull/1184#discussion_r57185550 --- Diff: storm-core/src/jvm/org/apache/storm/daemon/supervisor/SyncProcessEvent.java --- @@ -0,0 +1,428 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.storm.daemon.supervisor; + +import org.apache.commons.io.FileUtils; +import org.apache.commons.lang.StringUtils; +import org.apache.storm.Config; +import org.apache.storm.container.cgroup.CgroupManager; +import org.apache.storm.daemon.supervisor.workermanager.IWorkerManager; +import org.apache.storm.generated.ExecutorInfo; +import org.apache.storm.generated.LSWorkerHeartbeat; +import org.apache.storm.generated.LocalAssignment; +import org.apache.storm.generated.WorkerResources; +import org.apache.storm.utils.ConfigUtils; +import org.apache.storm.utils.LocalState; +import org.apache.storm.utils.Time; +import org.apache.storm.utils.Utils; +import org.eclipse.jetty.util.ConcurrentHashSet; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; +import org.yaml.snakeyaml.Yaml; + +import java.io.File; +import java.io.FileWriter; +import java.io.IOException; +import java.util.*; + +/** + * 1. to kill are those in allocated that are dead or disallowed 2. kill the ones that should be dead - read pids, kill -9 and individually remove file - rmr + * heartbeat dir, rmdir pid dir, rmdir id dir (catch exception and log) 3. of the rest, figure out what assignments aren't yet satisfied 4. generate new worker + * ids, write new "approved workers" to LS 5. create local dir for worker id 5. launch new workers (give worker-id, port, and supervisor-id) 6. wait for workers + * launch + */ +public class SyncProcessEvent implements Runnable { + +private static Logger LOG = LoggerFactory.getLogger(SyncProcessEvent.class); + +private LocalState localState; +private SupervisorData supervisorData; +public static final ExecutorInfo SYSTEM_EXECUTOR_INFO = new ExecutorInfo(-1, -1); + +private class ProcessExitCallback implements Utils.ExitCodeCallable { +private final String logPrefix; +private final String workerId; + +public ProcessExitCallback(String logPrefix, String workerId) { +this.logPrefix = logPrefix; +this.workerId = workerId; +} + +@Override +public Object call() throws Exception { +return null; +} + +@Override +public Object call(int exitCode) { +LOG.info("{} exited with code: {}", logPrefix, exitCode); +supervisorData.getDeadWorkers().add(workerId); +return null; +} +} + +public SyncProcessEvent(){ + +} +public SyncProcessEvent(SupervisorData supervisorData) { +init(supervisorData); +} + +//TODO: initData is intended to local supervisor, so we will remove them after porting worker.clj to java +public void init(SupervisorData supervisorData){ +this.supervisorData = supervisorData; +this.localState = supervisorData.getLocalState(); +} + +@Override +public void run() { +LOG.debug("Syncing processes"); +try { +Map conf = supervisorData.getConf(); +MapassignedExecutors = localState.getLocalAssignmentsMap(); + +if (assignedExecutors == null) { +assignedExecutors = new HashMap<>(); +} +int now = Time.currentTimeSecs(); + +Map localWorkerStats = getLocalWorkerStats(supervisorData, assignedExecutors, now); + +Set keeperWorkerIds = new HashSet<>(); +Set keepPorts = new HashSet<>(); +for (Map.Entry
[jira] [Commented] (STORM-1268) port backtype.storm.daemon.builtin-metrics to java
[ https://issues.apache.org/jira/browse/STORM-1268?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15208637#comment-15208637 ] ASF GitHub Bot commented on STORM-1268: --- Github user unsleepy22 commented on the pull request: https://github.com/apache/storm/pull/1218#issuecomment-200407083 ping @revans2 , could you take time to have a look? this PR blocks task.clj. > port backtype.storm.daemon.builtin-metrics to java > -- > > Key: STORM-1268 > URL: https://issues.apache.org/jira/browse/STORM-1268 > Project: Apache Storm > Issue Type: New Feature > Components: storm-core >Reporter: Robert Joseph Evans >Assignee: Cody > Labels: java-migration, jstorm-merger > > Built-in metrics -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: [STORM-1268] port builtin-metrics to java
Github user unsleepy22 commented on the pull request: https://github.com/apache/storm/pull/1218#issuecomment-200407083 ping @revans2 , could you take time to have a look? this PR blocks task.clj. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1650) improve performance by XORShiftRandom
[ https://issues.apache.org/jira/browse/STORM-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15208626#comment-15208626 ] ASF GitHub Bot commented on STORM-1650: --- Github user hustfxj commented on the pull request: https://github.com/apache/storm/pull/1250#issuecomment-200402469 @revans2 ThreadLocalRandom is 20% faster than XORShiftRandom. But ThreadLocalRandom is static.Yes, we can't use XORShiftRandom in executor.clj due to thread safety now. But if we assure every spout/bolt thread has itself XORShiftRandom object. Thus we can. > improve performance by XORShiftRandom > - > > Key: STORM-1650 > URL: https://issues.apache.org/jira/browse/STORM-1650 > Project: Apache Storm > Issue Type: Improvement >Reporter: John Fang >Assignee: John Fang > > '''Implement a random number generator based on the XORShift algorithm > discovered by George Marsaglia. This RNG is observed 4.5 times faster than > {@link Random} in benchmark, with the cost that abandon thread-safety. So > it's recommended to create a new {@link XORShiftRandom} for each thread.''' -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: [STORM-1650] improve performance by XORShiftRa...
Github user hustfxj commented on the pull request: https://github.com/apache/storm/pull/1250#issuecomment-200402469 @revans2 ThreadLocalRandom is 20% faster than XORShiftRandom. But ThreadLocalRandom is static.Yes, we can't use XORShiftRandom in executor.clj due to thread safety now. But if we assure every spout/bolt thread has itself XORShiftRandom object. Thus we can. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1279) port backtype.storm.daemon.supervisor to java
[ https://issues.apache.org/jira/browse/STORM-1279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15208619#comment-15208619 ] ASF GitHub Bot commented on STORM-1279: --- Github user jerrypeng commented on a diff in the pull request: https://github.com/apache/storm/pull/1184#discussion_r57182643 --- Diff: storm-core/src/jvm/org/apache/storm/daemon/supervisor/SyncProcessEvent.java --- @@ -0,0 +1,428 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.storm.daemon.supervisor; + +import org.apache.commons.io.FileUtils; +import org.apache.commons.lang.StringUtils; +import org.apache.storm.Config; +import org.apache.storm.container.cgroup.CgroupManager; +import org.apache.storm.daemon.supervisor.workermanager.IWorkerManager; +import org.apache.storm.generated.ExecutorInfo; +import org.apache.storm.generated.LSWorkerHeartbeat; +import org.apache.storm.generated.LocalAssignment; +import org.apache.storm.generated.WorkerResources; +import org.apache.storm.utils.ConfigUtils; +import org.apache.storm.utils.LocalState; +import org.apache.storm.utils.Time; +import org.apache.storm.utils.Utils; +import org.eclipse.jetty.util.ConcurrentHashSet; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; +import org.yaml.snakeyaml.Yaml; + +import java.io.File; +import java.io.FileWriter; +import java.io.IOException; +import java.util.*; + +/** + * 1. to kill are those in allocated that are dead or disallowed 2. kill the ones that should be dead - read pids, kill -9 and individually remove file - rmr + * heartbeat dir, rmdir pid dir, rmdir id dir (catch exception and log) 3. of the rest, figure out what assignments aren't yet satisfied 4. generate new worker + * ids, write new "approved workers" to LS 5. create local dir for worker id 5. launch new workers (give worker-id, port, and supervisor-id) 6. wait for workers + * launch + */ +public class SyncProcessEvent implements Runnable { + +private static Logger LOG = LoggerFactory.getLogger(SyncProcessEvent.class); + +private LocalState localState; +private SupervisorData supervisorData; +public static final ExecutorInfo SYSTEM_EXECUTOR_INFO = new ExecutorInfo(-1, -1); + +private class ProcessExitCallback implements Utils.ExitCodeCallable { +private final String logPrefix; +private final String workerId; + +public ProcessExitCallback(String logPrefix, String workerId) { +this.logPrefix = logPrefix; +this.workerId = workerId; +} + +@Override +public Object call() throws Exception { +return null; +} + +@Override +public Object call(int exitCode) { +LOG.info("{} exited with code: {}", logPrefix, exitCode); +supervisorData.getDeadWorkers().add(workerId); +return null; +} +} + +public SyncProcessEvent(){ + +} +public SyncProcessEvent(SupervisorData supervisorData) { +init(supervisorData); +} + +//TODO: initData is intended to local supervisor, so we will remove them after porting worker.clj to java +public void init(SupervisorData supervisorData){ +this.supervisorData = supervisorData; +this.localState = supervisorData.getLocalState(); +} + +@Override +public void run() { +LOG.debug("Syncing processes"); +try { +Map conf = supervisorData.getConf(); +MapassignedExecutors = localState.getLocalAssignmentsMap(); + +if (assignedExecutors == null) { +assignedExecutors = new HashMap<>(); +} +int now = Time.currentTimeSecs(); + +Map
[GitHub] storm pull request: [STORM-1279] port backtype.storm.daemon.superv...
Github user jerrypeng commented on a diff in the pull request: https://github.com/apache/storm/pull/1184#discussion_r57182643 --- Diff: storm-core/src/jvm/org/apache/storm/daemon/supervisor/SyncProcessEvent.java --- @@ -0,0 +1,428 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.storm.daemon.supervisor; + +import org.apache.commons.io.FileUtils; +import org.apache.commons.lang.StringUtils; +import org.apache.storm.Config; +import org.apache.storm.container.cgroup.CgroupManager; +import org.apache.storm.daemon.supervisor.workermanager.IWorkerManager; +import org.apache.storm.generated.ExecutorInfo; +import org.apache.storm.generated.LSWorkerHeartbeat; +import org.apache.storm.generated.LocalAssignment; +import org.apache.storm.generated.WorkerResources; +import org.apache.storm.utils.ConfigUtils; +import org.apache.storm.utils.LocalState; +import org.apache.storm.utils.Time; +import org.apache.storm.utils.Utils; +import org.eclipse.jetty.util.ConcurrentHashSet; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; +import org.yaml.snakeyaml.Yaml; + +import java.io.File; +import java.io.FileWriter; +import java.io.IOException; +import java.util.*; + +/** + * 1. to kill are those in allocated that are dead or disallowed 2. kill the ones that should be dead - read pids, kill -9 and individually remove file - rmr + * heartbeat dir, rmdir pid dir, rmdir id dir (catch exception and log) 3. of the rest, figure out what assignments aren't yet satisfied 4. generate new worker + * ids, write new "approved workers" to LS 5. create local dir for worker id 5. launch new workers (give worker-id, port, and supervisor-id) 6. wait for workers + * launch + */ +public class SyncProcessEvent implements Runnable { + +private static Logger LOG = LoggerFactory.getLogger(SyncProcessEvent.class); + +private LocalState localState; +private SupervisorData supervisorData; +public static final ExecutorInfo SYSTEM_EXECUTOR_INFO = new ExecutorInfo(-1, -1); + +private class ProcessExitCallback implements Utils.ExitCodeCallable { +private final String logPrefix; +private final String workerId; + +public ProcessExitCallback(String logPrefix, String workerId) { +this.logPrefix = logPrefix; +this.workerId = workerId; +} + +@Override +public Object call() throws Exception { +return null; +} + +@Override +public Object call(int exitCode) { +LOG.info("{} exited with code: {}", logPrefix, exitCode); +supervisorData.getDeadWorkers().add(workerId); +return null; +} +} + +public SyncProcessEvent(){ + +} +public SyncProcessEvent(SupervisorData supervisorData) { +init(supervisorData); +} + +//TODO: initData is intended to local supervisor, so we will remove them after porting worker.clj to java +public void init(SupervisorData supervisorData){ +this.supervisorData = supervisorData; +this.localState = supervisorData.getLocalState(); +} + +@Override +public void run() { +LOG.debug("Syncing processes"); +try { +Map conf = supervisorData.getConf(); +MapassignedExecutors = localState.getLocalAssignmentsMap(); + +if (assignedExecutors == null) { +assignedExecutors = new HashMap<>(); +} +int now = Time.currentTimeSecs(); + +Map localWorkerStats = getLocalWorkerStats(supervisorData, assignedExecutors, now); + +Set keeperWorkerIds = new HashSet<>(); +Set keepPorts = new HashSet<>(); +for
[jira] [Commented] (STORM-1279) port backtype.storm.daemon.supervisor to java
[ https://issues.apache.org/jira/browse/STORM-1279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15208613#comment-15208613 ] ASF GitHub Bot commented on STORM-1279: --- Github user jerrypeng commented on a diff in the pull request: https://github.com/apache/storm/pull/1184#discussion_r57181897 --- Diff: storm-core/src/jvm/org/apache/storm/daemon/supervisor/SyncProcessEvent.java --- @@ -0,0 +1,428 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.storm.daemon.supervisor; + +import org.apache.commons.io.FileUtils; +import org.apache.commons.lang.StringUtils; +import org.apache.storm.Config; +import org.apache.storm.container.cgroup.CgroupManager; +import org.apache.storm.daemon.supervisor.workermanager.IWorkerManager; +import org.apache.storm.generated.ExecutorInfo; +import org.apache.storm.generated.LSWorkerHeartbeat; +import org.apache.storm.generated.LocalAssignment; +import org.apache.storm.generated.WorkerResources; +import org.apache.storm.utils.ConfigUtils; +import org.apache.storm.utils.LocalState; +import org.apache.storm.utils.Time; +import org.apache.storm.utils.Utils; +import org.eclipse.jetty.util.ConcurrentHashSet; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; +import org.yaml.snakeyaml.Yaml; + +import java.io.File; +import java.io.FileWriter; +import java.io.IOException; +import java.util.*; + +/** + * 1. to kill are those in allocated that are dead or disallowed 2. kill the ones that should be dead - read pids, kill -9 and individually remove file - rmr + * heartbeat dir, rmdir pid dir, rmdir id dir (catch exception and log) 3. of the rest, figure out what assignments aren't yet satisfied 4. generate new worker + * ids, write new "approved workers" to LS 5. create local dir for worker id 5. launch new workers (give worker-id, port, and supervisor-id) 6. wait for workers + * launch + */ +public class SyncProcessEvent implements Runnable { + +private static Logger LOG = LoggerFactory.getLogger(SyncProcessEvent.class); + +private LocalState localState; +private SupervisorData supervisorData; +public static final ExecutorInfo SYSTEM_EXECUTOR_INFO = new ExecutorInfo(-1, -1); + +private class ProcessExitCallback implements Utils.ExitCodeCallable { +private final String logPrefix; +private final String workerId; + +public ProcessExitCallback(String logPrefix, String workerId) { +this.logPrefix = logPrefix; +this.workerId = workerId; +} + +@Override +public Object call() throws Exception { +return null; +} + +@Override +public Object call(int exitCode) { +LOG.info("{} exited with code: {}", logPrefix, exitCode); +supervisorData.getDeadWorkers().add(workerId); +return null; +} +} + +public SyncProcessEvent(){ + +} +public SyncProcessEvent(SupervisorData supervisorData) { +init(supervisorData); +} + +//TODO: initData is intended to local supervisor, so we will remove them after porting worker.clj to java +public void init(SupervisorData supervisorData){ +this.supervisorData = supervisorData; +this.localState = supervisorData.getLocalState(); +} + +@Override +public void run() { +LOG.debug("Syncing processes"); +try { +Map conf = supervisorData.getConf(); +MapassignedExecutors = localState.getLocalAssignmentsMap(); + +if (assignedExecutors == null) { +assignedExecutors = new HashMap<>(); +} +int now = Time.currentTimeSecs(); + +Map
[jira] [Commented] (STORM-1650) improve performance by XORShiftRandom
[ https://issues.apache.org/jira/browse/STORM-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15208545#comment-15208545 ] ASF GitHub Bot commented on STORM-1650: --- Github user revans2 commented on a diff in the pull request: https://github.com/apache/storm/pull/1250#discussion_r57174134 --- Diff: storm-core/src/clj/org/apache/storm/config.clj --- @@ -53,7 +54,7 @@ [freq] (let [freq (int freq) start (int 0) -r (java.util.Random.) +r (XORShiftRandom.) --- End diff -- Here too the code can be called from multiple threads, but the worst that happens is we double up on sampling some items for metrics. Ideally this would have a comment explaining this. > improve performance by XORShiftRandom > - > > Key: STORM-1650 > URL: https://issues.apache.org/jira/browse/STORM-1650 > Project: Apache Storm > Issue Type: Improvement >Reporter: John Fang >Assignee: John Fang > > '''Implement a random number generator based on the XORShift algorithm > discovered by George Marsaglia. This RNG is observed 4.5 times faster than > {@link Random} in benchmark, with the cost that abandon thread-safety. So > it's recommended to create a new {@link XORShiftRandom} for each thread.''' -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: [STORM-1650] improve performance by XORShiftRa...
Github user revans2 commented on a diff in the pull request: https://github.com/apache/storm/pull/1250#discussion_r57174134 --- Diff: storm-core/src/clj/org/apache/storm/config.clj --- @@ -53,7 +54,7 @@ [freq] (let [freq (int freq) start (int 0) -r (java.util.Random.) +r (XORShiftRandom.) --- End diff -- Here too the code can be called from multiple threads, but the worst that happens is we double up on sampling some items for metrics. Ideally this would have a comment explaining this. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[jira] [Commented] (STORM-1650) improve performance by XORShiftRandom
[ https://issues.apache.org/jira/browse/STORM-1650?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15208538#comment-15208538 ] ASF GitHub Bot commented on STORM-1650: --- Github user revans2 commented on a diff in the pull request: https://github.com/apache/storm/pull/1250#discussion_r57173711 --- Diff: storm-core/src/clj/org/apache/storm/daemon/executor.clj --- @@ -75,7 +76,7 @@ "Returns a function that returns a vector of which task indices to send tuple to, or just a single task index." [^WorkerTopologyContext context component-id stream-id ^Fields out-fields thrift-grouping ^List target-tasks topo-conf] (let [num-tasks (count target-tasks) -random (Random.) +random (XORShiftRandom.) --- End diff -- Again here the grouping could be called from multiple threads, but in this case doubling up on some numbers should not be a big deal. > improve performance by XORShiftRandom > - > > Key: STORM-1650 > URL: https://issues.apache.org/jira/browse/STORM-1650 > Project: Apache Storm > Issue Type: Improvement >Reporter: John Fang >Assignee: John Fang > > '''Implement a random number generator based on the XORShift algorithm > discovered by George Marsaglia. This RNG is observed 4.5 times faster than > {@link Random} in benchmark, with the cost that abandon thread-safety. So > it's recommended to create a new {@link XORShiftRandom} for each thread.''' -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[GitHub] storm pull request: [STORM-1650] improve performance by XORShiftRa...
Github user revans2 commented on a diff in the pull request: https://github.com/apache/storm/pull/1250#discussion_r57173711 --- Diff: storm-core/src/clj/org/apache/storm/daemon/executor.clj --- @@ -75,7 +76,7 @@ "Returns a function that returns a vector of which task indices to send tuple to, or just a single task index." [^WorkerTopologyContext context component-id stream-id ^Fields out-fields thrift-grouping ^List target-tasks topo-conf] (let [num-tasks (count target-tasks) -random (Random.) +random (XORShiftRandom.) --- End diff -- Again here the grouping could be called from multiple threads, but in this case doubling up on some numbers should not be a big deal. --- If your project is set up for it, you can reply to this email and have your reply appear on GitHub as well. If your project does not have this feature enabled and wishes so, or if the feature is enabled but not working, please contact infrastructure at infrastruct...@apache.org or file a JIRA ticket with INFRA. ---
[GitHub] storm pull request: [STORM-1279] port backtype.storm.daemon.superv...
Github user jerrypeng commented on a diff in the pull request: https://github.com/apache/storm/pull/1184#discussion_r57171985 --- Diff: storm-core/src/jvm/org/apache/storm/daemon/supervisor/SyncSupervisorEvent.java --- @@ -0,0 +1,632 @@ +/** + * Licensed to the Apache Software Foundation (ASF) under one + * or more contributor license agreements. See the NOTICE file + * distributed with this work for additional information + * regarding copyright ownership. The ASF licenses this file + * to you under the Apache License, Version 2.0 (the + * "License"); you may not use this file except in compliance + * with the License. You may obtain a copy of the License at + * + * http://www.apache.org/licenses/LICENSE-2.0 + * + * Unless required by applicable law or agreed to in writing, software + * distributed under the License is distributed on an "AS IS" BASIS, + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. + * See the License for the specific language governing permissions and + * limitations under the License. + */ +package org.apache.storm.daemon.supervisor; + +import org.apache.commons.io.FileUtils; +import org.apache.storm.Config; +import org.apache.storm.blobstore.BlobStore; +import org.apache.storm.blobstore.ClientBlobStore; +import org.apache.storm.cluster.IStateStorage; +import org.apache.storm.cluster.IStormClusterState; +import org.apache.storm.event.EventManager; +import org.apache.storm.generated.*; +import org.apache.storm.localizer.LocalResource; +import org.apache.storm.localizer.LocalizedResource; +import org.apache.storm.localizer.Localizer; +import org.apache.storm.utils.*; +import org.apache.thrift.transport.TTransportException; +import org.slf4j.Logger; +import org.slf4j.LoggerFactory; + +import java.io.File; +import java.io.FileOutputStream; +import java.io.IOException; +import java.net.JarURLConnection; +import java.net.URL; +import java.nio.file.Files; +import java.nio.file.StandardCopyOption; +import java.util.*; +import java.util.concurrent.atomic.AtomicInteger; + +public class SyncSupervisorEvent implements Runnable { + +private static final Logger LOG = LoggerFactory.getLogger(SyncSupervisorEvent.class); + +private EventManager syncSupEventManager; +private EventManager syncProcessManager; +private IStormClusterState stormClusterState; +private LocalState localState; +private SyncProcessEvent syncProcesses; +private SupervisorData supervisorData; + +public SyncSupervisorEvent(SupervisorData supervisorData, SyncProcessEvent syncProcesses, EventManager syncSupEventManager, +EventManager syncProcessManager) { + +this.syncProcesses = syncProcesses; +this.syncSupEventManager = syncSupEventManager; +this.syncProcessManager = syncProcessManager; +this.stormClusterState = supervisorData.getStormClusterState(); +this.localState = supervisorData.getLocalState(); +this.supervisorData = supervisorData; +} + +@Override +public void run() { +try { +Map conf = supervisorData.getConf(); +Runnable syncCallback = new EventManagerPushCallback(this, syncSupEventManager); +List stormIds = stormClusterState.assignments(syncCallback); +Map> assignmentsSnapshot = +getAssignmentsSnapshot(stormClusterState, stormIds, supervisorData.getAssignmentVersions().get(), syncCallback); +Map stormIdToProfilerActions = getProfileActions(stormClusterState, stormIds); + +Set allDownloadedTopologyIds = SupervisorUtils.readDownLoadedStormIds(conf); +Map stormcodeMap = readStormCodeLocations(assignmentsSnapshot); +Map existingAssignment = localState.getLocalAssignmentsMap(); +if (existingAssignment == null) { +existingAssignment = new HashMap<>(); +} + +Map allAssignment = +readAssignments(assignmentsSnapshot, existingAssignment, supervisorData.getAssignmentId(), supervisorData.getSyncRetry()); + +Map newAssignment = new HashMap<>(); +Set assignedStormIds = new HashSet<>(); + +for (Map.Entry entry : allAssignment.entrySet()) { +if (supervisorData.getiSupervisor().confirmAssigned(entry.getKey())) { +newAssignment.put(entry.getKey(), entry.getValue()); +