[GitHub] storm issue #1784: STORM-2205: Racecondition in getting nimbus summaries whi...

2016-11-17 Thread harshach
Github user harshach commented on the issue:

https://github.com/apache/storm/pull/1784
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm issue #1779: STORM-1694: Kafka Spout Trident Implementation Using New ...

2016-11-17 Thread harshach
Github user harshach commented on the issue:

https://github.com/apache/storm/pull/1779
  
+1


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: Request write access to Storm's wiki page on Confluent

2016-11-17 Thread P. Taylor Goetz
Done.

-Taylor

> On Nov 17, 2016, at 1:27 PM, Hugo Da Cruz Louro  
> wrote:
> 
> Hi,
> 
> I would like to request write access to the wiki page on 
> Confluent. I am 
> currently registered but I don't have writing access.
> 
> I have a couple of benchmarks and performance analysis that I would like to 
> share with the community.
> 
> Thanks in advance.
> Best,
> Hugo



Request write access to Storm's wiki page on Confluent

2016-11-17 Thread Hugo Da Cruz Louro
Hi,

I would like to request write access to the wiki page on 
Confluent. I am 
currently registered but I don't have writing access.

I have a couple of benchmarks and performance analysis that I would like to 
share with the community.

Thanks in advance.
Best,
Hugo


[GitHub] storm pull request #1785: [STORM-2201] Add dynamic scheduler configuration l...

2016-11-17 Thread ppoulosk
GitHub user ppoulosk opened a pull request:

https://github.com/apache/storm/pull/1785

[STORM-2201] Add dynamic scheduler configuration loading.

This adds an interface and two implementations, one that will load from a 
local file, and another
that will load a config from artifactory.

It also modifies the ResourceAwareScheduler and Multitenant schedulers to 
have a plugin interface and configuration entries for the plugin.

Unit tests are also added for the implementations.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/ppoulosk/storm STORM-2201

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/storm/pull/1785.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1785


commit a3024280e0aa1352ff7c0bd79f5d0e04fafff844
Author: Paul Poulosky 
Date:   2016-09-16T19:11:12Z

Add dynamic scheduler configuration loading.  This has an interface and
two implementations, one that will load from a local file, and another
that will load a config from artifactory.

Merge pull request #807 from ppoulosk/YSTORM-3095-redux

[YSTORM-3095]  [YSTORM-3779]  Re-merge and fix Artifactory scheduler plugins

Move to org.apache




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm issue #1739: STORM-1443 [Storm SQL] Support customizing parallelism in...

2016-11-17 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue:

https://github.com/apache/storm/pull/1739
  
@vesense 
Actually I had some time to think about STORM-2147.
There might be some ways to pass partition count to upstream, and easy way 
to do might be adding method to DataSourcesProvider. What I'm considering is 
that we're now only thinking about partition count, but Calcite supports table 
statistics which contains estimated row count (not available for streaming 
env., partitioning attributes, etc.). Is passing partition count exhaustive? 
I'm not sure.

As this is on top of STORM-1446 which needs understanding of Calcite, 
learning Calcite is more important for you to work on further works for Storm 
SQL, especially Storm SQL lacks reviewers to go forward. So if you haven't had 
time, let's take your time to get it.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm issue #1739: STORM-1443 [Storm SQL] Support customizing parallelism in...

2016-11-17 Thread vesense
Github user vesense commented on the issue:

https://github.com/apache/storm/pull/1739
  
@HeartSaVioR Overall looks good to me.
And I have a question: Now I'm working on STORM-2147 which I think should 
be based on STORM-1443. Through this PR we can set the parallelism by 
specifying `PARALLELISM` in SQL,  I want to know how can I do this in 
`DataSourcesProvider`(i.e. how to set partition number in data sources, new 
APIs or anything else)?



---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm issue #1742: STORM-2170 [Storm SQL] Add built-in socket datasource to ...

2016-11-17 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue:

https://github.com/apache/storm/pull/1742
  
@vesense Thanks for reviewing, I addressed all of your comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [DISCUSS] Release Storm 1.1.0

2016-11-17 Thread Satish Duggana
STORM-2205: Race condition in getting nimbus summaries while ZK
connections are reconnected.

This issue seems to occur in our environments and I would like this to be
part of 1.1.0.

Thanks,
Satish.

On Thu, Nov 17, 2016 at 9:36 AM, Jungtaek Lim  wrote:

> I have no idea on storm-kafka-client, but some bugfix issues for
> storm-kafka-client are waiting for reviewing / merging.
>
> STORM-2014 
> STORM-2087 
> STORM-2104 
>
> If someone can review them in several days it would be great.
>
> I hope that we include currently opened pull requests for Storm SQL so that
> we can release 'usable Storm SQL' more usable, but I'm also OK to postpone
> them to be included to next release if they drag the release.
>
> STORM-1446 
> STORM-1443 
> STORM-2148 
> STORM-2170 
>
> I can see some pull requests which address Trident implementations for
> storm-kafka-client, storm-mongodb, storm-cassandra.
>
> storm-kafka-client: STORM-1694
>  (patch for 2.0 is
> merged, patch for 1.x is ready for reviewing)
> storm-cassandra: STORM-1369
> 
> storm-mongodb: STORM-1607  jira/browse/STORM-1607>
>
> If we want to cut the release now, we could include only bugfix issues and
> postpone others. Otherwise we could discuss and include some or all of the
> above.
>
> What do you think? When we want to start the release process for 1.1.0?
>
> - Jungtaek Lim (HeartSaVioR)
>
> 2016년 11월 16일 (수) 오전 4:11, P. Taylor Goetz 님이 작성:
>
> Thanks Xin, I added it to the 1.1.0 epic.
>
> -Taylor
>
> > On Nov 15, 2016, at 9:01 AM, Xin Wang  wrote:
> >
> > STORM-2198 ( PR: https://github.com/apache/storm/pull/1773 ) fixes a bug
> of
> > storm-hdfs. Do we have a consideration to include this?
> >
> > Thanks,
> > Xin Wang (vesense)
> >
> > 2016-11-15 10:03 GMT+08:00 Jungtaek Lim :
> >
> >> Some issues on Storm SQL are resolved but not documented yet. I'll file
> an
> >> issue and assign to 1.1.0 release epic.
> >> And also I want to address dropping aggregation and join on Storm SQL
> >> Trident mode before releasing. I'll assign it too.
> >>
> >> - Jungtaek Lim (HeartSaVioR)
> >>
> >>
> >> 2016년 11월 15일 (화) 오전 5:55, P. Taylor Goetz 님이 작성:
> >>
> >>> I think we’re very close. I would like to confirm that the 1.x-branch
> is
> >>> not affected by STORM-2176.
> >>>
> >>> The worker lifecycle API was added in 1.0, but doesn’t work in any
> >>> released version due to STORM-2176.
> >>>
> >>> If there are any other open JIRAs that anyone is passionate about, now
> >>> would be a good time to assign them to the 1.1.0 release epic
> >> (STORM-1856).
> >>>
> >>> -Taylor
> >>>
> >>>
> >>>
>  On Oct 27, 2016, at 12:19 PM, Jungtaek Lim  wrote:
> 
>  Finally Pacemaker H/A, Supervisor V2, and Storm SQL PRs which were
> >> opened
>  at the last mail (4 weeks ago) are all merged to 1.x branch.
> 
>  There're some more PRs on Storm SQL opened, but given that we can
> >> release
>  new minor at any time when we feel it's enough change, I can wait for
> >> it.
>  They didn't get reviewed yet indeed.
> 
>  Is there something else we would want to include it to 1.1.0?
> 
>  Thanks,
>  Jungtaek Lim (HeartSaVioR)
> 
>  2016년 10월 1일 (토) 오전 9:30, Jungtaek Lim 님이 작성:
> 
> > Personally, merging and porting back to three branches are painful
> >>> enough,
> > especially we don't have merging script and having verbose process (I
> >>> mean
> > CHANGELOG).
> > It would be better if merging process is automated (by running script
> >> or
> > so), so I'd +1 to revisit Harsha's suggestion (adopting Kafka merge
> >>> script)
> > and modify script to fit to Storm.
> > (It will not work if it's the case we need to handle PRs for each
> >>> version
> > line, since 'Close' in commit log doesn't close the PR if its target
> >>> branch
> > is not master.)
> >
> > Anyway, without automation I don't want to maintain more version
> >> lines.
> > I'm looking at the announces from other projects, and others are only
> > maintaining two version lines.
> > Since we maintain 2.0.0 version line we can't reduce version lines to
> >> 2,
> > but hopefully at most 3.
> >
> > Btw, let's check pending pull requests and enumerate which can be
> >>> included
> > in 1.0.0, and start/finish review and merge them soon.
> > For me Supervisor V2 and Pacemaker H/A, and 

[GitHub] storm issue #1742: STORM-2170 [Storm SQL] Add built-in socket datasource to ...

2016-11-17 Thread vesense
Github user vesense commented on the issue:

https://github.com/apache/storm/pull/1742
  
Thanks @HeartSaVioR LGTM +1 Just left several comments.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm pull request #1742: STORM-2170 [Storm SQL] Add built-in socket datasou...

2016-11-17 Thread vesense
Github user vesense commented on a diff in the pull request:

https://github.com/apache/storm/pull/1742#discussion_r88447536
  
--- Diff: 
external/sql/storm-sql-runtime/src/jvm/org/apache/storm/sql/runtime/datasource/socket/SocketDataSourcesProvider.java
 ---
@@ -0,0 +1,94 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.storm.sql.runtime.datasource.socket;
+
+import com.google.common.collect.Lists;
+import org.apache.storm.sql.runtime.DataSource;
+import org.apache.storm.sql.runtime.DataSourcesProvider;
+import org.apache.storm.sql.runtime.FieldInfo;
+import org.apache.storm.sql.runtime.FieldNameExtractor;
+import org.apache.storm.sql.runtime.ISqlTridentDataSource;
+import org.apache.storm.sql.runtime.SimpleSqlTridentConsumer;
+import org.apache.storm.sql.runtime.datasource.socket.trident.SocketState;
+import 
org.apache.storm.sql.runtime.datasource.socket.trident.SocketStateUpdater;
+import 
org.apache.storm.sql.runtime.datasource.socket.trident.TridentSocketSpout;
+import org.apache.storm.sql.runtime.serde.json.JsonSerializer;
+import org.apache.storm.trident.spout.ITridentDataSource;
+import org.apache.storm.trident.state.StateFactory;
+import org.apache.storm.trident.state.StateUpdater;
+
+import java.net.URI;
+import java.util.List;
+
+/**
+ * Create a Socket data source based on the URI and properties. The URI 
has the format of
+ * socket://[host]:[port]. Both of host and port are mandatory.
+ *
+ * Note that it connects to given host and port, and receive the message 
if it's used for input source,
+ * and send the message if it's used for output data source.
+ */
+public class SocketDataSourcesProvider implements DataSourcesProvider {
+@Override
+public String scheme() {
+return "socket";
+}
+
+private static class SocketTridentDataSource implements 
ISqlTridentDataSource {
+
+private final List fieldNames;
+private final String host;
+private final int port;
+
+SocketTridentDataSource(List fields, String host, int 
port) {
+this.fieldNames = Lists.transform(fields, new 
FieldNameExtractor());
--- End diff --

Maybe we need a upmerge. use `FieldInfoUtils.getFieldNames` as the 
replacement.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm pull request #1742: STORM-2170 [Storm SQL] Add built-in socket datasou...

2016-11-17 Thread vesense
Github user vesense commented on a diff in the pull request:

https://github.com/apache/storm/pull/1742#discussion_r88448768
  
--- Diff: 
external/sql/storm-sql-runtime/src/resources/META-INF/services/org.apache.storm.sql.runtime.DataSourcesProvider
 ---
@@ -0,0 +1,32 @@
+#
+# Licensed to the Apache Software Foundation (ASF) under one
+# or more contributor license agreements.  See the NOTICE file
+# distributed with this work for additional information
+# regarding copyright ownership.  The ASF licenses this file
+# to you under the Apache License, Version 2.0 (the
+# "License"); you may not use this file except in compliance
+# with the License.  You may obtain a copy of the License at
+#
+# http://www.apache.org/licenses/LICENSE-2.0
+#
+# Unless required by applicable law or agreed to in writing, software
+# distributed under the License is distributed on an "AS IS" BASIS,
+# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+# See the License for the specific language governing permissions and
+# limitations under the License.
+#
+# contributor license agreements.  See the NOTICE file distributed with
--- End diff --

duplicate license


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm pull request #1742: STORM-2170 [Storm SQL] Add built-in socket datasou...

2016-11-17 Thread vesense
Github user vesense commented on a diff in the pull request:

https://github.com/apache/storm/pull/1742#discussion_r88449665
  
--- Diff: 
external/sql/storm-sql-runtime/src/jvm/org/apache/storm/sql/runtime/datasource/socket/SocketDataSourcesProvider.java
 ---
@@ -0,0 +1,94 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS,
+ * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
+ * See the License for the specific language governing permissions and
+ * limitations under the License.
+ */
+
+package org.apache.storm.sql.runtime.datasource.socket;
+
+import com.google.common.collect.Lists;
+import org.apache.storm.sql.runtime.DataSource;
+import org.apache.storm.sql.runtime.DataSourcesProvider;
+import org.apache.storm.sql.runtime.FieldInfo;
+import org.apache.storm.sql.runtime.FieldNameExtractor;
+import org.apache.storm.sql.runtime.ISqlTridentDataSource;
+import org.apache.storm.sql.runtime.SimpleSqlTridentConsumer;
+import org.apache.storm.sql.runtime.datasource.socket.trident.SocketState;
+import 
org.apache.storm.sql.runtime.datasource.socket.trident.SocketStateUpdater;
+import 
org.apache.storm.sql.runtime.datasource.socket.trident.TridentSocketSpout;
+import org.apache.storm.sql.runtime.serde.json.JsonSerializer;
+import org.apache.storm.trident.spout.ITridentDataSource;
+import org.apache.storm.trident.state.StateFactory;
+import org.apache.storm.trident.state.StateUpdater;
+
+import java.net.URI;
+import java.util.List;
+
+/**
+ * Create a Socket data source based on the URI and properties. The URI 
has the format of
+ * socket://[host]:[port]. Both of host and port are mandatory.
+ *
+ * Note that it connects to given host and port, and receive the message 
if it's used for input source,
+ * and send the message if it's used for output data source.
+ */
+public class SocketDataSourcesProvider implements DataSourcesProvider {
+@Override
+public String scheme() {
+return "socket";
+}
+
+private static class SocketTridentDataSource implements 
ISqlTridentDataSource {
+
+private final List fieldNames;
+private final String host;
+private final int port;
+
+SocketTridentDataSource(List fields, String host, int 
port) {
+this.fieldNames = Lists.transform(fields, new 
FieldNameExtractor());
+this.host = host;
+this.port = port;
+}
+
+@Override
+public ITridentDataSource getProducer() {
+return new TridentSocketSpout(fieldNames, host, port);
+}
+
+@Override
+public SqlTridentConsumer getConsumer() {
+StateFactory stateFactory = new SocketState.Factory(host, 
port);
+StateUpdater stateUpdater = new 
SocketStateUpdater(new JsonSerializer(fieldNames));
--- End diff --

use `SerdeUtils.getSerializer` for common scenes or we only need address 
`Json` format?


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm pull request #1784: STORM-2205: Racecondition in getting nimbus summar...

2016-11-17 Thread satishd
GitHub user satishd opened a pull request:

https://github.com/apache/storm/pull/1784

STORM-2205: Racecondition in getting nimbus summaries while ZK connections 
are reconnected.



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/satishd/storm storm-2205-1.x

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/storm/pull/1784.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #1784


commit a15b46d49a5ca28bfd196ea61e3bba213966a499
Author: Satish Duggana 
Date:   2016-11-17T08:40:19Z

STORM-2205 Handle race condition for nimbuses




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---