Re: [DISCUSS] Move non-connectors modules to out of external

2017-03-24 Thread P. Taylor Goetz
The question remains if we want to do this in the 1.1.0 release, or later.

If it's the 1.1.0 release we need to make the changes and cut another RC. I'm 
fine with that, but want to make sure we have consensus before going down that 
road.

-Taylor

> On Mar 24, 2017, at 10:57 PM, Harsha Chintalapani  wrote:
> 
> Agree on change like this would be confusing to the users. Lets keep the
> original plan of moving non-connectors modules of external instead of
> introducing new changes
> that are not in scope of this discussion.
> My +1 still stands on keeping the current external/storm-* in place and
> move just sql and storm-perf into top-level. We can have discussion for
> storm 2.0 if we want to do
> more changes.
> 
> -Harsha
> 
>> On Fri, Mar 24, 2017 at 4:31 PM P. Taylor Goetz  wrote:
>> 
>> If we decide to change the structure of the distribution like this, I
>> think we should do it in masrwe/2.0. If we want this for 1.1.0 we need to
>> cut a new release candidate.
>> 
>> Changing the structure of the distribution file structure can be
>> disruptive for users. Even the change to no longer include connector
>> binaries, as we've learned, will be a headache for some users.
>> 
>> IMHO, from an ops perspective, changes like this should be handled like
>> API changes.
>> 
>> -Taylor
>> 
>>> On Mar 24, 2017, at 4:07 PM, Hugo Da Cruz Louro 
>> wrote:
>>> 
>>> Another possibility is to keep the ‘external' module, and create sub
>> modules under it. The legacy structure would remain intact, while making
>> things more modular. An idea would be:
>>> 
>>> + external
>>>+ connectors
>>>+ tools
>>>+ monitoring
>>>+ etc
>>> 
>>> Hugo
>>> 
 On Mar 24, 2017, at 12:34 PM, P. Taylor Goetz 
>> wrote:
 
 For the background on why “external” was selected, you have to go back
>> to a lengthy discussion in Feb. 2014.
 
 Here’s the start of the thread:
 
 
>> http://mail-archives.apache.org/mod_mbox/storm-dev/201402.mbox/%3cee2bd0e2-254c-47af-8a53-257db7f05...@gmail.com%3e
>> <
>> http://mail-archives.apache.org/mod_mbox/storm-dev/201402.mbox/%3cee2bd0e2-254c-47af-8a53-257db7f05...@gmail.com%3E
>>> 
 
 It continues into March:
 
 
>> http://mail-archives.apache.org/mod_mbox/storm-dev/201403.mbox/%3ccadimvzum1d3om30zayqq4xxe1vjbn7fumqcsgu+524oqgec...@mail.gmail.com%3e
 
 I’m -1 on renaming “external”. That’s the name chosen by the community
>> and it has been the norm for 3 years. Changing it would likely confuse
>> users.
 
 One of the ideas behind “external” was that it would contain components
>> that were not essential to running storm. That line has recently blurred
>> with some non-connector code sneaking in, so I’m okay with moving
>> non-connector code out of external. Another point in that thread was a
>> desire to avoid cluttering up the root directory, so we need to be careful
>> about what the destination for those components is.
 
 -Taylor
 
> On Mar 24, 2017, at 3:11 PM, Hugo Da Cruz Louro <
>> hlo...@hortonworks.com> wrote:
> 
> +1 non-connectors to top level
> +1 to renaming external to connectors
> 
> As for storm-kaka, if we are already touching the external modules,
>> all the modules should be a submodule of a parent module called
>> storm-kafka. I don’t think we should have 3 parent modules as we currently
>> have (storm-kafka, storm-kafka-client, storm-kafka-monitor)
> 
> The structure should be something along the lines (I don’t mean the
>> exact names;  we should find better ones. storm-kafka and
>> storm-kafka-client are not very self explanatory in my opinion)
> 
> + storm-kafka
> + monitoring
> + new-client
> + old-client
> 
> If we have to create new modules or submodules (e.g. under utils) so
>> be it. The code should be in a module that is named after what its doing.
> 
> Hugo
> 
>> On Mar 24, 2017, at 11:15 AM, Priyank Shah 
>> wrote:
>> 
>> +1 to moving non-conncectors to top level. I think we should keep
>> stom-kafka-monitor under external or connectors(after renaming).
>> 
>> Jungtaek, just to clarify on what you said regarding storm core
>> referencing storm-kafka-monitor. Like you said its just calling the script
>> from ui jvm. There is no dependency in terms of class files needed to run
>> the script from ui. The script itself adds a –cp argument and all it needs
>> is storm-kafka-monitor jar in classpath. As far as packaging the script is
>> concerned we can do what Satish suggested. i.e. move it to
>> storm-kafka-monitor in source and while packaging put it under bin.
>> Reiterating to make sure I am not mis-understanding anything.
>> 
>> On 3/24/17, 9:14 AM, "Harsha Chintalapani"  wrote:
>> 
>> +1 on moving non-connectors to top-level like sql and storm-perf.
>> 

Re: [DISCUSS] Move non-connectors modules to out of external

2017-03-24 Thread Harsha Chintalapani
Agree on change like this would be confusing to the users. Lets keep the
original plan of moving non-connectors modules of external instead of
introducing new changes
that are not in scope of this discussion.
My +1 still stands on keeping the current external/storm-* in place and
move just sql and storm-perf into top-level. We can have discussion for
storm 2.0 if we want to do
more changes.

-Harsha

On Fri, Mar 24, 2017 at 4:31 PM P. Taylor Goetz  wrote:

> If we decide to change the structure of the distribution like this, I
> think we should do it in masrwe/2.0. If we want this for 1.1.0 we need to
> cut a new release candidate.
>
> Changing the structure of the distribution file structure can be
> disruptive for users. Even the change to no longer include connector
> binaries, as we've learned, will be a headache for some users.
>
> IMHO, from an ops perspective, changes like this should be handled like
> API changes.
>
> -Taylor
>
> > On Mar 24, 2017, at 4:07 PM, Hugo Da Cruz Louro 
> wrote:
> >
> > Another possibility is to keep the ‘external' module, and create sub
> modules under it. The legacy structure would remain intact, while making
> things more modular. An idea would be:
> >
> > + external
> > + connectors
> > + tools
> > + monitoring
> > + etc
> >
> > Hugo
> >
> >> On Mar 24, 2017, at 12:34 PM, P. Taylor Goetz 
> wrote:
> >>
> >> For the background on why “external” was selected, you have to go back
> to a lengthy discussion in Feb. 2014.
> >>
> >> Here’s the start of the thread:
> >>
> >>
> http://mail-archives.apache.org/mod_mbox/storm-dev/201402.mbox/%3cee2bd0e2-254c-47af-8a53-257db7f05...@gmail.com%3e
> <
> http://mail-archives.apache.org/mod_mbox/storm-dev/201402.mbox/%3cee2bd0e2-254c-47af-8a53-257db7f05...@gmail.com%3E
> >
> >>
> >> It continues into March:
> >>
> >>
> http://mail-archives.apache.org/mod_mbox/storm-dev/201403.mbox/%3ccadimvzum1d3om30zayqq4xxe1vjbn7fumqcsgu+524oqgec...@mail.gmail.com%3e
> >>
> >> I’m -1 on renaming “external”. That’s the name chosen by the community
> and it has been the norm for 3 years. Changing it would likely confuse
> users.
> >>
> >> One of the ideas behind “external” was that it would contain components
> that were not essential to running storm. That line has recently blurred
> with some non-connector code sneaking in, so I’m okay with moving
> non-connector code out of external. Another point in that thread was a
> desire to avoid cluttering up the root directory, so we need to be careful
> about what the destination for those components is.
> >>
> >> -Taylor
> >>
> >>> On Mar 24, 2017, at 3:11 PM, Hugo Da Cruz Louro <
> hlo...@hortonworks.com> wrote:
> >>>
> >>> +1 non-connectors to top level
> >>> +1 to renaming external to connectors
> >>>
> >>> As for storm-kaka, if we are already touching the external modules,
> all the modules should be a submodule of a parent module called
> storm-kafka. I don’t think we should have 3 parent modules as we currently
> have (storm-kafka, storm-kafka-client, storm-kafka-monitor)
> >>>
> >>> The structure should be something along the lines (I don’t mean the
> exact names;  we should find better ones. storm-kafka and
> storm-kafka-client are not very self explanatory in my opinion)
> >>>
> >>> + storm-kafka
> >>> + monitoring
> >>> + new-client
> >>> + old-client
> >>>
> >>> If we have to create new modules or submodules (e.g. under utils) so
> be it. The code should be in a module that is named after what its doing.
> >>>
> >>> Hugo
> >>>
>  On Mar 24, 2017, at 11:15 AM, Priyank Shah 
> wrote:
> 
>  +1 to moving non-conncectors to top level. I think we should keep
> stom-kafka-monitor under external or connectors(after renaming).
> 
>  Jungtaek, just to clarify on what you said regarding storm core
> referencing storm-kafka-monitor. Like you said its just calling the script
> from ui jvm. There is no dependency in terms of class files needed to run
> the script from ui. The script itself adds a –cp argument and all it needs
> is storm-kafka-monitor jar in classpath. As far as packaging the script is
> concerned we can do what Satish suggested. i.e. move it to
> storm-kafka-monitor in source and while packaging put it under bin.
> Reiterating to make sure I am not mis-understanding anything.
> 
>  On 3/24/17, 9:14 AM, "Harsha Chintalapani"  wrote:
> 
>  +1 on moving non-connectors to top-level like sql and storm-perf.
>  Regarding storm-kafka-monitor we can move this into "util" folder or
> keep
>  in the external.
>  -Harsha
> 
>  On Fri, Mar 24, 2017 at 2:23 AM Satish Duggana <
> satish.dugg...@gmail.com>
>  wrote:
> 
> > storm-kafka-monitor is not a connector by itself but it is related
> to kafka
> > connectors. So, any utility related to that connector should be part
> of
> > that connector module(can be a 

Re: [DISCUSS] Move non-connectors modules to out of external

2017-03-24 Thread P. Taylor Goetz
If we decide to change the structure of the distribution like this, I think we 
should do it in masrwe/2.0. If we want this for 1.1.0 we need to cut a new 
release candidate.

Changing the structure of the distribution file structure can be disruptive for 
users. Even the change to no longer include connector binaries, as we've 
learned, will be a headache for some users.

IMHO, from an ops perspective, changes like this should be handled like API 
changes.

-Taylor

> On Mar 24, 2017, at 4:07 PM, Hugo Da Cruz Louro  
> wrote:
> 
> Another possibility is to keep the ‘external' module, and create sub modules 
> under it. The legacy structure would remain intact, while making things more 
> modular. An idea would be:
> 
> + external 
> + connectors
> + tools
> + monitoring
> + etc
> 
> Hugo
> 
>> On Mar 24, 2017, at 12:34 PM, P. Taylor Goetz  wrote:
>> 
>> For the background on why “external” was selected, you have to go back to a 
>> lengthy discussion in Feb. 2014.
>> 
>> Here’s the start of the thread:
>> 
>> http://mail-archives.apache.org/mod_mbox/storm-dev/201402.mbox/%3cee2bd0e2-254c-47af-8a53-257db7f05...@gmail.com%3e
>>  
>> 
>> 
>> It continues into March:
>> 
>> http://mail-archives.apache.org/mod_mbox/storm-dev/201403.mbox/%3ccadimvzum1d3om30zayqq4xxe1vjbn7fumqcsgu+524oqgec...@mail.gmail.com%3e
>> 
>> I’m -1 on renaming “external”. That’s the name chosen by the community and 
>> it has been the norm for 3 years. Changing it would likely confuse users.
>> 
>> One of the ideas behind “external” was that it would contain components that 
>> were not essential to running storm. That line has recently blurred with 
>> some non-connector code sneaking in, so I’m okay with moving non-connector 
>> code out of external. Another point in that thread was a desire to avoid 
>> cluttering up the root directory, so we need to be careful about what the 
>> destination for those components is.
>> 
>> -Taylor
>> 
>>> On Mar 24, 2017, at 3:11 PM, Hugo Da Cruz Louro  
>>> wrote:
>>> 
>>> +1 non-connectors to top level
>>> +1 to renaming external to connectors
>>> 
>>> As for storm-kaka, if we are already touching the external modules, all the 
>>> modules should be a submodule of a parent module called storm-kafka. I 
>>> don’t think we should have 3 parent modules as we currently have 
>>> (storm-kafka, storm-kafka-client, storm-kafka-monitor)
>>> 
>>> The structure should be something along the lines (I don’t mean the exact 
>>> names;  we should find better ones. storm-kafka and storm-kafka-client are 
>>> not very self explanatory in my opinion)
>>> 
>>> + storm-kafka
>>> + monitoring
>>> + new-client
>>> + old-client
>>> 
>>> If we have to create new modules or submodules (e.g. under utils) so be it. 
>>> The code should be in a module that is named after what its doing.
>>> 
>>> Hugo
>>> 
 On Mar 24, 2017, at 11:15 AM, Priyank Shah  wrote:
 
 +1 to moving non-conncectors to top level. I think we should keep 
 stom-kafka-monitor under external or connectors(after renaming).
 
 Jungtaek, just to clarify on what you said regarding storm core 
 referencing storm-kafka-monitor. Like you said its just calling the script 
 from ui jvm. There is no dependency in terms of class files needed to run 
 the script from ui. The script itself adds a –cp argument and all it needs 
 is storm-kafka-monitor jar in classpath. As far as packaging the script is 
 concerned we can do what Satish suggested. i.e. move it to 
 storm-kafka-monitor in source and while packaging put it under bin. 
 Reiterating to make sure I am not mis-understanding anything.
 
 On 3/24/17, 9:14 AM, "Harsha Chintalapani"  wrote:
 
 +1 on moving non-connectors to top-level like sql and storm-perf.
 Regarding storm-kafka-monitor we can move this into "util" folder or keep
 in the external.
 -Harsha
 
 On Fri, Mar 24, 2017 at 2:23 AM Satish Duggana 
 wrote:
 
> storm-kafka-monitor is not a connector by itself but it is related to 
> kafka
> connectors. So, any utility related to that connector should be part of
> that connector module(can be a submodule) instead of a top level module.
> core/ui uses this utility referring directly in a hacky way, which we may
> want to fix later. storm-kafka-monitor script exists in bin directory 
> which
> can be moved to storm-kafka-monitor module and the same script can be
> packaged as part of storm/bin directory while packaging the distribution.
> 
> Thanks,
> ~Satish.
> 
>> On Fri, Mar 24, 2017 at 1:07 PM, Jungtaek Lim  wrote:
>> 
>> storm-kafka-monitor is referred by storm-core, 

[GitHub] storm issue #1924: STORM-2343: New Kafka spout can stop emitting tuples if m...

2017-03-24 Thread srdo
Github user srdo commented on the issue:

https://github.com/apache/storm/pull/1924
  
Very sorry about this, but this is still broken :( 

In the current implementation (pre this PR), the spout can be prevented 
from polling if there are more than maxUncommittedOffsets failed tuples. This 
is because the check for whether to poll isn't accounting for retriable tuples 
already being counted in the numUncommittedOffsets count.

The current fix is to allow failed tuples to retry always, even if 
maxUncommittedOffsets is exceeded, because failed tuples don't contribute 
further to the numUncommittedOffsets count. The problem has been to ensure that 
we don't also emit a bunch of new tuples when polling for retriable tuples, and 
end up ignoring maxUncommittedOffsets entirely. This was partially fixed by 
pausing partitions that have no retriable tuples.

With the previous seeking behavior, this was a complete fix, since 
maxPollRecords put a bound on how far the spout could read from the commit 
offset in a poll that was ignoring maxUncommittedOffsets. With the new seeking 
behavior this is broken. If the spout has emitted as many tuples as allowed, 
and the last (highest offset) tuple fails, the spout may now poll for a full 
batch of new tuples, starting from the failed tuple. This scenario can repeat 
arbitrarily many times, so maxUncommittedOffsets is completely ineffective. 

We don't want to go back to the old seeking behavior (IMO), because it 
meant that in the case where maxPollRecords is much lower than 
maxUncommittedOffsets (almost always), the spout might end up choking on failed 
tuples. For example, if maxPollRecords is 5, and tuple 0-4 are not ready for 
retry (they might have been retried already, and are now waiting for retry 
backoff), but tuple 5-9 are, the spout is unable to retry 5-9 (or anything else 
on that partition) because it keeps seeking back to 0, and polling out the 
first 5 tuples. Seeking directly to the retriable tuples should in most cases 
be more efficient as well, because in the old implementation we'd just be 
seeking to the last committed offset, polling, and discarding tuples until we 
reach the ones that can be retried.

We could probably fix the broken behavior by trying really hard not to emit 
new tuples when we're ignoring maxUncommittedOffsets, but that seems like it 
would be error prone and complicated to implement.

I think we might be able to fix this by ensuring that we don't 
"doublecount" retriable tuples. When the spout is deciding whether to poll, it 
should deduct retriable tuples from numUncommittedOffsets when comparing to 
maxUncommittedOffsets.

Changing the poll check in this way is the same as enforcing the following 
constraint per partition, it seems to me:
* Poll only if `numNonRetriableEmittedTuples < maxUncommittedOffsets`. If 
there are more nonretriable tuples than that, the poll won't be allowed because 
`numUncommittedOffsets = numRetriableTuples + numNonRetriableEmittedTuples`, so 
`numUncommittedOffsets - numRetriableTuples >= maxUncommittedOffsets`. 

This should mean that the limit on uncommitted tuples on each partition is 
going to be `maxUncommittedOffsets + maxPollRecords - 1`, because the latest 
tuple that can be retried on a partition is the one at offset 
`maxUncommittedOffsets`, where there are `maxUncommittedOffsets - 1` 
uncommitted tuples "to the left". If the retry poll starts at that offset, it 
at most emits the retried tuple plus `maxPollRecords - 1` new tuples.

There shouldn't be any problems when multiple partitions have retriable 
tuples, where retriable tuples on one partition might be able to cause a 
different partition to break the uncommitted offset limit. This is because a 
partition will at minimum contribute 0 to numUncommittedOffsets (e.g. if all 
uncommitted tuples on that partition are retriable), because any retriable 
tuples being subtracted were already counted in numUncommittedOffsets when the 
tuples were originally emitted.

If we can enforce the limit on a per partition basis this way, there's no 
reason to worry about only emitting retriable tuples when we're exceeding 
maxUncommittedOffsets. 

I don't think there's a need for pausing partitions anymore either. It was 
meant to prevent polling for new tuples when there were retriable tuples, but 
we're no longer trying to prevent that, since the per partition cap is already 
ensuring we won't emit too many tuples. Pausing in this case would prioritize 
retriable tuples over new tuples (e.g. in the case where an unpaused consumer 
might choose to fetch from a nonretriable partition even though there are 
retriable tuples), but might lead to lower throughput overall (in the case 
where there are not enough messages on the retriable partitions to fill a 
batch). I've removed it again.

I've put up what I hope is the fix both here and on the 

[GitHub] storm pull request #2026: Eventhub3

2017-03-24 Thread SreeramGarlapati
Github user SreeramGarlapati commented on a diff in the pull request:

https://github.com/apache/storm/pull/2026#discussion_r108012858
  
--- Diff: 
external/storm-eventhubs/src/main/java/org/apache/storm/eventhubs/spout/EventDataScheme.java
 ---
@@ -44,27 +43,21 @@
 public class EventDataScheme implements IEventDataScheme {
 
private static final long serialVersionUID = 1L;
-
+   private static final Logger logger = 
LoggerFactory.getLogger(EventDataScheme.class);
@Override
-   public List deserialize(Message message) {
+   public List deserialize(EventData eventData) {
final List fieldContents = new ArrayList();
-
-   Map metaDataMap = new HashMap();
String messageData = "";
-
-   for (Section section : message.getPayload()) {
-   if (section instanceof Data) {
-   Data data = (Data) section;
-   messageData = new 
String(data.getValue().getArray());
-   } else if (section instanceof AmqpValue) {
-   AmqpValue amqpValue = (AmqpValue) section;
-   messageData = amqpValue.getValue().toString();
-   } else if (section instanceof ApplicationProperties) {
-   final ApplicationProperties 
applicationProperties = (ApplicationProperties) section;
-   metaDataMap = applicationProperties.getValue();
-   }
+   if(eventData.getBytes()!=null)
+   messageData = new String (eventData.getBytes());
+   else if(eventData.getObject()!=null){
+   try{
+   messageData = new 
String(Serializedeserializeutil.serialize(eventData.getObject()),Charset.defaultCharset());
+   }catch (IOException e){
+   logger.error("Failed to serialize object"+e.toString());
}
-
+   }
+   Map metaDataMap = eventData.getProperties();
--- End diff --

>Map metaDataMap = eventData.getProperties(); [](start = 2, length = 44)

the old solution is not optimal - will result into too many unnecessary map 
objects initialized; pls evaluate if you want to change it
if map.count()==0 (ie., empty map) - dont even return them...


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm issue #2027: STORM-2432: Storm-Kafka-Client Trident Spout Seeks Incorr...

2017-03-24 Thread hmcl
Github user hmcl commented on the issue:

https://github.com/apache/storm/pull/2027
  
@satishd Done!


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [DISCUSS] Move non-connectors modules to out of external

2017-03-24 Thread Hugo Da Cruz Louro
Another possibility is to keep the ‘external' module, and create sub modules 
under it. The legacy structure would remain intact, while making things more 
modular. An idea would be:

 + external 
 + connectors
 + tools
 + monitoring
 + etc

Hugo

> On Mar 24, 2017, at 12:34 PM, P. Taylor Goetz  wrote:
> 
> For the background on why “external” was selected, you have to go back to a 
> lengthy discussion in Feb. 2014.
> 
> Here’s the start of the thread:
> 
> http://mail-archives.apache.org/mod_mbox/storm-dev/201402.mbox/%3cee2bd0e2-254c-47af-8a53-257db7f05...@gmail.com%3e
>  
> 
> 
> It continues into March:
> 
> http://mail-archives.apache.org/mod_mbox/storm-dev/201403.mbox/%3ccadimvzum1d3om30zayqq4xxe1vjbn7fumqcsgu+524oqgec...@mail.gmail.com%3e
> 
> I’m -1 on renaming “external”. That’s the name chosen by the community and it 
> has been the norm for 3 years. Changing it would likely confuse users.
> 
> One of the ideas behind “external” was that it would contain components that 
> were not essential to running storm. That line has recently blurred with some 
> non-connector code sneaking in, so I’m okay with moving non-connector code 
> out of external. Another point in that thread was a desire to avoid 
> cluttering up the root directory, so we need to be careful about what the 
> destination for those components is.
> 
> -Taylor
> 
>> On Mar 24, 2017, at 3:11 PM, Hugo Da Cruz Louro  
>> wrote:
>> 
>> +1 non-connectors to top level
>> +1 to renaming external to connectors
>> 
>> As for storm-kaka, if we are already touching the external modules, all the 
>> modules should be a submodule of a parent module called storm-kafka. I don’t 
>> think we should have 3 parent modules as we currently have (storm-kafka, 
>> storm-kafka-client, storm-kafka-monitor)
>> 
>> The structure should be something along the lines (I don’t mean the exact 
>> names;  we should find better ones. storm-kafka and storm-kafka-client are 
>> not very self explanatory in my opinion)
>> 
>> + storm-kafka
>>  + monitoring
>>  + new-client
>>  + old-client
>> 
>> If we have to create new modules or submodules (e.g. under utils) so be it. 
>> The code should be in a module that is named after what its doing.
>> 
>> Hugo
>> 
>>> On Mar 24, 2017, at 11:15 AM, Priyank Shah  wrote:
>>> 
>>> +1 to moving non-conncectors to top level. I think we should keep 
>>> stom-kafka-monitor under external or connectors(after renaming).
>>> 
>>> Jungtaek, just to clarify on what you said regarding storm core referencing 
>>> storm-kafka-monitor. Like you said its just calling the script from ui jvm. 
>>> There is no dependency in terms of class files needed to run the script 
>>> from ui. The script itself adds a –cp argument and all it needs is 
>>> storm-kafka-monitor jar in classpath. As far as packaging the script is 
>>> concerned we can do what Satish suggested. i.e. move it to 
>>> storm-kafka-monitor in source and while packaging put it under bin. 
>>> Reiterating to make sure I am not mis-understanding anything.
>>> 
>>> On 3/24/17, 9:14 AM, "Harsha Chintalapani"  wrote:
>>> 
>>>  +1 on moving non-connectors to top-level like sql and storm-perf.
>>>  Regarding storm-kafka-monitor we can move this into "util" folder or keep
>>>  in the external.
>>>  -Harsha
>>> 
>>>  On Fri, Mar 24, 2017 at 2:23 AM Satish Duggana 
>>>  wrote:
>>> 
 storm-kafka-monitor is not a connector by itself but it is related to kafka
 connectors. So, any utility related to that connector should be part of
 that connector module(can be a submodule) instead of a top level module.
 core/ui uses this utility referring directly in a hacky way, which we may
 want to fix later. storm-kafka-monitor script exists in bin directory which
 can be moved to storm-kafka-monitor module and the same script can be
 packaged as part of storm/bin directory while packaging the distribution.
 
 Thanks,
 ~Satish.
 
 On Fri, Mar 24, 2017 at 1:07 PM, Jungtaek Lim  wrote:
 
> storm-kafka-monitor is referred by storm-core, though it's referenced via
> executing command. Yes it's a bit odd to place it as top directory, but
> it's not a connector for that reason too. Neither is ideal for me, so
> ironically, either is fine.
> 
> - Jungtaek Lim (HeartSaVioR)
> 
> 2017년 3월 24일 (금) 오후 4:19, Satish Duggana 님이
 작성:
> 
>> +1 except for storm-kafka-monitor module as this utility is more about
>> querying topic/partition offsets of kafka spouts in a topology. Do not
 we
>> want to push this module into connectors/kafka as a submodule along
 with
>> other submodules including old/new kafka spout modules?
>> 

Re: [DISCUSS] Move non-connectors modules to out of external

2017-03-24 Thread P. Taylor Goetz
For the background on why “external” was selected, you have to go back to a 
lengthy discussion in Feb. 2014.

Here’s the start of the thread:

http://mail-archives.apache.org/mod_mbox/storm-dev/201402.mbox/%3cee2bd0e2-254c-47af-8a53-257db7f05...@gmail.com%3e
 


It continues into March:

http://mail-archives.apache.org/mod_mbox/storm-dev/201403.mbox/%3ccadimvzum1d3om30zayqq4xxe1vjbn7fumqcsgu+524oqgec...@mail.gmail.com%3e

I’m -1 on renaming “external”. That’s the name chosen by the community and it 
has been the norm for 3 years. Changing it would likely confuse users.

One of the ideas behind “external” was that it would contain components that 
were not essential to running storm. That line has recently blurred with some 
non-connector code sneaking in, so I’m okay with moving non-connector code out 
of external. Another point in that thread was a desire to avoid cluttering up 
the root directory, so we need to be careful about what the destination for 
those components is.

-Taylor

> On Mar 24, 2017, at 3:11 PM, Hugo Da Cruz Louro  
> wrote:
> 
> +1 non-connectors to top level
> +1 to renaming external to connectors
> 
> As for storm-kaka, if we are already touching the external modules, all the 
> modules should be a submodule of a parent module called storm-kafka. I don’t 
> think we should have 3 parent modules as we currently have (storm-kafka, 
> storm-kafka-client, storm-kafka-monitor)
> 
> The structure should be something along the lines (I don’t mean the exact 
> names;  we should find better ones. storm-kafka and storm-kafka-client are 
> not very self explanatory in my opinion)
> 
> + storm-kafka
>   + monitoring
>   + new-client
>   + old-client
> 
> If we have to create new modules or submodules (e.g. under utils) so be it. 
> The code should be in a module that is named after what its doing.
> 
> Hugo
> 
>> On Mar 24, 2017, at 11:15 AM, Priyank Shah  wrote:
>> 
>> +1 to moving non-conncectors to top level. I think we should keep 
>> stom-kafka-monitor under external or connectors(after renaming).
>> 
>> Jungtaek, just to clarify on what you said regarding storm core referencing 
>> storm-kafka-monitor. Like you said its just calling the script from ui jvm. 
>> There is no dependency in terms of class files needed to run the script from 
>> ui. The script itself adds a –cp argument and all it needs is 
>> storm-kafka-monitor jar in classpath. As far as packaging the script is 
>> concerned we can do what Satish suggested. i.e. move it to 
>> storm-kafka-monitor in source and while packaging put it under bin. 
>> Reiterating to make sure I am not mis-understanding anything.
>> 
>> On 3/24/17, 9:14 AM, "Harsha Chintalapani"  wrote:
>> 
>>   +1 on moving non-connectors to top-level like sql and storm-perf.
>>   Regarding storm-kafka-monitor we can move this into "util" folder or keep
>>   in the external.
>>   -Harsha
>> 
>>   On Fri, Mar 24, 2017 at 2:23 AM Satish Duggana 
>>   wrote:
>> 
>>> storm-kafka-monitor is not a connector by itself but it is related to kafka
>>> connectors. So, any utility related to that connector should be part of
>>> that connector module(can be a submodule) instead of a top level module.
>>> core/ui uses this utility referring directly in a hacky way, which we may
>>> want to fix later. storm-kafka-monitor script exists in bin directory which
>>> can be moved to storm-kafka-monitor module and the same script can be
>>> packaged as part of storm/bin directory while packaging the distribution.
>>> 
>>> Thanks,
>>> ~Satish.
>>> 
>>> On Fri, Mar 24, 2017 at 1:07 PM, Jungtaek Lim  wrote:
>>> 
 storm-kafka-monitor is referred by storm-core, though it's referenced via
 executing command. Yes it's a bit odd to place it as top directory, but
 it's not a connector for that reason too. Neither is ideal for me, so
 ironically, either is fine.
 
 - Jungtaek Lim (HeartSaVioR)
 
 2017년 3월 24일 (금) 오후 4:19, Satish Duggana 님이
>>> 작성:
 
> +1 except for storm-kafka-monitor module as this utility is more about
> querying topic/partition offsets of kafka spouts in a topology. Do not
>>> we
> want to push this module into connectors/kafka as a submodule along
>>> with
> other submodules including old/new kafka spout modules?
> 
> Thanks,
> Satish.
> 
> 
> On Fri, Mar 24, 2017 at 12:10 PM, Arun Iyer 
 wrote:
> 
>> +1
>> 
>> Makes sense to move the non-connectors to top level and keep only the
>> connectors under “connectors” folder.
>> 
>> 
>> On 3/24/17, 12:00 PM, "Jungtaek Lim"  wrote:
>> 
>>> (Sent this yesterday but can't find this from storm-dev mbox...
 sending
> 

Re: [DISCUSS] Move non-connectors modules to out of external

2017-03-24 Thread Hugo Da Cruz Louro
+1 non-connectors to top level
+1 to renaming external to connectors

As for storm-kaka, if we are already touching the external modules, all the 
modules should be a submodule of a parent module called storm-kafka. I don’t 
think we should have 3 parent modules as we currently have (storm-kafka, 
storm-kafka-client, storm-kafka-monitor)

The structure should be something along the lines (I don’t mean the exact 
names;  we should find better ones. storm-kafka and storm-kafka-client are not 
very self explanatory in my opinion)

+ storm-kafka
   + monitoring
   + new-client
   + old-client

If we have to create new modules or submodules (e.g. under utils) so be it. The 
code should be in a module that is named after what its doing.

Hugo

> On Mar 24, 2017, at 11:15 AM, Priyank Shah  wrote:
> 
> +1 to moving non-conncectors to top level. I think we should keep 
> stom-kafka-monitor under external or connectors(after renaming).
> 
> Jungtaek, just to clarify on what you said regarding storm core referencing 
> storm-kafka-monitor. Like you said its just calling the script from ui jvm. 
> There is no dependency in terms of class files needed to run the script from 
> ui. The script itself adds a –cp argument and all it needs is 
> storm-kafka-monitor jar in classpath. As far as packaging the script is 
> concerned we can do what Satish suggested. i.e. move it to 
> storm-kafka-monitor in source and while packaging put it under bin. 
> Reiterating to make sure I am not mis-understanding anything.
> 
> On 3/24/17, 9:14 AM, "Harsha Chintalapani"  wrote:
> 
>+1 on moving non-connectors to top-level like sql and storm-perf.
>Regarding storm-kafka-monitor we can move this into "util" folder or keep
>in the external.
>-Harsha
> 
>On Fri, Mar 24, 2017 at 2:23 AM Satish Duggana 
>wrote:
> 
>> storm-kafka-monitor is not a connector by itself but it is related to kafka
>> connectors. So, any utility related to that connector should be part of
>> that connector module(can be a submodule) instead of a top level module.
>> core/ui uses this utility referring directly in a hacky way, which we may
>> want to fix later. storm-kafka-monitor script exists in bin directory which
>> can be moved to storm-kafka-monitor module and the same script can be
>> packaged as part of storm/bin directory while packaging the distribution.
>> 
>> Thanks,
>> ~Satish.
>> 
>> On Fri, Mar 24, 2017 at 1:07 PM, Jungtaek Lim  wrote:
>> 
>>> storm-kafka-monitor is referred by storm-core, though it's referenced via
>>> executing command. Yes it's a bit odd to place it as top directory, but
>>> it's not a connector for that reason too. Neither is ideal for me, so
>>> ironically, either is fine.
>>> 
>>> - Jungtaek Lim (HeartSaVioR)
>>> 
>>> 2017년 3월 24일 (금) 오후 4:19, Satish Duggana 님이
>> 작성:
>>> 
 +1 except for storm-kafka-monitor module as this utility is more about
 querying topic/partition offsets of kafka spouts in a topology. Do not
>> we
 want to push this module into connectors/kafka as a submodule along
>> with
 other submodules including old/new kafka spout modules?
 
 Thanks,
 Satish.
 
 
 On Fri, Mar 24, 2017 at 12:10 PM, Arun Iyer 
>>> wrote:
 
> +1
> 
> Makes sense to move the non-connectors to top level and keep only the
> connectors under “connectors” folder.
> 
> 
> On 3/24/17, 12:00 PM, "Jungtaek Lim"  wrote:
> 
>> (Sent this yesterday but can't find this from storm-dev mbox...
>>> sending
 it
>> again)
>> 
>> Hi dev,
>> 
>> I'd like to start discussion regarding moving non-connectors modules
>>> out
> of
>> external, maybe top directory.
>> 
>> "external" directory has non-connectors (SQL, Flux,
>>> storm-kafka-monitor,
>> storm-submit-tools), and except Flux, others should be placed to the
> binary
>> dist. since Storm itself (not from user topology) needs to refer
>> them.
>> 
>> They're actually tied to the core of Storm, so I feel that it would
>> be
>> better to treat them (including Flux) as non-external, maybe same
>>> level
 as
>> storm-core.
>> (I'm not sure what "external" actually means for Storm project btw.)
>> 
>> In addition, after doing that I'd like to change the directory name
>> "external" to "connector" or so, so that the name could be
 self-describing
>> and we can only place connectors to that directory.
>> (I know it would be painful for already opened pull requests, so no
 strong
>> opinion regarding this.)
>> 
>> Looking forward to your opinion!
>> 
>> Thanks,
>> Jungtaek Lim (HeartSaVioR)
> 
> 
 
>>> 
>> 
> 
> 



Re: [DISCUSS] Move non-connectors modules to out of external

2017-03-24 Thread Priyank Shah
+1 to moving non-conncectors to top level. I think we should keep 
stom-kafka-monitor under external or connectors(after renaming).

Jungtaek, just to clarify on what you said regarding storm core referencing 
storm-kafka-monitor. Like you said its just calling the script from ui jvm. 
There is no dependency in terms of class files needed to run the script from 
ui. The script itself adds a –cp argument and all it needs is 
storm-kafka-monitor jar in classpath. As far as packaging the script is 
concerned we can do what Satish suggested. i.e. move it to storm-kafka-monitor 
in source and while packaging put it under bin. Reiterating to make sure I am 
not mis-understanding anything.

On 3/24/17, 9:14 AM, "Harsha Chintalapani"  wrote:

+1 on moving non-connectors to top-level like sql and storm-perf.
Regarding storm-kafka-monitor we can move this into "util" folder or keep
in the external.
-Harsha

On Fri, Mar 24, 2017 at 2:23 AM Satish Duggana 
wrote:

> storm-kafka-monitor is not a connector by itself but it is related to 
kafka
> connectors. So, any utility related to that connector should be part of
> that connector module(can be a submodule) instead of a top level module.
> core/ui uses this utility referring directly in a hacky way, which we may
> want to fix later. storm-kafka-monitor script exists in bin directory 
which
> can be moved to storm-kafka-monitor module and the same script can be
> packaged as part of storm/bin directory while packaging the distribution.
>
> Thanks,
> ~Satish.
>
> On Fri, Mar 24, 2017 at 1:07 PM, Jungtaek Lim  wrote:
>
> > storm-kafka-monitor is referred by storm-core, though it's referenced 
via
> > executing command. Yes it's a bit odd to place it as top directory, but
> > it's not a connector for that reason too. Neither is ideal for me, so
> > ironically, either is fine.
> >
> > - Jungtaek Lim (HeartSaVioR)
> >
> > 2017년 3월 24일 (금) 오후 4:19, Satish Duggana 님이
> 작성:
> >
> > > +1 except for storm-kafka-monitor module as this utility is more about
> > > querying topic/partition offsets of kafka spouts in a topology. Do not
> we
> > > want to push this module into connectors/kafka as a submodule along
> with
> > > other submodules including old/new kafka spout modules?
> > >
> > > Thanks,
> > > Satish.
> > >
> > >
> > > On Fri, Mar 24, 2017 at 12:10 PM, Arun Iyer 
> > wrote:
> > >
> > > > +1
> > > >
> > > > Makes sense to move the non-connectors to top level and keep only 
the
> > > > connectors under “connectors” folder.
> > > >
> > > >
> > > > On 3/24/17, 12:00 PM, "Jungtaek Lim"  wrote:
> > > >
> > > > >(Sent this yesterday but can't find this from storm-dev mbox...
> > sending
> > > it
> > > > >again)
> > > > >
> > > > >Hi dev,
> > > > >
> > > > >I'd like to start discussion regarding moving non-connectors 
modules
> > out
> > > > of
> > > > >external, maybe top directory.
> > > > >
> > > > >"external" directory has non-connectors (SQL, Flux,
> > storm-kafka-monitor,
> > > > >storm-submit-tools), and except Flux, others should be placed to 
the
> > > > binary
> > > > >dist. since Storm itself (not from user topology) needs to refer
> them.
> > > > >
> > > > >They're actually tied to the core of Storm, so I feel that it would
> be
> > > > >better to treat them (including Flux) as non-external, maybe same
> > level
> > > as
> > > > >storm-core.
> > > > >(I'm not sure what "external" actually means for Storm project 
btw.)
> > > > >
> > > > >In addition, after doing that I'd like to change the directory name
> > > > >"external" to "connector" or so, so that the name could be
> > > self-describing
> > > > >and we can only place connectors to that directory.
> > > > >(I know it would be painful for already opened pull requests, so no
> > > strong
> > > > >opinion regarding this.)
> > > > >
> > > > >Looking forward to your opinion!
> > > > >
> > > > >Thanks,
> > > > >Jungtaek Lim (HeartSaVioR)
> > > >
> > > >
> > >
> >
>




Re: [VOTE] Release Apache Storm 1.1.0 (RC3)

2017-03-24 Thread Hugo Da Cruz Louro
+1 (binding)
- unzipped .zip sources and binaries
- mvn clean install in unzipped sources
- created uber jar for storm-solr and storm-kafka-client
- tested storm-solr and storm-kafka-client in local cluster mode and remode 
cluster mode

The issue I mentioned earlier 
can safely be downgraded to critical as after further evaluation I consider 
there are workarounds. Furthermore the fix can easily be incorporated in minor 
release as I stated in the follow up discussion emails.

Thanks
Hugo



On Mar 24, 2017, at 12:11 AM, Jungtaek Lim 
> wrote:

1. I'm OK to contain uber jar only for storm-starter, as example for
starter.
The size should be smaller enough if we remove connector dependencies in
storm-starter. Currently it depends on storm-hdfs, storm-hbase,
storm-redis, and we could move out some topologies to remove dependencies.

2. Yeah maintaining doc page for connector in website is fine for me. More
clearly, either is fine.

3. SQL runner in Storm SQL compiles SQL to trident topology, so in order to
run SQL runner, binary dist. needs to have dependencies for SQL runner. Not
same as other connectors.
We might want to make uber jar for Storm SQL core and put to binary dist.
to make it clear.

Thanks,
Jungtaek Lim (HeartSaVioR)

2017년 3월 24일 (금) 오후 3:52, Arun Mahadevan 
>님이 작성:

Could you cast your vote? If you are still not satisfied with excluding
jars you can cast -0 or even -1.

I am not fully convinced with the current binary distribution.

1. Why do we expect users to build the examples source from the binary
distribution? Since these are most likely to be used by new users they will
find it difficult if the build breaks. I checked other distributions like
spark and they have the example jar inside the binary, but the size is
pretty small. If we remove the shading and only keep the storm-starter.jar
the size will be pretty small. Other option is to release a separate binary
(like apache-storm-examples-xyz.jar) with just the example jars.
2. Keeping the connectors out of the binary is good. We should also remove
the directories (with only the README.md). The users can find this info
from the website.
3. Storm-sql jars are kept in the binary. Is there some reason? May be
this should also be removed from the binary to be consistent.

Thanks,
Arun

On 3/23/17, 7:34 PM, "Jungtaek Lim" 
> wrote:

+1 to the latter.

I'm in favor of documenting the change to release note, and also docs so
that website can be reflected. The users who are affected to the change
wouldn't be much, since using dependency management tool (Maven, Gradle,
and so on) has been recommended for creating topology jar.

For me it's not a blocker for release.

Arun, I initiated another thread to discuss moving non-connectors to the
top directory.
Could you cast your vote? If you are still not satisfied with excluding
jars you can cast -0 or even -1.

- Jungtaek Lim (HeartSaVioR)

2017년 3월 23일 (목) 오후 10:43, P. Taylor Goetz 
>님이 작성:

Do we want to cancel this RC in order to better document the changes, or
will documenting it in the release announcement suffice for now
(provided
documentation is added for subsequent releases)?

I’m partial to the latter, but am open to others’ opinions.

-Taylor


On Mar 22, 2017, at 9:49 AM, Bobby Evans 


wrote:

+1 I built form the tag and ran using a single node cluster.
The examples and external components are excluded because they are
huge.  Because of shading they we distribute the same copy of them
multiple
times.
I agree with Alexandre.  We should document this change better,
because
it is confusing for people to get a release that used to have these in
it,
but does not any more.


- Bobby

On Tuesday, March 21, 2017, 10:46:38 PM CDT, Arun Mahadevan <
ar...@apache.org> wrote:Verified the artifacts. 
Compiled examples and
ran
some sample topologies. Looks good.

BTW, why are the external modules excluded from the binaries (the .zip
and .tar.gz). Isn’t it better if the binary distribution includes them?
Maybe it was already discussed but I am missing it. The sql directory
however seems to include the jars so it looks inconsistent.

- Arun


On 3/22/17, 12:56 AM, "P. Taylor Goetz" 
> wrote:

This is a call to vote on releasing Apache Storm 1.1.0 (rc3)

Full list of changes in this release:



https://git-wip-us.apache.org/repos/asf?p=storm.git;a=blob_plain;f=CHANGELOG.md;h=68fbab3c4f91359bd397d93a157830542839b002;hb=e40d213de7067f7d3aa4d4992b81890d8ed6ff31

The tag/commit to be voted upon is v1.1.0:



https://git-wip-us.apache.org/repos/asf?p=storm.git;a=tree;h=7fa62404feb6b86b3143c851b46237580720eb6b;hb=e40d213de7067f7d3aa4d4992b81890d8ed6ff31


Re: [DISCUSS] Move non-connectors modules to out of external

2017-03-24 Thread Harsha Chintalapani
+1 on moving non-connectors to top-level like sql and storm-perf.
Regarding storm-kafka-monitor we can move this into "util" folder or keep
in the external.
-Harsha

On Fri, Mar 24, 2017 at 2:23 AM Satish Duggana 
wrote:

> storm-kafka-monitor is not a connector by itself but it is related to kafka
> connectors. So, any utility related to that connector should be part of
> that connector module(can be a submodule) instead of a top level module.
> core/ui uses this utility referring directly in a hacky way, which we may
> want to fix later. storm-kafka-monitor script exists in bin directory which
> can be moved to storm-kafka-monitor module and the same script can be
> packaged as part of storm/bin directory while packaging the distribution.
>
> Thanks,
> ~Satish.
>
> On Fri, Mar 24, 2017 at 1:07 PM, Jungtaek Lim  wrote:
>
> > storm-kafka-monitor is referred by storm-core, though it's referenced via
> > executing command. Yes it's a bit odd to place it as top directory, but
> > it's not a connector for that reason too. Neither is ideal for me, so
> > ironically, either is fine.
> >
> > - Jungtaek Lim (HeartSaVioR)
> >
> > 2017년 3월 24일 (금) 오후 4:19, Satish Duggana 님이
> 작성:
> >
> > > +1 except for storm-kafka-monitor module as this utility is more about
> > > querying topic/partition offsets of kafka spouts in a topology. Do not
> we
> > > want to push this module into connectors/kafka as a submodule along
> with
> > > other submodules including old/new kafka spout modules?
> > >
> > > Thanks,
> > > Satish.
> > >
> > >
> > > On Fri, Mar 24, 2017 at 12:10 PM, Arun Iyer 
> > wrote:
> > >
> > > > +1
> > > >
> > > > Makes sense to move the non-connectors to top level and keep only the
> > > > connectors under “connectors” folder.
> > > >
> > > >
> > > > On 3/24/17, 12:00 PM, "Jungtaek Lim"  wrote:
> > > >
> > > > >(Sent this yesterday but can't find this from storm-dev mbox...
> > sending
> > > it
> > > > >again)
> > > > >
> > > > >Hi dev,
> > > > >
> > > > >I'd like to start discussion regarding moving non-connectors modules
> > out
> > > > of
> > > > >external, maybe top directory.
> > > > >
> > > > >"external" directory has non-connectors (SQL, Flux,
> > storm-kafka-monitor,
> > > > >storm-submit-tools), and except Flux, others should be placed to the
> > > > binary
> > > > >dist. since Storm itself (not from user topology) needs to refer
> them.
> > > > >
> > > > >They're actually tied to the core of Storm, so I feel that it would
> be
> > > > >better to treat them (including Flux) as non-external, maybe same
> > level
> > > as
> > > > >storm-core.
> > > > >(I'm not sure what "external" actually means for Storm project btw.)
> > > > >
> > > > >In addition, after doing that I'd like to change the directory name
> > > > >"external" to "connector" or so, so that the name could be
> > > self-describing
> > > > >and we can only place connectors to that directory.
> > > > >(I know it would be painful for already opened pull requests, so no
> > > strong
> > > > >opinion regarding this.)
> > > > >
> > > > >Looking forward to your opinion!
> > > > >
> > > > >Thanks,
> > > > >Jungtaek Lim (HeartSaVioR)
> > > >
> > > >
> > >
> >
>


[GitHub] storm pull request #2028: Fix headers

2017-03-24 Thread zinenko
GitHub user zinenko opened a pull request:

https://github.com/apache/storm/pull/2028

Fix headers



You can merge this pull request into a Git repository by running:

$ git pull https://github.com/zinenko/storm patch-1

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/storm/pull/2028.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #2028


commit 37665b5c51cae70ae021ff8ab2f1d43b1e1d607b
Author: AZ 
Date:   2017-03-24T09:56:08Z

Fix headers




---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [DISCUSS] Move non-connectors modules to out of external

2017-03-24 Thread Satish Duggana
storm-kafka-monitor is not a connector by itself but it is related to kafka
connectors. So, any utility related to that connector should be part of
that connector module(can be a submodule) instead of a top level module.
core/ui uses this utility referring directly in a hacky way, which we may
want to fix later. storm-kafka-monitor script exists in bin directory which
can be moved to storm-kafka-monitor module and the same script can be
packaged as part of storm/bin directory while packaging the distribution.

Thanks,
~Satish.

On Fri, Mar 24, 2017 at 1:07 PM, Jungtaek Lim  wrote:

> storm-kafka-monitor is referred by storm-core, though it's referenced via
> executing command. Yes it's a bit odd to place it as top directory, but
> it's not a connector for that reason too. Neither is ideal for me, so
> ironically, either is fine.
>
> - Jungtaek Lim (HeartSaVioR)
>
> 2017년 3월 24일 (금) 오후 4:19, Satish Duggana 님이 작성:
>
> > +1 except for storm-kafka-monitor module as this utility is more about
> > querying topic/partition offsets of kafka spouts in a topology. Do not we
> > want to push this module into connectors/kafka as a submodule along with
> > other submodules including old/new kafka spout modules?
> >
> > Thanks,
> > Satish.
> >
> >
> > On Fri, Mar 24, 2017 at 12:10 PM, Arun Iyer 
> wrote:
> >
> > > +1
> > >
> > > Makes sense to move the non-connectors to top level and keep only the
> > > connectors under “connectors” folder.
> > >
> > >
> > > On 3/24/17, 12:00 PM, "Jungtaek Lim"  wrote:
> > >
> > > >(Sent this yesterday but can't find this from storm-dev mbox...
> sending
> > it
> > > >again)
> > > >
> > > >Hi dev,
> > > >
> > > >I'd like to start discussion regarding moving non-connectors modules
> out
> > > of
> > > >external, maybe top directory.
> > > >
> > > >"external" directory has non-connectors (SQL, Flux,
> storm-kafka-monitor,
> > > >storm-submit-tools), and except Flux, others should be placed to the
> > > binary
> > > >dist. since Storm itself (not from user topology) needs to refer them.
> > > >
> > > >They're actually tied to the core of Storm, so I feel that it would be
> > > >better to treat them (including Flux) as non-external, maybe same
> level
> > as
> > > >storm-core.
> > > >(I'm not sure what "external" actually means for Storm project btw.)
> > > >
> > > >In addition, after doing that I'd like to change the directory name
> > > >"external" to "connector" or so, so that the name could be
> > self-describing
> > > >and we can only place connectors to that directory.
> > > >(I know it would be painful for already opened pull requests, so no
> > strong
> > > >opinion regarding this.)
> > > >
> > > >Looking forward to your opinion!
> > > >
> > > >Thanks,
> > > >Jungtaek Lim (HeartSaVioR)
> > >
> > >
> >
>


Re: [DISCUSS] Move non-connectors modules to out of external

2017-03-24 Thread Jungtaek Lim
storm-kafka-monitor is referred by storm-core, though it's referenced via
executing command. Yes it's a bit odd to place it as top directory, but
it's not a connector for that reason too. Neither is ideal for me, so
ironically, either is fine.

- Jungtaek Lim (HeartSaVioR)

2017년 3월 24일 (금) 오후 4:19, Satish Duggana 님이 작성:

> +1 except for storm-kafka-monitor module as this utility is more about
> querying topic/partition offsets of kafka spouts in a topology. Do not we
> want to push this module into connectors/kafka as a submodule along with
> other submodules including old/new kafka spout modules?
>
> Thanks,
> Satish.
>
>
> On Fri, Mar 24, 2017 at 12:10 PM, Arun Iyer  wrote:
>
> > +1
> >
> > Makes sense to move the non-connectors to top level and keep only the
> > connectors under “connectors” folder.
> >
> >
> > On 3/24/17, 12:00 PM, "Jungtaek Lim"  wrote:
> >
> > >(Sent this yesterday but can't find this from storm-dev mbox... sending
> it
> > >again)
> > >
> > >Hi dev,
> > >
> > >I'd like to start discussion regarding moving non-connectors modules out
> > of
> > >external, maybe top directory.
> > >
> > >"external" directory has non-connectors (SQL, Flux, storm-kafka-monitor,
> > >storm-submit-tools), and except Flux, others should be placed to the
> > binary
> > >dist. since Storm itself (not from user topology) needs to refer them.
> > >
> > >They're actually tied to the core of Storm, so I feel that it would be
> > >better to treat them (including Flux) as non-external, maybe same level
> as
> > >storm-core.
> > >(I'm not sure what "external" actually means for Storm project btw.)
> > >
> > >In addition, after doing that I'd like to change the directory name
> > >"external" to "connector" or so, so that the name could be
> self-describing
> > >and we can only place connectors to that directory.
> > >(I know it would be painful for already opened pull requests, so no
> strong
> > >opinion regarding this.)
> > >
> > >Looking forward to your opinion!
> > >
> > >Thanks,
> > >Jungtaek Lim (HeartSaVioR)
> >
> >
>


Re: [DISCUSS] Move non-connectors modules to out of external

2017-03-24 Thread Satish Duggana
+1 except for storm-kafka-monitor module as this utility is more about
querying topic/partition offsets of kafka spouts in a topology. Do not we
want to push this module into connectors/kafka as a submodule along with
other submodules including old/new kafka spout modules?

Thanks,
Satish.


On Fri, Mar 24, 2017 at 12:10 PM, Arun Iyer  wrote:

> +1
>
> Makes sense to move the non-connectors to top level and keep only the
> connectors under “connectors” folder.
>
>
> On 3/24/17, 12:00 PM, "Jungtaek Lim"  wrote:
>
> >(Sent this yesterday but can't find this from storm-dev mbox... sending it
> >again)
> >
> >Hi dev,
> >
> >I'd like to start discussion regarding moving non-connectors modules out
> of
> >external, maybe top directory.
> >
> >"external" directory has non-connectors (SQL, Flux, storm-kafka-monitor,
> >storm-submit-tools), and except Flux, others should be placed to the
> binary
> >dist. since Storm itself (not from user topology) needs to refer them.
> >
> >They're actually tied to the core of Storm, so I feel that it would be
> >better to treat them (including Flux) as non-external, maybe same level as
> >storm-core.
> >(I'm not sure what "external" actually means for Storm project btw.)
> >
> >In addition, after doing that I'd like to change the directory name
> >"external" to "connector" or so, so that the name could be self-describing
> >and we can only place connectors to that directory.
> >(I know it would be painful for already opened pull requests, so no strong
> >opinion regarding this.)
> >
> >Looking forward to your opinion!
> >
> >Thanks,
> >Jungtaek Lim (HeartSaVioR)
>
>


Re: [VOTE] Release Apache Storm 1.1.0 (RC3)

2017-03-24 Thread Jungtaek Lim
1. I'm OK to contain uber jar only for storm-starter, as example for
starter.
The size should be smaller enough if we remove connector dependencies in
storm-starter. Currently it depends on storm-hdfs, storm-hbase,
storm-redis, and we could move out some topologies to remove dependencies.

2. Yeah maintaining doc page for connector in website is fine for me. More
clearly, either is fine.

3. SQL runner in Storm SQL compiles SQL to trident topology, so in order to
run SQL runner, binary dist. needs to have dependencies for SQL runner. Not
same as other connectors.
We might want to make uber jar for Storm SQL core and put to binary dist.
to make it clear.

Thanks,
Jungtaek Lim (HeartSaVioR)

2017년 3월 24일 (금) 오후 3:52, Arun Mahadevan 님이 작성:

> >Could you cast your vote? If you are still not satisfied with excluding
> >jars you can cast -0 or even -1.
>
> I am not fully convinced with the current binary distribution.
>
> 1. Why do we expect users to build the examples source from the binary
> distribution? Since these are most likely to be used by new users they will
> find it difficult if the build breaks. I checked other distributions like
> spark and they have the example jar inside the binary, but the size is
> pretty small. If we remove the shading and only keep the storm-starter.jar
> the size will be pretty small. Other option is to release a separate binary
> (like apache-storm-examples-xyz.jar) with just the example jars.
> 2. Keeping the connectors out of the binary is good. We should also remove
> the directories (with only the README.md). The users can find this info
> from the website.
> 3. Storm-sql jars are kept in the binary. Is there some reason? May be
> this should also be removed from the binary to be consistent.
>
> Thanks,
> Arun
>
> On 3/23/17, 7:34 PM, "Jungtaek Lim"  wrote:
>
> >+1 to the latter.
> >
> >I'm in favor of documenting the change to release note, and also docs so
> >that website can be reflected. The users who are affected to the change
> >wouldn't be much, since using dependency management tool (Maven, Gradle,
> >and so on) has been recommended for creating topology jar.
> >
> >For me it's not a blocker for release.
> >
> >Arun, I initiated another thread to discuss moving non-connectors to the
> >top directory.
> >Could you cast your vote? If you are still not satisfied with excluding
> >jars you can cast -0 or even -1.
> >
> >- Jungtaek Lim (HeartSaVioR)
> >
> >2017년 3월 23일 (목) 오후 10:43, P. Taylor Goetz 님이 작성:
> >
> >> Do we want to cancel this RC in order to better document the changes, or
> >> will documenting it in the release announcement suffice for now
> (provided
> >> documentation is added for subsequent releases)?
> >>
> >> I’m partial to the latter, but am open to others’ opinions.
> >>
> >> -Taylor
> >>
> >>
> >> > On Mar 22, 2017, at 9:49 AM, Bobby Evans  >
> >> wrote:
> >> >
> >> > +1 I built form the tag and ran using a single node cluster.
> >> > The examples and external components are excluded because they are
> >> huge.  Because of shading they we distribute the same copy of them
> multiple
> >> times.
> >> > I agree with Alexandre.  We should document this change better,
> because
> >> it is confusing for people to get a release that used to have these in
> it,
> >> but does not any more.
> >> >
> >> >
> >> > - Bobby
> >> >
> >> > On Tuesday, March 21, 2017, 10:46:38 PM CDT, Arun Mahadevan <
> >> ar...@apache.org> wrote:Verified the artifacts. Compiled examples and
> ran
> >> some sample topologies. Looks good.
> >> >
> >> > BTW, why are the external modules excluded from the binaries (the .zip
> >> and .tar.gz). Isn’t it better if the binary distribution includes them?
> >> Maybe it was already discussed but I am missing it. The sql directory
> >> however seems to include the jars so it looks inconsistent.
> >> >
> >> > - Arun
> >> >
> >> >
> >> > On 3/22/17, 12:56 AM, "P. Taylor Goetz"  wrote:
> >> >
> >> >> This is a call to vote on releasing Apache Storm 1.1.0 (rc3)
> >> >>
> >> >> Full list of changes in this release:
> >> >>
> >> >>
> >>
> https://git-wip-us.apache.org/repos/asf?p=storm.git;a=blob_plain;f=CHANGELOG.md;h=68fbab3c4f91359bd397d93a157830542839b002;hb=e40d213de7067f7d3aa4d4992b81890d8ed6ff31
> >> >>
> >> >> The tag/commit to be voted upon is v1.1.0:
> >> >>
> >> >>
> >>
> https://git-wip-us.apache.org/repos/asf?p=storm.git;a=tree;h=7fa62404feb6b86b3143c851b46237580720eb6b;hb=e40d213de7067f7d3aa4d4992b81890d8ed6ff31
> >> >>
> >> >> The source archive being voted upon can be found here:
> >> >>
> >> >>
> >>
> https://dist.apache.org/repos/dist/dev/storm/apache-storm-1.1.0-rc3/apache-storm-1.1.0-src.tar.gz
> >> >>
> >> >> Other release files, signatures and digests can be found here:
> >> >>
> >> >> https://dist.apache.org/repos/dist/dev/storm/apache-storm-1.1.0-rc3/
> >> >>
> >> >> The release artifacts are signed with the following 

Re: [DISCUSS] Storm 2.0 Roadmap

2017-03-24 Thread Arun Mahadevan
+1 to release with the porting completed. I think its mainly the UI server and 
log viewer that’s pending. 

We can start doing the regression and performance tests for whatever is already 
ported.

If anyone is running the master branch in their pre-prod / prod environments, 
it will be good to know and give us more confidence.

The other features can be added in follow up releases.

Regards,
Arun


On 3/24/17, 11:47 AM, "Satish Duggana"  wrote:

>+1 to have 2.0 with porting and performance(it should be at least as good
>as 1.x release) issues addressed
>
>We can target other tasks(mentioned by Taylor and Jungtaek) for 2.x-branch.
>
>
>Exactly-once support:
>While thinking through the exactlyonce support design, it is realized
>better to avoid acking tuples and implement exactly once by snapshotting
>barriers. It seems JStorm folks followed similar design, they claim it
>gives better performance. This feature is essential for beam runner and we
>can decide on respective approaches though.
>
>Beam Runner
>Lets hold on this for now and keep it in Storm till 2.x. We should avoid
>having a minimal beam runner in haste. It is better to address STORM-2284,
>exactly-once and other windowing enhancements to enable beam runner.
>
>JStorm
>Agree with Jungtaek on looking at the latest JStorm and align/scope with
>the features for 2.x.
>
>STORM-2284
>We may want to look at JStorm worker before working on respective
>components in this epic to pull appropriate enhancements.
>
>YARN/MESOS
>Supporting Storm on YARN/Mesos for 2.x.
>
>Thanks,
>Satish.
>
>
>On Fri, Mar 24, 2017 at 9:09 AM, Jungtaek Lim  wrote:
>
>> First of all, +1 to complete only port work and do sanity check (including
>> performance regression), and release.
>>
>> If we can get STORM-2284 within deterministic time frame (say 2~3 months)
>> that should be great, but if not I'd in favor of postponing that to later
>> 2.x release.
>>
>> JStorm released their new versions after code donation. So there're more
>> things we could get ideas from, or even adopt from.
>> https://github.com/alibaba/jstorm/blob/master/history.md
>> As you noticed from release note link, we also need to update phase 2 since
>> they already changed what we're planning to do in phase 2. For example,
>> they changed backpressure to end-to-end, and changed to use snapshot rather
>> than acker.
>> May be sure, JStorm pulled many features from today's Storm, like Flux,
>> Windowing, more shuffle groupings, log search, log level change, and so on.
>>
>> STORM-2426  is due to
>> the
>> limitation of Spout lifecycle (all the things are done in single thread),
>> and STORM-1358 (JStorm's
>> multi-thread Spout) can remedy this (despite that Spout implementation may
>> need to guarantee thread-safety later). It's not a just improvement but
>> close to design concern so would like to address sooner than other things
>> in phase 2.
>>
>> For Storm SQL side, I've lost progress but major work would be adopting
>> group by with windowing. It was not available from Calcite but will be
>> available at next release (1.12.0).
>> I've filed this to STORM-2405
>> , but windowing & micro
>> batch is not intuitive, so I would like to change the underlying API to
>> stream API in SQL. Also filed this to STORM-2406
>> .
>>
>> Just 2 cents btw, hopefully I would like to see metrics V2 sooner since we
>> lost metrics even when doing normal operation like restarting worker,
>> rebalancing, and so on. Eventually we need to fight with dynamic scaling,
>> and then metrics will be broken often.
>>
>> Thanks,
>> Jungtaek Lim (HeartSaVioR)
>>
>> 2017년 3월 24일 (금) 오전 5:05, Harsha Chintalapani 님이 작성:
>>
>> > Storm 2.0 migration to java in itself is a big win and would attract
>> wider
>> > community and adoption. So my vote would be to resolve the first 3 items
>> to
>> > get a release out.
>> > All the other featured mentioned are great to have but shouldn't be
>> > blockers for 2.0 release.
>> >
>> > -Harsha
>> >
>> > On Thu, Mar 23, 2017 at 11:51 AM P. Taylor Goetz 
>> > wrote:
>> >
>> > > With the 1.1.0 release nearing completion, I’d like to turn our
>> attention
>> > > to 2.0 and develop a plan for what features, etc. to include.
>> > >
>> > > The following 3 are what I feel are the minimum for a 2.0 release.
>> These
>> > > could likely be resolved relatively quickly:
>> > >
>> > > * Performance — I’ve not benchmarked the master branch vs. 1.0.x or
>> 1.1.x
>> > > in a while, but I feel it will be important to make sure there are no
>> > > performance regressions, and would hope that we actually have a
>> > performance
>> > > improvement over previous versions. To that end (e.g. if there is in
>> > fact a
>> > > performance 

Re: [VOTE] Release Apache Storm 1.1.0 (RC3)

2017-03-24 Thread Arun Mahadevan
>Could you cast your vote? If you are still not satisfied with excluding
>jars you can cast -0 or even -1.

I am not fully convinced with the current binary distribution.

1. Why do we expect users to build the examples source from the binary 
distribution? Since these are most likely to be used by new users they will 
find it difficult if the build breaks. I checked other distributions like spark 
and they have the example jar inside the binary, but the size is pretty small. 
If we remove the shading and only keep the storm-starter.jar the size will be 
pretty small. Other option is to release a separate binary (like 
apache-storm-examples-xyz.jar) with just the example jars.
2. Keeping the connectors out of the binary is good. We should also remove the 
directories (with only the README.md). The users can find this info from the 
website.
3. Storm-sql jars are kept in the binary. Is there some reason? May be this 
should also be removed from the binary to be consistent.

Thanks,
Arun

On 3/23/17, 7:34 PM, "Jungtaek Lim"  wrote:

>+1 to the latter.
>
>I'm in favor of documenting the change to release note, and also docs so
>that website can be reflected. The users who are affected to the change
>wouldn't be much, since using dependency management tool (Maven, Gradle,
>and so on) has been recommended for creating topology jar.
>
>For me it's not a blocker for release.
>
>Arun, I initiated another thread to discuss moving non-connectors to the
>top directory.
>Could you cast your vote? If you are still not satisfied with excluding
>jars you can cast -0 or even -1.
>
>- Jungtaek Lim (HeartSaVioR)
>
>2017년 3월 23일 (목) 오후 10:43, P. Taylor Goetz 님이 작성:
>
>> Do we want to cancel this RC in order to better document the changes, or
>> will documenting it in the release announcement suffice for now (provided
>> documentation is added for subsequent releases)?
>>
>> I’m partial to the latter, but am open to others’ opinions.
>>
>> -Taylor
>>
>>
>> > On Mar 22, 2017, at 9:49 AM, Bobby Evans 
>> wrote:
>> >
>> > +1 I built form the tag and ran using a single node cluster.
>> > The examples and external components are excluded because they are
>> huge.  Because of shading they we distribute the same copy of them multiple
>> times.
>> > I agree with Alexandre.  We should document this change better, because
>> it is confusing for people to get a release that used to have these in it,
>> but does not any more.
>> >
>> >
>> > - Bobby
>> >
>> > On Tuesday, March 21, 2017, 10:46:38 PM CDT, Arun Mahadevan <
>> ar...@apache.org> wrote:Verified the artifacts. Compiled examples and ran
>> some sample topologies. Looks good.
>> >
>> > BTW, why are the external modules excluded from the binaries (the .zip
>> and .tar.gz). Isn’t it better if the binary distribution includes them?
>> Maybe it was already discussed but I am missing it. The sql directory
>> however seems to include the jars so it looks inconsistent.
>> >
>> > - Arun
>> >
>> >
>> > On 3/22/17, 12:56 AM, "P. Taylor Goetz"  wrote:
>> >
>> >> This is a call to vote on releasing Apache Storm 1.1.0 (rc3)
>> >>
>> >> Full list of changes in this release:
>> >>
>> >>
>> https://git-wip-us.apache.org/repos/asf?p=storm.git;a=blob_plain;f=CHANGELOG.md;h=68fbab3c4f91359bd397d93a157830542839b002;hb=e40d213de7067f7d3aa4d4992b81890d8ed6ff31
>> >>
>> >> The tag/commit to be voted upon is v1.1.0:
>> >>
>> >>
>> https://git-wip-us.apache.org/repos/asf?p=storm.git;a=tree;h=7fa62404feb6b86b3143c851b46237580720eb6b;hb=e40d213de7067f7d3aa4d4992b81890d8ed6ff31
>> >>
>> >> The source archive being voted upon can be found here:
>> >>
>> >>
>> https://dist.apache.org/repos/dist/dev/storm/apache-storm-1.1.0-rc3/apache-storm-1.1.0-src.tar.gz
>> >>
>> >> Other release files, signatures and digests can be found here:
>> >>
>> >> https://dist.apache.org/repos/dist/dev/storm/apache-storm-1.1.0-rc3/
>> >>
>> >> The release artifacts are signed with the following key:
>> >>
>> >>
>> https://git-wip-us.apache.org/repos/asf?p=storm.git;a=blob_plain;f=KEYS;hb=22b832708295fa2c15c4f3c70ac0d2bc6fded4bd
>> >>
>> >> The Nexus staging repository for this release is:
>> >>
>> >> https://repository.apache.org/content/repositories/orgapachestorm-1047
>> >>
>> >> Please vote on releasing this package as Apache Storm 1.1.0.
>> >>
>> >> When voting, please list the actions taken to verify the release.
>> >>
>> >> This vote will be open for at least 72 hours.
>> >>
>> >> [ ] +1 Release this package as Apache Storm 1.1.0
>> >> [ ]  0 No opinion
>> >> [ ] -1 Do not release this package because...
>> >>
>> >> Thanks to everyone who contributed to this release.
>> >>
>> >> -Taylor
>> >
>>
>>




Re: [DISCUSS] Move non-connectors modules to out of external

2017-03-24 Thread Arun Iyer
+1 

Makes sense to move the non-connectors to top level and keep only the 
connectors under “connectors” folder.


On 3/24/17, 12:00 PM, "Jungtaek Lim"  wrote:

>(Sent this yesterday but can't find this from storm-dev mbox... sending it
>again)
>
>Hi dev,
>
>I'd like to start discussion regarding moving non-connectors modules out of
>external, maybe top directory.
>
>"external" directory has non-connectors (SQL, Flux, storm-kafka-monitor,
>storm-submit-tools), and except Flux, others should be placed to the binary
>dist. since Storm itself (not from user topology) needs to refer them.
>
>They're actually tied to the core of Storm, so I feel that it would be
>better to treat them (including Flux) as non-external, maybe same level as
>storm-core.
>(I'm not sure what "external" actually means for Storm project btw.)
>
>In addition, after doing that I'd like to change the directory name
>"external" to "connector" or so, so that the name could be self-describing
>and we can only place connectors to that directory.
>(I know it would be painful for already opened pull requests, so no strong
>opinion regarding this.)
>
>Looking forward to your opinion!
>
>Thanks,
>Jungtaek Lim (HeartSaVioR)



Re: [VOTE] Release Apache Storm 1.1.0 (RC3)

2017-03-24 Thread Satish Duggana
+1 for RC3 and minor release with recent major/critical bug fixes in few
weeks time. Most of the storm-kafka-client issues are resolved.


Hugo,
You may want to vote again as you have changed your opinion mentioned in
your earlier mail.


Taylor,
It seems Hugo said this issue can be considered as critical instead of
Blocker in earlier mail as mentioned below.

>> Since this is an external module connector perhaps we can get this fix
onto a minor release instead of being a blocker. I will update the JIRA
accordingly.

Thanks,
Satish.


On Fri, Mar 24, 2017 at 7:33 AM, Jungtaek Lim  wrote:

> I'm OK to continue release process if we plan to do bugfix release sooner.
> (say, within 1 month, or just after all opened storm-kafka-client issues
> will be addressed.)
>
> We have other storm-kafka-client issues which are all major or critical but
> not included to RC3 as well. Having them in 1.1.0 is ideal but for other
> side I don't want to see 1.1.0 dragged any longer. Explaining "known issue"
> in release note would be sufficient, and 1.1.1 can be released sooner for
> resolving them.
>
> If we really would like to cancel RC3, I propose that we announce "code
> freeze" for 1.x branch (via sending mail to dev@), and address only issues
> on epic, and restart RC ASAP.
>
> tl;dr. I'm still +1 for the RC3, regardless of documentation about changes
> on binary dist. and storm-kafka-client issues.
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>
> 2017년 3월 24일 (금) 오전 4:53, Harsha Chintalapani 님이 작성:
>
> > I propose that we continue with the release and get this patch in next
> > minor release (probably 1.1.1?). Users doesn't need to upgrade their
> > cluster to get this change
> > so I would say this is not a show-stopper for the release.
> >
> > -Harsha
> >
> > On Thu, Mar 23, 2017 at 9:18 AM P. Taylor Goetz 
> wrote:
> >
> > > Hugo, now that you are a PMC member your vote is binding.
> > >
> > > While releases can’t be vetoed (a release vote needs at least three +1s
> > > and more +1s than -1s), it’s typical to take negative votes under
> serious
> > > consideration.
> > >
> > > So considering Hugo’s -1 vote, do we want cancel in order to include
> the
> > > proposed fix? It would also give us a chance to update the connector
> > > documentation regarding location of binary artifacts.
> > >
> > > -Taylor
> > >
> > >
> > > > On Mar 23, 2017, at 11:58 AM, Hugo Da Cruz Louro <
> > hlo...@hortonworks.com>
> > > wrote:
> > > >
> > > > -1 (non binding)
> > > >
> > > > This blocker bug was just reported -
> > > https://issues.apache.org/jira/browse/STORM-2432
> > > > Here is the fix: https://github.com/apache/storm/pull/2027
> > > >
> > > > I consider that this is a blocker bug because if the user choses
> > > UNCOMITTED_LATEST first poll strategy, no data gets polled at all.
> > > >
> > > > Thanks,
> > > > Hugo
> > > >
> > > > On Mar 23, 2017, at 8:48 AM, Harsha Chintalapani  > > > wrote:
> > > >
> > > > +1 for documenting and releasing.
> > > > -Harsha
> > > >
> > > > On Thu, Mar 23, 2017 at 7:04 AM Jungtaek Lim  >  > > kabh...@gmail.com>> wrote:
> > > >
> > > > +1 to the latter.
> > > >
> > > > I'm in favor of documenting the change to release note, and also docs
> > so
> > > > that website can be reflected. The users who are affected to the
> change
> > > > wouldn't be much, since using dependency management tool (Maven,
> > Gradle,
> > > > and so on) has been recommended for creating topology jar.
> > > >
> > > > For me it's not a blocker for release.
> > > >
> > > > Arun, I initiated another thread to discuss moving non-connectors to
> > the
> > > > top directory.
> > > > Could you cast your vote? If you are still not satisfied with
> excluding
> > > > jars you can cast -0 or even -1.
> > > >
> > > > - Jungtaek Lim (HeartSaVioR)
> > > >
> > > > 2017년 3월 23일 (목) 오후 10:43, P. Taylor Goetz   > > ptgo...@gmail.com>>님이 작성:
> > > >
> > > > Do we want to cancel this RC in order to better document the changes,
> > or
> > > > will documenting it in the release announcement suffice for now
> > (provided
> > > > documentation is added for subsequent releases)?
> > > >
> > > > I’m partial to the latter, but am open to others’ opinions.
> > > >
> > > > -Taylor
> > > >
> > > >
> > > > On Mar 22, 2017, at 9:49 AM, Bobby Evans  > > >
> > > > wrote:
> > > >
> > > > +1 I built form the tag and ran using a single node cluster.
> > > > The examples and external components are excluded because they are
> > > > huge.  Because of shading they we distribute the same copy of them
> > > > multiple
> > > > times.
> > > > I agree with Alexandre.  We should document this change better,
> because
> > > > it is confusing for people to get a release that used to have these
> in
> > > > it,
> > > > but does not any more.
> > > >
> > > >
> > > > - Bobby
> 

[GitHub] storm issue #2027: STORM-2432: Storm-Kafka-Client Trident Spout Seeks Incorr...

2017-03-24 Thread satishd
Github user satishd commented on the issue:

https://github.com/apache/storm/pull/2027
  
@hmcl I will take care of pushing these changes to master branch once this 
is merged into 1.x-branch.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm pull request #2027: STORM-2432: Storm-Kafka-Client Trident Spout Seeks...

2017-03-24 Thread satishd
Github user satishd commented on a diff in the pull request:

https://github.com/apache/storm/pull/2027#discussion_r107842202
  
--- Diff: 
external/storm-kafka-client/src/main/java/org/apache/storm/kafka/spout/trident/KafkaTridentSpoutEmitter.java
 ---
@@ -149,7 +150,7 @@ private void emitTuples(TridentCollector collector, 
ConsumerRecords record
 
 /**
  * Determines the offset of the next fetch. For failed batches 
lastBatchMeta is not null and contains the fetch
- * offset of the failed batch. In this scenario the next fetch will 
take place at the offset of the failed batch.
+ * offset of the failed batch. In this scenario the next fetch will 
take place at the offset of the failed batch + 1.
--- End diff --

minor nit: Just to make it clear.
>next fetch will take place at (offset of the failed batch + 1). 


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[DISCUSS] Move non-connectors modules to out of external

2017-03-24 Thread Jungtaek Lim
(Sent this yesterday but can't find this from storm-dev mbox... sending it
again)

Hi dev,

I'd like to start discussion regarding moving non-connectors modules out of
external, maybe top directory.

"external" directory has non-connectors (SQL, Flux, storm-kafka-monitor,
storm-submit-tools), and except Flux, others should be placed to the binary
dist. since Storm itself (not from user topology) needs to refer them.

They're actually tied to the core of Storm, so I feel that it would be
better to treat them (including Flux) as non-external, maybe same level as
storm-core.
(I'm not sure what "external" actually means for Storm project btw.)

In addition, after doing that I'd like to change the directory name
"external" to "connector" or so, so that the name could be self-describing
and we can only place connectors to that directory.
(I know it would be painful for already opened pull requests, so no strong
opinion regarding this.)

Looking forward to your opinion!

Thanks,
Jungtaek Lim (HeartSaVioR)


[GitHub] storm issue #2027: STORM-2432: Storm-Kafka-Client Trident Spout Seeks Incorr...

2017-03-24 Thread satishd
Github user satishd commented on the issue:

https://github.com/apache/storm/pull/2027
  
+1 LGTM


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


Re: [DISCUSS] Storm 2.0 Roadmap

2017-03-24 Thread Satish Duggana
+1 to have 2.0 with porting and performance(it should be at least as good
as 1.x release) issues addressed

We can target other tasks(mentioned by Taylor and Jungtaek) for 2.x-branch.


Exactly-once support:
While thinking through the exactlyonce support design, it is realized
better to avoid acking tuples and implement exactly once by snapshotting
barriers. It seems JStorm folks followed similar design, they claim it
gives better performance. This feature is essential for beam runner and we
can decide on respective approaches though.

Beam Runner
Lets hold on this for now and keep it in Storm till 2.x. We should avoid
having a minimal beam runner in haste. It is better to address STORM-2284,
exactly-once and other windowing enhancements to enable beam runner.

JStorm
Agree with Jungtaek on looking at the latest JStorm and align/scope with
the features for 2.x.

STORM-2284
We may want to look at JStorm worker before working on respective
components in this epic to pull appropriate enhancements.

YARN/MESOS
Supporting Storm on YARN/Mesos for 2.x.

Thanks,
Satish.


On Fri, Mar 24, 2017 at 9:09 AM, Jungtaek Lim  wrote:

> First of all, +1 to complete only port work and do sanity check (including
> performance regression), and release.
>
> If we can get STORM-2284 within deterministic time frame (say 2~3 months)
> that should be great, but if not I'd in favor of postponing that to later
> 2.x release.
>
> JStorm released their new versions after code donation. So there're more
> things we could get ideas from, or even adopt from.
> https://github.com/alibaba/jstorm/blob/master/history.md
> As you noticed from release note link, we also need to update phase 2 since
> they already changed what we're planning to do in phase 2. For example,
> they changed backpressure to end-to-end, and changed to use snapshot rather
> than acker.
> May be sure, JStorm pulled many features from today's Storm, like Flux,
> Windowing, more shuffle groupings, log search, log level change, and so on.
>
> STORM-2426  is due to
> the
> limitation of Spout lifecycle (all the things are done in single thread),
> and STORM-1358 (JStorm's
> multi-thread Spout) can remedy this (despite that Spout implementation may
> need to guarantee thread-safety later). It's not a just improvement but
> close to design concern so would like to address sooner than other things
> in phase 2.
>
> For Storm SQL side, I've lost progress but major work would be adopting
> group by with windowing. It was not available from Calcite but will be
> available at next release (1.12.0).
> I've filed this to STORM-2405
> , but windowing & micro
> batch is not intuitive, so I would like to change the underlying API to
> stream API in SQL. Also filed this to STORM-2406
> .
>
> Just 2 cents btw, hopefully I would like to see metrics V2 sooner since we
> lost metrics even when doing normal operation like restarting worker,
> rebalancing, and so on. Eventually we need to fight with dynamic scaling,
> and then metrics will be broken often.
>
> Thanks,
> Jungtaek Lim (HeartSaVioR)
>
> 2017년 3월 24일 (금) 오전 5:05, Harsha Chintalapani 님이 작성:
>
> > Storm 2.0 migration to java in itself is a big win and would attract
> wider
> > community and adoption. So my vote would be to resolve the first 3 items
> to
> > get a release out.
> > All the other featured mentioned are great to have but shouldn't be
> > blockers for 2.0 release.
> >
> > -Harsha
> >
> > On Thu, Mar 23, 2017 at 11:51 AM P. Taylor Goetz 
> > wrote:
> >
> > > With the 1.1.0 release nearing completion, I’d like to turn our
> attention
> > > to 2.0 and develop a plan for what features, etc. to include.
> > >
> > > The following 3 are what I feel are the minimum for a 2.0 release.
> These
> > > could likely be resolved relatively quickly:
> > >
> > > * Performance — I’ve not benchmarked the master branch vs. 1.0.x or
> 1.1.x
> > > in a while, but I feel it will be important to make sure there are no
> > > performance regressions, and would hope that we actually have a
> > performance
> > > improvement over previous versions. To that end (e.g. if there is in
> > fact a
> > > performance regression), the proposals that Roshan Naik put together
> for
> > > revising the threading and execution model (STORM-2307) and replacing
> > > Disruptor with JCTools (STORM-2306) warrant review and consideration.
> See
> > > also STORM-2284 which is the parent JIRA.
> > >
> > > * Finish porting Storm UI to java (STORM-1311)
> > >
> > > * Finish porting log viewer to java (STORM-1280)
> > >
> > > The following are items that are nice to have in 2.0, but I don’t feel
> > are
> > > absolutely necessary for an initial 2.0 release:
> > >
> > > * Beam Runner (I wouldn’t tie this to 2.0, 

[GitHub] storm issue #1943: [STORM-2363]Provide configuration to set the number of Ro...

2017-03-24 Thread liu-zhaokun
Github user liu-zhaokun commented on the issue:

https://github.com/apache/storm/pull/1943
  
@HeartSaVioR 
OK!Thanks for your reply.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm issue #1943: [STORM-2363]Provide configuration to set the number of Ro...

2017-03-24 Thread HeartSaVioR
Github user HeartSaVioR commented on the issue:

https://github.com/apache/storm/pull/1943
  
@liu-zhaokun 
Sorry we're on release vote for 1.1.0 and personally I would like to 
postpone everything not related to release. Please stay tuned from dev@ mailing 
list.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---


[GitHub] storm issue #1943: [STORM-2363]Provide configuration to set the number of Ro...

2017-03-24 Thread liu-zhaokun
Github user liu-zhaokun commented on the issue:

https://github.com/apache/storm/pull/1943
  
Hi,@HeartSaVioR 
Have you reviewed my PR?Looking forward to your reply.Thanks.


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---