Re: [DISCUSS] Call for PMC Members and Contributors

2023-02-28 Thread P. Taylor Goetz
Thank you both for your willingness to help the project.

Stay tuned.

-Taylor

> On Feb 25, 2023, at 10:15 AM, Alexandre Vermeerbergen 
>  wrote:
> 
> Hi Taylor,
> 
> Same for me, I'd be happy to help since I use Apache Storm (as I wrote
> in a previous mail) - so it sound obviously for me that I should do
> something it return, whatever it is!
> 
> Alex
> 
> Le ven. 24 févr. 2023 à 13:01, Richard Zowalla  a écrit :
>> 
>> Hi Taylor,
>> 
>> as written in the other thread, I am happy to help.
>> 
>> If needed, I can also work on doing the mechanical work of a release
>> (already knowing the quirks with asf infrastructure).
>> 
>> Gruß
>> Richard
>> 
>> Am Mittwoch, dem 22.02.2023 um 18:02 -0500 schrieb P. Taylor Goetz:
>>> Quick update:
>>> 
>>> The ASF Board voted to accept both our report and the resolution to
>>> change PMC Chair. Board feedback on our report was positive.
>>> 
>>> The next step is to expand the the PMC/Committers group. Official
>>> votes will necessarily be private, but I see no reason nominations
>>> can’t be public.
>>> 
>>> So lets open a thread to talk about adding new contributors to the
>>> PMC. Feel free to:
>>> 
>>> 1. Volunteer to support the project as a PMC Member/Committer..
>>> 2. Nominate someone become PMC/Committer
>>> 
>>> I have a few candidates in mind, and will follow up accordingly.
>>> Votes for new members will be private, but successful votes for new
>>> members will be announced on the public lists.
>>> 
>>> One important way to contribute is to volunteer as a release manager
>>> for any given release, even if we release infrequently (e.g. only in
>>> response to a serious bug or security issue. I don’t have the
>>> bandwidth to act in that role, though I will commit to support voting
>>> on releases, etc. Ideally we would want at least two folks ready to
>>> step into that role, but we can get by with one if necessary.
>>> 
>>> Thanks again to everyone who’s expressed interest in contributing to
>>> the project.
>>> 
>>> - Taylor
>> 



[DISCUSS] Call for PMC Members and Contributors

2023-02-22 Thread P. Taylor Goetz
Quick update:

The ASF Board voted to accept both our report and the resolution to change PMC 
Chair. Board feedback on our report was positive.

The next step is to expand the the PMC/Committers group. Official votes will 
necessarily be private, but I see no reason nominations can’t be public.

So lets open a thread to talk about adding new contributors to the PMC. Feel 
free to:

1. Volunteer to support the project as a PMC Member/Committer..
2. Nominate someone become PMC/Committer

I have a few candidates in mind, and will follow up accordingly. Votes for new 
members will be private, but successful votes for new members will be announced 
on the public lists.

One important way to contribute is to volunteer as a release manager for any 
given release, even if we release infrequently (e.g. only in response to a 
serious bug or security issue. I don’t have the bandwidth to act in that role, 
though I will commit to support voting on releases, etc. Ideally we would want 
at least two folks ready to step into that role, but we can get by with one if 
necessary.

Thanks again to everyone who’s expressed interest in contributing to the 
project.

- Taylor

Re: [DISCUSSION] Apache Storm and moving to the Attic

2023-02-01 Thread P. Taylor Goetz
An anecdote that might be applicable:

Look at version controls over the last few decades:

CVS —> Apache Subersion —> Git

Each transition was a major improvement.

That may lead you believe that Subversion is dead and long gone. And you would 
be wrong.

Subversion is still alive because it stil  has a user base, Talso has one of, 
if not the most, open-door policies to entry: Want committer access? Just ask 
and you’re in. If you eff up, someone will catch it and revert it.


We’ve had a few express interest in continuing the project. Why not elevate 
them to Committer/PMC? 

-Taylor


> On Feb 2, 2023, at 12:06 AM, P. Taylor Goetz  wrote:
> 
> IMO, there are tow scenarios:
> 
> 1. Storm is in dead. No one uses it anymore, and no one needs security 
> patches, etc.
> 2. Storm is in maintenance mode. While new features may not be added, there 
> are enough contributors left tp at least address any security concerns.
> 
> Wether we pursue the attic o projectr continuation is really up to the 
> community. Anyone who want to step up and help continue the project should be 
> considered a PMC candidate.
> 
> As I’ve said in other threads, I’m willing to act as PMC chair whatever the 
> PMC decides.
> 
> But for now we have ne official liaison to ther ASF board. PMC members need 
> to elect a chair, whether we choose to continue or head to the attic.
> 
> I would lean toward lowering the bar for PMC membership to anyone interested 
> in ongoing maintenance. 
> 
> I’m still willing to serve as the board liaison (PMC Chair) in either case of 
> retirement or continuation. I’d also support anyone else willing to take the 
> position.  The position needs to be filled either way.
> 
> This is a decision that affects both developers and users. Thank you Aaron 
> for bringing it to the attention of the  broader community.
> 
> -Taylor
> 
>> On Feb 1, 2023, at 9:26 PM, sunil yadav > <mailto:raosutn...@gmail.com>> wrote:
>> 
>> One side note looking to see what folks are migrating or moved to if not 
>> using Storm actively. Any cloud native DAG platform to recommend?
>> 
>> -Sunil
>> 
>> On Wed, Feb 1, 2023 at 6:12 PM Stephen Powis via user > <mailto:user@storm.apache.org>> wrote:
>> Ah Yea, also not opposed to the move to the attic if it's determined to be 
>> most appropriate, but I had a similar experience as Richard where I 
>> submitted some PRs, asked for comments/thoughts/review, and even offered to 
>> take over ownership of one of the sub modules and got no responses.  Perhaps 
>> I just submitted under the wrong forum to get an answer, but definitely a 
>> bummer, and would love to see the project/community get revived.
>> 
>> - Stephen
>> 
>> On Wed, Feb 1, 2023 at 10:09 PM Richard Zowalla > <mailto:r...@apache.org>> wrote:
>> Hi Aaron,
>> 
>> I am CC the users@ list as they weren't contained in the initial
>> proposal. Perhaps, there are people or institutions in the wild, who
>> want to volunteer or give it a try (still at the ASF level). 
>> 
>> Of course it shouldn't prevent a VOTE for moving to the attic, but an
>> additional try to get some more attention (if needed).
>> 
>> Gruß
>> Richard
>> 
>> 
>> Am Dienstag, dem 31.01.2023 um 08:00 -0600 schrieb Aaron Niskode-
>> Dossett:
>> > Thank you to those who responded. Given that the low number of
>> > responses I
>> > am going to start a vote on moving to the attic tomorrow.
>> > 
>> > One other note: A vote to move to the attic would not mean the end of
>> > Storm, the project can be forked by anyone who wants to continue the
>> > work
>> > (just as it can be forked today), it would just be outside of the
>> > ASF.
>> > 
>> > On Thu, Jan 26, 2023 at 3:12 AM Richard Zowalla > > <mailto:r...@apache.org>>
>> > wrote:
>> > 
>> > > Hi all,
>> > > 
>> > > I also experienced, that dev@ activity is quite low. I did provide
>> > > some
>> > > PRs and asked questions a few months ago and didn't get any
>> > > feedback,
>> > > although the PRs got merged in the end. I think, that we need some
>> > > sort
>> > > of community rebuilding / revival, if we want to maintain Storm in
>> > > a
>> > > sustainable way (like it is done for TomEE or OpenNLP).
>> > > 
>> > > Nevertheless, as I am also a committer on StormCrawler, which
>> > > relies on
>> > > Storm (obviously) _and_ is used in our research wo

Re: [DISCUSSION] Apache Storm and moving to the Attic

2023-02-01 Thread P. Taylor Goetz
IMO, there are tow scenarios:

1. Storm is in dead. No one uses it anymore, and no one needs security patches, 
etc.
2. Storm is in maintenance mode. While new features may not be added, there are 
enough contributors left tp at least address any security concerns.

Wether we pursue the attic o projectr continuation is really up to the 
community. Anyone who want to step up and help continue the project should be 
considered a PMC candidate.

As I’ve said in other threads, I’m willing to act as PMC chair whatever the PMC 
decides.

But for now we have ne official liaison to ther ASF board. PMC members need to 
elect a chair, whether we choose to continue or head to the attic.

I would lean toward lowering the bar for PMC membership to anyone interested in 
ongoing maintenance. 

I’m still willing to serve as the board liaison (PMC Chair) in either case of 
retirement or continuation. I’d also support anyone else willing to take the 
position.  The position needs to be filled either way.

This is a decision that affects both developers and users. Thank you Aaron for 
bringing it to the attention of the  broader community.

-Taylor

> On Feb 1, 2023, at 9:26 PM, sunil yadav  wrote:
> 
> One side note looking to see what folks are migrating or moved to if not 
> using Storm actively. Any cloud native DAG platform to recommend?
> 
> -Sunil
> 
> On Wed, Feb 1, 2023 at 6:12 PM Stephen Powis via user  > wrote:
> Ah Yea, also not opposed to the move to the attic if it's determined to be 
> most appropriate, but I had a similar experience as Richard where I submitted 
> some PRs, asked for comments/thoughts/review, and even offered to take over 
> ownership of one of the sub modules and got no responses.  Perhaps I just 
> submitted under the wrong forum to get an answer, but definitely a bummer, 
> and would love to see the project/community get revived.
> 
> - Stephen
> 
> On Wed, Feb 1, 2023 at 10:09 PM Richard Zowalla  > wrote:
> Hi Aaron,
> 
> I am CC the users@ list as they weren't contained in the initial
> proposal. Perhaps, there are people or institutions in the wild, who
> want to volunteer or give it a try (still at the ASF level). 
> 
> Of course it shouldn't prevent a VOTE for moving to the attic, but an
> additional try to get some more attention (if needed).
> 
> Gruß
> Richard
> 
> 
> Am Dienstag, dem 31.01.2023 um 08:00 -0600 schrieb Aaron Niskode-
> Dossett:
> > Thank you to those who responded. Given that the low number of
> > responses I
> > am going to start a vote on moving to the attic tomorrow.
> > 
> > One other note: A vote to move to the attic would not mean the end of
> > Storm, the project can be forked by anyone who wants to continue the
> > work
> > (just as it can be forked today), it would just be outside of the
> > ASF.
> > 
> > On Thu, Jan 26, 2023 at 3:12 AM Richard Zowalla  > >
> > wrote:
> > 
> > > Hi all,
> > > 
> > > I also experienced, that dev@ activity is quite low. I did provide
> > > some
> > > PRs and asked questions a few months ago and didn't get any
> > > feedback,
> > > although the PRs got merged in the end. I think, that we need some
> > > sort
> > > of community rebuilding / revival, if we want to maintain Storm in
> > > a
> > > sustainable way (like it is done for TomEE or OpenNLP).
> > > 
> > > Nevertheless, as I am also a committer on StormCrawler, which
> > > relies on
> > > Storm (obviously) _and_ is used in our research work, I am also
> > > happy
> > > to jump in as a volunteer, if some ppl are needed.
> > > 
> > > Gruß
> > > Richard
> > > 
> > > 
> > > Am Mittwoch, dem 25.01.2023 um 22:00 + schrieb Bipin Prasad:
> > > > I am volunteering to take on the role of PMC chairFor Storm. Not
> > > > quite sure about the process.
> > > > Thanks—Bipin Prasad
> > > > 
> > > > 
> > > > Sent from Yahoo Mail for iPhone
> > > > 
> > > > 
> > > > On Wednesday, January 25, 2023, 4:50 PM, Aaron Niskode-Dossett <
> > > > doss...@gmail.com > wrote:
> > > > 
> > > > Hello Storm developer community,
> > > > 
> > > > In the past year or so this project has slowed down and the
> > > > Project
> > > > Management Committee [PMC] has almost no active members.  The PMC
> > > > chair
> > > > resigned in 2022 with due notice and noone has since volunteered
> > > > to
> > > > assume
> > > > those duties.  I myself am an inactive PMC member.
> > > > 
> > > > This suggests to me that it's time to consider moving this
> > > > project to
> > > > the
> > > > Attic [1].  I would view this as the natural culmination of a
> > > > very
> > > > successful Apache project and not as a mark of failure.  There
> > > > are
> > > > many,
> > > > many successful and influential projects in the attic.
> > > > 
> > > > The alternative to moving to the attic would be to reconstitute
> > > > the
> > > > PMC
> > > > with *new members* of the development community willing to take
> > > > on
> > > > th

[ANNOUNCE] Apache Storm 2.0.0 Released

2019-05-30 Thread P. Taylor Goetz
The Apache Storm community is pleased to announce the release of Apache Storm 
version 2.0.0.

Storm is a distributed, fault-tolerant, and high-performance realtime 
computation system that provides strong guarantees on the processing of data. 
You can read more about Storm on the project website:

http://storm.apache.org

Downloads of source and binary distributions are listed in our download
section:

http://storm.apache.org/downloads.html

You can read more about this release in the following blog post:

http://storm.apache.org/2019/05/30/storm200-released.html

Distribution artifacts are available in Maven Central at the following 
coordinates:

groupId: org.apache.storm
artifactId: storm-core
version: 2.0.0

The full list of changes is available here[1]. Please let us know [2] if you 
encounter any problems.

Regards,

The Apache Storm Team

[1]: http://www.us.apache.org/dist/storm/apache-storm-2.0.0/RELEASE_NOTES.html
[2]: https://issues.apache.org/jira/browse/STORM

[CVE-2018-8008] Apache Storm arbitrary file write vulnerability

2018-06-05 Thread P. Taylor Goetz
CVE-2018-8008: Apache Storm arbitrary file write vulnerability

Severity: Important

Vendor:
The Apache Software Foundation

Versions Affected:
Apache Storm 1.2.1
Apache Storm 1.1.2

Description:
Apache Storm version 1.0.6 and earlier, 1.2.1 and earlier, and version 1.1.2 
and earlier expose an arbitrary file write vulnerability, that can be achieved 
using a specially crafted zip archive (affects other archives as well, bzip2, 
tar, xz, war, cpio, 7z), that holds path traversal filenames. So when the 
filename gets concatenated to the target extraction directory, the final path 
ends up outside of the target folder.

Mitigation:
1.2.1 users should upgrade to version 1.2.2.
1.1.2 users should upgrade to version 1.1.3.
1.0.6 users should upgrade to version 1.1.3.

Apache Storm 1.2.2 artifacts are available for immediate download here:

http://www.us.apache.org/dist/storm/apache-storm-1.2.2/

Apache Storm 1.1.3 artifacts are available for immediate download here:

http://www.us.apache.org/dist/storm/apache-storm-1.1.3/

Credit:
This issue was discovered by Snyk Security Research Team

References:
http://storm.apache.org/2018/06/04/storm122-released.html
http://storm.apache.org/2018/06/04/storm113-released.html

P. Taylor Goetz

[CVE-2018-1332] Apache Storm user impersonation vulnerability

2018-06-05 Thread P. Taylor Goetz
CVE-2018-1332: Apache Storm user impersonation vulnerability

Severity: Important

Vendor:
The Apache Software Foundation

Versions Affected:
Apache Storm 1.2.1
Apache Storm 1.1.2

Description:
Apache Storm version 1.0.6 and earlier, 1.2.1 and earlier, and version 1.1.2 
and earlier expose a vulnerability that could allow a user to impersonate 
another user when communicating with some Storm Daemons.


Mitigation:
1.2.1 users should upgrade to version 1.2.2.
1.1.2 users should upgrade to version 1.1.3.
1.0.6 users should upgrade to version 1.1.3.

Apache Storm 1.2.2 artifacts are available for immediate download here:

http://www.us.apache.org/dist/storm/apache-storm-1.2.2/

Apache Storm 1.1.3 artifacts are available for immediate download here:

http://www.us.apache.org/dist/storm/apache-storm-1.1.3/

Credit:
This issue was discovered by Bobby Evans of the Apache Storm PMC

References:
http://storm.apache.org/2018/06/04/storm122-released.html
http://storm.apache.org/2018/06/04/storm113-released.html

P. Taylor Goetz

[ANNOUNCE] Apache Storm 1.1.3 Released

2018-06-04 Thread P. Taylor Goetz
The Apache Storm community is pleased to announce the release of Apache Storm 
version 1.1.3.

Storm is a distributed, fault-tolerant, and high-performance realtime 
computation system that provides strong guarantees on the processing of data. 
You can read more about Storm on the project website:

http://storm.apache.org

Downloads of source and binary distributions are listed in our download
section:

http://storm.apache.org/downloads.html

You can read more about this release in the following blog post:

http://storm.apache.org/2018/02/19/storm121-released.html

Distribution artifacts are available in Maven Central at the following 
coordinates:

groupId: org.apache.storm
artifactId: storm-core
version: 1.1.3

The full list of changes is available here[1]. Please let us know [2] if you 
encounter any problems.

Regards,

The Apache Storm Team

[1]: http://www.us.apache.org/dist/storm/apache-storm-1.1.3/RELEASE_NOTES.html
[2]: https://issues.apache.org/jira/browse/STORM

[ANNOUNCE] Apache Storm 1.2.2 Released

2018-06-04 Thread P. Taylor Goetz
The Apache Storm community is pleased to announce the release of Apache Storm 
version 1.2.2.

Storm is a distributed, fault-tolerant, and high-performance realtime 
computation system that provides strong guarantees on the processing of data. 
You can read more about Storm on the project website:

http://storm.apache.org

Downloads of source and binary distributions are listed in our download
section:

http://storm.apache.org/downloads.html

You can read more about this release in the following blog post:

http://storm.apache.org/2018/02/19/storm121-released.html

Distribution artifacts are available in Maven Central at the following 
coordinates:

groupId: org.apache.storm
artifactId: storm-core
version: 1.2.2

The full list of changes is available here[1]. Please let us know [2] if you 
encounter any problems.

Regards,

The Apache Storm Team

[1]: http://www.us.apache.org/dist/storm/apache-storm-1.2.2/RELEASE_NOTES.html
[2]: https://issues.apache.org/jira/browse/STORM

[ANNOUNCE] Apache Storm 1.2.1 Released

2018-02-19 Thread P. Taylor Goetz
The Apache Storm community is pleased to announce the release of Apache Storm 
version 1.2.1.

Storm is a distributed, fault-tolerant, and high-performance realtime 
computation system that provides strong guarantees on the processing of data. 
You can read more about Storm on the project website:

http://storm.apache.org

Downloads of source and binary distributions are listed in our download
section:

http://storm.apache.org/downloads.html

You can read more about this release in the following blog post:

http://storm.apache.org/2018/02/19/storm121-released.html

Distribution artifacts are available in Maven Central at the following 
coordinates:

groupId: org.apache.storm
artifactId: storm-core
version: 1.2.1

The full list of changes is available here[1]. Please let us know [2] if you 
encounter any problems.

Regards,

The Apache Storm Team

[1]: http://www.us.apache.org/dist/storm/apache-storm-1.2.1/RELEASE_NOTES.html
[2]: https://issues.apache.org/jira/browse/STORM

[ANNOUNCE] Apache Storm 1.2.1 Released

2018-02-19 Thread P. Taylor Goetz
The Apache Storm community is pleased to announce the release of Apache Storm 
version 1.2.1.

Storm is a distributed, fault-tolerant, and high-performance realtime 
computation system that provides strong guarantees on the processing of data. 
You can read more about Storm on the project website:

http://storm.apache.org

Downloads of source and binary distributions are listed in our download
section:

http://storm.apache.org/downloads.html

You can read more about this release in the following blog post:

http://storm.apache.org/2018/02/19/storm121-released.html

Distribution artifacts are available in Maven Central at the following 
coordinates:

groupId: org.apache.storm
artifactId: storm-core
version: 1.2.1

The full list of changes is available here[1]. Please let us know [2] if you 
encounter any problems.

Regards,

The Apache Storm Team

[1]: http://www.us.apache.org/dist/storm/apache-storm-1.2.1/RELEASE_NOTES.html
[2]: https://issues.apache.org/jira/browse/STORM

[ANNOUNCE] Apache Storm 1.2.0 Released

2018-02-16 Thread P. Taylor Goetz
The Apache Storm community is pleased to announce the release of Apache Storm 
version 1.2.0.

Storm is a distributed, fault-tolerant, and high-performance realtime 
computation system that provides strong guarantees on the processing of data. 
You can read more about Storm on the project website:

http://storm.apache.org

Downloads of source and binary distributions are listed in our download
section:

http://storm.apache.org/downloads.html

You can read more about this release in the following blog post:

http://storm.apache.org/2018/02/15/storm120-released.html

Distribution artifacts are available in Maven Central at the following 
coordinates:

groupId: org.apache.storm
artifactId: storm-core
version: 1.2.0

The full list of changes is available here[1]. Please let us know [2] if you 
encounter any problems.

Regards,

The Apache Storm Team

[1]: http://www.us.apache.org/dist/storm/apache-storm-1.2.0/RELEASE_NOTES.html
[2]: https://issues.apache.org/jira/browse/STORM

[ANNOUNCE] Apache Storm 1.1.2 Released

2018-02-16 Thread P. Taylor Goetz
The Apache Storm community is pleased to announce the release of Apache Storm 
version 1.1.2.

Storm is a distributed, fault-tolerant, and high-performance realtime 
computation system that provides strong guarantees on the processing of data. 
You can read more about Storm on the project website:

http://storm.apache.org

Downloads of source and binary distributions are listed in our download
section:

http://storm.apache.org/downloads.html

You can read more about this release in the following blog post:

http://storm.apache.org/2018/02/15/storm112-released.html

Distribution artifacts are available in Maven Central at the following 
coordinates:

groupId: org.apache.storm
artifactId: storm-core
version: 1.1.2

The full list of changes is available here[1]. Please let us know [2] if you 
encounter any problems.

Regards,

The Apache Storm Team

[1]: http://www.us.apache.org/dist/storm/apache-storm-1.1.2/RELEASE_NOTES.html
[2]: https://issues.apache.org/jira/browse/STORM

[ANNOUNCE] Apache Storm 1.0.6 Released

2018-02-16 Thread P. Taylor Goetz
The Apache Storm community is pleased to announce the release of Apache Storm 
version 1.0.6.

Storm is a distributed, fault-tolerant, and high-performance realtime 
computation system that provides strong guarantees on the processing of data. 
You can read more about Storm on the project website:

http://storm.apache.org

Downloads of source and binary distributions are listed in our download
section:

http://storm.apache.org/downloads.html

You can read more about this release in the following blog post:

http://storm.apache.org/2018/02/14/storm106-released.html

Distribution artifacts are available in Maven Central at the following 
coordinates:

groupId: org.apache.storm
artifactId: storm-core
version: 1.0.5

The full list of changes is available here[1]. Please let us know [2] if you 
encounter any problems.

Regards,

The Apache Storm Team

[1]: http://www.us.apache.org/dist/storm/apache-storm-1.0.6/RELEASE_NOTES.html
[2]: https://issues.apache.org/jira/browse/STORM

[ANNOUNCE] Apache Storm 1.0.5 Released

2017-09-15 Thread P. Taylor Goetz
The Apache Storm community is pleased to announce the release of Apache Storm 
version 1.0.5.

Storm is a distributed, fault-tolerant, and high-performance realtime 
computation system that provides strong guarantees on the processing of data. 
You can read more about Storm on the project website:

http://storm.apache.org

Downloads of source and binary distributions are listed in our download
section:

http://storm.apache.org/downloads.html

You can read more about this release in the following blog post:

http://storm.apache.org/2017/09/15/storm105-released.html

Distribution artifacts are available in Maven Central at the following 
coordinates:

groupId: org.apache.storm
artifactId: storm-core
version: 1.0.5

The full list of changes is available here[1]. Please let us know [2] if you 
encounter any problems.

Regards,

The Apache Storm Team

[1]: http://www.us.apache.org/dist/storm/apache-storm-1.0.5/RELEASE_NOTES.html
[2]: https://issues.apache.org/jira/browse/STORM

[CVE-2017-9799] Apache Storm Possible Code Execution As A Different User

2017-08-09 Thread P. Taylor Goetz
Severity: High

Vendor: The Apache Software Foundation

Versions Affected:
Apache Storm 1.0.0, 1.0.1, 1.0.2, 1.0.3
Apache Storm 1.1.0

Description:
It was found that under some situations and configurations of storm it is 
theoretically possible for the owner of a topology to trick the supervisor to 
launch a worker as a different, non-root, user. In the worst case this could 
lead to secure credentials of the other user being compromised.  This 
vulnerability only applies to Apache Storm installations with security 
components enabled.

Mitigation:
Users of the affected versions should apply one of the following mitigations:

- Upgrade to Apache Storm 1.0.4 or later
- Upgrade to Apache Storm 1.1.1 or later

Apache Storm 1.1.1 and 1.0.4 can be downloaded here:

http://storm.apache.org/downloads.html

Credit:
This issue was identified by the Apche Storm PMC

References:
https://github.com/apache/storm/blob/v1.1.1/SECURITY.md 

https://github.com/apache/storm/blob/v1.0.4/SECURITY.md 




[ANNOUNCE] Apache Storm 1.1.1 Released

2017-08-01 Thread P. Taylor Goetz
The Apache Storm community is pleased to announce the release of Apache Storm 
version 1.1.1.

Storm is a distributed, fault-tolerant, and high-performance realtime 
computation system that provides strong guarantees on the processing of data. 
You can read more about Storm on the project website:

http://storm.apache.org

Downloads of source and binary distributions are listed in our download
section:

http://storm.apache.org/downloads.html

You can read more about this release in the following blog post:

http://storm.apache.org/2017/08/01/storm111-released.html

Distribution artifacts are available in Maven Central at the following 
coordinates:

groupId: org.apache.storm
artifactId: storm-core
version: 1.1.1

The full list of changes is available here[1]. Please let us know [2] if you 
encounter any problems.

Regards,

The Apache Storm Team

[1]: https://github.com/apache/storm/blob/v1.1.1/CHANGELOG.md
[2]: https://issues.apache.org/jira/browse/STORM

[ANNOUNCE] Apache Storm 1.0.4 Released

2017-07-28 Thread P. Taylor Goetz
The Apache Storm community is pleased to announce the release of Apache Storm 
version 1.0.4.

Storm is a distributed, fault-tolerant, and high-performance realtime 
computation system that provides strong guarantees on the processing of data. 
You can read more about Storm on the project website:

http://storm.apache.org

Downloads of source and binary distributions are listed in our download
section:

http://storm.apache.org/downloads.html

You can read more about this release in the following blog post:

http://storm.apache.org/2017/07/28/storm104-released.html

Distribution artifacts are available in Maven Central at the following 
coordinates:

groupId: org.apache.storm
artifactId: storm-core
version: 1.0.4

The full list of changes is available here[1]. Please let us know [2] if you 
encounter any problems.

Regards,

The Apache Storm Team

[1]: https://github.com/apache/storm/blob/v1.0.4/CHANGELOG.md
[2]: https://issues.apache.org/jira/browse/STORM

Re: Storm topology freezes and does not process tuples from Kafka

2017-07-14 Thread P. Taylor Goetz
> Supervisor log at the time of freeze looks like below
> 
> 2017-07-12 14:38:46.712 o.a.s.d.supervisor [INFO]
> d8958816-5bc8-449e-94e3-87ddbb2c3d02 still hasn't started


There are two situations where you would see those messages: When a topology is 
first deployed, and when a worker has died and is being restarted.

I suspect the latter. Have you looked at the worker logs for any indication 
that the workers might be crashing and what might be causing it?

What components are involved in you’re topology?

-Taylor


> On Jul 12, 2017, at 5:26 AM, Sreeram  wrote:
> 
> Hi,
> 
> I am observing that my storm topology intermediately freezes and does
> not continue to process tuples from Kafka. This happens frequently and
> when it happens this freeze lasts for 5 to 15 minutes. No content is
> written to any of the worker log files during this time.
> 
> The version of storm I use is 1.0.2 and Kafka version is 0.9.0.
> 
> Any suggestions to solve the issue ?
> 
> Thanks,
> Sreeram
> 
> Supervisor log at the time of freeze looks like below
> 
> 2017-07-12 14:38:46.712 o.a.s.d.supervisor [INFO]
> d8958816-5bc8-449e-94e3-87ddbb2c3d02 still hasn't started
> 2017-07-12 14:38:47.212 o.a.s.d.supervisor [INFO]
> d8958816-5bc8-449e-94e3-87ddbb2c3d02 still hasn't started
> 2017-07-12 14:38:47.712 o.a.s.d.supervisor [INFO]
> d8958816-5bc8-449e-94e3-87ddbb2c3d02 still hasn't started
> 2017-07-12 14:38:48.213 o.a.s.d.supervisor [INFO]
> d8958816-5bc8-449e-94e3-87ddbb2c3d02 still hasn't started
> 2017-07-12 14:38:48.713 o.a.s.d.supervisor [INFO]
> d8958816-5bc8-449e-94e3-87ddbb2c3d02 still hasn't started
> 2017-07-12 14:38:49.213 o.a.s.d.supervisor [INFO]
> d8958816-5bc8-449e-94e3-87ddbb2c3d02 still hasn't started
> 2017-07-12 14:38:49.713 o.a.s.d.supervisor [INFO]
> d8958816-5bc8-449e-94e3-87ddbb2c3d02 still hasn't started
> 2017-07-12 14:38:50.214 o.a.s.d.supervisor [INFO]
> d8958816-5bc8-449e-94e3-87ddbb2c3d02 still hasn't started
> 2017-07-12 14:38:50.714 o.a.s.d.supervisor [INFO]
> d8958816-5bc8-449e-94e3-87ddbb2c3d02 still hasn't started
> 
> 
> Thread stacks (sample)
> Most of worker threads during this freeze period look like one of the
> below two stack traces.
> 
> Thread 104773: (state = BLOCKED)
> - sun.misc.Unsafe.park(boolean, long) @bci=0 (Compiled frame;
> information may be imprecise)
> - java.util.concurrent.locks.LockSupport.parkNanos(java.lang.Object,
> long) @bci=20, line=215 (Compiled frame)
> - 
> java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(java.util.concurrent.SynchronousQueue$TransferStack$SNode,
> boolean, long) @bci=160, line=460 (Compil
> ed frame)
> - 
> java.util.concurrent.SynchronousQueue$TransferStack.transfer(java.lang.Object,
> boolean, long) @bci=102, line=362 (Compiled frame)
> - java.util.concurrent.SynchronousQueue.poll(long,
> java.util.concurrent.TimeUnit) @bci=11, line=941 (Compiled frame)
> - java.util.concurrent.ThreadPoolExecutor.getTask() @bci=134,
> line=1066 (Compiled frame)
> - 
> java.util.concurrent.ThreadPoolExecutor.runWorker(java.util.concurrent.ThreadPoolExecutor$Worker)
> @bci=26, line=1127 (Compiled frame)
> - java.util.concurrent.ThreadPoolExecutor$Worker.run() @bci=5,
> line=617 (Compiled frame)
> - java.lang.Thread.run() @bci=11, line=745 (Compiled frame)
> 
> Thread 147495: (state = IN_NATIVE)
> - sun.nio.ch.EPollArrayWrapper.epollWait(long, int, long, int) @bci=0
> (Compiled frame; information may be imprecise)
> - sun.nio.ch.EPollArrayWrapper.poll(long) @bci=18, line=269 (Compiled frame)
> - sun.nio.ch.EPollSelectorImpl.doSelect(long) @bci=28, line=93 (Compiled 
> frame)
> - sun.nio.ch.SelectorImpl.lockAndDoSelect(long) @bci=37, line=86
> (Compiled frame)
> - sun.nio.ch.SelectorImpl.select(long) @bci=30, line=97 (Compiled frame)
> - org.apache.kafka.common.network.Selector.select(long) @bci=35,
> line=425 (Compiled frame)
> - org.apache.kafka.common.network.Selector.poll(long) @bci=81,
> line=254 (Compiled frame)
> - org.apache.kafka.clients.NetworkClient.poll(long, long) @bci=84,
> line=270 (Compiled frame)
> - org.apache.kafka.clients.producer.internals.Sender.run(long)
> @bci=343, line=216 (Compiled frame)
> - org.apache.kafka.clients.producer.internals.Sender.run() @bci=27,
> line=128 (Interpreted frame)
> - java.lang.Thread.run() @bci=11, line=745 (Compiled frame)



Re: Decreasing value of Complete Latency in Storm UI

2017-07-14 Thread P. Taylor Goetz
Over how long a period do you see the complete latency decreasing? Does it 
stabilize at some point?

It’s typical for a topology to start out slow in terms of latency, then speed 
up as the worker JVMs “warm up.” The warm up period can last several minutes.

The complete latency metric should give you a good idea what end-to-end latency 
is like. Just realize that the latency numbers in Storm UI are approximations 
due to sampling. Storm samples a small percentage of tuples for calculating 
latencies. You can configure it to sample all, but it will kill performance and 
should only be done for debugging purposes.

-Taylor


> On Jul 14, 2017, at 7:29 AM, preethini v  wrote:
> 
> Hi,
> 
> I am running WordCountTopology with 3 worker nodes. The parallelism of spout, 
> split and count is 5, 8 and 12 respectively. I have enabled acking to measure 
> the complete latency of the topology.
> 
> I am considering  complete latency as a measure of end-to-end latency.
> 
> The Complete latency is the time a tuple is emitted by a Spout until 
> Spout.ack() is called.  Thus, it is the time from tuple being emitted, the 
> tuple processing time, the time it spends in the internal input/output 
> buffers and until the ack for the tuple is received by the Spout.
> 
> The stats from storm UI show that the complete latency for a topology keeps 
> decreasing with time. 
> 
> 1. Is this normal? 
> 2. If yes, What explains the continuous decreasing complete latency value? 
> 3. Is complete latency a good measure of end-to-end latency of a topology?
> 
> Thanks,
> Preethini



Re: How about Flink compatibility mode in Storm

2017-05-19 Thread P. Taylor Goetz
Hi Alexandre,

I’m not aware of any effort to port the Flink API to Storm. The closest thing 
to what you are looking for in Storm is a new “Streams” API that closely 
resembles the Java 8 Streams API, allows usage of lambdas, etc. That API will 
likely become available in the upcoming Storm 2.0 release.

A word of caution regarding Flink’s Storm compatibility layer, there is some 
important fine print [1] to be aware of. I’m also unsure of the future of that 
effort as it hasn’t been appreciably updated in almost 2 years aside from a 
recent update to package names for compatibility with Storm 1.0.

-Taylor

[1] 
https://github.com/apache/flink/blob/master/flink-contrib/flink-storm/README.md

> On May 19, 2017, at 3:29 PM, Alexandre Vermeerbergen 
>  wrote:
> 
> Hello,
> 
> Those days, it's becoming difficult to choose betwen Storm and Flink. I love 
> Storm bolts & spouts philosophy, but I'm missing the even-time based sliding 
> windows promised by Flink.
> 
> Flink offers a solution to run Storm topologies.
> 
> It's nice, but we're quite used at setting up Storm clusters, with 
> scalabilty, HA and upgrades in mind. Redoing everything with Mesos or one of 
> the other scheduler supported by Flink is feasible, but it will require some 
> time for us to be as production-ready as we are today with Storm.
> 
> Since we're so much used used to Storm, we'd love to see the opposite option: 
> ability to use Flink high-level API and semantics over Storm.
> 
> Is there any such project that'll eventually allow running Flink code in a 
> Storm topology?
> 
> Best regards,
> Alexandre Vermeerbergen.



Re: When would you have multiple tasks in an executor?

2017-05-13 Thread P. Taylor Goetz
Each executor is a separate JVM thread, so in some cases you may want to 
conserve threads. Consider a case where you have many tasks, some of which are 
very light weight. In that case, it may make sense to group the lightweight 
tasks under one executor/thread. JVM threads come with a certain amount of 
overhead.

It's largely a tuning situation. In many cases the defaults are fine. But we 
expose it to developers for the situations where the defaults might not be the 
best.

The fact that flux doesn't support it is a simple oversight -- core 
functionality was the initial focus. I'll look into adding it to flux, but 
would also support any pull request that added the capability.

(Adding dev@ to include anyone who might be interested.)

-Taylor

> On May 13, 2017, at 7:24 PM, S G  wrote:
> 
> Thanks Anshu.
> If you can provide an example, that would be much appreciated.
> 
>> On Sat, May 13, 2017 at 1:24 PM, anshu shukla  wrote:
>> The main reason behind task in an executor is dynamic scaling based on input 
>> rate/ resource req. Using storm rebalance.
>> 
>>> On Sun, May 14, 2017 at 12:52 AM, S G  wrote:
>>> 
>>> Hi,
>>> 
>>> As per this guide: 
>>> http://storm.apache.org/releases/current/Understanding-the-parallelism-of-a-Storm-topology.html,
>>> 
>>> An executor is a thread containing many tasks.
>>> And each task is an instance of a spout or a bolt.
>>> 
>>> I cannot imagine when would someone want to have multiple tasks in an 
>>> executor.
>>> AFAIK, Only use-case possible is when the tasks in an executor have to be 
>>> synchronous i.e. first task must wait for the second task (in the same 
>>> executor) to complete and that must wait for the third task (in the same 
>>> executor) to complete and so on.
>>> 
>>> But that requires the tasks in an executor to be different from each other.
>>> Where as the current API does not permit tasks of different type within an 
>>> executor.
>>> 
>>> So I am "guessing" that multiple tasks in an executor is just a historical 
>>> artifact with no real use-case.
>>> And that is also a reason why Flux has no option to specify num-tasks.
>>> 
>>> $ grep -ri task storm/flux/flux-core/src
>>> ===> nothing !
>>> 
>>> Please confirm if that is a correct understanding.
>>> Also, if that is correct, we might be able to squeeze some performance by 
>>> allowing only a single task per executor.
>>> 
>>> Thanks
>>> SG
>>> 
>>> 
>> 
>> 
>> 
>> -- 
>> Thanks & Regards,
>> Anshu Shukla
> 


Re: Performance of Multi-Lang protocol

2017-05-12 Thread P. Taylor Goetz
Adding dev@ mailing list...

There is definitely a performance hit. But it shouldn't be as drastic as you 
describe.

Can you share some of your environment characteristics?

I've been looking at the Apache Arrow project (full disclosure: I'm a PMC 
member) as a means for improved performance (it essentially would remove the 
performance hit for serialize/deserialize operations). This is particularly 
relevant to multi-lang, but could also apply to same-machine inter-worker 
communication.

At this point I don't feel Arrow is at a production level maturity, but is 
getting close. I definitely feel it's worth exploring at PoC level.

-Taylor

> On May 12, 2017, at 6:56 PM, Mauro Giusti  wrote:
> 
> Hi –
> We are using multi-lang to pass data between storm and mono –
>  
> We observe a 6x time increase when messages go from spout to bolt if the bolt 
> is in mono vs. being in Java –
>  
> Java can process 10,000 records in 0.7 seconds, while mono requires 4.5 
> seconds.
> The mono bolt was an empty one created with Storm.Net.Adapter library
>  
> This is on a single machine topology – we are still in dev phase and using 
> this solution for now -
>  
> Is this expected?
> Should we try to minimize multi-lang and inter-process or is this a problem 
> with my specific scenario (mono and/or single machine) ?
>  
> Thank you –
> Mauro.


Re: Searching Archives

2017-05-03 Thread P. Taylor Goetz
Give this a try:

https://lists.apache.org/list.html?d...@storm.apache.org

-Taylor

> On May 3, 2017, at 3:16 PM, Ramin Farajollah (BLOOMBERG/ 731 LEX) 
>  wrote:
> 
> Good idea but it does not work for me. Does it work for you?
> 
> STORM-1843 site:mail-archives.apache.org
> 
> Should find something like this:
> http://mail-archives.apache.org/mod_mbox/storm-dev/201705.mbox/%3CCAFA8zGmABq-muz%2BJjmH4SCN-qOtUQHrc8Aehdiw2YSAzbR8yKw%40mail.gmail.com%3E
> 
> 
> 
> From: mfo...@hortonworks.com 
> Subject: Re: Searching Archives
> Google indexes the apache email archives.  Specify 
> “site:mail-archives.apache.org” in your search string.
> 
>  
> 
> From: "Ramin Farajollah (BLOOMBERG/ 731 LEX)" 
> Reply-To: "user@storm.apache.org" , Ramin Farajollah 
> 
> Date: Wednesday, May 3, 2017 at 8:18 AM
> To: "user@storm.apache.org" 
> Subject: Searching Archives
> 
>  
> 
> Hi,
> 
>  
> 
> Is there a way to search the archives? I only see browsing by month.
> 
>  
> 
> If available, this will significantly reduce repeat questions.
> 
>  
> 
> http://mail-archives.apache.org/mod_mbox/storm-user/
> 
>  
> 
> 
> 
> << �gA mind is like a parachute. It doesn't work if it is not open.�h Frank 
> Zappa >>
> 
> 
> 
> 
> << �gA mind is like a parachute. It doesn't work if it is not open.�h Frank 
> Zappa >>


Re: github source code gone?

2017-04-28 Thread P. Taylor Goetz
Apache infra is working on it. This is a github issue not ASF. The cononical 
ASF repos are unaffected, so our source repo is safe.

-Taylor

> On Apr 28, 2017, at 5:46 PM, Michael Moss  wrote:
> 
> https://github.com/apache/storm
> 
> "This repository is empty."
> 
> Am I looking in the wrong place?


Re: [ANNOUNCE] Apache Storm 1.1.0 Released

2017-03-30 Thread P. Taylor Goetz
Thanks for the catch. I will correct it.

> On Mar 30, 2017, at 2:23 PM, Alexandre Vermeerbergen 
>  wrote:
> 
> Hello,
> Looks like there's a small mistake in download page 
> (http://storm.apache.org/downloads.html 
> <http://storm.apache.org/downloads.html>):
> 
> It says for 1.1.0 release:
> ==
> Storm artifacts are hosted in Maven Central 
> <http://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.apache.storm%22>. You 
> can add Storm as a dependency with the following coordinates:
> groupId: org.apache.storm 
> <http://search.maven.org/#search%7Cga%7C1%7Cg%3A%22org.apache.storm%22>
> artifactId: storm-core
> version: 1.0.2
> ==
> Shouldn't the version rather be "1.1.0" instead of "1.0.2" ?
> 
> regards,
> Alexandre Vermeerbergen
> 
> 2017-03-30 19:46 GMT+02:00 P. Taylor Goetz  <mailto:ptgo...@apache.org>>:
> The Apache Storm community is pleased to announce the release of Apache Storm 
> version 1.1.0.
> 
> Storm is a distributed, fault-tolerant, and high-performance realtime 
> computation system that provides strong guarantees on the processing of data. 
> You can read more about Storm on the project website:
> 
> http://storm.apache.org <http://storm.apache.org/>
> 
> Downloads of source and binary distributions are listed in our download
> section:
> 
> http://storm.apache.org/downloads.html 
> <http://storm.apache.org/downloads.html>
> 
> You can read more about this release in the following blog post:
> 
> http://storm.apache.org/2017/03/29/storm110-released.html 
> <http://storm.apache.org/2017/03/29/storm110-released.html>
> 
> Distribution artifacts are available in Maven Central at the following 
> coordinates:
> 
> groupId: org.apache.storm
> artifactId: storm-core
> version: 1.1.0
> 
> The full list of changes is available here[1]. Please let us know [2] if you 
> encounter any problems.
> 
> Regards,
> 
> The Apache Storm Team
> 
> [1]: https://github.com/apache/storm/blob/v1.1.0/CHANGELOG.md 
> <https://github.com/apache/storm/blob/v1.1.0/CHANGELOG.md>
> [2]: https://issues.apache.org/jira/browse/STORM 
> <https://issues.apache.org/jira/browse/STORM>



[ANNOUNCE] Apache Storm 1.1.0 Released

2017-03-30 Thread P. Taylor Goetz
The Apache Storm community is pleased to announce the release of Apache Storm 
version 1.1.0.

Storm is a distributed, fault-tolerant, and high-performance realtime 
computation system that provides strong guarantees on the processing of data. 
You can read more about Storm on the project website:

http://storm.apache.org

Downloads of source and binary distributions are listed in our download
section:

http://storm.apache.org/downloads.html

You can read more about this release in the following blog post:

http://storm.apache.org/2017/03/29/storm110-released.html

Distribution artifacts are available in Maven Central at the following 
coordinates:

groupId: org.apache.storm
artifactId: storm-core
version: 1.1.0

The full list of changes is available here[1]. Please let us know [2] if you 
encounter any problems.

Regards,

The Apache Storm Team

[1]: https://github.com/apache/storm/blob/v1.1.0/CHANGELOG.md
[2]: https://issues.apache.org/jira/browse/STORM

Re: Stateful topology hangs

2017-02-22 Thread P. Taylor Goetz
What version of Storm are you using? And which Kafka spout (i.e. storm-kafka or 
storm-kafka-client)?

-Taylor

> On Feb 22, 2017, at 4:32 PM, Abhishek Raj  wrote:
> 
> Hello, I am using storm's state management feature in a topology. 
> The topology has a kafkaspout and a StatefulBolt which uses 
> RedisKeyValueState. What I observe is that after some time of running 
> smoothly, the spout just stops consuming from the kafka topic and the 
> $checkpointspout stops emitting any checkpoint tuples. The topology just 
> hangs and there are no error messages in the logs. The spout acts as if there 
> are no more messages to consume even though there are. This happens very 
> randomly and if I restart the topology the error may or may not come.
> 
> Can anyone please help in debugging this? I tried searching on jira but 
> couldn't find a bug related to this issue.
> 
> Thanks,
> 
> -- 
> Abhishek



Re: Is there a way to register metrics in flux yaml

2017-02-15 Thread P. Taylor Goetz
Hi Marc,

Currently there is not a way to directly register metrics using Flux. It is 
something that is probably doable, though. Feel free to file a JIRA to add this 
functionality.

-Taylor

> On Feb 15, 2017, at 1:49 PM, Marc Zbyszynski  wrote:
> 
> Hello Everyone,
> 
> I am using flux to deploy multilang topologies with a lot of FluxShellBolt 
> instances. Do you know if there is any way to register a metric directly in 
> the flux yaml without writing any Java code? I know that the multilang 
> protocol supports submitting data to metrics that have been registered, but I 
> don't see any way to register the metrics in Flux or by invoking a method on 
> FluxShellBolt. Is that right, or am I missing something?
> 
> Thank you for your help!
> 
> -Marc



Re: [ANNOUNCE] Apache Storm 1.0.3 Released

2017-02-14 Thread P. Taylor Goetz
GitHub releases are created every time you create a tag, for example during the 
release candidate process. They should not be considered official releases. 
Official releases are also always signed with the GPG signature of an Apache 
committer.

It’s always best to download releases from an official mirror.

-Taylor



> On Feb 14, 2017, at 4:50 PM, Andrew Xor  wrote:
> 
> from my understanding you can always fall back to github releases 
> <https://github.com/apache/storm/releases> as well
> 
> Best,
> 
> A.
> 
> ​On Tue, Feb 14, 2017 at 9:44 PM, P. Taylor Goetz  <mailto:ptgo...@apache.org>> wrote:
> My apologies for sending this out early, not all web servers are in sync yet, 
> so you may get a 404 depending on which server you hit.
> 
> Direct link for download: 
> http://www.apache.org/dyn/closer.lua/storm/apache-storm-1.0.3/ 
> <http://www.apache.org/dyn/closer.lua/storm/apache-storm-1.0.3/>
> 
> Changelog: https://github.com/apache/storm/blob/v1.0.3/CHANGELOG.md 
> <https://github.com/apache/storm/blob/v1.0.3/CHANGELOG.md>
> 
> -Taylor
> 
> > On Feb 14, 2017, at 3:43 PM, P. Taylor Goetz  > <mailto:ptgo...@apache.org>> wrote:
> >
> > The Apache Storm community is pleased to announce the release of Apache 
> > Storm version 1.0.3.
> >
> > Storm is a distributed, fault-tolerant, and high-performance realtime 
> > computation system that provides strong guarantees on the processing of 
> > data. You can read more about Storm on the project website:
> >
> > http://storm.apache.org <http://storm.apache.org/>
> >
> > Downloads of source and binary distributions are listed in our download
> > section:
> >
> > http://storm.apache.org/downloads.html 
> > <http://storm.apache.org/downloads.html>
> >
> > You can read more about this release in the following blog post:
> >
> > http://storm.apache.org/2017/02/14/storm103-released.html 
> > <http://storm.apache.org/2017/02/14/storm103-released.html>
> >
> > Distribution artifacts are available in Maven Central at the following 
> > coordinates:
> >
> > groupId: org.apache.storm
> > artifactId: storm-core
> > version: 1.0.3
> >
> > The full list of changes is available here[1]. Please let us know [2] if 
> > you encounter any problems.
> >
> > Regards,
> >
> > The Apache Storm Team
> >
> > [1]: https://github.com/apache/storm/blob/v1.0.3/CHANGELOG.md 
> > <https://github.com/apache/storm/blob/v1.0.3/CHANGELOG.md>
> > [2]: https://issues.apache.org/jira/browse/STORM 
> > <https://issues.apache.org/jira/browse/STORM>
> 
> 



Re: [ANNOUNCE] Apache Storm 1.0.3 Released

2017-02-14 Thread P. Taylor Goetz
My apologies for sending this out early, not all web servers are in sync yet, 
so you may get a 404 depending on which server you hit.

Direct link for download: 
http://www.apache.org/dyn/closer.lua/storm/apache-storm-1.0.3/

Changelog: https://github.com/apache/storm/blob/v1.0.3/CHANGELOG.md

-Taylor

> On Feb 14, 2017, at 3:43 PM, P. Taylor Goetz  wrote:
> 
> The Apache Storm community is pleased to announce the release of Apache Storm 
> version 1.0.3.
> 
> Storm is a distributed, fault-tolerant, and high-performance realtime 
> computation system that provides strong guarantees on the processing of data. 
> You can read more about Storm on the project website:
> 
> http://storm.apache.org
> 
> Downloads of source and binary distributions are listed in our download
> section:
> 
> http://storm.apache.org/downloads.html
> 
> You can read more about this release in the following blog post:
> 
> http://storm.apache.org/2017/02/14/storm103-released.html
> 
> Distribution artifacts are available in Maven Central at the following 
> coordinates:
> 
> groupId: org.apache.storm
> artifactId: storm-core
> version: 1.0.3
> 
> The full list of changes is available here[1]. Please let us know [2] if you 
> encounter any problems.
> 
> Regards,
> 
> The Apache Storm Team
> 
> [1]: https://github.com/apache/storm/blob/v1.0.3/CHANGELOG.md
> [2]: https://issues.apache.org/jira/browse/STORM



[ANNOUNCE] Apache Storm 1.0.3 Released

2017-02-14 Thread P. Taylor Goetz
The Apache Storm community is pleased to announce the release of Apache Storm 
version 1.0.3.

Storm is a distributed, fault-tolerant, and high-performance realtime 
computation system that provides strong guarantees on the processing of data. 
You can read more about Storm on the project website:

http://storm.apache.org

Downloads of source and binary distributions are listed in our download
section:

http://storm.apache.org/downloads.html

You can read more about this release in the following blog post:

http://storm.apache.org/2017/02/14/storm103-released.html

Distribution artifacts are available in Maven Central at the following 
coordinates:

groupId: org.apache.storm
artifactId: storm-core
version: 1.0.3

The full list of changes is available here[1]. Please let us know [2] if you 
encounter any problems.

Regards,

The Apache Storm Team

[1]: https://github.com/apache/storm/blob/v1.0.3/CHANGELOG.md
[2]: https://issues.apache.org/jira/browse/STORM

Re: Storm (Core) versus Storm (with Trident)

2017-01-25 Thread P. Taylor Goetz
If you could provide additional information about your environment (as much as 
possible) that would help us answer your questions.

-Taylor

> On Jan 25, 2017, at 5:35 PM, Thomas Cristanis  
> wrote:
> 
> Thank you, P. Taylor.
> I asked this question because I'm making a basic benchmarks and I have the 
> opposite result. The Storm Core (Kafka) is about 2x and tries more than 
> Trident (Kafka).
> I still do not understand what's happening.
> 
> 
> --
> Thomas Cristanis
> 
> 2017-01-25 19:21 GMT-03:00 P. Taylor Goetz :
>> It comes down to a tradeoff between throughput and latency (as well as how 
>> and what you do in your topologies), and tuning parameters.
>> 
>> In the benchmarking I’ve done, Storm Core has lower latency and lower 
>> throughput, Trident has higher throughput (~2x) and higher latency (~2-3x).
>> 
>> But again, it really depends on what other systems you’re interacting with. 
>> One slow external system could easily negate any performance difference 
>> between the two.
>> 
>> -Taylor
>> 
>> 
>> > On Jan 25, 2017, at 4:03 PM, Thomas Cristanis  
>> > wrote:
>> >
>> > Has anyone done any benchmarks comparing the performance of the Storm 
>> > against the Apache Storm (Core) in relation to Apache Storm (with Trident) 
>> > ?
>> > If so, what were the results?
>> >
>> > --
>> > Thomas Cristanis
>> 
> 


Re: Storm (Core) versus Storm (with Trident)

2017-01-25 Thread P. Taylor Goetz
It comes down to a tradeoff between throughput and latency (as well as how and 
what you do in your topologies), and tuning parameters.

In the benchmarking I’ve done, Storm Core has lower latency and lower 
throughput, Trident has higher throughput (~2x) and higher latency (~2-3x).

But again, it really depends on what other systems you’re interacting with. One 
slow external system could easily negate any performance difference between the 
two.

-Taylor


> On Jan 25, 2017, at 4:03 PM, Thomas Cristanis  
> wrote:
> 
> Has anyone done any benchmarks comparing the performance of the Storm against 
> the Apache Storm (Core) in relation to Apache Storm (with Trident) ?
> If so, what were the results?
> 
> --
> Thomas Cristanis



Re: Zero Worker Bug in Topology on 1.0.2

2016-11-22 Thread P. Taylor Goetz
What does the Storm UI show? Are you out of slots?

-Taylor


> On Nov 21, 2016, at 6:28 PM, Joaquin Menchaca  wrote:
> 
> I am not sure what is causing this, but it seems that after 70 days of 
> running the cluster, it will not be able to run a topology with any workers 
> or tasks.
> 
> I tried a sample topology, which used to work, but now it is broken?
> 
> storm jar 
> /usr/lib/apache/storm/1.0.2/examples/storm-starter/storm-starter-topologies-1.0.2.jar
>  org.apache.storm.starter.ExclamationTopology ExclamationTopology
> 
> 
> Topology_nameStatus Num_tasks  Num_workers  Uptime_secs
> ---
> ExclamationTopology   ACTIVE 0  09
> 
> Any suggestions?
> 
> -- 
> 
> 是故勝兵先勝而後求戰,敗兵先戰而後求勝。



Re: Explicit Topology Parallelism Shape

2016-11-07 Thread P. Taylor Goetz
Yes, to do what you want you would need to implement a custom scheduler.

More details can be found here: 
http://storm.apache.org/releases/1.0.2/Storm-Scheduler.html 


-Taylor

> On Nov 7, 2016, at 11:31 AM, Arthur Maciejewicz  wrote:
> 
> Hello All,
> 
>  Is there a way to make the "shape" of topology parallelism explicit? For 
> example, assume:
> 
> * W workers on W nodes
> * 2 spouts with 1 executor each (total of 2 executors)
> * 1 "mapper" bolt with M executors
> * 1 "process" bolt with N executors.
> 
> Is it currently possibly to co-locate the spouts with the "mappers" in 2 
> workers, while pinning the N "process" executors to the remaining (W-2) 
> workers?
> 
> Visually this is what I want to do:
> 
> 
>  2 Workers   W-2 Workers
> 
>+---+
>  +--+  | process |
>  | spout|  +---+
>  | +  +-->  +---+
>  | map  |   | process |
>  +--+  +---+
>+---+
>  +--+  | process |
>  | spout|  +---+
>  | +   +-->  ...
>  | map  |  +---+
>  +--+  | process |
>+---+
>+---+
>| process |
>+---+
> 
> Thanks,
> 
> Arthur



Re: Storm over both Ethernet and Infiniband on the same cluster

2016-11-04 Thread P. Taylor Goetz
In your storm.yaml configuration on each supervisor machine, if you set 
“storm.local.hostname” to the Infiniband IP that should do what you want.

-Taylor


> On Nov 4, 2016, at 2:06 PM, Muhammad Haseeb Javed <11besemja...@seecs.edu.pk> 
> wrote:
> 
> That is definitely a plausible solution but not the one that would apply to 
> my case as it is a shared user cluster with me only having basic, non-root 
> access to it. I can not even do a sudo on it, let alone disable network 
> interconnects. Do you have any other solution in mind, be it complex, by 
> which this could be done?
> 
> On Fri, Nov 4, 2016 at 1:56 PM, Matt Foley  > wrote:
> Sorry if this is too crude, but the simplest answer is to disable one network 
> at a time, on all the servers in the cluster.  This can be done in software, 
> with a usually minor edit of the OS configuration files (which ones depends 
> on the particular linux you’re running) then restarting the network services. 
>  Rebooting is not usually required.
> 
>  
> 
> I’m assuming that, since you’re running benchmarks, you have dedicated 
> hardware that you can tweak as desired.
> 
>  
> 
> Hope this helps,
> 
> --Matt
> 
>  
> 
> From: Muhammad Haseeb Javed <11besemja...@seecs.edu.pk 
> >
> Reply-To: "user@storm.apache.org " 
> mailto:user@storm.apache.org>>
> Date: Thursday, November 3, 2016 at 10:49 PM
> To: "user@storm.apache.org " 
> mailto:user@storm.apache.org>>
> Subject: Storm over both Ethernet and Infiniband on the same cluster
> 
>  
> 
> I am trying to benchmark performance differences for Storm when run on 
> Ehternet compared to an Infiniband network. I have a cluster in which all 
> nodes are connected through both Ethernet and Infiniband. 
> 
>  
> 
> But I am lost as to how to configure Storm to perform communication over 
> Infiniband in one case and Ethernet in another. From what I understand, Storm 
> does not allow us to specify the IPs of slave (supervisor) nodes, or else we 
> could have just entered the address if Inifininand interconnect there.
> 
> 



Re: Messages are not being delivered fast enough

2016-11-01 Thread P. Taylor Goetz
You can safely ignore that message, as it only relates to delivery of metrics 
information (i.e. not topology data). It has since been set to a DEBUG level 
message, but that change isn’t in an official release yet.

What it means is that the handler got more than one metrics message when it was 
only expecting one. In that case it will only take that last metrics message 
and discard the rest.

-Taylor

> On Oct 18, 2016, at 1:56 AM, Daniccan VP  wrote:
> 
> Hi,
> 
> I am getting the following warning messages often in my Storm Cluster. I am 
> using Kafka Spout to get the messages from Kafka to Storm. What does this 
> warning mean ?
> 
> 2016-10-17 09:01:40.359 o.a.s.m.n.StormClientHandler [WARN] Messages are not 
> being delivered fast enough, got 2 metrics messages at once
> 
> Thanks and Regards,
> Daniccan VP | Junior Software Engineer
> Email : danic...@iqsystech.com
> ***
>  This email and any files transmitted with it are confidential and intended 
> solely for the use of the individual or entity to whom they have been 
> addressed. If you are not the intended recipient, you are notified that 
> disclosing, copying, distributing or taking any action in reliance on the 
> contents of this information is strictly prohibited. Please notify the sender 
> immediately by e-mail if you have received this e-mail by mistake and delete 
> this e-mail from your system.



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Is topology.tick.tuple.freq.secs a topology-level or component-level setting?

2016-10-31 Thread P. Taylor Goetz
Hi Matt,

It depends on where/how you set it:

storm.yaml—> cluster wide
topology conf. —> topology wide
getComponentConfiguration —> component specific

If you want to control it at the component level, leave it out of the 
storm.yaml and topology configuration.

-Taylor


> On Oct 31, 2016, at 2:54 PM, Matt Foley  wrote:
> 
> Hi,
> There is ambiguity in the available documentation:
> 
> The name “topology.tick.tuple.freq.secs” implies it is topology-level.
> But it is clearly set at the component level, in 
> Bolt::getComponentConfiguration() method.
> At which level does it take effect?
> 
> The primary example blog at 
> http://hortonworks.com/blog/apache-storm-design-pattern-micro-batching/ 
>  
> says:
> “Cons: Tick tuples are topology specific—not bolt specific. As such, you 
> cannot use them for managing different batch schedules for different 
> components within topologies.”
> 
> And couple weeks ago I found a stackoverflow comment (unfortunately I can’t 
> find it again now) that if you try to set different tick frequencies for 
> different bolts in the same topology, Storm would use the smaller interval.
> 
> BUT the release note (which is official, come to think of it) at 
> http://storm.apache.org/2012/08/02/storm080-released.html 
>  says:
> TOPOLOGY-TICK-TUPLE-FREQ-SECS is “Meant to be used as a component-specific 
> configuration.”
> 
> 
> Which is correct?  The underlying question is, CAN you set different tick 
> frequencies for different bolts in the same topology?
> Or if you try, what happens?
> 
> Thanks,
> --Matt



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: WorkerHook deserialization problem

2016-10-28 Thread P. Taylor Goetz
I was able to verify this to be a bug in how worker hooks work in local mode.

In trying to see if this affects distributed mode as well, a found a more 
serious issue that prevents workers from shutting down gracefully (an thus 
preventing shutdown hooks from running):

https://issues.apache.org/jira/browse/STORM-2176 


So for the time being I don’t believe worker shutdown hooks work in either 
local or distributed mode. I can confirm the start portion of worker hooks 
functions properly, but not shutdown. Hopefully we will be able to fix both 
these issues in an upcoming release.

-Taylor


> On Oct 21, 2016, at 9:58 AM, Kevin Peek  wrote:
> 
> I am running into problems with WorkerHooks on a local cluster. Even using 
> only a BaseWorkerHook, I get an Exception. When I run the following code, an 
> EOFException is thrown - it seems the Worker is trying to deserialize an 
> empty byte[] for one of the WorkerHooks. Comment out the line adding the hook 
> and this runs fine.
> 
> Can someone help me understand what is going wrong here and whether or not 
> this is strictly an issue with the LocalCluster and how I am using it.
> 
> 
> TopologyBuilder builder = new TopologyBuilder();
> builder.setSpout("spoutId", new RandomNumberSpout());
> builder.addWorkerHook(new BaseWorkerHook());
> StormTopology topology = builder.createTopology();
> Config config = new Config();
> config.setMessageTimeoutSecs(1);
> String topologyName = "dummy-topology";
> 
> LocalCluster cluster = new LocalCluster();
> cluster.submitTopology(topologyName, config, topology);
> Thread.sleep(5000);
> cluster.killTopology(topologyName);
> Thread.sleep(1);
> cluster.shutdown();
> 
> 
> Produces:
> 
> 
> java.lang.RuntimeException: java.io.EOFException
> 
>   at org.apache.storm.utils.Utils.javaDeserialize(Utils.java:185)
>   at 
> org.apache.storm.daemon.worker$run_worker_shutdown_hooks$iter__8540__8544$fn__8545.invoke(worker.clj:576)
>   at clojure.lang.LazySeq.sval(LazySeq.java:40)
>   at clojure.lang.LazySeq.seq(LazySeq.java:49)
>   at clojure.lang.RT.seq(RT.java:507)
>   at clojure.core$seq__4128.invoke(core.clj:137)
>   at clojure.core$dorun.invoke(core.clj:3009)
>   at clojure.core$doall.invoke(core.clj:3025)
>   at 
> org.apache.storm.daemon.worker$run_worker_shutdown_hooks.invoke(worker.clj:574)
>   at 
> org.apache.storm.daemon.worker$fn__8555$exec_fn__2466__auto__$reify__8557$shutdown_STAR___8577.invoke(worker.clj:691)
>   at 
> org.apache.storm.daemon.worker$fn__8555$exec_fn__2466__auto__$reify$reify__8603.shutdown(worker.clj:704)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93)
>   at clojure.lang.Reflector.invokeNoArgInstanceMember(Reflector.java:313)
>   at 
> org.apache.storm.process_simulator$kill_process.invoke(process_simulator.clj:46)
>   at 
> org.apache.storm.daemon.supervisor$shutdown_worker.invoke(supervisor.clj:286)
>   at 
> org.apache.storm.daemon.supervisor$fn__9307$exec_fn__2466__auto__$reify__9332.shutdown_all_workers(supervisor.clj:852)
>   at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>   at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
>   at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>   at java.lang.reflect.Method.invoke(Method.java:497)
>   at clojure.lang.Reflector.invokeMatchingMethod(Reflector.java:93)
>   at clojure.lang.Reflector.invokeNoArgInstanceMember(Reflector.java:313)
>   at 
> org.apache.storm.testing$kill_local_storm_cluster.invoke(testing.clj:199)
>   at org.apache.storm.LocalCluster$_shutdown.invoke(LocalCluster.clj:66)
>   at org.apache.storm.LocalCluster.shutdown(Unknown Source)



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: When to use MemoryMapState while performing a persistentAggregate in Trident?

2016-10-26 Thread P. Taylor Goetz
Storm has support for a Redis-backed map state:

https://github.com/apache/storm/blob/master/external/storm-redis/src/main/java/org/apache/storm/redis/trident/state/RedisMapState.java
 


-Taylor

> On Oct 26, 2016, at 5:17 AM, Dinesh Babu K G  wrote:
> 
> Thanks Arun. Does storm recommend any specific in-memory store for 
> persistence? I see memached given as an example in the storm documentation 
> but no word about other stores.
> 
> On Wed, Oct 26, 2016 at 2:37 PM Arun Mahadevan  > wrote:
> 
> 
> MemoryMapState is more for testing and does not provide any persistence. It 
> uses a HashMap internally. If you want persistence you need use the one based 
> on redis or other.
> 
> 
> 
> Thanks,
> 
> Arun
> 
> 
> 
> From: Dinesh Babu K G mailto:dinesh@gmail.com>>
> Reply-To: "user@storm.apache.org " 
> mailto:user@storm.apache.org>>
> Date: Wednesday, October 26, 2016 at 2:31 PM
> To: "user@storm.apache.org " 
> mailto:user@storm.apache.org>>
> Subject: When to use MemoryMapState while performing a persistentAggregate in 
> Trident?
> 
> 
> 
> Hi all,
> 
> 
> 
> I would like to understand when to use MemoryMapState v/s using a state that 
> is based on a in-memory data store (like memcached, redis or aerospike) while 
> doing persistentAggregate() in Trident.
> 
> 
> 
> Are there any pros & cons between the two approaches?
> 
> 
> 
> Thanks,
> 
> Dinesh Babu K.G
> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: java.lang.OutOfMemoryError: GC overhead limit exceeded in split bolt

2016-10-26 Thread P. Taylor Goetz
That topology doesn’t use reliable delivery, so there is nothing to throttle 
that spout without the `sleep()`. So the spout will emit as fast as it can, 
which is faster than the bolts in the topology can process them.

Try reducing the sleep time to something smaller like 5.

-Taylor

> On Oct 25, 2016, at 10:51 PM, Junguk Cho  wrote:
> 
> Hi, All.
> 
> I evaluate Storm performance with WordCount example.
> 
> In spout, I removed "Utils.sleep(100)" in RandomSentenceSpout.java.
> So, it emitted sentences very fast.
> Whenever I run WordCountTopology, "split" bolt dead with below messages and 
> It is re-launched.
> 
> 2016-10-25 20:37:39.395 STDERR [INFO] java.lang.OutOfMemoryError: GC overhead 
> limit exceeded
> 2016-10-25 20:37:39.398 STDERR [INFO] Dumping heap to artifacts/heapdump ...
> 
> I increased worker.heap.memory.mb as 2GB.
> 
> Obviously, split bolt was overloaded due to extensive inputs from a spout and 
> simple execute logics (splitting a sentence into words and sending them to a 
> count bolt).
> So, I changed the number of splits and counts while I kept using 1 spout, but 
> I still saw this situation.
> 
> Any advice will be welcome.
> 
> Thanks,
> Junguk
> 
> 
> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Messages are not being delivered fast enough warning

2016-10-26 Thread P. Taylor Goetz
You can safely ignore that message, as it only relates to delivery of metrics 
information (i.e. not topology data). It has since been set to a DEBUG level 
message, but that change isn’t in an official release yet.

What it means is that the handler got more than one metrics message when it was 
only expecting one. In that case it will only take that last metrics message 
and discard the rest.

-Taylor

> On Oct 25, 2016, at 4:00 AM, Chen Junfeng  wrote:
> 
> I found some "StormClientHandler [WARN] Messages are not being delivered fast 
> enough " ,message in my log files. How to discover which bolt throw this 
> warning? And how to solve it?
> 
> Regard,
> Junfeng Chen



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Nimbus setting java.class.path to include javaagent

2016-10-07 Thread P. Taylor Goetz
3.jar:/usr/local/Cellar/storm/1.0.1/libexec/lib/clojure-1.7.0.jar:/usr/local/Cellar/storm/1.0.1/libexec/lib/disruptor-3.3.2.jar:/usr/local/Cellar/storm/1.0.1/libexec/lib/kryo-3.0.3.jar:/usr/local/Cellar/storm/1.0.1/libexec/lib/log4j-api-2.1.jar:/usr/local/Cellar/storm/1.0.1/libexec/lib/log4j-core-2.1.jar:/usr/local/Cellar/storm/1.0.1/libexec/lib/log4j-over-slf4j-1.6.6.jar:/usr/local/Cellar/storm/1.0.1/libexec/lib/log4j-slf4j-impl-2.1.jar:/usr/local/Cellar/storm/1.0.1/libexec/lib/minlog-1.3.0.jar:/usr/local/Cellar/storm/1.0.1/libexec/lib/objenesis-2.1.jar:/usr/local/Cellar/storm/1.0.1/libexec/lib/reflectasm-1.10.1.jar:/usr/local/Cellar/storm/1.0.1/libexec/lib/servlet-api-2.5.jar:/usr/local/Cellar/storm/1.0.1/libexec/lib/slf4j-api-1.7.7.jar:/usr/local/Cellar/storm/1.0.1/libexec/lib/storm-core-1.0.1.jar:/usr/local/Cellar/storm/1.0.1/libexec/lib/storm-rename-hack-1.0.1.jar
>  
> thanks for all of your help.
> brad
> 
> Subject: Re: Nimbus setting java.class.path to include javaagent
> From: ptgo...@gmail.com
> Date: Wed, 5 Oct 2016 15:33:47 -0400
> To: user@storm.apache.org
> 
> What your seeing is the ZooKeeper client running within Nimbus/Supervisor 
> just printing out it’s class path, which it inherits from Storm’s class path. 
> It’s not really sending that to ZooKeeper.
>  
> What’s weird is that the agent jar sneaking onto the class path.
>  
> On your supervisor node, can you run `storm classpath` and see what it prints?
>  
> -Taylor
>  
>  
> On Oct 5, 2016, at 1:17 PM, Brad Rhodes  wrote:
>  
> Storm 0.9.6.  We are working on moving to 1.0.2 as we speak.
>  
> nimbus.childopts: "-Xmx1024m -DjmxRegistryPort=1098 -DjmxServerPort=1099 
> -javaagent:/opt/monitoring_agent/bobbrad.jar 
> -agentlib:jdwp=transport=dt_socket,server=y,address=8100,suspend=
> y -Dmonitoring.agent.properties=/opt/monitoring_agent/monitoring.properties 
> -Dapplication.name=storm.test.lodgingamenity.nimbus"
> 
> 
> supervisor.childopts: "-Xmx256m -DjmxRegistryPort=1098 -DjmxServerPort=1099 
> -javaagent:/opt/monitoring_agent/bobbrad2.jar 
> -Dmonitoring.agent.properties=/opt/monitoring_agent/monitoring
> .properties -Dapplication.name=storm.test.lodgingamenity.supervisor"
> 
> 
> worker.childopts: "-Xmx1024m 
> -Dmonitoring.agent.properties=/opt/monitoring_agent/monitoring.properties 
> -Dapplication.name=storm.test.lodgingamenity.worker.%ID%"
>  
>  
> From: P. Taylor Goetz [mailto:ptgo...@gmail.com] 
> Sent: Tuesday, October 4, 2016 9:46 AM
> To: user@storm.apache.org
> Subject: Re: Nimbus setting java.class.path to include javaagent
>  
> Can you post your values for nimbus.childopts, supervisor.childopts, and 
> worker.childopts? Also what version of Storm are you on?
>  
> -Taylor
>  
> On Sep 28, 2016, at 6:06 PM, Brad Rhodes  wrote:
>  
> I have a situation where we have a javaagent jar file.  Someone has added a 
> number of classes to the corporate standard javaagent jar file.  Nimbus 
> appears to append -javaagent jar files to the -cp (claspath) for the 
> supervisor process.
>  
> When Nimbus starts up java is started with a command like: 
>  
> /usr/java/latest/bin/java -server -Dstorm.options= -Dstorm.home=/opt/storm 
> -Dstorm.log.dir=/opt/storm/logs 
> -Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib -Dstorm.conf.file= 
> -cp 
> /opt/storm/lib/hiccup-0.3.6.jar:/opt/storm/lib/clj-stacktrace-0.2.2.jar:/opt/storm/lib/clj-time-0.4.1.jar:/opt/storm/lib/tools.macro-0.1.0.jar:/opt/storm/lib/commons-exec-1.1.jar:/opt/storm/lib/minlog-1.2.jar:/opt/storm/lib/joda-time-2.0.jar:/opt/storm/lib/ring-devel-0.3.11.jar:/opt/storm/lib/tools.cli-0.2.4.jar:/opt/storm/lib/logback-core-1.0.13.jar:/opt/storm/lib/tools.logging-0.2.3.jar:/opt/storm/lib/jline-2.11.jar:/opt/storm/lib/math.numeric-tower-0.0.1.jar:/opt/storm/lib/chill-java-0.3.5.jar:/opt/storm/lib/objenesis-1.2.jar:/opt/storm/lib/compojure-1.1.3.jar:/opt/storm/lib/clojure-1.5.1.jar:/opt/storm/lib/commons-codec-1.6.jar:/opt/storm/lib/commons-io-2.4.jar:/opt/storm/lib/clout-1.0.1.jar:/opt/storm/lib/log4j-over-slf4j-1.6.6.jar:/opt/storm/lib/commons-fileupload-1.2.1.jar:/opt/storm/lib/servlet-api-2.5.jar:/opt/storm/lib/reflectasm-1.07-shaded.jar:/opt/storm/lib/storm-core-0.9.6.jar:/opt/storm/lib/jgrapht-core-0.9.0.jar:/opt/storm/lib/ring-servlet-0.3.11.jar:/opt/storm/lib/core.incubator-0.1.0.jar:/opt/storm/lib/snakeyaml-1.11.jar:/opt/storm/lib/disruptor-2.10.4.jar:/opt/storm/lib/commons-lang-2.5.jar:/opt/storm/lib/carbonite-1.4.0.jar:/opt/storm/lib/ring-jetty-adapter-0.3.11.jar:/opt/storm/lib/asm-4.0.jar:/opt/storm/lib/jetty-6.1.26.jar:/opt/storm/lib/commons-logging-1.1.3.jar:/opt/storm/lib/json-simple-1.1.jar:/opt/storm/lib/slf4j-api-1.7.5.jar:/opt/storm/lib/logback-classic-1.0.13.jar:/opt/storm/lib/jetty-util-6.1.26.jar:/opt/s

Re: Storm Bug? Cannot connect to cluster

2016-10-06 Thread P. Taylor Goetz
I don't think it's a bug on your part. I tested it and found that if only 
nimbus.seed is set, the 'storm list' command will attempt to connect with 
localhost.

-Taylor

> On Oct 6, 2016, at 6:16 PM, Joaquin Menchaca  wrote:
> 
> It might be a bug on my part, trying a test, office Internet is slooow, so 
> have to step outside of office to do the test... 
> 
> - joaquin
> 
> PS - Not on the topic of writing bugs... how could I file bugs myself?  I 
> wanted to make a PR for some doc bugs, but don't quite know how to PR at a 
> particular tag...  
> 
>> On Thu, Oct 6, 2016 at 1:43 PM, P. Taylor Goetz  wrote:
>> This is a bug. The `storm list` command implementation doesn’t understand 
>> the “nimbus.seeds” configuration setting.
>> 
>> Can you try the following?
>> 
>> Add a nimbus.host entry in your storm.yaml file with ONE of your nimbus 
>> hostnames.
>> 
>> Let me know if that works, and I’ll file a JIRA to get this fixed.
>> 
>> -Taylor
>> 
>> 
>>> On Oct 6, 2016, at 3:31 PM, Joaquin Menchaca  wrote:
>>> 
>>> I don't get it, it seems to pick up the correct configuration... but then 
>>> ignores the nimbus.seeds and users localhost.
>>> 
>>> # cat /templates/storm.yaml 
>>> storm.zookeeper.servers:
>>> - "ip-10-110-20-8.us-west-2.compute.internal"
>>> 
>>> nimbus.seeds: ["ip-10-110-20-7.us-west-2.compute.internal", 
>>> "ip-10-110-20-146.us-west-2.compute.internal"]
>>> 
>>> # storm --config /templates/storm.yaml list
>>> Running: /usr/lib/jvm/java-8-oracle/bin/java -client -Ddaemon.name= 
>>> -Dstorm.options= -Dstorm.home=/usr/share/storm 
>>> -Dstorm.log.dir=/usr/share/storm/logs 
>>> -Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib 
>>> -Dstorm.conf.file=/templates/storm.yaml -cp 
>>> /usr/share/storm/lib/asm-4.0.jar:/usr/share/storm/lib/log4j-slf4j-impl-2.1.jar:/usr/share/storm/lib/log4j-core-2.1.jar:/usr/share/storm/lib/kryo-2.21.jar:/usr/share/storm/lib/servlet-api-2.5.jar:/usr/share/storm/lib/log4j-over-slf4j-1.6.6.jar:/usr/share/storm/lib/slf4j-api-1.7.7.jar:/usr/share/storm/lib/clojure-1.6.0.jar:/usr/share/storm/lib/storm-core-0.10.0.jar:/usr/share/storm/lib/reflectasm-1.07-shaded.jar:/usr/share/storm/lib/hadoop-auth-2.4.0.jar:/usr/share/storm/lib/log4j-api-2.1.jar:/usr/share/storm/lib/minlog-1.2.jar:/usr/share/storm/lib/disruptor-2.10.4.jar:/templates:/usr/share/storm/bin
>>>  backtype.storm.command.list
>>> ...
>>> 2264 [main] INFO  b.s.thrift - Connecting to Nimbus at localhost:6627 as 
>>> user: 
>>> ...
>>> Caused by: java.net.ConnectException: Connection refused
>>> ...
>>> 
>>>> On Thu, Oct 6, 2016 at 12:15 PM, P. Taylor Goetz  wrote:
>>>> For some reason it is trying to connect to localhost:
>>>> 
>>>> 2293 [main] INFO  b.s.thrift - Connecting to Nimbus at localhost:6627 as 
>>>> user:
>>>> 
>>>> Do you have a ls ~/.storm/storm.yaml file that might be getting picked up?
>>>> 
>>>> What are the contents of the storm.yaml you are using in your commands?
>>>> 
>>>> -Taylor
>>>> 
>>>> 
>>>>> On Oct 6, 2016, at 2:45 PM, Joaquin Menchaca  
>>>>> wrote:
>>>>> 
>>>>> I have only the nimbus.seeds configured, do I need anything else 
>>>>> configured to get his to work?
>>>>> 
>>>>> $ storm --config /templates/storm.yaml localconfvalue nimbus.seeds
>>>>> nimbus.seeds: [ip-10-110-20-7.us-west-2.compute.internal 
>>>>> ip-10-110-20-146.us-west-2.compute.internal]
>>>>> 
>>>>> $ nc -vz ip-10-110-20-7.us-west-2.compute.internal 6627   
>>>>> Connection to ip-10-110-20-7.us-west-2.compute.internal 6627 port [tcp/*] 
>>>>> succeeded!
>>>>> $ nc -vz ip-10-110-20-146.us-west-2.compute.internal 6627
>>>>> Connection to ip-10-110-20-146.us-west-2.compute.internal 6627 port 
>>>>> [tcp/*] succeeded!
>>>>> 
>>>>> $ storm --config /templates/storm.yaml list   
>>>>> Running: /usr/lib/jvm/java-8-oracle/bin/java -client -Ddaemon.name= 
>>>>> -Dstorm.options= -Dstorm.home=/usr/share/storm 
>>>>> -Dstorm.log.dir=/usr/share/storm/logs 
>>>>> -Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib 
>>>>> -Dstorm.conf.file=/templates/storm.yaml -cp 
>>

Re: Storm 1.0.2 - when does Storm schedule additional workers?

2016-10-06 Thread P. Taylor Goetz
Hi Dominik,

Storm will not (currently) allocate additional workers to a topology unless you 
tell it to using the rebalance command. It will start out with the number of 
workers you specify with either Config.setNumWorkers() or the topology.workers 
configuration key.

The supervisors.slots.ports configuration controls how many worker slots are 
available on a given supervisor node. The default is 4 slots. So with the 
defaults if you have 2 supervisor nodes, you will have a total of 8 slots. All 
of those slots will be unused until you submit a topology (no workers running). 
If you submit a topology with topology.worker: 4, Storm will allocate 4 workers 
to the topology, and you will be left with 4 remaining open slots.

Hope this helps.

-Taylor


> On Oct 6, 2016, at 4:31 PM, Dominik Safaric  wrote:
> 
> Hi everyone,
> 
> I’ve been curious about the following - under what conditions does Storm 
> schedule additional workers of a running topology and what is its relation 
> with the supervisor.slot.ports configuration value(s)?
> 
> Thanks in advance,
> Dominik



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Storm Bug? Cannot connect to cluster

2016-10-06 Thread P. Taylor Goetz
This is a bug. The `storm list` command implementation doesn’t understand the 
“nimbus.seeds” configuration setting.

Can you try the following?

Add a nimbus.host entry in your storm.yaml file with ONE of your nimbus 
hostnames.

Let me know if that works, and I’ll file a JIRA to get this fixed.

-Taylor


> On Oct 6, 2016, at 3:31 PM, Joaquin Menchaca  wrote:
> 
> I don't get it, it seems to pick up the correct configuration... but then 
> ignores the nimbus.seeds and users localhost.
> 
> # cat /templates/storm.yaml
> storm.zookeeper.servers:
> - "ip-10-110-20-8.us-west-2.compute.internal"
> 
> nimbus.seeds: ["ip-10-110-20-7.us-west-2.compute.internal", 
> "ip-10-110-20-146.us-west-2.compute.internal"]
> 
> # storm --config /templates/storm.yaml list
> Running: /usr/lib/jvm/java-8-oracle/bin/java -client -Ddaemon.name= 
> -Dstorm.options= -Dstorm.home=/usr/share/storm 
> -Dstorm.log.dir=/usr/share/storm/logs 
> -Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib 
> -Dstorm.conf.file=/templates/storm.yaml -cp 
> /usr/share/storm/lib/asm-4.0.jar:/usr/share/storm/lib/log4j-slf4j-impl-2.1.jar:/usr/share/storm/lib/log4j-core-2.1.jar:/usr/share/storm/lib/kryo-2.21.jar:/usr/share/storm/lib/servlet-api-2.5.jar:/usr/share/storm/lib/log4j-over-slf4j-1.6.6.jar:/usr/share/storm/lib/slf4j-api-1.7.7.jar:/usr/share/storm/lib/clojure-1.6.0.jar:/usr/share/storm/lib/storm-core-0.10.0.jar:/usr/share/storm/lib/reflectasm-1.07-shaded.jar:/usr/share/storm/lib/hadoop-auth-2.4.0.jar:/usr/share/storm/lib/log4j-api-2.1.jar:/usr/share/storm/lib/minlog-1.2.jar:/usr/share/storm/lib/disruptor-2.10.4.jar:/templates:/usr/share/storm/bin
>  backtype.storm.command.list
> ...
> 2264 [main] INFO  b.s.thrift - Connecting to Nimbus at localhost:6627 as user:
> ...
> Caused by: java.net.ConnectException: Connection refused
> ...
> 
> On Thu, Oct 6, 2016 at 12:15 PM, P. Taylor Goetz  <mailto:ptgo...@gmail.com>> wrote:
> For some reason it is trying to connect to localhost:
> 
> 2293 [main] INFO  b.s.thrift - Connecting to Nimbus at localhost:6627 as user:
> 
> Do you have a ls ~/.storm/storm.yaml file that might be getting picked up?
> 
> What are the contents of the storm.yaml you are using in your commands?
> 
> -Taylor
> 
> 
>> On Oct 6, 2016, at 2:45 PM, Joaquin Menchaca > <mailto:jmench...@gobalto.com>> wrote:
>> 
>> I have only the nimbus.seeds configured, do I need anything else configured 
>> to get his to work?
>> 
>> $ storm --config /templates/storm.yaml localconfvalue nimbus.seeds
>> nimbus.seeds: [ip-10-110-20-7.us 
>> <http://ip-10-110-20-7.us/>-west-2.compute.internal ip-10-110-20-146.us 
>> <http://ip-10-110-20-146.us/>-west-2.compute.internal]
>> 
>> $ nc -vz ip-10-110-20-7.us 
>> <http://ip-10-110-20-7.us/>-west-2.compute.internal 6627
>> Connection to ip-10-110-20-7.us 
>> <http://ip-10-110-20-7.us/>-west-2.compute.internal 6627 port [tcp/*] 
>> succeeded!
>> $ nc -vz ip-10-110-20-146.us 
>> <http://ip-10-110-20-146.us/>-west-2.compute.internal 6627
>> Connection to ip-10-110-20-146.us 
>> <http://ip-10-110-20-146.us/>-west-2.compute.internal 6627 port [tcp/*] 
>> succeeded!
>> 
>> $ storm --config /templates/storm.yaml list
>> Running: /usr/lib/jvm/java-8-oracle/bin/java -client -Ddaemon.name= 
>> -Dstorm.options= -Dstorm.home=/usr/share/storm 
>> -Dstorm.log.dir=/usr/share/storm/logs 
>> -Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib 
>> -Dstorm.conf.file=/templates/storm.yaml -cp 
>> /usr/share/storm/lib/asm-4.0.jar:/usr/share/storm/lib/log4j-slf4j-impl-2.1.jar:/usr/share/storm/lib/log4j-core-2.1.jar:/usr/share/storm/lib/kryo-2.21.jar:/usr/share/storm/lib/servlet-api-2.5.jar:/usr/share/storm/lib/log4j-over-slf4j-1.6.6.jar:/usr/share/storm/lib/slf4j-api-1.7.7.jar:/usr/share/storm/lib/clojure-1.6.0.jar:/usr/share/storm/lib/storm-core-0.10.0.jar:/usr/share/storm/lib/reflectasm-1.07-shaded.jar:/usr/share/storm/lib/hadoop-auth-2.4.0.jar:/usr/share/storm/lib/log4j-api-2.1.jar:/usr/share/storm/lib/minlog-1.2.jar:/usr/share/storm/lib/disruptor-2.10.4.jar:/templates:/usr/share/storm/bin
>>  backtype.storm.command.list
>> 1283 [main] INFO  b.s.u.Utils - Using defaults.yaml from resources
>> 2263 [main] INFO  b.s.u.Utils - Using defaults.yaml from resources
>> 2293 [main] INFO  b.s.thrift - Connecting to Nimbus at localhost:6627 as 
>> user:
>> 2293 [main] INFO  b.s.u.Utils - Using defaults.yaml from resources
>> 2348 [main] INFO  b.s.u.StormBoundedExponentialBackoffRetry - The 
>> baseSleepTimeMs [2000] the maxSleepTimeMs [6] the maxRetries [5]
>> Exception in thread "

Re: Storm Bug? Cannot connect to cluster

2016-10-06 Thread P. Taylor Goetz
For some reason it is trying to connect to localhost:

2293 [main] INFO  b.s.thrift - Connecting to Nimbus at localhost:6627 as user:

Do you have a ls ~/.storm/storm.yaml file that might be getting picked up?

What are the contents of the storm.yaml you are using in your commands?

-Taylor


> On Oct 6, 2016, at 2:45 PM, Joaquin Menchaca  wrote:
> 
> I have only the nimbus.seeds configured, do I need anything else configured 
> to get his to work?
> 
> $ storm --config /templates/storm.yaml localconfvalue nimbus.seeds
> nimbus.seeds: [ip-10-110-20-7.us-west-2.compute.internal 
> ip-10-110-20-146.us-west-2.compute.internal]
> 
> $ nc -vz ip-10-110-20-7.us-west-2.compute.internal 6627
> Connection to ip-10-110-20-7.us-west-2.compute.internal 6627 port [tcp/*] 
> succeeded!
> $ nc -vz ip-10-110-20-146.us-west-2.compute.internal 6627
> Connection to ip-10-110-20-146.us-west-2.compute.internal 6627 port [tcp/*] 
> succeeded!
> 
> $ storm --config /templates/storm.yaml list
> Running: /usr/lib/jvm/java-8-oracle/bin/java -client -Ddaemon.name= 
> -Dstorm.options= -Dstorm.home=/usr/share/storm 
> -Dstorm.log.dir=/usr/share/storm/logs 
> -Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib 
> -Dstorm.conf.file=/templates/storm.yaml -cp 
> /usr/share/storm/lib/asm-4.0.jar:/usr/share/storm/lib/log4j-slf4j-impl-2.1.jar:/usr/share/storm/lib/log4j-core-2.1.jar:/usr/share/storm/lib/kryo-2.21.jar:/usr/share/storm/lib/servlet-api-2.5.jar:/usr/share/storm/lib/log4j-over-slf4j-1.6.6.jar:/usr/share/storm/lib/slf4j-api-1.7.7.jar:/usr/share/storm/lib/clojure-1.6.0.jar:/usr/share/storm/lib/storm-core-0.10.0.jar:/usr/share/storm/lib/reflectasm-1.07-shaded.jar:/usr/share/storm/lib/hadoop-auth-2.4.0.jar:/usr/share/storm/lib/log4j-api-2.1.jar:/usr/share/storm/lib/minlog-1.2.jar:/usr/share/storm/lib/disruptor-2.10.4.jar:/templates:/usr/share/storm/bin
>  backtype.storm.command.list
> 1283 [main] INFO  b.s.u.Utils - Using defaults.yaml from resources
> 2263 [main] INFO  b.s.u.Utils - Using defaults.yaml from resources
> 2293 [main] INFO  b.s.thrift - Connecting to Nimbus at localhost:6627 as user:
> 2293 [main] INFO  b.s.u.Utils - Using defaults.yaml from resources
> 2348 [main] INFO  b.s.u.StormBoundedExponentialBackoffRetry - The 
> baseSleepTimeMs [2000] the maxSleepTimeMs [6] the maxRetries [5]
> Exception in thread "main" java.lang.RuntimeException: 
> org.apache.thrift7.transport.TTransportException: java.net.ConnectException: 
> Connection refused
> at 
> backtype.storm.security.auth.TBackoffConnect.retryNext(TBackoffConnect.java:59)
> at 
> backtype.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:51)
> at 
> backtype.storm.security.auth.ThriftClient.reconnect(ThriftClient.java:103)
> at backtype.storm.security.auth.ThriftClient.(ThriftClient.java:72)
> at backtype.storm.utils.NimbusClient.(NimbusClient.java:69)
> at backtype.storm.thrift$nimbus_client_and_conn.invoke(thrift.clj:75)
> at backtype.storm.thrift$nimbus_client_and_conn.invoke(thrift.clj:72)
> at backtype.storm.command.list$_main.invoke(list.clj:22)
> at clojure.lang.AFn.applyToHelper(AFn.java:152)
> at clojure.lang.AFn.applyTo(AFn.java:144)
> at backtype.storm.command.list.main(Unknown Source)
> Caused by: org.apache.thrift7.transport.TTransportException: 
> java.net.ConnectException: Connection refused
> at org.apache.thrift7.transport.TSocket.open(TSocket.java:187)
> at 
> org.apache.thrift7.transport.TFramedTransport.open(TFramedTransport.java:81)
> at 
> backtype.storm.security.auth.SimpleTransportPlugin.connect(SimpleTransportPlugin.java:103)
> at 
> backtype.storm.security.auth.TBackoffConnect.doConnectWithRetry(TBackoffConnect.java:48)
> ... 9 more
> Caused by: java.net.ConnectException: Connection refused
> at java.net.PlainSocketImpl.socketConnect(Native Method)
> at 
> java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
> at 
> java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
> at 
> java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
> at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
> at java.net.Socket.connect(Socket.java:589)
> at org.apache.thrift7.transport.TSocket.open(TSocket.java:182)
> ... 12 more
> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Nimbus setting java.class.path to include javaagent

2016-10-05 Thread P. Taylor Goetz
What your seeing is the ZooKeeper client running within Nimbus/Supervisor just 
printing out it’s class path, which it inherits from Storm’s class path. It’s 
not really sending that to ZooKeeper.

What’s weird is that the agent jar sneaking onto the class path.

On your supervisor node, can you run `storm classpath` and see what it prints?

-Taylor


> On Oct 5, 2016, at 1:17 PM, Brad Rhodes  wrote:
> 
> Storm 0.9.6.  We are working on moving to 1.0.2 as we speak.
> 
> nimbus.childopts: "-Xmx1024m -DjmxRegistryPort=1098 -DjmxServerPort=1099 
> -javaagent:/opt/monitoring_agent/bobbrad.jar 
> -agentlib:jdwp=transport=dt_socket,server=y,address=8100,suspend=
> y -Dmonitoring.agent.properties=/opt/monitoring_agent/monitoring.properties 
> -Dapplication.name=storm.test.lodgingamenity.nimbus 
> <http://dapplication.name%3dstorm.test.lodgingamenity.nimbus/>"
> 
> 
> supervisor.childopts: "-Xmx256m -DjmxRegistryPort=1098 -DjmxServerPort=1099 
> -javaagent:/opt/monitoring_agent/bobbrad2.jar 
> -Dmonitoring.agent.properties=/opt/monitoring_agent/monitoring
> .properties -Dapplication.name=storm.test.lodgingamenity.supervisor 
> <http://dapplication.name%3dstorm.test.lodgingamenity.supervisor/>"
> 
> 
> worker.childopts: "-Xmx1024m 
> -Dmonitoring.agent.properties=/opt/monitoring_agent/monitoring.properties 
> -Dapplication.name=storm.test.lodgingamenity.worker.%ID% 
> <http://dapplication.name%3dstorm.test.lodgingamenity.worker.%25id%25/>"
> 
> 
> From: P. Taylor Goetz [mailto:ptgo...@gmail.com]
> Sent: Tuesday, October 4, 2016 9:46 AM
> To: user@storm.apache.org
> Subject: Re: Nimbus setting java.class.path to include javaagent
> 
> Can you post your values for nimbus.childopts, supervisor.childopts, and 
> worker.childopts? Also what version of Storm are you on?
> 
> -Taylor
> 
>> On Sep 28, 2016, at 6:06 PM, Brad Rhodes > <mailto:bradr...@hotmail.com>> wrote:
>> 
>> I have a situation where we have a javaagent jar file.  Someone has added a 
>> number of classes to the corporate standard javaagent jar file.  Nimbus 
>> appears to append -javaagent jar files to the -cp (claspath) for the 
>> supervisor process.
>> 
>> When Nimbus starts up java is started with a command like:
>> 
>> /usr/java/latest/bin/java -server -Dstorm.options= -Dstorm.home=/opt/storm 
>> -Dstorm.log.dir=/opt/storm/logs 
>> -Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib 
>> -Dstorm.conf.file= -cp 
>> /opt/storm/lib/hiccup-0.3.6.jar:/opt/storm/lib/clj-stacktrace-0.2.2.jar:/opt/storm/lib/clj-time-0.4.1.jar:/opt/storm/lib/tools.macro-0.1.0.jar:/opt/storm/lib/commons-exec-1.1.jar:/opt/storm/lib/minlog-1.2.jar:/opt/storm/lib/joda-time-2.0.jar:/opt/storm/lib/ring-devel-0.3.11.jar:/opt/storm/lib/tools.cli-0.2.4.jar:/opt/storm/lib/logback-core-1.0.13.jar:/opt/storm/lib/tools.logging-0.2.3.jar:/opt/storm/lib/jline-2.11.jar:/opt/storm/lib/math.numeric-tower-0.0.1.jar:/opt/storm/lib/chill-java-0.3.5.jar:/opt/storm/lib/objenesis-1.2.jar:/opt/storm/lib/compojure-1.1.3.jar:/opt/storm/lib/clojure-1.5.1.jar:/opt/storm/lib/commons-codec-1.6.jar:/opt/storm/lib/commons-io-2.4.jar:/opt/storm/lib/clout-1.0.1.jar:/opt/storm/lib/log4j-over-slf4j-1.6.6.jar:/opt/storm/lib/commons-fileupload-1.2.1.jar:/opt/storm/lib/servlet-api-2.5.jar:/opt/storm/lib/reflectasm-1.07-shaded.jar:/opt/storm/lib/storm-core-0.9.6.jar:/opt/storm/lib/jgrapht-core-0.9.0.jar:/opt/storm/lib/ring-servlet-0.3.11.jar:/opt/storm/lib/core.incubator-0.1.0.jar:/opt/storm/lib/snakeyaml-1.11.jar:/opt/storm/lib/disruptor-2.10.4.jar:/opt/storm/lib/commons-lang-2.5.jar:/opt/storm/lib/carbonite-1.4.0.jar:/opt/storm/lib/ring-jetty-adapter-0.3.11.jar:/opt/storm/lib/asm-4.0.jar:/opt/storm/lib/jetty-6.1.26.jar:/opt/storm/lib/commons-logging-1.1.3.jar:/opt/storm/lib/json-simple-1.1.jar:/opt/storm/lib/slf4j-api-1.7.5.jar:/opt/storm/lib/logback-classic-1.0.13.jar:/opt/storm/lib/jetty-util-6.1.26.jar:/opt/storm/lib/kryo-2.21.jar:/opt/storm/lib/ring-core-1.1.5.jar:/opt/storm/conf
>>  -Xmx1024m -DjmxRegistryPort=1098 -DjmxServerPort=1099 
>> -javaagent:/opt/monitoring_agent/bobbrad.jar 
>> -agentlib:jdwp=transport=dt_socket,server=y,address=8100,suspend=y 
>> -Dmonitoring.agent.properties=/opt/monitoring_agent/monitoring.properties 
>> -Dapplication.name=storm.test.lodgingamenity.nimbus 
>> -Dlogfile.name=nimbus.log 
>> -Dlogback.configurationFile=/opt/storm/logback/cluster.xml 
>> backtype.storm.daemon.nimbus
>> 
>> Nimbus at some point sends the following to Zookeeper:
>> 016-09-28T18:50:45.627+ o.a.s.z.ZooKeeper [INFO] Client 
>> environment:java.class.path=/opt/storm/lib/hiccup-0.3.6.jar:/opt/storm/lib/clj-stacktrace-0.2.2.jar:/opt/storm/lib/clj-time-0.4.1.jar:/opt/stor

Re: Question on Storm Spout time

2016-10-04 Thread P. Taylor Goetz
With a spout parallelism of 2, and topology.max.spout.pending of 5, you will 
have a total of 10 tuples in flight. Your mergeBolt is taking almost 30 seconds 
to process each tuple. While that bolt is processing a tuple, any additional 
tuples routed to an instance of that bolt have to wait in an internal buffer. 
That wait time will contribute to the overall complete latency.

-Taylor


> On Sep 29, 2016, at 12:47 PM, Suma Cherukuri  wrote:
> 
> Hi,
> I am using storm to do file concatenations on S3. The question is regarding 
> the complete latency of the spout. There is a huge difference between the 
> bolt process latencies and the spout complete latencies. Can anyone please 
> help me understand whats causing this behavior in storm.
> 
> Below are the storm configurations:
> 
> topology.executor.receive.buffer.size: 128
> topology.executor.send.buffer.size: 128
> topology.receiver.buffer.size: 8
> topology.transfer.buffer.size: 32
> topology.max.spout.pending: 5
> topology.message.timeout.secs: 600
> topology.spout.wait.strategy: "backtype.storm.spout.SleepSpoutWaitStrategy"
> 
> Please find the attached screenshot for the latencies.
> 
> Thanks
> Suma Cherukuri
> 
> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Nimbus setting java.class.path to include javaagent

2016-10-04 Thread P. Taylor Goetz
Can you post your values for nimbus.childopts, supervisor.childopts, and 
worker.childopts? Also what version of Storm are you on?

-Taylor

> On Sep 28, 2016, at 6:06 PM, Brad Rhodes  wrote:
> 
> I have a situation where we have a javaagent jar file.  Someone has added a 
> number of classes to the corporate standard javaagent jar file.  Nimbus 
> appears to append -javaagent jar files to the -cp (claspath) for the 
> supervisor process.
> 
> When Nimbus starts up java is started with a command like:
> 
> /usr/java/latest/bin/java -server -Dstorm.options= -Dstorm.home=/opt/storm 
> -Dstorm.log.dir=/opt/storm/logs 
> -Djava.library.path=/usr/local/lib:/opt/local/lib:/usr/lib -Dstorm.conf.file= 
> -cp 
> /opt/storm/lib/hiccup-0.3.6.jar:/opt/storm/lib/clj-stacktrace-0.2.2.jar:/opt/storm/lib/clj-time-0.4.1.jar:/opt/storm/lib/tools.macro-0.1.0.jar:/opt/storm/lib/commons-exec-1.1.jar:/opt/storm/lib/minlog-1.2.jar:/opt/storm/lib/joda-time-2.0.jar:/opt/storm/lib/ring-devel-0.3.11.jar:/opt/storm/lib/tools.cli-0.2.4.jar:/opt/storm/lib/logback-core-1.0.13.jar:/opt/storm/lib/tools.logging-0.2.3.jar:/opt/storm/lib/jline-2.11.jar:/opt/storm/lib/math.numeric-tower-0.0.1.jar:/opt/storm/lib/chill-java-0.3.5.jar:/opt/storm/lib/objenesis-1.2.jar:/opt/storm/lib/compojure-1.1.3.jar:/opt/storm/lib/clojure-1.5.1.jar:/opt/storm/lib/commons-codec-1.6.jar:/opt/storm/lib/commons-io-2.4.jar:/opt/storm/lib/clout-1.0.1.jar:/opt/storm/lib/log4j-over-slf4j-1.6.6.jar:/opt/storm/lib/commons-fileupload-1.2.1.jar:/opt/storm/lib/servlet-api-2.5.jar:/opt/storm/lib/reflectasm-1.07-shaded.jar:/opt/storm/lib/storm-core-0.9.6.jar:/opt/storm/lib/jgrapht-core-0.9.0.jar:/opt/storm/lib/ring-servlet-0.3.11.jar:/opt/storm/lib/core.incubator-0.1.0.jar:/opt/storm/lib/snakeyaml-1.11.jar:/opt/storm/lib/disruptor-2.10.4.jar:/opt/storm/lib/commons-lang-2.5.jar:/opt/storm/lib/carbonite-1.4.0.jar:/opt/storm/lib/ring-jetty-adapter-0.3.11.jar:/opt/storm/lib/asm-4.0.jar:/opt/storm/lib/jetty-6.1.26.jar:/opt/storm/lib/commons-logging-1.1.3.jar:/opt/storm/lib/json-simple-1.1.jar:/opt/storm/lib/slf4j-api-1.7.5.jar:/opt/storm/lib/logback-classic-1.0.13.jar:/opt/storm/lib/jetty-util-6.1.26.jar:/opt/storm/lib/kryo-2.21.jar:/opt/storm/lib/ring-core-1.1.5.jar:/opt/storm/conf
>  -Xmx1024m -DjmxRegistryPort=1098 -DjmxServerPort=1099 
> -javaagent:/opt/monitoring_agent/bobbrad.jar 
> -agentlib:jdwp=transport=dt_socket,server=y,address=8100,suspend=y 
> -Dmonitoring.agent.properties=/opt/monitoring_agent/monitoring.properties 
> -Dapplication.name=storm.test.lodgingamenity.nimbus -Dlogfile.name=nimbus.log 
> -Dlogback.configurationFile=/opt/storm/logback/cluster.xml 
> backtype.storm.daemon.nimbus
> 
> Nimbus at some point sends the following to Zookeeper:
> 016-09-28T18:50:45.627+ o.a.s.z.ZooKeeper [INFO] Client 
> environment:java.class.path=/opt/storm/lib/hiccup-0.3.6.jar:/opt/storm/lib/clj-stacktrace-0.2.2.jar:/opt/storm/lib/clj-time-0.4.1.jar:/opt/storm/lib/tools.macro-0.1.0.jar:/opt/storm/lib/commons-exec-1.1.jar:/opt/storm/lib/minlog-1.2.jar:/opt/storm/lib/joda-time-2.0.jar:/opt/storm/lib/ring-devel-0.3.11.jar:/opt/storm/lib/tools.cli-0.2.4.jar:/opt/storm/lib/logback-core-1.0.13.jar:/opt/storm/lib/tools.logging-0.2.3.jar:/opt/storm/lib/jline-2.11.jar:/opt/storm/lib/math.numeric-tower-0.0.1.jar:/opt/storm/lib/chill-java-0.3.5.jar:/opt/storm/lib/objenesis-1.2.jar:/opt/storm/lib/compojure-1.1.3.jar:/opt/storm/lib/clojure-1.5.1.jar:/opt/storm/lib/commons-codec-1.6.jar:/opt/storm/lib/commons-io-2.4.jar:/opt/storm/lib/clout-1.0.1.jar:/opt/storm/lib/log4j-over-slf4j-1.6.6.jar:/opt/storm/lib/commons-fileupload-1.2.1.jar:/opt/storm/lib/servlet-api-2.5.jar:/opt/storm/lib/reflectasm-1.07-shaded.jar:/opt/storm/lib/storm-core-0.9.6.jar:/opt/storm/lib/jgrapht-core-0.9.0.jar:/opt/storm/lib/ring-servlet-0.3.11.jar:/opt/storm/lib/core.incubator-0.1.0.jar:/opt/storm/lib/snakeyaml-1.11.jar:/opt/storm/lib/disruptor-2.10.4.jar:/opt/storm/lib/commons-lang-2.5.jar:/opt/storm/lib/carbonite-1.4.0.jar:/opt/storm/lib/ring-jetty-adapter-0.3.11.jar:/opt/storm/lib/asm-4.0.jar:/opt/storm/lib/jetty-6.1.26.jar:/opt/storm/lib/commons-logging-1.1.3.jar:/opt/storm/lib/json-simple-1.1.jar:/opt/storm/lib/slf4j-api-1.7.5.jar:/opt/storm/lib/logback-classic-1.0.13.jar:/opt/storm/lib/jetty-util-6.1.26.jar:/opt/storm/lib/kryo-2.21.jar:/opt/storm/lib/ring-core-1.1.5.jar:/opt/storm/conf:/opt/monitoring_agent/bobbrad.jar
> 
> and
> Server 
> environment:java.class.path=/opt/storm/lib/hiccup-0.3.6.jar:/opt/storm/lib/clj-stacktrace-0.2.2.jar:/opt/storm/lib/clj-time-0.4.1.jar:/opt/storm/lib/tools.macro-0.1.0.jar:/opt/storm/lib/commons-exec-1.1.jar:/opt/storm/lib/minlog-1.2.jar:/opt/storm/lib/joda-time-2.0.jar:/opt/storm/lib/ring-devel-0.3.11.jar:/opt/storm/lib/tools.cli-0.2.4.jar:/opt/storm/lib/logback-core-1.0.13.jar:/opt/storm/lib/tools.logging-0.2.3.jar:/opt/storm/lib/jline-2.11.jar:/opt/storm/lib/math.numeric-tower-0.0.1.jar:/opt/storm/lib/chill-java-0.3.5.jar:/opt/storm/lib/objenesis-1.2.jar:/opt/storm/lib/c

Re: What determines the topology.acker.executors parameter value?

2016-10-04 Thread P. Taylor Goetz
Hi Dominik,

For the case you describe, I don’t think you’d need to increase the number of 
ackers. But you may want to experiment with increasing it to see if it makes a 
difference in performance.

-Taylor

> On Oct 3, 2016, at 5:09 PM, Dominik Safaric  wrote:
> 
> Hi Everyone,
> 
> I’ve been curious onto the following - what determines the value of the 
> topology.acker.executors parameter of the Storm configuration?
> 
> By default, it is equal to the number of workers. However, if for example 
> having a single worker, consisting of for example a spout with increased 
> parallelism, should the number of acker executors be increased as well?
> 
> What is your experience with it?
> 
> Dominik



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: StormMqtt, MqttPublishFunction does not work on SSL

2016-10-04 Thread P. Taylor Goetz
Can you provide some more information about how you configure SSL and the 
contents of your keystore and truststore files?

Off the top of my head, it sounds like a certificate issue.

-Taylor

> On Sep 30, 2016, at 1:27 AM, Ye Minkg  wrote:
> 
> I was trying to use StormMqtt  with SSL connection.
> 
> But connection failed with
> "Connection refused: no further information" error.
> 
> It only happens when I use 'client keystore'  and 'CA truststore'.
> 
> Connection is succeeded in 'CA truststore' only pattern.
> 
> Did anyone face about this problem?
> Can anyone help me?
> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


[ANNOUNCE] Apache Storm 0.10.2 Released

2016-09-14 Thread P. Taylor Goetz
The Apache Storm community is pleased to announce the release of Apache Storm 
version 0.10.2.

Storm is a distributed, fault-tolerant, and high-performance realtime 
computation system that provides strong guarantees on the processing of data. 
You can read more about Storm on the project website:

http://storm.apache.org

Downloads of source and binary distributions are listed in our download
section:

http://storm.apache.org/downloads.html

You can read more about this release in the following blog post:

https://storm.apache.org/2016/09/14/storm0102-released.html

Distribution artifacts are available in Maven Central at the following 
coordinates:

groupId: org.apache.storm
artifactId: storm-core
version: 0.10.2

The full list of changes is available here[1]. Please let us know [2] if you 
encounter any problems.

Regards,

The Apache Storm Team

[1]: https://github.com/apache/storm/blob/v0.10.2/CHANGELOG.md
[2]: https://issues.apache.org/jira/browse/STORM


signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: How will storm replay the tuple tree?

2016-09-13 Thread P. Taylor Goetz
Hi Cheney,

Replays happen at the spout level. So if there is a failure at any point in the 
tuple tree (the tuple tree being the anchored emits, unanchored emits don’t 
count), the original spout tuple will be replayed. So the replayed tuple will 
traverse the topology again, including unanchored points.

If an unanchored tuple fails downstream, it will not trigger a replay.

Hope this helps.

-Taylor


> On Sep 13, 2016, at 4:42 AM, Cheney Chen  wrote:
> 
> Hi there,
> 
> We're using storm 1.0.1, and I'm checking through 
> http://storm.apache.org/releases/1.0.1/Guaranteeing-message-processing.html 
> 
> 
> Got questions for below two scenarios.
> Assume topology: S (spout) --> BoltA --> BoltB
> 1. S: anchored emit, BoltA: anchored emit
> Suppose BoltB processing failed w/ ack, what will the replay be, will it 
> execute both BoltA and BoltB or only failed BoltB processing?
> 
> 2. S: anchored emit, BoltA: unanchored emit
> Suppose BoltB processing failed w/ ack, replay will not happen, correct?
> 
> --
> Regards,
> Qili Chen (Cheney)
> 
> E-mail: tbcql1...@gmail.com 
> MP: (+1) 4086217503



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: What's the default value of topology.max.spout.pending

2016-09-12 Thread P. Taylor Goetz
The default value of null means Storm will not limit the number of pending 
tuples.

-Taylor

> On Sep 12, 2016, at 7:23 AM, Cheney Chen  wrote:
> 
> Hi there,
> 
> We're using storm 1.0.1
> I'd like to know what's the default value of topology.max.spout.pending, 
> which is "topology.max.spout.pending: null" in defaults.yaml.
> 
> --
> Regards,
> Qili Chen (Cheney)



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Netty Configurations in 1.0.2 still valid?

2016-09-08 Thread P. Taylor Goetz
I would recommend removing that from your configuration and allow the defaults 
to take effect. The defaults are sane, and should only be overridden when 
necessary.

Here are the current defaults as of 1.0.2:

storm.messaging.netty.server_worker_threads: 1
storm.messaging.netty.client_worker_threads: 1
storm.messaging.netty.buffer_size: 5242880 #5MB buffer
storm.messaging.netty.max_retries: 300
storm.messaging.netty.max_wait_ms: 1000
storm.messaging.netty.min_wait_ms: 100
storm.messaging.netty.transfer.batch.size: 262144
storm.messaging.netty.socket.backlog: 500

-Taylor


> On Sep 8, 2016, at 2:25 PM, Joaquin Menchaca  wrote:
> 
> I forgot where I got this from, but in Storm 0.10.0, I used to do this:
> 
> # netty transport
> storm.messaging.transport: "backtype.storm.messaging.netty.Context"
> storm.messaging.netty.buffer_size: 16384
> storm.messaging.netty.max_retries: 10
> storm.messaging.netty.min_wait_ms: 1000
> storm.messaging.netty.max_wait_ms: 5000
> 
> Are these valid in Storm 1.0.2?
> 
> --
> 
> 是故勝兵先勝而後求戰,敗兵先戰而後求勝。



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: doc confusion - nimbus.seed config

2016-09-07 Thread P. Taylor Goetz
Yes.

> On Sep 7, 2016, at 6:33 PM, Joaquin Menchaca  wrote:
> 
> The docs say:
> 
> nimbus.seeds: ["111.222.333.44"]
> 
> Does this mean we can:
> nimbus.seeds: ["192.168.51.5","192.168.51.6"]
> 
> Where each seed refers to a nimbus server?
> 
> -- 
> 
> 是故勝兵先勝而後求戰,敗兵先戰而後求勝。


[ANNOUNCE] Apache Storm 0.9.7 Released

2016-09-07 Thread P. Taylor Goetz
The Apache Storm community is pleased to announce the release of Apache Storm 
version 0.9.7.

Storm is a distributed, fault-tolerant, and high-performance realtime 
computation system that provides strong guarantees on the processing of data. 
You can read more about Storm on the project website:

http://storm.apache.org

Downloads of source and binary distributions are listed in our download
section:

http://storm.apache.org/downloads.html

You can read more about this release in the following blog post:

https://storm.apache.org/2016/09/07/storm097-released.html

Distribution artifacts are available in Maven Central at the following 
coordinates:

groupId: org.apache.storm
artifactId: storm-core
version: 0.9.7

The full list of changes is available here[1]. Please let us know [2] if you 
encounter any problems.

Regards,

The Apache Storm Team

[1]: https://github.com/apache/storm/blob/v0.9.7/CHANGELOG.md
[2]: https://issues.apache.org/jira/browse/STORM


signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Storm UI generated logviewer links problematic

2016-09-06 Thread P. Taylor Goetz
You can override DNS by setting storm.local.hostname in the config yaml on each 
node. Use the public IP/hostname and that should fix your problem.

-Taylor


> On Sep 6, 2016, at 2:47 PM, Joaquin Menchaca  wrote:
> 
> This is a big issue to how storm ui generates the URLs to access storm 
> logviewer.  I need to have some control to how storm ui generates the links, 
> such as using routes or custom name to represent the supervisors.
> 
> In my use case: 1) secure VPC, only accessible through 22 from admin VPC, 2) 
> storm ui available through ELB which is secured to only use public DNS, 3) 
> internal DNS different that public DNS.
> 
> As I understand, storm ui fetches the supervisor names from zookeeper, and 
> supervisors register to zookeeper using some name it generates (hostname? 
> reverse DNS lookup?)
> 
> The problem is that the DNS names are not usable, even if I park a 
> reverse-proxy, I cannot route to URLs that ONLY make sense to internal DNS.



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [SURVEY] What version of Storm are you using?

2016-08-17 Thread P. Taylor Goetz
+1

Any and all feedback is always welcome.

-Taylor

> On Aug 17, 2016, at 9:22 PM, Jungtaek Lim  wrote:
> 
> off-topic:
> Andrew and Erik, please feel free to post the issue to dev@ whenever you meet 
> the issue from streamparse, storm-mesos, and etc. which might be cause of 
> upstream so that Storm dev. community can help if possible.
> 
> 2016년 8월 18일 (목) 오전 10:07, Andrew Montalenti 님이 작성:
>> Erik, re: this issue, "blocked by us dealing with mysterious Storm Worker 
>> process heartbeat failures", do you have any topo's @ Groupon using 
>> ShellSpout (multi-lang)?
>> 
>> If so, the patch/resolution in STORM-1928 (released in 1.0.2, see JIRA for 
>> details) may be helpful.
>> 
>> 
>>> On Aug 17, 2016 6:07 PM, "Jungtaek Lim"  wrote:
>>> Side note for users waiting stable version of 1.x: 1.0.2 is worth to try 
>>> out since it's well tested, and fixes various critical bugs users met.
>>> 
>>> - Jungtaek Lim (HeartSaVioR)
>>> 
>>> 2016년 8월 18일 (목) 오전 6:58, Erik Weathers 님이 작성:
 Groupon is using 0.9.6.
 
 Switching to 0.10.0 or 1.0+ will take us more time largely because of the 
 "logback to log4j 2" change in 0.10.  We have a bunch of internal teams 
 using a library for logging which is based on logback, and it's definitely 
 going to be herding cats to get them to upgrade their topologies.
 
 We are also responsible for the storm-mesos framework, which we will 
 hopefully come back to adding support for 1.0+ next month on.   Spending 
 time on that is currently blocked by us dealing with mysterious Storm 
 Worker process heartbeat failures.
 
 - Erik
 
> On Wed, Aug 17, 2016 at 2:47 PM, Joaquin Menchaca  
> wrote:
> 0.10.0


[SURVEY] What version of Storm are you using?

2016-08-17 Thread P. Taylor Goetz
On the Storm developer list, there are a number of discussions regarding ending 
support for various older versions of Storm. In order to make an informed 
decision I’d like to get an idea of what versions the user community is 
actively using. I’d like to ask the user community to answer the following 
questions so we can best determine which version lines we should continue to 
support, and which ones can be EOL’ed.

1. What version of Storm are you currently using?

2. If you are not on the most recent version, what is preventing you from 
upgrading?


Thanks in advance.

-Taylor


signature.asc
Description: Message signed with OpenPGP using GPGMail


[ANNOUNCE] Apache Storm 1.0.2 Released

2016-08-10 Thread P. Taylor Goetz
The Apache Storm community is pleased to announce the release of Apache Storm 
version 1.0.2.

Storm is a distributed, fault-tolerant, and high-performance realtime 
computation system that provides strong guarantees on the processing of data. 
You can read more about Storm on the project website:

http://storm.apache.org

Downloads of source and binary distributions are listed in our download
section:

http://storm.apache.org/downloads.html

You can read more about this release in the following blog post:

https://storm.apache.org/2016/08/10/storm102-released.html

Distribution artifacts are available in Maven Central at the following 
coordinates:

groupId: org.apache.storm
artifactId: storm-core
version: 1.0.2

The full list of changes is available here[1]. Please let us know [2] if you 
encounter any problems.

Regards,

The Apache Storm Team

[1]: https://github.com/apache/storm/blob/v1.0.2/CHANGELOG.md
[2]: https://issues.apache.org/jira/browse/STORM


signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Storm unique strengths

2016-06-02 Thread P. Taylor Goetz
There are a few things to keep in mind when evaluating Heron and Storm:

First is performance. Twitter benchmarked Heron against a very old, pre-Apache 
version of Storm (back when the transport layer was based on 0mq), so their 
claims of performance improvements over Storm are likely significantly 
overblown. There have been an enormous number of performance improvements since 
then, and the Storm 1.0 release likely erases most of the performance gain 
claimed by the Heron project.

Second, despite their claims, Heron is not API compatible with the latest 
release of Apache Storm. It may be somewhat compatible with the 0.9.x series of 
releases, but 0.10.x is likely to have some compatibility issues (I haven’t 
tested this out so I don’t know for sure), and it’s certainly not compatible 
with 1.0.

Finally, lets look at a few things that Storm has that Heron does not. Off the 
top of my head I can think of:

* End-to-end security (Kerberos, etc.), including secure integration with other 
Apache Hadoop projects like ZooKeeper, HDFS, HBase, etc.
* Trident API (microbatching, exactly-once processing, etc.)
* Distributed Remote Procedure Calls (DRPC)
* Built-in windowing support
* State management (stateful bolts with automatic checkpointing)
* Distributed Cache API
* Kafka integration (though I believe this is coming)
* Integration with HDFS, Hive, HBase, Cassandra, Solr, Elastic Search, Redis, 
MongoDB, JDBC, MQTT, and Azure Event Hubs.
* Scheduler framework independence (Heron requires Apache Mesos)
* Partial key groupings
* Declarative topology wiring (i.e. Flux)

Is Heron a drop-in replacement for Storm? Probably not.

-Taylor

> On Jun 2, 2016, at 9:27 AM, leon_mcl...@tutanota.com wrote:
> 
> Hi Marc,
> 
> I had come across Heron a couple of weeks ago. It was indeed quite 
> interesting. Thanks for the hint.
> 
> Regards
> Leon
> 
> 
> 1. Jun 2016 11:47 by m.r...@f1-outsourcing.eu 
> :
> 
> 
> Maybe also take into account the new heron
> 
> https://blog.twitter.com/2016/open-sourcing-twitter-heron 
> 
> 
> 
> - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -.
> F1 Outsourcing Development Sp. z o.o.
> Poland
> 
> t: +48 (0)124466845
> f: +48 (0)124466843
> e: m...@f1-outsourcing.eu 
> 
> 
> -Original Message-
> From: leon_mcl...@tutanota.com  
> [mailto:leon_mcl...@tutanota.com ]
> Sent: woensdag 1 juni 2016 11:44
> To: User
> Cc: aaron.doss...@target.com 
> Subject: Re: Storm unique strengths
> 
> Hi Aaron,
> 
> thank you very much for the link. I found it quite insightful. It is one
> of the few benchmarks i have encountered where Storm comes out on top in
> terms of latency, although the at-most once trade-off is quite harsh.
> 
> Regards
> Leon
> 
> 31. May 2016 15:37 by aaron.doss...@target.com 
> :
> 
> 
> 
> Hi Leon,
> 
> This isn’t an advocacy piece per se, but this analysis by several
> member of the Storm community may be helpful. For a particular use case
> you can compare performance and then assess whether the features,
> user-friendliness, or API of a particular framework is worth switching
> to.
> 
> https://yahooeng.tumblr.com/post/135321837876/benchmarking-streamin 
> 
> g-computation-engines-at
> 
> 
> From: "leon_mcl...@tutanota.com " 
> mailto:leon_mcl...@tutanota.com>>
> Reply-To: "user@storm.apache.org " 
> mailto:user@storm.apache.org>>
> Date: Monday, May 30, 2016 at 3:28 AM
> To: "user@storm.apache.org " 
> mailto:user@storm.apache.org>>
> Subject: Storm unique strengths
> 
> 
> Hi Storm team,
> 
> there are a lot of online comparisons between Storm and other Data
> Stream Management Systems, yet few of them originate from Storm
> committers/advocats.
> I am trying to identify the aspects that Storm possesses, which
> make it stand out among its direct competitors. Currently there is
> significant competition from Apache Flink, although less so from Spark
> due to its seconds latency restriction.
> 
> From my experience Storm offers a unique support for DSLs, as well
> as a very flexible concept of Spouts and Bolts. Other aspects however
> seem to have been improved upon by Flink in greater part.
> 
> Would you be able to direct me to resources that argue more towards
> Storm's case?
> 
> Thanks in advance.
> 
> Leon



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Storm 1.0.0 upgrade Serialization issue

2016-05-11 Thread P. Taylor Goetz
I'm okay with a quick turnaround release for this fix. We've got two valid 
reports of it, and more will follow quickly as users continue to upgrade.

-Taylor

> On May 11, 2016, at 11:00 PM, Jungtaek Lim  wrote:
> 
> KB, 
> 
> Submitted pull request: https://github.com/apache/storm/pull/1412 for 1.x
> 
> Since Storm 1.0.1 was released at May 6, I feel we may want to gather more 
> bugfixes to prepare next version, and go on release process.
> But if we think it's critical or even blocker, we could initiate discussion 
> for next release immediately.
> 
> Hopefully we can release next version within a month.
> 
> Thanks,
> Jungtaek Lim (HeartSaVioR)
> 
> 2016년 5월 11일 (수) 오후 11:34, KB 님이 작성:
>> Hi Jungtaek,
>> 
>> Thanks for providing the snapshot build with this fix. I have verified the 
>> fix and it is working fine.
>> 
>> Please let me know when can I expect the release with the fix.
>> 
>> Once again thanks a lot for looking into this.
>> 
>>> On Wed, May 11, 2016 at 8:07 AM, Jungtaek Lim  wrote:
>>> Yes sure, here is storm-core jar 1.0.2 snapshot which just applies 
>>> STORM-1773 patch into Storm 1.0.1.
>>> http://people.apache.org/~kabhwan/storm-core-1.0.2-SNAPSHOT.jar
>>> It just changes the version of common-io from 2.4 to 2.5.
>>> 
>>> Please let me know if this works so that I can submit pull request.
>>> 
>>> Thanks in advance!
>>> Jungtaek Lim (HeartSaVioR)
>>> 
>>> 2016년 5월 11일 (수) 오전 12:54, KB 님이 작성:
 Thanks for this update. Actually our topology is built on multiple 
 wrappers on top of Storm. I'll try to create simple topology to reproduce 
 the problem. Meanwhile would it be possible to create a snapshot release 
 with the fix. I'll test and let you know. Please let me know.
 
 Regards,
 
> On Tue, May 10, 2016 at 9:39 AM, Jungtaek Lim  wrote:
> Filed: https://issues.apache.org/jira/browse/STORM-1773
> 
> KB, 
> could you share sample topology which hits serialization issue? I would 
> like to check whether patch helps resolving it or not.
> 
> 2016년 5월 10일 (화) 오후 12:14, Jungtaek Lim 님이 작성:
>> Samuel and KB,
>> 
>> I think Storm 1.x hits the bug on commons-io (IO-368). I'll file an 
>> issue.
>> 
>> Thanks for reporting.
>> 
>> Best Regards,
>> Jungtaek Lim (HeartSaVioR)
>> 
>> 2016년 5월 10일 (화) 오전 1:33, KB 님이 작성:
>>> All,
>>> 
>>> This problem persists in release 1.0.1 as well. I appreciate if someone 
>>> could help fixing this issue.
>>> 
>>> Thanks a lot !!!
>>> 
>>> 
>>> 
 On Wed, May 4, 2016 at 7:41 PM, KB  wrote:
 Thanks for your reply Samuel.
 
 I have setup a very simple topology and not using ObjectMapper or any 
 other Jackson classes. Although we are using jackson libraries
 Jackson-core-2.6.2
 Jackson-databind-2.4.5
 
 and these versions not changed between Storm version 0.9 and 1.0.0. 
 Please let me know if you make any progress on this issue.
 
 Meanwhile would it help if we raise a jira issue to track the problem.
 
 Please advice.
 
 Thanks,
 
 
> On Tue, May 3, 2016 at 10:17 PM,  wrote:
> Hi,
> 
> we had a similar issue (see 
> https://mail-archives.apache.org/mod_mbox/storm-user/201604.mbox/%3C645fd70cb0874be0ac1f1e41a0f9393b%40SG001741.corproot.net%3E
>  ). So far, we have not been able to solve it, but we currently have 
> a suspicion that it might be related to the Jackson ObjectMapper we 
> use. Can I check whether you also use that?
> 
>  
> 
> With kind regards
> 
>  
> 
> Samuel
> 
>  
> 
> From: KB [mailto:jam.develo...@gmail.com] 
> Sent: Dienstag, 3. Mai 2016 18:43
> To: user@storm.apache.org
> Subject: Storm 1.0.0 upgrade Serialization issue
> 
>  
> 
> Hello,
> 
>  
> 
> We have recently upgraded to Storm 1.0.0. Our system was in 
> production for long with Storm 0.9.
> 
>  
> 
> Our topology is not getting loaded with this upgrade. It was working 
> fine with 0.9.
> 
>  
> 
> I am getting following error:
> 
>  
> 
> ---
> 
>  
> 
>  
> 
> 119662 [Thread-11] ERROR o.a.s.d.worker - Error on initialization of 
> server mk-worker
> 
> java.lang.RuntimeException: java.lang.ClassNotFoundException: boolean
> 
>   at org.apache.storm.utils.Utils.javaDeserialize(Utils.java:181) 
> ~[storm-core-1.0.0.jar:1.0.0]
> 
>   at 
> org.apache.sto

[ANNOUNCE] Apache Storm 1.0.1 Released

2016-05-06 Thread P. Taylor Goetz
The Apache Storm community is pleased to announce the release of Apache Storm 
version 1.0.1.

Storm is a distributed, fault-tolerant, and high-performance realtime 
computation system that provides strong guarantees on the processing of data. 
You can read more about Storm on the project website:

http://storm.apache.org

Downloads of source and binary distributions are listed in our download
section:

http://storm.apache.org/downloads.html

You can read more about this release in the following blog post:

https://storm.apache.org/2016/05/06/storm101-released.html

Distribution artifacts are available in Maven Central at the following 
coordinates:

groupId: org.apache.storm
artifactId: storm-core
version: 1.0.1

The full list of changes is available here[1]. Please let us know [2] if you 
encounter any problems.

Regards,

The Apache Storm Team

[1]: https://github.com/apache/storm/blob/v1.0.1/CHANGELOG.md
[2]: https://issues.apache.org/jira/browse/STORM


signature.asc
Description: Message signed with OpenPGP using GPGMail


[ANNOUNCE] Apache Storm 0.10.1 Released

2016-05-06 Thread P. Taylor Goetz
The Apache Storm community is pleased to announce the release of Apache Storm 
version 0.10.1.

Storm is a distributed, fault-tolerant, and high-performance realtime 
computation system that provides strong guarantees on the processing of data. 
You can read more about Storm on the project website:

http://storm.apache.org

Downloads of source and binary distributions are listed in our download
section:

http://storm.apache.org/downloads.html

You can read more about this release in the following blog post:

https://storm.apache.org/2016/05/05/storm0101-released.html

Distribution artifacts are available in Maven Central at the following 
coordinates:

groupId: org.apache.storm
artifactId: storm-core
version: 0.10.1

The full list of changes is available here[1]. Please let us know [2] if you 
encounter any problems.

Regards,

The Apache Storm Team

[1]: https://github.com/apache/storm/blob/v0.10.1/CHANGELOG.md
[2]: https://issues.apache.org/jira/browse/STORM


signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Spout Questions

2016-05-02 Thread P. Taylor Goetz
nextTuple(), ack(), and fail() are all called by the same thread. nextTuple() 
should be fast, so you probably only want to emit one or a handful of tuples. 
Emitting a huge number of tuples in the nextTuple() method is what’s causing 
your problem.

-Taylor

> On May 2, 2016, at 9:08 AM, Adrien Carreira  wrote:
> 
> Hi there,
> 
> Dont't know if I'm on the right place.. But let's try.
> 
> I'm build a Topology, And I've a spout plugged on Redis.
> 
> My question is, when the topology is active, Why the nextTuple() method isn't 
> call when ack() method is called.
> 
> Meaning, I've about 10k acking message without a nextTuple() called...
> 
> So what going is : nextTuple is called to emit 3k message, stops, acking 
> is called to ack all message without calling nextTuple to refeed the 
> topoogy
> 
> What can be the problem ?
> 
> 
> Thanks for your feedbacks and sorry for my bad english.
> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: thread safe output collector

2016-04-28 Thread P. Taylor Goetz
I was partly right. In earlier versions (pre-1.0) was not thread safe (due to 
shuffle grouping). 1.0 introduced a new shuffle grouping implementation that is 
thread-safe, in addition to a load-aware shuffle grouping. I had missed the 
fact that the new shuffle grouping is thread safe.

-Taylor

> On Apr 28, 2016, at 1:48 PM, Steven Lewis  wrote:
> 
> We tried implementing our own thread safe output collector but it seems 
> ridiculous for such a concurrent system. Why don’t they implement it in core 
> Storm? They can have the current one, and then build a separate one called 
> Concurrent_OutputCollector or some such. I was hoping they fixed that in 
> version 1.0 and I missed it.
> 
> From: Stephen Powis mailto:spo...@salesforce.com>>
> Reply-To: "user@storm.apache.org <mailto:user@storm.apache.org>" 
> mailto:user@storm.apache.org>>
> Date: Thursday, April 28, 2016 at 10:58 AM
> To: "user@storm.apache.org <mailto:user@storm.apache.org>" 
> mailto:user@storm.apache.org>>
> Subject: Re: thread safe output collector
> 
> So the Spout documentation (assuming its correct...) here 
> (http://storm.apache.org/releases/current/Concepts.html#spouts 
> <http://storm.apache.org/releases/current/Concepts.html#spouts>) mentions 
> this:
> 
> "The main method on spouts is nextTuple. nextTuple either emits a new tuple 
> into the topology or simply returns if there are no new tuples to emit. It is 
> imperative that nextTuple does not block for any spout implementation, 
> because Storm calls all the spout methods on the same thread."
> 
> When developing a custom spout we interpreted it to mean that any "real work" 
> done by a spout should be done in a separate thread, and decided on the 
> following pattern which seems some what relevant to what you are trying to do 
> in your bolts.
> 
> On Spout prepare, we create a concurrent/thread safe queue.  We then create a 
> new Thread passing it a reference to our thread safe queue.  This thread 
> handles finding new data that needs to be emitted.  When that thread finds 
> data, it adds it to the shared queue.  When the spout's nextTuple() method is 
> called, it looks for data on the shared queue and emits it.
> 
> I imagine doing async processing in a bolt using one or more threads could 
> work with a similar pattern.  On prepare you setup your thread(s) with 
> references to a shared queue.  The bolt passes work to be completed to the 
> thread(s), the thread(s) communicate back to the bolt the result via a shared 
> queue.  Add in the concept of tick tuples to ensure your bolt checks for 
> completed work on a regular basis?
> 
> Is there a better way to do this?
> 
> On Thu, Apr 28, 2016 at 11:22 AM, Julien Nioche 
> mailto:lists.digitalpeb...@gmail.com>> wrote:
> Thanks for the clarification
> 
> On 28 April 2016 at 15:12, P. Taylor Goetz  <mailto:ptgo...@gmail.com>> wrote:
> The documentation is wrong. See:
> 
> https://issues.apache.org/jira/browse/STORM-841 
> <https://issues.apache.org/jira/browse/STORM-841>
> 
> At some point it looks like the change made there got reverted. I will reopen 
> it to make sure the documentation is corrected.
> 
> OutputCollector is NOT thread-safe.
> 
> -Taylor
> 
>> On Apr 28, 2016, at 9:06 AM, Stephen Powis > <mailto:spo...@salesforce.com>> wrote:
>> 
>> "Its perfectly fine to launch new threads in bolts that do processing 
>> asynchronously. OutputCollector 
>> <http://storm.apache.org/releases/current/javadocs/org/apache/storm/task/OutputCollector.html>
>>  is thread-safe and can be called at any time."
>> 
>> 
>> From the docs for 0.9.6: 
>> http://storm.apache.org/releases/0.9.6/Concepts.html#bolts 
>> <http://storm.apache.org/releases/0.9.6/Concepts.html#bolts>
>> 
>> On Thu, Apr 28, 2016 at 9:03 AM, P. Taylor Goetz > <mailto:ptgo...@gmail.com>> wrote:
>> IIRC there was discussion about making it thread safe, but I don't believe 
>> it was implemented.
>> 
>> -Taylor
>> 
>> On Apr 28, 2016, at 3:52 AM, Julien Nioche > <mailto:lists.digitalpeb...@gmail.com>> wrote:
>> 
>>> Hi Stephen
>>> 
>>> I asked the same question in February but did not get a reply
>>> 
>>> https://mail-archives.apache.org/mod_mbox/storm-user/201602.mbox/%3cca+-fm0urpf3fuerozywpzmxu-kdbgf-zj3wbyr8evsaqjc6...@mail.gmail.com%3E
>>>  
>>> <https://mail-archives.apache.org/mod_mbox/storm-user/201602.mbox/%3cca+-fm0urpf3fuerozywpzmxu-kdbgf-zj3wbyr8evsaqjc6...@mail.gmail.com%3E>
>>> 
>>> Anyone who c

Re: thread safe output collector

2016-04-28 Thread P. Taylor Goetz
The documentation is wrong. See:

https://issues.apache.org/jira/browse/STORM-841 
<https://issues.apache.org/jira/browse/STORM-841>

At some point it looks like the change made there got reverted. I will reopen 
it to make sure the documentation is corrected.

OutputCollector is NOT thread-safe.

-Taylor

> On Apr 28, 2016, at 9:06 AM, Stephen Powis  wrote:
> 
> "Its perfectly fine to launch new threads in bolts that do processing 
> asynchronously. OutputCollector 
> <http://storm.apache.org/releases/current/javadocs/org/apache/storm/task/OutputCollector.html>
>  is thread-safe and can be called at any time."
> 
> 
> 
> From the docs for 0.9.6: 
> http://storm.apache.org/releases/0.9.6/Concepts.html#bolts 
> <http://storm.apache.org/releases/0.9.6/Concepts.html#bolts>
> 
> On Thu, Apr 28, 2016 at 9:03 AM, P. Taylor Goetz  <mailto:ptgo...@gmail.com>> wrote:
> IIRC there was discussion about making it thread safe, but I don't believe it 
> was implemented.
> 
> -Taylor
> 
> On Apr 28, 2016, at 3:52 AM, Julien Nioche  <mailto:lists.digitalpeb...@gmail.com>> wrote:
> 
>> Hi Stephen
>> 
>> I asked the same question in February but did not get a reply
>> 
>> https://mail-archives.apache.org/mod_mbox/storm-user/201602.mbox/%3cca+-fm0urpf3fuerozywpzmxu-kdbgf-zj3wbyr8evsaqjc6...@mail.gmail.com%3E
>>  
>> <https://mail-archives.apache.org/mod_mbox/storm-user/201602.mbox/%3cca+-fm0urpf3fuerozywpzmxu-kdbgf-zj3wbyr8evsaqjc6...@mail.gmail.com%3E>
>> 
>> Anyone who could confirm this?
>> 
>> Thanks
>> 
>> On 27 April 2016 at 14:05, Steven Lewis > <mailto:steven.le...@walmart.com>> wrote:
>> I have conflicting information, and have not checked personally but has the 
>> output collector finally been made thread safe for emitting in version 1.0 
>> or 0.10? I know it was a huge problem in 0.9.5 when trying to do threading 
>> in a bolt for async future calls and emitting once it returns.
>> 
>> This email and any files transmitted with it are confidential and intended 
>> solely for the individual or entity to whom they are addressed. If you have 
>> received this email in error destroy it immediately. *** Walmart 
>> Confidential ***
>> 
>> 
>> 
>> --
>> 
>> Open Source Solutions for Text Engineering
>> 
>> http://www.digitalpebble.com <http://www.digitalpebble.com/>
>> http://digitalpebble.blogspot.com/ <http://digitalpebble.blogspot.com/>
>> #digitalpebble <http://twitter.com/digitalpebble>
> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: thread safe output collector

2016-04-28 Thread P. Taylor Goetz
IIRC there was discussion about making it thread safe, but I don't believe it 
was implemented.

-Taylor

> On Apr 28, 2016, at 3:52 AM, Julien Nioche  
> wrote:
> 
> Hi Stephen
> 
> I asked the same question in February but did not get a reply
> 
> https://mail-archives.apache.org/mod_mbox/storm-user/201602.mbox/%3cca+-fm0urpf3fuerozywpzmxu-kdbgf-zj3wbyr8evsaqjc6...@mail.gmail.com%3E
> 
> Anyone who could confirm this?
> 
> Thanks
> 
>> On 27 April 2016 at 14:05, Steven Lewis  wrote:
>> I have conflicting information, and have not checked personally but has the 
>> output collector finally been made thread safe for emitting in version 1.0 
>> or 0.10? I know it was a huge problem in 0.9.5 when trying to do threading 
>> in a bolt for async future calls and emitting once it returns.
>> 
>> This email and any files transmitted with it are confidential and intended 
>> solely for the individual or entity to whom they are addressed. If you have 
>> received this email in error destroy it immediately. *** Walmart 
>> Confidential ***
> 
> 
> 
> -- 
> 
> Open Source Solutions for Text Engineering
> 
> http://www.digitalpebble.com
> http://digitalpebble.blogspot.com/
> #digitalpebble


[ANNOUNCE] Apache Storm 1.0 Released

2016-04-12 Thread P. Taylor Goetz
The Apache Storm community is pleased to announce the release of Apache Storm 
version 1.0.0.

Storm is a distributed, fault-tolerant, and high-performance realtime 
computation system that provides strong guarantees on the processing of data. 
You can read more about Storm on the project website:

http://storm.apache.org

Downloads of source and binary distributions are listed in our download
section:

http://storm.apache.org/downloads.html

You can read more about this release in the following blog post:

https://storm.apache.org/2016/04/12/storm100-released.html

Distribution artifacts are available in Maven Central at the following 
coordinates:

groupId: org.apache.storm
artifactId: storm-core
version: 1.0.0

The full list of changes is available here[1]. Please let us know [2] if you 
encounter any problems.

Regards,

The Apache Storm Team

[1]: https://github.com/apache/storm/blob/v1.0.0/CHANGELOG.md
[2]: https://issues.apache.org/jira/browse/STORM


signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Next version of storm?

2016-03-30 Thread P. Taylor Goetz
Tid,

That fix will be included in the 1.0 release. That release will likely happen 
in the next week or so. This is a major release and the dev community is 
working hard to make sure this is a solid release that's ready for users (I.e. 
Not beta). We tend to set a bar for releases, not strict dates. When we reach 
that bar, we release.

Thank you and the entire Storm user community for your patience. I can assure 
you we are close.

-Taylor

> On Mar 30, 2016, at 6:57 PM, Tech Id  wrote:
> 
> 
> STORM-971 is another good fix we would like to have.
> 
> Would appreciate if someone can suggest when is next version of storm 
> scheduled?
> 
> 
> Thanks for helping,
> Regards
> Tid
> 
>> On Sat, Mar 12, 2016 at 6:23 AM, Craig Charleton  
>> wrote:
>> Mr. Goetz,
>> 
>> I was unable to find the answer to this question online. I apologize if I 
>> missed it. I was wondering what changes/improvements will be present around 
>> stateful bolts in 1.0?
>> 
>> Many Thanks!
>> 
>> 
>> 
>> > On Mar 11, 2016, at 11:04 PM, P. Taylor Goetz  wrote:
>> >
>> > 1.0 should be released this month. We will likely update 0.10.x as well.
>> >
>> > 1.0 is a big release, and we want to make sure it's solid.
>> >
>> > -Taylor
>> >
>> >
>> >
>> >> On Mar 11, 2016, at 10:47 PM, Tech Id  wrote:
>> >>
>> >> Hi,
>> >>
>> >> We need to have the fix for:
>> >>STORM-1207: Added flux support for IWindowedBolt
>> >>
>> >> Its been ~5 months since storm was last released (Oct 2015) and ~1000 
>> >> commits have gone into it since then.
>> >> Is someone can help me know when the newer version of storm would be 
>> >> released it would be great help.
>> >>
>> >> Appreciate your time,
>> >> Regards
>> >> Tid
> 


Re: Next version of storm?

2016-03-11 Thread P. Taylor Goetz
1.0 should be released this month. We will likely update 0.10.x as well.

1.0 is a big release, and we want to make sure it's solid.

-Taylor



> On Mar 11, 2016, at 10:47 PM, Tech Id  wrote:
> 
> Hi,
> 
> We need to have the fix for:
> STORM-1207: Added flux support for IWindowedBolt
> 
> Its been ~5 months since storm was last released (Oct 2015) and ~1000 commits 
> have gone into it since then.
> Is someone can help me know when the newer version of storm would be released 
> it would be great help.
> 
> Appreciate your time,
> Regards
> Tid


Re: New Concurrent modification exception's after storm 0.10.0

2016-03-03 Thread P. Taylor Goetz
Hi Stephen,

Can you provide a stack trace that indicates where this is occurring?

-Taylor


> On Mar 2, 2016, at 1:49 PM, Stephen Powis  wrote:
> 
> Hey!
> 
> Did anything change between storm 0.9.5 and 0.10.0 regarding 
> ConcurrentModificationExceptions and how they are detected?  We've had a 
> topology running for the last 6months or so and never saw this exception.
> 
> After upgrading to Storm 0.10.x which didn't require any changes to our 
> topology/bolt/business logic, we're now seeing these intermittently and have 
> been struggling to see where we've gone wrong -- We don't seem to be 
> modifying values in the emitted tuples anywhere after emitting.
> 
> Thanks!
> Stephen



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: version 1.0?

2016-02-17 Thread P. Taylor Goetz
Hi Maciek,

We’re getting close. Hopefully in the next 2-3 weeks.

If you’re interested in tracking progress, look at this JIRA ticket:

https://issues.apache.org/jira/browse/STORM-1491 


We will likely release once all those issues have been resolved.


-Taylor


> On Feb 17, 2016, at 2:23 PM, Maciek Próchniak  wrote:
> 
> Hi,
> 
> Is there any timeline for 1.0 release?
> We're evaluating Storm (together with Flink) for our client and it'd be great 
> for us to have sliding window support.
> Guess we could use version built from sources for some time - but we still 
> need some estimates on 1.0 availability.
> 
> thanks,
> maciek
> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Storm + HDFS

2016-02-03 Thread P. Taylor Goetz
Assuming you have git and maven installed:

git clone g...@github.com:apache/storm.git
cd storm
git checkout -b 1.x origin/1.x-branch
mvn install -DskipTests

That third step checks out the 1.x-branch branch which is the base for the 
upcoming 1.0 release.

You can then include the storm-hdfs dependency in your project:


org.apache.storm
storm-hdfs
1.0.0-SNAPSHOT


You can find more information on using the spout and other HDFS components here:

https://github.com/apache/storm/tree/1.x-branch/external/storm-hdfs#hdfs-spout 


-Taylor

> On Feb 3, 2016, at 2:54 PM, K Zharas  wrote:
> 
> Oh ok. Can you plz give me an idea how can I do it manually? I'm quite 
> beginner :)
> 
> On Thu, Feb 4, 2016 at 3:43 AM, Parth Brahmbhatt  > wrote:
> Storm-hdfs spout is not yet published in maven. You will have to checkout 
> storm locally and build it to make it available for development.
> 
> From: K Zharas mailto:kgzha...@gmail.com>>
> Reply-To: "user@storm.apache.org " 
> mailto:user@storm.apache.org>>
> Date: Wednesday, February 3, 2016 at 11:41 AM
> To: "user@storm.apache.org " 
> mailto:user@storm.apache.org>>
> Subject: Re: Storm + HDFS
> 
> Yes, looks like it is. But, I have added dependencies required by storm-hdfs 
> as stated in a guide.
> 
> On Thu, Feb 4, 2016 at 3:33 AM, Nick R. Katsipoulakis  > wrote:
> Well,
> 
> those errors look like a problem with the way you build your jar file.
> Please, make sure that you build your jar with the proper storm maven 
> dependency).
> 
> Cheers,
> Nick
> 
> On Wed, Feb 3, 2016 at 2:31 PM, K Zharas  > wrote:
> It throws and error that packages does not exist. I have also tried changing 
> org.apache to backtype, still got an error but only for storm.hdfs.spout. 
> Btw, I use Storm-0.10.0 and Hadoop-2.7.1
> 
>package org.apache.storm does not exist
>package org.apache.storm does not exist
>package org.apache.storm.generated does not exist
>package org.apache.storm.metric does not exist
>package org.apache.storm.topology does not exist
>package org.apache.storm.utils does not exist
>package org.apache.storm.utils does not exist
>package org.apache.storm.hdfs.spout does not exist
>package org.apache.storm.hdfs.spout does not exist
>package org.apache.storm.topology.base does not exist
>package org.apache.storm.topology does not exist
>package org.apache.storm.tuple does not exist
>package org.apache.storm.task does not exist
> 
> On Wed, Feb 3, 2016 at 8:57 PM, Matthias J. Sax  > wrote:
> Storm does provide HdfsSpout and HdfsBolt already. Just use those,
> instead of writing your own spout/bolt:
> 
> https://github.com/apache/storm/tree/master/external/storm-hdfs 
> 
> 
> -Matthias
> 
> 
> On 02/03/2016 12:34 PM, K Zharas wrote:
> > Can anyone help to create a Spout which reads a file from HDFS?
> > I have tried with the code below, but it is not working.
> >
> > public void nextTuple() {
> >   Path pt=new Path("hdfs://localhost:50070/user/BCpredict.txt");
> >   FileSystem fs = FileSystem.get(new Configuration());
> >   BufferedReader br = new BufferedReader(new
> > InputStreamReader(fs.open(pt)));
> >   String line = br.readLine();
> >   while (line != null){
> >  System.out.println(line);
> >  line=br.readLine();
> >  _collector.emit(new Values(line));
> >   }
> > }
> >
> > On Tue, Feb 2, 2016 at 1:19 PM, K Zharas  > 
> > >> wrote:
> >
> > Hi.
> >
> > I have a project I'm currently working on. The idea is to implement
> > "scikit-learn" into Storm and integrate it with HDFS.
> >
> > I've already implemented "scikit-learn". But, currently I'm using a
> > text file to read and write. However, I need to use HDFS, but
> > finding it hard to integrate with HDFS.
> >
> > Here is the link to github
> >  > >. (I only included
> > files that I used, not whole project)
> >
> > Basically, I have a few questions if you don't mint to answer them
> > 1) How to use HDFS to read and write?
> > 2) Is my "scikit-learn" implementation correct?
> > 3) How to create a Storm project? (Currently working in "storm-starter")
> >
> > These questions may sound a bit silly, but I really can't find a
> > proper solution.
> >
> > Thank you for your attention to this matter.
> > Sincerely, Zharas.
> >
> >
> >
> >
> > --
> > Best regards,
> > Zharas
> 
> 
> 
> 
> --
> Best regards,
> Zharas
> 
> 
> 
> --
> Nick R. Katsipou

Re: Acking of anchor tuple list decreases throughput?

2016-01-30 Thread P. Taylor Goetz
Interesting conversation.

The back pressure mechanism in 1.0 should help.

Do you guys have environments that you could test that in?

Better yet, do you have code to share?

-Taylor

> On Jan 30, 2016, at 9:05 PM, hokiege...@gmail.com wrote:
> 
> Hey Kashyap,
> 
> Excellent points, especially regarding compression. I've thought about trying 
> compression, and your results indicate that's worth a shot.
> 
> Also, I concur on fields grouping, especially with a dramatic fan-out 
> followed by a fan-in, which is what I am currently working with.
> 
> Sure glad I started this thread today because both you and Nick have shared 
> lots of excellent thoughts--much appreciated, and thanks to you both!
> 
> --John
> 
> Sent from my iPhone
> 
>> On Jan 30, 2016, at 7:34 PM, Kashyap Mhaisekar  wrote:
>> 
>> John, Nick
>> I don't have direct answers but here is one test I did based on which I 
>> concluded that tuple size does matter.
>> My use case was like this -
>> Spout S emits a number X (say 1 or 100 or 1024 etc) -> Bolt A (Which 
>> generates a string of Xkb and emits it out 200 times) -> Bolt C (Bolt see 
>> just prints the the length of the string). All are shuffle grouped and no 
>> limits on max spout pending.
>> 
>> As you notice, this is a pretty straight topology with really nothing much 
>> in this except emitting out Strings of varying sizes.
>> 
>> With increase in the size, i notice that the throughput (No. of acks on 
>> spout divided by total time taken) decreases. The test was done on 1 machine 
>> so that network can be ruled out. The only things in play here are the LMAX 
>> and Kryo (de)serialization.
>> 
>> Another test - if Bolt C was field grouped on X, then i see that the 
>> performance drops much further, probably because all the desrialization is 
>> being done on instance of the bolt AND also because the queues are filled up.
>> 
>> This being said, when I compressed the emits from Bolt A (Use Snappy 
>> compression), I see that the throuput increases drastically. - I interpret 
>> this as the reduction in size due to compression has improved throughput).
>> 
>> I unfortunately have not checked VisualVM at the time..
>> 
>> Hope this helps.
>> 
>> Thanks
>> Kashyap
>>> On Sat, Jan 30, 2016 at 4:54 PM, John Yost  wrote:
>>> Also, I am wondering if this issue is actually fixed in 0.10.0: 
>>> https://issues.apache.org/jira/browse/STORM-292  What do you guys think?
>>> 
>>> --John
>>> 
 On Sat, Jan 30, 2016 at 5:53 PM, John Yost  wrote:
 Hi Kashyap,
 
 Question--what percentage of time is spent in Kryo deserialization and how 
 much in LMAX disruptor?
 
 --John
 
> On Sat, Jan 30, 2016 at 5:18 PM, Kashyap Mhaisekar  
> wrote:
> That is right. But for a decently well written code, disruptor is almost 
> always the CPU hogger. That said, on the issue b of emits taking time, we 
> found that the size of emitted object matters. Kryo times for serializing 
> and deserialization increases with size.
> 
> But does size have a correlation with disruptor showing up big time in 
> profiling?
> 
> Thanks
> Kashyap
> 
> Kashyap, 
> 
> It is only expected to see the Disruptor dominating CPU time. It is the 
> object responsible for sending/receiving tuples (at least when you have 
> tuples produced by one executor thread for another executor thread on the 
> same machine). Therefore, it is expected to see Disruptor having 
> something like ~80% of the time. 
> 
> A nice experiment to check my statement above is to create a Bolt that 
> for every tuple it receives, it performs a random CPU task (like nested 
> for loops) and it emits a tuple only after receiving X number of tuples, 
> where X > 1. Then, I expect that you will see the percentage of CPU time 
> for the Disruptor object to drop.
> 
> Cheers,
> Nick
> 
>> On Sat, Jan 30, 2016 at 3:40 PM, Kashyap Mhaisekar  
>> wrote:
>> John, Nick
>> Thanks for broaching this topic. In my case, 1 tuple from spout gives 
>> out 200 more tuples. I too see the same class listed in VisualVM 
>> profiling... And tried bringing this down... I reduced parallelism 
>> hints, played with buffers, changed lmax strategies, changed max spout 
>> pending... Nothing seems to have an impact
>> 
>> Any ideas on what could be done for this?
>> 
>> Thanks
>> Kashyap
>> 
>> Hello John, 
>> 
>> First off, let us agree on your definition of throughput. Do you define 
>> throughput as the average number of tuples each of your last bolts 
>> (sinks) emit per second? If yes, then OK. Otherwise, please provide us 
>> with more details.
>> 
>> Going back to the BlockingWaitStrategy observation you have, it (most 
>> probably) means that since you are producing a large number of tuples 
>> (15-20 tuples) the outgoing Disruptor queue gets 

Re: Custom implementation of ISpoutWaitStrategy

2015-12-22 Thread P. Taylor Goetz
If the call to a spout's `nextTuple()` method does not emit anything, Storm 
will call the `emptyEmit()` method with the number of times `nextTuple()` has 
consecutively failed to emit anything (`streak` will be reset to 0 if the spout 
emits something).


-Taylor

> On Dec 22, 2015, at 6:03 AM, Denis DEBARBIEUX  wrote:
> 
> Dear all,
> 
> I would to implement my own SpoutWaitStrategy. So I have to implement
> ISpoutWaitStrategy and then prepare and emptyEmit methods.
> 
> But, I do not understand the meaning of the parameter 'streak'. How
> should I use it?
> 
> Thanks for your help.
> 
> Denis
> 
> ---
> L'absence de virus dans ce courrier électronique a été vérifiée par le 
> logiciel antivirus Avast.
> https://www.avast.com/antivirus
> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


Storm at Hadoop Summit Europe

2015-12-15 Thread P. Taylor Goetz
Voting for sessions for Hadoop Summit Europe ends today. There are a number of 
Storm-related sessions that have been proposed. Below are the ones I found, but 
I didn’t do an exhaustive search. If there are others I missed, feel free to 
tack onto this thread.

If you have a chance, cast a vote (you can vote more than once) for some Storm 
sessions so we have a good presence at the conference.

From Device to Data Center to Insights: Architectural Considerations for the 
Internet of Anything 


Running Storm with 5 9's availability 


The Future of Apache Storm 


Practical Complex Event Processing with Storm 


Finding Outliers with Spark and Storm: Guide to Keeping Your Sanity 


Querying the internet of things: streaming SQL on Kafka/Samza and Storm/Trident 


Streaming SQL on Storm 


-Taylor


signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Fan in problem: virtually all time spent in network I/O wait

2015-12-09 Thread P. Taylor Goetz
Hi John,

I think it *may* make sense, but without more details like code/sample data, it 
is hard to say.

Whenever you use a fields grouping, key distribution can come into play and 
affect scaling.

-Taylor

> On Dec 9, 2015, at 9:31 PM, John Yost  wrote:
> 
> Hi Everyone,
> 
> I have a large fan in within my topology where I go from 1000 Bolt A 
> executors to 50 Bolt B executors via fieldsGrouping.  When I profile via 
> jvisualvm, it shows that the Bolt A thread spends 99% of it's time in the 
> com.lmax.disruptor.BlockingWaitStrategy.waitFor method.
> 
> The topology details are as follows:
> 
> 200 workers
> 20 KafkaSpout executors
> 1000 Bolt A executors
> 50  Bolt B executors
> 
> fieldsGrouping from Bolt A -> Bolt B because I am caching in Bolt B, building 
> up large Key/Value pairs for HFile import into HBase.
> 
> I am thinking if I add an extra bolt between Bolt A and Bolt B where I do a 
> localOrShuffleGrouping to go from 1000 -> 200 locally followed by 
> fieldsGrouping to go from 200 -> 50 will lessen Network I/O wait time.
> 
> Please confirm if this makes sense or if there are any other better ideas.
> 
> Thanks
> 
> --John


Re: Multithreading in Bolt vs more Bolts: tradeoffs?

2015-11-23 Thread P. Taylor Goetz
I think it depends on what exactly you're doing in those bolts.

In other words, we'd need more detail, like code or pseudo code. Otherwise we 
are guessing.

-Taylor

> On Nov 23, 2015, at 4:37 PM, John Yost  wrote:
> 
> Hi Everyone,
> 
> I have a large fan out in my topology 20 spouts -> 1000 boltA executors 
> followed by a large fan-in (1000 boltA executors to 200 boltB executors). The 
> performance of the fan-in is really bad. I am wondering if it would make more 
> sense to instead do multithreading in boltA so that I go 20 KS -> 200 boltA 
> -> 200 boltB.
> 
> Any opinions/ideas/comments would be greatly appreciated.
> 
> Thanks! :)
> 
> --John


Re: Compiling and executing storm

2015-11-13 Thread P. Taylor Goetz
That will happen if you don’t have a GPG key setup, but the error is benign — 
the distribution archive will be created, it just won’t be signed. Or you can 
skip it as you’ve done.

-Taylor

> On Nov 13, 2015, at 2:50 PM, Rodrigo Valladares  
> wrote:
> 
> That worked when I skipped gpg authentication:
> mvn package -Dgpg.skip=true
> 
> When I used mvn package I got the following error:
> 
> [INFO] --- maven-gpg-plugin:1.6:sign (default) @ apache-storm-bin ---
> 
> gpg: no default secret key: No secret key
> 
> 
> gpg: signing failed: No secret key
> 
> 
> [ERROR] Failed to execute goal 
> org.apache.maven.plugins:maven-gpg-plugin:1.6:sign (default) on project 
> apache-storm-bin: Exit code: 2 -> [Help 1]
> 
> Thanks
> 
> 
> 2015-11-13 12:16 GMT-06:00 Parth Brahmbhatt  >:
> See if the following works for you.
> 
> cd $STORM/storm-dist/binary
> mvn package
> cd target && tar xvf apache-storm-$version.tar.gz
> cd apache-storm-$version
> bin/storm nimbus
> 
> Thanks
> Parth
> 
> From: Rodrigo Valladares  >
> Reply-To: "user@storm.apache.org " 
> mailto:user@storm.apache.org>>
> Date: Friday, November 13, 2015 at 11:01 AM
> To: "user@storm.apache.org " 
> mailto:user@storm.apache.org>>
> Subject: Compiling and executing storm
> 
> Hello
> 
> I am a newbie in storm and I want to start looking and hacking the source 
> code. I downloaded the release version and executed it in my cluster. I am 
> having a bit of hard time compiling the code from source and them setting up 
> the cluster from my own compiled version. When I execute the bin/storm nimbus 
> script I am getting the following message:
> 
> The storm client can only be run from within a release. You appear to be 
> trying to run the client from a checkout of Storm's source code.
> 
> You can download a Storm release at http://storm-project.net/downloads.html 
> 
> Have any of you compiled storm from source?
> Can you give me any tips on how to do that.
> 
> 
> 
> Thank you
> Rodrigo Valladares Cotta
> Master's Student, Computer Science
> University of Nebraska-Lincoln
> 
> 
> 
> --
> Rodrigo Valladares Cotta
> Master's Student, Computer Science
> University of Nebraska-Lincoln



signature.asc
Description: Message signed with OpenPGP using GPGMail


[ANNOUNCE] Apache Storm 0.10.0 Released

2015-11-05 Thread P. Taylor Goetz
The Apache Storm community is pleased to announce the release of Apache Storm 
version 0.10.0.

Storm is a distributed, fault-tolerant, and high-performance realtime 
computation system that provides strong guarantees on the processing of data. 
You can read more about Storm on the project website:

http://storm.apache.org

Downloads of source and binary distributions are listed in our download
section:

http://storm.apache.org/downloads.html

You can read more about this release in the following blog post:

https://storm.apache.org/2015/11/05/storm0100-released.html

Distribution artifacts are available in Maven Central at the following 
coordinates:

groupId: org.apache.storm
artifactId: storm-core
version: 0.10.0

The full list of changes is available here[1]. Please let us know [2] if you 
encounter any problems.

Regards,

The Apache Storm Team

[1]: https://github.com/apache/storm/blob/v0.10.0/CHANGELOG.md
[2]: https://issues.apache.org/jira/browse/STORM


signature.asc
Description: Message signed with OpenPGP using GPGMail


[ANNOUNCE] Apache Storm 0.9.6 Released

2015-11-05 Thread P. Taylor Goetz
The Apache Storm community is pleased to announce the release of Apache Storm 
version 0.9.6.

Storm is a distributed, fault-tolerant, and high-performance realtime 
computation system that provides strong guarantees on the processing of data. 
You can read more about Storm on the project website:

http://storm.apache.org

Downloads of source and binary distributions are listed in our download
section:

http://storm.apache.org/downloads.html

You can read more about this release in the following blog post:

https://storm.apache.org/2015/11/05/storm096-released.html

Distribution artifacts are available in Maven Central at the following 
coordinates:

groupId: org.apache.storm
artifactId: storm-core
version: 0.9.6

The full list of changes is available here[1]. Please let us know [2] if you 
encounter any problems.

Regards,

The Apache Storm Team

[1]: https://github.com/apache/storm/blob/v0.9.6/CHANGELOG.md
[2]: https://issues.apache.org/jira/browse/STORM


signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Storm 0.9.6

2015-11-03 Thread P. Taylor Goetz
Hi Praj,

The 0.9.6 tag was created when the corresponding release candidate was created. 
That release candidate is not being voted upon on the dev@ mailing list and can 
be released once the requisite 3 +1 votes from committers have been cast. Once 
that happens, the artifacts will be released from staging.

Github creates a “release” anytime a tag is created. But official Apache Storm 
releases require a great deal more scrutiny and process than simply creating a 
tag.

-Taylor

> On Nov 2, 2015, at 11:44 AM, Prajwal Tuladhar  wrote:
> 
> Hi,
> 
> It seems like v0.9.6 [1] tag was created few days ago but I can't find that 
> release in maven central. Is there an ETA when it will be published to Maven 
> Central?
> 
> Thanks.
> 
> [1] https://github.com/apache/storm/tree/v0.9.6 
> 
> 
> --
> --
> Cheers,
> Praj



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Storm vs Spark Streaming Tech Evaluation

2015-10-20 Thread P. Taylor Goetz
Hi Satish,

Great series of blog posts. Thanks for sharing!

-Taylor

> On Oct 19, 2015, at 4:24 AM, Satish Mittal  wrote:
> 
> Hi All,
> 
> The data platform team at Inmobi recently performed an extensive evaluation 
> exercise in the process of finalizing the real-time Stream processing stack 
> as the choice of our platform.
> 
> We have captured all the details of our evaluation as the following series of 
> 4 blogs which have been published at Inmobi technology site:
> 
> 1) Introduction; Identifying stream processing use-cases at Inmobi; 
> Identifying potential Technology Candidates. In the interest of time, we 
> limited the
> http://technology.inmobi.com/blog/real-time-stream-processing-at-inmobi-part-1
>  
> 
> 
> 2) Detailed overview of Storm and Spark Streaming platforms
> http://technology.inmobi.com/blog/real-time-stream-processing-at-inmobi-part-2
>  
> 
> 
> 3) Identify and define various important evaluation criteria
> http://technology.inmobi.com/blog/real-time-stream-processing-at-inmobi-part-3
>  
> 
> 
> 4) Detailed findings on various evaluation criteria, evaluation summary along 
> with the final recommendation.
> http://technology.inmobi.com/blog/real-time-stream-processing-at-inmobi-part-4
>  
> 
> 
> We hope that this analysis would be useful in general to anyone who is 
> starting to explore the world of real-time stream processing and decide upon 
> a particular tech stack.
> 
> Please go through the blogs and let us know your thoughts!
> 
> Regards,
> Satish
> 
> 
> 
> 
> 
> _
> The information contained in this communication is intended solely for the 
> use of the individual or entity to whom it is addressed and others authorized 
> to receive it. It may contain confidential or legally privileged information. 
> If you are not the intended recipient you are hereby notified that any 
> disclosure, copying, distribution or taking any action in reliance on the 
> contents of this information is strictly prohibited and may be unlawful. If 
> you have received this communication in error, please notify us immediately 
> by responding to this email and then delete it from your system. The firm is 
> neither liable for the proper and complete transmission of the information 
> contained in this communication nor for any delay in its receipt.



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Tident: fork DAG without partitioning?

2015-09-24 Thread P. Taylor Goetz
Hi Grant,

I think the fluent API is tripping you up a bit, but what you are trying to do 
is possible.

Stream rootStream = …;

Stream streamA = rootStream.each(new Fields(…), filterA);
streamA.localOrShuffle()
 .each(new Fields(…), eachA, new Fields(…));

Stream streamB = rootStream.each(new Fields(…), new Negate(filterA));
streamB.localOrShuffle()
 .each(new Fields(…), eachB, new Fields(…));


In streamB I used the Negate filter — all that does is invert an existing 
filter. That way you don’t have to write two filters that are just the opposite 
of one another.

-Taylor


> On Sep 24, 2015, at 12:35 PM, Grant Overby (groverby)  
> wrote:
> 
> I have a trident topology where a portion of the DAG looks like this:
> 
> 
>   partition - - - filterA - - - eachA
> /
> stream - - - -
> \
>   partition - - - filterB - - - eachB
> 
> 
> 
> I believe with the above DAG, each tuple will be sent down both sides of the 
> fork. Approximately half will be filtered out by each filter. Thus, forcing 
> half my tuples to cross the partition only to get dropped afterwards.
> 
> Can the DAG be constructed like the following. If so, how do I define the 
> topology?
> 
>   filterA - - - partition - - - eachA
> /
> stream - - - -
> \
>   filterB - - - partition - - - eachB
> 
> 
> 
> 
> The following doesn’t appear to work. filterA and filterB don’t receive 
> tuples.
> 
> Stream stream = …;
> 
> stream.parallelismHint(N);
> 
> stream
> .each(new Fields(…), filterA)
> .localOrShuffle()
> .each(new Fields(…), eachA, new Fields(…))
> ;
> 
> stream
> .each(new Fields(…), filterB)
> .localOrShuffle()
> .each(new Fields(…), eachB, new Fields(…))
> ;
> 
> 
> 
> 
> 
> 
> 
> Grant Overby
> Software Engineer
> Cisco.com 
> grove...@cisco.com 
> Mobile: 865 724 4910
> 
> 
>  Think before you print.
> This email may contain confidential and privileged material for the sole use 
> of the intended recipient. Any review, use, distribution or disclosure by 
> others is strictly prohibited. If you are not the intended recipient (or 
> authorized to receive for the recipient), please contact the sender by reply 
> email and delete all copies of this message.
> 
> Please click here 
>  for 
> Company Registration Information.
> 
> 
> 
> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Storm 0.10.0 release

2015-09-24 Thread P. Taylor Goetz
That release was cancelled because it did not include some important bug fixes, 
so it has not been officially released.

I will delete the tags so it does not appear to have been released.

-Taylor

> On Sep 23, 2015, at 6:17 PM, Prajwal Tuladhar  wrote:
> 
> And weirdly, that version is not available in maven central 
> https://search.maven.org/#search|gav|1|g%3A%22org.apache.storm%22%20AND%20a%3A%22storm-core%22
>  
> 
> 
> On Wed, Sep 23, 2015 at 9:04 AM, Ziemer, Tom  > wrote:
> Hi,
> 
> 
> 
> I just saw that storm 0.10.0 was released on github on Sep. 11.
> 
> https://github.com/apache/storm/releases 
> 
> 
> 
> Does anybody know when we can expect the binaries to be available?
> 
> 
> 
> Thanks,
> 
> Tom
> 
> 
> 
> 
> --
> --
> Cheers,
> Praj



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Clarification regarding packaging of storm-core jar

2015-07-29 Thread P. Taylor Goetz
Hi Richards,

The transitive dependencies that are now included in the storm-core jar file 
have their package names “relocated” so users can include their own versions of 
those dependencies without encountering dependency conflicts.

The best practice is to include your own version of those dependencies in your 
topology jar. Then if you need to upgrade, you simply repackage your topology 
jar.

An alternate option is to use the storm-bundled version of the dependencies. To 
do that you simply modify the import statements in your code to reference the 
relocated package (they will be prefixed with “org.apache.storm”). With this 
approach, your topology jar will be smaller, but you will be dependent on the 
specific versions bundled with storm, so if you upgrade storm later, you may 
need to recompile/repackage your topology code.

-Taylor

> On Jul 29, 2015, at 8:21 AM, Richards Peter  wrote:
> 
> Hi,
> 
> I was using storm-0.8.2 so far in my project. I am starting to evaluate 
> storm-0.9.*. I found a difference in the way storm-core(0.9.4) /storm(0.8.2) 
> jar has been packaged. I am not an expert in this packaging concept. So I 
> thought of taking some advice from the experts.
> 
> In storm 0.8.* jar, I find the transitive dependencies maintained separately 
> in $STORM_HOME/lib directory, i.e. the dependencies such as http-core, 
> commons-io, etc. are maintained separately in the lib directory. Storm jar 
> had only the classes related to storm. With such a design we were able to 
> upgrade these dependencies when we had to integrate another third party 
> component in our topology (by replacing older versions of jar in storm lib 
> with newer versions used by the third party component).
> 
> In storm 0.9.4, I find that some of the transitive dependencies of storm are 
> packaged within storm-core jar file. If we encounter a situation when we have 
> to upgrade an individual transitive dependency how should I go forward?
> 
> Could you please share your thoughts about the same?
> 
> Thanks,
> Richards Peter.
> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Disable_replay_timeouts

2015-07-27 Thread P. Taylor Goetz
Yes, you can set the use the “topology.enable.message.timeouts” config property:

topology.enable.message.timeouts: false

-Taylor

> On Jul 27, 2015, at 9:05 AM, Ajay Chander  wrote:
> 
> Hi Everyone,
> 
> Is there anyway to explicitly disable replay timeouts specific to the 
> topology from my java app?
> 
> Thank you,
> Ajay
> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Does Storm Rebalance during scale down?

2015-07-09 Thread P. Taylor Goetz
Yes, if you remove a supervisor node Nimbus will detect it and reassign the 
work to other supervisor nodes in the cluster.

-Taylor

> On Jul 9, 2015, at 12:18 PM, Dillian Murphey  wrote:
> 
> If a node is removed, will Storm rebalance among the remaining nodes? I know 
> in case of scaling up we need to call the rebalance.
> 
> 
> thanks!
> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: State in Storm

2015-07-07 Thread P. Taylor Goetz
Hi Thilina,

1. With Storm’s Core API (spouts/bolts) if you are storing state in an 
in-memory structure such as a HashMap, that state will be lost in the event of 
a worker crash. It is up to you to provide the logic for persisting state and 
restoring it in the event of a failure; Storm will not do it for you. Storm’s 
trident API on the other hand has an abstraction for persistent state [1] and 
there are implementations for various backing stores like HBase, Redis, and 
others. In the storm-hbase module there is an example of a persistent word 
count.

2. Storm uses zookeeper for things like nimbus assignments and 
worker/supervisor heartbeats. The Trident API also uses zookeeper for tracking 
transactions. Other components may use zookeeper as well. The kafka spout, for 
example, uses zookeeper for tracking offsets.

-Taylor


[1] https://storm.apache.org/documentation/Trident-state.html


> On Jul 7, 2015, at 2:16 AM, Thilina Rathnayake  wrote:
> 
> Hi All,
> 
> I am a newbie to storm and have a few questions regarding the state in storm. 
> I will be
> really grateful if someone can help me out with the following.
> 
> * Consider the word count topology in `storm-starter`. We have a map called 
> `counts`
>   in it. Does storm store `count` in zookeeper or anywhere else so that if 
> the worker
>   dies we are able to recover the data inside `counts`?
> 
> * What state information of supervisors/workers are stored in the zookeeper?
> 
> Thanks in advance !
> 
> Regards,
> Thilina



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: HdfsBolt

2015-06-22 Thread P. Taylor Goetz
There might be others here that are using Storm with CDH, but since Cloudera 
Manager is closed source/proprietary, you may well be better off asking the 
question on a Cloudera forum for specific details.

That being said, the community here is likely willing to help, but you’ll need 
to get past the Cloudera-specific pieces and down to the open source parts. If 
you know the HDFS URL for the cluster, you should be able to connect.

What version of Storm are you using? When you say “trying to write to hadoop” 
do you mean HDFS, or something else? Are you using the storm-hdfs component 
that ships with Apache Storm?

The more details you can provide the better the community will be able to help 
you.

-Taylor

On Jun 22, 2015, at 8:06 PM, Ajay Chander  wrote:

> Hi Everyone,
> 
> I am trying to write the data into hadoop from my storm topology. For this 
> communication to happen through hostnames, I have enabled couple of 
> properties namely "dfs.client.use.datanode.hostname"=true and  
> "dfs.datanode.use.datanode.hostname" = true in my cloudera manager. Now how 
> do I make my storm aware of those two properties. When my storm topology is 
> running by default it takes those properties as false. Now how do I override 
> those properties ("dfs.client.use.datanode.hostname"=true and  
> "dfs.datanode.use.datanode.hostname" = true) which are specific to my hadoop 
> in my hdfsbolt. ??
> 
> Any help is highly appreciated.
> 
> Thank you,
> Ajay



signature.asc
Description: Message signed with OpenPGP using GPGMail


[CVE-2015-3188] Apache Storm remote code execution vulnerability

2015-06-19 Thread P. Taylor Goetz
CVE-2015-3188: Apache Storm remote code execution vulnerability

Severity: Important

Vendor:
The Apache Software Foundation

Versions Affected:
Apache Storm 0.10.0-beta

Description:
The UI daemon in Apache Storm 0.10.0-beta allows remote users to run 
arbitrary code as the user running the web server. With kerberos 
authentication this could allow impersonation of arbitrary users on other 
systems, including HDFS and HBase.

Mitigation:
0.10.0-beta users should upgrade to 0.10.0-beta1 or disable the Storm UI
daemon.

Apache Storm 0.10.0-beta1 artifacts are available for immediate download here:

http://www.us.apache.org/dist/storm/apache-storm-0.10.0-beta1/

Credit:
This issue was discovered by Bobby Evans of the Apache Storm PMC

References:
https://github.com/apache/storm/blob/v0.10.0-beta1/SECURITY.md
https://github.com/apache/storm/blob/v0.10.0-beta1/STORM-UI-REST-API.md

P. Taylor Goetz


signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Using FLUX and multiple streams to the same bolt

2015-06-17 Thread P. Taylor Goetz
JIRA:

https://issues.apache.org/jira/browse/STORM-873

On Jun 17, 2015, at 11:38 PM, P. Taylor Goetz  wrote:

> Romeo,
> 
> I have a fix (see below). Should be included in the next release (beta or 
> final). I will follow up with a JIRA ID for tracking.
> 
> -- TOPOLOGY DETAILS --
> Topology Name: diamond-topology
> --- SPOUTS ---
> spout-1 [1] (backtype.storm.testing.TestWordSpout)
>  BOLTS ---
> A [1] (org.apache.storm.flux.wrappers.bolts.LogInfoBolt)
> B [1] (org.apache.storm.flux.wrappers.bolts.LogInfoBolt)
> C [1] (org.apache.storm.flux.wrappers.bolts.LogInfoBolt)
> D [1] (org.apache.storm.flux.wrappers.bolts.LogInfoBolt)
> --- STREAMS ---
> spout-1 --FIELDS--> A
> A --SHUFFLE--> B
> A --SHUFFLE--> C
> C --SHUFFLE--> D
> B --SHUFFLE--> D
> --
> 
> Thanks again for reporting this, and helping out with beta testing.
> 
> - Taylor
> 
> 
> On Jun 17, 2015, at 4:54 PM, P. Taylor Goetz  wrote:
> 
>> Hi Romeo,
>> 
>> Thanks for reporting that. It’s a bug, and your approach for a fix is 
>> correct.
>> 
>> If you’d like, feel free to open a JIRA and optionally a pull request for a 
>> fix. Otherwise, I can take care of it.
>> 
>> -Taylor
>> 
>> On Jun 17, 2015, at 4:07 PM, Romeo Nocon  wrote:
>> 
>>> Hi,
>>> 
>>> I'm testing migrating over a topology I have to flux.  The
>>> 
>>> spout:
>>> - id: "spout"
>>> 
>>> bolts:
>>> - id: "bolt_A"
>>>  className: "com.blah.boltA"
>>>  parallelism: 1
>>> - id: "bolt_B"
>>>  className: "com.blah.boltB"
>>>  parallelism: 1
>>> - id: "bolt_C"
>>>  className: "com.blah.boltC"
>>>  parallelism: 1
>>> - id: "bolt_D"
>>>  className: "com.blah.boltD"
>>>  parallelism: 1
>>> 
>>> streams:
>>> - name:  ""
>>>  from: "spout"
>>>  to: "bolt_A"
>>>  grouping:
>>>type: SHUFFLE
>>> - name: "A-->B"
>>>   from: "bolt_A"
>>>   to: "bolt_B"
>>>   grouping:
>>> streamId: "forB"
>>> - name: "A-->C"
>>>   from: "bolt_A"
>>>   to: "bolt_C"
>>>   grouping:
>>> streamId: "forC"
>>> - name: "B-->D"
>>>   from: "bolt_B"
>>>   to: "bolt_D"
>>> - name: "C-->D"
>>>   from: "bolt_C"
>>>   to: "bolt_D"
>>> 
>>> It builds something like below (imagine the arrow from A-> B, A-> C,
>>> B->D, and C->D)
>>> -
>>>  Bolt_B
>>> Spout -> Bolt_A  -> Bolt_D
>>>  Bolt_C
>>> -
>>> 
>>> I get an error below in FLUX.
>>> 
>>> Exception in thread "main" java.lang.IllegalArgumentException: Bolt
>>> has already been declared for id bolt_D
>>>  at 
>>> backtype.storm.topology.TopologyBuilder.validateUnusedId(TopologyBuilder.java:212)
>>>  at 
>>> backtype.storm.topology.TopologyBuilder.setBolt(TopologyBuilder.java:139)
>>>  at 
>>> org.apache.storm.flux.FluxBuilder.buildStreamDefinitions(FluxBuilder.java:158)
>>>  at org.apache.storm.flux.FluxBuilder.buildTopology(FluxBuilder.java:94)
>>>  at org.apache.storm.flux.Flux.runCli(Flux.java:153)
>>>  at org.apache.storm.flux.Flux.main(Flux.java:98)
>>> 
>>> Looking at the buildStreamDefinitions code in the FluxBuilder it
>>> iterates through each of the defined streams then calls the
>>> appropriate
>>> 
>>>   builder.setBolt(stream.getTo()...).
>>> 
>>> Since I have two streams going to Bolt_D it ends up getting the error
>>> above.  Does someone have a patch or fix out there already?
>>> 
>>> A possible fix is to cache the BoltDeclarer by getTo() id then skip
>>> the builder.setBolt method so the code can continue setting the
>>> different types of groupings on the rest of streams.  Just a thought.
>>> 
>>> Thanks,
>>> Romeo
>> 
> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: Using FLUX and multiple streams to the same bolt

2015-06-17 Thread P. Taylor Goetz
Romeo,

I have a fix (see below). Should be included in the next release (beta or 
final). I will follow up with a JIRA ID for tracking.

-- TOPOLOGY DETAILS --
Topology Name: diamond-topology
--- SPOUTS ---
spout-1 [1] (backtype.storm.testing.TestWordSpout)
 BOLTS ---
A [1] (org.apache.storm.flux.wrappers.bolts.LogInfoBolt)
B [1] (org.apache.storm.flux.wrappers.bolts.LogInfoBolt)
C [1] (org.apache.storm.flux.wrappers.bolts.LogInfoBolt)
D [1] (org.apache.storm.flux.wrappers.bolts.LogInfoBolt)
--- STREAMS ---
spout-1 --FIELDS--> A
A --SHUFFLE--> B
A --SHUFFLE--> C
C --SHUFFLE--> D
B --SHUFFLE--> D
--

Thanks again for reporting this, and helping out with beta testing.

- Taylor


On Jun 17, 2015, at 4:54 PM, P. Taylor Goetz  wrote:

> Hi Romeo,
> 
> Thanks for reporting that. It’s a bug, and your approach for a fix is correct.
> 
> If you’d like, feel free to open a JIRA and optionally a pull request for a 
> fix. Otherwise, I can take care of it.
> 
> -Taylor
> 
> On Jun 17, 2015, at 4:07 PM, Romeo Nocon  wrote:
> 
>> Hi,
>> 
>> I'm testing migrating over a topology I have to flux.  The
>> 
>> spout:
>> - id: "spout"
>> 
>> bolts:
>> - id: "bolt_A"
>>   className: "com.blah.boltA"
>>   parallelism: 1
>> - id: "bolt_B"
>>   className: "com.blah.boltB"
>>   parallelism: 1
>> - id: "bolt_C"
>>   className: "com.blah.boltC"
>>   parallelism: 1
>> - id: "bolt_D"
>>   className: "com.blah.boltD"
>>   parallelism: 1
>> 
>> streams:
>> - name:  ""
>>   from: "spout"
>>   to: "bolt_A"
>>   grouping:
>> type: SHUFFLE
>> - name: "A-->B"
>>from: "bolt_A"
>>to: "bolt_B"
>>grouping:
>>  streamId: "forB"
>> - name: "A-->C"
>>from: "bolt_A"
>>to: "bolt_C"
>>grouping:
>>  streamId: "forC"
>> - name: "B-->D"
>>from: "bolt_B"
>>to: "bolt_D"
>> - name: "C-->D"
>>from: "bolt_C"
>>to: "bolt_D"
>> 
>> It builds something like below (imagine the arrow from A-> B, A-> C,
>> B->D, and C->D)
>> -
>>   Bolt_B
>> Spout -> Bolt_A  -> Bolt_D
>>   Bolt_C
>> -
>> 
>> I get an error below in FLUX.
>> 
>> Exception in thread "main" java.lang.IllegalArgumentException: Bolt
>> has already been declared for id bolt_D
>>   at 
>> backtype.storm.topology.TopologyBuilder.validateUnusedId(TopologyBuilder.java:212)
>>   at 
>> backtype.storm.topology.TopologyBuilder.setBolt(TopologyBuilder.java:139)
>>   at 
>> org.apache.storm.flux.FluxBuilder.buildStreamDefinitions(FluxBuilder.java:158)
>>   at org.apache.storm.flux.FluxBuilder.buildTopology(FluxBuilder.java:94)
>>   at org.apache.storm.flux.Flux.runCli(Flux.java:153)
>>   at org.apache.storm.flux.Flux.main(Flux.java:98)
>> 
>> Looking at the buildStreamDefinitions code in the FluxBuilder it
>> iterates through each of the defined streams then calls the
>> appropriate
>> 
>>builder.setBolt(stream.getTo()...).
>> 
>> Since I have two streams going to Bolt_D it ends up getting the error
>> above.  Does someone have a patch or fix out there already?
>> 
>> A possible fix is to cache the BoltDeclarer by getTo() id then skip
>> the builder.setBolt method so the code can continue setting the
>> different types of groupings on the rest of streams.  Just a thought.
>> 
>> Thanks,
>> Romeo
> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: stumped with local-mode stack overflow error

2015-06-17 Thread P. Taylor Goetz
Strange...

What version of storm?

Looks like you are on OSX and have Java 1.8, have you tried forcing Java 1.7?

How are you running in local mode?

The more details you can give, the better. Code to reproduce the issue would be 
ideal.

-Taylor


> On Jun 17, 2015, at 3:08 PM, David Maldonado  wrote:
> 
> Hello all, 
> 
> I’m having a nasty issue in local-mode only where I’m getting a stack 
> overflow error when jdk libraries are getting copied over to the supervisor, 
> any help would be greatly appreciated!
> 
> David Maldonado
> Evident.io
> 
> 4744 [main] INFO  backtype.storm.daemon.supervisor - Starting supervisor with 
> id a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16 at host 
> ip-192-168-1-107.us-west-2.compute.internal
> 4916 [main] INFO  backtype.storm.daemon.nimbus - Received topology submission 
> for test with conf {"topology.max.task.parallelism" nil, 
> "topology.acker.executors" nil, "topology.kryo.register" nil, 
> "topology.kryo.decorators" (), "topology.name" "test", "storm.id" 
> "test-1-1434562496", "topology.debug" true}
> 4960 [main] INFO  backtype.storm.daemon.nimbus - Activating test: 
> test-1-1434562496
> 5295 [main] INFO  backtype.storm.scheduler.EvenScheduler - Available slots: 
> (["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1028] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1029] 
> ["237d2e07-f432-4438-859f-93afa8b8bd6a" 1024] 
> ["237d2e07-f432-4438-859f-93afa8b8bd6a" 1025] 
> ["237d2e07-f432-4438-859f-93afa8b8bd6a" 1026])
> 5373 [main] INFO  backtype.storm.daemon.nimbus - Setting new assignment for 
> topology id test-1-1434562496: 
> #backtype.storm.daemon.common.Assignment{:master-code-dir 
> "/var/folders/wz/92ljhht90f35lylfk36k47mrgn/T//b8725cee-b13f-4148-9f2b-a14c4e6a2c37/nimbus/stormdist/test-1-1434562496",
>  :node->host {"a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 
> "ip-192-168-1-107.us-west-2.compute.internal"}, :executor->node+port {[2 2] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [579 579] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [580 580] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [1157 1157] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [1 1] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [995 996] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [3 4] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [1027 1028] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [35 36] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [1059 1060] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [67 68] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [1091 1092] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [99 100] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [1123 1124] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [131 132] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [1155 1156] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [163 164] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [195 196] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [227 228] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [259 260] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [291 292] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [323 324] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [355 356] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [387 388] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [419 420] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [451 452] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [483 484] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [515 516] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [547 548] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [611 612] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [643 644] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [675 676] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [707 708] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [739 740] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [771 772] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [803 804] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [835 836] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [867 868] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [899 900] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [931 932] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [963 964] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [997 998] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [5 6] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [1029 1030] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [37 38] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [1061 1062] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [69 70] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [1093 1094] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [101 102] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [1125 1126] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027], [133 134] 
> ["a0fa6bb2-cfa8-4c09-9b83-7d176c18bb16" 1027

Re: Using FLUX and multiple streams to the same bolt

2015-06-17 Thread P. Taylor Goetz
Hi Romeo,

Thanks for reporting that. It’s a bug, and your approach for a fix is correct.

If you’d like, feel free to open a JIRA and optionally a pull request for a 
fix. Otherwise, I can take care of it.

-Taylor

On Jun 17, 2015, at 4:07 PM, Romeo Nocon  wrote:

> Hi,
> 
> I'm testing migrating over a topology I have to flux.  The
> 
> spout:
>  - id: "spout"
> 
> bolts:
>  - id: "bolt_A"
>className: "com.blah.boltA"
>parallelism: 1
>  - id: "bolt_B"
>className: "com.blah.boltB"
>parallelism: 1
>  - id: "bolt_C"
>className: "com.blah.boltC"
>parallelism: 1
>  - id: "bolt_D"
>className: "com.blah.boltD"
>parallelism: 1
> 
> streams:
>  - name:  ""
>from: "spout"
>to: "bolt_A"
>grouping:
>  type: SHUFFLE
>  - name: "A-->B"
> from: "bolt_A"
> to: "bolt_B"
> grouping:
>   streamId: "forB"
>  - name: "A-->C"
> from: "bolt_A"
> to: "bolt_C"
> grouping:
>   streamId: "forC"
>  - name: "B-->D"
> from: "bolt_B"
> to: "bolt_D"
>  - name: "C-->D"
> from: "bolt_C"
> to: "bolt_D"
> 
> It builds something like below (imagine the arrow from A-> B, A-> C,
> B->D, and C->D)
> -
>Bolt_B
> Spout -> Bolt_A  -> Bolt_D
>Bolt_C
> -
> 
> I get an error below in FLUX.
> 
> Exception in thread "main" java.lang.IllegalArgumentException: Bolt
> has already been declared for id bolt_D
>at 
> backtype.storm.topology.TopologyBuilder.validateUnusedId(TopologyBuilder.java:212)
>at 
> backtype.storm.topology.TopologyBuilder.setBolt(TopologyBuilder.java:139)
>at 
> org.apache.storm.flux.FluxBuilder.buildStreamDefinitions(FluxBuilder.java:158)
>at org.apache.storm.flux.FluxBuilder.buildTopology(FluxBuilder.java:94)
>at org.apache.storm.flux.Flux.runCli(Flux.java:153)
>at org.apache.storm.flux.Flux.main(Flux.java:98)
> 
> Looking at the buildStreamDefinitions code in the FluxBuilder it
> iterates through each of the defined streams then calls the
> appropriate
> 
> builder.setBolt(stream.getTo()...).
> 
> Since I have two streams going to Bolt_D it ends up getting the error
> above.  Does someone have a patch or fix out there already?
> 
> A possible fix is to cache the BoltDeclarer by getTo() id then skip
> the builder.setBolt method so the code can continue setting the
> different types of groupings on the rest of streams.  Just a thought.
> 
> Thanks,
> Romeo



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [ANNOUNCE] Apache Storm 0.10.0-beta Released

2015-06-16 Thread P. Taylor Goetz
It is in the 0.10.0-beta release. It was originally committed to the 0.10.x 
branch, then applied to the 0.9.x branch.

-Taylor

On Jun 16, 2015, at 2:29 PM, Binh Nguyen Van  wrote:

> Hi,
> 
> This ticket also scheduled for 0.10.0 but I do not see it in the change log 
> of this beta. Do we still plan to have it in 0.10?
> 
> Regards
> -Binh
> 
> On Tue, Jun 16, 2015 at 5:13 AM, Richards Peter  wrote:
> Hi,
> 
> Thank you for making this release available.
> 
> Could you please confirm whether the item STORM-130 in Full Change Log 
> (Supervisor getting killed due to java.io.FileNotFoundException: File 
> '../stormconf.ser' does not exist) is a code merge from storm 0.9.5 or a fix 
> after storm 0.9.5?
> 
> Thanks,
> Richards Peter.
> 



signature.asc
Description: Message signed with OpenPGP using GPGMail


  1   2   >