-1

I think we ought to include STORM-3059, since it's a regression compared to
1.2.1, and it prevents the spout from working at all in certain
configurations.

2018-05-06 15:54 GMT+02:00 Alexandre Vermeerbergen <[email protected]
>:

> Hello Stig,
>
> Yes you are right: "mvn clean install -DskipTests" solved the build issue.
>
> Yes another excellent news : I have rebuilt my topologies with the
> modified storm-kafka-client.jar from your pull request (which shows
> 1.2.3-SNAPSHOP as its version), the topologie which had its spout
> crashing will the NullPointerException is repaired !
>
> I will gladly vote a [+1] (still non binding) if STORM-3059 could be
> part of Storm 1.2.2 !
>
> Best regards,
> Alexandre Vermeerbergen
>
> 2018-05-06 13:35 GMT+02:00 Stig Rohde Døssing <[email protected]>:
> > Alexandre,
> >
> > Could you try running just
> >
> > mvn clean install -DskipTests
> >
> > I think for some reason one of the dependencies isn't being built as it
> > should by Maven.
> >
> > 2018-05-06 12:59 GMT+02:00 Alexandre Vermeerbergen <
> [email protected]
> >>:
> >
> >> Hello Stig,
> >>
> >> Thanks, I followed your instructions, but got a build failure:
> >>
> >> [INFO] --- maven-dependency-plugin:2.8:unpack (unpack) @ storm-core ---
> >> [INFO] Configured Artifact: org.apache.storm:multilang-
> >> ruby:1.2.3-SNAPSHOT:jar
> >> Downloading: http://repository.apache.org/snapshots/org/apache/storm/
> >> multilang-ruby/1.2.3-SNAPSHOT/maven-metadata.xml
> >> Downloading: https://clojars.org/repo/org/apache/storm/multilang-ruby/1
> .
> >> 2.3-SNAPSHOT/maven-metadata.xml
> >> Downloading: https://clojars.org/repo/org/apache/storm/multilang-ruby/1
> .
> >> 2.3-SNAPSHOT/multilang-ruby-1.2.3-SNAPSHOT.jar
> >> Downloading: http://repository.apache.org/snapshots/org/apache/storm/
> >> multilang-ruby/1.2.3-SNAPSHOT/multilang-ruby-1.2.3-SNAPSHOT.jar
> >> [INFO] ------------------------------------------------------------
> >> ------------
> >> [INFO] Reactor Summary:
> >> [INFO]
> >> [INFO] Storm .............................................. SUCCESS [
> >> 4.279 s]
> >> [INFO] maven-shade-clojure-transformer .................... SUCCESS [
> >> 4.568 s]
> >> [INFO] storm-maven-plugins ................................ SUCCESS [
> >> 3.992 s]
> >> [INFO] Storm Core ......................................... FAILURE
> >> [03:12 min]
> >> [INFO] storm-kafka-client ................................. SKIPPED
> >> [INFO] ------------------------------------------------------------
> >> ------------
> >> [INFO] BUILD FAILURE
> >> [INFO] ------------------------------------------------------------
> >> ------------
> >> [INFO] Total time: 03:28 min
> >> [INFO] Finished at: 2018-05-06T12:27:58+02:00
> >> [INFO] Final Memory: 86M/1460M
> >> [INFO] ------------------------------------------------------------
> >> ------------
> >> [ERROR] Failed to execute goal
> >> org.apache.maven.plugins:maven-dependency-plugin:2.8:unpack (unpack)
> >> on project storm-core: Unable to find artifact. Could not find
> >> artifact org.apache.storm:multilang-ruby:jar:1.2.3-SNAPSHOT in clojars
> >> (https://clojars.org/repo/)
> >> [ERROR]
> >> [ERROR] Try downloading the file manually from the project website.
> >> [ERROR]
> >> [ERROR] Then, install it using the command:
> >> [ERROR] mvn install:install-file -DgroupId=org.apache.storm
> >> -DartifactId=multilang-ruby -Dversion=1.2.3-SNAPSHOT -Dpackaging=jar
> >> -Dfile=/path/to/file
> >> [ERROR]
> >> [ERROR] Alternatively, if you host your own repository you can deploy
> >> the file there:
> >> [ERROR] mvn deploy:deploy-file -DgroupId=org.apache.storm
> >> -DartifactId=multilang-ruby -Dversion=1.2.3-SNAPSHOT -Dpackaging=jar
> >> -Dfile=/path/to/file -Durl=[url] -DrepositoryId=[id]
> >> [ERROR]
> >> [ERROR]
> >> [ERROR] org.apache.storm:multilang-ruby:jar:1.2.3-SNAPSHOT
> >> [ERROR]
> >> [ERROR] from the specified remote repositories:
> >> [ERROR] central (http://repo1.maven.org/maven2/, releases=true,
> >> snapshots=false),
> >> [ERROR] clojars (https://clojars.org/repo/, releases=true,
> >> snapshots=true),
> >> [ERROR] apache.snapshots (http://repository.apache.org/snapshots,
> >> releases=false, snapshots=true)
> >> [ERROR] -> [Help 1]
> >>
> >> I guess something's missing in the instruction or in the dependency
> >> declaration file ?
> >>
> >> Best regards,
> >> Alexandre
> >>
> >>
> >> 2018-05-06 12:15 GMT+02:00 Stig Rohde Døssing <[email protected]>:
> >> > Start by cloning the Storm repository:
> >> >
> >> > git clone https://github.com/apache/storm.git
> >> >
> >> > cd into the directory containing the Storm code, then fetch the branch
> >> > corresponding to the PR
> >> >
> >> > git fetch origin pull/2663/head:STORM-3059-1.x
> >> >
> >> > In this case 2663 is the PR number of the PR you want to fetch (from
> the
> >> > url https://github.com/apache/storm/pull/2663), and STORM-3059-1.x is
> >> the
> >> > name of the branch you want to create locally that will point to the
> >> > commits from the PR. Then you just checkout the branch you created.
> >> >
> >> > git checkout STORM-3059-1.x
> >> >
> >> > At this point you can build storm-kafka-client with
> >> >
> >> > mvn clean install -DskipTests -pl external/storm-kafka-client -am
> >> >
> >> > 2018-05-06 11:38 GMT+02:00 Alexandre Vermeerbergen <
> >> [email protected]
> >> >>:
> >> >
> >> >> Hello Stig,
> >> >>
> >> >> Yes I can try your fix very quickly if you have a binary artifact
> >> >> (storm-kafka-client.jar, I guess) which I could download.
> >> >> Or "copy paste" instructions that I could use to build it (I'm sorry
> :
> >> >> I tend to be slow at understanding how to retrieve specific pull
> >> >> requests to build artifacts).
> >> >>
> >> >> Best regards,
> >> >> Alexandre
> >> >>
> >> >> 2018-05-06 10:51 GMT+02:00 Stig Rohde Døssing <
> [email protected]>:
> >> >> > I put up what I believe should be a fix at
> >> >> > https://github.com/apache/storm/pull/2663, would you be willing to
> >> try
> >> >> it
> >> >> > out?
> >> >> >
> >> >> > Regarding killing the entire worker, you are right that it can be
> >> >> overkill
> >> >> > in some cases, but there's a tradeoff you have to make. Heron runs
> >> each
> >> >> > component (spout/bolt) in independent JVMs, so if e.g. the spout
> >> crashes
> >> >> > there then only the JVM hosting that spout will crash and restart.
> >> They
> >> >> pay
> >> >> > for it by having to communicate between JVMs more, since they never
> >> have
> >> >> > situations where a spout can send a tuple to a bolt without having
> to
> >> >> > serialize it and go between processes.
> >> >> >
> >> >> > 2018-05-06 10:38 GMT+02:00 Alexandre Vermeerbergen <
> >> >> [email protected]
> >> >> >>:
> >> >> >
> >> >> >> Hello Stig,
> >> >> >>
> >> >> >> Thank you very much for your very fast answer and for opening
> >> >> >> https://issues.apache.org/jira/browse/STORM-3059.
> >> >> >>
> >> >> >> Regarding my generic concern that Kafka Spout exceptions shouldn't
> >> >> >> kill it's worker process, I am still concerned by the scope of the
> >> >> >> "kill/recovery".
> >> >> >> Indeed, a worker process generally not only hosts spouts, but also
> >> >> bolts.
> >> >> >> The fact that a spout occasional crash leads to the killing of
> >> >> >> everything else running on the same worker process seems overkill
> (no
> >> >> >> pun intended) to me.
> >> >> >>
> >> >> >> To give an analogy with a web application server, it's like if we
> >> >> >> would agree that an exception thrown by a servlet could lead to a
> >> kill
> >> >> >> of the application server's container process. Yeah with a
> cluster of
> >> >> >> containers and a good load balancer in front this could be OK in
> >> >> >> production, but yet... I still feel this overkill.
> >> >> >>
> >> >> >> Back to my precise issues, is there possibility to have
> >> >> >> https://issues.apache.org/jira/browse/STORM-3059 in Storm 1.2.2
> >> final
> >> >> >> ?
> >> >> >>
> >> >> >> Best regards,
> >> >> >> Alexandre Vermeerbergen
> >> >> >>
> >> >> >>
> >> >> >>
> >> >> >> 2018-05-06 9:58 GMT+02:00 Stig Rohde Døssing <
> [email protected]
> >> >:
> >> >> >> > The exception is caused by the fix in STORM-2994, the new code
> >> should
> >> >> >> only
> >> >> >> > run in AT_LEAST_ONCE mode, not in the others.
> >> >> >> >
> >> >> >> > Have raised https://issues.apache.org/jira/browse/STORM-3059 to
> >> fix
> >> >> it.
> >> >> >> >
> >> >> >> > I disagree that the spout should catch and swallow
> >> unknown/unexpected
> >> >> >> > exceptions. Storm is designed to be fail-fast, and to restart
> >> >> processes
> >> >> >> > when they error out unexpectedly. I don't think the spout would
> >> work
> >> >> any
> >> >> >> > better if it caught and ignored these exceptions.
> >> >> >> >
> >> >> >> > 2018-05-06 9:30 GMT+02:00 Alexandre Vermeerbergen <
> >> >> >> [email protected]>
> >> >> >> > :
> >> >> >> >
> >> >> >> >> Hello All,
> >> >> >> >>
> >> >> >> >> [ ] -1 Do not release this package because storm-kafka-client
> >> spout
> >> >> >> >> glitches can crash workers, leading to degraded performances.
> >> >> >> >>
> >> >> >> >> I have downloaded the binary artifacts of this storm 1.2.0rc2,
> >> copied
> >> >> >> >> the binaries of storm-kafka-client, flux and flux-wrapper from
> the
> >> >> >> >> Nessus staging repository quoted by Taylor, and ran tests on a
> >> >> >> >> relatively modest configuration (1 VM for Nimbus, 1 VM for a
> >> >> >> >> Supervisor node, 1 VM a Zookeeper node) with Java 8 update 172
> on
> >> >> >> >> CentOS 7. We use storm-kafka-client with Kafka 0.10.2.0 libs
> >> against
> >> >> a
> >> >> >> >> large cluster of Kafka Brokers at version 1.0.1. We have ~15
> >> >> >> >> topologies running on this setup.
> >> >> >> >>
> >> >> >> >> The first glitch I noticed is a that, unlike with Storm 1.2.0
> >> which
> >> >> we
> >> >> >> >> use in production, we had deserialization exceptions in of our
> our
> >> >> >> >> Spout:
> >> >> >> >> - With Storm 1.2.0, these exceptions were somehow "swallowed",
> and
> >> >> >> >> this spout would consume nothing and not crashing its worker
> >> anyway
> >> >> >> >> - With Storm 1.2.2rc2, these exceptions showed up, with a
> crash of
> >> >> the
> >> >> >> >> Spout's worker process. Storm restarts the worker, and then the
> >> same
> >> >> >> >> crash occurs not long afte. All this leads to many Netty
> errors in
> >> >> all
> >> >> >> >> our topologies' logs, which clearly means bad performances (CPU
> >> load
> >> >> >> >> quite high on the Supervisor VM)
> >> >> >> >> After fixing the cause of the deserialization exception (which
> >> was in
> >> >> >> >> our code), this issue disappeared.
> >> >> >> >>
> >> >> >> >> The second glitch which I just noticed is that we have a
> another
> >> >> >> >> "situation" which is similar but yet a little bit different :
> on
> >> >> >> >> another of our topologies, we have a spout throwing the
> following
> >> >> >> >> exception, also leading to its worker's crash:
> >> >> >> >>
> >> >> >> >> 2018-05-06 06:55:49.560 o.a.s.k.s.KafkaSpout
> >> >> >> >> Thread-6-eventKafkaSpout-executor[3 3] [INFO] Initialization
> >> >> complete
> >> >> >> >> 2018-05-06 06:55:49.636 o.a.s.util Thread-6-eventKafkaSpout-
> >> >> executor[3
> >> >> >> >> 3] [ERROR] Async loop died!
> >> >> >> >> java.lang.NullPointerException: null
> >> >> >> >>         at org.apache.storm.kafka.spout.
> >> KafkaSpout.emitOrRetryTuple(
> >> >> >> >> KafkaSpout.java:507)
> >> >> >> >> ~[stormjar.jar:?]
> >> >> >> >>         at org.apache.storm.kafka.spout.KafkaSpout.
> >> >> >> >> emitIfWaitingNotEmitted(KafkaSpout.java:440)
> >> >> >> >> ~[stormjar.jar:?]
> >> >> >> >>         at org.apache.storm.kafka.spout.KafkaSpout.nextTuple(
> >> >> >> >> KafkaSpout.java:308)
> >> >> >> >> ~[stormjar.jar:?]
> >> >> >> >>         at org.apache.storm.daemon.
> executor$fn__10727$fn__10742$
> >> >> >> >> fn__10773.invoke(executor.clj:654)
> >> >> >> >> ~[storm-core-1.2.2.jar:1.2.2]
> >> >> >> >>         at org.apache.storm.util$async_
> >> loop$fn__553.invoke(util.clj:
> >> >> >> 484)
> >> >> >> >> [storm-core-1.2.2.jar:1.2.2]
> >> >> >> >>         at clojure.lang.AFn.run(AFn.java:22)
> >> [clojure-1.7.0.jar:?]
> >> >> >> >>         at java.lang.Thread.run(Thread.java:748) [?:1.8.0_172]
> >> >> >> >> 2018-05-06 06:55:49.641 o.a.s.d.executor
> >> >> >> >> Thread-6-eventKafkaSpout-executor[3 3] [ERROR]
> >> >> >> >> java.lang.NullPointerException: null
> >> >> >> >>         at org.apache.storm.kafka.spout.
> >> KafkaSpout.emitOrRetryTuple(
> >> >> >> >> KafkaSpout.java:507)
> >> >> >> >> ~[stormjar.jar:?]
> >> >> >> >>         at org.apache.storm.kafka.spout.KafkaSpout.
> >> >> >> >> emitIfWaitingNotEmitted(KafkaSpout.java:440)
> >> >> >> >> ~[stormjar.jar:?]
> >> >> >> >>         at org.apache.storm.kafka.spout.KafkaSpout.nextTuple(
> >> >> >> >> KafkaSpout.java:308)
> >> >> >> >> ~[stormjar.jar:?]
> >> >> >> >>         at org.apache.storm.daemon.
> executor$fn__10727$fn__10742$
> >> >> >> >> fn__10773.invoke(executor.clj:654)
> >> >> >> >> ~[storm-core-1.2.2.jar:1.2.2]
> >> >> >> >>         at org.apache.storm.util$async_
> >> loop$fn__553.invoke(util.clj:
> >> >> >> 484)
> >> >> >> >> [storm-core-1.2.2.jar:1.2.2]
> >> >> >> >>         at clojure.lang.AFn.run(AFn.java:22)
> >> [clojure-1.7.0.jar:?]
> >> >> >> >>         at java.lang.Thread.run(Thread.java:748) [?:1.8.0_172]
> >> >> >> >> 2018-05-06 06:55:49.702 o.a.s.util Thread-6-eventKafkaSpout-
> >> >> executor[3
> >> >> >> >> 3] [ERROR] Halting process: ("Worker died")
> >> >> >> >> java.lang.RuntimeException: ("Worker died")
> >> >> >> >>         at org.apache.storm.util$exit_
> >> process_BANG_.doInvoke(util.
> >> >> >> clj:341)
> >> >> >> >> [storm-core-1.2.2.jar:1.2.2]
> >> >> >> >>         at clojure.lang.RestFn.invoke(RestFn.java:423)
> >> >> >> >> [clojure-1.7.0.jar:?]
> >> >> >> >>         at org.apache.storm.daemon.worker$fn__11404$fn__11405.
> >> >> >> >> invoke(worker.clj:792)
> >> >> >> >> [storm-core-1.2.2.jar:1.2.2]
> >> >> >> >>         at org.apache.storm.daemon.
> executor$mk_executor_data$fn__
> >> >> >> >> 10612$fn__10613.invoke(executor.clj:281)
> >> >> >> >> [storm-core-1.2.2.jar:1.2.2]
> >> >> >> >>         at org.apache.storm.util$async_
> >> loop$fn__553.invoke(util.clj:
> >> >> >> 494)
> >> >> >> >> [storm-core-1.2.2.jar:1.2.2]
> >> >> >> >>         at clojure.lang.AFn.run(AFn.java:22)
> >> [clojure-1.7.0.jar:?]
> >> >> >> >>         at java.lang.Thread.run(Thread.java:748) [?:1.8.0_172]
> >> >> >> >> 2018-05-06 06:55:49.706 o.a.s.d.worker Thread-15 [INFO]
> Shutting
> >> down
> >> >> >> >> worker metricAggregation_ec2-34-248-249-45-eu-west-1-compute-
> >> >> >> >> amazonaws-com_defaultStormTopic-106-1525589734
> >> >> >> >> 871ced6c-14c4-4a59-a774-579bf357314f 6714
> >> >> >> >> 2018-05-06 06:55:49.707 o.a.s.d.worker Thread-15 [INFO]
> >> Terminating
> >> >> >> >> messaging context
> >> >> >> >> 2018-05-06 06:55:49.707 o.a.s.d.worker Thread-15 [INFO]
> Shutting
> >> down
> >> >> >> >> executors
> >> >> >> >>
> >> >> >> >> This exception looks like
> >> >> >> >> https://issues.apache.org/jira/browse/STORM-3032, but I can't
> >> tell
> >> >> for
> >> >> >> >> sure, except that the line number of the exception corresponds
> to
> >> >> this
> >> >> >> >> line of KafkaSpout.java:
> >> >> >> >>
> >> >> >> >>                 offsetManagers.get(tp).
> >> >> addToEmitMsgs(msgId.offset());
> >> >> >> >>
> >> >> >> >>
> >> >> >> >> To sum up, I am voting [-1] on this 1.2.0rc2, because I have
> the
> >> >> >> >> feeling that exceptions in Kafka Spout should be gracefully
> caught
> >> >> and
> >> >> >> >> never lead to work crashes. I understand that the root cause of
> >> these
> >> >> >> >> exceptions can come from the specific code we have in our
> >> topologies,
> >> >> >> >> and for the 1st case I was glad to see it because it was an
> easy
> >> fix
> >> >> >> >> on our side, but nevertheless on a production system, one can
> have
> >> >> >> >> sometimes exceptions and the performance pain of workers crash
> is
> >> >> >> >> simply not affordable in production.
> >> >> >> >>
> >> >> >> >> I don't know if a JIRA already exists on this generic issue
> that
> >> >> kafka
> >> >> >> >> spout exceptions should be gracefully catched (and maybe lead
> to
> >> >> >> >> "Failed" tuples, so that there would be a tracking anyway? or
> at
> >> >> least
> >> >> >> >> a log message in Storm UI ?)
> >> >> >> >>
> >> >> >> >> Please note that this JIRA would differ from STORM-3032 because
> >> >> >> >> STORM-3032 seems to be specific to one case, where as in my
> >> opinion
> >> >> >> >> the crash issue is more generic - as my two different cases
> show.
> >> >> >> >>
> >> >> >> >> Best regards,
> >> >> >> >> Alexandre Vermeerbergen
> >> >> >> >>
> >> >> >> >>
> >> >> >> >> 2018-05-03 19:18 GMT+02:00 P. Taylor Goetz <[email protected]
> >:
> >> >> >> >> > CORRECTION: The Nexus staging repository for this rc is:
> >> >> >> >> >
> >> >> >> >> > https://repository.apache.org/content/repositories/
> >> >> >> orgapachestorm-1064
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >> > On May 3, 2018, at 11:42 AM, P. Taylor Goetz <
> [email protected]
> >> >
> >> >> >> wrote:
> >> >> >> >> >
> >> >> >> >> > This is a call to vote on releasing Apache Storm 1.2.2 (rc2)
> >> >> >> >> >
> >> >> >> >> > Full list of changes in this release:
> >> >> >> >> >
> >> >> >> >> > https://dist.apache.org/repos/dist/dev/storm/apache-storm-1.
> >> >> >> >> 2.2-rc2/RELEASE_NOTES.html
> >> >> >> >> >
> >> >> >> >> > The tag/commit to be voted upon is v1.2.2:
> >> >> >> >> >
> >> >> >> >> > https://git-wip-us.apache.org/repos/asf?p=storm.git;a=tree;
> h=
> >> >> >> >> 7cb19fb3befa65e5ff9e5e02f38e16de865982a9;hb=
> >> >> >> e001672cf0ea59fe6989b563fb6bbb
> >> >> >> >> 450fe8e7e5
> >> >> >> >> >
> >> >> >> >> > The source archive being voted upon can be found here:
> >> >> >> >> >
> >> >> >> >> > https://dist.apache.org/repos/dist/dev/storm/apache-storm-1.
> >> >> >> >> 2.2-rc2/apache-storm-1.2.2-src.tar.gz
> >> >> >> >> >
> >> >> >> >> > Other release files, signatures and digests can be found
> here:
> >> >> >> >> >
> >> >> >> >> > https://dist.apache.org/repos/dist/dev/storm/apache-storm-1.
> >> >> 2.2-rc2/
> >> >> >> >> >
> >> >> >> >> > The release artifacts are signed with the following key:
> >> >> >> >> >
> >> >> >> >> > https://git-wip-us.apache.org/repos/asf?p=storm.git;a=blob_
> >> >> >> >> plain;f=KEYS;hb=22b832708295fa2c15c4f3c70ac0d2bc6fded4bd
> >> >> >> >> >
> >> >> >> >> > The Nexus staging repository for this release is:
> >> >> >> >> >
> >> >> >> >> > https://repository.apache.org/content/repositories/
> >> >> >> orgapachestorm-1062
> >> >> >> >> >
> >> >> >> >> > Please vote on releasing this package as Apache Storm 1.2.2.
> >> >> >> >> >
> >> >> >> >> > When voting, please list the actions taken to verify the
> >> release.
> >> >> >> >> >
> >> >> >> >> > This vote will be open for 72 hours or until at least 3 PMC
> >> members
> >> >> >> vote
> >> >> >> >> +1.
> >> >> >> >> >
> >> >> >> >> > [ ] +1 Release this package as Apache Storm 1.2.2
> >> >> >> >> > [ ]  0 No opinion
> >> >> >> >> > [ ] -1 Do not release this package because...
> >> >> >> >> >
> >> >> >> >> > Thanks to everyone who contributed to this release.
> >> >> >> >> >
> >> >> >> >> > -Taylor
> >> >> >> >> >
> >> >> >> >> >
> >> >> >> >>
> >> >> >>
> >> >>
> >>
>

Reply via email to