As having a separate module for each of the connectors will cause a lot of 
bloat … may be good to club them into one.

-roshan


On 3/2/17, 10:39 AM, "Sree V" <sree_at_ch...@yahoo.com.INVALID> wrote:

    +1separate main binaries, connectors and samples/examples.
     
    Thanking you.
    With Regards
    Sree 
    
        On Thursday, March 2, 2017 8:02 AM, Satish Duggana 
<satish.dugg...@gmail.com> wrote:
     
    
     Agree that such huge binaries may not be acceptable. We should really think
    about the options proposed earlier like excluding some of the external
    connectors from binary and update the documentation respectively.
    
    Thanks,
    Satish.
    
    On Thu, Mar 2, 2017 at 4:41 PM, Jungtaek Lim <kabh...@gmail.com> wrote:
    
    > Adding my observation to my last mail:
    > I just give STORM-2249 a try (against master branch) and compare before vs
    > after.
    > It adds more than 300 MB, and the final archive is more than 550 MB. I
    > guess it would be similar for 1.x branch.
    >
    > Before ----------
    > -rw-r--r--  1 jlim  staff  3.3K  3  2 19:41 apache-storm-2.0.0-SNAPSHOT.
    > pom
    > -rw-r--r--  1 jlim  staff  473B  3  2 19:41 apache-storm-2.0.0-SNAPSHOT.
    > pom.asc
    > -rw-r--r--  1 jlim  staff  264M  3  2 19:41 apache-storm-2.0.0-SNAPSHOT.
    > tar.gz
    > -rw-r--r--  1 jlim  staff  473B  3  2 19:41 apache-storm-2.0.0-SNAPSHOT.
    > tar.gz.asc
    > -rw-r--r--  1 jlim  staff  264M  3  2 19:41 apache-storm-2.0.0-SNAPSHOT.
    > zip
    > -rw-r--r--  1 jlim  staff  473B  3  2 19:41 apache-storm-2.0.0-SNAPSHOT.
    > zip.asc
    >
    > After -----------
    > -rw-r--r--  1 jlim  staff  3.3K  3  2 19:46 apache-storm-2.0.0-SNAPSHOT.
    > pom
    > -rw-r--r--  1 jlim  staff  473B  3  2 19:46 apache-storm-2.0.0-SNAPSHOT.
    > pom.asc
    > -rw-r--r--  1 jlim  staff  564M  3  2 19:46 apache-storm-2.0.0-SNAPSHOT.
    > tar.gz
    > -rw-r--r--  1 jlim  staff  473B  3  2 19:46 apache-storm-2.0.0-SNAPSHOT.
    > tar.gz.asc
    > -rw-r--r--  1 jlim  staff  565M  3  2 19:46 apache-storm-2.0.0-SNAPSHOT.
    > zip
    > -rw-r--r--  1 jlim  staff  473B  3  2 19:47 apache-storm-2.0.0-SNAPSHOT.
    > zip.asc
    >
    > While 264M is already a bit huge for me, 564M is not that I can accept.
    > (Binary dist. of Flink 1.2.0 is 127M, Spark 2.1.0 is 195M, Kafka 0.10.2 is
    > 37M.)
    >
    > Btw, we're including source code of examples, and "mvn clean package" will
    > work for every example modules.
    >
    > 2017년 3월 2일 (목) 오전 11:29, Jungtaek Lim <kabh...@gmail.com>님이 작성:
    >
    > > I guess it might be good time to think why we add all connectors to the
    > > binary distribution.
    > >
    > > Spark and Flink don't include them to binary dist. They even moved some
    > or
    > > most of connectors out of repo, have been maintaining them in Apache
    > Bahir.
    > > (Personally this is something I'm in favor of. We have lots of 
connectors
    > > and many of them are outdated - clear example is storm-elasticsearch.)
    > >
    > > If we are assuming online then we don't even need to think about users
    > > touching binary dist. version of connectors. Users have been including
    > them
    > > via build tools' dependency management, or even starting 1.1.0, users 
can
    > > include them via '--artifact' option.
    > >
    > > I also was just one of users for Storm, and I haven't use them directly.
    > > How much UX gets worse when we remove connectors to binary dist? It only
    > > helps some users who are not connected to the internet, and IMHO it's a
    > > rare case.
    > >
    > > I would like to see the opposite approach, removing all connectors (or
    > > just keeping storm-kafka/storm-kafka-client and some more preferred
    > things)
    > > and its relevant examples from binary dist.
    > >
    > > What do you think about it?
    > >
    > > - Jungtaek Lim (HeartSaVioR)
    > >
    > >
    > > 2017년 3월 2일 (목) 오전 10:23, Roshan Naik <ros...@hortonworks.com>님이 작성:
    > >
    > > Once all of the shaded examples are included the size will go up 
further.
    > >
    > > But currently as they are not part of the tar.gz … something else is the
    > > culprit for the bloat.
    > >
    > >
    > >
    > > Below is a comparative listing of 1.0.3 vs 1.1.0  binary releases .. of
    > > files that are larger than 4MB.
    > >
    > >
    > >
    > >
    > >
    > > @Jungtaek Lim<mailto:j...@hortonworks.com> :  I am thinking, since the
    > > code for the examples can be easily viewed online … it would be valuable
    > to
    > > have the executable topologies made available to the user as part of the
    > > binary release … rather have them figure out how to build it correctly
    > > before trying them out.
    > >
    > >
    > >
    > > -roshan
    > >
    > >
    > >
    > >
    > >
    > > ➜  apache-storm-1.0.3 >  find . -type f -size +4096 -exec ls -lh {} \;
    > >
    > > -rw-r--r--@ 1 roshan  staff    70M Feb  7 12:33
    > ./examples/storm-starter/
    > > storm-starter-topologies-1.0.3.jar
    > >
    > > -rwxr-xr-x@ 1 roshan  staff    65M Feb  7 12:30
    > > ./external/flux/flux-examples-1.0.3.jar
    > >
    > > -rwxr-xr-x@ 1 roshan  staff  3.5M Feb  7 12:32
    > > ./external/sql/storm-sql-core/calcite-core-1.4.0-incubating.jar
    > >
    > > -rwxr-xr-x@ 1 roshan  staff  2.1M Feb  7 12:32
    > > ./external/sql/storm-sql-core/guava-16.0.1.jar
    > >
    > > -rwxr-xr-x@ 1 roshan  staff  7.3M Feb  7 12:30
    > > ./external/storm-eventhubs/storm-eventhubs-1.0.3-jar-
    > with-dependencies.jar
    > >
    > > -rwxr-xr-x@ 1 roshan  staff  5.6M Feb  7 12:33
    > > ./external/storm-jms/storm-jms-examples-1.0.3-jar-with-dependencies.jar
    > >
    > > -rwxr-xr-x@ 1 roshan  staff  9.9M Feb  7 12:33
    > > ./external/storm-mqtt/storm-mqtt-examples-1.0.3.jar
    > >
    > > -rw-r--r--@ 1 roshan  staff  3.7M Nov  4 10:02 ./lib/clojure-1.7.0.jar
    > >
    > > -rw-r--r--@ 1 roshan  staff    19M Feb  7 12:26
    > ./lib/storm-core-1.0.3.jar
    > >
    > > -rw-r--r--@ 1 roshan  staff  2.4M Feb  7 12:26
    > > ./lib/storm-rename-hack-1.0.3.jar
    > >
    > >
    > >
    > > ➜  apache-storm-1.1.0 >  find . -type f -size +4096 -exec ls -lh {} \;
    > >
    > > -rwxr-xr-x@ 1 roshan  staff  8.0M Feb 24 12:23 ./examples/storm-pmml-
    > > examples/storm-pmml-examples-1.1.0.jar
    > >
    > > -rwxr-xr-x@ 1 roshan  staff    60M Feb 24 12:20
    > ./examples/storm-starter/
    > > storm-starter-topologies-1.1.0.jar
    > >
    > > -rwxr-xr-x@ 1 roshan  staff    66M Feb 24 12:11
    > > ./external/flux/flux-examples-1.1.0.jar
    > >
    > > -rwxr-xr-x@ 1 roshan  staff  4.0M Feb 24 12:16
    > > ./external/sql/storm-sql-core/calcite-core-1.11.0.jar
    > >
    > > -rwxr-xr-x@ 1 roshan  staff  2.1M Feb 24 12:16
    > > ./external/sql/storm-sql-core/guava-16.0.1.jar
    > >
    > > -rwxr-xr-x@ 1 roshan  staff  4.0M Feb 24 12:12
    > ./external/sql/storm-sql-
    > > runtime/calcite-core-1.11.0.jar
    > >
    > > -rwxr-xr-x@ 1 roshan  staff  2.1M Feb 24 12:12
    > ./external/sql/storm-sql-
    > > runtime/guava-16.0.1.jar
    > >
    > > -rwxr-xr-x@ 1 roshan  staff    78M Feb 24 12:18
    > > ./external/storm-druid/storm-druid-1.1.0.jar
    > >
    > > -rwxr-xr-x@ 1 roshan  staff  7.3M Feb 24 12:11
    > > ./external/storm-eventhubs/storm-eventhubs-1.1.0-jar-
    > with-dependencies.jar
    > >
    > > -rwxr-xr-x@ 1 roshan  staff  5.6M Feb 24 12:20
    > > ./external/storm-jms/storm-jms-examples-1.1.0-jar-with-dependencies.jar
    > >
    > > -rwxr-xr-x@ 1 roshan  staff  6.7M Feb 24 12:18
    > > ./external/storm-submit-tools/storm-submit-tools-1.1.0.jar
    > >
    > > -rwxr-xr-x@ 1 roshan  staff  3.7M Nov  4 10:02 ./lib/clojure-1.7.0.jar
    > >
    > > -rwxr-xr-x@ 1 roshan  staff    20M Feb 24 12:07
    > ./lib/storm-core-1.1.0.jar
    > >
    > > -rwxr-xr-x@ 1 roshan  staff  2.4M Feb 24 12:07
    > > ./lib/storm-rename-hack-1.1.0.jar
    > >
    > > -rwxr-xr-x@ 1 roshan  staff    18M Feb 24 12:19
    > > ./toollib/storm-kafka-monitor-1.1.0.jar
    > >
    > >
    > >
    > >
    > >
    > >
    > >
    > >
    > >
    > >
    > >
    > >
    > >
    > > On 3/1/17, 4:43 PM, "Jungtaek Lim" <kabh...@gmail.com> wrote:
    > >
    > >
    > >
    > >    About STORM-2249, since examples are shading their dependencies,
    > binary
    > >
    > >    dist will grow much bigger. I've left some comments regarding that.
    > >
    > >    Btw, I have another view of this. Showing example codes is more
    > > important
    > >
    > >    than just let users execute some topologies. That's what example
    > > modules
    > >
    > >    are for. We need to include source as well. If we need to pick one,
    > > source
    > >
    > >    code would be better.
    > >
    > >
    > >
    > >    STORM-2343 seems better to add to 1.1.0. I just am not enough
    > familiar
    > > with
    > >
    > >    storm-kafka-client so not sure I can review that, but I'll try to. I
    > > feel
    > >
    > >    it's not that make release dragged. Let's add to 1.1.0 epic.
    > >
    > >
    > >
    > >    Let's make minimum merge before another RC vote. Addressing
    > STORM-2389
    > > (and
    > >
    > >    maybe STORM-2343) is enough for me. Others are not that critical.
    > >
    > >
    > >
    > >    Thanks,
    > >
    > >    Jungtaek Lim (HeartSaVioR)
    > >
    > >
    > >
    > >
    > >
    > >    On Thu, Mar 2, 2017 at 7:00 AM, Hugo Da Cruz Louro <
    > > hlo...@hortonworks.com>
    > >
    > >    wrote:
    > >
    > >
    > >
    > >    > Roshan, does this PR<https://github.com/apache/storm/pull/1831>
    > and
    > > JIRA<
    > >
    > >    > https://issues.apache.org/jira/browse/STORM-2249> address the
    > > missing
    > >
    > >    > jars problem that you mentioned. I had created it in December 2016,
    > > but
    > >
    > >    > there is an ongoing discussion if we should indeed put the jars in
    > > the
    > >
    > >    > examples location or not.
    > >
    > >    >
    > >
    > >    > On a different note, this storm-kafka-client/KafkaSpout PR<
    > >
    > >    > https://github.com/apache/storm/pull/1924> fixes a bug with the
    > > number of
    > >
    > >    > uncommitted offsets that is quite important. It is not a blocker,
    > > but it is
    > >
    > >    > quite critical. I am going to do one last pass reviewing today. It
    > > would be
    > >
    > >    > good if we could have this PR included with the release. Can anyone
    > > else
    > >
    > >    > review it as well ?
    > >
    > >    >
    > >
    > >    > Thanks,
    > >
    > >    > Hugo
    > >
    > >    >
    > >
    > >    > On Mar 1, 2017, at 9:14 AM, P. Taylor Goetz <ptgo...@gmail.com
    > > <mailto:ptgo
    > >
    > >    > e...@gmail.com>> wrote:
    > >
    > >    >
    > >
    > >    > Yeah, I don’t think the file size is a killer/blocker. It’s largely
    > > due to
    > >
    > >    > shaded examples, etc. But it’s something to keep an eye on. Our
    > > binary
    > >
    > >    > releases shouldn’t have to be that big.
    > >
    > >    >
    > >
    > >    > -Taylor
    > >
    > >    >
    > >
    > >    > On Mar 1, 2017, at 12:09 PM, Roshan Naik <ros...@hortonworks.com<
    > > mailto:
    > >
    > >    > ros...@hortonworks.com>> wrote:
    > >
    > >    >
    > >
    > >    > Have filed Jiras so for the 3 issues mentioned. Not sure if we need
    > > a JIRA
    > >
    > >    > for the file size getting bloated by that much.
    > >
    > >    > Somebody better familiar with the matter may want to take about
    > that?
    > >
    > >    > -roshan
    > >
    > >    >
    > >
    > >    >
    > >
    > >    > On 3/1/17, 8:13 AM, "P. Taylor Goetz" <ptgo...@gmail.com<mailto:
    > ptgo
    > >
    > >    > e...@gmail.com>> wrote:
    > >
    > >    >
    > >
    > >    >  Thanks for bringing these up Roshan. Feel free to file JIRA
    > > tickets for
    > >
    > >    > these issues and assign the “Release Apache Storm 1.1.0” epic so
    > > they can
    > >
    > >    > be tracked for this release.
    > >
    > >    >
    > >
    > >    >  -Taylor
    > >
    > >    >
    > >
    > >    > On Mar 1, 2017, at 9:27 AM, Roshan Naik <ros...@hortonworks.com<
    > > mailto:
    > >
    > >    > ros...@hortonworks.com>> wrote:
    > >
    > >    >
    > >
    > >    > Found these additional issues:
    > >
    > >    >
    > >
    > >    >
    > >
    > >    >
    > >
    > >    > 1- BUG: Even if topology.eventlogger.executors=0,  the
    > event_logger
    > > bolt
    > >
    > >    > is instantiated … previously observed to cause ~10% degradation in
    > > perf
    > >
    > >    > even with logging disabled.
    > >
    > >    >
    > >
    > >    > 2- Missing Jars: The storm-*-examples jars are missing in the
    > binary
    > >
    > >    > distro (other than a storm-pmml-examples.jar,
    > storm-jms-examples.jar
    > > &
    > >
    > >    > flux-examples.jar).
    > >
    > >    >
    > >
    > >    > 3- Minor: HdfsSpoutTopology example has not been moved into
    > >
    > >    > storm-hdfs-examples from storm-starter
    > >
    > >    >
    > >
    > >    >
    > >
    > >    >
    > >
    > >    > Another side observation …  v1.0.3 tar.gz downloadable was 190MB.
    > >  This
    > >
    > >    > v1.1.0 tar.gz  downloadable is 297MB !! …. even though some of the
    > > example
    > >
    > >    > topologies didn’t make it.
    > >
    > >    >
    > >
    > >    >
    > >
    > >    >
    > >
    > >    > -roshan
    > >
    > >    >
    > >
    > >    >
    > >
    > >    >
    > >
    > >    >
    > >
    > >    >
    > >
    > >    >
    > >
    > >    >
    > >
    > >    >
    > >
    > >
    > >
    > >
    > >
    > >    --
    > >
    > >    Name : Jungtaek Lim
    > >
    > >    Blog : http://medium.com/@heartsavior
    > >
    > >    Twitter : http://twitter.com/heartsavior
    > >
    > >    LinkedIn : http://www.linkedin.com/in/heartsavior
    > >
    > >
    > >
    >
    
       

Reply via email to