For the content in the packages: Currently we have binary packages for three platforms: osx, centos and ubuntu. The file sizes are fairly close among them. I am going to use ubuntu as example:
- there is a tar.gz file and an sh file. I am thinking we may keep the sh file only since it is easier to use and contains the same data as the tar.gz file. This can save about 50% in size (from 470 * 6 -> 470 * 3). The file size is 470m. -rw-r--r--@ 1 nwang staff 470508875 Feb 27 10:57 heron-v0.20.1-incubating-rc1-ubuntu14.04.tar.gz The package is trim-able I believe, but there isn't many obvious ones: drwxr-xr-x@ 14 nwang staff 448 Dec 31 1969 bin drwxr-xr-x@ 16 nwang staff 512 Dec 31 1969 conf drwxr-xr-x@ 5 nwang staff 160 Mar 6 15:10 dist drwxr-xr-x@ 16 nwang staff 512 Dec 31 1969 examples drwxr-xr-x@ 11 nwang staff 352 Dec 31 1969 include drwxr-xr-x@ 12 nwang staff 384 Mar 6 15:04 lib -r-xr-xr-x@ 1 nwang staff 285 Dec 31 1969 release.yaml It is used to install a local environment that contains all necessary components: - bin: binaries for tools like api server, tracker, ui, etc - conf: configurations for different environments - dist: heron core and its tar.gz file (194m, might be trim-able). - examples: example topology jars - include: c++ header files - lib: c++ and java lib files More detailed information is attached at the end of this email. /dist/core/core and /lib are the biggest dirs and they might have a few duplicated big files we can try to get rid of. heron-downloader.jar 16m * 2 heron-metricscachemgr.jar 8.8m * 2 heron-binpacking-packing.jar 5.3m * 3 heron-roundrobin-packing.jar 5.3m * 2 scheduler jars 60m * 2 heron-localfs-statemgr.jar 5.3m * 2 heron-zookeeper-statemgr.jar 6.6m * 2 In case we can get rid of the core.tar.gz and the duplicated files, we might be able to reduce the size to 480m - 190m - 100m = 190m. Then these binary packages will be 190 * 3 = 570m, after some works. heron-v0.20.1-incubating-rc1-ubuntu14.04 ├── [ 448] bin │ ├── [2.1M] heron │ ├── [2.1M] heron-admin │ ├── [1.5K] heron-apiserver │ ├── [1.5K] heron-apiserver.sh │ ├── [1.2K] heron-downloader │ ├── [ 931] heron-downloader-config │ ├── [ 931] heron-downloader-config.sh │ ├── [1.2K] heron-downloader.sh │ ├── [2.2M] heron-explorer │ ├── [ 42M] heron-nomad │ ├── [2.5M] heron-tracker │ └── [3.5M] heron-ui ├── [ 512] conf │ ├── [ 384] aurora │ │ ├── [1.1K] client.yaml │ │ ├── [1.1K] downloader.yaml │ │ ├── [3.3K] heron.aurora │ │ ├── [ 12K] heron_internals.yaml │ │ ├── [6.4K] metrics_sinks.yaml │ │ ├── [1.0K] packing.yaml │ │ ├── [2.7K] scheduler.yaml │ │ ├── [1.6K] stateful.yaml │ │ ├── [2.3K] statemgr.yaml │ │ └── [1.1K] uploader.yaml │ ├── [ 352] examples │ │ ├── [1.3K] aurora_scheduler.yaml │ │ ├── [1.1K] downloader.yaml │ │ ├── [ 12K] heron_internals.yaml │ │ ├── [1.3K] local_scheduler.yaml │ │ ├── [1.5K] local_stateful.yaml │ │ ├── [ 951] localfs_statemgr.yaml │ │ ├── [1001] localfs_uploader.yaml │ │ ├── [7.3K] metrics_sinks.yaml │ │ └── [ 800] roundrobin_packing.yaml │ ├── [2.3K] heron_tracker.yaml │ ├── [ 352] kubernetes │ │ ├── [ 979] client.yaml │ │ ├── [1.1K] downloader.yaml │ │ ├── [ 12K] heron_internals.yaml │ │ ├── [6.3K] metrics_sinks.yaml │ │ ├── [1.0K] packing.yaml │ │ ├── [1.5K] scheduler.yaml │ │ ├── [1.6K] stateful.yaml │ │ ├── [1.7K] statemgr.yaml │ │ └── [1.1K] uploader.yaml │ ├── [ 384] local │ │ ├── [ 984] client.yaml │ │ ├── [1.1K] downloader.yaml │ │ ├── [2.2K] healthmgr.yaml │ │ ├── [ 12K] heron_internals.yaml │ │ ├── [6.4K] metrics_sinks.yaml │ │ ├── [1.0K] packing.yaml │ │ ├── [1.3K] scheduler.yaml │ │ ├── [1.6K] stateful.yaml │ │ ├── [1.3K] statemgr.yaml │ │ └── [1.2K] uploader.yaml │ ├── [ 352] localzk │ │ ├── [ 915] client.yaml │ │ ├── [1.1K] downloader.yaml │ │ ├── [ 12K] heron_internals.yaml │ │ ├── [6.4K] metrics_sinks.yaml │ │ ├── [1.0K] packing.yaml │ │ ├── [1.3K] scheduler.yaml │ │ ├── [1.6K] stateful.yaml │ │ ├── [2.5K] statemgr.yaml │ │ └── [1.2K] uploader.yaml │ ├── [ 320] marathon │ │ ├── [1.1K] client.yaml │ │ ├── [ 12K] heron_internals.yaml │ │ ├── [6.4K] metrics_sinks.yaml │ │ ├── [1.0K] packing.yaml │ │ ├── [1.5K] scheduler.yaml │ │ ├── [1.6K] stateful.yaml │ │ ├── [1.3K] statemgr.yaml │ │ └── [1.1K] uploader.yaml │ ├── [ 320] mesos │ │ ├── [ 800] client.yaml │ │ ├── [ 12K] heron_internals.yaml │ │ ├── [6.4K] metrics_sinks.yaml │ │ ├── [1.0K] packing.yaml │ │ ├── [2.0K] scheduler.yaml │ │ ├── [1.6K] stateful.yaml │ │ ├── [1.3K] statemgr.yaml │ │ └── [1.2K] uploader.yaml │ ├── [ 416] nomad │ │ ├── [1.1K] client.yaml │ │ ├── [1.0K] cluster.yaml │ │ ├── [1.1K] downloader.yaml │ │ ├── [ 12K] heron_internals.yaml │ │ ├── [1.8K] heron_nomad.sh │ │ ├── [6.4K] metrics_sinks.yaml │ │ ├── [1.0K] packing.yaml │ │ ├── [2.7K] scheduler.yaml │ │ ├── [1.6K] stateful.yaml │ │ ├── [1.7K] statemgr.yaml │ │ └── [1.1K] uploader.yaml │ ├── [ 384] sandbox │ │ ├── [ 984] client.yaml │ │ ├── [1.1K] downloader.yaml │ │ ├── [2.2K] healthmgr.yaml │ │ ├── [ 12K] heron_internals.yaml │ │ ├── [6.4K] metrics_sinks.yaml │ │ ├── [1.0K] packing.yaml │ │ ├── [1.3K] scheduler.yaml │ │ ├── [1.6K] stateful.yaml │ │ ├── [1.3K] statemgr.yaml │ │ └── [1.2K] uploader.yaml │ ├── [ 352] slurm │ │ ├── [ 915] client.yaml │ │ ├── [ 12K] heron_internals.yaml │ │ ├── [6.4K] metrics_sinks.yaml │ │ ├── [1.0K] packing.yaml │ │ ├── [1.4K] scheduler.yaml │ │ ├── [1.2K] slurm.sh │ │ ├── [1.6K] stateful.yaml │ │ ├── [1.2K] statemgr.yaml │ │ └── [1.2K] uploader.yaml │ ├── [ 512] standalone │ │ ├── [ 984] client.yaml │ │ ├── [1.0K] cluster.yaml │ │ ├── [1.1K] downloader.yaml │ │ ├── [ 12K] heron_internals.yaml │ │ ├── [1.8K] heron_nomad.sh │ │ ├── [ 845] inventory.yaml │ │ ├── [6.4K] metrics_sinks.yaml │ │ ├── [1.0K] packing.yaml │ │ ├── [ 96] resources │ │ │ └── [1015] master.hcl │ │ ├── [2.5K] scheduler.yaml │ │ ├── [1.6K] stateful.yaml │ │ ├── [1.7K] statemgr.yaml │ │ ├── [ 256] templates │ │ │ ├── [1.5K] apiserver.template.hcl │ │ │ ├── [1.6K] heron_tools.template.hcl │ │ │ ├── [2.4K] scheduler.template.yaml │ │ │ ├── [1.1K] slave.template.hcl │ │ │ ├── [1.7K] statemgr.template.yaml │ │ │ └── [1.0K] uploader.template.yaml │ │ └── [1.0K] uploader.yaml │ ├── [ 160] test │ │ ├── [ 11K] test_heron_internals.yaml │ │ ├── [8.1K] test_metrics_sinks.yaml │ │ └── [ 816] test_override.yaml │ └── [ 384] yarn │ ├── [ 915] client.yaml │ ├── [1.1K] downloader.yaml │ ├── [2.2K] healthmgr.yaml │ ├── [ 12K] heron_internals.yaml │ ├── [6.4K] metrics_sinks.yaml │ ├── [1.0K] packing.yaml │ ├── [1.6K] scheduler.yaml │ ├── [1.6K] stateful.yaml │ ├── [1.2K] statemgr.yaml │ └── [1.1K] uploader.yaml ├── [ 160] dist │ ├── [ 160] heron-core │ │ ├── [ 160] heron-core │ │ │ ├── [ 384] bin │ │ │ │ ├── [6.3M] heron-cpp-instance │ │ │ │ ├── [1.2K] heron-downloader │ │ │ │ ├── [ 931] heron-downloader-config │ │ │ │ ├── [ 931] heron-downloader-config.sh │ │ │ │ ├── [1.2K] heron-downloader.sh │ │ │ │ ├── [1.7M] heron-executor │ │ │ │ ├── [1.8M] heron-python-instance │ │ │ │ ├── [2.5M] heron-shell │ │ │ │ ├── [7.4M] heron-stmgr │ │ │ │ └── [7.7M] heron-tmaster │ │ │ └── [ 416] lib │ │ │ ├── [ 96] ckptmgr │ │ │ │ └── [5.4M] heron-ckptmgr.jar │ │ │ ├── [ 96] downloaders │ │ │ │ └── [ 16M] heron-downloader.jar │ │ │ ├── [ 96] healthmgr │ │ │ │ └── [ 32M] heron-healthmgr.jar │ │ │ ├── [ 96] instance │ │ │ │ └── [3.3M] heron-instance.jar │ │ │ ├── [ 96] metricscachemgr │ │ │ │ └── [8.8M] heron-metricscachemgr.jar │ │ │ ├── [ 96] metricsmgr │ │ │ │ └── [6.9M] heron-metricsmgr.jar │ │ │ ├── [ 128] packing │ │ │ │ ├── [5.3M] heron-binpacking-packing.jar │ │ │ │ └── [5.3M] heron-roundrobin-packing.jar │ │ │ ├── [ 288] scheduler │ │ │ │ ├── [ 16M] heron-kubernetes-scheduler.jar │ │ │ │ ├── [7.5M] heron-local-scheduler.jar │ │ │ │ ├── [7.5M] heron-marathon-scheduler.jar │ │ │ │ ├── [9.4M] heron-mesos-scheduler.jar │ │ │ │ ├── [ 13M] heron-nomad-scheduler.jar │ │ │ │ ├── [7.3M] heron-scheduler.jar │ │ │ │ └── [7.5M] heron-slurm-scheduler.jar │ │ │ ├── [ 160] statefulstorage │ │ │ │ ├── [ 18M] heron-dlog-statefulstorage.jar │ │ │ │ ├── [5.0M] heron-hdfs-statefulstorage.jar │ │ │ │ └── [5.0M] heron-localfs-statefulstorage.jar │ │ │ └── [ 128] statemgr │ │ │ ├── [5.3M] heron-localfs-statemgr.jar │ │ │ └── [6.6M] heron-zookeeper-statemgr.jar │ │ └── [ 285] release.yaml │ └── [186M] heron-core.tar.gz ├── [ 512] examples │ ├── [2.0K] fibonacci.yaml │ ├── [3.3M] heron-api-examples.jar │ ├── [4.4M] heron-eco-examples.jar │ ├── [1.6K] heron-stateful-windowing.yaml │ ├── [1.2K] heron-stateful-word-count.yaml │ ├── [ 12M] heron-streamlet-examples.jar │ ├── [ 31M] heron-streamlet-scala-examples.jar │ ├── [2.1K] heron_fibonacci.yaml │ ├── [1.9K] heron_windowing.yaml │ ├── [2.0K] heron_wordcount.yaml │ ├── [ 860] sample.properties │ ├── [1.8K] simple_windowing.yaml │ ├── [2.0K] simple_wordcount.yaml │ └── [4.4M] storm-eco-examples.jar ├── [ 352] include │ ├── [ 288] bolt │ │ ├── [2.0K] base-basic-bolt.h │ │ ├── [1.3K] base-rich-bolt.h │ │ ├── [2.2K] basic-bolt-collector.h │ │ ├── [2.8K] ibasic-bolt.h │ │ ├── [3.0K] ibolt-output-collector.h │ │ ├── [4.3K] ibolt.h │ │ └── [1.2K] irich-bolt.h │ ├── [ 96] config │ │ └── [ 12K] config.h │ ├── [ 160] exceptions │ │ ├── [1.2K] already-alive-exception.h │ │ ├── [1.3K] invalid-topology-exception.h │ │ └── [1.3K] serialization-exception.h │ ├── [ 320] metric │ │ ├── [1.4K] assignable-metric.h │ │ ├── [1.5K] count-metric.h │ │ ├── [1.2K] imetric.h │ │ ├── [1.6K] imetrics-registrar.h │ │ ├── [1.3K] imulti-metric.h │ │ ├── [1.6K] mean-metric.h │ │ ├── [1.9K] multi-count-metric.h │ │ └── [1.9K] multi-mean-metric.h │ ├── [ 192] serializer │ │ ├── [1.6K] cereal-serializer.h │ │ ├── [1.9K] ipluggable-serializer.h │ │ ├── [1.5K] string-serializer.h │ │ └── [6.2K] tuple-serializer-utils.h │ ├── [ 192] spout │ │ ├── [1.4K] base-rich-spout.h │ │ ├── [1.3K] irich-spout.h │ │ ├── [2.4K] ispout-output-collector.h │ │ └── [5.6K] ispout.h │ ├── [ 192] topology │ │ ├── [1.4K] base-rich-spout.h │ │ ├── [1.3K] irich-spout.h │ │ ├── [2.4K] ispout-output-collector.h │ │ └── [5.6K] ispout.h │ ├── [ 128] tuple │ │ ├── [2.9K] fields.h │ │ └── [2.8K] tuple.h │ └── [ 96] utils │ └── [2.4K] utils.h ├── [ 384] lib │ ├── [ 224] api │ │ ├── [ 43K] api-scala.jar │ │ ├── [ 46M] heron-apiserver.jar │ │ ├── [110K] libapi-java.jar │ │ ├── [ 60K] libcxx-api.a │ │ └── [ 61K] libcxx-api.pic.a │ ├── [ 96] downloaders │ │ └── [ 16M] heron-downloader.jar │ ├── [ 96] metricscachemgr │ │ └── [8.8M] heron-metricscachemgr.jar │ ├── [ 128] packing │ │ ├── [5.3M] heron-binpacking-packing.jar │ │ └── [5.3M] heron-roundrobin-packing.jar │ ├── [ 416] scheduler │ │ ├── [7.5M] heron-aurora-scheduler.jar │ │ ├── [5.3M] heron-binpacking-packing.jar │ │ ├── [ 16M] heron-kubernetes-scheduler.jar │ │ ├── [7.5M] heron-local-scheduler.jar │ │ ├── [7.5M] heron-marathon-scheduler.jar │ │ ├── [9.4M] heron-mesos-scheduler.jar │ │ ├── [ 13M] heron-nomad-scheduler.jar │ │ ├── [5.3M] heron-roundrobin-packing.jar │ │ ├── [7.3M] heron-scheduler.jar │ │ ├── [7.5M] heron-slurm-scheduler.jar │ │ └── [9.0M] heron-yarn-scheduler.jar │ ├── [ 96] simulator │ │ └── [ 45K] libsimulator-java.jar │ ├── [ 128] statemgr │ │ ├── [5.3M] heron-localfs-statemgr.jar │ │ └── [6.6M] heron-zookeeper-statemgr.jar │ ├── [ 160] third_party │ │ ├── [1.3M] libprotobuf_java.jar │ │ ├── [ 29K] slf4j-api-1.7.7.jar │ │ └── [7.7K] slf4j-jdk14-1.7.7.jar │ └── [ 320] uploader │ ├── [ 16M] heron-dlog-uploader.jar │ ├── [4.8M] heron-gcs-uploader.jar │ ├── [2.5M] heron-hdfs-uploader.jar │ ├── [3.9M] heron-http-uploader.jar │ ├── [2.5M] heron-localfs-uploader.jar │ ├── [2.5M] heron-null-uploader.jar │ ├── [7.2M] heron-s3-uploader.jar │ └── [2.5M] heron-scp-uploader.jar └── [ 285] release.yaml 53 directories, 253 files On Sun, Mar 10, 2019 at 11:54 PM Ning Wang <[email protected]> wrote: > Thanks Dave! > > Yeah. I will get a list of content in packages. > > For the docker image, I think it should be ok. Let me try to publish it to > Apache docker hub and see if there is anything missing. > > > > On Fri, Mar 8, 2019 at 2:14 PM Dave Fisher <[email protected]> wrote: > >> Hi - >> >> > On Mar 8, 2019, at 1:58 PM, Ning Wang <[email protected]> wrote: >> > >> > Hi, >> > >> > I have been trying to release Heron 0.20.1 (being distracted time by >> time >> > though) and the most recent question I am having is where to put the >> binary >> > packages. >> > >> > The binary packages are (when we were doing releases on github): >> > - tar.gz packages for osx, centos and ubuntu, each one includes all >> modules >> > like core, lib, tools, etc. >> > - .sh packages for the three platforms. which is an installer for the >> > modules in the tar.gz packages. >> > - docker image (dockerhub, not github most of the times) >> >> Let’s discuss the components in each binary package and how big that they >> really are and need to be. >> >> > >> > Currently each package is more than 400MB. >> >> When packages of this size are released from dev to the release area it >> requires replication to the Apache Mirror system. When the size exceeds an >> aggregate of 1GB then Infra needs to manually handle things to avoid >> impacting the mirrors. (There are 250 projects using the mirrors.) >> >> >> >> > >> > I was trying to understand the Apache rules and my impression was that >> > these package should be on dist.apache.org like the src packages (I >> might >> > be wrong about the rules though) and it looks like Apache Storm has a >> > binary package in their release. >> >> Make the case for Heron without comparison to other projects. >> >> There is a place to make Apache Docker releases on docker hub. Let’s >> figure out this if it is a valid distribution that Heron could make. >> >> > >> > However it seems Apache infra has a byte limit of 500MB for each >> release. I >> > guess it means that the binary packages are not "required" to be on >> Apache >> > infra? >> >> Yes and no. Let’s discuss the packages first. >> >> > >> > The binary packages are convenient for users. So I think they should be >> > included in release. The question is where should we put them? >> > >> > So far it looks like the options are: >> > - ask for an exception and publish them to dist.apache.org. It seems >> like >> > Apache infra guys don't suggest this solution. >> > - publish only src package to dist.apache.org and publish the binary >> > packages on github (or is there any other suggestion?). This is >> convenient >> > for us and there is no problem so far (we have binary packages for all >> the >> > previous releases and github hasn't complained). The question about this >> > option is more like if this is acceptable (or ever better suggested) by >> > Apache? >> > >> > What do you think about the two options above and any other options we >> > should consider? >> >> Let’s discuss the packages. >> >> Next we will also need to discuss the website. >> >> > >> > Thanks. >> > --ning >> >>
