I thought, we only include the libraries from SystemDS binary in the python package. If so, then hadoop-* libraries are not new additions. Unfortunately, test.pypi doesn't allow packages of more than 100MB, which means we won't be able to dry run our python releases. I would be a little more comfortable with a better explanation for why the python package size increased by 2x from the last release.
Regards, Arnab.. On Tue, Jun 21, 2022 at 6:55 PM Janardhan <janard...@apache.org> wrote: > Hi, > > PyPi packages are a little more than 100MB. Compared 2.2.1 which is ~56 MB. > > -- Added in the present release (library sizes after unzip) > > 70K Jun 21 15:08 commons-compiler-3.0.16.jar > 601K Jun 21 15:08 commons-compress-1.19.jar > > 193K Jun 21 15:08 commons-text-1.6.jar > > 19M Jun 21 15:08 hadoop-client-api-3.3.1.jar > 31M Jun 21 15:08 hadoop-client-runtime-3.3.1.jar > > 5.3M Jun 21 15:08 hadoop-hdfs-client-3.3.1.jar > > 1.5M Jun 21 15:08 htrace-core4-4.1.0-incubating.jar > > 126K Jun 21 15:08 re2j-1.1.jar > > 192K Jun 21 15:08 stax2-api-4.2.1.jar > 511K Jun 21 15:08 woodstox-core-5.3.0.jar > > Let us see if there is some optimization we can do? > > Best, > Janardhan >