Re: [DISCUSS] Split PyFlink packages into two packages: apache-flink and apache-flink-libraries

2021-03-22 Thread Xingbo Huang
Thanks for the feedback everyone. I will proceed if there is no objection. Best, Xingbo Till Rohrmann 于2021年3月22日周一 下午5:30写道: > If there is no other way, then I would say let's go with splitting the > modules. This is already better than keeping the Flink binaries bundled > with every

Re: [DISCUSS] Split PyFlink packages into two packages: apache-flink and apache-flink-libraries

2021-03-22 Thread Till Rohrmann
If there is no other way, then I would say let's go with splitting the modules. This is already better than keeping the Flink binaries bundled with every Python/platform package. Cheers, Till On Mon, Mar 22, 2021 at 8:28 AM Xingbo Huang wrote: > When we **pip install** a wheel package, it just

Re: [DISCUSS] Split PyFlink packages into two packages: apache-flink and apache-flink-libraries

2021-03-22 Thread Xingbo Huang
When we **pip install** a wheel package, it just unpacks the wheel package and installs its dependencies[1]. There is no way to download things from an external website during installation. It works differently from the source package where we could download something in the setup.py. This is

Re: [DISCUSS] Split PyFlink packages into two packages: apache-flink and apache-flink-libraries

2021-03-19 Thread Till Rohrmann
I think that we should try to reduce the size of the packages by either splitting them or by having another means to retrieve the Java binaries. Cheers, Till On Fri, Mar 19, 2021 at 2:58 AM Xingbo Huang wrote: > Hi Till, > > The package size of tensorflow[1] is also very big(about 300MB+).

Re: [DISCUSS] Split PyFlink packages into two packages: apache-flink and apache-flink-libraries

2021-03-18 Thread Xingbo Huang
Hi Till, The package size of tensorflow[1] is also very big(about 300MB+). However, it does not try to solve the problem, but expands the space limit in PyPI frequently whenever the project space is full. We could also choose this option. According to our current release frequency, we probably

Re: [DISCUSS] Split PyFlink packages into two packages: apache-flink and apache-flink-libraries

2021-03-17 Thread Till Rohrmann
How do other projects solve this problem? Cheers, Till On Wed, Mar 17, 2021 at 3:45 AM Xingbo Huang wrote: > Hi Chesnay, > > Yes, in most cases, we can indeed download the required jars in `setup.py`, > which is also the solution I originally thought of reducing the size of > wheel packages.

Re: [DISCUSS] Split PyFlink packages into two packages: apache-flink and apache-flink-libraries

2021-03-16 Thread Xingbo Huang
Hi Chesnay, Yes, in most cases, we can indeed download the required jars in `setup.py`, which is also the solution I originally thought of reducing the size of wheel packages. However, I'm afraid that it will not work in scenarios when accessing the external network is not possible which is very

Re: [DISCUSS] Split PyFlink packages into two packages: apache-flink and apache-flink-libraries

2021-03-16 Thread Chesnay Schepler
This proposed apache-flink-libraries package would just contain the binary, right? And effectively be unusable to the python audience on it's own. Essentially we are just abusing Pypi for shipping a java binary. Is there no way for us to download the jars when the python package is being

Re: [DISCUSS] Split PyFlink packages into two packages: apache-flink and apache-flink-libraries

2021-03-16 Thread Dian Fu
Yes, the size of .whl file in PyFlink will also be about 3MB if we split the package. Currently the package is big because we bundled the jar files in it. > 2021年3月16日 下午8:13,Chesnay Schepler 写道: > > key difference being that the beam .whl files are 3mb large, aka 60x smaller. > > On

Re: [DISCUSS] Split PyFlink packages into two packages: apache-flink and apache-flink-libraries

2021-03-16 Thread Chesnay Schepler
key difference being that the beam .whl files are 3mb large, aka 60x smaller. On 3/16/2021 1:06 PM, Dian Fu wrote: Hi Chesnay, We will publish binary packages separately for: 1) Python 3.5 / 3.6 / 3.7 / 3.8 (since 1.12) separately 2) Linux / Mac separately Besides, there is also a source

Re: [DISCUSS] Split PyFlink packages into two packages: apache-flink and apache-flink-libraries

2021-03-16 Thread Dian Fu
Hi Chesnay, We will publish binary packages separately for: 1) Python 3.5 / 3.6 / 3.7 / 3.8 (since 1.12) separately 2) Linux / Mac separately Besides, there is also a source package which is used when none of the above binary packages is usable, e.g. for Window users. PS: publishing multiple

Re: [DISCUSS] Split PyFlink packages into two packages: apache-flink and apache-flink-libraries

2021-03-16 Thread Xintong Song
And it's not only uploaded to PyPI, but the ASF mirrors as well. https://dist.apache.org/repos/dist/release/flink/flink-1.12.2/python/ Thank you~ Xintong Song On Tue, Mar 16, 2021 at 7:41 PM Xintong Song wrote: > Actually, I think it's 9 packages, not 7. > > Check here for the 1.12.2

Re: [DISCUSS] Split PyFlink packages into two packages: apache-flink and apache-flink-libraries

2021-03-16 Thread Xintong Song
Actually, I think it's 9 packages, not 7. Check here for the 1.12.2 packages. https://pypi.org/project/apache-flink/#files Thank you~ Xintong Song On Tue, Mar 16, 2021 at 7:08 PM Chesnay Schepler wrote: > Am I reading this correctly that we publish 7 different artifacts just > for python?

Re: [DISCUSS] Split PyFlink packages into two packages: apache-flink and apache-flink-libraries

2021-03-16 Thread Chesnay Schepler
Am I reading this correctly that we publish 7 different artifacts just for python? What does the release matrix look like? On 3/16/2021 3:45 AM, Dian Fu wrote: Hi Xingbo, Thanks a lot for bringing up this discussion. Actually the size limit already becomes an issue during releasing 1.11.3

Re: [DISCUSS] Split PyFlink packages into two packages: apache-flink and apache-flink-libraries

2021-03-15 Thread Dian Fu
Hi Xingbo, Thanks a lot for bringing up this discussion. Actually the size limit already becomes an issue during releasing 1.11.3 and 1.12.1. It blocks us to publish PyFlink packages to PyPI during the release as there is no enough space left (PS: already published the packages after

[DISCUSS] Split PyFlink packages into two packages: apache-flink and apache-flink-libraries

2021-03-11 Thread Xingbo Huang
Hi everyone, Since release-1.11, pyflink has introduced cython support and we will release 7 packages (for different platforms and Python versions) to PyPI for each release and the size of each package is more than 200MB as we need to bundle the jar files into the package. The entire project