On Thu, Jun 10, 2021 at 3:00 AM Ismaël Mejía <ieme...@gmail.com> wrote: > > As a follow up on this with the merge of > https://github.com/apache/beam/pull/14832 Beam will be producing python > wheels for AARCH64 starting on Beam 2.32.0!
Nice. > Also due to the recent version updates (grpc, protobuf and arrow) we should > be pretty close to fully support it without extra compilation. > Seems like the only missing piece is cython > https://github.com/cython/cython/issues/3892 Cython already supports ARM. This is just about providing pre-built wheels for installing Cython (which aren't necessarily needed). > Now the next important step would be to make the docker images multi-arch. > That would be a great contribution if someone is motivated. > > > On Thu, Jan 28, 2021 at 1:47 AM Robert Bradshaw <rober...@google.com> wrote: >> >> Cython supports ARM64. The issue here is that we don't have a C++ compiler >> (It's looking for 'cc') available in the container (and grpc, and possibly >> others, don't have wheel files for this platform). I wonder if apt-get >> install build-essential would be sufficient. >> >> On Wed, Jan 27, 2021 at 2:22 PM Ismaël Mejía <ieme...@gmail.com> wrote: >>> >>> Nice to see the interest, I also suppose that devs on Apple macbooks with >>> the >>> new M1 processor will soon request this feature. >>> >>> I ran today some pipelines on ARM64 on classic runners relatively easy >>> which was expected. We will have issues however for the Java 8 SDK harness >>> because the parent image openjdk:8 is not supported yet for ARM64. >>> >>> I tried to setup a python dev environment and found the first issue. It >>> looks >>> like gRPC does not support arm64 yet [1][2] or am I misreading it? >>> >>> $ pip install -r build-requirements.txt >>> >>> Collecting grpcio-tools==1.30.0 >>> Downloading grpcio-tools-1.30.0.tar.gz (2.1 MB) >>> |████████████████████████████████| 2.1 MB 21.7 MB/s >>> ERROR: Command errored out with exit status 1: >>> command: /home/ubuntu/.virtualenvs/beam-dev/bin/python3 -c >>> 'import sys, setuptools, tokenize; sys.argv[0] = >>> '"'"'/tmp/pip-install-3lhad2qc/grpcio-tools_d3562157df5c41db9110e4ccd165c87e/setup.py'"'"'; >>> __file__='"'"'/tmp/pip-install-3lhad2qc/grpcio-tools_d3562157df5c41db9110e4ccd165c87e/setup.py'"'"';f=getattr(tokenize, >>> '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', >>> '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' >>> egg_info --egg-base /tmp/pip-pip-egg-info-km8agjf4 >>> cwd: >>> /tmp/pip-install-3lhad2qc/grpcio-tools_d3562157df5c41db9110e4ccd165c87e/ >>> Complete output (11 lines): >>> Traceback (most recent call last): >>> File "<string>", line 1, in <module> >>> File >>> "/tmp/pip-install-3lhad2qc/grpcio-tools_d3562157df5c41db9110e4ccd165c87e/setup.py", >>> line 112, in <module> >>> if check_linker_need_libatomic(): >>> File >>> "/tmp/pip-install-3lhad2qc/grpcio-tools_d3562157df5c41db9110e4ccd165c87e/setup.py", >>> line 73, in check_linker_need_libatomic >>> cc_test = subprocess.Popen(['cc', '-x', 'c++', '-std=c++11', '-'], >>> File "/usr/lib/python3.8/subprocess.py", line 854, in __init__ >>> self._execute_child(args, executable, preexec_fn, close_fds, >>> File "/usr/lib/python3.8/subprocess.py", line 1702, in _execute_child >>> raise child_exception_type(errno_num, err_msg, err_filename) >>> FileNotFoundError: [Errno 2] No such file or directory: 'cc' >>> ---------------------------------------- >>> WARNING: Discarding >>> https://files.pythonhosted.org/packages/da/3c/bed275484f6cc262b5de6ceaae36798c60d7904cdd05dc79cc830b880687/grpcio-tools-1.30.0.tar.gz#sha256=7878adb93b0c1941eb2e0bed60719f38cda2ae5568bc0bcaa701f457e719a329 >>> (from https://pypi.org/simple/grpcio-tools/). Command errored out with >>> exit status 1: python setup.py egg_info Check the logs for full >>> command output. >>> ERROR: Could not find a version that satisfies the requirement >>> grpcio-tools==1.30.0 >>> ERROR: No matching distribution found for grpcio-tools==1.30.0 >>> >>> [1] https://pypi.org/project/grpcio-tools/#files >>> [2] https://github.com/grpc/grpc/issues/21283 >>> >>> I can imagine also that we will have some struggles with the python harness >>> and all of its dependencies. Does cython already support ARM64? >>> >>> I went and filled some JIRAs to keep track of this: >>> >>> BEAM-11703 Support apache-beam python install on ARM64 >>> BEAM-11704 Support Beam docker images on ARM64 >>> >>> >>> On Tue, Jan 26, 2021 at 8:48 PM Robert Burke <rob...@frantil.com> wrote: >>> > >>> > I believe so. >>> > >>> > The Go SDK requires in most instances for a user to Register their DoFns >>> > at package init time, linked to the type/functions fully qualified path >>> > as detemined by Go, which is consistent across architectures, at least >>> > with the standard toochain. >>> > >>> > Those strings are used to look things up on distributed workers, >>> > regardless of the architecture. >>> > >>> > >>> > >>> > On Tue, Jan 26, 2021, 11:33 AM Robert Bradshaw <rober...@google.com> >>> > wrote: >>> >> >>> >> Cool. Are DoFn (et al) references compatible across cross-compiled >>> >> binaries? >>> >> >>> >> On Tue, Jan 26, 2021 at 11:23 AM Robert Burke <rob...@frantil.com> wrote: >>> >>> >>> >>> Go cross compilation is as simple as setting the right flag env >>> >>> variables [1], but can be as complicated as requiring a cross compiling >>> >>> GCC instance installed if CGO[2] is necessary. I think we're probably >>> >>> clear on just needing the flag though for the various Boot executables. >>> >>> >>> >>> For go pipelines we'd need to update the shared runner code to support >>> >>> selecting the cross compiled worker binary environment. I believe it's >>> >>> hard set to amd64 linux at present, but that's a separate issue. >>> >>> >>> >>> [1] https://golangcookbook.com/chapters/running/cross-compiling/ >>> >>> [2] https://golang.org/cmd/cgo/ >>> >>> >>> >>> On Tue, Jan 26, 2021, 10:25 AM Robert Bradshaw <rober...@google.com> >>> >>> wrote: >>> >>>> >>> >>>> +1 >>> >>>> >>> >>>> I don't think it would be that hard to build and release arm-based >>> >>>> docker images. (Perhaps just a matter of changing the docker file to >>> >>>> depend on a different base, and doing some cross-compile. That would >>> >>>> suss out whether we're inadvertently taking on any incompatible >>> >>>> dependencies.) >>> >>>> >>> >>>> Theoretically, if one does that and manually specifies the container, >>> >>>> it could just work for Python (assuming no wheel files are specified >>> >>>> as manual dependencies). For Java, if one builds/deploys an uberjar >>> >>>> (on a different architecture), there may be issues in any transitive >>> >>>> dependency that has JNI code (us or users). I'd imagine this issue is >>> >>>> common to and being explored by many of the other Java big data >>> >>>> systems in use; it'd be interesting to know what solutions are out >>> >>>> there. >>> >>>> >>> >>>> For go, the executable is uploaded directly into the container. We'd >>> >>>> probably have to do something fancier like cross-compiling the >>> >>>> executable (and making sure the UserFn references, which I think are >>> >>>> just pointers into the binary, still work if the launcher is one >>> >>>> architecture and the workers another). >>> >>>> >>> >>>> Definitely worth exploring. >>> >>>> >>> >>>> >>> >>>> >>> >>>> >>> >>>> On Tue, Jan 26, 2021 at 10:09 AM Ismaël Mejía <ieme...@gmail.com> >>> >>>> wrote: >>> >>>>> >>> >>>>> I stumbled today on this user request: >>> >>>>> BEAM-10982 Wheel support for linux aarch64 >>> >>>>> >>> >>>>> It made me wonder if with the advent of ARM64 processors not only in >>> >>>>> the client but server side (Graviton and others) if it is worth that >>> >>>>> we start to think about having support for this architecture on the >>> >>>>> python installers and in the docker images. It seems that for the >>> >>>>> latter it should not be that difficult given that our parent images >>> >>>>> are already multi-arch. >>> >>>>> >>> >>>>> Are there some possible issues or binary/platform specific >>> >>>>> dependencies that impede us from doing this?