pitrou commented on code in PR #37785: URL: https://github.com/apache/arrow/pull/37785#discussion_r1340285870
########## go/parquet/internal/utils/_lib/README.md: ########## @@ -0,0 +1,148 @@ +<!--- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. +--> + +# SIMD Bit Packing Implementation + +Go doesn't have any SIMD intrinsics so for some low-level optimizations we can +leverage auto-vectorization and the fact that Go lets you specify the body of a +function in assembly to benefit from SIMD. + +In here we have implementations using SIMD intrinsics for AVX (amd64) and NEON (arm64). + +## Generating the Go assembly + +c2goasm and asm2plan9s are two projects which can be used in conjunction to generate +compatible Go assembly from C assembly. + +First the tools need to be installed: + +```bash +go install github.com/klauspost/asmfmt/cmd/asmfmt@latest +go install github.com/minio/asm2plan9s@latest +go install github.com/minio/c2goasm@latest +``` + +### Generating for amd64 + +The Makefile in the directory above will work for amd64. `make assembly` will compile +the c sources and then call `c2goasm` to generate the Go assembly for amd64 +architectures. + +### Generating for arm64 + +Unfortunately there are some caveats for arm64. c2goasm / asm2plan9s doesn't fully +support arm64 correctly. However, proper assembly can be created with some slight +manipulation of the result. + +The Makefile has the NEON flags for compiling the assembly by using +`make _lib/bit_packing_neon.s` and `make _lib/unpack_bool_neon.s` to generate the +raw assembly sources. + +Before calling `c2goasm` there's a few things that need to be modified in the assembly: + +* x86-64 assembly uses `#` for comments while arm64 assembly uses `//` for comments. + `c2goasm` assumes `#` for comments and splits lines based on them. For most lines + this isn't an issue, but for any constants this is important and will need to have + the comment character converted from `//` to `#`. +* A `word` for x86-64 is 16 bits, a `double` word is 32 bits, and a `quad` is 64 bits. + For arm64, a `word` is 32 bits. This means that constants in the assembly need to be + modified. `c2goasm` and `asm2plan9s` expect the x86-64 meaning for the sizes, so + usage of `.word ######` needs to be converted to `.long #####` before running + `c2goasm` + * Because of this change in bits, `MOVQ` instructions will also be converted to Review Comment: ```suggestion * Because of this change in bits, `MOVQ` instructions will also be converted to ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
