zeroshade commented on code in PR #37785:
URL: https://github.com/apache/arrow/pull/37785#discussion_r1342924432


##########
go/parquet/internal/utils/_lib/README.md:
##########
@@ -0,0 +1,148 @@
+<!---
+  Licensed to the Apache Software Foundation (ASF) under one
+  or more contributor license agreements.  See the NOTICE file
+  distributed with this work for additional information
+  regarding copyright ownership.  The ASF licenses this file
+  to you under the Apache License, Version 2.0 (the
+  "License"); you may not use this file except in compliance
+  with the License.  You may obtain a copy of the License at
+
+    http://www.apache.org/licenses/LICENSE-2.0
+
+  Unless required by applicable law or agreed to in writing,
+  software distributed under the License is distributed on an
+  "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+  KIND, either express or implied.  See the License for the
+  specific language governing permissions and limitations
+  under the License.
+-->
+
+# SIMD Bit Packing Implementation
+
+Go doesn't have any SIMD intrinsics so for some low-level optimizations we can 
+leverage auto-vectorization and the fact that Go lets you specify the body of a
+function in assembly to benefit from SIMD.
+
+In here we have implementations using SIMD intrinsics for AVX (amd64) and NEON 
(arm64).
+
+## Generating the Go assembly
+
+c2goasm and asm2plan9s are two projects which can be used in conjunction to 
generate
+compatible Go assembly from C assembly.
+
+First the tools need to be installed:
+
+```bash
+go install github.com/klauspost/asmfmt/cmd/asmfmt@latest
+go install github.com/minio/asm2plan9s@latest
+go install github.com/minio/c2goasm@latest
+```
+
+### Generating for amd64
+
+The Makefile in the directory above will work for amd64. `make assembly` will 
compile
+the c sources and then call `c2goasm` to generate the Go assembly for amd64 
+architectures.
+
+### Generating for arm64
+
+Unfortunately there are some caveats for arm64. c2goasm / asm2plan9s doesn't 
fully
+support arm64 correctly. However, proper assembly can be created with some 
slight
+manipulation of the result.
+
+The Makefile has the NEON flags for compiling the assembly by using 
+`make _lib/bit_packing_neon.s` and `make _lib/unpack_bool_neon.s` to generate 
the
+raw assembly sources. 
+
+Before calling `c2goasm` there's a few things that need to be modified in the 
assembly:

Review Comment:
   I did it with a series of regex's but until i write some automation, yes 
they are done by hand.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to