kou commented on code in PR #28: URL: https://github.com/apache/arrow-dotnet/pull/28#discussion_r2312688041
########## docs/index.md: ########## @@ -0,0 +1,151 @@ +--- +_layout: landing +--- +<!--- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. +--> + +# Apache Arrow .NET + +An implementation of Arrow targeting .NET. + +See our current [feature matrix](https://github.com/apache/arrow/blob/main/docs/source/status.rst) +for currently available features. + +# Implementation Review Comment: ```suggestion ## Implementation ``` ########## docs/index.md: ########## @@ -0,0 +1,151 @@ +--- +_layout: landing +--- +<!--- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. +--> + +# Apache Arrow .NET + +An implementation of Arrow targeting .NET. + +See our current [feature matrix](https://github.com/apache/arrow/blob/main/docs/source/status.rst) +for currently available features. + +# Implementation + +- Arrow specification 1.0.0. (Support for reading 0.11+.) +- C# 11 +- .NET Standard 2.0, .NET 6.0, .NET 8.0 and .NET Framework 4.6.2 +- Asynchronous I/O +- Uses modern .NET runtime features such as **Span<T>**, **Memory<T>**, **MemoryManager<T>**, and **System.Buffers** primitives for memory allocation, memory storage, and fast serialization. +- Uses **Acyclic Visitor Pattern** for array types and arrays to facilitate serialization, record batch traversal, and format growth. + +# Known Issues + +- Cannot read Arrow files containing tensors. +- Cannot easily modify allocation strategy without implementing a custom memory pool. All allocations are currently 64-byte aligned and padded to 8-bytes. +- Default memory allocation strategy uses an over-allocation strategy with pointer fixing, which results in significant memory overhead for small buffers. A buffer that requires a single byte for storage may be backed by an allocation of up to 64-bytes to satisfy alignment requirements. +- There are currently few builder APIs available for specific array types. Arrays must be built manually with an arrow buffer builder abstraction. +- FlatBuffer code generation is not included in the build process. +- Serialization implementation does not perform exhaustive validation checks during deserialization in every scenario. +- Throws exceptions with vague, inconsistent, or non-localized messages in many situations +- Throws exceptions that are non-specific to the Arrow implementation in some circumstances where it probably should (eg. does not throw ArrowException exceptions) +- Lack of code documentation +- Lack of usage examples + +# Usage Review Comment: ```suggestion ## Usage ``` ########## docs/index.md: ########## @@ -0,0 +1,151 @@ +--- +_layout: landing +--- +<!--- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. +--> + +# Apache Arrow .NET + +An implementation of Arrow targeting .NET. + +See our current [feature matrix](https://github.com/apache/arrow/blob/main/docs/source/status.rst) +for currently available features. + +# Implementation + +- Arrow specification 1.0.0. (Support for reading 0.11+.) +- C# 11 +- .NET Standard 2.0, .NET 6.0, .NET 8.0 and .NET Framework 4.6.2 +- Asynchronous I/O +- Uses modern .NET runtime features such as **Span<T>**, **Memory<T>**, **MemoryManager<T>**, and **System.Buffers** primitives for memory allocation, memory storage, and fast serialization. +- Uses **Acyclic Visitor Pattern** for array types and arrays to facilitate serialization, record batch traversal, and format growth. + +# Known Issues + +- Cannot read Arrow files containing tensors. +- Cannot easily modify allocation strategy without implementing a custom memory pool. All allocations are currently 64-byte aligned and padded to 8-bytes. +- Default memory allocation strategy uses an over-allocation strategy with pointer fixing, which results in significant memory overhead for small buffers. A buffer that requires a single byte for storage may be backed by an allocation of up to 64-bytes to satisfy alignment requirements. +- There are currently few builder APIs available for specific array types. Arrays must be built manually with an arrow buffer builder abstraction. +- FlatBuffer code generation is not included in the build process. +- Serialization implementation does not perform exhaustive validation checks during deserialization in every scenario. +- Throws exceptions with vague, inconsistent, or non-localized messages in many situations +- Throws exceptions that are non-specific to the Arrow implementation in some circumstances where it probably should (eg. does not throw ArrowException exceptions) +- Lack of code documentation +- Lack of usage examples + +# Usage + +Example demonstrating reading [RecordBatches](xref:Apache.Arrow.RecordBatch) from an Arrow IPC file using an +[ArrowFileReader](xref:Apache.Arrow.Ipc.ArrowFileReader): + + using System.Diagnostics; + using System.IO; + using System.Threading.Tasks; + using Apache.Arrow; + using Apache.Arrow.Ipc; + + public static async Task<RecordBatch> ReadArrowAsync(string filename) + { + using (var stream = File.OpenRead(filename)) + using (var reader = new ArrowFileReader(stream)) + { + var recordBatch = await reader.ReadNextRecordBatchAsync(); + Debug.WriteLine("Read record batch with {0} column(s)", recordBatch.ColumnCount); + return recordBatch; + } + } + + +# Status Review Comment: ```suggestion ## Status ``` We need to add one more heading level for sub sections. ########## docs/index.md: ########## @@ -0,0 +1,151 @@ +--- +_layout: landing +--- +<!--- + Licensed to the Apache Software Foundation (ASF) under one + or more contributor license agreements. See the NOTICE file + distributed with this work for additional information + regarding copyright ownership. The ASF licenses this file + to you under the Apache License, Version 2.0 (the + "License"); you may not use this file except in compliance + with the License. You may obtain a copy of the License at + + http://www.apache.org/licenses/LICENSE-2.0 + + Unless required by applicable law or agreed to in writing, + software distributed under the License is distributed on an + "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY + KIND, either express or implied. See the License for the + specific language governing permissions and limitations + under the License. +--> + +# Apache Arrow .NET + +An implementation of Arrow targeting .NET. + +See our current [feature matrix](https://github.com/apache/arrow/blob/main/docs/source/status.rst) +for currently available features. + +# Implementation + +- Arrow specification 1.0.0. (Support for reading 0.11+.) +- C# 11 +- .NET Standard 2.0, .NET 6.0, .NET 8.0 and .NET Framework 4.6.2 +- Asynchronous I/O +- Uses modern .NET runtime features such as **Span<T>**, **Memory<T>**, **MemoryManager<T>**, and **System.Buffers** primitives for memory allocation, memory storage, and fast serialization. +- Uses **Acyclic Visitor Pattern** for array types and arrays to facilitate serialization, record batch traversal, and format growth. + +# Known Issues Review Comment: ```suggestion ## Known Issues ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
