jorgecarleitao commented on a change in pull request #8630:
URL: https://github.com/apache/arrow/pull/8630#discussion_r521392544



##########
File path: rust/arrow/src/array/transform/mod.rs
##########
@@ -0,0 +1,532 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+use std::{io::Write, mem::size_of, sync::Arc};
+
+use crate::{buffer::MutableBuffer, datatypes::DataType, util::bit_util};
+
+use super::{ArrayData, ArrayDataRef};
+
+mod boolean;
+mod list;
+mod primitive;
+mod utils;
+mod variable_size;
+
+type ExtendNullBits<'a> = Box<Fn(&mut _MutableArrayData, usize, usize) -> () + 
'a>;
+// function that extends `[start..start+len]` to the mutable array.
+// this is dynamic because different data_types influence how buffers and 
childs are extended.
+type Extend<'a> = Box<Fn(&mut _MutableArrayData, usize, usize) -> () + 'a>;

Review comment:
       That unfortunately would only work for primitive buffers. For string 
arrays, extending an array data requires a complex operation that is 
fundamentally different from extending a single buffer. For nested types, the 
operation is recursive on the child data.
   
   This is fundamentally a dynamic operation: we only know what to do when we 
see which `DataType` the user wants to build an `ArrayData` from. We can see 
that the `Builders` use a similar approach: they use `dyn Builder` for the same 
reason.
   
   The builders have an extra complexity associated with the fact that their 
input type is not uniform: i.e. their API supports extending from a `&[T]` 
(e.g. `i32` or `i16`), which is the reason why they need to be implemented via 
a dynamic type, whose each implementation has methods for each type. In the 
`MutableArrayData`, the only "thing" that we extend from is an `ArrayData`, 
which has a uniform (rust) type, but requires a different behavior based on its 
`data_type` => function pointer per data-type.




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to