alamb commented on a change in pull request #1782:
URL: https://github.com/apache/arrow-datafusion/pull/1782#discussion_r803685458



##########
File path: datafusion/src/row/bitmap.rs
##########
@@ -0,0 +1,194 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+//! General utilities for null bit section handling based on 
[arrow::util::bit_util]

Review comment:
       What do you think about directly using bit util?
   
   https://github.com/apache/arrow-rs/blob/master/arrow/src/util/bit_util.rs
   
   I feel the proliferation of bit manipulation code is somewhat repetitive

##########
File path: datafusion/src/lib.rs
##########
@@ -223,6 +223,8 @@ pub use arrow;
 pub use parquet;
 
 pub(crate) mod field_util;
+#[allow(dead_code)]

Review comment:
       Rather than `dead_code` how about we add a `feat(experimental)` that is 
only enabled when we are iterating on this code?
   
   That way we will not impose any compile time overhead on other users of 
DataFusion who won't get benefit from this code (yet). I am thinking to try and 
avoid undoing the progress that is being made on #348 
   
   
   
   

##########
File path: datafusion/src/row/mod.rs
##########
@@ -0,0 +1,334 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+//! An implementation of Row backed by raw bytes
+//!
+//! Each tuple consists of up to three parts: [null bit set] [values] [var 
length data]
+//!
+//! The null bit set is used for null tracking and is aligned to 1-byte. It 
stores
+//! one bit per field.
+//!
+//! In the region of the values, we store the fields in the order they are 
defined in the schema.

Review comment:
       One classic database technique to pack such data into a single block 
(rather than a separate variable length area) is to preallocate the page (e.g. 
32K or something) and then write rows into the front of the page, but filling 
the variable length area from the *back*. 

##########
File path: datafusion/src/row/mod.rs
##########
@@ -0,0 +1,334 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+//! An implementation of Row backed by raw bytes
+//!
+//! Each tuple consists of up to three parts: [null bit set] [values] [var 
length data]
+//!
+//! The null bit set is used for null tracking and is aligned to 1-byte. It 
stores
+//! one bit per field.
+//!
+//! In the region of the values, we store the fields in the order they are 
defined in the schema.

Review comment:
       This describes a variable length tuple -- so if you pack a bunch of 
`Row`s together in memory it will not be possible to find some arbitrary 
location quickly
   
   For example, finding where `Row3 `starts in the following picture needs to 
scan  all columns of Row1 and Row2
   
   ```
   ┌─────────────────────────────────┐
   │              Row1               │
   ├──────────┬──────────────────────┤
   │   Row2   │         Row3         │
   ├──────────┴─┬────────────────────┤
   │    Row4    │        Row5        │
   └────────────┴────────────────────┘
   ```
   
   
   The benefit of this strategy is that tuple construction will be very fast 
and memory usage optimal
   
   There are other strategies that have a fixed width tuples, as discussed on 
https://github.com/apache/arrow-datafusion/issues/1708 that have benefits 
(though are likely not as memory optimal)
   
   I think getting optimal performance will come from being able to vectorize 
many operations for which fixed sized tuples are compelling:
   
   ```
   ┌───────────┬──┐               ┌─────────────┐
   │     Row1  │  ├──────┐        │             │
   ├───────────┼──┤      │ ┌──────┼▶            │
   │     Row2  │  │──────┼┐├──────┼─▶           │
   ├───────────┼──┤      │││      │             │
   │     Row3  │  │──────┼┼┤      │  variable   │
   ├───────────┼──┤      └┼┼─────▶│ length area │
   │     Row4  │  │───────┼┘      │             │
   ├───────────┼──┤       └──────▶│             │
   │     Row5  │  │───────────┐   │             │
   └───────────┴──┘           └───┼───────▶     │
                                  └─────────────┘
   ```
   
   Maybe I can find some time this weekend to play around with some ideas

##########
File path: datafusion/src/row/mod.rs
##########
@@ -0,0 +1,334 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+//! An implementation of Row backed by raw bytes
+//!
+//! Each tuple consists of up to three parts: [null bit set] [values] [var 
length data]
+//!
+//! The null bit set is used for null tracking and is aligned to 1-byte. It 
stores
+//! one bit per field.
+//!
+//! In the region of the values, we store the fields in the order they are 
defined in the schema.
+//! - For fixed-length, sequential access fields, we store them directly.
+//!       E.g., 4 bytes for int and 1 byte for bool.
+//! - For fixed-length, update often fields, we store one 8-byte word per 
field.

Review comment:
       I don't really see a notion of "update often" appearing in this code. 
Maybe it is future work

##########
File path: datafusion/src/row/reader.rs
##########
@@ -0,0 +1,441 @@
+// Licensed to the Apache Software Foundation (ASF) under one
+// or more contributor license agreements.  See the NOTICE file
+// distributed with this work for additional information
+// regarding copyright ownership.  The ASF licenses this file
+// to you under the Apache License, Version 2.0 (the
+// "License"); you may not use this file except in compliance
+// with the License.  You may obtain a copy of the License at
+//
+//   http://www.apache.org/licenses/LICENSE-2.0
+//
+// Unless required by applicable law or agreed to in writing,
+// software distributed under the License is distributed on an
+// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+// KIND, either express or implied.  See the License for the
+// specific language governing permissions and limitations
+// under the License.
+
+//! Accessing row from raw bytes

Review comment:
       Here is one for consieration:
   
   ```text
                                                                                
                           
    Row Layout                                                                  
                           
                                                                                
                           
   ┌────────────────┬──────────────────────────┬───────────────────────┐        
┌───────────────────────┐  
   │Validity Bitmask│    Fixed Width Field     │ Variable Width Field  │   ...  
│     vardata area      │  
   │ (byte aligned) │   (native type width)    │(len + vardata offset) │        
│   (variable length)   │  
   └────────────────┴──────────────────────────┴───────────────────────┘        
└───────────────────────┘  
                                                                                
                           
                                                                                
                           
                                                                                
                           
    For example, given the schema (Int8, Float32, Utf8, Utf8)                   
                           
                                                                                
                           
    Encoding the tuple (1, NULL, "FooBar", "baz")                               
                           
                                                                                
                           
    Requires 35 bytes as shown                                                  
                           
   
┌────────┬────────┬──────────────┬──────────────────────┬──────────────────────┬───────────────────────┐
   │0b000110│  0x01  │  0x00000000  │0x00000000  0x00000006│0x00000006  
0x00000003│       FooBarbaz       │
   
└────────┴────────┴──────────────┴──────────────────────┴──────────────────────┴───────────────────────┘
   0        1         2             10                     18                   
  26                     35
                                                                                
                           
    Validity    Int8  Float32 Field       Utf8 Field 1         Utf8 Field 2     
     Variable length       
      Mask     Field    (4 bytes)           Offset: 0            Offset: 6      
          area             
    (1 byte)  (1 byte)                       Size: 6              Size: 3       
        (9 bytes)          
                                            (8 bytes)            (8 bytes)      
                           
   ```
   
   Also attaching the monopic file in case anyone finds that useful: 
   
[drawing.zip](https://github.com/apache/arrow-datafusion/files/8041744/drawing.zip)
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


Reply via email to