Jefffrey commented on code in PR #9000:
URL: https://github.com/apache/arrow-rs/pull/9000#discussion_r2643573644


##########
arrow-row/src/lib.rs:
##########
@@ -1951,11 +1956,26 @@ unsafe fn decode_column(
 
                         let child_array =
                             unsafe { converter.convert_raw(&mut sparse_data, 
validate_utf8) }?;
+
+                        // track bytes consumed for rows that belong to this 
field
+                        for (row_idx, child_row) in field_rows.iter() {
+                            let remaining_len = sparse_data[*row_idx].len();
+                            bytes_consumed[*row_idx] = 1 + child_row.len() - 
remaining_len;
+                        }

Review Comment:
   ```suggestion
                           // ensure we advance pass consumed bytes in rows
                           for (row_idx, child_row) in field_rows.iter() {
                               let remaining_len = sparse_data[*row_idx].len();
                               let consumed_length = 1 + child_row.len() - 
remaining_len;
                               rows[*row_idx] = 
&rows[*row_idx][consumed_length..];
                           }
   ```
   
   Thoughts of inlining it like this, which can remove the need for a separate 
`bytes_consumed` vec?



##########
arrow-row/src/lib.rs:
##########
@@ -1930,6 +1929,12 @@ unsafe fn decode_column(
                         let child_array =
                             unsafe { converter.convert_raw(&mut child_data, 
validate_utf8) }?;
 
+                        // track bytes consumed by comparing original and 
remaining lengths
+                        for (i, (row_idx, child_row)) in 
field_rows.iter().enumerate() {
+                            let remaining_len = child_data[i].len();
+                            bytes_consumed[*row_idx] = 1 + child_row.len() - 
remaining_len;
+                        }

Review Comment:
   ```suggestion
                           for ((row_idx, original_bytes), remaining_bytes) in
                               field_rows.iter().zip(child_data)
                           {
                               bytes_consumed[*row_idx] =
                                   1 + original_bytes.len() - 
remaining_bytes.len();
                           }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to