alamb commented on code in PR #6643:
URL: https://github.com/apache/arrow-rs/pull/6643#discussion_r1889141293


##########
arrow-json/src/reader/mod.rs:
##########
@@ -170,12 +170,30 @@ mod struct_array;
 mod tape;
 mod timestamp_array;
 
+/// Specifies what is considered valid JSON when parsing StructArrays.

Review Comment:
   Do you have a link to documentation / prior art for this type of JSON 
encoding of structs? Maybe to trino / presto docs that describe it in more full 
detail that we can add to this doc?



##########
arrow-json/src/reader/struct_array.rs:
##########
@@ -71,42 +75,70 @@ impl ArrayDecoder for StructArrayDecoder {
             .then(|| BooleanBufferBuilder::new(pos.len()));
 
         for (row, p) in pos.iter().enumerate() {

Review Comment:
   Performance wise, this code now has another conditional in the hot loop 
(each row)
   
   ```rust
           for (row, p) in pos.iter().enumerate() {
            if self.struct_parse_mode { 
              ..
           }
         }
   ```
   
   I suspect the code would be fast (and easier to understand / cleaner) if you 
hoisted the check, something like
   ```rust
            if self.struct_parse_mode { 
               for (row, p) in pos.iter().enumerate() {
                ..
               }
             else {
                      for (row, p) in pos.iter().enumerate() {
                ..
               }
         }
   ```



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to