Re: [I] [C++][IPC] Allow a more fine-grained column selection for reading data with nested fields [arrow]

via GitHub Mon, 04 May 2026 21:12:45 -0700


romankarlstetter commented on issue #37358:
URL: https://github.com/apache/arrow/issues/37358#issuecomment-4376477509


   One use case in which we use struct is the following: 
   
   We use arrow IPC to store time series data. We also store aggregations of 
these time series, and for that, we derive the schema and store the actual 
aggregations as nested fields. Example:
   
   Schema:
   `| TS | col1 | col2 | ... |`
   
   Derived Schema for storing aggregated data:
   ```
   | TS | COUNT |        col1      |       col2       | ... |
   |    |       | max | min | mean | max | min | mean | ... |
   ```
   
   In this case, `col1` and `col2` are of type `struct` with the nested fields 
`max`, `min` and `mean`
   
   For certain queries, we're only interested in, say, max aggregations, so we 
only want to load `TS`, `col1.max`, `col2.max`, and the resulting schema of the 
returned record batches would be the following
   
   ```
   | TS | col1 | col2 |
   |    | max  | max  |
   ```
   
   Currently, only "full" struct columns can be specified via 
`IpcReadOptions::included_fields`.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [I] [C++][IPC] Allow a more fine-grained column selection for reading data with nested fields [arrow]

Reply via email to