Using C++//Arrow to filter out large parquet files and I’m able to do this 
successfully. The current poc implementation is based on nested for/loops which 
I would like to avoid this and instead use built-in filter/take functions or 
some recommendations  to extract (take functions ?) arrays of indices or 
booleans to filter out rows.

The input (data) array/column type is MapArray[key:String, 
value:StructArray[id:String, …]] 

The input filter is a {filter_key: “some string”, filter_ids: [“aaa”, “bee”, 
“see”, ..] }
  - Where filter_key, and filter_ids is to match contents of input MapArray

The output I’m looking for is either array of booleans or indices of input 
array that match the input filer.

Thank you

Reply via email to