paleolimbot commented on PR #813:
URL: https://github.com/apache/sedona-db/pull/813#issuecomment-4468735315

   Possibly? I know it seems like a lot of comments but what I'm getting at is:
   
   ```rust
   let mut scratch = Vec::new();
   for raster in rasters {
     let data_slice = raster.band(0)?.contiguous_data(&mut scratch)?;
     do_stuff(data_slice);
   }
   ```
   
   Or on the day we need to track memory:
   
   ```rust
   let mut max_capacity = 0;
   for raster in rasters {
     max_capacity = 
max_capacity.max(raster.band(0)?.contiguous_data_alloc_size());
   }
   
   if max_capacity > config_options.raster.max_scratch_alloc {
     // error
   }
   
   // If this ends up being a lot, frequently, we may also want to use a 
MemoryReservation to track it
   // Probably this helps contiguous_data be faster since in theory there is 
only one heap allocation.
   // You could probably use some of the unsafe modifiers to also ensure that 
there are no bounds
   // checks in your materializer slowing things down
   let mut scratch = Vec::with_capacity(max_capacity);
   for raster in rasters {
     let data_slice = raster.band(0)?.contiguous_data(&mut scratch)?;
     do_stuff(data_slice);
   }
   ```
   
   I'm not sure exactly how loading works yet but you might want something 
similar (with the hiccup that you need some number of async workers doing IO so 
you also need the same number of scratch buffers).
   
   (I have the same comment for pretty much everything that heap allocates in a 
loop, which is why, for example, our geometry writers write WKB directly into 
the Arrow output instead of returning `Vec<u8>`).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to