Tudyx commented on issue #1760:
URL: https://github.com/apache/arrow-rs/issues/1760#issuecomment-1141513104

   Thanks a lot for your response, it help me a lot to better understand how to 
deal with `arrow` format.
   
   - Concerning the workload, it can be quite big. I'm currently  working on my 
spare times on a port of `PyTorch` `dataloader` in Rust. I've implemented all 
the base functionalities.  I want to play with dataset from 
[huggingFace](https://huggingface.co/datasets) which contains a ton of `arrow` 
dataset, to do more advanced test with my library. The typical workflow is to 
process some contiguous rows at the time, so i think slicing is an important 
operation
   The idea is to propose an option for loading the dataset in RAM or use 
`arrow` memory map depending on the size of the dataset.
   
   - About the data representation, i have a little question that may sound 
stupid. When you say that `Arrow` also supports zero-copy slicing of arrays, 
something which cannot be performed with `Vec`, does using a slice with vector 
is not actually doing a zero-copy slicing also? Like in this example
   ```rust
   let vector = vec!["foo", "bar", "bax"];
   let slice = &vector[1..2];
   assert_eq!(slice[0], "bar");
   ```
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to