suremarc commented on issue #3922:
URL: https://github.com/apache/arrow-rs/issues/3922#issuecomment-1482076248

   Thank you for the speedy reply. It sounds like this feature doesn't really 
agree with Parquet very much, unfortunately. 
   
   > That being said, it is possible to just decode the last n rows using 
[`RowSelection`](https://docs.rs/parquet/latest/parquet/arrow/arrow_reader/struct.RowSelection.html).
 This means if the DataFusion query optimiser could be taught to push this 
down, it should work without requiring any additional functionality in the 
parquet crate.
   
   This makes sense, but I think I should have been more specific — the queries 
I was testing also involved filtering, e.g. `SELECT * FROM table WHERE 
attribute = 'value' ORDER BY field DESC LIMIT n`. Unless I am misunderstanding, 
I do not think it is possible to select the last N rows subject to a predicate 
with a `RowSelection`. 
   
   I am starting to think maybe Parquet and Datafusion are not ideal for my 
company's use case — we are interested in its analytical capabilities, but our 
existing products support queries of the form described above (essentially, 
filter + limit + sort ascending/descending on time only). Do you think it would 
be worth opening an issue on Datafusion about this though?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to