zhuqi-lucas commented on PR #16711:
URL: https://github.com/apache/datafusion/pull/16711#issuecomment-3052805272

   > > The adaptive selection will help Q30 and Q31 from previous PR result:
   > > [apache/arrow-rs#7454 
(comment)](https://github.com/apache/arrow-rs/pull/7454#issuecomment-2840770864)
   > 
   > @zhuqi-lucas my largest concern about all the previous adaptive row 
selection work was the software engineering / keeping the complexity under 
control.
   > 
   > I have been dreaming about this -- I am thinking it might be time to 
create an internal RowSelection representation like
   > 
   > ```rust
   > enum InternalRowSelector {
   >   /// Skip the next n rows
   >   Skip(usize),
   >   /// Decode and output the next n rows
   >   Decode(usize),
   >    /// Decode the next filter.len() rows and apply the filter before 
outputting the rows
   >   DecodeAndFilter(Arc<BooleanArray>),
   > }
   > ```
   > 
   > Then the actual RecordReader would take a Vec` or something
   > 
   > To stage / keep things of reasonable complexity we could make one PR that 
refactors to
   > 
   > ```rust
   > enum InternalRowSelector {
   >   /// Skip the next n rows
   >   Skip(usize),
   >   /// Decode and output the next n rows
   >   Decode(usize),
   >    // NO DecodeAndFilter variant
   > }
   > ```
   > 
   > And then add the DecodeAndFilter as a follow on PR
   > 
   > Maybe I'll try and make a PR
   
   Thank you @alamb , i agree, even the adaptive selection ratio is 
experimented by random selection in my PR, i can't find a good way to do it 
clearly.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to