viirya commented on PR #12082:
URL: https://github.com/apache/datafusion/pull/12082#issuecomment-2319412657

   > > > if there is a left streamed row with join key (1) from the right side 
we gonna have joined buffered batches where range shows what indices share the 
same join key.
   > > > For example
   > > > ```
   > > > Streamed data        Buffered data
   > > > [1]               -> [0, 1, 1], [1, 1, 2]
   > > > ```
   > > > 
   > > > 
   > > >     
   > > >       
   > > >     
   > > > 
   > > >       
   > > >     
   > > > 
   > > >     
   > > >   
   > > > Should have ranges `[1..3], [0..2]`
   > > 
   > > 
   > > I don't get the question clearly.
   > > You have `[0, 1, 1]` as buffered indices for same streamed row? Why you 
have same buffered row id `1` twice?
   > 
   > Thanks @viirya it's not indices, it is a raw data. Let me rephrase it.
   > 
   > If I have a left table
   > 
   > a  b
   > 10 20
   > and right table
   > 
   > a  b
   > 5  20
   > 10 20
   > 10 21
   > 10 21
   > 10 22
   > 15 22
   > And join key is A and Filter is on column B
   > 
   > In `freeze_streamed` I can observe the right table comes as 3 batches
   > 
   > 1 Batch. join_array [10] Range 1..3 - which is correct as rownumbers 1 and 
2 related to join key 10 2 Batch. join_array[10] Range 0..2 - which is correct 
as rownumbers 0 and 1 related to join key 10 3 Batch. join_array[15] Range 0..1 
- which is weird, why this batch associated ?
   
   Would you let me know how do you cut the 3 batches among the 6 buffered rows?
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: github-unsubscr...@datafusion.apache.org
For additional commands, e-mail: github-h...@datafusion.apache.org

Reply via email to