YjyJeff commented on PR #12189:
URL: https://github.com/apache/datafusion/pull/12189#issuecomment-2313977282

   > Is this bug visible/reproducible somehow? If so, could you add a 
regression test?
   
   @crepererum  This bug is triggered by our customized `TableScan` operator 
that does not support `FusedStream` contract. In our implementation, when the 
stream is finished, calling the `poll_next` method again will cause panic.  
   
   It is simple to mimic the above situation. Firstly, modify the 
[MemoryStream](https://github.com/apache/datafusion/blob/main/datafusion/physical-plan/src/memory.rs#L281)
 such that it does not support `FusedStream` contract with following code: 
   
   ```rust
       fn poll_next(
           mut self: std::pin::Pin<&mut Self>,
           _: &mut Context<'_>,
       ) -> Poll<Option<Self::Item>> {
           Poll::Ready(if self.index < self.data.len() {
               self.index += 1;
               let batch = &self.data[self.index - 1];
   
               // return just the columns requested
               let batch = match self.projection.as_ref() {
                   Some(columns) => batch.project(columns)?,
                   None => batch.clone(),
               };
   
               Some(Ok(batch))
           } else if self.index == self.data.len() {
               self.index += 1;
               None
           } else {
               panic!("Exhausted stream is polled again")
           })
       }
   ```
   Secondly, run the `datafusion-cli` binary with following sqls:
   ```SQL
   CREATE TABLE wtf AS VALUES (4, 'arrow'), (99, 'datafusion');
   
   select t0.column1, t1.column1, t0.column2, t1.column2 from wtf t0 left join 
wtf t1 on t0.column1 > t1.column1;
   ```
   Then, we will hit the `panic` statement. 
   
   Currently, the `ExecutionPlan` trait only requires the `execute` method to 
return the struct that implements the `Stream` trait. According to the 
definition of the 
[poll_next](https://docs.rs/futures/latest/futures/stream/trait.Stream.html#tymethod.poll_next)
 method, panic can happens when polling the exhausted stream


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to