mbutrovich opened a new pull request, #21663: URL: https://github.com/apache/datafusion/pull/21663
## Which issue does this PR close? - Closes #21662. ## Rationale for this change Miri detects a Stacked Borrows violation in `PushDecoderStreamState::transition`. A nested `async` block captures `&mut self` as a single opaque mutable reference. At the `.await` on `get_byte_ranges`, the future yields, and the `Unique` tag on the borrow stack is invalidated by a `SharedReadOnly` retag. When the future resumes, `push_ranges` attempts a two-phase retag through the now-invalidated tag. This was found by [Apache DataFusion Comet](https://github.com/apache/datafusion-comet), which runs Miri in CI. ## What changes are included in this PR? Remove the nested `async` block in the `NeedsData` arm of `PushDecoderStreamState::transition` and inline the IO (`get_byte_ranges`) and CPU (`push_ranges`) operations as separate statements. Since `transition` is already an `async fn`, the `.await` works directly in the loop body. Without the nested block, the compiler can split the borrows of `self.reader` and `self.decoder` into disjoint field borrows, keeping the borrow stack valid across the yield point. Also removes the now-unused `parquet::errors::ParquetError` import. ## Are these changes tested? Covered by existing parquet reader tests. The original violation was caught by Miri, which DataFusion does not currently run in CI. ## Are there any user-facing changes? No. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
