gitmodimo commented on PR #47129: URL: https://github.com/apache/arrow/pull/47129#issuecomment-3116725340
@zanmato1984 Consider following sequence: https://github.com/apache/arrow/blob/25f8f008061b978137a2c1e5bc934d07fae56e3e/cpp/src/arrow/dataset/dataset_writer.cc#L682-L687 1. Initial conditions `rows_in_flight_throttle.current_value_=0` `rows_in_flight_throttle.backpressure_=Finished` 2. `Thread_1`: In DoWriteRecordBatch L.683 [Acquire](https://github.com/apache/arrow/blob/25f8f008061b978137a2c1e5bc934d07fae56e3e/cpp/src/arrow/dataset/dataset_writer.cc#L683) No backpressure is applied so `rows_in_flight_throttle.current_value_ +=100` - value 100 `rows_in_flight_throttle.backpressure_=Finished` - no change `dir_queue->StartWrite` - write_1 in Thread_2 starts 3. `Thread_1`: Again in DoWriteRecordBatch L.683 [Acquire](https://github.com/apache/arrow/blob/25f8f008061b978137a2c1e5bc934d07fae56e3e/cpp/src/arrow/dataset/dataset_writer.cc#L683) Bacpressure is applied and `rows_in_flight_throttle.current_value_` is _not_ incremented. `rows_in_flight_throttle.current_value_ +=0` - value 100 `rows_in_flight_throttle.backpressure_=Pending` 4. `Thread_2` finishes write_1 and issues `rows_in_flight_throttle.Release` `rows_in_flight_throttle.current_value_ -=100` - value 0 `rows_in_flight_throttle.backpressure_=Finished` 5. `Thread_1`: In next line of DoWriteRecordBatch L.684 [!backpressure.is_finished()](https://github.com/apache/arrow/blob/25f8f008061b978137a2c1e5bc934d07fae56e3e/cpp/src/arrow/dataset/dataset_writer.cc#L684) Internal state of the future is Finished by previous step therefore StartWrite issued write_2. 6. write_2 finishes and issues `rows_in_flight_throttle.Release` `rows_in_flight_throttle.current_value_ -=100` - integer underflow locks backpressure forever `rows_in_flight_throttle.backpressure_=Finished` 7. No write is ever scheduled again Steps 3. and 5. are not atomic and are concurring with 4. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
