afs commented on PR #3753: URL: https://github.com/apache/jena/pull/3753#issuecomment-3904732930
`StageGeneratorGeneric` needs your patch - thanks (took v2, and put `if ( input.hasNext )` at the start). PR revised. > I occasionally got failing query cancel tests on a 14 core machine - before and with your PR. What sort of machine is that? I have a Dell PC with `12th Gen Intel® Core™ i7-12700 × 20` - 4 P cores, 16 E cores. Each P core has its own L2 cache and groups of 4 E cores have an L2 cache. I have seen these errors once in several months of a lot of Jena builds while removing the deprecated code and other cleaning up for Jena6. The CI on Jenkins and github has it quite often when the build systems are busy. --- Looks like you have found the cause of both the `jena-integeration-tests` and `jena-geosparql` test failures! It's `QueryIterAbortable` for `jena-integeration-tests` which inherits from `QueryIterPlainWrapper`. However ... :smile: ... this shows another problem. It's not just missing the iterator and failing to close it. The code can concurrently manipulate the iterator. `requestCancel` is on another thread. The `synchronized` makes `close` / `requestCancel` safe. `hasNextBinding` / `moveToNextBinding` / `close` can manipulate the iterator safely as they are all on the same thread. What Rust might call the owning thread. `requestCancel` / `hasNextBinding` / `moveToNextBinding` is a mixture of threads. While closing due to cancellation, `*NextBinding` may be active and there is no protection. In Rust, this code pattern would not even compile! So I think using `synchronized` will fix the build crashes but a complete fix needs to only perform the `close` on the query thread. Adding `synchronized` to `*NextBinding` is very heavy. As using `synchroized` for `close` / `requestCancel` is step forward, let's put it in and see how the CI reacts. In parallel, I'll raise an issue to discuss what to do with a permanent fix which I fear may impact other iterators. One solution is a volatile so the cancel signal is put onto the query execution thread. I'm currently at the stage of wondering why the whole system does not crash more often! This is a common state when working on concurrency bugs. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
