Re: [PR] [SPARK-55620][PYTHON][CONNECT] Fix deadlock in test_connect_session d… [spark]

2026-03-05 Thread via GitHub


Yicong-Huang commented on PR #54636:
URL: https://github.com/apache/spark/pull/54636#issuecomment-4009017800

   Sounds good thanks. let's close the ticket.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] [SPARK-55620][PYTHON][CONNECT] Fix deadlock in test_connect_session d… [spark]

2026-03-05 Thread via GitHub


gaogaotiantian commented on PR #54636:
URL: https://github.com/apache/spark/pull/54636#issuecomment-4008902180

   #54279 was merged last week


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] [SPARK-55620][PYTHON][CONNECT] Fix deadlock in test_connect_session d… [spark]

2026-03-05 Thread via GitHub


Yicong-Huang commented on PR #54636:
URL: https://github.com/apache/spark/pull/54636#issuecomment-4008883926

   @gaogaotiantian I reported the issue. and the hang was observed on this run. 
https://github.com/Yicong-Huang/spark/actions/runs/22196593939, which was 2 
weeks ago and after #54279. Can you double check if this is fixed?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] [SPARK-55620][PYTHON][CONNECT] Fix deadlock in test_connect_session d… [spark]

2026-03-05 Thread via GitHub


godeomt commented on PR #54636:
URL: https://github.com/apache/spark/pull/54636#issuecomment-4008544292

   Thank you for your warm words and encouragement! I will definitely try 
again. Closing this PR now.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] [SPARK-55620][PYTHON][CONNECT] Fix deadlock in test_connect_session d… [spark]

2026-03-05 Thread via GitHub


godeomt closed pull request #54636: [SPARK-55620][PYTHON][CONNECT] Fix deadlock 
in test_connect_session d…
URL: https://github.com/apache/spark/pull/54636


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] [SPARK-55620][PYTHON][CONNECT] Fix deadlock in test_connect_session d… [spark]

2026-03-05 Thread via GitHub


gaogaotiantian commented on PR #54636:
URL: https://github.com/apache/spark/pull/54636#issuecomment-4008528124

   @godeomt , no worries. deadlock is often a relatively difficult type of 
issue to find the root cause. It normally requires some expertise in the 
domain. It's very common that even the experienced engineers have the wrong 
thoughts. The reason I asked about LLM is just to confirm that this could 
potentially be some hallucination from AI (it looks like one).
   
   I believe the that specific flakiness issue was already resolved. Feel free 
to observe the CI results to confirm that. Also thank you for taking interests 
in improving spark. Even though this PR probably won't be merged, you'll still 
have a lot of chances in the future to become a contributor!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] [SPARK-55620][PYTHON][CONNECT] Fix deadlock in test_connect_session d… [spark]

2026-03-05 Thread via GitHub


godeomt commented on PR #54636:
URL: https://github.com/apache/spark/pull/54636#issuecomment-4008456021

   Hi, thank you so much for the detailed review and for pointing out the 
technical flaws in my reasoning!
   To answer your questions honestly:
   
   - Regarding PR #54279 and reproduction: To be completely transparent, I 
missed PR #54279. I spent a lot of time trying to reproduce the hanging issue 
locally in various ways, but I couldn't get it to fail at all. Because I 
couldn't reproduce the bug after multiple attempts, I eventually turned to an 
LLM for help to understand the potential root cause based on the old CI logs.
   
   - Regarding the root cause: You are absolutely right. If Python tries to 
spawn a thread during finalization, it raises a RuntimeError rather than 
deadlocking. My assumption (and the AI's hallucination) about 
self._channel.close() spawning a new thread and causing an infinite hang was 
technically incorrect. I lacked the deep domain knowledge to properly verify 
the AI's theory.
   
   I really appreciate you taking the time to explain why the theory was 
flawed. It is a huge learning experience for me as a first-time contributor. If 
PR #54279 already addresses this flakiness, I am more than happy to close this 
PR. Thank you again for your patience and guidance!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]



Re: [PR] [SPARK-55620][PYTHON][CONNECT] Fix deadlock in test_connect_session d… [spark]

2026-03-05 Thread via GitHub


gaogaotiantian commented on PR #54636:
URL: https://github.com/apache/spark/pull/54636#issuecomment-4007803652

   I have some doubts about the root cause.
   
   First of all, did you observe any test hanging after 
https://github.com/apache/spark/pull/54279 ? Why do you believe 
`self._channel.close()` will spawn a *new* thread to clean up the ongoing 
gRPCs, instead of using the existing thread? If a python process is trying to 
spawn a new thread during finalization phase, it should raise an exception, 
instead of block forever.
   
   If you don't mind me asking, did you come up with the theory by yourself, or 
with the help of LLM? It might help with our discussion.
   
   Thank you for taking a look at the issues.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


-
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]