zero323 commented on a change in pull request #34892:
URL: https://github.com/apache/spark/pull/34892#discussion_r770905121



##########
File path: python/pyspark/context.py
##########
@@ -288,9 +287,11 @@ def _do_init(
 
         # Create a single Accumulator in Java that we'll send all our updates 
through;
         # they will be passed back to us through a TCP server
+        assert self._gateway is not None

Review comment:
       > Why are the asserts needed, out of curiosity
   
   Long story short ‒ `_gateway`, `_jvm`, etc. are class variables that might 
have or might have not been initialized 
   
   
https://github.com/apache/spark/blob/6e45b04db48008fa033b09df983d3bd1c4f790ea/python/pyspark/context.py#L155-L158
   
   so all are defined `Optional[_]`.
   
   On runtime we blindly assume that we deal with active context and these are 
set, but that's not clear for the type checker ‒ these assertions serve as 
`mypy` hints "hey, I know this stuff is not `None` here".
   
   There are other ways of handling this (`casts`, `ignores`, we even discussed 
helper methods, but this seems to be the least intrusive approach and the 
closest to our actual expectations).
   




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to