HyukjinKwon commented on issue #24958: [SPARK-28153][PYTHON] Use 
AtomicReference at InputFileBlockHolder (to support input_file_name with Python 
UDF)
URL: https://github.com/apache/spark/pull/24958#issuecomment-509637062
 
 
   @brkyvz, we're not rushing - we're not ignoring any issue or holes actually 
found or merging it without discussion. Also, using thread local isn't a 
horrible way although it might be less preferred case by case - we can avoid to 
have one place that multiple tasks access but run them in parallel separately.
   
   I get the suggestion makes sense too but adding new way isn't necessarily 
safe. There are always new holes that can pop up. One conservative way is 
usually to keep the codes with less changes (you know for instance bug 
compatibility).
   
   In addition, it's rather a general design issue not specific only to this 
code path. For instance, the codes below:
   
   
https://github.com/apache/spark/blob/5264164a67df498b73facae207eda12ee133be7d/sql/core/src/main/scala/org/apache/spark/sql/execution/streaming/continuous/EpochTracker.scala#L26-L31
   
   have the almost similar issue as SPARK-28153 - the parent thread updates the 
current epoch but the child thread (Python write thread) cannot read it. 
Ideally we should identify where to fix as well.
   
   

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to