Github user avi8tr commented on the issue:
https://github.com/apache/spark/pull/16782
This patch is not a solution for pyspark users because all of the ML stages
in the pipeline are also not threadsafe in their creation due to this same
wrapper. Note that the wrapper does two separate things, enforces keywords
only and passes the kwargs in an unsafe manner outside the call to the wrapped
method. We can fix this by simply omitting the wrapper's second (apparently
unneeded) feature. Another benefit of this omission is that wrapped functions
do not need to be modified to use the wrapper (although the ML methods that
have been already modified to depend upon the input_kwargs introduced by the
defective wrapper must be switched back to using named arguments). Note this
also would fix the bug in Pipeline where the __init__ method's modifications to
stages are lost. To illustrate this approach to a fix using minimalist code
similar to Pipeline:
`from functools import wraps
def keyword_only(func):
"""
A decorator that forces keyword arguments in the wrapped method
"""
@wraps(func)
def wrapper(*args, **kwargs):
if len(args) > 1:
raise TypeError("Method %s forces keyword arguments." %
func.__name__)
return func(*args, **kwargs)
return wrapper
class Mytest:
@keyword_only
def __init__(self, stages=None):
"""
__init__(self, stages=None)
"""
self.setParams(stages=stages)
@keyword_only
def setParams(self, stages=None):
"""
setParams(self, stages=None)
Sets params for Pipeline.
"""
if stages is None:
stages = []
return self._set(stages=stages)
def _set(self,**kwargs):
for key,value in kwargs.items():
print ('kwargs contains ' + key + ": " + str(value))
if __name__ == "__main__":
print ()
print ('zero arguments')
baz = Mytest()
print ()
print ('initParams')
foo = Mytest(stages='initParams')
print ()
print ('setParams')
bar = Mytest()
bar.setParams(stages='setParams')
print ()
print ('nonKeyword arguments')
try:
bar = Mytest('nokeywords')
except Exception as e:
print ('Exception: '+e.args[0])
print ()
print ('initParams with unexpected parameter')
try:
bat = Mytest(stages='initParams', unexpectedParameter='foo')
except Exception as e:
print ('Exception: '+e.args[0])
`
the output of which is:
`zero arguments
kwargs contains stages: []
initParams
kwargs contains stages: initParams
setParams
kwargs contains stages: []
kwargs contains stages: setParams
nonKeyword arguments
Exception: Method __init__ forces keyword arguments.
initParams with unexpected parameter
Exception: __init__() got an unexpected keyword argument
'unexpectedParameter'
`
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]