[GitHub] [spark] WeichenXu123 commented on a change in pull request #30471: [WIP][SPARK-33520][ML] make CrossValidator/TrainValidateSplit support Python backend estimator/model
WeichenXu123 commented on a change in pull request #30471: URL: https://github.com/apache/spark/pull/30471#discussion_r53061 ## File path: python/pyspark/ml/tuning.py ## @@ -207,6 +210,205 @@ def _to_java_impl(self): return java_estimator, java_epms, java_evaluator +class _ValidatorSharedReadWrite: + +@staticmethod +def saveImpl(path, instance, sc, extraMetadata=None): +from pyspark.ml.classification import OneVsRest +numParamsNotJson = 0 +jsonEstimatorParamMaps = [] +for paramMap in instance.getEstimatorParamMaps(): +jsonParamMap = [] +for p, v in paramMap.items(): +jsonParam = {'parent': p.parent, 'name': p.name} +if (isinstance(v, Estimator) and not ( +isinstance(v, _ValidatorParams) or +isinstance(v, OneVsRest)) +) or isinstance(v, Transformer) or \ Review comment: > It's easy to add the extension when that case is explicitly supported Seems no. Suppose there're some 3rd party libraries depends on spark. And 3rd party estimators may include an evaluator param (e.g. for supervising in training early stop) . We need to support them. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] WeichenXu123 commented on a change in pull request #30471: [WIP][SPARK-33520][ML] make CrossValidator/TrainValidateSplit support Python backend estimator/model
WeichenXu123 commented on a change in pull request #30471: URL: https://github.com/apache/spark/pull/30471#discussion_r530060435 ## File path: python/pyspark/ml/tuning.py ## @@ -207,6 +210,205 @@ def _to_java_impl(self): return java_estimator, java_epms, java_evaluator +class _ValidatorSharedReadWrite: + +@staticmethod +def saveImpl(path, instance, sc, extraMetadata=None): +from pyspark.ml.classification import OneVsRest +numParamsNotJson = 0 +jsonEstimatorParamMaps = [] +for paramMap in instance.getEstimatorParamMaps(): +jsonParamMap = [] +for p, v in paramMap.items(): +jsonParam = {'parent': p.parent, 'name': p.name} +if (isinstance(v, Estimator) and not ( +isinstance(v, _ValidatorParams) or +isinstance(v, OneVsRest)) +) or isinstance(v, Transformer) or \ Review comment: > The Validators class will directly take Estimator and Evaluator, and the Transformer will be part of the pipeline Estimator. Should the Transformer params be part of the pipeline params? A pyspark param value can be estimator/transformer/evaluator. They're all legal. Although currently pyspark does not have the case "transformer" to be a param value, but, allow it here is to provide extensibility. ``` if (isinstance(v, Estimator) and not ( isinstance(v, _ValidatorParams) or isinstance(v, OneVsRest)) ) or isinstance(v, Transformer) or \ isinstance(Evaluator): ``` The logic try to keep equivalent with jvm side logic `isInstanceOf[DefaultParamsWritable] && !isInstanceOf[MLWritable]` This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] WeichenXu123 commented on a change in pull request #30471: [WIP][SPARK-33520][ML] make CrossValidator/TrainValidateSplit support Python backend estimator/model
WeichenXu123 commented on a change in pull request #30471: URL: https://github.com/apache/spark/pull/30471#discussion_r530056747 ## File path: python/pyspark/ml/tuning.py ## @@ -207,6 +210,205 @@ def _to_java_impl(self): return java_estimator, java_epms, java_evaluator +class _ValidatorSharedReadWrite: + +@staticmethod +def saveImpl(path, instance, sc, extraMetadata=None): +from pyspark.ml.classification import OneVsRest +numParamsNotJson = 0 +jsonEstimatorParamMaps = [] +for paramMap in instance.getEstimatorParamMaps(): +jsonParamMap = [] +for p, v in paramMap.items(): +jsonParam = {'parent': p.parent, 'name': p.name} +if (isinstance(v, Estimator) and not ( +isinstance(v, _ValidatorParams) or +isinstance(v, OneVsRest)) +) or isinstance(v, Transformer) or \ Review comment: > It would be clearer if we can create an interface similar to DefaultParamsWritable Pyspark already has this interface, the issue is the existing pyspark estimator/transformer did not inherit `DefaultParamsWritable`, change them to inherit `DefaultParamsWritable` may cause breaking changes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] WeichenXu123 commented on a change in pull request #30471: [WIP][SPARK-33520][ML] make CrossValidator/TrainValidateSplit support Python backend estimator/model
WeichenXu123 commented on a change in pull request #30471: URL: https://github.com/apache/spark/pull/30471#discussion_r530055236 ## File path: python/pyspark/ml/tuning.py ## @@ -207,6 +210,205 @@ def _to_java_impl(self): return java_estimator, java_epms, java_evaluator +class _ValidatorSharedReadWrite: + +@staticmethod +def saveImpl(path, instance, sc, extraMetadata=None): +from pyspark.ml.classification import OneVsRest +numParamsNotJson = 0 +jsonEstimatorParamMaps = [] +for paramMap in instance.getEstimatorParamMaps(): +jsonParamMap = [] Review comment: `jsonParamMap` ==> `jsonParamList`: sounds good. `jsonEstimatorParamMaps`, the name `xxxMaps` actually means `xxxMapList`. Similar to existing codes. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] WeichenXu123 commented on a change in pull request #30471: [WIP][SPARK-33520][ML] make CrossValidator/TrainValidateSplit support Python backend estimator/model
WeichenXu123 commented on a change in pull request #30471: URL: https://github.com/apache/spark/pull/30471#discussion_r530055236 ## File path: python/pyspark/ml/tuning.py ## @@ -207,6 +210,205 @@ def _to_java_impl(self): return java_estimator, java_epms, java_evaluator +class _ValidatorSharedReadWrite: + +@staticmethod +def saveImpl(path, instance, sc, extraMetadata=None): +from pyspark.ml.classification import OneVsRest +numParamsNotJson = 0 +jsonEstimatorParamMaps = [] +for paramMap in instance.getEstimatorParamMaps(): +jsonParamMap = [] Review comment: oh. there's a mistake here. I will update. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] WeichenXu123 commented on a change in pull request #30471: [WIP][SPARK-33520][ML] make CrossValidator/TrainValidateSplit support Python backend estimator/model
WeichenXu123 commented on a change in pull request #30471: URL: https://github.com/apache/spark/pull/30471#discussion_r530055236 ## File path: python/pyspark/ml/tuning.py ## @@ -207,6 +210,205 @@ def _to_java_impl(self): return java_estimator, java_epms, java_evaluator +class _ValidatorSharedReadWrite: + +@staticmethod +def saveImpl(path, instance, sc, extraMetadata=None): +from pyspark.ml.classification import OneVsRest +numParamsNotJson = 0 +jsonEstimatorParamMaps = [] +for paramMap in instance.getEstimatorParamMaps(): +jsonParamMap = [] Review comment: they're maps in "json" style (can be directly serialized by json.dumps()) . So named like jsonXXX. It is similar to other existing codes This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] WeichenXu123 commented on a change in pull request #30471: [WIP][SPARK-33520][ML] make CrossValidator/TrainValidateSplit support Python backend estimator/model
WeichenXu123 commented on a change in pull request #30471: URL: https://github.com/apache/spark/pull/30471#discussion_r529109498 ## File path: python/pyspark/ml/tuning.py ## @@ -207,6 +210,205 @@ def _to_java_impl(self): return java_estimator, java_epms, java_evaluator +class _ValidatorSharedReadWrite: + +@staticmethod +def saveImpl(path, instance, sc, extraMetadata=None): +from pyspark.ml.classification import OneVsRest +numParamsNotJson = 0 +jsonEstimatorParamMaps = [] +for paramMap in instance.getEstimatorParamMaps(): +jsonParamMap = [] +for p, v in paramMap.items(): +jsonParam = {'parent': p.parent, 'name': p.name} +if (isinstance(v, Estimator) and not ( +isinstance(v, _ValidatorParams) or +isinstance(v, OneVsRest)) +) or isinstance(v, Transformer) or \ +isinstance(Evaluator): +relative_path = f'epm_{p.name}{numParamsNotJson}' +param_path = os.path.join(path, relative_path) +numParamsNotJson += 1 +v.save(param_path) +jsonParam['value'] = relative_path +jsonParam['isJson'] = False +elif isinstance(v, MLWritable): +raise RuntimeError( +"ValidatorSharedReadWrite.saveImpl does not handle parameters of type: " +"MLWritable that are not Estimaor/Evaluator/Transformer, and if parameter is estimator," +"it cannot be Validator or OneVsRest") Review comment: In JVM side, the logic is only support `DefaultParamsWritable` instance. Pyspark side we cannot use it, but we can equivalently use: Support Estimaor/Evaluator/Transformer type but excludes Validator or OneVsRest. These types in JVM side are inherits `DefaultParamsWritable`. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] WeichenXu123 commented on a change in pull request #30471: [WIP][SPARK-33520][ML] make CrossValidator/TrainValidateSplit support Python backend estimator/model
WeichenXu123 commented on a change in pull request #30471: URL: https://github.com/apache/spark/pull/30471#discussion_r529100588 ## File path: python/pyspark/ml/tuning.py ## @@ -207,6 +210,205 @@ def _to_java_impl(self): return java_estimator, java_epms, java_evaluator +class _ValidatorSharedReadWrite: + +@staticmethod +def saveImpl(path, instance, sc, extraMetadata=None): +from pyspark.ml.classification import OneVsRest +numParamsNotJson = 0 +jsonEstimatorParamMaps = [] +for paramMap in instance.getEstimatorParamMaps(): +jsonParamMap = [] +for p, v in paramMap.items(): +jsonParam = {'parent': p.parent, 'name': p.name} +if (isinstance(v, Estimator) and not ( +isinstance(v, _ValidatorParams) or +isinstance(v, OneVsRest)) +) or isinstance(v, Transformer) or \ +isinstance(Evaluator): +relative_path = f'epm_{p.name}{numParamsNotJson}' +param_path = os.path.join(path, relative_path) +numParamsNotJson += 1 +v.save(param_path) +jsonParam['value'] = relative_path +jsonParam['isJson'] = False +elif isinstance(v, MLWritable): +raise RuntimeError( +"ValidatorSharedReadWrite.saveImpl does not handle parameters of type: " +"MLWritable that are not Estimaor/Evaluator/Transformer, and if parameter is estimator," +"it cannot be Validator or OneVsRest") +else: +jsonParam['value'] = v +jsonParam['isJson'] = True +jsonParamMap.append(jsonParam) +jsonEstimatorParamMaps.append(jsonParamMap) + +skipParams = ['estimator', 'evaluator', 'estimatorParamMaps'] + +jsonParams = {} +for p, v in instance._paramMap.items(): +if p.name not in skipParams: +jsonParams[p.name] = v + +jsonParams['estimatorParamMaps'] = jsonEstimatorParamMaps + +DefaultParamsWriter.saveMetadata(instance, path, sc, extraMetadata, jsonParams) +evaluatorPath = os.path.join(path, 'evaluator') +instance.getEvaluator().save(evaluatorPath) +estimatorPath = os.path.join(path, 'estimator') +instance.getEstimator().save(estimatorPath) + +@staticmethod +def load(path, sc, metadata): +evaluatorPath = os.path.join(path, 'evaluator') +evaluator = DefaultParamsReader.loadParamsInstance(evaluatorPath, sc) +estimatorPath = os.path.join(path, 'estimator') +estimator = DefaultParamsReader.loadParamsInstance(estimatorPath, sc) + +uidToParams = MetaAlgorithmReadWrite.getUidMap(estimator) Review comment: This map: `MetaAlgorithmReadWrite.getUidMap` used to find the parent of each param in `estimatorParamMaps` e.g. CrossValidator on pipeline, and pipeline include [transformer1, transformer2, estimator1], and CrossValidator want to tune on params : [transformer1.param1, transformer2.param1, estimator1.params1] https://github.com/apache/spark/pull/30471/files#r529101510 Then the getUidMap is used for: providing uid, get the corresponding transformer / estimator which is the parent of the param. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] WeichenXu123 commented on a change in pull request #30471: [WIP][SPARK-33520][ML] make CrossValidator/TrainValidateSplit support Python backend estimator/model
WeichenXu123 commented on a change in pull request #30471: URL: https://github.com/apache/spark/pull/30471#discussion_r529101510 ## File path: python/pyspark/ml/tuning.py ## @@ -207,6 +210,205 @@ def _to_java_impl(self): return java_estimator, java_epms, java_evaluator +class _ValidatorSharedReadWrite: + +@staticmethod +def saveImpl(path, instance, sc, extraMetadata=None): +from pyspark.ml.classification import OneVsRest +numParamsNotJson = 0 +jsonEstimatorParamMaps = [] +for paramMap in instance.getEstimatorParamMaps(): +jsonParamMap = [] +for p, v in paramMap.items(): +jsonParam = {'parent': p.parent, 'name': p.name} Review comment: When we save a param, We will first save: * parent (uid of estimator/transformer) * param name This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] WeichenXu123 commented on a change in pull request #30471: [WIP][SPARK-33520][ML] make CrossValidator/TrainValidateSplit support Python backend estimator/model
WeichenXu123 commented on a change in pull request #30471: URL: https://github.com/apache/spark/pull/30471#discussion_r529100588 ## File path: python/pyspark/ml/tuning.py ## @@ -207,6 +210,205 @@ def _to_java_impl(self): return java_estimator, java_epms, java_evaluator +class _ValidatorSharedReadWrite: + +@staticmethod +def saveImpl(path, instance, sc, extraMetadata=None): +from pyspark.ml.classification import OneVsRest +numParamsNotJson = 0 +jsonEstimatorParamMaps = [] +for paramMap in instance.getEstimatorParamMaps(): +jsonParamMap = [] +for p, v in paramMap.items(): +jsonParam = {'parent': p.parent, 'name': p.name} +if (isinstance(v, Estimator) and not ( +isinstance(v, _ValidatorParams) or +isinstance(v, OneVsRest)) +) or isinstance(v, Transformer) or \ +isinstance(Evaluator): +relative_path = f'epm_{p.name}{numParamsNotJson}' +param_path = os.path.join(path, relative_path) +numParamsNotJson += 1 +v.save(param_path) +jsonParam['value'] = relative_path +jsonParam['isJson'] = False +elif isinstance(v, MLWritable): +raise RuntimeError( +"ValidatorSharedReadWrite.saveImpl does not handle parameters of type: " +"MLWritable that are not Estimaor/Evaluator/Transformer, and if parameter is estimator," +"it cannot be Validator or OneVsRest") +else: +jsonParam['value'] = v +jsonParam['isJson'] = True +jsonParamMap.append(jsonParam) +jsonEstimatorParamMaps.append(jsonParamMap) + +skipParams = ['estimator', 'evaluator', 'estimatorParamMaps'] + +jsonParams = {} +for p, v in instance._paramMap.items(): +if p.name not in skipParams: +jsonParams[p.name] = v + +jsonParams['estimatorParamMaps'] = jsonEstimatorParamMaps + +DefaultParamsWriter.saveMetadata(instance, path, sc, extraMetadata, jsonParams) +evaluatorPath = os.path.join(path, 'evaluator') +instance.getEvaluator().save(evaluatorPath) +estimatorPath = os.path.join(path, 'estimator') +instance.getEstimator().save(estimatorPath) + +@staticmethod +def load(path, sc, metadata): +evaluatorPath = os.path.join(path, 'evaluator') +evaluator = DefaultParamsReader.loadParamsInstance(evaluatorPath, sc) +estimatorPath = os.path.join(path, 'estimator') +estimator = DefaultParamsReader.loadParamsInstance(estimatorPath, sc) + +uidToParams = MetaAlgorithmReadWrite.getUidMap(estimator) Review comment: This map: `MetaAlgorithmReadWrite.getUidMap` used to find the parent of each param in `estimatorParamMaps` e.g. CrossValidator on pipeline, and pipeline include [transformer1, transformer2, estimator1], and CrossValidator want to tune on params : [transformer1.param1, transformer2.param1, estimator1.params1] Then the getUidMap is used for: providing uid, get the corresponding transformer / estimator which is the parent of the param. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] WeichenXu123 commented on a change in pull request #30471: [WIP][SPARK-33520][ML] make CrossValidator/TrainValidateSplit support Python backend estimator/model
WeichenXu123 commented on a change in pull request #30471: URL: https://github.com/apache/spark/pull/30471#discussion_r529100588 ## File path: python/pyspark/ml/tuning.py ## @@ -207,6 +210,205 @@ def _to_java_impl(self): return java_estimator, java_epms, java_evaluator +class _ValidatorSharedReadWrite: + +@staticmethod +def saveImpl(path, instance, sc, extraMetadata=None): +from pyspark.ml.classification import OneVsRest +numParamsNotJson = 0 +jsonEstimatorParamMaps = [] +for paramMap in instance.getEstimatorParamMaps(): +jsonParamMap = [] +for p, v in paramMap.items(): +jsonParam = {'parent': p.parent, 'name': p.name} +if (isinstance(v, Estimator) and not ( +isinstance(v, _ValidatorParams) or +isinstance(v, OneVsRest)) +) or isinstance(v, Transformer) or \ +isinstance(Evaluator): +relative_path = f'epm_{p.name}{numParamsNotJson}' +param_path = os.path.join(path, relative_path) +numParamsNotJson += 1 +v.save(param_path) +jsonParam['value'] = relative_path +jsonParam['isJson'] = False +elif isinstance(v, MLWritable): +raise RuntimeError( +"ValidatorSharedReadWrite.saveImpl does not handle parameters of type: " +"MLWritable that are not Estimaor/Evaluator/Transformer, and if parameter is estimator," +"it cannot be Validator or OneVsRest") +else: +jsonParam['value'] = v +jsonParam['isJson'] = True +jsonParamMap.append(jsonParam) +jsonEstimatorParamMaps.append(jsonParamMap) + +skipParams = ['estimator', 'evaluator', 'estimatorParamMaps'] + +jsonParams = {} +for p, v in instance._paramMap.items(): +if p.name not in skipParams: +jsonParams[p.name] = v + +jsonParams['estimatorParamMaps'] = jsonEstimatorParamMaps + +DefaultParamsWriter.saveMetadata(instance, path, sc, extraMetadata, jsonParams) +evaluatorPath = os.path.join(path, 'evaluator') +instance.getEvaluator().save(evaluatorPath) +estimatorPath = os.path.join(path, 'estimator') +instance.getEstimator().save(estimatorPath) + +@staticmethod +def load(path, sc, metadata): +evaluatorPath = os.path.join(path, 'evaluator') +evaluator = DefaultParamsReader.loadParamsInstance(evaluatorPath, sc) +estimatorPath = os.path.join(path, 'estimator') +estimator = DefaultParamsReader.loadParamsInstance(estimatorPath, sc) + +uidToParams = MetaAlgorithmReadWrite.getUidMap(estimator) Review comment: This map: `MetaAlgorithmReadWrite.getUidMap` used to find the parent of each param in `estimatorParamMaps` e.g. CrossValidator on pipeline, and pipeline include [transformer1, transformer2, estimator1], and CrossValidator want to tune on params : [transformer1.params1, transformer2.params1, estimator1] Then the getUidMap is used for: providing uid, get the corresponding transformer / estimator which is the parent of the param. This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org
[GitHub] [spark] WeichenXu123 commented on a change in pull request #30471: [WIP][SPARK-33520][ML] make CrossValidator/TrainValidateSplit support Python backend estimator/model
WeichenXu123 commented on a change in pull request #30471: URL: https://github.com/apache/spark/pull/30471#discussion_r529099004 ## File path: python/pyspark/ml/tuning.py ## @@ -207,6 +210,205 @@ def _to_java_impl(self): return java_estimator, java_epms, java_evaluator +class _ValidatorSharedReadWrite: + +@staticmethod +def saveImpl(path, instance, sc, extraMetadata=None): +from pyspark.ml.classification import OneVsRest +numParamsNotJson = 0 +jsonEstimatorParamMaps = [] +for paramMap in instance.getEstimatorParamMaps(): +jsonParamMap = [] +for p, v in paramMap.items(): +jsonParam = {'parent': p.parent, 'name': p.name} +if (isinstance(v, Estimator) and not ( +isinstance(v, _ValidatorParams) or +isinstance(v, OneVsRest)) +) or isinstance(v, Transformer) or \ +isinstance(Evaluator): +relative_path = f'epm_{p.name}{numParamsNotJson}' +param_path = os.path.join(path, relative_path) +numParamsNotJson += 1 +v.save(param_path) +jsonParam['value'] = relative_path +jsonParam['isJson'] = False Review comment: For the case: CrossValidator on estimator of OneVsRest, and need to turning the parameter `OneVsRest.classifier`, then EstimatorParamMaps will include a param (OneVsRest.classifier -> A_Classifier) We need to consider how to saving Classifier type param value. This is what the code do: * if it is normal param, we save it as Json format and `isJson` field is True. * otherwise set `isJson` field False, and save it by object.save(). This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: us...@infra.apache.org - To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org