[jira] [Updated] (SPARK-28365) Set default locale param for StopWordsRemover to en_US if system default locale isn't in available locales in JVM

2019-07-14 Thread Liang-Chi Hsieh (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-28365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang-Chi Hsieh updated SPARK-28365:

Summary: Set default locale param for StopWordsRemover to en_US if system 
default locale isn't in available locales in JVM  (was: Set default locale for 
StopWordsRemover tests to prevent invalid locale error during test)

> Set default locale param for StopWordsRemover to en_US if system default 
> locale isn't in available locales in JVM
> -
>
> Key: SPARK-28365
> URL: https://issues.apache.org/jira/browse/SPARK-28365
> Project: Spark
>  Issue Type: Test
>  Components: ML, PySpark
>Affects Versions: 3.0.0
>Reporter: Liang-Chi Hsieh
>Priority: Minor
>
> Because the local default locale isn't in available locales at {{Locale}}, 
> when I did some tests locally with python code, {{StopWordsRemover}} related 
> python test hits some errors, like:
> {code}
> Traceback (most recent call last):
>   File "/spark-1/python/pyspark/ml/tests/test_feature.py", line 87, in 
> test_stopwordsremover
> stopWordRemover = StopWordsRemover(inputCol="input", outputCol="output")
>   File "/spark-1/python/pyspark/__init__.py", line 111, in wrapper
> return func(self, **kwargs)
>   File "/spark-1/python/pyspark/ml/feature.py", line 2646, in __init__
> self.uid)
>   File "/spark-1/python/pyspark/ml/wrapper.py", line 67, in _new_java_obj
> return java_obj(*java_args)
>   File /spark-1/python/lib/py4j-0.10.8.1-src.zip/py4j/java_gateway.py", line 
> 1554, in __call__
> answer, self._gateway_client, None, self._fqn)
>   File "/spark-1/python/pyspark/sql/utils.py", line 93, in deco
> raise converted
> pyspark.sql.utils.IllegalArgumentException: 'StopWordsRemover_4598673ee802 
> parameter locale given invalid value en_TW.'
> {code}
> As per [~hyukjin.kwon]'s advice, instead of setting up locale to pass test, 
> it is better to have a workable locale if system default locale can't be 
> found in available locales in JVM. Otherwise, users have to manually change 
> system locale or accessing a private property _jvm in PySpark.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28365) Set default locale param for StopWordsRemover to en_US if system default locale isn't in available locales in JVM

2019-07-14 Thread Liang-Chi Hsieh (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-28365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang-Chi Hsieh updated SPARK-28365:

Priority: Major  (was: Minor)

> Set default locale param for StopWordsRemover to en_US if system default 
> locale isn't in available locales in JVM
> -
>
> Key: SPARK-28365
> URL: https://issues.apache.org/jira/browse/SPARK-28365
> Project: Spark
>  Issue Type: Bug
>  Components: ML, PySpark
>Affects Versions: 3.0.0
>Reporter: Liang-Chi Hsieh
>Priority: Major
>
> Because the local default locale isn't in available locales at {{Locale}}, 
> when I did some tests locally with python code, {{StopWordsRemover}} related 
> python test hits some errors, like:
> {code}
> Traceback (most recent call last):
>   File "/spark-1/python/pyspark/ml/tests/test_feature.py", line 87, in 
> test_stopwordsremover
> stopWordRemover = StopWordsRemover(inputCol="input", outputCol="output")
>   File "/spark-1/python/pyspark/__init__.py", line 111, in wrapper
> return func(self, **kwargs)
>   File "/spark-1/python/pyspark/ml/feature.py", line 2646, in __init__
> self.uid)
>   File "/spark-1/python/pyspark/ml/wrapper.py", line 67, in _new_java_obj
> return java_obj(*java_args)
>   File /spark-1/python/lib/py4j-0.10.8.1-src.zip/py4j/java_gateway.py", line 
> 1554, in __call__
> answer, self._gateway_client, None, self._fqn)
>   File "/spark-1/python/pyspark/sql/utils.py", line 93, in deco
> raise converted
> pyspark.sql.utils.IllegalArgumentException: 'StopWordsRemover_4598673ee802 
> parameter locale given invalid value en_TW.'
> {code}
> As per [~hyukjin.kwon]'s advice, instead of setting up locale to pass test, 
> it is better to have a workable locale if system default locale can't be 
> found in available locales in JVM. Otherwise, users have to manually change 
> system locale or accessing a private property _jvm in PySpark.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28365) Set default locale param for StopWordsRemover to en_US if system default locale isn't in available locales in JVM

2019-07-14 Thread Liang-Chi Hsieh (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-28365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang-Chi Hsieh updated SPARK-28365:

Component/s: (was: PySpark)

> Set default locale param for StopWordsRemover to en_US if system default 
> locale isn't in available locales in JVM
> -
>
> Key: SPARK-28365
> URL: https://issues.apache.org/jira/browse/SPARK-28365
> Project: Spark
>  Issue Type: Bug
>  Components: ML
>Affects Versions: 3.0.0
>Reporter: Liang-Chi Hsieh
>Priority: Major
>
> Because the local default locale isn't in available locales at {{Locale}}, 
> when I did some tests locally with python code, {{StopWordsRemover}} related 
> python test hits some errors, like:
> {code}
> Traceback (most recent call last):
>   File "/spark-1/python/pyspark/ml/tests/test_feature.py", line 87, in 
> test_stopwordsremover
> stopWordRemover = StopWordsRemover(inputCol="input", outputCol="output")
>   File "/spark-1/python/pyspark/__init__.py", line 111, in wrapper
> return func(self, **kwargs)
>   File "/spark-1/python/pyspark/ml/feature.py", line 2646, in __init__
> self.uid)
>   File "/spark-1/python/pyspark/ml/wrapper.py", line 67, in _new_java_obj
> return java_obj(*java_args)
>   File /spark-1/python/lib/py4j-0.10.8.1-src.zip/py4j/java_gateway.py", line 
> 1554, in __call__
> answer, self._gateway_client, None, self._fqn)
>   File "/spark-1/python/pyspark/sql/utils.py", line 93, in deco
> raise converted
> pyspark.sql.utils.IllegalArgumentException: 'StopWordsRemover_4598673ee802 
> parameter locale given invalid value en_TW.'
> {code}
> As per [~hyukjin.kwon]'s advice, instead of setting up locale to pass test, 
> it is better to have a workable locale if system default locale can't be 
> found in available locales in JVM. Otherwise, users have to manually change 
> system locale or accessing a private property _jvm in PySpark.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Updated] (SPARK-28365) Set default locale param for StopWordsRemover to en_US if system default locale isn't in available locales in JVM

2019-07-14 Thread Liang-Chi Hsieh (JIRA)


 [ 
https://issues.apache.org/jira/browse/SPARK-28365?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Liang-Chi Hsieh updated SPARK-28365:

Issue Type: Bug  (was: Test)

> Set default locale param for StopWordsRemover to en_US if system default 
> locale isn't in available locales in JVM
> -
>
> Key: SPARK-28365
> URL: https://issues.apache.org/jira/browse/SPARK-28365
> Project: Spark
>  Issue Type: Bug
>  Components: ML, PySpark
>Affects Versions: 3.0.0
>Reporter: Liang-Chi Hsieh
>Priority: Minor
>
> Because the local default locale isn't in available locales at {{Locale}}, 
> when I did some tests locally with python code, {{StopWordsRemover}} related 
> python test hits some errors, like:
> {code}
> Traceback (most recent call last):
>   File "/spark-1/python/pyspark/ml/tests/test_feature.py", line 87, in 
> test_stopwordsremover
> stopWordRemover = StopWordsRemover(inputCol="input", outputCol="output")
>   File "/spark-1/python/pyspark/__init__.py", line 111, in wrapper
> return func(self, **kwargs)
>   File "/spark-1/python/pyspark/ml/feature.py", line 2646, in __init__
> self.uid)
>   File "/spark-1/python/pyspark/ml/wrapper.py", line 67, in _new_java_obj
> return java_obj(*java_args)
>   File /spark-1/python/lib/py4j-0.10.8.1-src.zip/py4j/java_gateway.py", line 
> 1554, in __call__
> answer, self._gateway_client, None, self._fqn)
>   File "/spark-1/python/pyspark/sql/utils.py", line 93, in deco
> raise converted
> pyspark.sql.utils.IllegalArgumentException: 'StopWordsRemover_4598673ee802 
> parameter locale given invalid value en_TW.'
> {code}
> As per [~hyukjin.kwon]'s advice, instead of setting up locale to pass test, 
> it is better to have a workable locale if system default locale can't be 
> found in available locales in JVM. Otherwise, users have to manually change 
> system locale or accessing a private property _jvm in PySpark.



--
This message was sent by Atlassian JIRA
(v7.6.14#76016)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org