[jira] [Commented] (ORC-410) Fix a locale-dependent test in TestCsvReader
[ https://issues.apache.org/jira/browse/ORC-410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16634821#comment-16634821 ] Kotaro Terada commented on ORC-410: --- Thank you for committing this, Owen! > Fix a locale-dependent test in TestCsvReader > > > Key: ORC-410 > URL: https://issues.apache.org/jira/browse/ORC-410 > Project: ORC > Issue Type: Bug >Reporter: Kotaro Terada >Assignee: Kotaro Terada >Priority: Major > Fix For: 1.5.4, 1.6.0 > > > {{testCustomTimestampFormat}} in {{TestCsvReader}} fails in some environments > because the test is locale-dependent. > In this test, we try to parse a DateTime string (such as '21 Mar 2018 > 12:23:34') with a given timestamp format. The problem is that English month > abbreviations (such as 'Mar') are locale-dependent. When the locale of Java > Virtual Machine is a locale where the language is English (e.g., en_US and > en_GB), this test passes without any problems. However, when the locale of > JVM is a locale where the language is non-English (e.g., ja_JP and zh_CN), > the test fails as follows. > {noformat} > [INFO] Running org.apache.orc.tools.convert.TestCsvReader > [ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.237 > s <<< FAILURE! - in org.apache.orc.tools.convert.TestCsvReader > [ERROR] testCustomTimestampFormat(org.apache.orc.tools.convert.TestCsvReader) > Time elapsed: 0.143 s <<< ERROR! > org.threeten.bp.format.DateTimeParseException: Text '21 Mar 2018 12:23:34' > could not be parsed at index 3 > at > org.apache.orc.tools.convert.TestCsvReader.testCustomTimestampFormat(TestCsvReader.java:189) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ORC-410) Fix a locale-dependent test in TestCsvReader
[ https://issues.apache.org/jira/browse/ORC-410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16634219#comment-16634219 ] ASF GitHub Bot commented on ORC-410: Github user asfgit closed the pull request at: https://github.com/apache/orc/pull/314 > Fix a locale-dependent test in TestCsvReader > > > Key: ORC-410 > URL: https://issues.apache.org/jira/browse/ORC-410 > Project: ORC > Issue Type: Bug >Reporter: Kotaro Terada >Priority: Major > > {{testCustomTimestampFormat}} in {{TestCsvReader}} fails in some environments > because the test is locale-dependent. > In this test, we try to parse a DateTime string (such as '21 Mar 2018 > 12:23:34') with a given timestamp format. The problem is that English month > abbreviations (such as 'Mar') are locale-dependent. When the locale of Java > Virtual Machine is a locale where the language is English (e.g., en_US and > en_GB), this test passes without any problems. However, when the locale of > JVM is a locale where the language is non-English (e.g., ja_JP and zh_CN), > the test fails as follows. > {noformat} > [INFO] Running org.apache.orc.tools.convert.TestCsvReader > [ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.237 > s <<< FAILURE! - in org.apache.orc.tools.convert.TestCsvReader > [ERROR] testCustomTimestampFormat(org.apache.orc.tools.convert.TestCsvReader) > Time elapsed: 0.143 s <<< ERROR! > org.threeten.bp.format.DateTimeParseException: Text '21 Mar 2018 12:23:34' > could not be parsed at index 3 > at > org.apache.orc.tools.convert.TestCsvReader.testCustomTimestampFormat(TestCsvReader.java:189) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ORC-410) Fix a locale-dependent test in TestCsvReader
[ https://issues.apache.org/jira/browse/ORC-410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631478#comment-16631478 ] ASF GitHub Bot commented on ORC-410: GitHub user kotarot opened a pull request: https://github.com/apache/orc/pull/314 ORC-410: Fix a locale-dependent test in TestCsvReader ## Problem `testCustomTimestampFormat` in `TestCsvReader` fails in some environments because the test is locale-dependent. In this test, we try to parse a DateTime string (such as '21 Mar 2018 12:23:34') with a given timestamp format. The problem is that English month abbreviations (such as 'Mar') are locale-dependent. When the locale of Java Virtual Machine is a locale where the language is English (e.g., en_US and en_GB), this test passes without any problems. However, when the locale of JVM is a locale where the language is non-English (e.g., ja_JP and zh_CN), the test fails as follows. ``` [INFO] Running org.apache.orc.tools.convert.TestCsvReader [ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.237 s <<< FAILURE! - in org.apache.orc.tools.convert.TestCsvReader [ERROR] testCustomTimestampFormat(org.apache.orc.tools.convert.TestCsvReader) Time elapsed: 0.143 s <<< ERROR! org.threeten.bp.format.DateTimeParseException: Text '21 Mar 2018 12:23:34' could not be parsed at index 3 at org.apache.orc.tools.convert.TestCsvReader.testCustomTimestampFormat(TestCsvReader.java:189) ``` ## Solution The following two solutions can be considered to fix this problem by updating the test: (1) Make this test be locale-independent. (2) Set the locale to en_US in this test. (1) is better, but it's not an easy task to construct a DateTime string which can be successfully parsed in all existing locales. Thus, I adopt (2) and modify the test. You can merge this pull request into a Git repository by running: $ git pull https://github.com/kotarot/orc ORC-410 Alternatively you can review and apply these changes as the patch at: https://github.com/apache/orc/pull/314.patch To close this pull request, make a commit to your master/trunk branch with (at least) the following in the commit message: This closes #314 commit 43ae8b80783c2af8e155c2fbbfb724bf86b9a5f2 Author: Kotaro Terada Date: 2018-09-27T03:45:31Z Fix a locale-dependent test in TestCsvReader > Fix a locale-dependent test in TestCsvReader > > > Key: ORC-410 > URL: https://issues.apache.org/jira/browse/ORC-410 > Project: ORC > Issue Type: Bug >Reporter: Kotaro Terada >Priority: Major > > {{testCustomTimestampFormat}} in {{TestCsvReader}} fails in some environments > because the test is locale-dependent. > In this test, we try to parse a DateTime string (such as '21 Mar 2018 > 12:23:34') with a given timestamp format. The problem is that English month > abbreviations (such as 'Mar') are locale-dependent. When the locale of Java > Virtual Machine is a locale where the language is English (e.g., en_US and > en_GB), this test passes without any problems. However, when the locale of > JVM is a locale where the language is non-English (e.g., ja_JP and zh_CN), > the test fails as follows. > {noformat} > [INFO] Running org.apache.orc.tools.convert.TestCsvReader > [ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.237 > s <<< FAILURE! - in org.apache.orc.tools.convert.TestCsvReader > [ERROR] testCustomTimestampFormat(org.apache.orc.tools.convert.TestCsvReader) > Time elapsed: 0.143 s <<< ERROR! > org.threeten.bp.format.DateTimeParseException: Text '21 Mar 2018 12:23:34' > could not be parsed at index 3 > at > org.apache.orc.tools.convert.TestCsvReader.testCustomTimestampFormat(TestCsvReader.java:189) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)
[jira] [Commented] (ORC-410) Fix a locale-dependent test in TestCsvReader
[ https://issues.apache.org/jira/browse/ORC-410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631462#comment-16631462 ] Kotaro Terada commented on ORC-410: --- Please assign this Jira to me. > Fix a locale-dependent test in TestCsvReader > > > Key: ORC-410 > URL: https://issues.apache.org/jira/browse/ORC-410 > Project: ORC > Issue Type: Bug >Reporter: Kotaro Terada >Priority: Major > > {{testCustomTimestampFormat}} in {{TestCsvReader}} fails in some environments > because the test is locale-dependent. > In this test, we try to parse a DateTime string (such as '21 Mar 2018 > 12:23:34') with a given timestamp format. The problem is that English month > abbreviations (such as 'Mar') are locale-dependent. When the locale of Java > Virtual Machine is a locale where the language is English (e.g., en_US and > en_GB), this test passes without any problems. However, when the locale of > JVM is a locale where the language is non-English (e.g., ja_JP and zh_CN), > the test fails as follows. > {noformat} > [INFO] Running org.apache.orc.tools.convert.TestCsvReader > [ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.237 > s <<< FAILURE! - in org.apache.orc.tools.convert.TestCsvReader > [ERROR] testCustomTimestampFormat(org.apache.orc.tools.convert.TestCsvReader) > Time elapsed: 0.143 s <<< ERROR! > org.threeten.bp.format.DateTimeParseException: Text '21 Mar 2018 12:23:34' > could not be parsed at index 3 > at > org.apache.orc.tools.convert.TestCsvReader.testCustomTimestampFormat(TestCsvReader.java:189) > {noformat} -- This message was sent by Atlassian JIRA (v7.6.3#76005)