[jira] [Commented] (ORC-410) Fix a locale-dependent test in TestCsvReader

2018-10-01 Thread Kotaro Terada (JIRA)


[ 
https://issues.apache.org/jira/browse/ORC-410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16634821#comment-16634821
 ] 

Kotaro Terada commented on ORC-410:
---

Thank you for committing this, Owen!

> Fix a locale-dependent test in TestCsvReader
> 
>
> Key: ORC-410
> URL: https://issues.apache.org/jira/browse/ORC-410
> Project: ORC
>  Issue Type: Bug
>Reporter: Kotaro Terada
>Assignee: Kotaro Terada
>Priority: Major
> Fix For: 1.5.4, 1.6.0
>
>
> {{testCustomTimestampFormat}} in {{TestCsvReader}} fails in some environments 
> because the test is locale-dependent.
> In this test, we try to parse a DateTime string (such as '21 Mar 2018 
> 12:23:34') with a given timestamp format. The problem is that English month 
> abbreviations (such as 'Mar') are locale-dependent. When the locale of Java 
> Virtual Machine is a locale where the language is English (e.g., en_US and 
> en_GB), this test passes without any problems. However, when the locale of 
> JVM is a locale where the language is non-English (e.g., ja_JP and zh_CN), 
> the test fails as follows.
> {noformat}
> [INFO] Running org.apache.orc.tools.convert.TestCsvReader
> [ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.237 
> s <<< FAILURE! - in org.apache.orc.tools.convert.TestCsvReader
> [ERROR] testCustomTimestampFormat(org.apache.orc.tools.convert.TestCsvReader) 
>  Time elapsed: 0.143 s  <<< ERROR!
> org.threeten.bp.format.DateTimeParseException: Text '21 Mar 2018 12:23:34' 
> could not be parsed at index 3
> at 
> org.apache.orc.tools.convert.TestCsvReader.testCustomTimestampFormat(TestCsvReader.java:189)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ORC-410) Fix a locale-dependent test in TestCsvReader

2018-10-01 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/ORC-410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16634219#comment-16634219
 ] 

ASF GitHub Bot commented on ORC-410:


Github user asfgit closed the pull request at:

https://github.com/apache/orc/pull/314


> Fix a locale-dependent test in TestCsvReader
> 
>
> Key: ORC-410
> URL: https://issues.apache.org/jira/browse/ORC-410
> Project: ORC
>  Issue Type: Bug
>Reporter: Kotaro Terada
>Priority: Major
>
> {{testCustomTimestampFormat}} in {{TestCsvReader}} fails in some environments 
> because the test is locale-dependent.
> In this test, we try to parse a DateTime string (such as '21 Mar 2018 
> 12:23:34') with a given timestamp format. The problem is that English month 
> abbreviations (such as 'Mar') are locale-dependent. When the locale of Java 
> Virtual Machine is a locale where the language is English (e.g., en_US and 
> en_GB), this test passes without any problems. However, when the locale of 
> JVM is a locale where the language is non-English (e.g., ja_JP and zh_CN), 
> the test fails as follows.
> {noformat}
> [INFO] Running org.apache.orc.tools.convert.TestCsvReader
> [ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.237 
> s <<< FAILURE! - in org.apache.orc.tools.convert.TestCsvReader
> [ERROR] testCustomTimestampFormat(org.apache.orc.tools.convert.TestCsvReader) 
>  Time elapsed: 0.143 s  <<< ERROR!
> org.threeten.bp.format.DateTimeParseException: Text '21 Mar 2018 12:23:34' 
> could not be parsed at index 3
> at 
> org.apache.orc.tools.convert.TestCsvReader.testCustomTimestampFormat(TestCsvReader.java:189)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ORC-410) Fix a locale-dependent test in TestCsvReader

2018-09-28 Thread ASF GitHub Bot (JIRA)


[ 
https://issues.apache.org/jira/browse/ORC-410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631478#comment-16631478
 ] 

ASF GitHub Bot commented on ORC-410:


GitHub user kotarot opened a pull request:

https://github.com/apache/orc/pull/314

ORC-410: Fix a locale-dependent test in TestCsvReader

## Problem

`testCustomTimestampFormat` in `TestCsvReader` fails in some environments 
because the test is locale-dependent.

In this test, we try to parse a DateTime string (such as '21 Mar 2018 
12:23:34') with a given timestamp format. The problem is that English month 
abbreviations (such as 'Mar') are locale-dependent. When the locale of Java 
Virtual Machine is a locale where the language is English (e.g., en_US and 
en_GB), this test passes without any problems. However, when the locale of JVM 
is a locale where the language is non-English (e.g., ja_JP and zh_CN), the test 
fails as follows.

```
[INFO] Running org.apache.orc.tools.convert.TestCsvReader
[ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 
0.237 s <<< FAILURE! - in org.apache.orc.tools.convert.TestCsvReader
[ERROR] 
testCustomTimestampFormat(org.apache.orc.tools.convert.TestCsvReader)  Time 
elapsed: 0.143 s  <<< ERROR!
org.threeten.bp.format.DateTimeParseException: Text '21 Mar 2018 12:23:34' 
could not be parsed at index 3
at 
org.apache.orc.tools.convert.TestCsvReader.testCustomTimestampFormat(TestCsvReader.java:189)
```

## Solution

The following two solutions can be considered to fix this problem by 
updating the test:
(1) Make this test be locale-independent.
(2) Set the locale to en_US in this test.

(1) is better, but it's not an easy task to construct a DateTime string 
which can be successfully parsed in all existing locales.
Thus, I adopt (2) and modify the test.

You can merge this pull request into a Git repository by running:

$ git pull https://github.com/kotarot/orc ORC-410

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/orc/pull/314.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #314


commit 43ae8b80783c2af8e155c2fbbfb724bf86b9a5f2
Author: Kotaro Terada 
Date:   2018-09-27T03:45:31Z

Fix a locale-dependent test in TestCsvReader




> Fix a locale-dependent test in TestCsvReader
> 
>
> Key: ORC-410
> URL: https://issues.apache.org/jira/browse/ORC-410
> Project: ORC
>  Issue Type: Bug
>Reporter: Kotaro Terada
>Priority: Major
>
> {{testCustomTimestampFormat}} in {{TestCsvReader}} fails in some environments 
> because the test is locale-dependent.
> In this test, we try to parse a DateTime string (such as '21 Mar 2018 
> 12:23:34') with a given timestamp format. The problem is that English month 
> abbreviations (such as 'Mar') are locale-dependent. When the locale of Java 
> Virtual Machine is a locale where the language is English (e.g., en_US and 
> en_GB), this test passes without any problems. However, when the locale of 
> JVM is a locale where the language is non-English (e.g., ja_JP and zh_CN), 
> the test fails as follows.
> {noformat}
> [INFO] Running org.apache.orc.tools.convert.TestCsvReader
> [ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.237 
> s <<< FAILURE! - in org.apache.orc.tools.convert.TestCsvReader
> [ERROR] testCustomTimestampFormat(org.apache.orc.tools.convert.TestCsvReader) 
>  Time elapsed: 0.143 s  <<< ERROR!
> org.threeten.bp.format.DateTimeParseException: Text '21 Mar 2018 12:23:34' 
> could not be parsed at index 3
> at 
> org.apache.orc.tools.convert.TestCsvReader.testCustomTimestampFormat(TestCsvReader.java:189)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)


[jira] [Commented] (ORC-410) Fix a locale-dependent test in TestCsvReader

2018-09-28 Thread Kotaro Terada (JIRA)


[ 
https://issues.apache.org/jira/browse/ORC-410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16631462#comment-16631462
 ] 

Kotaro Terada commented on ORC-410:
---

Please assign this Jira to me.

> Fix a locale-dependent test in TestCsvReader
> 
>
> Key: ORC-410
> URL: https://issues.apache.org/jira/browse/ORC-410
> Project: ORC
>  Issue Type: Bug
>Reporter: Kotaro Terada
>Priority: Major
>
> {{testCustomTimestampFormat}} in {{TestCsvReader}} fails in some environments 
> because the test is locale-dependent.
> In this test, we try to parse a DateTime string (such as '21 Mar 2018 
> 12:23:34') with a given timestamp format. The problem is that English month 
> abbreviations (such as 'Mar') are locale-dependent. When the locale of Java 
> Virtual Machine is a locale where the language is English (e.g., en_US and 
> en_GB), this test passes without any problems. However, when the locale of 
> JVM is a locale where the language is non-English (e.g., ja_JP and zh_CN), 
> the test fails as follows.
> {noformat}
> [INFO] Running org.apache.orc.tools.convert.TestCsvReader
> [ERROR] Tests run: 5, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 0.237 
> s <<< FAILURE! - in org.apache.orc.tools.convert.TestCsvReader
> [ERROR] testCustomTimestampFormat(org.apache.orc.tools.convert.TestCsvReader) 
>  Time elapsed: 0.143 s  <<< ERROR!
> org.threeten.bp.format.DateTimeParseException: Text '21 Mar 2018 12:23:34' 
> could not be parsed at index 3
> at 
> org.apache.orc.tools.convert.TestCsvReader.testCustomTimestampFormat(TestCsvReader.java:189)
> {noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)