[
https://issues.apache.org/jira/browse/PHOENIX-1142?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14301993#comment-14301993
]
James Taylor commented on PHOENIX-1142:
---------------------------------------
Let's let [~gabriel.reid] review too. Here's some additional concerns:
- The SimpleDateFormat is not thread safe for parsing, so you can't store it in
a local like this in DateUtil (but maybe I'm missing something?):
{code}
+ private static DateTimeParser defaultDateTimeParser;
{code}
- There's a facility to set the date timezone and date format on a config
property which might occur at connection time. This code seems to disable that?
{code}
diff --git
phoenix-core/src/main/java/org/apache/phoenix/parse/ToDateParseNode.java
phoenix-core/src/main/java/org/apache/phoenix/parse/ToDateParseNode.java
index 46bca63..ab1d76f 100644
--- phoenix-core/src/main/java/org/apache/phoenix/parse/ToDateParseNode.java
+++ phoenix-core/src/main/java/org/apache/phoenix/parse/ToDateParseNode.java
@@ -38,10 +38,10 @@ public class ToDateParseNode extends FunctionParseNode {
@Override
public FunctionExpression create(List<Expression> children,
StatementContext context) throws SQLException {
- Format dateParser;
+ DateUtil.DateTimeParser dateParser;
String dateFormat = (String) ((LiteralExpression)
children.get(1)).getValue();
String timeZoneId = (String) ((LiteralExpression)
children.get(2)).getValue();
- TimeZone parserTimeZone = context.getDateFormatTimeZone();
+ TimeZone parserTimeZone = null;
if (dateFormat == null) {
dateFormat = context.getDateFormat();
}
{code}
- It appears that the default date parsing is being set as a static variable
(JVM-wide), while ConnectionQueryServicesImpl is a per cluster object. This
doesn't seem to be correct. I think we should consider where the DateTimeParser
should live/be scoped to:
{code}
@@ -249,6 +238,8 @@ public class ConnectionQueryServicesImpl extends
DelegateQueryServices implement
.maximumSize(MAX_TABLE_STATS_CACHE_ENTRIES)
.expireAfterWrite(halfStatsUpdateFreq, TimeUnit.MILLISECONDS)
.build();
+ // Initialize DateUtil DefaultDatetimeParser
+ DateUtil.init(this.config);
}
{code}
- In general, we use the QueryServices.getProps() method to access property
values rather than go through Configuration. The reason is that we've seen this
impact performance because Configuration does locking while ReadOnlyProps does
not. The locking is not necessary, because the Configuration is not changing
in-place.
- What will the impact be of changing the DATE_TIME_PARSER (now
ISO_DATE_TIME_PARSER) on CSV bulk loading? We append the ' ' literal because we
use a YYYY-MM-DD HH:MM:SS.zzz format (which isn't ISO standard, I believe, as
ISO wants a 'T' instead).
{code}
+ private static final DateTimeFormatter ISO_DATE_TIME_PARSER = new
DateTimeFormatterBuilder()
.append(ISODateTimeFormat.dateParser())
.appendOptional(new DateTimeFormatterBuilder()
- .appendLiteral(' ')
+ .appendLiteral(' ').toParser())
+ .appendOptional(new DateTimeFormatterBuilder()
.append(ISODateTimeFormat.timeParser()).toParser())
.toFormatter()
+ .withZoneUTC()
.withChronology(ISOChronology.getInstanceUTC());
{code}
> Improve CsvBulkLoadTool to parse different Date formats
> -------------------------------------------------------
>
> Key: PHOENIX-1142
> URL: https://issues.apache.org/jira/browse/PHOENIX-1142
> Project: Phoenix
> Issue Type: Improvement
> Reporter: David Kjerrumgaard
> Assignee: Jeffrey Zhong
> Attachments: PHOENIX-1142.patch
>
>
> Currently, the CsvBulkLoadTool uses a single 'default' format to parse dates
> from the source CSV file. This can sometimes cause issues when the date
> fields don't match this default format.
> Would it be possible to add a method for specifying / overriding the
> 'default' date format?
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)