[ 
https://issues.apache.org/jira/browse/NIFI-4882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16367764#comment-16367764
 ] 

ASF GitHub Bot commented on NIFI-4882:
--------------------------------------

Github user derekstraka commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2473#discussion_r168847826
  
    --- Diff: 
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/csv/AbstractCSVRecordReader.java
 ---
    @@ -0,0 +1,110 @@
    +package org.apache.nifi.csv;
    +
    +
    +import org.apache.nifi.logging.ComponentLog;
    +import org.apache.nifi.serialization.RecordReader;
    +import org.apache.nifi.serialization.record.DataType;
    +import org.apache.nifi.serialization.record.RecordSchema;
    +import org.apache.nifi.serialization.record.util.DataTypeUtils;
    +import org.apache.nifi.serialization.record.RecordFieldType;
    +import java.text.DateFormat;
    +import java.util.function.Supplier;
    +
    +abstract public class AbstractCSVRecordReader implements RecordReader {
    +
    +    protected final ComponentLog logger;
    +    protected final boolean hasHeader;
    +    protected final boolean ignoreHeader;
    +
    +    protected final Supplier<DateFormat> LAZY_DATE_FORMAT;
    +    protected final Supplier<DateFormat> LAZY_TIME_FORMAT;
    +    protected final Supplier<DateFormat> LAZY_TIMESTAMP_FORMAT;
    +
    +    protected final String dateFormat;
    +    protected final String timeFormat;
    +    protected final String timestampFormat;
    +
    +    protected final RecordSchema schema;
    +
    +    AbstractCSVRecordReader(final ComponentLog logger, final RecordSchema 
schema, final boolean hasHeader, final boolean ignoreHeader,
    +                            final String dateFormat, final String 
timeFormat, final String timestampFormat) {
    +        this.logger = logger;
    +        this.schema = schema;
    +        this.hasHeader = hasHeader;
    +        this.ignoreHeader = ignoreHeader;
    +        this.dateFormat = (dateFormat == null) ? 
RecordFieldType.DATE.getDefaultFormat() : dateFormat;
    +        this.timeFormat = (timeFormat == null) ? 
RecordFieldType.TIME.getDefaultFormat(): timeFormat;
    +        this.timestampFormat = (timestampFormat == null) ? 
RecordFieldType.TIMESTAMP.getDefaultFormat(): timestampFormat;
    +
    +        final DateFormat df = (dateFormat == null) ? null : 
DataTypeUtils.getDateFormat(dateFormat);
    +        final DateFormat tf = (timeFormat == null) ? null : 
DataTypeUtils.getDateFormat(timeFormat);
    +        final DateFormat tsf = (timestampFormat == null) ? null : 
DataTypeUtils.getDateFormat(timestampFormat);
    --- End diff --
    
    I didn't actually check that.  It was a copy paste from the original 
source.  I'll validate and report what I find.


> CSVRecordReader should utilize specified date/time/timestamp format at its 
> convertSimpleIfPossible method
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: NIFI-4882
>                 URL: https://issues.apache.org/jira/browse/NIFI-4882
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Extensions
>            Reporter: Koji Kawamura
>            Assignee: Derek Straka
>            Priority: Major
>
> CSVRecordReader.convertSimpleIfPossible method is used by ValidateRecord. The 
> method does not coerce values to the target schema field type if the raw 
> string representation in the input CSV file is not compatible.
> The type compatibility check is implemented as follows. But it does not use 
> user specified date/time/timestamp format:
> {code}
>                 // This will return 'false' for input '01/01/1900' when user 
> specified custom format 'MM/dd/YYYY'
>                 if (DataTypeUtils.isCompatibleDataType(trimmed, dataType)) {
>                     // The LAZY_DATE_FORMAT should be used to check 
> compatibility, too.
>                     return DataTypeUtils.convertType(trimmed, dataType, 
> LAZY_DATE_FORMAT, LAZY_TIME_FORMAT, LAZY_TIMESTAMP_FORMAT, fieldName);
>                 } else {
>                     return value;
>                 }
> {code}
> If input date strings have different format than the default format 
> 'yyyy-MM-dd', then ValidateRecord processor can not validate input records.
> JacksonCSVRecordReader has the identical methods with CSVRecordReader. Those 
> classes should have an abstract class.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to