Github user ijokarumawak commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2473#discussion_r168883276
  
    --- Diff: 
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/main/java/org/apache/nifi/csv/AbstractCSVRecordReader.java
 ---
    @@ -0,0 +1,140 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one or more
    + * contributor license agreements.  See the NOTICE file distributed with
    + * this work for additional information regarding copyright ownership.
    + * The ASF licenses this file to You under the Apache License, Version 2.0
    + * (the "License"); you may not use this file except in compliance with
    + * the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +
    +package org.apache.nifi.csv;
    +
    +
    +import org.apache.nifi.logging.ComponentLog;
    +import org.apache.nifi.serialization.RecordReader;
    +import org.apache.nifi.serialization.record.DataType;
    +import org.apache.nifi.serialization.record.RecordSchema;
    +import org.apache.nifi.serialization.record.util.DataTypeUtils;
    +import org.apache.nifi.serialization.record.RecordFieldType;
    +import java.text.DateFormat;
    +import java.util.function.Supplier;
    +
    +abstract public class AbstractCSVRecordReader implements RecordReader {
    +
    +    protected final ComponentLog logger;
    +    protected final boolean hasHeader;
    +    protected final boolean ignoreHeader;
    +
    +    protected final Supplier<DateFormat> LAZY_DATE_FORMAT;
    +    protected final Supplier<DateFormat> LAZY_TIME_FORMAT;
    +    protected final Supplier<DateFormat> LAZY_TIMESTAMP_FORMAT;
    +
    +    protected final String dateFormat;
    +    protected final String timeFormat;
    +    protected final String timestampFormat;
    +
    +    protected final RecordSchema schema;
    +
    +    AbstractCSVRecordReader(final ComponentLog logger, final RecordSchema 
schema, final boolean hasHeader, final boolean ignoreHeader,
    +                            final String dateFormat, final String 
timeFormat, final String timestampFormat) {
    +        this.logger = logger;
    +        this.schema = schema;
    +        this.hasHeader = hasHeader;
    +        this.ignoreHeader = ignoreHeader;
    +
    +        if (dateFormat == null || dateFormat.isEmpty()) {
    +            this.dateFormat = RecordFieldType.DATE.getDefaultFormat();
    +            LAZY_DATE_FORMAT = () -> null;
    --- End diff --
    
    @derekstraka Thanks for updating this. I'm wondering if this is correct. If 
I understand it correctly, `this.dateFormat` is used to check compatibility, 
but actual data conversion is done with `LAZY_DATE_FORMAT`. 
    And [DataTypeUtil will try to convert String value as Long if this Supplier 
returns 
null](https://github.com/apache/nifi/blob/master/nifi-commons/nifi-record/src/main/java/org/apache/nifi/serialization/record/util/DataTypeUtils.java#L511).
    
    If so, when `dataFormat` is null or empty and input value is '2018-02-17', 
result would be a NumberFormatException? The unit test covers default 
'yyyy-MM-dd' format and custom format case, but does not have the case where 
null/empty format is specified. Would you add that one, too?
    
    Also, I wonder how we should treat data in unix epoch representation. By 
looking at DataTypeUtils.isDateTypeCompatible method, it actually checks if a 
[String is compatible with Integer when format is 
null](https://github.com/apache/nifi/blob/master/nifi-commons/nifi-record/src/main/java/org/apache/nifi/serialization/record/util/DataTypeUtils.java#L535).
    
    From above observation, I think we should set `null` to `this.dataFormat` 
if `dateFormat` argument is null or empty, instead of using default format, so 
that it can be treated as numeric representation of a date.
    
    How do you think?


---

Reply via email to