[ 
https://issues.apache.org/jira/browse/NIFI-4882?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16401689#comment-16401689
 ] 

ASF GitHub Bot commented on NIFI-4882:
--------------------------------------

Github user ijokarumawak commented on a diff in the pull request:

    https://github.com/apache/nifi/pull/2473#discussion_r175000719
  
    --- Diff: 
nifi-nar-bundles/nifi-standard-services/nifi-record-serialization-services-bundle/nifi-record-serialization-services/src/test/java/org/apache/nifi/csv/TestCSVRecordReader.java
 ---
    @@ -106,6 +117,195 @@ public void testDate() throws IOException, 
MalformedRecordException {
             }
         }
     
    +    @Test
    +    public void testDateNoCoersionSuccess() throws IOException, 
MalformedRecordException {
    +        final String text = "date\n11/30/1983";
    +
    +        final List<RecordField> fields = new ArrayList<>();
    +        fields.add(new RecordField("date", 
RecordFieldType.DATE.getDataType()));
    +        final RecordSchema schema = new SimpleRecordSchema(fields);
    +
    +        try (final InputStream bais = new 
ByteArrayInputStream(text.getBytes());
    +             final CSVRecordReader reader = new CSVRecordReader(bais, 
Mockito.mock(ComponentLog.class), schema, format, true, false,
    +                     "MM/dd/yyyy", 
RecordFieldType.TIME.getDefaultFormat(), 
RecordFieldType.TIMESTAMP.getDefaultFormat(), "UTF-8")) {
    +
    +            final Record record = reader.nextRecord(false, false);
    +            final java.sql.Date date = (Date) record.getValue("date");
    +            final Calendar calendar = 
Calendar.getInstance(TimeZone.getTimeZone("gmt"));
    +            calendar.setTimeInMillis(date.getTime());
    +
    +            assertEquals(1983, calendar.get(Calendar.YEAR));
    +            assertEquals(10, calendar.get(Calendar.MONTH));
    +            assertEquals(30, calendar.get(Calendar.DAY_OF_MONTH));
    +        }
    +    }
    +
    +    @Test
    +    public void testDateNoCoersionFailure() throws IOException, 
MalformedRecordException {
    +        final String text = "date\n11/30/1983";
    +
    +        final List<RecordField> fields = new ArrayList<>();
    +        fields.add(new RecordField("date", 
RecordFieldType.DATE.getDataType()));
    +        final RecordSchema schema = new SimpleRecordSchema(fields);
    +
    +        try (final InputStream bais = new 
ByteArrayInputStream(text.getBytes());
    +             final CSVRecordReader reader = new CSVRecordReader(bais, 
Mockito.mock(ComponentLog.class), schema, format, true, false,
    +                     "MM-dd-yyyy", 
RecordFieldType.TIME.getDefaultFormat(), 
RecordFieldType.TIMESTAMP.getDefaultFormat(), "UTF-8")) {
    +
    +            final Record record = reader.nextRecord(false, false);
    +            assertEquals("11/30/1983", (String)record.getValue("date"));
    --- End diff --
    
    Assertion comment would be helpful. "When values are not in the expected 
format, return String as it is".


> CSVRecordReader should utilize specified date/time/timestamp format at its 
> convertSimpleIfPossible method
> ---------------------------------------------------------------------------------------------------------
>
>                 Key: NIFI-4882
>                 URL: https://issues.apache.org/jira/browse/NIFI-4882
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Extensions
>            Reporter: Koji Kawamura
>            Assignee: Derek Straka
>            Priority: Major
>
> CSVRecordReader.convertSimpleIfPossible method is used by ValidateRecord. The 
> method does not coerce values to the target schema field type if the raw 
> string representation in the input CSV file is not compatible.
> The type compatibility check is implemented as follows. But it does not use 
> user specified date/time/timestamp format:
> {code}
>                 // This will return 'false' for input '01/01/1900' when user 
> specified custom format 'MM/dd/YYYY'
>                 if (DataTypeUtils.isCompatibleDataType(trimmed, dataType)) {
>                     // The LAZY_DATE_FORMAT should be used to check 
> compatibility, too.
>                     return DataTypeUtils.convertType(trimmed, dataType, 
> LAZY_DATE_FORMAT, LAZY_TIME_FORMAT, LAZY_TIMESTAMP_FORMAT, fieldName);
>                 } else {
>                     return value;
>                 }
> {code}
> If input date strings have different format than the default format 
> 'yyyy-MM-dd', then ValidateRecord processor can not validate input records.
> JacksonCSVRecordReader has the identical methods with CSVRecordReader. Those 
> classes should have an abstract class.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to