[jira] [Commented] (NIFI-6134) InferAvroSchema does not honour Analysis sampling.

Matt Burgess (JIRA) Tue, 26 Mar 2019 08:43:18 -0700


    [ 
https://issues.apache.org/jira/browse/NIFI-6134?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16801861#comment-16801861
 ]


Matt Burgess commented on NIFI-6134:
------------------------------------

The documentation for the Number Of Records To Analyze states that it "only 
applies to JSON content type". Strangely enough though, looking at the Kite 
code, for CSV schema inference it should be hard-coded to 25 rows (whether the 
property is set to 10 or not). I don't think the Kite SDK is being maintained 
so unfortunately I think this has to be considered "works as designed". You may 
have better luck with the CSVReader schema inference available as of NiFi 1.9.0.

> InferAvroSchema does not honour Analysis sampling.
> --------------------------------------------------
>
>                 Key: NIFI-6134
>                 URL: https://issues.apache.org/jira/browse/NIFI-6134
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Core Framework
>    Affects Versions: 1.9.0
>         Environment: Windows
>            Reporter: Steven Fister
>            Priority: Critical
>         Attachments: image-2019-03-20-18-04-50-595.png
>
>
> When using the InferAvroSchema setting the inferred.avro.schema setting even 
> when setting to 25 or 250 it still only samples 10 records.  In my sample I 
> skip the first line which is a header that has blanks in it.  The module only 
> samples the remaining 9 after the first line in skipped in the  CSV File.
> !image-2019-03-20-18-04-50-595.png!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

[jira] [Commented] (NIFI-6134) InferAvroSchema does not honour Analysis sampling.

Reply via email to