[
https://issues.apache.org/jira/browse/NIFI-5552?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16643496#comment-16643496
]
ASF GitHub Bot commented on NIFI-5552:
--------------------------------------
Github user mattyb149 commented on the issue:
https://github.com/apache/nifi/pull/2966
I'm a little leery of automatically normalizing the names. In
[NIFI-4612](https://issues.apache.org/jira/browse/NIFI-4612) we added the
ability to disable name validation for the AvroSchemaRegistry, so there could
be Avro-illegal field names but they'd be preserved. I think that came up for
things like PutDatabaseRecord where the field names needed to be preserved but
were not Avro-valid. If we automatically normalize them, there might be
unintended (and undesired) consequence to things like PutParquet and
PutDatabaseRecord.
Perhaps instead of (or in addition to) normalization, we provide the option
to "disable schema validation". This isn't technically Avro-specific, but would
be used while we are so coupled to Avro schemas in the meantime?
> CSVReader Can't Derive Schema from Quoted Headers
> -------------------------------------------------
>
> Key: NIFI-5552
> URL: https://issues.apache.org/jira/browse/NIFI-5552
> Project: Apache NiFi
> Issue Type: Bug
> Components: Extensions
> Affects Versions: 1.7.1
> Reporter: Shawn Weeks
> Assignee: Pierre Villard
> Priority: Minor
>
> When deriving the schema from a CSV File Header NiFi is unable to generate a
> valid schema if the Header Columns are Double Quoted even though the
> CSVReader is set to handle quotes. Using the nile.csv sample file fromĀ
> https://people.sc.fsu.edu/~jburkardt/data/csv/csv.html results in anĀ Illegal
> initial character exception in the Avro Schema generator. In this specific
> case the header did not contain any spaces or special characters though it
> was case sensitive.
> {code:java}
> org.apache.avro.SchemaParseException: Illegal initial character: "Flood"
> at org.apache.avro.Schema.validateName(Schema.java:1147)
> at org.apache.avro.Schema.access$200(Schema.java:81)
> at org.apache.avro.Schema$Field.<init>(Schema.java:403)
> at org.apache.avro.Schema$Field.<init>(Schema.java:423)
> at org.apache.avro.Schema$Field.<init>(Schema.java:415)
> at org.apache.nifi.avro.AvroTypeUtil.buildAvroField(AvroTypeUtil.java:123)
> at org.apache.nifi.avro.AvroTypeUtil.buildAvroSchema(AvroTypeUtil.java:114)
> at org.apache.nifi.avro.AvroTypeUtil.extractAvroSchema(AvroTypeUtil.java:94)
> at
> org.apache.nifi.schema.access.WriteAvroSchemaAttributeStrategy.getAttributes(WriteAvroSchemaAttributeStrategy.java:58)
> at org.apache.nifi.json.WriteJsonResult.writeRecord(WriteJsonResult.java:137)
> at
> org.apache.nifi.serialization.AbstractRecordSetWriter.write(AbstractRecordSetWriter.java:59)
> at
> org.apache.nifi.processors.standard.AbstractRecordProcessor$1.process(AbstractRecordProcessor.java:122)
> at
> org.apache.nifi.controller.repository.StandardProcessSession.write(StandardProcessSession.java:2885)
> at
> org.apache.nifi.processors.standard.AbstractRecordProcessor.onTrigger(AbstractRecordProcessor.java:109)
> at
> org.apache.nifi.processor.AbstractProcessor.onTrigger(AbstractProcessor.java:27)
> at
> org.apache.nifi.controller.StandardProcessorNode.onTrigger(StandardProcessorNode.java:1165)
> at
> org.apache.nifi.controller.tasks.ConnectableTask.invoke(ConnectableTask.java:203)
> at
> org.apache.nifi.controller.scheduling.TimerDrivenSchedulingAgent$1.run(TimerDrivenSchedulingAgent.java:117)
> at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748){code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)