[ 
https://issues.apache.org/jira/browse/NIFI-5781?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Matt Burgess updated NIFI-5781:
-------------------------------
    Fix Version/s: 1.9.0

> Incorrect schema for provenance events in SiteToSiteProvenanceReportingTask
> ---------------------------------------------------------------------------
>
>                 Key: NIFI-5781
>                 URL: https://issues.apache.org/jira/browse/NIFI-5781
>             Project: Apache NiFi
>          Issue Type: Bug
>          Components: Extensions
>    Affects Versions: 1.7.0, 1.8.0, 1.7.1
>            Reporter: Pierre Villard
>            Assignee: Pierre Villard
>            Priority: Major
>             Fix For: 1.9.0
>
>
> The current schema does not allow null values for fields such as "details", 
> "remoteIdentifier", "alternateIdentifier" and others. This Jira is to make 
> the schema more flexible and allow for null fields.
> This will cause error looking like:
> {noformat}
> 2018-11-01 14:59:30,551 ERROR [Timer-Driven Process Thread-2] 
> o.a.n.r.SiteToSiteProvenanceReportingTask 
> SiteToSiteProvenanceReportingTask[id=0751c46f-0163-1000-7d33-f276e8654728] 
> Error running task 
> SiteToSiteProvenanceReportingTask[id=0751c46f-0163-1000-7d33-f276e8654728] 
> due to org.apache.avro.file.DataFileWriter$AppendWriteException: 
> java.lang.NullPointerException: null of string in field details of 
> nifi.provenanceEvent{noformat}
> +*Workaround*+: specify a writer with a custom schema instead of inheriting 
> record schema.
> {noformat}
> {
>   "namespace": "nifi",
>   "name": "provenanceEvent",
>   "type": "record",
>   "fields": [
>     { "name": "eventId", "type": "string" },
>     { "name": "eventOrdinal", "type": "long" },
>     { "name": "eventType", "type": "string" },
>     { "name": "timestampMillis", "type": "long" },
>     { "name": "durationMillis", "type": "long" },
>     { "name": "lineageStart", "type": { "type": "long", "logicalType": 
> "timestamp-millis" } },
>     { "name": "details", "type": ["null", "string"] },
>     { "name": "componentId", "type": ["null", "string"] },
>     { "name": "componentType", "type": ["null", "string"] },
>     { "name": "componentName", "type": ["null", "string"] },
>     { "name": "processGroupId", "type": ["null", "string"] },
>     { "name": "processGroupName", "type": ["null", "string"] },
>     { "name": "entityId", "type": ["null", "string"] },
>     { "name": "entityType", "type": ["null", "string"] },
>     { "name": "entitySize", "type": ["null", "long"] },
>     { "name": "previousEntitySize", "type": ["null", "long"] },
>     { "name": "updatedAttributes", "type": { "type": "map", "values": 
> "string" } },
>     { "name": "previousAttributes", "type": { "type": "map", "values": 
> "string" } },
>     { "name": "actorHostname", "type": ["null", "string"] },
>     { "name": "contentURI", "type": ["null", "string"] },
>     { "name": "previousContentURI", "type": ["null", "string"] },
>     { "name": "parentIds", "type": { "type": "array", "items": "string" } },
>     { "name": "childIds", "type": { "type": "array", "items": "string" } },
>     { "name": "platform", "type": "string" },
>     { "name": "application", "type": "string" },
>     { "name": "remoteIdentifier", "type": ["null", "string"] },
>     { "name": "alternateIdentifier", "type": ["null", "string"] },
>     { "name": "transitUri", "type": ["null", "string"] }
>   ]
> }{noformat}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

Reply via email to