Pierre Villard created NIFI-5781:
------------------------------------
Summary: Incorrect schema for provenance events in
SiteToSiteProvenanceReportingTask
Key: NIFI-5781
URL: https://issues.apache.org/jira/browse/NIFI-5781
Project: Apache NiFi
Issue Type: Bug
Components: Extensions
Affects Versions: 1.7.1, 1.8.0, 1.7.0
Reporter: Pierre Villard
Assignee: Pierre Villard
The current schema does not allow null values for "details", "remoteIdentifier"
and "alternateIdentifier" fields.
This will cause error looking like:
{noformat}
2018-11-01 14:59:30,551 ERROR [Timer-Driven Process Thread-2]
o.a.n.r.SiteToSiteProvenanceReportingTask
SiteToSiteProvenanceReportingTask[id=0751c46f-0163-1000-7d33-f276e8654728]
Error running task
SiteToSiteProvenanceReportingTask[id=0751c46f-0163-1000-7d33-f276e8654728] due
to org.apache.avro.file.DataFileWriter$AppendWriteException:
java.lang.NullPointerException: null of string in field details of
nifi.provenanceEvent{noformat}
+*Workaround*+: specify a writer with a custom schema instead of inheriting
record schema.
{noformat}
{
"namespace": "nifi",
"name": "provenanceEvent",
"type": "record",
"fields": [
{ "name": "eventId", "type": "string" },
{ "name": "eventOrdinal", "type": "long" },
{ "name": "eventType", "type": "string" },
{ "name": "timestampMillis", "type": "long" },
{ "name": "durationMillis", "type": "long" },
{ "name": "lineageStart", "type": { "type": "long", "logicalType":
"timestamp-millis" } },
{ "name": "details", "type": ["null", "string"] },
{ "name": "componentId", "type": "string" },
{ "name": "componentType", "type": "string" },
{ "name": "componentName", "type": "string" },
{ "name": "processGroupId", "type": "string" },
{ "name": "processGroupName", "type": "string" },
{ "name": "entityId", "type": "string" },
{ "name": "entityType", "type": "string" },
{ "name": "entitySize", "type": ["null", "long"] },
{ "name": "previousEntitySize", "type": ["null", "long"] },
{ "name": "updatedAttributes", "type": { "type": "map", "values": "string"
} },
{ "name": "previousAttributes", "type": { "type": "map", "values": "string"
} },
{ "name": "actorHostname", "type": "string" },
{ "name": "contentURI", "type": "string" },
{ "name": "previousContentURI", "type": "string" },
{ "name": "parentIds", "type": { "type": "array", "items": "string" } },
{ "name": "childIds", "type": { "type": "array", "items": "string" } },
{ "name": "platform", "type": "string" },
{ "name": "application", "type": "string" },
{ "name": "remoteIdentifier", "type": ["null", "string"] },
{ "name": "alternateIdentifier", "type": ["null", "string"] },
{ "name": "transitUri", "type": ["null", "string"] }
]
}{noformat}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)