Hi Dominique, Thank you for your interest in NiFI and Atlas integration. I have some experience with that, and actually written the NiFi reporting task.
I have two things in mind could be related to your situation. One is NIFI-4971, it's under being reviewed now. It fixes lineage reporting issue when 'complete path' strategy is used. If you are using 'complete path', I'd recommend trying 'simple path' to see if that's the case. The other one is Atlas not being able to catch up fast enough to consume all messages from the Kafka topic. This happens when lots of messages are sent to the Atlas hook topic from NiFi, particularly seen when different files are written or retrieved from file system and NiFi tries to report it, as those entities are reported individually. Following command can be helpful to see how Atlas consumes messages. If there're lots of LAG, those messages are waiting to be consumed and processed by Atlas. # Sometimes Atlas consumer is not catching up and entities are not created even if NiFi reported as expected KAFKA_HOME/bin/kafka-consumer-groups.sh --bootstrap-server server:port --describe --group atlas GROUP TOPIC PARTITION CURRENT-OFFSET LOG-END-OFFSET LAG OWNER atlas ATLAS_HOOK 0 24944 31897 6953 Thanks, Koji On Thu, Apr 26, 2018 at 6:50 PM, Dominique De Vito <[email protected]> wrote: > Hi, > > I have defined a simple pipeline in Nifi: > > GetFile => some processor doing a dummy transformation => PublishInKafka > > ...............with Atlas integration for lineage purposes > > Versions: > -- Atlas 0.8.0 (Stack : HDP 2.6.4) > -- Nifi 1.5.0 > > and I have put some (dummy) file into the input directory, and it went up to > the end of the pipeline. > > Results: > > * a "nifi_flow" entity and a "nifi_flow_path" entity were defined in Atlas > <= good > > * PROBLEM_1: the "nifi_flow_path" entity has no input, neither output. > > But I see in the Nifi logs a trace stating that Nifi has sent a > "ENTITY_PARTIAL_UPDATE" json to Atlas HOOK topic, with correct input and > output. > > So, something looks like broken in Nifi<=>Atlas link, or within Atlas. > > * PROBLEM_2 (but Atlas related): when I use the GUI, Atlas says it can't > found the "nifi_flow" entity while it's available through the REST api: > > 2018-04-24 05:48:14,317 ERROR - [pool-2-thread-5 - > 3076c14e-9bb4-44a7-8299-d56476f3ec89:] ~ graph rollback due to exception > AtlasBaseException:Instance nifi_flow with unique attribute > {qualifiedName=76d4acd9-0162-1000-257a-7393e17b3a16@mycluster5} does not > exist (GraphTransactionInterceptor:73) > > ============> > > So my questions: > > 1) Did anyone meet such problems ? > > 2) Does anyone have had some (good) experience integrating Nifi with Atlas ? > > Thanks. > > Dominique >
