James Medel created NIFI-7411:
---------------------------------

             Summary: Integrates NiFi with H2O Driverless AI MOJO Scoring 
Pipeline (Java Runtime) To Do ML Inference
                 Key: NIFI-7411
                 URL: https://issues.apache.org/jira/browse/NIFI-7411
             Project: Apache NiFi
          Issue Type: New Feature
          Components: Extensions
    Affects Versions: 1.12.0
         Environment: Mac OS X Mojave 10.14.6
            Reporter: James Medel


*NiFi and H2O Driverless AI Integration* via Custom NiFi Processor:

Integrates NiFi with H2O Driverless AI by using Driverless AI's MOJO Scoring 
Pipeline (in Java Runtime) and NiFi's Custom Processor. This processor executes 
the MOJO Scoring Pipeline to do batch scoring or real-time scoring for one or 
more predicted labels on tabular data in the incoming flow file content. If the 
tabular data is one row, then the MOJO does real-time scoring. If the tabular 
data is multiple rows, then the MOJO does batch scoring. I would like to 
contribute my processor to NiFi as a new feature.

*1 Custom Processor* created for NiFi:

*ExecuteMojoScoringRecord* - Executes H2O Driverless AI's MOJO Scoring Pipeline 
in Java Runtime to do batch scoring or real-time scoring on a frame of data 
within each incoming flow file. It requires the user to add *mojo2-runtime.jar* 
filepath into *MOJO2 Runtime JAR Directory* ** property to dynamically modify 
the classpath. It also requires the user to add the *pipeline.mojo* filepath 
into the *Pipeline MOJO Filepath* property. This property is used in the 
onTrigger() method to get the pipeline.mojo filepath, so we can pass it into the
MojoPipeline.loadFrom(pipelineMojoPath) to instantiate our MojoPipeline model. 
Then the record read in with Record Reader and the model are passed into a 
predict() method to make predictions on the test data within the record. Inside 
the predict() method, I use MojoFrameBuilder and MojoRowBuilder with the 
recordMap to build an input MojoFrame. Then I use the model's transform(input 
MojoFrame) method to make the predictions on the input and store them into an 
output MojoFrame. I iterate through the MojoFrame by row and column to store 
each key value pair prediction into the predictedRecordMap. I then convert the 
predictedRecordMap to predictedRecord and return the record back to onTrigger 
to write the record to the flow file content using RecordSetWriter. We keep 
writing predicted Records to the flow file content until there are no more 
records to write. Then we reach near end of onTrigger() and the flow file is 
either transferred on relationship failure, success or original to the next 
connection.
 
*Hydraulic System Condition Monitoring* Data used in NiFi Flow:
 
The sensor test data I used in this integration comes from UCI ML Repo: 
Condition Monitoring for Hydraulic Systems. I was able to predict the hydraulic 
cooling condition through NiFi and H2O Integration described above. This use 
case is hydraulic system predictive maintenance.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to