James Medel created NIFI-7411:
---------------------------------
Summary: Integrates NiFi with H2O Driverless AI MOJO Scoring
Pipeline (Java Runtime) To Do ML Inference
Key: NIFI-7411
URL: https://issues.apache.org/jira/browse/NIFI-7411
Project: Apache NiFi
Issue Type: New Feature
Components: Extensions
Affects Versions: 1.12.0
Environment: Mac OS X Mojave 10.14.6
Reporter: James Medel
*NiFi and H2O Driverless AI Integration* via Custom NiFi Processor:
Integrates NiFi with H2O Driverless AI by using Driverless AI's MOJO Scoring
Pipeline (in Java Runtime) and NiFi's Custom Processor. This processor executes
the MOJO Scoring Pipeline to do batch scoring or real-time scoring for one or
more predicted labels on tabular data in the incoming flow file content. If the
tabular data is one row, then the MOJO does real-time scoring. If the tabular
data is multiple rows, then the MOJO does batch scoring. I would like to
contribute my processor to NiFi as a new feature.
*1 Custom Processor* created for NiFi:
*ExecuteMojoScoringRecord* - Executes H2O Driverless AI's MOJO Scoring Pipeline
in Java Runtime to do batch scoring or real-time scoring on a frame of data
within each incoming flow file. It requires the user to add *mojo2-runtime.jar*
filepath into *MOJO2 Runtime JAR Directory* ** property to dynamically modify
the classpath. It also requires the user to add the *pipeline.mojo* filepath
into the *Pipeline MOJO Filepath* property. This property is used in the
onTrigger() method to get the pipeline.mojo filepath, so we can pass it into the
MojoPipeline.loadFrom(pipelineMojoPath) to instantiate our MojoPipeline model.
Then the record read in with Record Reader and the model are passed into a
predict() method to make predictions on the test data within the record. Inside
the predict() method, I use MojoFrameBuilder and MojoRowBuilder with the
recordMap to build an input MojoFrame. Then I use the model's transform(input
MojoFrame) method to make the predictions on the input and store them into an
output MojoFrame. I iterate through the MojoFrame by row and column to store
each key value pair prediction into the predictedRecordMap. I then convert the
predictedRecordMap to predictedRecord and return the record back to onTrigger
to write the record to the flow file content using RecordSetWriter. We keep
writing predicted Records to the flow file content until there are no more
records to write. Then we reach near end of onTrigger() and the flow file is
either transferred on relationship failure, success or original to the next
connection.
*Hydraulic System Condition Monitoring* Data used in NiFi Flow:
The sensor test data I used in this integration comes from UCI ML Repo:
Condition Monitoring for Hydraulic Systems. I was able to predict the hydraulic
cooling condition through NiFi and H2O Integration described above. This use
case is hydraulic system predictive maintenance.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)