james94 commented on pull request #4242: URL: https://github.com/apache/nifi/pull/4242#issuecomment-622609665
@pvillard31 To answer your first question, "**does the current implementation implies that all fields of the input record must be used for the prediction?**" - No not all fields of the input record need to be used for the prediction. Going back to your example, if we pass in features `A,B,C` to the MOJO Pipeline, it will filter out the features it doesn't need. So, the MOJO Pipeline will ignore feature `C` and make the prediction for label `D` based on features `A,B`. So, the users won't have to worry about manually removing fields. To answer your second question, "**what will be the name of the field for the prediction, is there a way to specify/force the name?**" - The MOJO Pipeline already has the prediction field name(s). When the MOJO Pipeline is built in Driverless AI, some of the metadata it is given is the predicted field name(s). In the processor in the predict() method, when I use the MojoPipeline model to make the prediction on the input test data, next I convert the MojoFrame into a predictedRecordMap. This hash map contains key value pairs of one or more predicted field names and field values. Now we have a predictedRecord that also holds one or more predicted field names and field values. So, when the user configures the CSVRecordSetWriter, they can choose "Inherit Record Schema" for Schema Access Strategy to get the predicted field names from the predictedRecord. Also I have a GitHub Repo that has 2 NiFi templates and some example data to use the **ExecuteMojoScoringRecord** processor in a Hydraulic System Predictive Maintenance use case. Since this processor uses a Driverless AI MOJO Scoring Pipeline, the user will need a Driverless AI License Key to use the processor. https://github.com/james94/Hydraulic-System-Predictive-Maintenance ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
