Github user ijokarumawak commented on the issue:
https://github.com/apache/nifi/pull/2686
@mans2singh I have been thinking about how to use this processor in a
practical NiFi data flow. Certainly the processor can use a deep learning model
to classify or predict using regression, but current approach that writing
evaluation result into FlowFile content may not be useful in real data flows.
Let's say user want to route incoming data into different branches of a
data flow to process differently. The most basic use-case would be binary
classification. If a given data is predicted as class A, then do something,
such as sending an alert. In order to do so, we need to carry original data to
report meaningful alert. By rewriting FlowFile content only using result makes
it difficult to tie-up original data and prediction result, and it is hard to
construct subsequent flow.
By considering real use-cases more, I started feeling this is more of a
Enrich or Lookup pattern.
```
# Original dataset
Record1
Record2
Record3
# Convert the original dataset into a vector to applying a model, while
keeping original data to preserve relationships.
Record1, Feature Vector1
Record2, Feature Vector2
Record3, Feature Vector3
# Then we can further enrich records with prediction results
Record1, Result1 (A:0.9, B:0.1)
Record2, Result2 (A:0.85, B:0.15)
Record3, Result3 (A:0.05, B:0.95)
# Once we have such FlowFile, we can filter certain dataset based on
prediction
# Route class A into flow branch A
Record1
Record2
# Route class B into branch B
Record 3
# Then we can produce some meaningful report using original information
Send an alert based on Record3
```
Have you ever looked at LookupRecord processor and RecordLookupService
controller service? I think we can do more interesting things if we implement
as a RecordLookupService.
How do you think?
---