[GitHub] nifi issue #2686: NIFI-5166 - Deep learning classification and regression pr...

ijokarumawak Thu, 07 Jun 2018 04:29:05 -0700

Github user ijokarumawak commented on the issue:

    https://github.com/apache/nifi/pull/2686
  
    @mans2singh I have been thinking about how to use this processor in a 
practical NiFi data flow. Certainly the processor can use a deep learning model 
to classify or predict using regression, but current approach that writing 
evaluation result into FlowFile content may not be useful in real data flows.
    
    Let's say user want to route incoming data into different branches of a 
data flow to process differently. The most basic use-case would be binary 
classification. If a given data is predicted as class A, then do something, 
such as sending an alert. In order to do so, we need to carry original data to 
report meaningful alert. By rewriting FlowFile content only using result makes 
it difficult to tie-up original data and prediction result, and it is hard to 
construct subsequent flow.
    
    By considering real use-cases more, I started feeling this is more of a 
Enrich or Lookup pattern.
    
    ```
    # Original dataset
    Record1
    Record2
    Record3
    
    # Convert the original dataset into a vector to applying a model, while 
keeping original data to preserve relationships.
    Record1, Feature Vector1
    Record2, Feature Vector2
    Record3, Feature Vector3
    
    # Then we can further enrich records with prediction results
    Record1, Result1 (A:0.9, B:0.1)
    Record2, Result2 (A:0.85, B:0.15)
    Record3, Result3 (A:0.05, B:0.95)
    
    # Once we have such FlowFile, we can filter certain dataset based on 
prediction
    # Route class A into flow branch A
    Record1
    Record2
    # Route class B into branch B
    Record 3
    
    # Then we can produce some meaningful report using original information
    Send an alert based on Record3
    ```
    
    Have you ever looked at LookupRecord processor and RecordLookupService 
controller service? I think we can do more interesting things if we implement 
as a RecordLookupService.
    
    How do you think?

---

[GitHub] nifi issue #2686: NIFI-5166 - Deep learning classification and regression pr...

Reply via email to