Hi Shankha,

The fuzzy hash processors operate on the content of the flowfile. You would 
first use a processor to ingest the “data file” content. This could be 
something like GetFile, GetHDFS, GetSFTP, InvokeHTTP, etc. depending on the 
source of the file. Once that step is done, the flowfile content will contain 
the data file bytes. If you want to perform the fuzzy hash calculation on the 
entire data file content, you can connect the success relationship from the 
ingest processor directly to FuzzyHashContent, and the resulting flowfile will 
contain an attribute with the calculated hash value. If you want to perform the 
calculation over only specific parts of the flowfile, you can use a processor 
to manipulate the content, for example EvaluateJsonPath, EvaluateXPath, 
ReplaceText, etc.

You can see an example flow which uses these processors in slide 21 of a 
presentation [1] André Fucs de Miranda and I gave recently, and André has 
published the flow XML here [2].

[1] 
https://github.com/alopresto/slides/blob/master/dws_sydney_2017/the_power_of_intelligent_flows.pdf
 
<https://github.com/alopresto/slides/blob/master/dws_sydney_2017/the_power_of_intelligent_flows.pdf>
[2] https://github.com/fluenda/dataworks_summit_iot_botnet

Andy LoPresto
[email protected]
[email protected]
PGP Fingerprint: 70EC B3E5 98A6 5A3F D3C4  BACE 3C6E F65B 2F7D EF69

> On Oct 6, 2017, at 4:27 AM, shankhamajumdar <[email protected]> 
> wrote:
> 
> Hi,
> 
> I want to implement fuzzy logic on some fields in a data file using NiFi. I
> am trying to use  FuzzyHashContent/CompareFuzzyHash processor but not sure
> how to implement the flow. Can you please provide me an example?
> 
> Regards,
> Shankha
> 
> 
> 
> --
> Sent from: http://apache-nifi-developer-list.39713.n7.nabble.com/

Attachment: signature.asc
Description: Message signed with OpenPGP using GPGMail

Reply via email to