Hello Vyshali
below you can find python code example for hashing the fourth column of a
CSV file using the ExecuteScript processor
If you hash a field using SHA256 then the length of the field is changed.
A sha256 is 256 bits long
import hashlib
import java.io
from org.apache.commons.io import IOUtils
from java.nio.charset import StandardCharsets
from org.apache.nifi.processor.io import StreamCallback
def hashField(text):
return hashlib.sha256(text.encode('ascii')).hexdigest()
class convertStream(StreamCallback):
def __init__(self):
pass
def process(self,inputStream,outputStream):
text = IOUtils.toString(inputStream, StandardCharsets.ISO_8859_1)
output=[]
for line in text.splitlines():
l=line.split(';')
l[3] = hashField(l[3].lower())
l.append(l[3]+"_"+l[0]+"_"+l[1])
output.append(';'.join(l))
out='\n'.join(output)
outputStream.write(out.encode('latin-1'))
flowfile = session.get()
if(flowfile != None):
flowfile=session.write(flowfile,convertStream())
flowfile = session.putAttribute(flowfile, "filename",
flowfile.getAttribute('filename').split('.')[0]+'_hashed')
session.transfer(flowfile, REL_SUCCESS)
session.commit()
Regards,
Chris
On Fri, Oct 20, 2017 at 7:19 PM, Vyshali <[email protected]> wrote:
> Hi Chris,
>
> Thanks for the suggestion.Should I have code in python or some languagues
> for hashing the data using exectescript processor ? If so,will the format
> of
> the data be detained after hashing.
> Please provide some clarity on that.
>
> Thanks,
> Vyshali
>
>
>
> --
> Sent from: http://apache-nifi-developer-list.39713.n7.nabble.com/
>