Hi Devs
I am running a NiFi 1.8 cluster, each node has 128Gb of Ram. I need to load the
contents of a file of which is around 5Gb in size into a Key/Value cache.
The file I want to load is produced by another company so the format it comes
in is not negotiable. The file contains thousands of lines in the following
format:-
<index value1>:{<property1 name>: <property1 value>, <property2
name>:<property2 value>}<index value2>:{<property1 name>: <property1 value>,
<property2 name>:<property2 value>}
<index value3>:{<property1 name>: <property1 value>, <property2
name>:<property2 value>}
I want the index value to become the Key and everything beyond the colon to
become the value.
What would be the most efficient way of reading the file, and parsing it to
load into a cache, I thought of reading in the file, using a split content on
CR/LF and then splitting on the first colon.I have noticed in 1.8 there are
some CSV and JSON Readers (controller services), would these be a better way of
doing this, but the problem I can see is that the file isn't quite a CSV and it
isn't quite a JSON Array file.
Many thanksDave