MapRecord[] seems to be the theme of the day.
I am looking at replacing our existing Redis enrichment solution with one based
on Elasticsearch. One key advantage is that the response from ES is a JSON
object, instead of a string as with our current solution. So the lookup result
does not require further parsing before it can be built into the record. I have
however encountered one possible bug.
The use case I have is where I want to look up an enrichment record based on
the fields in the existing record, but I do not know which of a fixed selection
of fields will have a usable match in each case, so I have to try each one in
turn until I get a match. I have boiled this problem down to the essentials
with a trivial example. I have created an index with the following entries,
where the key value is also used as the document ID.
[
{"key":"one","value":1},
{"key":"two","value":2},
{"key":"three","value":3},
{"key":"four","value":4}
]
And a test file with the following content.
[
{"name":"bob","role":"user","hands":"two","Enrichment":{}},
{"name":"bill","role":"dog","legs":"four","Enrichment":{}}
]
My flow consists of 2 LookupRecord processors using a single
ElasticsearchLookupService which points back to my test index. There are no
schemas defined and I use JSON readers and writers with the "infer" and
"inherit" schema strategies.
The first lookup uses the Result RecordPath of "/Enrichment/limbs" and single
lookup parameter of "key" : "/hands". The second lookup uses the same result
path but the lookup parameter is "key" : "/legs". Both processors route to
matched or unmatched. The test file is given to the first lookup. The unmatched
relationship goes to the second lookup.
On the matched side of the first processor I get the expected result that Bob
has 2 limbs.
[{"name":"bob","role":"user","hands":"two","Enrichment":{"limbs":{"key":"two","value":2}},"legs":null}]
But on the matched side of the second the lookup value is encoded as a
MapRecord array string, requiring further parsing, which removes an advantage
of using ES.
[{"name":"bill","role":"dog","hands":null,"Enrichment":{"limbs":"MapRecord[{key=four,
value=4}]"},"legs":"four"}]
This result is position dependant. Reversing the order of the lookups sees Bill
being correctly enriched but Bob gets a MapRecord string. Changing the result
record path of the second lookup (e.g. to "/Enrichment/limbs2") makes the
problem go away, so I might be able to use this as a workaround, but it will
involve coalescing the results further down the line.
Is this a bug or have I missed something out. This is on a Docker cluster I
have spun up from scratch using NiFi 1.23.2 and ES 8.11.1, although I noticed
the problem on earlier systems. I have not tried this against any other
databases.
Steve Hindmarch
Application Specialist
This email contains information from BT that might be privileged or
confidential. And it's only meant for the person above. If that's not you,
we're sorry - we must have sent it to you by mistake. Please email us to let us
know, and don't copy or forward it to anyone else. Thanks.
We monitor our email systems and may record all our emails.
British Telecommunications plc
R/O: 1 Braham Street, London E1 8EE
Registered in England: No 1800000
British Telecommunications plc is authorised and regulated by Financial Conduct
Authority for the provision of consumer credit