Something like this worked for me: def topValuesMap = [:].withDefault{ [:].withDefault{ 0 } } def tallyMap = [:].withDefault{ 0 } def tally tally = { Map json, String prefix -> json.each { k, v -> String key = prefix + k if (v instanceof List) { tallyMap[key] += 1 v.each{ tally(it, key + '.') } } else { def val = v?.toString().trim() if (v) { tallyMap[key] += 1 topValuesMap[key][v] += 1 if (v instanceof Map) tally(v, key + '.') } } } }
def root = new JsonSlurper().parse(inputStream) root.each { // each allows json to be a list, not needed if always a map tally(it, '') } println tallyMap println topValuesMap.collectEntries{ k, m -> [(k), m.sort{ _, v -> -v.value }] }.take(10) <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> Virus-free.www.avast.com <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> <#DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> On Fri, May 12, 2023 at 8:27 PM James McMahon <jsmcmah...@gmail.com> wrote: > Thank you for the response, Paul. I will integrate and try these > suggestions within my Groovy code that runs in a nifi ExecuteScript > processor. I'll be working on this once again tonight. > > The map topValuesMap is intended to capture this: for each key identified > in the json, cross-tabulate for each value associated with that key how > many times it occurs. After the json is fully processed for key, sort the > resulting map and retain only the top ten values found in the json. If a > set has a lastName key, the topValuesMap that results might look something > like this after all keys have been cross-tabulated: > > ["lastName": ["Smith" : 1023, "Jones" : 976, "Chang": 899, "Doe": 511, > ...], > "address.street": [.....], > . > . > . > "a final key": [.....] > ] > > Each key would have ten values in its value map, unless it cross-tabulates > to less than ten in total, in which case it will be sorted by count value > and all values accepted. > Again, many thanks. > Jim > > On Fri, May 12, 2023 at 2:19 AM Paul King <pa...@asert.com.au> wrote: > >> I am not 100% sure what you are trying to capture in topValuesMap but for >> tallyMap you probably want something like: >> >> def tallyMap = [:].withDefault{ 0 } >> def tally >> tally = { Map json, String prefix -> >> json.each { k, v -> >> if (v instanceof List) { >> tallyMap[prefix + k] += 1 >> v.each{ tally(it, "$prefix${k}.") } >> } else if (v?.toString().trim()) { >> tallyMap[prefix + k] += 1 >> if (v instanceof Map) tally(v, "$prefix${k}.") >> } >> } >> } >> >> def root = new JsonSlurper().parse(inputStream) >> def initialPrefix = '' >> tally(root, initialPrefix) >> println tallyMap >> >> Output: >> [name:1, age:1, address:1, address.street:1, address.city:1, >> address.state:1, address.zip:1, phoneNumbers:1, phoneNumbers.type:2, >> phoneNumbers.number:2] >> >> >> >> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> >> Virus-free.www.avast.com >> <https://www.avast.com/sig-email?utm_medium=email&utm_source=link&utm_campaign=sig-email&utm_content=webmail> >> <#m_227840230113638997_m_3707991732434518544_DAB4FAD8-2DD7-40BB-A1B8-4E2AA1F9FDF2> >> >> On Fri, May 12, 2023 at 10:38 AM James McMahon <jsmcmah...@gmail.com> >> wrote: >> >>> I have this incoming json: { "name": "John Doe", "age": 42, "address": { >>> "street": "123 Main St", "city": "Anytown", "state": "CA", "zip": "12345" >>> }, "phoneNumbers": [ { "type": "home", "number": "555-1234" }, { "type": >>> "work", "number": "555-5678" } ] } I wish to tally all the keys in this >>> json in a map that gives me the key name as its key, and a count of the >>> number of times the key occurs in the json as its value. For this example, >>> the keys I expect in my output should include name, age, address, >>> address.street, address.city, address.state, address.zip, phoneNumbers, >>> phoneNumbers.type, and phoneNumbers.number. But I do not get that. Instead, >>> I get this for the list of fields: triage.json.fields >>> name,age,address,phoneNumbers And I get this for my tally count by key: >>> triage.json.tallyMap [name:1, age:1, address:1, phoneNumbers:1] >>> >>> I am close, but not quite there. I don't capture all the keys. Here is >>> my code. How must I modify this to get the result I require? import >>> groovy.json.JsonSlurper import org.apache.commons.io.IOUtils import >>> java.nio.charset.StandardCharsets def keys = [] def tallyMap = [:] def >>> topValuesMap = [:] def ff = session.get() if (!ff) return try { >>> session.read(ff, { inputStream -> def json = new >>> JsonSlurper().parseText(IOUtils.toString(inputStream, >>> StandardCharsets.UTF_8)) json.each { k, v -> if (v != null && >>> !v.toString().trim().isEmpty()) { tallyMap[k] = tallyMap.containsKey(k) ? >>> tallyMap[k] + 1 : 1 if (topValuesMap.containsKey(k)) { def valuesMap = >>> topValuesMap[k] valuesMap[v] = valuesMap.containsKey(v) ? valuesMap[v] + 1 >>> : 1 topValuesMap[k] = valuesMap } else { topValuesMap[k] = [:].withDefault{ >>> 0 }.plus([v: 1]) } } } } as InputStreamCallback) keys = >>> tallyMap.keySet().toList() def tallyMapString = tallyMap.collectEntries { >>> k, v -> [(k): v] }.toString() def topValuesMapString = >>> topValuesMap.collectEntries { k, v -> [(k): v.sort{ -it.value }.take(10)] >>> }.toString() ff = session.putAttribute(ff, 'triage.json.fields', >>> keys.join(",")) ff = session.putAttribute(ff, 'triage.json.tallyMap', >>> tallyMapString) ff = session.putAttribute(ff, 'triage.json.topValuesMap', >>> topValuesMapString) session.transfer(ff, REL_SUCCESS) } catch (Exception e) >>> { log.error('Error processing json fields', e) session.transfer(ff, >>> REL_FAILURE) } >>> >>