Thanks Matt. The majority of messages are < 10 KB, however I am seeing some that are > 10MB in their raw flat file format before being transformed into their more verbose JSON. So possibly 2-3x that for max size. Before reaching this processor all messages are well-formed JSON and using the UTF-8 charset. The script is doing a bunch of date formatting and other formatting that is too complex for the JOLT processor.
> On May 1, 2018, at 7:14 PM, Matt Burgess <[email protected]> wrote: > > Timothy, > > I haven't seen anything that can cause this to hang, in the Groovy > source code it might seem to "hang" [1] if there's a crazy large > input; how big are your flow files going into the ExecuteScript > processor? If size is not the issue, then perhaps there's an > assumption about character sets that causes a problem, etc. Are your > input files well-formed JSON, and if so, are they ROUSs? (Rodents Of > Unusual Size)? Are they encoded with unicode or other character sets? > > I'll try to reproduce this locally and get to the bottom of it, if > it's a Groovy bug that has been fixed we can upgrade the version, if > it's a bug that isn't fixed then hopefully there's a workaround with > ValidateRecord. Lastly, may I ask what your script is doing to the > flow files? Perhaps there are existing processors and/or techniques we > could use to get it done... > > Regards, > Matt > > [1] > https://github.com/apache/groovy/blob/GROOVY_2_4_5/subprojects/groovy-json/src/main/java/groovy/json/internal/JsonParserCharArray.java#L108 > > > On Tue, May 1, 2018 at 2:54 PM, Timothy Tschampel > <[email protected]> wrote: >> I have a flow which periodically hangs and messages begin to queue behind an >> ExecuteScript component using groovy. Once this happens the component can’t >> be stopped or restarted. Restarting alone does not move things along; only >> a restart after purging the content/flow file repositories seems to help. >> I’m not seeing any errors in the logs. Thread dumps show the same >> “ScriptXX.run” running at the same place >> (groovy.json.internal.JsonParserCharArray.decodeJsonObject(JsonParserCharArray.java:108)) >> and possibly several others running in slightly different paths originated >> from groovy.json.JsonSlurper.parseText(JsonSlurper.java:205). Is there >> anything special about the groovy json parser with regards to configuring >> the processor or using within a Nifi flow? I have attached 2 dumps of this >> happening at different times. I have a suspicion that possibly a mangled >> message is causing a problem, but I would have expected some sort of error >> in this case.
