Hello.  I have a task to parse dates out of incoming raw content. Of course
the date patterns can assume any number of forms -   YYYY-MM-DD,
YYYY/MM/DD, YYYYMMDD, MMDDYYYY, etc etc etc. I can build myself a robust
regex to match a broad set of such patterns in the raw data, but I wonder
if there is a project or library available for Groovy that already offes
this?

Assuming I get pattern matches parsed out of my raw data, I will have a
collection of strings representing year-month-days in a variety of formats.
I'd then like to normalize them to a standard form so that I can sort and
compare them. I intend to identify the range of dates in the raw data as a
sorted Groovy list.

I anticipate I will miss many pattern variations with my initial cut at
this. I do have one thing going for me: as I test through volumes of raw
data, I'll be able to improve the pattern net I cast to catch an
ever-improving percentage of year-month-day expressions.

I intend to write a Groovy script that will run from an Apache NiFi
ExecuteScript processor. I'll read in my data flowfile content using a
buffered reader so I can handle flowfiles that may be large.

Any recommendations or suggestions?

Reply via email to