[
https://issues.apache.org/jira/browse/NIFI-8161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17290817#comment-17290817
]
Arek Burdach commented on NIFI-8161:
------------------------------------
Can anybody take a look on this change? [~exceptionfactory] [~mtien] maybe? It
is not too much controversial change. We'd successfully deployed it on our
production (about 50 flows integrating different sources in different formats)
and it caused about 5 changes in expressions. Mainly because of bad usage of
format e.g. was used `mm` twice instead of `MM` and some slight differences in
interpreting time zone format. Overall this change cause about 30% reduction of
cpu load on our NiFi instances.
> NiFi EL: migration from SimpleDateFormat to DateTimeFormatter
> -------------------------------------------------------------
>
> Key: NIFI-8161
> URL: https://issues.apache.org/jira/browse/NIFI-8161
> Project: Apache NiFi
> Issue Type: Improvement
> Reporter: Arek Burdach
> Priority: Major
> Labels: perfomance, pull-request-available
> Time Spent: 20m
> Remaining Estimate: 0h
>
> In NiFi Expression Language we are commonly using SimpleDateFormat. It is
> old, inefficient approach: due to mutability of SimpleDateFormat it need to
> be recreated for each operation. Also this format is lax in some places where
> user would like to be more strict. In my opinion better approach would be
> usage of java8 DateTimeFormatter instead.
> I've done some benchmarks that you can check on your own in
> FormatEvaluatorBenchmark. Results on my 8-core, i7-1065G7:
> - before change (SimpleDateFormat): 11.230 ± 5.407 us/op
> - after switching to DateTimeFormatter API: 4.747 ± 0.426 us/op
> - after introduction of preparation of formatter for literal formats: 2.025 ±
> 0.055 us/op
> This change is not 100% transparent so some changes might be necessary in
> users code. Most of differences are visible in modifications that I've made
> in TestQuery tests:
> - back tick (`) for escaping of extra characters is not supported anymore -
> only single quote is supported
> - "repeated" syntax like "dd" for days strictly check if two digits were
> provided - if someone need to use more lax syntax, need to use single "d"
> syntax
> [update]
> After switching to lenient mode, parsing is compatible with SimpleDateFormat
> in second point ("repeated" syntax)
--
This message was sent by Atlassian Jira
(v8.3.4#803005)