[ 
https://issues.apache.org/jira/browse/NIFI-8161?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17290817#comment-17290817
 ] 

Arek Burdach commented on NIFI-8161:
------------------------------------

Can anybody take a look on this change? [~exceptionfactory] [~mtien] maybe? It 
is not too much controversial change. We'd successfully deployed it on our 
production (about 50 flows integrating different sources in different formats) 
and it caused about 5 changes in expressions. Mainly because of bad usage of 
format e.g. was used `mm` twice instead of `MM` and some slight differences in 
interpreting time zone format. Overall this change cause about 30% reduction of 
cpu load on our NiFi instances. 

> NiFi EL: migration from SimpleDateFormat to DateTimeFormatter
> -------------------------------------------------------------
>
>                 Key: NIFI-8161
>                 URL: https://issues.apache.org/jira/browse/NIFI-8161
>             Project: Apache NiFi
>          Issue Type: Improvement
>            Reporter: Arek Burdach
>            Priority: Major
>              Labels: perfomance, pull-request-available
>          Time Spent: 20m
>  Remaining Estimate: 0h
>
> In NiFi Expression Language we are commonly using SimpleDateFormat. It is 
> old, inefficient approach: due to mutability of SimpleDateFormat it need to 
> be recreated for each operation. Also this format is lax in some places where 
> user would like to be more strict. In my opinion better approach would be 
> usage of java8 DateTimeFormatter instead.
> I've done some benchmarks that you can check on your own in 
> FormatEvaluatorBenchmark. Results on my 8-core, i7-1065G7:
> - before change (SimpleDateFormat): 11.230 ± 5.407  us/op
> - after switching to DateTimeFormatter API: 4.747 ± 0.426  us/op
> - after introduction of preparation of formatter for literal formats: 2.025 ± 
> 0.055  us/op
> This change is not 100% transparent so some changes might be necessary in 
> users code. Most of differences are visible in modifications that I've made 
> in TestQuery tests:
> -  back tick (`) for escaping of extra characters is not supported anymore - 
> only single quote is supported
> - "repeated" syntax like "dd" for days strictly check if two digits were 
> provided -  if someone need to use more lax syntax, need to use single "d" 
> syntax
> [update]
> After switching to lenient mode, parsing is compatible with SimpleDateFormat 
> in second point ("repeated" syntax)



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to