MaxGekk commented on issue #24309: [SPARK-27398][SQL] Refactoring of 
CreateJacksonParser.getStreamDecoder
URL: https://github.com/apache/spark/pull/24309#issuecomment-480593091
 
 
   I re-ran JSON benchmark, and unfortunately it shows performance regression 
up to 2 times. For example, the last benchmarks:
   Before:
   ```
   son files in the per-line mode:          Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   
------------------------------------------------------------------------------------------------------------------------
   Text read                                          7537           7556       
   26          6.6         150.7       1.0X
   Schema inferring                                  27875          28306       
  499          1.8         557.5       0.3X
   Parsing without charset                           26030          26083       
   67          1.9         520.6       0.3X
   Parsing with UTF-8                                37115          37480       
  392          1.3         742.3       0.2X
   ```
   After:
   ```
   Json files in the per-line mode:          Best Time(ms)   Avg Time(ms)   
Stdev(ms)    Rate(M/s)   Per Row(ns)   Relative
   
------------------------------------------------------------------------------------------------------------------------
   Text read                                          7435           7457       
   27          6.7         148.7       1.0X
   Schema inferring                                  28307          28378       
   70          1.8         566.1       0.3X
   Parsing without charset                           25104          25197       
   89          2.0         502.1       0.3X
   Parsing with UTF-8                                66200          66402       
  216          0.8        1324.0       0.1X
   ```
   I am closing the PR.

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to