Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/22293 )

Change subject: IMPALA-13627: Handle legacy Hive timezone conversion
......................................................................


Patch Set 14:

(1 comment)

http://gerrit.cloudera.org:8080/#/c/22293/11/be/src/benchmarks/convert-timestamp-benchmark.cc
File be/src/benchmarks/convert-timestamp-benchmark.cc:

http://gerrit.cloudera.org:8080/#/c/22293/11/be/src/benchmarks/convert-timestamp-benchmark.cc@115
PS11, Line 115:                   (sec split (old))               30.1       31 
    31.7         1X         1X         1X
              :                         (day split)                526      532 
     534      17.5X      17.2X      16.9X
              :
              : FromUnixTimeNanos:         Function  iters/ms   10%ile   50%ile 
  90%ile     10%ile     50%ile     90%ile
              :                                                                 
         (relative) (relative) (relative)
              : 
---------------------------------------------------------------------------------------------------------
              :                   (sec split (old))               31.2       32 
    32.4         1X         1X         1X
              :                   (sec split (new))               16.3     16.5 
    16.6     0.523X     0.517X     0.513X
              :
              : FromSubsecondUnixTime:     Function  iters/ms   10%ile   50%ile 
  90%ile     10%ile     50%ile     90%ile
              :                                                                 
         (relative) (relative) (relative)
> Done
Thanks for the investigation! It seems that I messed up my own optimization by 
mistake during IMPALA-9385, but my guess is that it only affects the benchmark, 
not actual execution.

I assumed that it should make things faster as the flag check is replaced by a 
pointer comparison with nullptr. I should have also updated the benchmark to 
use UTCPTR(==nullptr) instead of &TimezoneDatabase::GetUtcTimezone(). Nearly 
all code uses UTCPTR instead of GetUtcTimezone(), so that would better reflect 
how this function is actually used.

Another thing that may affect the performance of this benchmark is that 
TimestampValue::FromUnixTime() in timestamp-value.cc, not in inline.h or .h, so 
the compiler may not inline it (inlining would allow removing the branch 
completely when called with UTCPTR). It is a very simple function that looks 
like a good candidate for inlining.

Note that there is an interesting quirk about cctz's UTC handling: conversion 
using UTC (and other timezones where only one offset exists) is surprisingly 
slow, often slower than timezones with DST changes. This was one of the 
motivations behind using nullptr for UTC wherever possible.



--
To view, visit http://gerrit.cloudera.org:8080/22293
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I1271ed1da0b74366ab8315e7ec2d4ee47111e067
Gerrit-Change-Number: 22293
Gerrit-PatchSet: 14
Gerrit-Owner: Michael Smith <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Reviewer: Michael Smith <[email protected]>
Gerrit-Reviewer: Riza Suminto <[email protected]>
Gerrit-Comment-Date: Wed, 12 Feb 2025 09:44:47 +0000
Gerrit-HasComments: Yes

Reply via email to