[Impala-ASF-CR] IMPALA-7417: Speed up sub-second unix time->TimestampValue conversions

Csaba Ringhofer (Code Review) Tue, 28 Aug 2018 03:03:02 -0700

Csaba Ringhofer has posted comments on this change. ( 
http://gerrit.cloudera.org:8080/11183 )


Change subject: IMPALA-7417: Speed up sub-second unix time->TimestampValue 
conversions
......................................................................


Patch Set 12:

(15 comments)

http://gerrit.cloudera.org:8080/#/c/11183/11//COMMIT_MSG
Commit Message:

http://gerrit.cloudera.org:8080/#/c/11183/11//COMMIT_MSG@10
PS11, Line 10: (which is split do date_ and time_ similarly to
> nit: please wrap at 70 characters, here and below.
Done


http://gerrit.cloudera.org:8080/#/c/11183/11/be/src/benchmarks/convert-timestamp-benchmark.cc
File be/src/benchmarks/convert-timestamp-benchmark.cc:

http://gerrit.cloudera.org:8080/#/c/11183/11/be/src/benchmarks/convert-timestamp-benchmark.cc@578
PS11, Line 578: boost::posix_time::nanoseconds(t
> It probably wouldn't make a difference but maybe you should call boost::pos
Done


http://gerrit.cloudera.org:8080/#/c/11183/11/be/src/benchmarks/convert-timestamp-benchmark.cc@631
PS11, Line 631:     cctz_optimized_unix_time_to_utc_ptime(unix_time_whole);
> The original implementation had:
That constant was not accessible from here and I thought that the results would 
be the same, so why not use NANOS_PER_SEC. It turned that I was wrong and the 
result is slightly different in some cases (+- 1 nanosec in analytic average 
tests).


http://gerrit.cloudera.org:8080/#/c/11183/11/be/src/benchmarks/convert-timestamp-benchmark.cc@635
PS11, Line 635:
              : TimestampValue from_subsecond_unix_time_new(
              :     const double& unix_time) {
              :   const double ONE_BILLIONTH = 0.000000001;
              :   int64_t unix_time_whole = unix_time;
              :   int64_t nanos = (unix_time - unix_time_whole) / ONE_BILLIONTH;
              :
> The implementation of TimestampValue::FromSubsecondUnixTime() was changed i
Done


http://gerrit.cloudera.org:8080/#/c/11183/11/be/src/benchmarks/convert-timestamp-benchmark.cc@923
PS11, Line 923:
              :   // Benchmark FromSubsecondUnixTime before and after 
IMPALA-7417.
> Are the rounding rules still different?
No, it should be the same in patch set 12.


http://gerrit.cloudera.org:8080/#/c/11183/11/be/src/benchmarks/convert-timestamp-benchmark.cc@932
PS11, Line 932:
> This benchmark and the previous bm_utc_from_unix_time_nanos benchmark measu
I added the test originally to check if the rounding has added any regression. 
As the rounding was replaced with truncation this is no longer relevant. I 
would still keep the test because the results are different (5.57 vs 3.97 
improvement) and the the benchmark turned out to be useful for catching 
correctness issues.

So I would prefer to keep the test but I can remove it if you want to keep this 
file more compact.


http://gerrit.cloudera.org:8080/#/c/11183/11/be/src/runtime/timestamp-test.cc
File be/src/runtime/timestamp-test.cc:

http://gerrit.cloudera.org:8080/#/c/11183/11/be/src/runtime/timestamp-test.cc@235
PS11, Line 235: The fraction part of 'nanos' can e
> You mean "'nanos' can express sub-nanoseconds"?
I hope that it is clearer with the new comment.


http://gerrit.cloudera.org:8080/#/c/11183/11/be/src/runtime/timestamp-test.cc@240
PS11, Line 240: ue::
> Maybe use MILLIS_PER_SEC instead?
My impression is that 1000 is more readable than MILLIS_PER_SEC, while 
NANOS_PER_SEC is more readable than 1000*1000*1000 (and 1000000000 is by far 
the worst), so this inconsistency is intentional.


http://gerrit.cloudera.org:8080/#/c/11183/11/be/src/runtime/timestamp-test.cc@282
PS11, Line 282: p
> nit: extra space
Done


http://gerrit.cloudera.org:8080/#/c/11183/11/be/src/runtime/timestamp-test.cc@280
PS11, Line 280: / Checks that all sub-second From*UnixTime gives the same 
result and that the result
              : // is the same as 'expected'.
              : // If 'expected' is nullptr then the result expected to be 
invalid (out of range).
              : void TestFromSubSecondFunctions(int64_t seconds, int millis, 
const cha
> Why do we need this extra test? And how is this 'opposite sign'? Please cla
I have extended the test to check the same timestamp with different seconds / 
millis, so that > 1000 millis are also tested.


http://gerrit.cloudera.org:8080/#/c/11183/11/be/src/runtime/timestamp-test.cc@284
PS11, Line 284: Ti
> nit: missing space after 'if'
Done


http://gerrit.cloudera.org:8080/#/c/11183/11/be/src/runtime/timestamp-test.cc@848
PS11, Line 848:
> How about adding some extra tests for the micros/nanos unix time conversion
Done


http://gerrit.cloudera.org:8080/#/c/11183/11/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

http://gerrit.cloudera.org:8080/#/c/11183/11/be/src/runtime/timestamp-value.h@189
PS11, Line 189:  !time.is_negative()
> Instead this, you could use !time.is_negative()
Done


http://gerrit.cloudera.org:8080/#/c/11183/11/be/src/runtime/timestamp-value.h@190
PS11, Line 190: time.total_nanoseconds() < NANOS_PER_DAY
> maybe use (time.hours() < 24) ?
hours() is a confusing function in my opinion - I think that it should be 
called total_hours(), because it can be >= 24. (seconds() and minutes() are < 
60). The description at 
https://www.boost.org/doc/libs/1_55_0/doc/html/date_time/posix_time.html 
doesn't help too much to clear the confusion.

I did a quick benchmark and it would have the same speed, (apparently gcc can 
replace A/ConstB < ConstC with A < ConstC*ConstB for integers), but I would 
prefer to keep it as it is.


http://gerrit.cloudera.org:8080/#/c/11183/11/be/src/runtime/timestamp-value.h@328
PS11, Line 328: unix_time
> unix_time_ticks
Done



--
To view, visit http://gerrit.cloudera.org:8080/11183
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-MessageType: comment
Gerrit-Change-Id: I572b5876b979ddae58165bd40d5b008ce9d7a4aa
Gerrit-Change-Number: 11183
Gerrit-PatchSet: 12
Gerrit-Owner: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Attila Jeges <[email protected]>
Gerrit-Reviewer: Csaba Ringhofer <[email protected]>
Gerrit-Reviewer: Impala Public Jenkins <[email protected]>
Gerrit-Comment-Date: Tue, 28 Aug 2018 10:02:18 +0000
Gerrit-HasComments: Yes

[Impala-ASF-CR] IMPALA-7417: Speed up sub-second unix time->TimestampValue conversions

Reply via email to