Matthew Jacobs has posted comments on this change.

Change subject: IMPALA-5137: Support Kudu UNIXTIME_MICROS as Impala TIMESTAMP
......................................................................


Patch Set 5:

(14 comments)

http://gerrit.cloudera.org:8080/#/c/6526/5//COMMIT_MSG
Commit Message:

Line 28: of converting to STRING, so this provides some decent
> I assume an exhaustive run has passed?
Not yet, too hard to do without the kudu changes being in because I have to 
have a patched Kudu. That should go in soon though, then I'll update the change 
w/ a new toolchain build and submit an exhaustive run.


http://gerrit.cloudera.org:8080/#/c/6526/5/be/src/exec/kudu-scanner.cc
File be/src/exec/kudu-scanner.cc:

Line 185:     Tuple* kudu_tuple = const_cast<Tuple*>(reinterpret_cast<const 
Tuple*>(
> how come this expression changed so much?
Not sure exactly what you mean.

In order to get the timestamp w/ padding, we can't use the regular 
KuduScanBatch::RowPtr API (what we had before). Kudu is providing us with this 
'direct_data' mechanism (i.e. 'raw mode') that just returns a const uint8_t* to 
the beginning of the row batch memory.

I guess there's no reason that this code does the const_cast and 
reinterpret_cast in a different order -- if that's what you meant? I can swap 
their order.


Line 191:     // int64) have 8 bytes of padding following the timestamp. 
Because this padding is
> in l149 you're saying it's still a todo...
I had to edit the kudu client to always turn this functionality on because I'm 
waiting for a change from David to get in to Kudu. That should happen today, so 
I'll be able to update the Kudu client and then set the option on l149.


http://gerrit.cloudera.org:8080/#/c/6526/5/be/src/exec/kudu-table-sink.cc
File be/src/exec/kudu-table-sink.cc:

PS5, Line 304: E
> also print value?
Done


http://gerrit.cloudera.org:8080/#/c/6526/5/be/src/runtime/timestamp-value.h
File be/src/runtime/timestamp-value.h:

Line 32: #include "gutil/walltime.h"
> i'm assuming you're including this because you're inlining some function bo
Done


Line 107:   static TimestampValue FromUnixTimeMicros(time_t unix_time, int64_t 
micros) {
> add a comment
Done


Line 190:     DCHECK(unix_time != NULL);
> switch to nullptr throughout this file
Done


Line 193:     tm temp_tm = boost::posix_time::to_tm(temp);
> what about to_time_t, or does that require a newer version of boost?
I'd prefer to keep this consistent with the ToUnixTime() code (see l224), in 
case there could be subtle differences between the conversions. While that may 
be better for performance, it should be removed when we can move to a 64-bit 
timestamp type.


Line 214:     *unix_time_micros = std::min(MAX_UNIXTIME_MICROS, 
*unix_time_micros);
> should this return false when you go above the max?
Given that this is converting from a TimestampValue which cannot represent 
timestamps above the 9999-12-31 23:59:59.999999999, this shouldn't be able to 
go above the MAX_UNIXTIME_MICROS+1 after the rounding. I can add a DCHECK and a 
comment.


http://gerrit.cloudera.org:8080/#/c/6526/5/testdata/workloads/functional-query/queries/QueryTest/kudu-overflow-ts.test
File 
testdata/workloads/functional-query/queries/QueryTest/kudu-overflow-ts.test:

Line 1: ====
> hm, this is basically the same as the other new .test file. leave todo to f
added a TODO in the .py file


http://gerrit.cloudera.org:8080/#/c/6526/5/testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test
File testdata/workloads/functional-query/queries/QueryTest/kudu_insert.test:

Line 6:    ts timestamp)
> no 'null' clause here?
Done


PS5, Line 53: 999999
> so this is truncation behavior?
Yes, that was the intention of this test case. I'll add a bit more to the 
comment.


Line 183: ---- DML_RESULTS: tdata
> how come this only shows 2 rows?
that's just what this test case includes, see the query above "... where id < 
2". More rows are added in subsequent queries.


Line 323: create table allkeytypes (i1 tinyint, i2 smallint, i3 int, i4 bigint, 
name string,
> add a timestamp key col here?
I can't yet, timestamps aren't supported in the range partition ddl yet. That 
requires more FE work and that'll be a future change.


-- 
To view, visit http://gerrit.cloudera.org:8080/6526
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: Iae6ccfffb79118a9036fb2227dba3a55356c896d
Gerrit-PatchSet: 5
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Matthew Jacobs <m...@cloudera.com>
Gerrit-Reviewer: Dan Hecht <dhe...@cloudera.com>
Gerrit-Reviewer: David Ribeiro Alves <davidral...@gmail.com>
Gerrit-Reviewer: Lars Volker <l...@cloudera.com>
Gerrit-Reviewer: Marcel Kornacker <mar...@cloudera.com>
Gerrit-Reviewer: Matthew Jacobs <m...@cloudera.com>
Gerrit-Reviewer: Thomas Tauber-Marshall <tmarsh...@cloudera.com>
Gerrit-HasComments: Yes

Reply via email to