Taras Bobrovytsky has posted comments on this change.

Change subject: IMPALA-4351,IMPALA-4353: [qgen] randomly generate INSERT 
statements
......................................................................


Patch Set 4:

(10 comments)

http://gerrit.cloudera.org:8080/#/c/5486/4/tests/comparison/db_connection.py
File tests/comparison/db_connection.py:

PS4, Line 126:  (
was this completely broken before?


http://gerrit.cloudera.org:8080/#/c/5486/4/tests/comparison/funcs.py
File tests/comparison/funcs.py:

Line 461: class CastFunc(Func):
Don't we already have a cast function in the query generator somewhere?


http://gerrit.cloudera.org:8080/#/c/5486/4/tests/comparison/model_translator.py
File tests/comparison/model_translator.py:

Line 799:     return data_type_class.__name__.upper()
We are not updating this like for Postgres above. It looks like this part of 
the code is never being used and is becoming stale. Maybe we should think about 
addressing this somehow?


http://gerrit.cloudera.org:8080/#/c/5486/4/tests/comparison/query_profile.py
File tests/comparison/query_profile.py:

Line 84:         'INSERT_VALUES_ROWS': (1, 10)}
It might be a good idea to increase this (maybe later on) so that we sometimes 
get large values (like 1000).


PS4, Line 643: col1, col2, ..
if not all columns that the table contains are in this list, then NULLs will be 
inserted, right?


PS4, Line 647: partial'
Just curious, what is the reason to write 'partial' first (before the the ==), 
instead of second?


PS4, Line 653: table.updatable_columns
We will always be biased towards inserting from the beginning of the table. How 
about shuffling first: .extend(shuffle(table.updatable_columns)[:additional])

table.updatable_columns returns a copy, so shuffling shouldn't cause problems.


PS4, Line 654: else
There is a lot of code duplication here. This can be simplified:

additional_column_count = randint(0, len(...))
if additional_column_count == 0 and len(cilumns_to_insert) == 0:
  additional_column_count = 1
coolumns_to_insert.extend(...)


http://gerrit.cloudera.org:8080/#/c/5486/4/tests/comparison/statement_generator.py
File tests/comparison/statement_generator.py:

Line 93:       # expressions will be implicitly casted in the databases. This 
requires less work
Maybe we should add a TODO jira to make it possible for the query generator to 
be able to deal with implicit casts? We might be losing some coverage here.


Line 131:         values_row.append(val)
Might be interesting to sometimes insert 2 identical rows. Or 2 rows with the 
same primary key value.


-- 
To view, visit http://gerrit.cloudera.org:8080/5486
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I842b41f0eed07ab30ec76d8fc3cdd5affb525af6
Gerrit-PatchSet: 4
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <[email protected]>
Gerrit-Reviewer: David Knupp <[email protected]>
Gerrit-Reviewer: Michael Brown <[email protected]>
Gerrit-Reviewer: Taras Bobrovytsky <[email protected]>
Gerrit-HasComments: Yes

Reply via email to