Michael Brown has posted comments on this change.

Change subject: IMPALA-4351,IMPALA-4353: [qgen] randomly generate INSERT 
statements
......................................................................


Patch Set 6:

(10 comments)

Taras, please see patch set 6 for new changes.

http://gerrit.cloudera.org:8080/#/c/5486/4/tests/comparison/db_connection.py
File tests/comparison/db_connection.py:

PS4, Line 126:  (
> was this completely broken before?
Yeah, this check was cleary broken, but I tended to give it happy-path migrated 
tables, so it didn't matter for my use.


http://gerrit.cloudera.org:8080/#/c/5486/4/tests/comparison/funcs.py
File tests/comparison/funcs.py:

Line 461: class CastFunc(Func):
> Don't we already have a cast function in the query generator somewhere?
We only have a CastAsChar function, which exists for this reason:

"This is added so that query generation can assume that any return type can be 
produced by an aggregate or analytic with only one level of nesting."

The CastAsChar SqlWriter implementation basically writes either an Int or Char 
base type as a string.

There is no first-class cast function with many signatures to choose from for 
casting.


http://gerrit.cloudera.org:8080/#/c/5486/4/tests/comparison/model_translator.py
File tests/comparison/model_translator.py:

Line 799:     return data_type_class.__name__.upper()
> We are not updating this like for Postgres above. It looks like this part o
It's basically the same case as with Oracle: the code exists and maybe worked 
years ago, but probably doesn't very well anymore. We are certainly not doing 
regular runs. We've also let slip through all kinds of code (say, in 
db_connection) that doesn't work and doesn't get regular testing.

With Oracle at least, I can see us keeping it around for when we want to do 
more testing comparing analytic functions, for example. Has there ever been a 
need for MySQL comparison since preferring PostgreSQL? Maybe we just cut the 
MySQL code in a follow-on patch.


http://gerrit.cloudera.org:8080/#/c/5486/4/tests/comparison/query_profile.py
File tests/comparison/query_profile.py:

Line 84:         'INSERT_VALUES_ROWS': (1, 10)}
> It might be a good idea to increase this (maybe later on) so that we someti
I agree. But it makes the statements incredibly long, and without serialization 
as artifacts, and replay, it's hard for someone to deal with such a long 
statement, for example when copy/pasting. IMPALA-4732 will deal with this and 
related concerns.


PS4, Line 643: col1, col2, ..
> if not all columns that the table contains are in this list, then NULLs wil
yes


PS4, Line 647: partial'
> Just curious, what is the reason to write 'partial' first (before the the =
Old habit related to C, when you could mistake = for == in comparison. If you 
use the literal on the left, you can't mistakenly use =, since you can't assign 
to a literal. Do you want me to change it? I don't believe there's anything 
wrong with this from a style standpoint.


PS4, Line 653: maining_columns[:additi
> We will always be biased towards inserting from the beginning of the table.
Good catch. shuffle() acts on the parameter and doesn't have a return value, so 
my use doesn't match yours, but the concept was kept.


PS4, Line 654: shuf
> There is a lot of code duplication here. This can be simplified:
Removed duplication.


http://gerrit.cloudera.org:8080/#/c/5486/4/tests/comparison/statement_generator.py
File tests/comparison/statement_generator.py:

Line 93:       # expressions will be implicitly casted in the databases. This 
requires less work
> Maybe we should add a TODO jira to make it possible for the query generator
Done


Line 131:         values_row.append(val)
> Might be interesting to sometimes insert 2 identical rows. Or 2 rows with t
I think this can be more effective once we increase the number of rows 
inserted. IMPALA-4732


-- 
To view, visit http://gerrit.cloudera.org:8080/5486
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I842b41f0eed07ab30ec76d8fc3cdd5affb525af6
Gerrit-PatchSet: 6
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: Michael Brown <[email protected]>
Gerrit-Reviewer: David Knupp <[email protected]>
Gerrit-Reviewer: Michael Brown <[email protected]>
Gerrit-Reviewer: Taras Bobrovytsky <[email protected]>
Gerrit-HasComments: Yes

Reply via email to