David Knupp has posted comments on this change.

Change subject: IMPALA-4735: Upgrade pytest in python env to version 2.9.2.
......................................................................


Patch Set 3:

(7 comments)

http://gerrit.cloudera.org:8080/#/c/5640/3//COMMIT_MSG
Commit Message:

PS3, Line 7: 2.9.2
> Why not 3.0.5?
I was going to add a comment about this to the JIRA.

Essentially, I couldn't find the workaround/fix to get rid of this 
3.0.5-specific warning, and I thought an incremental upgrade would at least 
solve some of our issues:

   WP1 None Module already imported so can not be re-written: random_plugin
   WP1 None Module already imported so can not be re-written: xdist
   WP1 None Module already imported so can not be re-written: xdist
   WP1 None Module already imported so can not be re-written: xdist


Line 13: In addition to bumping the version of pytest (and related modules),
> It might make more sense to split these changes up.
I can do that if you feel strongly about it.


Line 15: showed up from pytest re: the imported TestMatrix and TestDimension
> Could you put in the commit message the sed script (or whatever) you used t
Done


Line 24: Tested by doing a standard (non-exhaustive) test run on centos 6.4
> Presumably that checks that the tests that run do not fail. How did you ver
OK, so this seemingly innocuous question opened up (what seems to me) a really 
weird can of worms.

First, it is possible to to do a dry-run of sorts if you specify the 
--collect-only option. This will show exactly which tests would have been 
executed, taking into consideration all test vectors.

Since run-tests.py doesn't seem to honor --collect-only, you have to use 
impala-py.test directly, which means you have to explicitly pass in the test 
directories that run-tests.py would have typically white-listed out for you. 
E.g., this is the command for getting an exhaustive dry-run:

  $ impala-py.test --exploration_strategy=exhaustive --collect-only failure/ 
query_test/ stress/ aux_query_tests/ shell/ hs2/ catalog_service/ metadata/ 
data_errors/ statestore/ unittests/


To test, I ran it first from asf-gerrit-master branch (pytest v2.7.2), and then 
my dev branch (pytest v2.9.2), each time blowing away the python virtual 
environment and bootstrapping a new one. If I grep for "collected" in each 
file, the number was always the same -- 20881 items...

  $ grep collected ~/tmp/pytest_logs/pytest*exhaustive*log
  /home/dknupp/tmp/pytest_logs/pytest272_exhaustive_collect2.log:collecting ... 
collected 20881 items
  /home/dknupp/tmp/pytest_logs/pytest272_exhaustive_collect.log:collecting ... 
collected 20881 items
  /home/dknupp/tmp/pytest_logs/pytest292_exhaustive_collect2.log:collecting ... 
collected 20881 items
  /home/dknupp/tmp/pytest_logs/pytest292_exhaustive_collect.log:collecting ... 
collected 20881 items


However, I was surprised to learn that this number tended to vary when 
collecting tests with the core exploration_strategy -- even when comparing 
successive runs on the same branch. I narrowed it down to a single test -- 
query_test::test_cancellation::TestCancellationSerial::test_cancel_insert.

If I log output from the following command 10 times (regardless of version of 
pytest we use) in quick succession:

  $ impala-py.test --exploration_strategy=core --collect-only 
query_test/test_cancellation.py


...and again grep for the collected items, the result varies between 51 and 57 
items:

  $ grep collected ~/tmp/pytest_logs/pytest272_test_cancellation_collect*
  
/home/dknupp/tmp/pytest_logs/pytest272_test_cancellation_collect0.log:collecting
 ... collected 51 items
  
/home/dknupp/tmp/pytest_logs/pytest272_test_cancellation_collect1.log:collecting
 ... collected 52 items
  
/home/dknupp/tmp/pytest_logs/pytest272_test_cancellation_collect2.log:collecting
 ... collected 53 items
  
/home/dknupp/tmp/pytest_logs/pytest272_test_cancellation_collect3.log:collecting
 ... collected 54 items
  
/home/dknupp/tmp/pytest_logs/pytest272_test_cancellation_collect4.log:collecting
 ... collected 53 items
  
/home/dknupp/tmp/pytest_logs/pytest272_test_cancellation_collect5.log:collecting
 ... collected 55 items
  
/home/dknupp/tmp/pytest_logs/pytest272_test_cancellation_collect6.log:collecting
 ... collected 53 items
  
/home/dknupp/tmp/pytest_logs/pytest272_test_cancellation_collect7.log:collecting
 ... collected 57 items
  
/home/dknupp/tmp/pytest_logs/pytest272_test_cancellation_collect8.log:collecting
 ... collected 53 items
  
/home/dknupp/tmp/pytest_logs/pytest272_test_cancellation_collect9.log:collecting
 ... collected 52 items


Compare this to the result if we try another test module. In this case, the 
result is consistent every time:

  $ grep collected ~/tmp/pytest_logs/pytest292_test_queries_collect*
  /home/dknupp/tmp/pytest_logs/pytest292_test_queries_collect0.log:collecting 
... collected 38 items
  /home/dknupp/tmp/pytest_logs/pytest292_test_queries_collect1.log:collecting 
... collected 38 items
  /home/dknupp/tmp/pytest_logs/pytest292_test_queries_collect2.log:collecting 
... collected 38 items
  /home/dknupp/tmp/pytest_logs/pytest292_test_queries_collect3.log:collecting 
... collected 38 items
  /home/dknupp/tmp/pytest_logs/pytest292_test_queries_collect4.log:collecting 
... collected 38 items
  /home/dknupp/tmp/pytest_logs/pytest292_test_queries_collect5.log:collecting 
... collected 38 items
  /home/dknupp/tmp/pytest_logs/pytest292_test_queries_collect6.log:collecting 
... collected 38 items
  /home/dknupp/tmp/pytest_logs/pytest292_test_queries_collect7.log:collecting 
... collected 38 items
  /home/dknupp/tmp/pytest_logs/pytest292_test_queries_collect8.log:collecting 
... collected 38 items
  /home/dknupp/tmp/pytest_logs/pytest292_test_queries_collect9.log:collecting 
... collected 38 items


If we cat the output of two of the log files, we can see that from run to run, 
it's actually using different vectors:

  $ cat /home/dknupp/tmp/pytest_logs/pytest272_test_cancellation_collect6.log | 
grep test_cancel_insert
      <Function "test_cancel_insert[table_format: text/none | exec_option: 
{'disable_codegen': False, 'abort_on_error': 1, 
'exec_single_node_rows_threshold': 0, 'batch_size': 0, 'num_nodes': 0} | 
query_type: CTAS | max_block_mgr_memory: 0 | cancel_delay: 3 | action: None | 
query: select count(l_returnflag) pk from lineitem]">
      <Function "test_cancel_insert[table_format: seq/gzip/block | exec_option: 
{'disable_codegen': False, 'abort_on_error': 1, 
'exec_single_node_rows_threshold': 0, 'batch_size': 0, 'num_nodes': 0} | 
query_type: SELECT | max_block_mgr_memory: 0 | cancel_delay: 3 | action: None | 
query: compute stats lineitem]">
      <Function "test_cancel_insert[table_format: parquet/none | exec_option: 
{'disable_codegen': False, 'abort_on_error': 1, 
'exec_single_node_rows_threshold': 0, 'batch_size': 0, 'num_nodes': 0} | 
query_type: CTAS | max_block_mgr_memory: 0 | cancel_delay: 3 | action: None | 
query: select count(l_returnflag) pk from lineitem]">
      <Function "test_cancel_insert[table_format: text/gzip/block | 
exec_option: {'disable_codegen': False, 'abort_on_error': 1, 
'exec_single_node_rows_threshold': 0, 'batch_size': 0, 'num_nodes': 0} | 
query_type: SELECT | max_block_mgr_memory: 0 | cancel_delay: 3 | action: None | 
query: compute stats lineitem]">
      <Function "test_cancel_insert[table_format: parquet/none | exec_option: 
{'disable_codegen': False, 'abort_on_error': 1, 
'exec_single_node_rows_threshold': 0, 'batch_size': 0, 'num_nodes': 0} | 
query_type: SELECT | max_block_mgr_memory: 0 | cancel_delay: 3 | action: None | 
query: compute stats lineitem]">


In the run immediately following, it uses:

  $ cat /home/dknupp/tmp/pytest_logs/pytest272_test_cancellation_collect7.log | 
grep test_cancel_insert
      <Function "test_cancel_insert[table_format: avro/snap/block | 
exec_option: {'disable_codegen': False, 'abort_on_error': 1, 
'exec_single_node_rows_threshold': 0, 'batch_size': 0, 'num_nodes': 0} | 
query_type: SELECT | max_block_mgr_memory: 0 | cancel_delay: 3 | action: None | 
query: compute stats lineitem]">
      <Function "test_cancel_insert[table_format: text/none | exec_option: 
{'disable_codegen': False, 'abort_on_error': 1, 
'exec_single_node_rows_threshold': 0, 'batch_size': 0, 'num_nodes': 0} | 
query_type: CTAS | max_block_mgr_memory: 0 | cancel_delay: 3 | action: None | 
query: select l_returnflag from lineitem]">
      <Function "test_cancel_insert[table_format: seq/snap/block | exec_option: 
{'disable_codegen': False, 'abort_on_error': 1, 
'exec_single_node_rows_threshold': 0, 'batch_size': 0, 'num_nodes': 0} | 
query_type: SELECT | max_block_mgr_memory: 0 | cancel_delay: 3 | action: None | 
query: compute stats lineitem]">
      <Function "test_cancel_insert[table_format: avro/none | exec_option: 
{'disable_codegen': False, 'abort_on_error': 1, 
'exec_single_node_rows_threshold': 0, 'batch_size': 0, 'num_nodes': 0} | 
query_type: SELECT | max_block_mgr_memory: 0 | cancel_delay: 3 | action: None | 
query: compute stats lineitem]">
      <Function "test_cancel_insert[table_format: seq/gzip/block | exec_option: 
{'disable_codegen': False, 'abort_on_error': 1, 
'exec_single_node_rows_threshold': 0, 'batch_size': 0, 'num_nodes': 0} | 
query_type: SELECT | max_block_mgr_memory: 0 | cancel_delay: 3 | action: None | 
query: compute stats lineitem]">
      <Function "test_cancel_insert[table_format: text/gzip/block | 
exec_option: {'disable_codegen': False, 'abort_on_error': 1, 
'exec_single_node_rows_threshold': 0, 'batch_size': 0, 'num_nodes': 0} | 
query_type: SELECT | max_block_mgr_memory: 0 | cancel_delay: 3 | action: None | 
query: compute stats lineitem]">
      <Function "test_cancel_insert[table_format: parquet/none | exec_option: 
{'disable_codegen': False, 'abort_on_error': 1, 
'exec_single_node_rows_threshold': 0, 'batch_size': 0, 'num_nodes': 0} | 
query_type: CTAS | max_block_mgr_memory: 0 | cancel_delay: 3 | action: None | 
query: select count(l_returnflag) pk from lineitem]">
      <Function "test_cancel_insert[table_format: parquet/none | exec_option: 
{'disable_codegen': False, 'abort_on_error': 1, 
'exec_single_node_rows_threshold': 0, 'batch_size': 0, 'num_nodes': 0} | 
query_type: SELECT | max_block_mgr_memory: 0 | cancel_delay: 3 | action: None | 
query: compute stats lineitem]">
      <Function "test_cancel_insert[table_format: text/none | exec_option: 
{'disable_codegen': False, 'abort_on_error': 1, 
'exec_single_node_rows_threshold': 0, 'batch_size': 0, 'num_nodes': 0} | 
query_type: CTAS | max_block_mgr_memory: 0 | cancel_delay: 3 | action: None | 
query: select count(l_returnflag) pk from lineitem]">


I also find that actually running the tests, whether or not with run-tests.py 
or pytest directly, produces different results for this one test (although the 
same number of PASSED tests is always reported as 48.)

  $ grep test_cancel_insert ~/tmp/pytest_logs/pytest272_runtests_tests0.log | 
wc -l
  6
  $ grep test_cancel_insert ~/tmp/pytest_logs/pytest272_runtests_tests1.log | 
wc -l
  6
  $ grep test_cancel_insert ~/tmp/pytest_logs/pytest272_runtests_tests2.log | 
wc -l
  8
  $ grep test_cancel_insert ~/tmp/pytest_logs/pytest272_runtests_tests3.log | 
wc -l
  9
  $ grep test_cancel_insert ~/tmp/pytest_logs/pytest272_runtests_tests4.log | 
wc -l
  7

  $ grep test_cancel_insert ~/tmp/pytest_logs/pytest292_runtests_tests0.log | 
wc -l
  3
  $ grep test_cancel_insert ~/tmp/pytest_logs/pytest292_runtests_tests1.log | 
wc -l
  6
  $ grep test_cancel_insert ~/tmp/pytest_logs/pytest292_runtests_tests2.log | 
wc -l
  6
  $ grep test_cancel_insert ~/tmp/pytest_logs/pytest292_runtests_tests3.log | 
wc -l
  2
  $ grep test_cancel_insert ~/tmp/pytest_logs/pytest292_runtests_tests4.log | 
wc -l
  6


Kinda not sure what to make of this yet.


http://gerrit.cloudera.org:8080/#/c/5640/3/tests/run-tests.py
File tests/run-tests.py:

PS3, Line 101: we need to account for
> Why? What bad thing could happen otherwise?
I moved this line, and I think the context was lost. I've edited the comment to 
hopefully be more clear. Essentially though, this is specifically required by 
IMPALA-4510 (and generally related to the contortions stemming from 
IMPALA-4417.)

The essential reason is that at various stages of our test run, we need to 
alternately filter and/or augment the list of pytest args. If we're filtering 
args, we need to make sure we don't filter away ones that we need. If we 
separate them, we can do this by checking each arg against the pytest config 
values, e.g.:

this will be retained:

   (Pdb) pytest.config.getvalue('metastore_server')
   'localhost:9083'
   

but this would be (incorrectly) filtered out:

   (Pdb) pytest.config.getvalue('metastore_server=localhost:9083')
   *** ValueError: no option named 'metastore_server=localhost:9083'

I agree that this is all regrettably ugly.


PS3, Line 103: random
> "arbitrary", not "random"
Done


PS3, Line 143: Because of the way our repo is organized.
> This doesn't explain a lot as it stands. I'm also not sure if it is a conti
Thanks. Hopefully the amended version is more clear.


-- 
To view, visit http://gerrit.cloudera.org:8080/5640
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings

Gerrit-MessageType: comment
Gerrit-Change-Id: I40d129e0e63ca5bee126bac6ac923abb3c7e0a67
Gerrit-PatchSet: 3
Gerrit-Project: Impala-ASF
Gerrit-Branch: master
Gerrit-Owner: David Knupp <dkn...@cloudera.com>
Gerrit-Reviewer: David Knupp <dkn...@cloudera.com>
Gerrit-Reviewer: Jim Apple <jbapple-imp...@apache.org>
Gerrit-Reviewer: Lars Volker <l...@cloudera.com>
Gerrit-Reviewer: Michael Brown <mi...@cloudera.com>
Gerrit-HasComments: Yes

Reply via email to