potiuk commented on issue #23538:
URL: https://github.com/apache/airflow/issues/23538#issuecomment-1130737710

   > @potiuk Currently we check only if the versions are right and docker is 
running. We don't check in code if the container started is healthy. Do we need 
to check that before running command and error out in case if it is unhealthy?
   
   No need really. The health checks are done in  Docker-compose already AND in 
the "entrypoint_ci.sh" when integrations are enabled.  The `cassandra` case is  
the result of the first check. The main point there is to detect if the reason 
for failure was docker-compose health check failure (the first point) and print 
some better (red) error message in this case. You can simulate the case by 
modifying health check so that it will always fail ( for example here 
https://github.com/apache/airflow/blob/f721529decf4116d927d1f109beb389bdd605a0c/scripts/ci/docker-compose/integration-cassandra.yml#L28
 ) 
   
   > @potiuk Can you point me to the place where the current output of tests is 
logged?
   
   Few places. This is where it gets complicated (and we need to do it simpler);
   
   *  The logs are captured to a file: 
https://github.com/apache/airflow/blob/f721529decf4116d927d1f109beb389bdd605a0c/scripts/ci/testing/ci_run_airflow_testing.sh#L49
 (>"${JOB_LOG}" 2>&1). Each parallel run has its own log file
   
   * Then we have some monitoring loop that reads last few lines from each log 
for parallel tasks and print the last few lines from each 
https://github.com/apache/airflow/blob/f721529decf4116d927d1f109beb389bdd605a0c/scripts/ci/libraries/_parallel.sh#L60
   
   * Finally - depending on success/failure when all of the jobs finish are 
printed using "GitHub group output" to the logs so that we can see them in 
"folded" groups. The groups are green or red depending if particular job 
failed/succeeded.
   
https://github.com/apache/airflow/blob/f721529decf4116d927d1f109beb389bdd605a0c/scripts/ci/libraries/_parallel.sh#L133
   
https://github.com/apache/airflow/blob/f721529decf4116d927d1f109beb389bdd605a0c/scripts/ci/libraries/_parallel.sh#L148
   
   * And we also print summary at the end and exit with error if any of the 
parallel jobs failed
   
https://github.com/apache/airflow/blob/f721529decf4116d927d1f109beb389bdd605a0c/scripts/ci/libraries/_parallel.sh#L165
   
   This is complex and I think we can do simpler (and nicer) using python 
managing those parallel runs:
   
   1) no need to save output to files, we can just capture the output as usual 
and  print them at the end from those captured variables. It will take a bit of 
memory, but it should be fine I think 
   
   2) We do not need to print the last few lines reallly. We just do it to show 
the progress, but it is not THAT useful  Rather than that we should simply do 
some kind of progress indicating how many lines of output are written for 
example for each job already and MAYBE in case of tests we should parse last 
few lines and try to guess "percent of progress" For example here: 
https://github.com/apache/airflow/runs/6498181414?check_suite_focus=true#step:10:240
   
   ```
     ### The last 2 lines for Providers process: 
/tmp/tmp.aRrLkYXmmw/tests/Providers/stdout ###
     tests/providers/amazon/aws/hooks/test_ec2.py .............               [ 
 3%]
     tests/providers/amazon/aws/hooks/test_eks.py ...............
   ```
   
   If we read last 90/100 characters we should be able to find the "[  3%]" 
with  regexp and simply print that :). With Python we can do a lot more easily 
than Bash.
   
   3) It's really useful to get the foldable outputs after all jobs are 
finished and color them green/red  depending on status. Maybe also we can 
figure out some improvements there - but those could be done later.
   
   
   > Could i check by using tqdm to show the progress? @potiuk
   
   Rich has fantastic customizable progress bar. 
https://rich.readthedocs.io/en/stable/progress.html. No need to use tqdm . 
However the problem is that progress bar in CI output is not working the way it 
works in terminal. We cannot really get such nice/dynamic progress bar there, 
we should think more about providing progress in percentage printed from time 
to time.
   
   
   
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to