potiuk commented on code in PR #24871:
URL: https://github.com/apache/airflow/pull/24871#discussion_r916339053


##########
dev/breeze/src/airflow_breeze/commands/ci_commands.py:
##########
@@ -114,11 +127,68 @@ def resource_check(verbose: bool, dry_run: bool):
     check_docker_resources(shell_params.airflow_image_name, verbose=verbose, 
dry_run=dry_run)
 
 
+HOME_DIR = Path(os.path.expanduser('~')).resolve()
+
+DIRECTORIES_TO_FIX = [
+    AIRFLOW_SOURCES_ROOT,
+    HOME_DIR / ".aws",
+    HOME_DIR / ".azure",
+    HOME_DIR / ".config/gcloud",
+    HOME_DIR / ".docker",
+    AIRFLOW_SOURCES_ROOT,
+]
+
+
+def fix_ownership_for_file(file: Path, dry_run: bool, verbose: bool):
+    get_console().print(f"[info]Fixing ownership of {file}")
+    result = run_command(
+        ["sudo", "chown", f"{os.getuid}:{os.getgid()}", str(file.resolve())],
+        check=False,
+        stderr=subprocess.STDOUT,
+        dry_run=dry_run,
+        verbose=verbose,
+    )
+    if result.returncode != 0:
+        get_console().print(f"[warning]Could not fix ownership for {file}: 
{result.stdout}")
+
+
+def fix_ownership_for_path(path: Path, dry_run: bool, verbose: bool):
+    if path.is_dir():
+        for p in Path(path).rglob('*'):
+            if p.owner == 'root':
+                fix_ownership_for_file(p, dry_run=dry_run, verbose=verbose)
+    else:
+        if path.owner == 'root':
+            fix_ownership_for_file(path, dry_run=dry_run, verbose=verbose)
+
+
+def fix_ownership_without_docker(dry_run: bool, verbose: bool):
+    for directory_to_fix in DIRECTORIES_TO_FIX:
+        fix_ownership_for_path(directory_to_fix, dry_run=dry_run, 
verbose=verbose)
+
+
 @main.command(name="fix-ownership", help="Fix ownership of source files to be 
same as host user.")
[email protected](
+    '--use-sudo',
+    is_flag=True,
+    help="Use sudo instead of docker image to fix the ownership. You need to 
be a `sudoer` to run it",
+    envvar='USE_SUDO',
+)
 @option_github_repository
 @option_verbose
 @option_dry_run
-def fix_ownership(github_repository: str, verbose: bool, dry_run: bool):
+def fix_ownership(github_repository: str, use_sudo: bool, verbose: bool, 
dry_run: bool):
+    system = platform.system().lower()
+    if system != 'linux':
+        get_console().print(
+            f"[warning]You should only need to run fix-ownership on Linux and 
your system is {system}"

Review Comment:
   BTW. Just to give a little context on why we need the `fix-ownership` in CI 
at all.
   
   The "root" (pun intended!) problenm here is that we are re-using workers 
sometimes. This means that when we "leave" the job, we should do some cleanup. 
I would prefer (and this is what I normally do) to  always start from scratch 
on CI. I call it "TRUE CLEAN STATE" of the CI worker. This is what GitHub 
public runners do - you always get empty, pristine, clean workspace when you 
start your job. I would very, very much prefer it - we already had some cases 
where left-overs from previous jobs (like those root-owned fles) created 
problem for the next worker (when they tried to checkout latest branch, they 
failed as they could not delete the root-owned files).
   
   But in our case currently we use Self-hosted runners which are 
often-reused-AWS instances. until we move to K8S, we cannot really do much 
about it (unless we want to loose extra time on spinning the machines) and we 
do everything possible to clean-up the machines after the job is complete, so 
that the next job gets a "clean state" - this "fix-ownership" is part of the 
cleanup. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to