asf-tooling commented on issue #1214:
URL: 
https://github.com/apache/tooling-trusted-releases/issues/1214#issuecomment-4407475410

   <!-- gofannon-issue-triage-bot v2 -->
   
   **Automated triage** — analyzed at `main@751c2146`
   
   **Type:** `question`  •  **Classification:** `no_action`  •  **Confidence:** 
`high`
   **Application domain(s):** `automated_checks`
   
   ### Summary
   This issue reports that license checks flag .gitignore and META-INF/services 
files. However, @sbp (maintainer) has already explained in the discussion that 
this is working as designed: when a .rat-excludes file is present in the 
archive, ATR hands exclusion control to the project, and the project must add 
patterns like `**/.gitignore` and `**/META-INF/services/*` to their 
.rat-excludes file. @sbp confirmed that after doing so, the problematic files 
no longer appear in ATR's checks output. The user's remaining issues appear to 
be from running RAT locally with different invocation parameters than ATR uses.
   
   ### Where this lives in the code today
   
   #### `atr/tasks/checks/rat.py` — `_synchronous_core_excludes_source` (lines 
538-560)
   _currently does this_
   Shows the priority logic: archive .rat-excludes takes precedence over policy 
excludes, and when neither is present defaults are used. This is the behavior 
@sbp explained.
   
   ```python
   def _synchronous_core_excludes_source(
       archive_excludes_path: str | None, policy_excludes: list[str], 
archive_dir: str, scratch_dir: str
   ) -> tuple[str, str | None]:
       # Determine excludes_source and effective excludes file
       excludes_source: str
       effective_excludes_path: str | None
   
       if archive_excludes_path is not None:
           excludes_source = "archive"
           effective_excludes_path = os.path.join(archive_dir, 
archive_excludes_path)
           log.info(f"Using archive {_RAT_EXCLUDES_FILENAME}: 
{archive_excludes_path}")
       elif policy_excludes:
           excludes_source = "policy"
           policy_excludes_file = os.path.join(scratch_dir, 
_POLICY_EXCLUDES_FILENAME)
           with open(policy_excludes_file, "w") as f:
               f.write("\n".join(policy_excludes))
           effective_excludes_path = policy_excludes_file
           log.info(f"Using policy excludes written to: {policy_excludes_file}")
       else:
           excludes_source = "none"
           effective_excludes_path = None
           log.info("No excludes: using defaults only")
       return excludes_source, effective_excludes_path
   ```
   
   #### `atr/tasks/checks/license.py` — 
`_headers_check_core_logic_should_check` (lines 570-584)
   _currently does this_
   The lightweight license checker only checks files matching INCLUDED_PATTERNS 
(which are source code extensions). Files like .gitignore (no extension match) 
and service loader files would NOT be checked by this path - confirming these 
files only get flagged via the RAT path.
   
   ```python
   def _headers_check_core_logic_should_check(filepath: str) -> bool:
       """Determine whether a file should be checked for license headers."""
       if filepath.endswith(constants.GENERATED_FILE_SUFFIXES):
           return False
   
       ext = _get_file_extension(filepath)
       if ext is None:
           return False
   
       # Then check if the file matches any of our included patterns
       for pattern in INCLUDED_PATTERNS:
           if re.search(pattern, filepath, re.IGNORECASE):
               return True
   
       return False
   ```
   
   ### Proposed approach
   No code change is needed. The system is working as designed. @sbp confirmed 
in the discussion that adding `**/.gitignore` and `**/META-INF/services/*` to 
the project's `.rat-excludes` file resolves the issue when the modified archive 
is uploaded to ATR. The user's remaining confusion stems from running RAT 
locally with different invocation parameters than ATR uses. The issue can be 
closed once the user confirms their upload works correctly, or if no further 
response comes.
   
   ### Open questions
   - Whether the user will upload the modified archives to ATR to verify the 
fix works end-to-end
   - Whether there is a UX improvement opportunity to document (or suggest in 
the UI) common .rat-excludes patterns like **/.gitignore and 
**/META-INF/services/* when a .rat-excludes file is present but these common 
patterns are missing
   
   _The agent reviewed this issue and is not proposing patches in this run. 
Review the existing-code citations and open questions above before deciding 
next steps._
   
   ### Files examined
   - `atr/tasks/checks/license.py`
   - `atr/tasks/checks/rat.py`
   - `tests/unit/test_checks_license.py`
   - `atr/get/checks.py`
   - `tests/unit/test_checks_rat.py`
   - `atr/storage/readers/checks.py`
   - `atr/classify.py`
   
   ### Related issues
   This issue appears related to: #878.
   
   _Both concern overly strict license/file checks that should be configurable 
or skipped for certain file types_
   
   ---
   *Draft from a triage agent. A human reviewer should validate before merging 
any change. The agent did not run tests or verify diffs apply.*


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to