asf-tooling commented on issue #1149:
URL:
https://github.com/apache/tooling-trusted-releases/issues/1149#issuecomment-4409769912
<!-- gofannon-issue-triage-bot v2 -->
**Automated triage** — analyzed at `main@2da7807a`
**Type:** `new_feature` • **Classification:** `actionable` •
**Confidence:** `high`
**Application domain(s):** `sbom_analysis`, `shared_infrastructure`
### Summary
Issue #1149 requests adding a logging framework to the SBOM CLI
(atr/sbom/cli.py) so that when SBOM library functions are invoked from CLI they
log to the console, but when called from the ATR server (e.g., background tasks
via atr/tasks/sbom.py) they route through the existing structured logging
infrastructure. Currently, atr/sbom/cli.py uses raw print() statements
exclusively for all output and has no logging integration. The existing
atr/log.py and atr/loggers.py modules provide the structlog-based logging
framework that should be reused.
### Where this lives in the code today
#### `atr/sbom/cli.py` — `main` (lines 176-190)
_needs modification_
CLI entry point that needs logging initialization before dispatching
commands.
```python
def main() -> None: # noqa: C901
if len(sys.argv) < 3:
print("Usage: python -m atr.sbom <COMMAND> <SBOM PATH>")
print(
"COMMAND can be one of: license, merge, missing, osv, "
"outdated, patch-ntia, patch-vuln, scores, validate-cli,
validate-py, where"
)
sys.exit(1)
path = pathlib.Path(sys.argv[2])
bundle = path_to_bundle(path)
if not bundle:
raise RuntimeError("Could not load bundle")
match sys.argv[1]:
case "license":
command_license(bundle)
```
#### `atr/sbom/cli.py` — `command_license` (lines 37-57)
_needs modification_
Example command function using print() that should use logging instead.
```python
def command_license(bundle: models.bundle.Bundle) -> None:
_, warnings, errors = check(bundle.bom)
if warnings:
print("WARNINGS (Category B):")
for warning in warnings:
version_str = f" {warning.component_version}" if
warning.component_version else ""
scope_str = f" [scope: {warning.scope}]" if warning.scope else ""
print(f" - {warning.component_name}{version_str}:
{warning.license_expression}{scope_str}")
print()
if errors:
print("ERRORS (Category X):")
for error in errors:
version_str = f" {error.component_version}" if
error.component_version else ""
scope_str = f" [scope: {error.scope}]" if error.scope else ""
unknown_suffix = " (Category X due to unknown license
identifiers)" if error.any_unknown else ""
name_str = f"{error.component_name}{version_str}"
license_str =
f"{error.license_expression}{scope_str}{unknown_suffix}"
print(f" - {name_str}: {license_str}")
print()
if (not warnings) and (not errors):
print("All licenses are approved (Category A)")
```
#### `atr/loggers.py` — `configure_structlog` (lines 32-42)
_extension point_
Existing structlog configuration that should be reused by CLI logging setup.
```python
def configure_structlog(shared_processors:
Sequence[structlog.types.Processor]) -> None:
structlog.configure(
processors=[
*shared_processors,
structlog.stdlib.ProcessorFormatter.wrap_for_formatter,
],
wrapper_class=structlog.stdlib.BoundLogger,
context_class=dict,
logger_factory=structlog.stdlib.LoggerFactory(),
cache_logger_on_first_use=True,
)
```
#### `atr/loggers.py` — `shared_processors` (lines 94-103)
_extension point_
Shared processor chain that CLI logging should also use for consistency.
```python
def shared_processors() -> list[structlog.types.Processor]:
return [
structlog.contextvars.merge_contextvars,
structlog.stdlib.add_log_level,
structlog.stdlib.add_logger_name,
structlog.stdlib.PositionalArgumentsFormatter(),
structlog.processors.TimeStamper(fmt="iso"),
structlog.processors.StackInfoRenderer(),
structlog.processors.UnicodeDecoder(),
]
```
#### `atr/loggers.py` — `create_output_formatter` (lines 59-69)
_extension point_
Formatter factory to reuse for CLI console output.
```python
def create_output_formatter(
shared_processors: Sequence[structlog.types.Processor],
renderer: structlog.types.Processor,
) -> structlog.stdlib.ProcessorFormatter:
return structlog.stdlib.ProcessorFormatter(
processors=[
structlog.stdlib.ProcessorFormatter.remove_processors_meta,
renderer,
],
foreign_pre_chain=list(shared_processors),
)
```
#### `atr/log.py` — `info` (lines 182-183)
_extension point_
Convenience function that SBOM functions should use instead of print() for
diagnostic/result logging.
```python
def info(msg: str, **kwargs) -> None:
_event(logging.INFO, msg, **kwargs)
```
#### `atr/server.py` — `_app_setup_logging` (lines 380-399)
_currently does this_
Server's logging setup pattern that the CLI setup should mirror in a lighter
form.
```python
def _app_setup_logging(app: base.QuartApp, config_mode: config.Mode,
app_config: type[config.AppConfig]) -> None:
"""Setup application logging with structlog and queue-based handlers."""
import logging.handlers
import structlog
import atr.loggers as loggers
shared_processors = loggers.shared_processors()
# Output handler: pretty console for dev (Debug and Allow Tests), JSON
for non-dev (Docker, etc.)
output_handler = logging.StreamHandler(sys.stderr)
use_json_output = app_config.LOG_JSON or (not
config.is_dev_environment())
if use_json_output:
# JSON output should include rendered exceptions
output_handler.setFormatter(loggers.create_json_formatter(shared_processors))
else:
renderer: structlog.types.Processor =
structlog.dev.ConsoleRenderer(colors=True)
# Queue-based logging for thread safety
output_handler.setFormatter(loggers.create_output_formatter(shared_processors,
renderer))
```
### Where new code would go
- `atr/sbom/cli.py` — before main function
Add a setup_cli_logging() function that initializes structlog with console
output for standalone CLI use.
### Proposed approach
Add a `setup_cli_logging()` function to `atr/sbom/cli.py` that configures
structlog and stdlib logging with a console renderer on stderr, mirroring the
server's setup but without file handlers or queue listeners. Call this from
`main()` before dispatching commands. Then replace `print()` calls in the
command functions with `atr.log.info()` (for results) and
`atr.log.warning()`/`atr.log.error()` (for problems). When these same
underlying SBOM library functions are invoked from the ATR server (via
`atr/tasks/sbom.py`), the server's existing structlog configuration handles
routing to the appropriate log files. The key insight is that structlog's
stdlib integration means the same `atr.log.*` calls work in both contexts —
only the handler configuration differs, and that's set up at application/CLI
startup time.
For user-facing structured output (like license check results), keep using
`print()` to stdout so CLI consumers can parse it. Use `atr.log` for
operational/diagnostic messages (errors loading files, scan progress, etc.)
which go to stderr via the logging framework. This separation allows piping CLI
output while still getting diagnostics.
### Suggested patches
#### `atr/sbom/cli.py`
Add CLI logging initialization and replace diagnostic print() calls with
structured logging.
````diff
--- a/atr/sbom/cli.py
+++ b/atr/sbom/cli.py
@@ -16,10 +16,13 @@
# under the License.
import asyncio
+import logging
import pathlib
import sys
+import structlog
import yyjson
+
from cyclonedx.model.bom import Bom
from cyclonedx.output import make_outputter
from cyclonedx.schema import OutputFormat
@@ -31,6 +34,26 @@ from .cyclonedx import validate_cli, validate_py
from .licenses import check
from .sbomqs import total_score
from .tool import plugin_outdated_version
from .utilities import bundle_to_ntia_patch, bundle_to_vuln_patch,
patch_to_data, path_to_bundle
+import atr.log as log
+
+
+def setup_cli_logging(level: int = logging.INFO) -> None:
+ """Configure logging for standalone CLI invocation.
+
+ When SBOM functions are called from the ATR server, the server's
+ structlog configuration is already active. This function provides
+ an equivalent console-oriented setup for direct CLI use.
+ """
+ import atr.loggers as loggers
+
+ shared = loggers.shared_processors()
+ loggers.configure_structlog(shared)
+
+ handler = logging.StreamHandler(sys.stderr)
+ renderer: structlog.types.Processor =
structlog.dev.ConsoleRenderer(colors=sys.stderr.isatty())
+ handler.setFormatter(loggers.create_output_formatter(shared, renderer))
+ logging.basicConfig(level=level, handlers=[handler], force=True)
+
def command_license(bundle: models.bundle.Bundle) -> None:
_, warnings, errors = check(bundle.bom)
@@ -167,12 +190,14 @@ def command_where(bundle: models.bundle.Bundle) ->
None:
def main() -> None: # noqa: C901
if len(sys.argv) < 3:
print("Usage: python -m atr.sbom <COMMAND> <SBOM PATH>")
print(
"COMMAND can be one of: license, merge, missing, osv, "
"outdated, patch-ntia, patch-vuln, scores, validate-cli,
validate-py, where"
)
sys.exit(1)
+
+ setup_cli_logging()
path = pathlib.Path(sys.argv[2])
bundle = path_to_bundle(path)
if not bundle:
- raise RuntimeError("Could not load bundle")
+ log.error("Could not load bundle", path=str(path))
+ sys.exit(1)
match sys.argv[1]:
case "license":
command_license(bundle)
````
### Open questions
- Should user-facing output (license check results, validation output)
remain as print() to stdout, or should it all go through the logging framework?
The diff above only converts diagnostic messages; a follow-up could convert
result output too.
- Should the SBOM library functions themselves (in atr/sbom/licenses.py,
atr/sbom/cyclonedx.py, etc.) also add logging calls for errors and progress?
The issue mentions 'where the functions are invoked from', suggesting the
library layer should log too.
- Should a --verbose or --debug CLI flag be added to control the log level
in CLI mode?
- Are there existing callers of the SBOM library from server tasks
(atr/tasks/sbom.py) that currently swallow errors which logging would surface?
### Files examined
- `atr/sbom/cli.py`
- `atr/log.py`
- `atr/loggers.py`
- `tests/e2e/sbom/test_post.py`
- `atr/server.py`
- `scripts/keys_import.py`
- `atr/config.py`
- `atr/util.py`
---
*Draft from a triage agent. A human reviewer should validate before merging
any change. The agent did not run tests or verify diffs apply.*
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]