GitHub user weiqingy edited a discussion: [Feature] Per-Event-Type Configurable 
Log Levels for Event Log

GitHub issue: https://github.com/apache/flink-agents/issues/541

## Motivation

The event log captures every event flowing through an agent for debugging, 
auditing, and observability. Today the only filtering mechanism is 
`EventFilter`, a binary accept/reject predicate. This makes it impossible to:

- Log some event types at full detail while keeping others concise.
- Reduce log volume in production without losing visibility into critical event 
types.
- Adjust verbosity for a single event type at job submission time without 
re-specifying an entire filter.

This design introduces **per-event-type configurable log levels** so that 
operators can independently control the verbosity of each event type.

## Log Levels

Three log levels, ordered from least to most verbose:

| Level | Behavior |
|---|---|
| `OFF` | Event is not logged at all. |
| `STANDARD` | Event is logged. Details may be omitted to keep logs concise 
(see [Truncation Strategy](#truncation-strategy-standard-level)). |
| `VERBOSE` | Event is logged with full detail. Nothing is omitted. |

The default level for all event types is `STANDARD` with truncation active 
(default `max-length` of 4096 characters). This means `STANDARD` and `VERBOSE` 
have distinct behaviors out of the box.

### Why Default to STANDARD with Active Truncation

| Approach | Out-of-the-box Behavior | Backward Compatible? | Semantic Clarity |
|---|---|---|---|
| **STANDARD + active truncation (chosen)** | Events logged with long fields 
truncated automatically. | No. Existing logs may be truncated after upgrade. | 
High. STANDARD and VERBOSE are immediately distinct. |
| VERBOSE (no truncation) | All events logged in full, identical to today. | 
Yes. Zero behavior change. | Medium. Users must opt-in to STANDARD and set 
max-length to see benefits. |
| STANDARD + max-length=0 | All events logged in full, identical to today. | 
Yes. Zero behavior change. | Low. STANDARD and VERBOSE are identical until 
max-length is modified. |

We chose active truncation because:

- **Semantic clarity**: `STANDARD` and `VERBOSE` mean different things from day 
one. No configuration required to see the distinction.
- **Simple opt-in path**: Operators who need full detail for specific event 
types simply set those types to `VERBOSE`. No need to understand or configure 
`max-length`.
- **Practical benefit by default**: AI agent events frequently contain very 
long LLM responses (10K+ characters) and tool outputs. Truncation at 4096 
characters keeps logs usable for monitoring without excessive disk usage.
- **Backward compatibility trade-off**: Existing users will see truncated logs 
after upgrade. This is mitigated by setting `event-log.level: VERBOSE` to 
restore the previous behavior, or setting specific event types to `VERBOSE` for 
full detail where needed.

## Configuration

### Config Key Pattern

Per-event-type settings use the pattern:

```
event-log.<EVENT_TYPE>.<property>
```

The event type appears in the middle, and the property name appears at the 
tail. This structure groups all settings for a given event type together and 
allows future per-type properties (e.g., routing events to different logger 
destinations) without restructuring the key namespace. This follows standard 
hierarchical logger configuration conventions.

**Future extensibility example:**

```yaml
event-log.org.apache.flink.agents.api.event.ChatRequestEvent.level: VERBOSE
event-log.org.apache.flink.agents.api.event.ChatRequestEvent.max-length: 8192   
# future
event-log.org.apache.flink.agents.api.event.ChatRequestEvent.logger: kafka      
 # future
```

### Event Type Names in Config Keys

Config keys use the **fully-qualified class name** of the event type. This 
avoids ambiguity when different packages define event classes with the same 
simple name.

```
event-log.org.apache.flink.agents.api.event.ChatRequestEvent.level=VERBOSE
event-log.org.apache.flink.agents.api.InputEvent.level=OFF
```

### Hierarchical Inheritance

Log level resolution follows **hierarchy inheritance**. The dot-separated event 
type name defines a natural hierarchy. When an event type has no exact config 
match, the level is inherited from the nearest configured ancestor.

The root config key `event-log.level` serves as the global default — no special 
`default` keyword is needed.

**Resolution order** (most specific wins):

1. **Exact match**: 
`event-log.org.apache.flink.agents.api.event.ChatRequestEvent.level`
2. **Parent package**: `event-log.org.apache.flink.agents.api.event.level`
3. **Grandparent package**: `event-log.org.apache.flink.agents.api.level`
4. ... _(walks up the hierarchy)_
5. **Root**: `event-log.level`
6. **Built-in default**: `STANDARD` (if `event-log.level` is not configured)

**Example**: Given these event types:

```
org.apache.flink.agents.api.InputEvent
org.apache.flink.agents.api.OutputEvent
org.apache.flink.agents.api.event.ChatRequestEvent
org.apache.flink.agents.api.event.ToolRequestEvent
```

And this config:

```yaml
event-log.level: STANDARD                                              # root 
default
event-log.org.apache.flink.agents.api.event.level: OFF                 # 
package-level
event-log.org.apache.flink.agents.api.event.ChatRequestEvent.level: VERBOSE  # 
exact type
```

Resolution:

| Event Type | Resolved Level | Why |
|---|---|---|
| `...api.event.ChatRequestEvent` | `VERBOSE` | Exact match |
| `...api.event.ToolRequestEvent` | `OFF` | No exact match → inherits from 
`...api.event` |
| `...api.InputEvent` | `STANDARD` | No exact match, no `...api` key → inherits 
from root |
| `...api.OutputEvent` | `STANDARD` | Same as above |

### Complete Config Key Reference

| Config Key | Type | Default | Description |
|---|---|---|---|
| `event-log.level` | String (`OFF` / `STANDARD` / `VERBOSE`) | `STANDARD` | 
Root default log level for all event types. |
| `event-log.<EVENT_TYPE>.level` | String (`OFF` / `STANDARD` / `VERBOSE`) | 
_(inherits from parent in hierarchy)_ | Log level for a specific event type or 
package. |
| `event-log.standard.max-length` | Integer | `4096` | Maximum character length 
for serialized event content at `STANDARD` level. Set to `0` to disable 
truncation. Has no effect at `VERBOSE` level. |

### Configuration Examples

**Config file:**

```yaml
# Root default: log everything at STANDARD
event-log.level: STANDARD

# Java events: use Java FQCN
event-log.org.apache.flink.agents.api.event.ChatRequestEvent.level: VERBOSE
event-log.org.apache.flink.agents.api.event.ChatResponseEvent.level: VERBOSE
event-log.org.apache.flink.agents.api.event.ContextRetrievalRequestEvent.level: 
OFF
event-log.org.apache.flink.agents.api.event.ContextRetrievalResponseEvent.level:
 OFF

# Python events: use Python module path (the event type string from PythonEvent)
event-log.flink_agents.api.events.event.OutputEvent.level: VERBOSE
event-log.my_module.MyCustomEvent.level: OFF

# Truncation is active by default (max-length: 4096).
# To increase the limit:
# event-log.standard.max-length: 8192
```

The config key uses whatever type string appears in the event log's `eventType` 
field. For Java events, that's the Java FQCN (e.g., 
`org.apache.flink.agents.api.event.ChatRequestEvent`). For Python events, 
that's the Python module path (e.g., 
`flink_agents.api.events.event.OutputEvent`). The hierarchy inheritance works 
the same way for both — it walks up the dot-separated segments.

**Known limitations of the current model:**

- **Same logical event requires two config keys**: Java `OutputEvent` and 
Python `OutputEvent` are the same concept, but they have different type strings 
(`org.apache.flink.agents.api.OutputEvent` vs 
`flink_agents.api.events.event.OutputEvent`). There is no single config key 
that covers both.
- **Package-level config doesn't cross languages**: 
`event-log.org.apache.flink.agents.api.event.level: OFF` silences all Java 
events in that package, but equivalent Python events are unaffected.
- **No common ancestor below root**: Java hierarchies start with 
`org.apache...`, Python with `flink_agents...`. The only shared ancestor is the 
root `event-log.level`, which is too broad for targeted control.

These limitations are acceptable for the initial release because most jobs 
today are either pure Java or pure Python. See [Migration to 
Language-Independent Events](#migration-to-language-independent-events) for how 
these limitations are resolved when events become language-independent.

**Override at job submission time:**

```bash
# A shared config.yaml defines defaults for all jobs.
# Override just one event type for debugging a specific job run.
# Other per-type levels from config.yaml are preserved because
# each type has its own independent config key.
flink run ... \
  -Devent-log.org.apache.flink.agents.api.event.ChatRequestEvent.level=VERBOSE
```

## Truncation Strategy (STANDARD Level)

At `STANDARD` level, events may be truncated to stay within the 
`event-log.standard.max-length` limit (default: 4096 characters). Truncation 
**never** applies at `VERBOSE` level.

Setting `event-log.standard.max-length` to `0` disables truncation, making 
`STANDARD` behave identically to `VERBOSE` (except for the metadata label).

### What Gets Truncated

Truncation targets the content-heavy parts of the serialized event:

1. **Long string fields** — String values exceeding an internal threshold are 
shortened and suffixed with `"... [truncated]"`. This most commonly affects LLM 
response text, tool call arguments, and tool response bodies.
2. **Large arrays/lists** — Arrays with more elements than an internal 
threshold are trimmed, with a trailing marker indicating how many elements were 
removed.
3. **Deep nesting** — Object structures nested beyond an internal depth 
threshold are replaced with a placeholder.

The specific thresholds for each strategy are implementation details that may 
be tuned over time. The semantic contract is: at `STANDARD` level, details 
might be omitted to keep logs concise.

### What Does NOT Get Truncated

Structural and identifying fields are always preserved in full:

- `eventType`, `id`, `attributes`, `timestamp`
- Top-level scalar fields (model name, request IDs, status flags)

### Truncation Guarantees and Limitations

- **Approximate, not exact**: The character limit is a best-effort cap. Actual 
serialized output may slightly exceed the configured limit due to JSON escaping 
and structural overhead. Strict enforcement would require double-serialization, 
which is not worth the cost for a logging feature.
- **Truncated content is not independently parseable**: A truncated string 
field may contain partial JSON or incomplete text. Consumers needing complete 
structured content from a specific event type should configure that type at 
`VERBOSE` level.

## Event Log Record Schema

This section describes the JSON schema of each record written to the event log 
file. Two new top-level fields (`logLevel`, `eventType`) are added. Users and 
downstream tools that parse event log files should be aware of these additions.

Records include top-level `logLevel` and `eventType` fields:

```json
{
  "timestamp": "2024-01-15T10:30:00Z",
  "logLevel": "VERBOSE",
  "eventType": "org.apache.flink.agents.api.event.ChatRequestEvent",
  "event": {
    "eventType": "org.apache.flink.agents.api.event.ChatRequestEvent",
    "id": "...",
    "attributes": {},
    "model": "gpt-4",
    "messages": [...]
  }
}
```

At `STANDARD` level with truncation applied:

```json
{
  "timestamp": "2024-01-15T10:30:00Z",
  "logLevel": "STANDARD",
  "eventType": "org.apache.flink.agents.api.event.ChatResponseEvent",
  "event": {
    "eventType": "org.apache.flink.agents.api.event.ChatResponseEvent",
    "id": "...",
    "attributes": {},
    "response": "The beginning of a very long LLM response... [truncated]"
  }
}
```

The `eventType` field is emitted at the top level (alongside `timestamp`) for 
convenient downstream filtering without needing to parse into the `event` 
object.

Old records without `logLevel` or top-level `eventType` are deserialized 
correctly, defaulting to `STANDARD`.

## Interaction with EventFilter

The existing `EventFilter` mechanism continues to work. Log level and event 
filter compose with AND semantics — both must pass for an event to be logged:

| `EventFilter.accept()` | Log Level | Event logged? |
|---|---|---|
| `true` | `STANDARD` or `VERBOSE` | Yes |
| `true` | `OFF` | No |
| `false` | any | No |

The `EventFilter` is evaluated first. If the filter rejects, the level is not 
consulted. Any `EventFilter` configured today continues to work unchanged.

## Validation

On logger initialization, configured event type names are validated against 
known event classes. Unrecognized names produce a warning log:

```
WARN - Configured event log level for 
'org.apache.flink.agents.api.event.ChatRequstEvent'
but no matching event class was found. Check for typos in the config key.
```

This catches typos without failing the job. Custom event types not in the 
built-in registry trigger the warning but still function correctly at runtime.

## Observability

When truncation is active (`event-log.standard.max-length > 0`), a counter 
metric `eventLogTruncatedEvents` is incremented each time an event is 
truncated. This helps operators decide whether to increase the length limit or 
switch specific event types to `VERBOSE`.

## Backward Compatibility

- Default log level is `STANDARD` with `max-length=4096`. This is a **behavior 
change** from today — events at `STANDARD` level may be truncated. To restore 
previous behavior, set `event-log.level: VERBOSE` or 
`event-log.standard.max-length: 0`.
- JSON records without `logLevel` or top-level `eventType` fields deserialize 
correctly, defaulting to `STANDARD`.
- Existing `EventFilter` configurations continue to work unchanged.
- No existing config keys are renamed or removed.

## Migration to Language-Independent Events

_(from reviewer feedback, cc @wenjin272)_

There is ongoing discussion about changing events to language-independent JSON 
objects to simplify custom event definitions, especially for cross-language use 
cases where users currently need to define the same event type in both Java and 
Python.

### Current Model (This Design)

Config keys use the event's type string as-is — Java FQCNs for Java events, 
Python module paths for Python events:

```yaml
event-log.org.apache.flink.agents.api.event.ChatRequestEvent.level: VERBOSE   # 
Java
event-log.flink_agents.api.events.event.OutputEvent.level: VERBOSE             
# Python
```

This has known limitations in mixed-language jobs (see [Configuration 
Examples](#configuration-examples)), but is acceptable for the initial release 
since most jobs today are likely either pure Java or pure Python. (Feedback 
welcome.) 

### Future Model (Language-Independent Events)

If events become plain JSON with a user-chosen type string (e.g., 
`"ChatRequestEvent"`, `"OutputEvent"`), the config keys simplify and the 
cross-language limitations disappear:

```yaml
event-log.ChatRequestEvent.level: VERBOSE    # one key covers both Java and 
Python
event-log.OutputEvent.level: VERBOSE         # no language-specific namespace
```

### Migration Plan

When language-independent events are adopted:

1. **Event type strings change**: The `eventType` field in log records would 
change from FQCNs/module paths to plain type strings. Config keys follow 
automatically since they are based on the `eventType` value.
2. **Deprecation period**: During migration, the system recognizes both old 
FQCN-style keys and new plain-string keys. If both are configured for the same 
event, the new key takes precedence. A warning is logged for deprecated 
FQCN-style keys.
3. **Hierarchy inheritance adapts**: With plain type strings that may not 
contain dots, hierarchy inheritance becomes less relevant. The root 
`event-log.level` still serves as the global default. If the community adopts a 
naming convention with dots (e.g., `chat.request`, `tool.response`), hierarchy 
inheritance continues to work.

### Design Decision

This design targets the current model (Java FQCNs + Python module paths) for 
the initial implementation. The `event-log.<TYPE>.level` config key pattern and 
hierarchy inheritance mechanism are compatible with both the current and future 
models — only the type strings that users write in config files would change 
during migration.



GitHub link: https://github.com/apache/flink-agents/discussions/552

----
This is an automatically sent email for [email protected].
To unsubscribe, please send an email to: [email protected]

Reply via email to