Copilot commented on code in PR #13737:
URL: https://github.com/apache/skywalking/pull/13737#discussion_r2912090135


##########
docs/en/concepts-and-designs/lal.md:
##########
@@ -145,6 +145,89 @@ filter {
 Extractors aim to extract metadata from the logs. The metadata can be a 
service name, a service instance name, an
 endpoint name, or even a trace ID, all of which can be associated with the 
existing traces and metrics.
 
+#### Local variables (`def`)
+
+You can use `def` to declare local variables in the extractor (or at the 
filter level). This is useful when an
+expression is reused multiple times, or when you want to break a long chain 
into readable steps.
+
+The syntax is:
+
+```
+def variableName = expression
+def variableName = expression as TypeName
+```
+
+The variable type is inferred from the initializer expression at compile time. 
`def` is not limited to JSON — it works
+with any value access expression whose type is resolvable on the classpath, 
including protobuf getter chains,
+`log.*` fields, and Gson JSON method chains. Subsequent method calls on the 
variable are validated at compile time
+against the inferred type.
+
+You can optionally add an explicit `as` type cast to narrow the variable type. 
The cast type can be a built-in
+type (`String`, `Long`, `Integer`, `Boolean`) or a fully qualified class name:
+
+```
+def value = someExpression as com.example.MyType
+```
+
+This is useful when the compiler infers a type that is too general (e.g., 
`Object` from a generic API return)
+and you know the concrete runtime type. The cast tells the compiler which type 
to use for subsequent method chain
+validation. Note that `as` performs a Java cast — it does **not** convert 
between types. For JSON conversion, use
+`toJson()` or `toJsonArray()` instead.
+
+The FQCN must be resolvable on the classpath at compile time. If the class is 
not found, the OAP server will fail
+to start.
+
+Two built-in conversion functions are provided for JSON interoperability:
+
+- `toJson(expr)` — converts a value to a Gson `JsonObject`. Works with JSON 
strings, `Map`, and protobuf `Struct`.
+- `toJsonArray(expr)` — converts a value to a Gson `JsonArray`. Works with 
JSON array strings.
+

Review Comment:
   The docs state `toJson(expr)` “Works with JSON strings, `Map`, and protobuf 
`Struct`”, but the current runtime implementation only supports `String` and 
`Struct` (and returns null for `Map`). Please either add Map conversion support 
in `LalRuntimeHelper` or update this section so users don’t rely on unsupported 
behavior.



##########
oap-server/analyzer/log-analyzer/src/main/java/org/apache/skywalking/oap/log/analyzer/v2/compiler/LALBlockCodegen.java:
##########
@@ -1140,6 +1172,283 @@ static String generateExtraLogAccess(
         return prevVar;
     }
 
+    // ==================== Def statement codegen ====================
+
+    static void generateDefStatement(final StringBuilder sb,
+                                      final LALScriptModel.DefStatement def,
+                                      final LALClassGenerator.GenCtx genCtx) {
+        final LALScriptModel.ValueAccess init = def.getInitializer();
+        final String varName = def.getVarName();
+        final String javaVar = "_d" + genCtx.localVarCounter++;
+
+        // Determine type and generate initializer expression
+        Class<?> resolvedType;
+        final StringBuilder initExpr = new StringBuilder();
+
+        if (init.getFunctionCallName() != null
+                && BUILTIN_FUNCTIONS.containsKey(init.getFunctionCallName())) {
+            // Built-in function: toJson(...), toJsonArray(...)
+            final Object[] info = 
BUILTIN_FUNCTIONS.get(init.getFunctionCallName());
+            final String helperMethod = (String) info[0];
+            resolvedType = (Class<?>) info[1];
+
+            initExpr.append(helperMethod).append("(");
+            if (!init.getFunctionCallArgs().isEmpty()) {
+                generateValueAccess(initExpr,
+                    init.getFunctionCallArgs().get(0).getValue(), genCtx);
+            } else {
+                initExpr.append("null");
+            }
+            initExpr.append(")");
+        } else {

Review Comment:
   `generateDefStatement` ignores additional arguments for built-ins like 
`toJson(...)`/`toJsonArray(...)` (it only emits the first arg, otherwise 
`null`). If a script accidentally passes 2+ args, it will compile but behave 
unexpectedly. Consider validating the arity for built-in functions and failing 
compilation with a clear error when the arg count doesn’t match the expected 
signature (exactly 1).



##########
oap-server/analyzer/log-analyzer/src/main/java/org/apache/skywalking/oap/log/analyzer/v2/compiler/rt/LalRuntimeHelper.java:
##########
@@ -180,6 +188,115 @@ public String tagValue(final String key) {
         return "";
     }
 
+    // ==================== JSON conversion (for def variables) 
====================
+    //
+    // toJsonObject: converts to JsonObject. Overloaded for Struct, String, 
Object.
+    // toJsonArray:  converts to JsonArray.  Overloaded for String, Object.
+    //
+    // Primary use case: extract JWT claims from envoy ALS filter_metadata:
+    //   def jwt = toJson(parsed?.commonProperties?.metadata
+    //       ?.filterMetadataMap?.get("envoy.filters.http.jwt_authn"))
+    //   tag 'email': 
jwt?.getAsJsonObject("payload")?.get("email")?.getAsString()
+
+    /**
+     * Converts a protobuf {@link Struct} to a Gson {@link JsonObject}.
+     * Recursively converts nested Struct/Value/ListValue to Gson types.
+     */
+    public JsonObject toJsonObject(final Struct struct) {
+        if (struct == null) {
+            return null;
+        }
+        return structToJsonObject(struct);
+    }
+
+    /**
+     * Parses a JSON string into a {@link JsonObject}.
+     */
+    public JsonObject toJsonObject(final String s) {
+        if (s == null || s.isEmpty()) {
+            return null;
+        }
+        final JsonElement el = JsonParser.parseString(s);
+        return el.isJsonObject() ? el.getAsJsonObject() : null;
+    }
+
+    /**
+     * Fallback for {@code Map.get()} erasure — dispatches to typed overloads.
+     */
+    public JsonObject toJsonObject(final Object obj) {
+        if (obj == null) {
+            return null;
+        }
+        if (obj instanceof Struct) {
+            return toJsonObject((Struct) obj);
+        }
+        if (obj instanceof String) {
+            return toJsonObject((String) obj);
+        }
+        return null;
+    }
+
+    /**
+     * Parses a JSON string into a {@link JsonArray}.
+     */
+    public JsonArray toJsonArray(final String s) {
+        if (s == null || s.isEmpty()) {
+            return null;
+        }
+        final JsonElement el = JsonParser.parseString(s);
+        return el.isJsonArray() ? el.getAsJsonArray() : null;
+    }

Review Comment:
   `toJsonObject(String)`/`toJsonArray(String)` call 
`JsonParser.parseString(...)` without handling `JsonSyntaxException`. Since 
these functions may be applied to untrusted log fields, a single malformed JSON 
value would throw and can abort the whole rule execution. Consider catching 
Gson parse exceptions and returning `null` (or an empty JsonObject/JsonArray) 
consistently with the helper’s other “best-effort” conversions (e.g., 
`parseTimestamp` returns 0 on failure).



##########
oap-server/analyzer/log-analyzer/src/main/java/org/apache/skywalking/oap/log/analyzer/v2/compiler/rt/LalRuntimeHelper.java:
##########
@@ -180,6 +188,115 @@ public String tagValue(final String key) {
         return "";
     }
 
+    // ==================== JSON conversion (for def variables) 
====================
+    //
+    // toJsonObject: converts to JsonObject. Overloaded for Struct, String, 
Object.
+    // toJsonArray:  converts to JsonArray.  Overloaded for String, Object.
+    //
+    // Primary use case: extract JWT claims from envoy ALS filter_metadata:
+    //   def jwt = toJson(parsed?.commonProperties?.metadata
+    //       ?.filterMetadataMap?.get("envoy.filters.http.jwt_authn"))
+    //   tag 'email': 
jwt?.getAsJsonObject("payload")?.get("email")?.getAsString()
+
+    /**
+     * Converts a protobuf {@link Struct} to a Gson {@link JsonObject}.
+     * Recursively converts nested Struct/Value/ListValue to Gson types.
+     */
+    public JsonObject toJsonObject(final Struct struct) {
+        if (struct == null) {
+            return null;
+        }
+        return structToJsonObject(struct);
+    }
+
+    /**
+     * Parses a JSON string into a {@link JsonObject}.
+     */
+    public JsonObject toJsonObject(final String s) {
+        if (s == null || s.isEmpty()) {
+            return null;
+        }
+        final JsonElement el = JsonParser.parseString(s);
+        return el.isJsonObject() ? el.getAsJsonObject() : null;
+    }
+
+    /**
+     * Fallback for {@code Map.get()} erasure — dispatches to typed overloads.
+     */
+    public JsonObject toJsonObject(final Object obj) {
+        if (obj == null) {
+            return null;
+        }
+        if (obj instanceof Struct) {
+            return toJsonObject((Struct) obj);
+        }
+        if (obj instanceof String) {
+            return toJsonObject((String) obj);
+        }
+        return null;
+    }

Review Comment:
   `toJsonObject(Object)` is documented/advertised (and the PR description 
mentions) support for `Map`, but the implementation only dispatches `Struct` 
and `String`. This means `toJson(parsed.someMap)` (from JSON/YAML parser 
output) will silently return null, breaking the feature for Map-backed inputs. 
Add a `Map` overload and/or handle `obj instanceof Map` in the dispatcher with 
a recursive Map/List -> Gson conversion similar to the Struct conversion.



##########
docs/en/concepts-and-designs/lal.md:
##########
@@ -145,6 +145,89 @@ filter {
 Extractors aim to extract metadata from the logs. The metadata can be a 
service name, a service instance name, an
 endpoint name, or even a trace ID, all of which can be associated with the 
existing traces and metrics.
 
+#### Local variables (`def`)
+
+You can use `def` to declare local variables in the extractor (or at the 
filter level). This is useful when an
+expression is reused multiple times, or when you want to break a long chain 
into readable steps.
+
+The syntax is:
+
+```
+def variableName = expression
+def variableName = expression as TypeName
+```
+
+The variable type is inferred from the initializer expression at compile time. 
`def` is not limited to JSON — it works
+with any value access expression whose type is resolvable on the classpath, 
including protobuf getter chains,
+`log.*` fields, and Gson JSON method chains. Subsequent method calls on the 
variable are validated at compile time
+against the inferred type.

Review Comment:
   The docs say `def` can be declared “at the filter level”, but in the current 
codegen local vars are scoped to the generated Java method: filter-level `def` 
lives in `execute()`, while `extractor {}`/`sink {}` are emitted as separate 
private methods with their own `localVars` map. This means a filter-level `def` 
can’t be referenced inside `extractor {}`/`sink {}`. Please clarify the 
intended scoping rules here (or adjust codegen if filter-level defs are meant 
to be visible inside nested blocks).



##########
oap-server/analyzer/log-analyzer/src/test/java/org/apache/skywalking/oap/log/analyzer/v2/compiler/LALExpressionExecutionTest.java:
##########
@@ -318,6 +330,30 @@ private void executeAndAssert(
         }
     }
 
+    /**
+     * Reads a field from the output builder if present, otherwise falls back
+     * to the proto value. The output builder (e.g. LogBuilder) stores
+     * extracted fields that haven't been flushed to LogData proto yet.
+     */
+    private static String getOutputOrProto(final Object outputObj,
+                                            final String fieldName,
+                                            final String protoValue) {
+        if (outputObj == null) {
+            return protoValue;
+        }
+        try {
+            final Field f = outputObj.getClass().getDeclaredField(fieldName);
+            f.setAccessible(true);
+            final Object val = f.get(outputObj);
+            if (val != null) {
+                return val.toString();
+            }
+        } catch (NoSuchFieldException | IllegalAccessException ignored) {
+            // Field not on this output type — fall back to proto
+        }
+        return protoValue;
+    }

Review Comment:
   `getOutputOrProto` assumes standard fields exist on the output builder with 
names like `service`/`serviceInstance`, but extractor codegen can target output 
types that use different standard-field names (e.g., `setServiceName` -> 
`serviceName`, `setServiceInstanceName` -> `serviceInstanceName`). For those 
builders, this helper will fall back to `ctx.log()` and may miss asserting 
extracted values. Consider checking for the known aliases 
(serviceName/serviceInstanceName) and/or calling getters/setters reflectively 
(or reusing `FIELD_TYPE_SETTER_CANDIDATES`) instead of hardcoding field names.



##########
oap-server/analyzer/log-analyzer/src/main/java/org/apache/skywalking/oap/log/analyzer/v2/compiler/LALBlockCodegen.java:
##########
@@ -1140,6 +1172,283 @@ static String generateExtraLogAccess(
         return prevVar;
     }
 
+    // ==================== Def statement codegen ====================
+
+    static void generateDefStatement(final StringBuilder sb,
+                                      final LALScriptModel.DefStatement def,
+                                      final LALClassGenerator.GenCtx genCtx) {
+        final LALScriptModel.ValueAccess init = def.getInitializer();
+        final String varName = def.getVarName();
+        final String javaVar = "_d" + genCtx.localVarCounter++;
+
+        // Determine type and generate initializer expression
+        Class<?> resolvedType;
+        final StringBuilder initExpr = new StringBuilder();
+
+        if (init.getFunctionCallName() != null
+                && BUILTIN_FUNCTIONS.containsKey(init.getFunctionCallName())) {
+            // Built-in function: toJson(...), toJsonArray(...)
+            final Object[] info = 
BUILTIN_FUNCTIONS.get(init.getFunctionCallName());
+            final String helperMethod = (String) info[0];
+            resolvedType = (Class<?>) info[1];
+
+            initExpr.append(helperMethod).append("(");
+            if (!init.getFunctionCallArgs().isEmpty()) {
+                generateValueAccess(initExpr,
+                    init.getFunctionCallArgs().get(0).getValue(), genCtx);
+            } else {
+                initExpr.append("null");
+            }
+            initExpr.append(")");
+        } else {
+            // General value access — type inferred from lastResolvedType
+            generateValueAccess(initExpr, init, genCtx);
+            resolvedType = genCtx.lastResolvedType != null
+                ? genCtx.lastResolvedType : Object.class;
+            // Box primitive types for local variable declarations
+            if (resolvedType.isPrimitive()) {
+                final String boxName = 
LALCodegenHelper.boxTypeName(resolvedType);
+                if (boxName != null) {
+                    try {
+                        resolvedType = Class.forName("java.lang." + boxName);
+                    } catch (ClassNotFoundException ignored) {
+                        // keep primitive
+                    }
+                }
+            }
+        }
+
+        // Apply explicit type cast if specified (e.g., "as 
com.example.MyType")
+        final String castType = def.getCastType();
+        if (castType != null && !castType.isEmpty()) {
+            // Resolve the cast type — primitive wrapper names are handled,
+            // anything else is treated as a FQCN
+            final Class<?> castClass = resolveDefCastType(castType);
+            if (castClass != null) {
+                resolvedType = castClass;
+            }
+        }
+
+        // Register in local vars for later reference
+        genCtx.localVars.put(varName,
+            new LALClassGenerator.LocalVarInfo(javaVar, resolvedType));
+
+        // Emit declaration (placed at method top via localVarDecls)
+        genCtx.localVarDecls.append("  ").append(resolvedType.getName())
+            .append(" ").append(javaVar).append(";\n");
+        genCtx.localVarLvtVars.add(new String[]{
+            javaVar, "L" + resolvedType.getName().replace('.', '/') + ";"
+        });
+
+        // Emit assignment in body (at the point where def appears)
+        sb.append("  ").append(javaVar).append(" = ");
+        if (castType != null && !castType.isEmpty()) {
+            sb.append("(").append(resolvedType.getName()).append(") ");
+        }
+        sb.append(initExpr).append(";\n");
+    }
+
+    /**
+     * Resolves a cast type string to a {@link Class}.
+     * Handles the four built-in type names ({@code String}, {@code Long},
+     * {@code Integer}, {@code Boolean}) and fully qualified class names.
+     */
+    private static Class<?> resolveDefCastType(final String castType) {
+        switch (castType) {
+            case "String":
+                return String.class;
+            case "Long":
+                return Long.class;
+            case "Integer":
+                return Integer.class;
+            case "Boolean":
+                return Boolean.class;
+            default:
+                try {
+                    return Class.forName(castType);
+                } catch (ClassNotFoundException e) {
+                    throw new IllegalArgumentException(
+                        "def cast type not found on classpath: " + castType, 
e);
+                }
+        }
+    }
+
+    // ==================== Def variable chain codegen ====================
+
+    /**
+     * Generates typed method-chain access on a def variable.
+     * Uses reflection to resolve each method/field call and track types.
+     *
+     * @param sb output buffer
+     * @param localVar the def variable info (java var name + resolved type)
+     * @param chain the chain segments after the variable name
+     * @param genCtx codegen context
+     */
+    static void generateDefVarChain(
+            final StringBuilder sb,
+            final LALClassGenerator.LocalVarInfo localVar,
+            final List<LALScriptModel.ValueAccessSegment> chain,
+            final LALClassGenerator.GenCtx genCtx) {
+        if (chain.isEmpty()) {
+            sb.append(localVar.javaVarName);
+            genCtx.lastResolvedType = localVar.resolvedType;
+            return;
+        }
+
+        String prevExpr = localVar.javaVarName;
+        Class<?> currentType = localVar.resolvedType;
+        boolean canBeNull = true;
+
+        for (int i = 0; i < chain.size(); i++) {
+            final LALScriptModel.ValueAccessSegment seg = chain.get(i);
+            final boolean isLast = i == chain.size() - 1;
+
+            if (seg instanceof LALScriptModel.MethodSegment) {
+                final LALScriptModel.MethodSegment ms =
+                    (LALScriptModel.MethodSegment) seg;
+                final String methodName = ms.getName();
+
+                // Resolve method on currentType via reflection
+                final java.lang.reflect.Method method =
+                    resolveMethod(currentType, methodName, ms.getArguments());
+                if (method == null) {
+                    throw new IllegalArgumentException(
+                        "Cannot resolve method " + currentType.getSimpleName()
+                            + "." + methodName + "() in def variable chain");
+                }
+                final Class<?> returnType = method.getReturnType();
+                final String args = generateMethodArgs(ms.getArguments());
+
+                if (ms.isSafeNav() && canBeNull) {
+                    if (isLast && returnType.isPrimitive()) {
+                        // Primitive return with null guard
+                        final String boxName =
+                            LALCodegenHelper.boxTypeName(returnType);
+                        prevExpr = "(" + prevExpr + " == null ? null : "
+                            + boxName + ".valueOf(" + prevExpr + "."
+                            + methodName + "(" + args + ")))";
+                        currentType = returnType;
+                    } else {
+                        prevExpr = "(" + prevExpr + " == null ? null : "
+                            + prevExpr + "." + methodName + "(" + args + "))";
+                        currentType = returnType;
+                        canBeNull = true;
+                    }
+                } else {
+                    prevExpr = prevExpr + "." + methodName + "(" + args + ")";
+                    currentType = returnType;
+                    canBeNull = !returnType.isPrimitive();
+                }

Review Comment:
   `generateDefVarChain` builds nested ternary null-guards by reusing 
`prevExpr` multiple times (e.g., `(<prevExpr> == null ? null : 
<prevExpr>.m())`). When `prevExpr` itself already contains method calls (from 
earlier segments), this duplicates evaluation and can exponentially grow the 
generated source for long safe-nav chains. Consider emitting temporary 
variables for intermediate results (similar to `_tN` proto caching) to avoid 
repeated evaluation and keep generated code size predictable.



##########
oap-server/analyzer/log-analyzer/src/main/java/org/apache/skywalking/oap/log/analyzer/v2/compiler/LALBlockCodegen.java:
##########
@@ -1175,23 +1484,21 @@ static void generateProcessRegistryCall(
 
     static String appendMethodSegment(final String current,
                                        final LALScriptModel.MethodSegment ms) {
+        final String mn = ms.getName();
+        final String args = ms.getArguments().isEmpty()
+            ? "" : generateMethodArgs(ms.getArguments());
         if (ms.isSafeNav()) {
-            final String mn = ms.getName();
+            // Special-cased helpers for common safe-nav methods on Object
             if ("toString".equals(mn)) {
                 return "h.toString(" + current + ")";
             } else if ("trim".equals(mn)) {
                 return "h.trim(" + current + ")";
-            } else {
-                throw new IllegalArgumentException(
-                    "Unsupported safe-nav method: ?." + mn + "()");
             }
+            // General safe-nav: null guard with ternary
+            return "(" + current + " == null ? null : "
+                + current + "." + mn + "(" + args + "))";
         } else {
-            if (ms.getArguments().isEmpty()) {
-                return current + "." + ms.getName() + "()";
-            } else {
-                return current + "." + ms.getName() + "("
-                    + generateMethodArgs(ms.getArguments()) + ")";
-            }
+            return current + "." + mn + "(" + args + ")";
         }
     }

Review Comment:
   `appendMethodSegment` now emits a generic safe-nav ternary for any method 
(`(current == null ? null : current.mn(args))`). This will generate 
uncompilable Java when the method returns a primitive (e.g., `?.length()` -> 
`int`, `?.endsWith()` -> `boolean`) because the ternary mixes `null` with a 
primitive. Since `appendMethodSegment` doesn’t have type info to box 
primitives, it may be safer to keep the previous strict allowlist 
(toString/trim) or route other safe-nav method calls through typed resolution 
(like `generateDefVarChain`) so primitives can be boxed appropriately.



##########
oap-server/analyzer/log-analyzer/CLAUDE.md:
##########
@@ -105,6 +110,68 @@ All generated methods include a `LocalVariableTable` 
attribute for debugger/deco
 
 LVT entries are added via `PrivateMethod` inner class which carries both 
source code and variable descriptors.
 
+## Local Variables (`def`)
+
+The `def` keyword declares local variables in the extractor (or filter level). 
The grammar rule:
+
+```
+defStatement: DEF IDENTIFIER ASSIGN valueAccess typeCast? ;
+```
+
+The optional `typeCast` supports built-in types (`String`, `Long`, `Integer`, 
`Boolean`) and
+fully qualified class names (`as com.example.MyType`). The FQCN is resolved 
via `Class.forName()`
+at compile time. If not found, compilation fails with 
`IllegalArgumentException`.
+
+### Type inference
+
+The variable type is inferred from the initializer expression:
+
+| Initializer | Inferred type | Generated code |
+|---|---|---|
+| `toJson(expr)` | `JsonObject` | `h.toJsonObject(expr)` |
+| `toJsonArray(expr)` | `JsonArray` | `h.toJsonArray(expr)` |
+| General value access | Last resolved type via reflection | Standard value 
access codegen |
+
+Built-in functions are registered in `BUILTIN_FUNCTIONS` map in 
`LALBlockCodegen`.
+
+### Chained def variables
+
+A `def` variable can be initialized from another `def` variable's method chain:
+
+```
+def jwt = 
toJson(parsed?.commonProperties?.metadata?.filterMetadataMap?.get("envoy.filters.http.jwt_authn"))
+def payload = jwt?.getAsJsonObject("payload")
+tag 'email': payload?.get("email")?.getAsString()
+```
+
+The general value access path in `generateDefStatement()` recognizes `jwt` as 
a def variable,
+delegates to `generateDefVarChain()` which uses reflection to resolve the 
chain, and
+`genCtx.lastResolvedType` captures the resolved type (`JsonObject` in this 
case).
+
+### Runtime helpers
+
+`LalRuntimeHelper` provides `toJsonObject()` and `toJsonArray()` overloads:
+
+| Method | Input type | Conversion |
+|---|---|---|
+| `toJsonObject(String)` | JSON string | 
`JsonParser.parseString().getAsJsonObject()` |
+| `toJsonObject(Map)` | Map (from JSON/YAML parser) | Recursive Gson 
conversion |
+| `toJsonObject(Struct)` | Protobuf `Struct` | Recursive field conversion 
preserving nested structures |
+| `toJsonObject(Object)` | Any (fallback) | Delegates to above based on 
runtime type |
+| `toJsonArray(String)` | JSON array string | 
`JsonParser.parseString().getAsJsonArray()` |
+| `toJsonArray(Object)` | Any (fallback) | String fallback |
+

Review Comment:
   This documentation claims `toJsonObject(Map)` exists and that `toJson()` 
works with Map inputs, but `LalRuntimeHelper` currently only supports `String` 
and protobuf `Struct` (via `toJsonObject(Object)` dispatch). Either implement 
Map support in `LalRuntimeHelper` or adjust this doc to match actual behavior 
to avoid misleading future contributors.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to