phanikumv commented on code in PR #68725:
URL: https://github.com/apache/airflow/pull/68725#discussion_r3440384218


##########
java-sdk/sdk/src/main/kotlin/org/apache/airflow/sdk/execution/Logger.kt:
##########
@@ -34,52 +34,112 @@ import java.util.concurrent.ConcurrentLinkedDeque
 import kotlin.reflect.KClass
 import kotlin.time.Clock
 
-enum class Level { ERROR, DEBUG, }
+// Adapted from Python logging.
+enum class Level(
+  val value: Short,
+) {
+  CRITICAL(50),
+  ERROR(40),
+  WARNING(30),
+  INFO(20),
+  DEBUG(10),
+  NOTSET(0),
+}
+
+private object LevelParser {
+  val levels = Level.entries.map { it.toString().uppercase() to it }.toMap()
+
+  fun parse(s: String?) = levels[s?.uppercase()]
+
+  fun parseNamed(s: String?): Map<String, Level> {
+    if (s == null) return emptyMap()
+    return buildMap {
+      s.split(Regex("""[\s,]+""")).forEach {
+        val parts = it.split(Regex("""\s*=\s*"""), 2)
+        val level = parse(parts[1])
+        if (level != null) put(parts[0], level)

Review Comment:
   The happy path works fine, however,
   
   Say a deployment sets:
   
   AIRFLOW__LOGGING__NAMESPACE_LEVELS="botocore=debug,"
   
   A trailing comma — extremely common when people build up a comma list. Now 
trace it:
   
   Step 1 — split "botocore=debug," on [\s,]+:
   
   ["botocore=debug", ""]
   
   The trailing comma produces an empty-string token at the end.
   
   Step 2 — the loop reaches the "" token and splits it on =:
   
   "".split(Regex("""\s*=\s*"""), 2)  ->  [""]      // size 1, no "=" to split 
on
   
   Step 3 — parse(parts[1]) reads index 1 of a 1-element list:
   
   java.lang.IndexOutOfBoundsException: Index 1 out of bounds for length 1
   
   The same thing happens for "botocore" (a bare name, no =), " botocore=debug" 
(leading space), ",botocore=debug" (leading comma), or the env var set to "".
   
   The example value in 
[config.yml](https://github.com/apache/airflow/blob/main/airflow-core/src/airflow/config_templates/config.yml#L852)
 for this exact option is:
   
   example: "sqlalchemy=INFO sqlalchemy.engine=DEBUG, botocor"
   
   That last token, botocor, has no =level — so a user who copies the 
documented example verbatim hits the crash
   
   The Python parser (structlog.py) has the same parsing gap, but it fails 
loudly with a ValueError at config time naming the bad value



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to