vivek807 opened a new issue, #19513:
URL: https://github.com/apache/druid/issues/19513

   ### Description
   
   The product uses a regular expression with a worst-case computational 
complexity that is inefficient and possibly exponential.
   
   ## Summary
   A DATASOURCE WRITE user can hang Overlord worker threads indefinitely with a 
single sampler request, degrading or denying service on the cluster control 
plane.
   
   ## Root cause
   > Line numbers pinned to 
`druid-31.0.2@230605ec33db326c37154a03bcc4edfccc40203b`.
   
   
`processing/src/main/java/org/apache/druid/data/input/impl/RegexInputFormat.java:50-60`:
   
   ```java
   public RegexInputFormat(
       @JsonProperty("pattern") String pattern,
       @JsonProperty("listDelimiter") @Nullable String listDelimiter,
       @JsonProperty("columns") @Nullable List<String> columns
   )
   {
     this.pattern = pattern;
     this.listDelimiter = listDelimiter;
     this.columns = columns;
     this.compiledPatternSupplier = Suppliers.memoize(() -> 
Pattern.compile(pattern));
   }
   ```
   
   RegexInputFormat compiles @JsonProperty pattern with no complexity/length 
limit and applies Matcher.matches() per line. The sampler runs in the Overlord 
JVM (CliOverlord.java:460); TimedShutoffInputSourceReader only checks the 
volatile closed flag at iterator boundaries (:89-101) and cannot interrupt an 
in-progress Matcher.matches(). Attacker also controls timeoutMs and can set it 
to 0.
   
   **Exploit scenario (static hypothesis — unverified):**
   DATASOURCE WRITE user POSTs sampler spec with 
InlineInputSource.data='aaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaX' and 
RegexInputFormat.pattern='^(.*a){20}$', timeoutMs=0. Overlord thread enters 
catastrophic backtracking and never returns to the iterator boundary. A few 
concurrent requests exhaust the Jetty pool.
   
   ---


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to