bobbai00 opened a new pull request, #4261: URL: https://github.com/apache/texera/pull/4261
### What changes were proposed in this PR? Fixed two issues in `WordCloudOpDesc.scala`: 1. **Regex escaping bug**: The pattern `r'\\w'` in the Scala triple-quoted string produced `r'\\w'` in Python, which matches a literal backslash followed by `w` — not word characters (`\w`). This caused `str.contains` to return `False` for all normal text, filtering out every row and producing the error: *"text column does not contain words or contains only nulls."* Fixed by changing to `r'\w'`. 2. **Duplicate statement**: Removed a duplicate `Map(...)` line in `getOutputSchemas` (dead code). ### Any related issues, documentation, discussions? N/A ### How was this PR tested? Manually verified the generated Python code produces the correct regex pattern `r'\w'`. ### Was this PR authored or co-authored using generative AI tooling? Generated-by: Claude Code (Claude Opus 4.6) -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
