Max Moroz created SPARK-16409:
---------------------------------
Summary: regexp_extract with optional groups causes NPE
Key: SPARK-16409
URL: https://issues.apache.org/jira/browse/SPARK-16409
Project: Spark
Issue Type: Bug
Components: Spark Core
Affects Versions: 2.0.0
Reporter: Max Moroz
Priority: Critical
df.select(F.regexp_extract('s', r'(a+)(b)?(c)', 2)).collect()
causes NPE. Worse, in a large program it doesn't cause NPE instantly; it
actually works fine, until some unpredictable (and inconsistent) moment in the
future when (presumably) the invalid memory access occurs, and then it fails.
For this reason, it took several hours to debug this.
Suggestion: either fill the group with null; or raise exception immediately
after examining the argument with a message that optional groups are not
allowed.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]