Max Moroz created SPARK-16409:
---------------------------------

             Summary: regexp_extract with optional groups causes NPE
                 Key: SPARK-16409
                 URL: https://issues.apache.org/jira/browse/SPARK-16409
             Project: Spark
          Issue Type: Bug
          Components: Spark Core
    Affects Versions: 2.0.0
            Reporter: Max Moroz
            Priority: Critical


df.select(F.regexp_extract('s', r'(a+)(b)?(c)', 2)).collect()

causes NPE. Worse, in a large program it doesn't cause NPE instantly; it 
actually works fine, until some unpredictable (and inconsistent) moment in the 
future when (presumably) the invalid memory access occurs, and then it fails. 
For this reason, it took several hours to debug this.

Suggestion: either fill the group with null; or raise exception immediately 
after examining the argument with a message that optional groups are not 
allowed.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to