Adam Hooper created ARROW-12670:
-----------------------------------

             Summary: extract_regex gives bizarre behavior after nulls or 
non-matches
                 Key: ARROW-12670
                 URL: https://issues.apache.org/jira/browse/ARROW-12670
             Project: Apache Arrow
          Issue Type: Bug
          Components: C++, Python
    Affects Versions: 4.0.0
            Reporter: Adam Hooper


After a non-match, the *subsequent* string never matches.

{code}
>>> pa.compute.extract_regex(pa.array(["a", "b", "c", "d"]), 
>>> pattern="(?P<x>[^b])")
<pyarrow.lib.StructArray object at 0x7f80de956640>
-- is_valid:
  [
    true,
    false,
    true,
    true
  ]
-- child 0 type: string
  [
    "a",
    "",
    "",
    "a"
  ]
{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Reply via email to