[
https://issues.apache.org/jira/browse/BEAM-7018?focusedWorklogId=284304&page=com.atlassian.jira.plugin.system.issuetabpanels:worklog-tabpanel#worklog-284304
]
ASF GitHub Bot logged work on BEAM-7018:
----------------------------------------
Author: ASF GitHub Bot
Created on: 29/Jul/19 13:53
Start Date: 29/Jul/19 13:53
Worklog Time Spent: 10m
Work Description: mszb commented on issue #8859: [BEAM-7018] Added Regex
transform for PythonSDK
URL: https://github.com/apache/beam/pull/8859#issuecomment-516001151
okay.
> Someone using Regex.find_iter might expect match objects, just as in
Python, so I'd avoid that naming. You might need to use re.finditer to
implement Regex.find_all though, due to Regex.findall's variant signature.
> […](#)
> On Mon, Jul 29, 2019 at 3:40 PM Shoaib Zafar ***@***.***> wrote: Thanks
for the feedback @robertwb <https://github.com/robertwb>. This approach seems
good, I'll update the code! Only one questions though, What I think, we create
Regex.find_all which just Map re.findall and put all approach you mentioned
above in the Regex.find_iter method! Because re.findall returns a list of all
the groups (re.findall("a(b*)", "abb ax abbb") >> ['bb', '', 'bbb']) whereas in
the above example we are going to return a list of group(0) ["abb", "a",
"abbb"]. Your thoughts? — You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#8859?email_source=notifications&email_token=AADWVAIM36YNCUXHB2SCR4TQB3XL7A5CNFSM4HYGZUYKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD3AXP5A#issuecomment-515995636>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AADWVAIEP3UFNJ7DEHW7Q4LQB3XL7ANCNFSM4HYGZUYA>
.
----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
For queries about this service, please contact Infrastructure at:
[email protected]
Issue Time Tracking
-------------------
Worklog Id: (was: 284304)
Time Spent: 10h 10m (was: 10h)
> Regex transform for Python SDK
> ------------------------------
>
> Key: BEAM-7018
> URL: https://issues.apache.org/jira/browse/BEAM-7018
> Project: Beam
> Issue Type: New Feature
> Components: sdk-py-core
> Reporter: Rose Nguyen
> Assignee: Shehzaad Nakhoda
> Priority: Minor
> Time Spent: 10h 10m
> Remaining Estimate: 0h
>
> PTransorms to use Regular Expressions to process elements in a PCollection
> It should offer the same API as its Java counterpart:
> [https://github.com/apache/beam/blob/11a977b8b26eff2274d706541127c19dc93131a2/sdks/java/core/src/main/java/org/apache/beam/sdk/transforms/Regex.java]
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)