[jira] [Commented] (SPARK-23081) Add colRegex API to PySpark

Darrell Taylor (JIRA) Tue, 18 Sep 2018 03:26:06 -0700


    [ 
https://issues.apache.org/jira/browse/SPARK-23081?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16618874#comment-16618874
 ]


Darrell Taylor commented on SPARK-23081:
----------------------------------------

I tend to agree that I'm unsure why this was added as its easily done in 
PySpark.  But my main reason to comment is that the implemenation feels 
incorrect.  I'm unable to chain functions together and need to reference the 
dataframe.  e.g.

```
spark.table('xyz').colRegex('foobar').printSchema()
```
Feels like the natural way to use it, but I have to do it in two parts...
```
df=spark.table('xyz')
df.select(df.colRegex('foobar')).printSchema()
```
I don't think any of the other DataFrame functions work like this?

> Add colRegex API to PySpark
> ---------------------------
>
>                 Key: SPARK-23081
>                 URL: https://issues.apache.org/jira/browse/SPARK-23081
>             Project: Spark
>          Issue Type: Improvement
>          Components: PySpark
>    Affects Versions: 2.3.0
>            Reporter: Xiao Li
>            Assignee: Huaxin Gao
>            Priority: Major
>             Fix For: 2.3.0
>
>




--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (SPARK-23081) Add colRegex API to PySpark

Reply via email to