Github user felixcheung commented on a diff in the pull request:
https://github.com/apache/spark/pull/20390#discussion_r163791518
--- Diff: python/pyspark/sql/dataframe.py ---
@@ -819,6 +819,29 @@ def columns(self):
"""
return [f.name for f in self.schema.fields]
+ @since(2.4)
+ def colRegex(self, colName):
+ """
+ Selects column based on the column name specified as a regex and
return it
+ as :class:`Column`.
+
+ :param colName: string, column name specified as a regex.
+
+ >>> df = spark.createDataFrame([("a", 1), ("b", 2), ("c", 3)])
+ >>> df.select(df.colRegex("`(_1)?+.+`")).show()
--- End diff --
nit: perhaps a bit obscure to pick the default column name of `_1`?
how about we name the columns in the line above?
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]