Github user felixcheung commented on a diff in the pull request: https://github.com/apache/spark/pull/20390#discussion_r163791518 --- Diff: python/pyspark/sql/dataframe.py --- @@ -819,6 +819,29 @@ def columns(self): """ return [f.name for f in self.schema.fields] + @since(2.4) + def colRegex(self, colName): + """ + Selects column based on the column name specified as a regex and return it + as :class:`Column`. + + :param colName: string, column name specified as a regex. + + >>> df = spark.createDataFrame([("a", 1), ("b", 2), ("c", 3)]) + >>> df.select(df.colRegex("`(_1)?+.+`")).show() --- End diff -- nit: perhaps a bit obscure to pick the default column name of `_1`? how about we name the columns in the line above?
--- --------------------------------------------------------------------- To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org For additional commands, e-mail: reviews-h...@spark.apache.org