dnskr commented on a change in pull request #33740:
URL: https://github.com/apache/spark/pull/33740#discussion_r689081057
##########
File path: docs/configuration.md
##########
@@ -659,6 +659,16 @@ Apart from these, the following properties are also
available, and may be useful
</td>
<td>2.1.2</td>
</tr>
+<tr>
+ <td><code>spark.redaction.string.regex</code></td>
+ <td>(none)</td>
+ <td>
+ Regex to decide which parts of strings produced by Spark contain sensitive
+ information. When this regex matches a string part, that string part is
replaced by a
+ dummy value. This is currently used to redact the output of SQL explain
commands.
Review comment:
Yes, you are right. Looks like `STRING_REDACTION_PATTERN `
(`spark.redaction.string.regex`) usage has been removed in
[commit#2831571](https://github.com/apache/spark/commit/2831571) in Dec 19,
2017. So it is not used in source code anymore rather than fallback value for
`spark.sql.redaction.string.regex` property.
I found having this property in `spark.sql.redaction.string.regex`
description confusing because it creates a feeling that
`spark.redaction.string.regex` is needed to redact sensetive data in non-SQL
places. It might confuse others as well.
Would it be a good idea to remove it from `spark.sql.redaction.string.regex`
description to avoid this misunderstanding?
Also in source code we can mark `spark.redaction.string.regex` as
`Deprecated` or just add a note that it is not supposed to be set by users.
What do you think?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]