Github user jkbradley commented on a diff in the pull request:
https://github.com/apache/spark/pull/21347#discussion_r189057734
--- Diff:
mllib/src/main/scala/org/apache/spark/ml/util/Instrumentation.scala ---
@@ -125,6 +125,19 @@ private[spark] class Instrumentation[E <:
Estimator[_]] private (
log(compact(render(name -> value)))
}
+ def logNamedValue(name: String, value: Array[String]): Unit = {
+ log(compact(render(name -> value.toSeq)))
--- End diff --
I see, so you're pointing out that our current approach is inconsistent:
JSONifying array values (for clustering) but not JSONifying scalar values
(Long, etc.)? I don't have a good sense of what's best, but if we just have to
pick something, I'd suggest not JSONifying values since doing so creates
slightly longer strings to log (with extra quotes) and since that maintains a
stable format from Spark 2.3.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]