[GitHub] spark pull request #21347: [SPARK-24290][ML] add support for Array input for...

jkbradley Thu, 17 May 2018 11:29:12 -0700

Github user jkbradley commented on a diff in the pull request:

    https://github.com/apache/spark/pull/21347#discussion_r189057734
  
    --- Diff: 
mllib/src/main/scala/org/apache/spark/ml/util/Instrumentation.scala ---
    @@ -125,6 +125,19 @@ private[spark] class Instrumentation[E <: 
Estimator[_]] private (
         log(compact(render(name -> value)))
       }
     
    +  def logNamedValue(name: String, value: Array[String]): Unit = {
    +    log(compact(render(name -> value.toSeq)))
    --- End diff --
    
    I see, so you're pointing out that our current approach is inconsistent: 
JSONifying array values (for clustering) but not JSONifying scalar values 
(Long, etc.)?  I don't have a good sense of what's best, but if we just have to 
pick something, I'd suggest not JSONifying values since doing so creates 
slightly longer strings to log (with extra quotes) and since that maintains a 
stable format from Spark 2.3.



---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #21347: [SPARK-24290][ML] add support for Array input for...

Reply via email to