Bago Amirbekian created SPARK-24747:
---------------------------------------

             Summary: Make spark.ml.util.Instrumentation class more flexible
                 Key: SPARK-24747
                 URL: https://issues.apache.org/jira/browse/SPARK-24747
             Project: Spark
          Issue Type: Improvement
          Components: ML
    Affects Versions: 2.3.1
            Reporter: Bago Amirbekian


The Instrumentation class (which is an internal private class) is some what 
limited by it's current APIs. The class requires an estimator and dataset be 
passed to the constructor which limits how it can be used. Furthermore, the 
current APIs make it hard to intercept failures and record anything related to 
those failures.

The following changes could make the instrumentation class easier to work with. 
All these changes are for private APIs and should not be visible to users.
{code}
// New no-argument constructor:
Instrumentation()

// New api to log previous constructor arguments.
logTrainingContext(estimator: Estimator[_], dataset: Dataset[_])

logFailure(e: Throwable): Unit

// Log success with no arguments
logSuccess(): Unit

// Log result model explicitly instead of passing to logSuccess
logModel(model: Model[_]): Unit

// On Companion object
Instrumentation.instrumented[T](body: (Instrumentation => T)): T

// The above API will allow us to write instrumented methods more clearly and 
handle logging success and failure automatically:
def someMethod(...): T = instrumented { instr =>
  instr.logNamedValue(name, value)
  // more code here
  instr.logModel(model)
}

{code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to