[jira] [Commented] (SPARK-9720) spark.ml Identifiable types should have UID in toString methods
[ https://issues.apache.org/jira/browse/SPARK-9720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14742161#comment-14742161 ] Bertrand Dechoux commented on SPARK-9720: - The pull request can be merged. > spark.ml Identifiable types should have UID in toString methods > --- > > Key: SPARK-9720 > URL: https://issues.apache.org/jira/browse/SPARK-9720 > Project: Spark > Issue Type: Improvement > Components: ML >Reporter: Joseph K. Bradley >Assignee: Bertrand Dechoux >Priority: Minor > Labels: starter > > It would be nice to include the UID (instance name) in toString methods. > That's the default behavior for Identifiable, but some types override the > default toString and do not include the UID. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-9720) spark.ml Identifiable types should have UID in toString methods
[ https://issues.apache.org/jira/browse/SPARK-9720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14680520#comment-14680520 ] Joseph K. Bradley commented on SPARK-9720: -- Oh sorry! I shouldn't have said "print." > spark.ml Identifiable types should have UID in toString methods > --- > > Key: SPARK-9720 > URL: https://issues.apache.org/jira/browse/SPARK-9720 > Project: Spark > Issue Type: Improvement > Components: ML >Reporter: Joseph K. Bradley >Priority: Minor > Labels: starter > > It would be nice to include the UID (instance name) in toString methods. > That's the default behavior for Identifiable, but some types override the > default toString and do not include the UID. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-9720) spark.ml Identifiable types should have UID in toString methods
[ https://issues.apache.org/jira/browse/SPARK-9720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14679481#comment-14679481 ] Sean Cho commented on SPARK-9720: - I was fooled by the word "print". I thought it was said that toString method should "print" the uid. I suppose it was meant to be "return the uid". ;) > spark.ml Identifiable types should have UID in toString methods > --- > > Key: SPARK-9720 > URL: https://issues.apache.org/jira/browse/SPARK-9720 > Project: Spark > Issue Type: Improvement > Components: ML >Reporter: Joseph K. Bradley >Priority: Minor > Labels: starter > > It would be nice to print the UID (instance name) in toString methods. > That's the default behavior for Identifiable, but some types override the > default toString and do not print the UID. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-9720) spark.ml Identifiable types should have UID in toString methods
[ https://issues.apache.org/jira/browse/SPARK-9720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14679228#comment-14679228 ] Apache Spark commented on SPARK-9720: - User 'BertrandDechoux' has created a pull request for this issue: https://github.com/apache/spark/pull/8062 > spark.ml Identifiable types should have UID in toString methods > --- > > Key: SPARK-9720 > URL: https://issues.apache.org/jira/browse/SPARK-9720 > Project: Spark > Issue Type: Improvement > Components: ML >Reporter: Joseph K. Bradley >Priority: Minor > Labels: starter > > It would be nice to print the UID (instance name) in toString methods. > That's the default behavior for Identifiable, but some types override the > default toString and do not print the UID. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-9720) spark.ml Identifiable types should have UID in toString methods
[ https://issues.apache.org/jira/browse/SPARK-9720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14662514#comment-14662514 ] Joseph K. Bradley commented on SPARK-9720: -- I like the proposal, but I don't think we should break APIs...which unfortunately means we will need to stick with encouragement instead of enforcement. Would you mind sending a PR to update those classes with issues? > spark.ml Identifiable types should have UID in toString methods > --- > > Key: SPARK-9720 > URL: https://issues.apache.org/jira/browse/SPARK-9720 > Project: Spark > Issue Type: Improvement > Components: ML >Reporter: Joseph K. Bradley >Priority: Minor > Labels: starter > > It would be nice to print the UID (instance name) in toString methods. > That's the default behavior for Identifiable, but some types override the > default toString and do not print the UID. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-9720) spark.ml Identifiable types should have UID in toString methods
[ https://issues.apache.org/jira/browse/SPARK-9720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14662333#comment-14662333 ] Bertrand Dechoux commented on SPARK-9720: - I could take care of it. Here is the list (only in spark.ml) : * DecisionTreeClassificationModel * DecisionTreeRegressionModel * GBTClassificationModel * GBTRegressionModel * NaiveBayesModel * RFormula * RFormulaModel * RandomForestClassificationModel * RandomForestRegressionModel The question is do we want to enforce that "identifiable types should be identifiable by their toString". It does make sense. The following question is can we introduce potential API breaking change in the API in order to do it? If the answer is yes, the easy way would be to set Identifiable.toString as final and compose it with an overridable empty suffix private[spark] trait Identifiable { /** * An immutable unique ID for the object and its derivatives. */ val uid: String def toStringSuffix: String = "" override final def toString: String = uid + toStringSuffix } Is there a committer that could validate this proposal? > spark.ml Identifiable types should have UID in toString methods > --- > > Key: SPARK-9720 > URL: https://issues.apache.org/jira/browse/SPARK-9720 > Project: Spark > Issue Type: Improvement > Components: ML >Reporter: Joseph K. Bradley >Priority: Minor > Labels: starter > > It would be nice to print the UID (instance name) in toString methods. > That's the default behavior for Identifiable, but some types override the > default toString and do not print the UID. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-9720) spark.ml Identifiable types should have UID in toString methods
[ https://issues.apache.org/jira/browse/SPARK-9720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14662253#comment-14662253 ] Joseph K. Bradley commented on SPARK-9720: -- Good point about the default toString. The main problem is from a few classes overriding toString. I'll state that in the description. > spark.ml Identifiable types should have UID in toString methods > --- > > Key: SPARK-9720 > URL: https://issues.apache.org/jira/browse/SPARK-9720 > Project: Spark > Issue Type: Improvement > Components: ML >Reporter: Joseph K. Bradley >Priority: Minor > Labels: starter > > It would be nice to print the UID (instance name) in toString methods. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-9720) spark.ml Identifiable types should have UID in toString methods
[ https://issues.apache.org/jira/browse/SPARK-9720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14662248#comment-14662248 ] Bertrand Dechoux commented on SPARK-9720: - I might not understand but isn't it already the case for the master branch? https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/util/Identifiable.scala trait Identifiable { override def toString: String = uid } And many Identifiables have a default constructor using Identifiable.randomUID("keyword") for uid. https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/evaluation/BinaryClassificationEvaluator.scala Do you have specific counter examples? > spark.ml Identifiable types should have UID in toString methods > --- > > Key: SPARK-9720 > URL: https://issues.apache.org/jira/browse/SPARK-9720 > Project: Spark > Issue Type: Improvement > Components: ML >Reporter: Joseph K. Bradley >Priority: Minor > Labels: starter > > It would be nice to print the UID (instance name) in toString methods. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-9720) spark.ml Identifiable types should have UID in toString methods
[ https://issues.apache.org/jira/browse/SPARK-9720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14661218#comment-14661218 ] Joseph K. Bradley commented on SPARK-9720: -- Say you have a big Pipeline, run it, and get a failure saying some model type MyModel failed at point X. You may have multiple instances of MyModel in the Pipeline, and you will have no idea which of those instances caused the failure. It'd be nice to know which one, and the UID provides that. > spark.ml Identifiable types should have UID in toString methods > --- > > Key: SPARK-9720 > URL: https://issues.apache.org/jira/browse/SPARK-9720 > Project: Spark > Issue Type: Improvement > Components: ML >Reporter: Joseph K. Bradley >Priority: Minor > Labels: starter > > It would be nice to print the UID (instance name) in toString methods. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-9720) spark.ml Identifiable types should have UID in toString methods
[ https://issues.apache.org/jira/browse/SPARK-9720?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14661195#comment-14661195 ] Sean Cho commented on SPARK-9720: - Can I ask why this is required and if it is a good idea? > spark.ml Identifiable types should have UID in toString methods > --- > > Key: SPARK-9720 > URL: https://issues.apache.org/jira/browse/SPARK-9720 > Project: Spark > Issue Type: Improvement > Components: ML >Reporter: Joseph K. Bradley >Priority: Minor > Labels: starter > > It would be nice to print the UID (instance name) in toString methods. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org