[jira] [Commented] (SPARK-13998) HashingTF should extend UnaryTransformer
[ https://issues.apache.org/jira/browse/SPARK-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15223326#comment-15223326 ] Jacek Laskowski commented on SPARK-13998: - I don't personally, but I don't really like whenever I see all these non-{{UnaryTransformer}} transformers like {{HashingTF}} or {{StopWordsRemover}} on my way. I'd like to give it a shot and get rid of the "anomaly". What is the jira for refactoring {{UnaryTransformer}} to support setting {{Attribute}}? Or perhaps [~yanboliang] wants to work on it? Please guide [~josephkb] / [~mlnick]. > HashingTF should extend UnaryTransformer > > > Key: SPARK-13998 > URL: https://issues.apache.org/jira/browse/SPARK-13998 > Project: Spark > Issue Type: Sub-task > Components: ML >Affects Versions: 2.0.0 >Reporter: Jacek Laskowski >Priority: Minor > > Currently > [HashingTF|https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/feature/HashingTF.scala#L37] > extends {{Transformer with HasInputCol with HasOutputCol}}, but there is a > helper > [UnaryTransformer|https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/Transformer.scala#L79-L80] > abstract class for exactly the reason. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13998) HashingTF should extend UnaryTransformer
[ https://issues.apache.org/jira/browse/SPARK-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215743#comment-15215743 ] Nick Pentreath commented on SPARK-13998: It's more house-keeping, to group this under the related improvements to feature hashing for consistency. It seems it requires {{Attribute}} support in {{UnaryTransformer}} first, which is perhaps low priority. So for now we can leave this. > HashingTF should extend UnaryTransformer > > > Key: SPARK-13998 > URL: https://issues.apache.org/jira/browse/SPARK-13998 > Project: Spark > Issue Type: Sub-task > Components: ML >Affects Versions: 2.0.0 >Reporter: Jacek Laskowski >Priority: Minor > > Currently > [HashingTF|https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/feature/HashingTF.scala#L37] > extends {{Transformer with HasInputCol with HasOutputCol}}, but there is a > helper > [UnaryTransformer|https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/Transformer.scala#L79-L80] > abstract class for exactly the reason. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13998) HashingTF should extend UnaryTransformer
[ https://issues.apache.org/jira/browse/SPARK-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15211935#comment-15211935 ] Jacek Laskowski commented on SPARK-13998: - [~mlnick] It's a simple refactoring, i.e. changing {{extends Transformer with HasInputCol with HasOutputCol}} to {{extends UnaryTransformer[...]}} Is your moving the issue as a subtask a nod to the change? (I'm concerned after having read the other comment from [~yanboliang] and the follow-up from [~josephkb]) > HashingTF should extend UnaryTransformer > > > Key: SPARK-13998 > URL: https://issues.apache.org/jira/browse/SPARK-13998 > Project: Spark > Issue Type: Sub-task > Components: ML >Affects Versions: 2.0.0 >Reporter: Jacek Laskowski >Priority: Minor > > Currently > [HashingTF|https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/feature/HashingTF.scala#L37] > extends {{Transformer with HasInputCol with HasOutputCol}}, but there is a > helper > [UnaryTransformer|https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/Transformer.scala#L79-L80] > abstract class for exactly the reason. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13998) HashingTF should extend UnaryTransformer
[ https://issues.apache.org/jira/browse/SPARK-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15201400#comment-15201400 ] Yanbo Liang commented on SPARK-13998: - I think we should first make refactor for UnaryTransformer to support setting Attribute for output column. If this make sense, I can give a try. > HashingTF should extend UnaryTransformer > > > Key: SPARK-13998 > URL: https://issues.apache.org/jira/browse/SPARK-13998 > Project: Spark > Issue Type: Sub-task > Components: ML >Affects Versions: 2.0.0 >Reporter: Jacek Laskowski >Priority: Minor > > Currently > [HashingTF|https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/feature/HashingTF.scala#L37] > extends {{Transformer with HasInputCol with HasOutputCol}}, but there is a > helper > [UnaryTransformer|https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/Transformer.scala#L79-L80] > abstract class for exactly the reason. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13998) HashingTF should extend UnaryTransformer
[ https://issues.apache.org/jira/browse/SPARK-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15201236#comment-15201236 ] Nick Pentreath commented on SPARK-13998: [~jlaskowski] I've moved this to a sub-task under SPARK-13964, to group it with the other improvements to feature hashing. > HashingTF should extend UnaryTransformer > > > Key: SPARK-13998 > URL: https://issues.apache.org/jira/browse/SPARK-13998 > Project: Spark > Issue Type: Sub-task > Components: ML >Affects Versions: 2.0.0 >Reporter: Jacek Laskowski >Priority: Minor > > Currently > [HashingTF|https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/feature/HashingTF.scala#L37] > extends {{Transformer with HasInputCol with HasOutputCol}}, but there is a > helper > [UnaryTransformer|https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/Transformer.scala#L79-L80] > abstract class for exactly the reason. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-13998) HashingTF should extend UnaryTransformer
[ https://issues.apache.org/jira/browse/SPARK-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15201998#comment-15201998 ] Joseph K. Bradley commented on SPARK-13998: --- I agree it'd require that. I don't think it's high priority for now though, unless someone has a need for UnaryTransformer supporting attributes right tnow. > HashingTF should extend UnaryTransformer > > > Key: SPARK-13998 > URL: https://issues.apache.org/jira/browse/SPARK-13998 > Project: Spark > Issue Type: Sub-task > Components: ML >Affects Versions: 2.0.0 >Reporter: Jacek Laskowski >Priority: Minor > > Currently > [HashingTF|https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/feature/HashingTF.scala#L37] > extends {{Transformer with HasInputCol with HasOutputCol}}, but there is a > helper > [UnaryTransformer|https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/Transformer.scala#L79-L80] > abstract class for exactly the reason. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org