[jira] [Commented] (SPARK-13998) HashingTF should extend UnaryTransformer

2016-04-03 Thread Jacek Laskowski (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15223326#comment-15223326
 ] 

Jacek Laskowski commented on SPARK-13998:
-

I don't personally, but I don't really like whenever I see all these 
non-{{UnaryTransformer}} transformers like {{HashingTF}} or 
{{StopWordsRemover}} on my way. I'd like to give it a shot and get rid of the 
"anomaly". What is the jira for refactoring {{UnaryTransformer}} to support 
setting {{Attribute}}? Or perhaps [~yanboliang] wants to work on it? 

Please guide [~josephkb] / [~mlnick].

> HashingTF should extend UnaryTransformer
> 
>
> Key: SPARK-13998
> URL: https://issues.apache.org/jira/browse/SPARK-13998
> Project: Spark
>  Issue Type: Sub-task
>  Components: ML
>Affects Versions: 2.0.0
>Reporter: Jacek Laskowski
>Priority: Minor
>
> Currently 
> [HashingTF|https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/feature/HashingTF.scala#L37]
>  extends {{Transformer with HasInputCol with HasOutputCol}}, but there is a 
> helper 
> [UnaryTransformer|https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/Transformer.scala#L79-L80]
>  abstract class for exactly the reason.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13998) HashingTF should extend UnaryTransformer

2016-03-29 Thread Nick Pentreath (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15215743#comment-15215743
 ] 

Nick Pentreath commented on SPARK-13998:


It's more house-keeping, to group this under the related improvements to 
feature hashing for consistency. It seems it requires {{Attribute}} support in 
{{UnaryTransformer}} first, which is perhaps low priority. So for now we can 
leave this.

> HashingTF should extend UnaryTransformer
> 
>
> Key: SPARK-13998
> URL: https://issues.apache.org/jira/browse/SPARK-13998
> Project: Spark
>  Issue Type: Sub-task
>  Components: ML
>Affects Versions: 2.0.0
>Reporter: Jacek Laskowski
>Priority: Minor
>
> Currently 
> [HashingTF|https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/feature/HashingTF.scala#L37]
>  extends {{Transformer with HasInputCol with HasOutputCol}}, but there is a 
> helper 
> [UnaryTransformer|https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/Transformer.scala#L79-L80]
>  abstract class for exactly the reason.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13998) HashingTF should extend UnaryTransformer

2016-03-25 Thread Jacek Laskowski (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15211935#comment-15211935
 ] 

Jacek Laskowski commented on SPARK-13998:
-

[~mlnick] It's a simple refactoring, i.e. changing {{extends Transformer with 
HasInputCol with HasOutputCol}} to {{extends UnaryTransformer[...]}} Is your 
moving the issue as a subtask a nod to the change? (I'm concerned after having 
read the other comment from [~yanboliang] and the follow-up from [~josephkb])

> HashingTF should extend UnaryTransformer
> 
>
> Key: SPARK-13998
> URL: https://issues.apache.org/jira/browse/SPARK-13998
> Project: Spark
>  Issue Type: Sub-task
>  Components: ML
>Affects Versions: 2.0.0
>Reporter: Jacek Laskowski
>Priority: Minor
>
> Currently 
> [HashingTF|https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/feature/HashingTF.scala#L37]
>  extends {{Transformer with HasInputCol with HasOutputCol}}, but there is a 
> helper 
> [UnaryTransformer|https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/Transformer.scala#L79-L80]
>  abstract class for exactly the reason.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13998) HashingTF should extend UnaryTransformer

2016-03-20 Thread Yanbo Liang (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15201400#comment-15201400
 ] 

Yanbo Liang commented on SPARK-13998:
-

I think we should first make refactor for UnaryTransformer to support setting 
Attribute for output column. If this make sense, I can give a try.

> HashingTF should extend UnaryTransformer
> 
>
> Key: SPARK-13998
> URL: https://issues.apache.org/jira/browse/SPARK-13998
> Project: Spark
>  Issue Type: Sub-task
>  Components: ML
>Affects Versions: 2.0.0
>Reporter: Jacek Laskowski
>Priority: Minor
>
> Currently 
> [HashingTF|https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/feature/HashingTF.scala#L37]
>  extends {{Transformer with HasInputCol with HasOutputCol}}, but there is a 
> helper 
> [UnaryTransformer|https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/Transformer.scala#L79-L80]
>  abstract class for exactly the reason.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13998) HashingTF should extend UnaryTransformer

2016-03-19 Thread Nick Pentreath (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15201236#comment-15201236
 ] 

Nick Pentreath commented on SPARK-13998:


[~jlaskowski] I've moved this to a sub-task under SPARK-13964, to group it with 
the other improvements to feature hashing.

> HashingTF should extend UnaryTransformer
> 
>
> Key: SPARK-13998
> URL: https://issues.apache.org/jira/browse/SPARK-13998
> Project: Spark
>  Issue Type: Sub-task
>  Components: ML
>Affects Versions: 2.0.0
>Reporter: Jacek Laskowski
>Priority: Minor
>
> Currently 
> [HashingTF|https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/feature/HashingTF.scala#L37]
>  extends {{Transformer with HasInputCol with HasOutputCol}}, but there is a 
> helper 
> [UnaryTransformer|https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/Transformer.scala#L79-L80]
>  abstract class for exactly the reason.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13998) HashingTF should extend UnaryTransformer

2016-03-18 Thread Joseph K. Bradley (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13998?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15201998#comment-15201998
 ] 

Joseph K. Bradley commented on SPARK-13998:
---

I agree it'd require that.  I don't think it's high priority for now though, 
unless someone has a need for UnaryTransformer supporting attributes right tnow.

> HashingTF should extend UnaryTransformer
> 
>
> Key: SPARK-13998
> URL: https://issues.apache.org/jira/browse/SPARK-13998
> Project: Spark
>  Issue Type: Sub-task
>  Components: ML
>Affects Versions: 2.0.0
>Reporter: Jacek Laskowski
>Priority: Minor
>
> Currently 
> [HashingTF|https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/feature/HashingTF.scala#L37]
>  extends {{Transformer with HasInputCol with HasOutputCol}}, but there is a 
> helper 
> [UnaryTransformer|https://github.com/apache/spark/blob/master/mllib/src/main/scala/org/apache/spark/ml/Transformer.scala#L79-L80]
>  abstract class for exactly the reason.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org