[jira] [Commented] (SPARK-13938) word2phrase feature created in ML

2016-03-21 Thread Steve Weng (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15205095#comment-15205095
 ] 

Steve Weng commented on SPARK-13938:


I looked it over already, but was hoping you had more details.




> word2phrase feature created in ML
> -
>
> Key: SPARK-13938
> URL: https://issues.apache.org/jira/browse/SPARK-13938
> Project: Spark
>  Issue Type: New Feature
>  Components: ML
>Reporter: Steve Weng
>Priority: Critical
>   Original Estimate: 840h
>  Remaining Estimate: 840h
>
> I implemented word2phrase (see http://arxiv.org/pdf/1310.4546.pdf) which 
> transforms a sentence of words into one where certain individual consecutive 
> words are concatenated by using a training model/estimator (e.g. "I went to 
> New York" becomes "I went to new_york").



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13938) word2phrase feature created in ML

2016-03-21 Thread Steve Weng (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15205026#comment-15205026
 ] 

Steve Weng commented on SPARK-13938:


Hey Sean, what classifies as whether or not features would be appropriate for 
Spark?  The word2phrase algo seems fairly useful and often provided along with 
word2vec, which Spark does support.  (For example, Gensim features phrase on 
top of word2vec).

Cheers,
Steve




> word2phrase feature created in ML
> -
>
> Key: SPARK-13938
> URL: https://issues.apache.org/jira/browse/SPARK-13938
> Project: Spark
>  Issue Type: New Feature
>  Components: ML
>Reporter: Steve Weng
>Priority: Critical
>   Original Estimate: 840h
>  Remaining Estimate: 840h
>
> I implemented word2phrase (see http://arxiv.org/pdf/1310.4546.pdf) which 
> transforms a sentence of words into one where certain individual consecutive 
> words are concatenated by using a training model/estimator (e.g. "I went to 
> New York" becomes "I went to new_york").



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Created] (SPARK-13938) word2phrase feature created in ML

2016-03-19 Thread Steve Weng (JIRA)
Steve Weng created SPARK-13938:
--

 Summary: word2phrase feature created in ML
 Key: SPARK-13938
 URL: https://issues.apache.org/jira/browse/SPARK-13938
 Project: Spark
  Issue Type: New Feature
  Components: ML
Reporter: Steve Weng
Priority: Critical


I implemented word2phrase (see http://arxiv.org/pdf/1310.4546.pdf) which 
transforms a sentence of words into one where certain individual consecutive 
words are concatenated by using a training model/estimator (e.g. "I went to New 
York" becomes "I went to new_york").



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org



[jira] [Commented] (SPARK-13938) word2phrase feature created in ML

2016-03-19 Thread Steve Weng (JIRA)

[ 
https://issues.apache.org/jira/browse/SPARK-13938?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198082#comment-15198082
 ] 

Steve Weng commented on SPARK-13938:


looks good thanks!



> word2phrase feature created in ML
> -
>
> Key: SPARK-13938
> URL: https://issues.apache.org/jira/browse/SPARK-13938
> Project: Spark
>  Issue Type: New Feature
>  Components: ML
>Reporter: Steve Weng
>Priority: Critical
>   Original Estimate: 840h
>  Remaining Estimate: 840h
>
> I implemented word2phrase (see http://arxiv.org/pdf/1310.4546.pdf) which 
> transforms a sentence of words into one where certain individual consecutive 
> words are concatenated by using a training model/estimator (e.g. "I went to 
> New York" becomes "I went to new_york").



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org