[ 
https://issues.apache.org/jira/browse/SPARK-10935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15009526#comment-15009526
 ] 

Joseph K. Bradley commented on SPARK-10935:
-------------------------------------------

[[email protected]]  [SPARK-7546] is meant to track an example of feature 
assembly, but it's been pending for a long time.  I'm hoping we can get it in 
during this release.

The question of what's contained in a closure is a pretty fundamental item in 
learning Spark.  That should really be documented within the Spark Core guides, 
such as 
[http://spark.apache.org/docs/latest/programming-guide.html#understanding-closures-a-nameclosureslinka].
  If there are ambiguities there, it'd be great if you could create a JIRA and 
comment about how to improve that part.

In general, it should be more efficient and safer (in terms of memory errors) 
to use DataFrames instead of RDDs.

> Avito Context Ad Clicks
> -----------------------
>
>                 Key: SPARK-10935
>                 URL: https://issues.apache.org/jira/browse/SPARK-10935
>             Project: Spark
>          Issue Type: Sub-task
>          Components: ML
>            Reporter: Xiangrui Meng
>
> From [[email protected]]:
> I would love to do Avito Context Ad Clicks - 
> https://www.kaggle.com/c/avito-context-ad-clicks - but it involves a lot of 
> feature engineering and preprocessing. I would love to split this with 
> somebody else if anybody is interested on working with this.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to