[
https://issues.apache.org/jira/browse/SPARK-10935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15009526#comment-15009526
]
Joseph K. Bradley commented on SPARK-10935:
-------------------------------------------
[[email protected]] [SPARK-7546] is meant to track an example of feature
assembly, but it's been pending for a long time. I'm hoping we can get it in
during this release.
The question of what's contained in a closure is a pretty fundamental item in
learning Spark. That should really be documented within the Spark Core guides,
such as
[http://spark.apache.org/docs/latest/programming-guide.html#understanding-closures-a-nameclosureslinka].
If there are ambiguities there, it'd be great if you could create a JIRA and
comment about how to improve that part.
In general, it should be more efficient and safer (in terms of memory errors)
to use DataFrames instead of RDDs.
> Avito Context Ad Clicks
> -----------------------
>
> Key: SPARK-10935
> URL: https://issues.apache.org/jira/browse/SPARK-10935
> Project: Spark
> Issue Type: Sub-task
> Components: ML
> Reporter: Xiangrui Meng
>
> From [[email protected]]:
> I would love to do Avito Context Ad Clicks -
> https://www.kaggle.com/c/avito-context-ad-clicks - but it involves a lot of
> feature engineering and preprocessing. I would love to split this with
> somebody else if anybody is interested on working with this.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]