[ https://issues.apache.org/jira/browse/SPARK-10935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15136053#comment-15136053 ]
Ruslan Dautkhanov commented on SPARK-10935: ------------------------------------------- I noticed outer joins. Spark before 1.5 used cartesian product to produce outer joins - SPARK-11111. That will not work well with larger datasets. Fixed in 1.6. > Avito Context Ad Clicks > ----------------------- > > Key: SPARK-10935 > URL: https://issues.apache.org/jira/browse/SPARK-10935 > Project: Spark > Issue Type: Sub-task > Components: ML > Reporter: Xiangrui Meng > > From [~kpl...@gmail.com]: > I would love to do Avito Context Ad Clicks - > https://www.kaggle.com/c/avito-context-ad-clicks - but it involves a lot of > feature engineering and preprocessing. I would love to split this with > somebody else if anybody is interested on working with this. -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org