[jira] [Commented] (SPARK-2192) Examples Data Not in Binary Distribution
[ https://issues.apache.org/jira/browse/SPARK-2192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14226868#comment-14226868 ] Pat McDonough commented on SPARK-2192: -- Thanks [~srowen]. And yes, Xiangrui confirmed he just generated the data. > Examples Data Not in Binary Distribution > > > Key: SPARK-2192 > URL: https://issues.apache.org/jira/browse/SPARK-2192 > Project: Spark > Issue Type: Bug > Components: Build >Affects Versions: 1.0.0 >Reporter: Pat McDonough > > The data used by examples is not packaged up with the binary distribution. > The data subdirectory of spark should make it's way in to the distribution > somewhere so the examples can use it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2192) Examples Data Not in Binary Distribution
[ https://issues.apache.org/jira/browse/SPARK-2192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14226175#comment-14226175 ] Apache Spark commented on SPARK-2192: - User 'srowen' has created a pull request for this issue: https://github.com/apache/spark/pull/3480 > Examples Data Not in Binary Distribution > > > Key: SPARK-2192 > URL: https://issues.apache.org/jira/browse/SPARK-2192 > Project: Spark > Issue Type: Bug > Components: Build >Affects Versions: 1.0.0 >Reporter: Pat McDonough > > The data used by examples is not packaged up with the binary distribution. > The data subdirectory of spark should make it's way in to the distribution > somewhere so the examples can use it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2192) Examples Data Not in Binary Distribution
[ https://issues.apache.org/jira/browse/SPARK-2192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14226151#comment-14226151 ] Sean Owen commented on SPARK-2192: -- Oops, on further inspection I see that the file is not Movielens data, but merely in the same format. The comments do say this in MovieLensALS.scala. I'll cook up a PR to add the example data to the distro. > Examples Data Not in Binary Distribution > > > Key: SPARK-2192 > URL: https://issues.apache.org/jira/browse/SPARK-2192 > Project: Spark > Issue Type: Bug > Components: Build >Affects Versions: 1.0.0 >Reporter: Pat McDonough > > The data used by examples is not packaged up with the binary distribution. > The data subdirectory of spark should make it's way in to the distribution > somewhere so the examples can use it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2192) Examples Data Not in Binary Distribution
[ https://issues.apache.org/jira/browse/SPARK-2192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14225036#comment-14225036 ] Pat McDonough commented on SPARK-2192: -- [~srowen] - I fully support that and agree that Movielens needs to be removed. > Examples Data Not in Binary Distribution > > > Key: SPARK-2192 > URL: https://issues.apache.org/jira/browse/SPARK-2192 > Project: Spark > Issue Type: Bug > Components: Build >Affects Versions: 1.0.0 >Reporter: Pat McDonough > > The data used by examples is not packaged up with the binary distribution. > The data subdirectory of spark should make it's way in to the distribution > somewhere so the examples can use it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2192) Examples Data Not in Binary Distribution
[ https://issues.apache.org/jira/browse/SPARK-2192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14224436#comment-14224436 ] Sean Owen commented on SPARK-2192: -- Data files are now consolidated under "data/", and they are not in the binary distribution. It would be easy to add them, and seems like a reasonable thing to do. However, I'm not clear all of those data files can be distributed; MovieLens data for example isn't supposed to be AFAIK. In fact, I'm not clear it should be in the Spark repo even. Any support for me adding this to the distro, but removing examples based on things like Movielens that shouldn't be redistributed? > Examples Data Not in Binary Distribution > > > Key: SPARK-2192 > URL: https://issues.apache.org/jira/browse/SPARK-2192 > Project: Spark > Issue Type: Bug > Components: Build >Affects Versions: 1.0.0 >Reporter: Pat McDonough > > The data used by examples is not packaged up with the binary distribution. > The data subdirectory of spark should make it's way in to the distribution > somewhere so the examples can use it. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org
[jira] [Commented] (SPARK-2192) Examples Data Not in Binary Distribution
[ https://issues.apache.org/jira/browse/SPARK-2192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14043847#comment-14043847 ] Pat McDonough commented on SPARK-2192: -- Based on a very quick and not thorough search, the only mention I found of those files came in the docs (bagel-programming-guide.md --> pagerank_data.txt). But you'll also note that SparkKMeans and SparkPageRank seem to work with those files. On Wed, Jun 25, 2014 at 10:52 AM, Henry Saputra (JIRA) > Examples Data Not in Binary Distribution > > > Key: SPARK-2192 > URL: https://issues.apache.org/jira/browse/SPARK-2192 > Project: Spark > Issue Type: Bug > Components: Build >Affects Versions: 1.0.0 >Reporter: Pat McDonough > > The data used by examples is not packaged up with the binary distribution. > The data subdirectory of spark should make it's way in to the distribution > somewhere so the examples can use it. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-2192) Examples Data Not in Binary Distribution
[ https://issues.apache.org/jira/browse/SPARK-2192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14042938#comment-14042938 ] Henry Saputra commented on SPARK-2192: -- I think several tests already have the data in the main/resources. Do you have list of which ones missing? > Examples Data Not in Binary Distribution > > > Key: SPARK-2192 > URL: https://issues.apache.org/jira/browse/SPARK-2192 > Project: Spark > Issue Type: Bug > Components: Build >Affects Versions: 1.0.0 >Reporter: Pat McDonough > > The data used by examples is not packaged up with the binary distribution. > The data subdirectory of spark should make it's way in to the distribution > somewhere so the examples can use it. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-2192) Examples Data Not in Binary Distribution
[ https://issues.apache.org/jira/browse/SPARK-2192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14038200#comment-14038200 ] Patrick Wendell commented on SPARK-2192: It might be good to have all the example data in src/main/resources. > Examples Data Not in Binary Distribution > > > Key: SPARK-2192 > URL: https://issues.apache.org/jira/browse/SPARK-2192 > Project: Spark > Issue Type: Bug > Components: Build >Affects Versions: 1.0.0 >Reporter: Pat McDonough > > The data used by examples is not packaged up with the binary distribution. > The data subdirectory of spark should make it's way in to the distribution > somewhere so the examples can use it. -- This message was sent by Atlassian JIRA (v6.2#6252)
[jira] [Commented] (SPARK-2192) Examples Data Not in Binary Distribution
[ https://issues.apache.org/jira/browse/SPARK-2192?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14037070#comment-14037070 ] Pat McDonough commented on SPARK-2192: -- [~pacoid] - thanks for pointing this out. I guess we'll have to fall back to using the data from src > Examples Data Not in Binary Distribution > > > Key: SPARK-2192 > URL: https://issues.apache.org/jira/browse/SPARK-2192 > Project: Spark > Issue Type: Bug > Components: Build >Affects Versions: 1.0.0 >Reporter: Pat McDonough > > The data used by examples is not packaged up with the binary distribution. > The data subdirectory of spark should make it's way in to the distribution > somewhere so the examples can use it. -- This message was sent by Atlassian JIRA (v6.2#6252)