[jira] [Resolved] (SPARK-2361) Decide whether to broadcast or serialize the weights directly in MLlib algorithms

2014-07-26 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2361?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-2361. Resolution: Fixed Fix Version/s: 1.1.0 > Decide whether to broadcast or serialize the weight

[jira] [Assigned] (SPARK-2680) Lower spark.shuffle.memoryFraction to 0.2 by default

2014-07-26 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia reassigned SPARK-2680: Assignee: Matei Zaharia > Lower spark.shuffle.memoryFraction to 0.2 by default > --

[jira] [Resolved] (SPARK-2680) Lower spark.shuffle.memoryFraction to 0.2 by default

2014-07-26 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-2680. -- Resolution: Fixed > Lower spark.shuffle.memoryFraction to 0.2 by default >

[jira] [Commented] (SPARK-2447) Add common solution for sending upsert actions to HBase (put, deletes, and increment)

2014-07-26 Thread Ted Malaska (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075536#comment-14075536 ] Ted Malaska commented on SPARK-2447: Added first of many pull request. Please feel fr

[jira] [Commented] (SPARK-2684) Update ExternalAppendOnlyMap to take an iterator as input

2014-07-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075526#comment-14075526 ] Apache Spark commented on SPARK-2684: - User 'mateiz' has created a pull request for th

[jira] [Assigned] (SPARK-2684) Update ExternalAppendOnlyMap to take an iterator as input

2014-07-26 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2684?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia reassigned SPARK-2684: Assignee: Matei Zaharia > Update ExternalAppendOnlyMap to take an iterator as input > -

[jira] [Commented] (SPARK-1170) Add histogram() to PySpark

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1170?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075522#comment-14075522 ] Josh Rosen commented on SPARK-1170: --- Hi [~dwmclary] and [~prashant_], It looks like you

[jira] [Assigned] (SPARK-1170) Add histogram() to PySpark

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-1170: - Assignee: Josh Rosen > Add histogram() to PySpark > -- > >

[jira] [Updated] (SPARK-1170) Add histogram() to PySpark

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-1170: -- Assignee: (was: Prashant Sharma) > Add histogram() to PySpark > -- > >

[jira] [Updated] (SPARK-1170) Add histogram() to PySpark

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-1170: -- Assignee: Prashant Sharma (was: Josh Rosen) > Add histogram() to PySpark > --

[jira] [Assigned] (SPARK-1170) Add histogram() to PySpark

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1170?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-1170: - Assignee: Josh Rosen > Add histogram() to PySpark > -- > >

[jira] [Resolved] (SPARK-1207) Make python support for histograms

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1207?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-1207. --- Resolution: Duplicate > Make python support for histograms > -- > >

[jira] [Resolved] (SPARK-2601) py4j.Py4JException on sc.pickleFile

2014-07-26 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-2601. -- Resolution: Fixed Fix Version/s: 1.1.0 > py4j.Py4JException on sc.pickleFile > -

[jira] [Commented] (SPARK-1550) Successive creation of spark context fails in pyspark, if the previous initialization of spark context had failed.

2014-07-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075517#comment-14075517 ] Apache Spark commented on SPARK-1550: - User 'JoshRosen' has created a pull request for

[jira] [Updated] (SPARK-1550) Successive creation of spark context fails in pyspark, if the previous initialization of spark context had failed.

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-1550: -- Affects Version/s: 1.0.1 0.9.2 1.0.0 > Successive creatio

[jira] [Assigned] (SPARK-1550) Successive creation of spark context fails in pyspark, if the previous initialization of spark context had failed.

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1550?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-1550: - Assignee: Josh Rosen > Successive creation of spark context fails in pyspark, if the previous >

[jira] [Commented] (SPARK-1550) Successive creation of spark context fails in pyspark, if the previous initialization of spark context had failed.

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1550?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075509#comment-14075509 ] Josh Rosen commented on SPARK-1550: --- Actually, there's still a similar problem in Spark

[jira] [Updated] (SPARK-2435) Add shutdown hook to bin/pyspark

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2435?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-2435: -- Assignee: Josh Rosen > Add shutdown hook to bin/pyspark > > >

[jira] [Commented] (SPARK-2601) py4j.Py4JException on sc.pickleFile

2014-07-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2601?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075498#comment-14075498 ] Apache Spark commented on SPARK-2601: - User 'JoshRosen' has created a pull request for

[jira] [Updated] (SPARK-2601) py4j.Py4JException on sc.pickleFile

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-2601: -- Affects Version/s: 1.1.0 > py4j.Py4JException on sc.pickleFile > --- >

[jira] [Resolved] (SPARK-2704) ConnectionManager threads should be named and daemon

2014-07-26 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2704?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-2704. -- Resolution: Fixed Fix Version/s: 1.1.0 > ConnectionManager threads should be named and d

[jira] [Updated] (SPARK-2523) For partitioned Hive tables, partition-specific ObjectInspectors should be used.

2014-07-26 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-2523: Summary: For partitioned Hive tables, partition-specific ObjectInspectors should be used. (was: Potential

[jira] [Updated] (SPARK-2523) Potential Bugs if SerDe is not the identical among partitions and table

2014-07-26 Thread Yin Huai (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2523?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated SPARK-2523: Issue Type: Bug (was: Improvement) > Potential Bugs if SerDe is not the identical among partitions and tab

[jira] [Assigned] (SPARK-2601) py4j.Py4JException on sc.pickleFile

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2601?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-2601: - Assignee: Josh Rosen > py4j.Py4JException on sc.pickleFile > ---

[jira] [Resolved] (SPARK-2547) The clustering documentaion example provided for spark 0.9.1/docs is having a error

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2547?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-2547. --- Resolution: Fixed Fix Version/s: 0.9.3 Target Version/s: (was: 0.9.2) > The cluste

[jira] [Resolved] (SPARK-717) Refactor Programming Guides in Documentation

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-717?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-717. -- Resolution: Fixed Target Version/s: 1.0.0 > Refactor Programming Guides in Documentation > ---

[jira] [Resolved] (SPARK-1036) .gitignore is overly aggressive

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1036?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-1036. --- Resolution: Fixed Fix Version/s: 1.0.0 Assignee: Patrick Wendell Fixed by Patrick in

[jira] [Updated] (SPARK-2637) PEP8 Compliance pull request #1540

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-2637: -- Component/s: (was: Documentation) PySpark > PEP8 Compliance pull request #1540 > -

[jira] [Resolved] (SPARK-2637) PEP8 Compliance pull request #1540

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2637?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-2637. --- Resolution: Won't Fix Closing this as 'wont fix' since we decided not to re-format code cloudpickle

[jira] [Resolved] (SPARK-2694) machine learning

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-2694. --- Resolution: Incomplete > machine learning > > > Key: SPARK-2694 >

[jira] [Resolved] (SPARK-661) Java unit tests don't seem to run with Maven

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-661?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-661. -- Resolution: Cannot Reproduce > Java unit tests don't seem to run with Maven > --

[jira] [Commented] (SPARK-606) Add mapSideCombine setting to Java API partitionBy() method.

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075481#comment-14075481 ] Josh Rosen commented on SPARK-606: -- mapSideCombine was removed from partitionBy in https:

[jira] [Resolved] (SPARK-606) Add mapSideCombine setting to Java API partitionBy() method.

2014-07-26 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-606?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-606. -- Resolution: Won't Fix Fix Version/s: 0.8.0 Assignee: Reynold Xin > Add mapSideCombine se

[jira] [Commented] (SPARK-2704) ConnectionManager threads should be named and daemon

2014-07-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2704?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075478#comment-14075478 ] Apache Spark commented on SPARK-2704: - User 'rxin' has created a pull request for this

[jira] [Created] (SPARK-2704) ConnectionManager threads should be named and daemon

2014-07-26 Thread Reynold Xin (JIRA)
Reynold Xin created SPARK-2704: -- Summary: ConnectionManager threads should be named and daemon Key: SPARK-2704 URL: https://issues.apache.org/jira/browse/SPARK-2704 Project: Spark Issue Type: Bu

[jira] [Created] (SPARK-2703) Make Tachyon related unit tests execute without deploying a Tachyon system locally.

2014-07-26 Thread Haoyuan Li (JIRA)
Haoyuan Li created SPARK-2703: - Summary: Make Tachyon related unit tests execute without deploying a Tachyon system locally. Key: SPARK-2703 URL: https://issues.apache.org/jira/browse/SPARK-2703 Project:

[jira] [Created] (SPARK-2702) Upgrade Tachyon dependency to 0.5.0

2014-07-26 Thread Haoyuan Li (JIRA)
Haoyuan Li created SPARK-2702: - Summary: Upgrade Tachyon dependency to 0.5.0 Key: SPARK-2702 URL: https://issues.apache.org/jira/browse/SPARK-2702 Project: Spark Issue Type: Improvement Affec

[jira] [Updated] (SPARK-2279) JavaSparkContext should allow creation of EmptyRDD

2014-07-26 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-2279: - Priority: Minor (was: Major) > JavaSparkContext should allow creation of EmptyRDD >

[jira] [Updated] (SPARK-2279) JavaSparkContext should allow creation of EmptyRDD

2014-07-26 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-2279: - Assignee: Bob Paulin > JavaSparkContext should allow creation of EmptyRDD > -

[jira] [Resolved] (SPARK-2279) JavaSparkContext should allow creation of EmptyRDD

2014-07-26 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2279?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-2279. -- Resolution: Fixed Fix Version/s: 1.1.0 > JavaSparkContext should allow creation of Empty

[jira] [Created] (SPARK-2701) ConnectionManager throws out of "Could not find reference for received ack message xxx" exception

2014-07-26 Thread Guoqiang Li (JIRA)
Guoqiang Li created SPARK-2701: -- Summary: ConnectionManager throws out of "Could not find reference for received ack message xxx" exception Key: SPARK-2701 URL: https://issues.apache.org/jira/browse/SPARK-2701

[jira] [Commented] (SPARK-2447) Add common solution for sending upsert actions to HBase (put, deletes, and increment)

2014-07-26 Thread Ted Malaska (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2447?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075376#comment-14075376 ] Ted Malaska commented on SPARK-2447: Getting closer to the first pull request. 1. Add

[jira] [Commented] (SPARK-2700) Hidden files (such as .impala_insert_staging) should be filtered out by sqlContext.parquetFile

2014-07-26 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075309#comment-14075309 ] Sean Owen commented on SPARK-2700: -- (As a generic aside, yes, in general apps should neve

[jira] [Resolved] (SPARK-2652) Turning default configurations for PySpark

2014-07-26 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2652?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-2652. -- Resolution: Fixed > Turning default configurations for PySpark > --

[jira] [Resolved] (SPARK-2696) Reduce default spark.serializer.objectStreamReset

2014-07-26 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-2696. -- Resolution: Fixed Fix Version/s: 1.0.3 Target Version/s: 1.0.3 (was: 1.0.0) >

[jira] [Updated] (SPARK-2696) Reduce default spark.serializer.objectStreamReset

2014-07-26 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2696?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-2696: - Assignee: Hossein Falaki > Reduce default spark.serializer.objectStreamReset > -

[jira] [Resolved] (SPARK-1458) Expose sc.version in PySpark

2014-07-26 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-1458. -- Resolution: Fixed Fix Version/s: 1.1.0 > Expose sc.version in PySpark >

[jira] [Commented] (SPARK-2674) Add date and time types to inferSchema

2014-07-26 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075284#comment-14075284 ] Apache Spark commented on SPARK-2674: - User 'davies' has created a pull request for th