[jira] [Commented] (SPARK-2674) Add date and time types to inferSchema

2014-07-25 Thread Davies Liu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2674?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075272#comment-14075272 ] Davies Liu commented on SPARK-2674: --- Date and time in Python will be converted into java

[jira] [Comment Edited] (SPARK-2699) Improve compatibility with parquet file/table

2014-07-25 Thread Teng Qiu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075256#comment-14075256 ] Teng Qiu edited comment on SPARK-2699 at 7/26/14 4:17 AM: -- PR: ht

[jira] [Comment Edited] (SPARK-2699) Improve compatibility with parquet file/table

2014-07-25 Thread Teng Qiu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075256#comment-14075256 ] Teng Qiu edited comment on SPARK-2699 at 7/26/14 4:15 AM: -- added

[jira] [Commented] (SPARK-2699) Improve compatibility with parquet file/table

2014-07-25 Thread Teng Qiu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2699?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075256#comment-14075256 ] Teng Qiu commented on SPARK-2699: - added a config property: parquet.binarytype, default is

[jira] [Commented] (SPARK-2410) Thrift/JDBC Server

2014-07-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2410?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075249#comment-14075249 ] Apache Spark commented on SPARK-2410: - User 'liancheng' has created a pull request for

[jira] [Commented] (SPARK-2700) Hidden files (such as .impala_insert_staging) should be filtered out by sqlContext.parquetFile

2014-07-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2700?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075237#comment-14075237 ] Apache Spark commented on SPARK-2700: - User 'chutium' has created a pull request for t

[jira] [Updated] (SPARK-2677) BasicBlockFetchIterator#next can wait forever

2014-07-25 Thread Guoqiang Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guoqiang Li updated SPARK-2677: --- Fix Version/s: 1.0.3 > BasicBlockFetchIterator#next can wait forever > --

[jira] [Updated] (SPARK-2700) Hidden files (such as .impala_insert_staging) should be filtered out by sqlContext.parquetFile

2014-07-25 Thread Teng Qiu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Teng Qiu updated SPARK-2700: Description: when creating a table in impala, a hidden folder .impala_insert_staging will be created in th

[jira] [Resolved] (SPARK-2681) Spark can hang when fetching shuffle blocks

2014-07-25 Thread Guoqiang Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guoqiang Li resolved SPARK-2681. Resolution: Duplicate > Spark can hang when fetching shuffle blocks > -

[jira] [Updated] (SPARK-2677) BasicBlockFetchIterator#next can wait forever

2014-07-25 Thread Guoqiang Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guoqiang Li updated SPARK-2677: --- Priority: Blocker (was: Critical) > BasicBlockFetchIterator#next can wait forever >

[jira] [Updated] (SPARK-2677) BasicBlockFetchIterator#next can wait forever

2014-07-25 Thread Guoqiang Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guoqiang Li updated SPARK-2677: --- Component/s: Spark Core > BasicBlockFetchIterator#next can wait forever > ---

[jira] [Updated] (SPARK-2677) BasicBlockFetchIterator#next can wait forever

2014-07-25 Thread Guoqiang Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2677?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Guoqiang Li updated SPARK-2677: --- Fix Version/s: 1.1.0 > BasicBlockFetchIterator#next can wait forever > --

[jira] [Commented] (SPARK-2446) Add BinaryType support to Parquet I/O.

2014-07-25 Thread Teng Qiu (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075232#comment-14075232 ] Teng Qiu commented on SPARK-2446: - Hi, thanks for the advice, i created a ticket for this:

[jira] [Created] (SPARK-2700) Hidden files (such as .impala_insert_staging) should be filtered out by sqlContext.parquetFile

2014-07-25 Thread Teng Qiu (JIRA)
Teng Qiu created SPARK-2700: --- Summary: Hidden files (such as .impala_insert_staging) should be filtered out by sqlContext.parquetFile Key: SPARK-2700 URL: https://issues.apache.org/jira/browse/SPARK-2700 Pr

[jira] [Commented] (SPARK-2681) Spark can hang when fetching shuffle blocks

2014-07-25 Thread Kousuke Saruta (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075230#comment-14075230 ] Kousuke Saruta commented on SPARK-2681: --- Hey, I reported similar issue. https://issu

[jira] [Created] (SPARK-2699) Improve compatibility with parquet file/table

2014-07-25 Thread Teng Qiu (JIRA)
Teng Qiu created SPARK-2699: --- Summary: Improve compatibility with parquet file/table Key: SPARK-2699 URL: https://issues.apache.org/jira/browse/SPARK-2699 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-1097) ConcurrentModificationException

2014-07-25 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075216#comment-14075216 ] Patrick Wendell commented on SPARK-1097: A follow up to this fix is in Spark 1.0.2

[jira] [Updated] (SPARK-1097) ConcurrentModificationException

2014-07-25 Thread Patrick Wendell (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1097?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Patrick Wendell updated SPARK-1097: --- Fix Version/s: 1.0.2 > ConcurrentModificationException > --- > >

[jira] [Commented] (SPARK-2010) Support for nested data in PySpark SQL

2014-07-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2010?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075188#comment-14075188 ] Apache Spark commented on SPARK-2010: - User 'davies' has created a pull request for th

[jira] [Commented] (SPARK-1812) Support cross-building with Scala 2.11

2014-07-25 Thread Mark Hamstra (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075176#comment-14075176 ] Mark Hamstra commented on SPARK-1812: - FWIW scalatest can be pushed to 2.2.0 without a

[jira] [Commented] (SPARK-2279) JavaSparkContext should allow creation of EmptyRDD

2014-07-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075162#comment-14075162 ] Apache Spark commented on SPARK-2279: - User 'bobpaulin' has created a pull request for

[jira] [Commented] (SPARK-2279) JavaSparkContext should allow creation of EmptyRDD

2014-07-25 Thread Bob Paulin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2279?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075163#comment-14075163 ] Bob Paulin commented on SPARK-2279: --- Added pull request: https://github.com/apache/spark

[jira] [Commented] (SPARK-1812) Support cross-building with Scala 2.11

2014-07-25 Thread Anand Avati (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1812?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075161#comment-14075161 ] Anand Avati commented on SPARK-1812: FYI - I am working on Scala 2.11 support, ongoing

[jira] [Updated] (SPARK-2698) RDD page Spark Web UI bug

2014-07-25 Thread Hossein Falaki (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2698?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Hossein Falaki updated SPARK-2698: -- Attachment: spark ui.png > RDD page Spark Web UI bug > - > >

[jira] [Created] (SPARK-2698) RDD page Spark Web UI bug

2014-07-25 Thread Hossein Falaki (JIRA)
Hossein Falaki created SPARK-2698: - Summary: RDD page Spark Web UI bug Key: SPARK-2698 URL: https://issues.apache.org/jira/browse/SPARK-2698 Project: Spark Issue Type: Bug Component

[jira] [Created] (SPARK-2697) Source Scala and Python shell banners from a single place

2014-07-25 Thread Nicholas Chammas (JIRA)
Nicholas Chammas created SPARK-2697: --- Summary: Source Scala and Python shell banners from a single place Key: SPARK-2697 URL: https://issues.apache.org/jira/browse/SPARK-2697 Project: Spark

[jira] [Updated] (SPARK-1498) Spark can hang if pyspark tasks fail

2014-07-25 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1498?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen updated SPARK-1498: -- Affects Version/s: 0.9.2 0.9.1 Fix Version/s: 1.0.0 This is still a prob

[jira] [Commented] (SPARK-1458) Expose sc.version in PySpark

2014-07-25 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075137#comment-14075137 ] Nicholas Chammas commented on SPARK-1458: - Derp, I guess we [overcomplicated|http

[jira] [Resolved] (SPARK-915) Tidy up the scripts

2014-07-25 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-915?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-915. -- Resolution: Fixed Fix Version/s: 0.9.0 > Tidy up the scripts > --- > >

[jira] [Commented] (SPARK-1458) Expose sc.version in PySpark

2014-07-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075126#comment-14075126 ] Apache Spark commented on SPARK-1458: - User 'JoshRosen' has created a pull request for

[jira] [Commented] (SPARK-2154) Worker goes down.

2014-07-25 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2154?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075123#comment-14075123 ] Nicholas Chammas commented on SPARK-2154: - Can we edit the title of this issue to

[jira] [Commented] (SPARK-2407) Implement SQL SUBSTR() directly in Catalyst

2014-07-25 Thread Nicholas Chammas (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2407?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075122#comment-14075122 ] Nicholas Chammas commented on SPARK-2407: - Minor point: Shouldn't the issue type f

[jira] [Commented] (SPARK-2696) Reduce default spark.serializer.objectStreamReset

2014-07-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2696?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075117#comment-14075117 ] Apache Spark commented on SPARK-2696: - User 'falaki' has created a pull request for th

[jira] [Assigned] (SPARK-1458) Expose sc.version in PySpark

2014-07-25 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1458?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen reassigned SPARK-1458: - Assignee: Josh Rosen > Expose sc.version in PySpark > > >

[jira] [Created] (SPARK-2696) Reduce default spark.serializer.objectStreamReset

2014-07-25 Thread Hossein Falaki (JIRA)
Hossein Falaki created SPARK-2696: - Summary: Reduce default spark.serializer.objectStreamReset Key: SPARK-2696 URL: https://issues.apache.org/jira/browse/SPARK-2696 Project: Spark Issue Type

[jira] [Commented] (SPARK-1011) MatrixFactorizationModel in pyspark throws serialization error

2014-07-25 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075094#comment-14075094 ] Josh Rosen commented on SPARK-1011: --- Oh, and if you want to combine the actual vs. predi

[jira] [Reopened] (SPARK-2410) Thrift/JDBC Server

2014-07-25 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust reopened SPARK-2410: - > Thrift/JDBC Server > -- > > Key: SPARK-2410 >

[jira] [Resolved] (SPARK-1011) MatrixFactorizationModel in pyspark throws serialization error

2014-07-25 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1011?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-1011. --- Resolution: Fixed Fix Version/s: 0.9.1 Assignee: Hossein Falaki This was fixed in Spa

[jira] [Resolved] (SPARK-2567) Resubmitted stage sometimes remains as active stage in the web UI

2014-07-25 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-2567. -- Resolution: Fixed Fix Version/s: 1.1.0 > Resubmitted stage sometimes remains as active s

[jira] [Updated] (SPARK-2567) Resubmitted stage sometimes remains as active stage in the web UI

2014-07-25 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2567?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-2567: - Assignee: Kay Ousterhout > Resubmitted stage sometimes remains as active stage in the web UI > --

[jira] [Commented] (SPARK-2567) Resubmitted stage sometimes remains as active stage in the web UI

2014-07-25 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2567?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14075035#comment-14075035 ] Matei Zaharia commented on SPARK-2567: -- I've merged this into 1.1 because the patch d

[jira] [Resolved] (SPARK-1726) Tasks that fail to serialize remain in active stages forever.

2014-07-25 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1726?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-1726. -- Resolution: Fixed Fix Version/s: 1.1.0 > Tasks that fail to serialize remain in active s

[jira] [Resolved] (SPARK-1257) Endless running task when using pyspark with input file containing a long line

2014-07-25 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1257?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-1257. --- Resolution: Fixed Fix Version/s: 0.9.1 Assignee: Josh Rosen > Endless running task wh

[jira] [Commented] (SPARK-2420) Change Spark build to minimize library conflicts

2014-07-25 Thread Brock Noland (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074992#comment-14074992 ] Brock Noland commented on SPARK-2420: - Sounds like a great plan to me. The Hive + Spar

[jira] [Commented] (SPARK-2680) Lower spark.shuffle.memoryFraction to 0.2 by default

2014-07-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2680?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074972#comment-14074972 ] Apache Spark commented on SPARK-2680: - User 'mateiz' has created a pull request for th

[jira] [Resolved] (SPARK-2125) Add sorting flag to ShuffleManager, and implement it in HashShuffleManager

2014-07-25 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2125?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-2125. -- Resolution: Fixed Fix Version/s: 1.1.0 > Add sorting flag to ShuffleManager, and impleme

[jira] [Resolved] (SPARK-1394) calling system.platform on worker raises IOError

2014-07-25 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-1394?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Josh Rosen resolved SPARK-1394. --- Resolution: Fixed Fix Version/s: 1.0.1 > calling system.platform on worker raises IOError > --

[jira] [Commented] (SPARK-2446) Add BinaryType support to Parquet I/O.

2014-07-25 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2446?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074942#comment-14074942 ] Michael Armbrust commented on SPARK-2446: - Hi [~chutium], Thanks for reporting th

[jira] [Updated] (SPARK-2529) Clean the closure in foreach and foreachPartition

2014-07-25 Thread Tathagata Das (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Tathagata Das updated SPARK-2529: - Target Version/s: 1.1.0, 1.0.2 (was: 1.1.0, 1.0.3) > Clean the closure in foreach and foreachPar

[jira] [Created] (SPARK-2695) Figure out a good way to handle NullType columns.

2014-07-25 Thread Yin Huai (JIRA)
Yin Huai created SPARK-2695: --- Summary: Figure out a good way to handle NullType columns. Key: SPARK-2695 URL: https://issues.apache.org/jira/browse/SPARK-2695 Project: Spark Issue Type: Improvement

[jira] [Commented] (SPARK-2314) RDD actions are only overridden in Scala, not java or python

2014-07-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074840#comment-14074840 ] Apache Spark commented on SPARK-2314: - User 'staple' has created a pull request for th

[jira] [Resolved] (SPARK-2682) Javadoc generated from Scala source code is not in javadoc's index

2014-07-25 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2682?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-2682. -- Resolution: Fixed Fix Version/s: 1.1.0 > Javadoc generated from Scala source code is not

[jira] [Commented] (SPARK-2314) RDD actions are only overridden in Scala, not java or python

2014-07-25 Thread Aaron Staple (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2314?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074836#comment-14074836 ] Aaron Staple commented on SPARK-2314: - Hi, I added a PR that I handles overriding coll

[jira] [Updated] (SPARK-2576) slave node throws NoClassDefFoundError $line11.$read$ when executing a Spark QL query on HDFS CSV file

2014-07-25 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-2576: Fix Version/s: 1.0.2 > slave node throws NoClassDefFoundError $line11.$read$ when executing

[jira] [Updated] (SPARK-2576) slave node throws NoClassDefFoundError $line11.$read$ when executing a Spark QL query on HDFS CSV file

2014-07-25 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2576?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-2576: Target Version/s: 1.1.0 (was: 1.1.0, 1.0.3) > slave node throws NoClassDefFoundError $line

[jira] [Updated] (SPARK-2010) Support for nested data in PySpark SQL

2014-07-25 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2010?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust updated SPARK-2010: Assignee: Davies Liu (was: Kan Zhang) > Support for nested data in PySpark SQL > -

[jira] [Commented] (SPARK-2694) machine learning

2014-07-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2694?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074805#comment-14074805 ] Sean Owen commented on SPARK-2694: -- Erm, it looks like you have just described simple k-m

[jira] [Commented] (SPARK-2515) Hypothesis testing

2014-07-25 Thread Doris Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2515?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074802#comment-14074802 ] Doris Xin commented on SPARK-2515: -- A toString method sounds like a really good idea here

[jira] [Created] (SPARK-2694) machine learning

2014-07-25 Thread Akash (JIRA)
Akash created SPARK-2694: Summary: machine learning Key: SPARK-2694 URL: https://issues.apache.org/jira/browse/SPARK-2694 Project: Spark Issue Type: Documentation Components: MLlib Affe

[jira] [Resolved] (SPARK-2422) Spark SQL migration guide for Shark users

2014-07-25 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2422?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-2422. - Resolution: Fixed Fix Version/s: 1.1.0 This was fixed in: https://github.com/apach

[jira] [Resolved] (SPARK-2410) Thrift/JDBC Server

2014-07-25 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2410?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-2410. - Resolution: Fixed Fix Version/s: 1.1.0 > Thrift/JDBC Server > -- >

[jira] [Resolved] (SPARK-2459) the user should be able to configure the resources used by JDBC server

2014-07-25 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2459?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Michael Armbrust resolved SPARK-2459. - Resolution: Fixed Fix Version/s: 1.1.0 > the user should be able to configure the

[jira] [Created] (SPARK-2693) Support for UDAF Hive Aggregates like PERCENTILE

2014-07-25 Thread Michael Armbrust (JIRA)
Michael Armbrust created SPARK-2693: --- Summary: Support for UDAF Hive Aggregates like PERCENTILE Key: SPARK-2693 URL: https://issues.apache.org/jira/browse/SPARK-2693 Project: Spark Issue Ty

[jira] [Commented] (SPARK-2692) Decision Tree API update

2014-07-25 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074769#comment-14074769 ] Joseph K. Bradley commented on SPARK-2692: -- https://github.com/apache/spark/pull/

[jira] [Comment Edited] (SPARK-2692) Decision Tree API update

2014-07-25 Thread Joseph K. Bradley (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2692?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074769#comment-14074769 ] Joseph K. Bradley edited comment on SPARK-2692 at 7/25/14 7:00 PM: -

[jira] [Created] (SPARK-2692) Decision Tree API update

2014-07-25 Thread Joseph K. Bradley (JIRA)
Joseph K. Bradley created SPARK-2692: Summary: Decision Tree API update Key: SPARK-2692 URL: https://issues.apache.org/jira/browse/SPARK-2692 Project: Spark Issue Type: Improvement

[jira] [Resolved] (SPARK-2683) unidoc failed because org.apache.spark.util.CallSite uses Java keywords as value names

2014-07-25 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2683?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-2683. -- Resolution: Fixed Fix Version/s: 1.1.0 > unidoc failed because org.apache.spark.util.Cal

[jira] [Commented] (SPARK-2660) Enable pretty-printing SchemaRDD Rows

2014-07-25 Thread Michael Armbrust (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2660?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074760#comment-14074760 ] Michael Armbrust commented on SPARK-2660: - One way to do this would be to add "def

[jira] [Commented] (SPARK-2691) Allow Spark on Mesos to be launched with Docker

2014-07-25 Thread Timothy Chen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2691?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074739#comment-14074739 ] Timothy Chen commented on SPARK-2691: - Please assign this to me, Thanks! > Allow Spar

[jira] [Created] (SPARK-2691) Allow Spark on Mesos to be launched with Docker

2014-07-25 Thread Timothy Chen (JIRA)
Timothy Chen created SPARK-2691: --- Summary: Allow Spark on Mesos to be launched with Docker Key: SPARK-2691 URL: https://issues.apache.org/jira/browse/SPARK-2691 Project: Spark Issue Type: Impro

[jira] [Created] (SPARK-2690) Make unidoc part of our test process

2014-07-25 Thread Yin Huai (JIRA)
Yin Huai created SPARK-2690: --- Summary: Make unidoc part of our test process Key: SPARK-2690 URL: https://issues.apache.org/jira/browse/SPARK-2690 Project: Spark Issue Type: Test Reporte

[jira] [Commented] (SPARK-2638) Improve concurrency of fetching Map outputs

2014-07-25 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2638?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074710#comment-14074710 ] Josh Rosen commented on SPARK-2638: --- Hi Stephen, The goal of MapOutputTracker's synchro

[jira] [Commented] (SPARK-2420) Change Spark build to minimize library conflicts

2014-07-25 Thread Xuefu Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074700#comment-14074700 ] Xuefu Zhang commented on SPARK-2420: +1 on downgrading Guava in Spark, as majority of

[jira] [Commented] (SPARK-2316) StorageStatusListener should avoid O(blocks) operations

2014-07-25 Thread Shivaram Venkataraman (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2316?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074692#comment-14074692 ] Shivaram Venkataraman commented on SPARK-2316: -- On a related note, can we hav

[jira] [Commented] (SPARK-2620) case class cannot be used as key for reduce

2014-07-25 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074673#comment-14074673 ] Matei Zaharia commented on SPARK-2620: -- The problem is that case class is compiled di

[jira] [Resolved] (SPARK-2689) Remove use of println in ActorHelper

2014-07-25 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2689?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-2689. -- Resolution: Fixed Fix Version/s: 1.1.0 > Remove use of println in ActorHelper >

[jira] [Created] (SPARK-2689) Remove use of println in ActorHelper

2014-07-25 Thread Matei Zaharia (JIRA)
Matei Zaharia created SPARK-2689: Summary: Remove use of println in ActorHelper Key: SPARK-2689 URL: https://issues.apache.org/jira/browse/SPARK-2689 Project: Spark Issue Type: Bug

[jira] [Commented] (SPARK-2689) Remove use of println in ActorHelper

2014-07-25 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2689?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074664#comment-14074664 ] Matei Zaharia commented on SPARK-2689: -- Pull request: https://github.com/apache/spark

[jira] [Commented] (SPARK-2579) Reading from S3 returns an inconsistent number of items with Spark 0.9.1

2014-07-25 Thread Sean Owen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2579?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074657#comment-14074657 ] Sean Owen commented on SPARK-2579: -- I tried to reproduce this with Spark 1.0.1, and I was

[jira] [Commented] (SPARK-2420) Change Spark build to minimize library conflicts

2014-07-25 Thread Marcelo Vanzin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2420?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074596#comment-14074596 ] Marcelo Vanzin commented on SPARK-2420: --- After some brainstorming, the path of least

[jira] [Commented] (SPARK-2387) Remove the stage barrier for better resource utilization

2014-07-25 Thread Josh Rosen (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2387?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074521#comment-14074521 ] Josh Rosen commented on SPARK-2387: --- {quote} For example, in a push-style shuffle, the p

[jira] [Updated] (SPARK-2688) Need a way to run multiple data pipeline concurrently

2014-07-25 Thread Xuefu Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated SPARK-2688: --- Description: Suppose we want to do the following data processing: {code} rdd1 -> rdd2 -> rdd3

[jira] [Updated] (SPARK-2688) Need a way to run multiple data pipeline concurrently

2014-07-25 Thread Xuefu Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2688?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xuefu Zhang updated SPARK-2688: --- Description: Suppose we want to do the following data processing: {code} rdd1 -> rdd2 -> rdd3

[jira] [Commented] (SPARK-2688) Need a way to run multiple data pipeline concurrently

2014-07-25 Thread Xuefu Zhang (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2688?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074368#comment-14074368 ] Xuefu Zhang commented on SPARK-2688: cc: [~rxin] [~sandyr] > Need a way to run multip

[jira] [Created] (SPARK-2688) Need a way to run multiple data pipeline concurrently

2014-07-25 Thread Xuefu Zhang (JIRA)
Xuefu Zhang created SPARK-2688: -- Summary: Need a way to run multiple data pipeline concurrently Key: SPARK-2688 URL: https://issues.apache.org/jira/browse/SPARK-2688 Project: Spark Issue Type: I

[jira] [Commented] (SPARK-2547) The clustering documentaion example provided for spark 0.9.1/docs is having a error

2014-07-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074282#comment-14074282 ] Apache Spark commented on SPARK-2547: - User 'yu-iskw' has created a pull request for t

[jira] [Commented] (SPARK-2547) The clustering documentaion example provided for spark 0.9.1/docs is having a error

2014-07-25 Thread Yu Ishikawa (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2547?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074278#comment-14074278 ] Yu Ishikawa commented on SPARK-2547: https://github.com/apache/spark/pull/1590 > The

[jira] [Commented] (SPARK-2687) after receving allocated containers,amClient should remove ContainerRequest.

2014-07-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2687?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074234#comment-14074234 ] Apache Spark commented on SPARK-2687: - User 'lianhuiwang' has created a pull request f

[jira] [Created] (SPARK-2687) after receving allocated containers,amClient should remove ContainerRequest.

2014-07-25 Thread Lianhui Wang (JIRA)
Lianhui Wang created SPARK-2687: --- Summary: after receving allocated containers,amClient should remove ContainerRequest. Key: SPARK-2687 URL: https://issues.apache.org/jira/browse/SPARK-2687 Project: Spa

[jira] [Commented] (SPARK-2620) case class cannot be used as key for reduce

2014-07-25 Thread Apache Spark (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074196#comment-14074196 ] Apache Spark commented on SPARK-2620: - User 'ash211' has created a pull request for th

[jira] [Commented] (SPARK-2620) case class cannot be used as key for reduce

2014-07-25 Thread Andrew Ash (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074195#comment-14074195 ] Andrew Ash commented on SPARK-2620: --- I attempted to write a unit test to demonstrate thi

[jira] [Resolved] (SPARK-2529) Clean the closure in foreach and foreachPartition

2014-07-25 Thread Reynold Xin (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2529?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Reynold Xin resolved SPARK-2529. Resolution: Fixed Fix Version/s: 1.0.2 1.1.0 Assignee: Reynold Xi

[jira] [Commented] (SPARK-2685) Update ExternalAppendOnlyMap to avoid buffer.remove()

2014-07-25 Thread Mridul Muralidharan (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2685?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074186#comment-14074186 ] Mridul Muralidharan commented on SPARK-2685: We moved to using java.util.Linke

[jira] [Commented] (SPARK-2681) Spark can hang when fetching shuffle blocks

2014-07-25 Thread Guoqiang Li (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2681?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14074182#comment-14074182 ] Guoqiang Li commented on SPARK-2681: There seems to be a deadlock occurred? {noformat}

[jira] [Resolved] (SPARK-2657) Use more compact data structures than ArrayBuffer in groupBy and cogroup

2014-07-25 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2657?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-2657. -- Resolution: Fixed > Use more compact data structures than ArrayBuffer in groupBy and cogroup >

[jira] [Resolved] (SPARK-2574) Avoid allocating new ArrayBuffer in groupByKey's mergeCombiner

2014-07-25 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-2574?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia resolved SPARK-2574. -- Resolution: Fixed > Avoid allocating new ArrayBuffer in groupByKey's mergeCombiner > --

[jira] [Updated] (SPARK-993) Don't reuse Writable objects in HadoopRDDs by default

2014-07-25 Thread Matei Zaharia (JIRA)
[ https://issues.apache.org/jira/browse/SPARK-993?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Matei Zaharia updated SPARK-993: Summary: Don't reuse Writable objects in HadoopRDDs by default (was: Don't reuse Writable objects i