spark git commit: [SPARK-11195][CORE] Use correct classloader for TaskResultGetter

2015-11-18 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.5 4b8dc2556 -> f802b07ab [SPARK-11195][CORE] Use correct classloader for TaskResultGetter Make sure we are using the context classloader when deserializing failed TaskResults instead of the Spark classloader. The issue is that

spark git commit: [SPARK-11773][SPARKR] Implement collection functions in SparkR.

2015-11-18 Thread shivaram
Repository: spark Updated Branches: refs/heads/branch-1.6 99c2e86de -> 8ac518c13 [SPARK-11773][SPARKR] Implement collection functions in SparkR. Author: Sun Rui Closes #9764 from sun-rui/SPARK-11773. (cherry picked from commit 224723e6a8b198ef45d6c5ca5d2f9c61188ada8f)

spark git commit: [SPARK-11773][SPARKR] Implement collection functions in SparkR.

2015-11-18 Thread shivaram
Repository: spark Updated Branches: refs/heads/master a97d6f3a5 -> 224723e6a [SPARK-11773][SPARKR] Implement collection functions in SparkR. Author: Sun Rui Closes #9764 from sun-rui/SPARK-11773. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: [SPARK-11195][CORE] Use correct classloader for TaskResultGetter

2015-11-18 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 224723e6a -> 3cca5ffb3 [SPARK-11195][CORE] Use correct classloader for TaskResultGetter Make sure we are using the context classloader when deserializing failed TaskResults instead of the Spark classloader. The issue is that

spark git commit: [SPARK-11195][CORE] Use correct classloader for TaskResultGetter

2015-11-18 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 8ac518c13 -> a72abebcb [SPARK-11195][CORE] Use correct classloader for TaskResultGetter Make sure we are using the context classloader when deserializing failed TaskResults instead of the Spark classloader. The issue is that

spark git commit: [SPARK-11281][SPARKR] Add tests covering the issue.

2015-11-18 Thread shivaram
Repository: spark Updated Branches: refs/heads/branch-1.6 3bd13ee9d -> 99c2e86de [SPARK-11281][SPARKR] Add tests covering the issue. The goal of this PR is to add tests covering the issue to ensure that is was resolved by [SPARK-11086](https://issues.apache.org/jira/browse/SPARK-11086).

spark git commit: [SPARK-11804] [PYSPARK] Exception raise when using Jdbc predicates opt…

2015-11-18 Thread davies
Repository: spark Updated Branches: refs/heads/branch-1.6 48bfe3f89 -> 3bd13ee9d [SPARK-11804] [PYSPARK] Exception raise when using Jdbc predicates opt… …ion in PySpark Author: Jeff Zhang Closes #9791 from zjffdu/SPARK-11804. (cherry picked from commit

spark git commit: [SPARK-11804] [PYSPARK] Exception raise when using Jdbc predicates opt…

2015-11-18 Thread davies
Repository: spark Updated Branches: refs/heads/master 1429e0a2b -> 3a6807fdf [SPARK-11804] [PYSPARK] Exception raise when using Jdbc predicates opt… …ion in PySpark Author: Jeff Zhang Closes #9791 from zjffdu/SPARK-11804. Project:

spark git commit: [SPARK-11795][SQL] combine grouping attributes into a single NamedExpression

2015-11-18 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 33b837333 -> dbf428c87 [SPARK-11795][SQL] combine grouping attributes into a single NamedExpression we use `ExpressionEncoder.tuple` to build the result encoder, which assumes the input encoder should point to a struct type field if

spark git commit: [SPARK-11803][SQL] fix Dataset self-join

2015-11-18 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.6 a72abebcb -> ad2ebe4db [SPARK-11803][SQL] fix Dataset self-join When we resolve the join operator, we may change the output of right side if self-join is detected. So in `Dataset.joinWith`, we should resolve the join operator first,

spark git commit: [SPARK-11725][SQL] correctly handle null inputs for UDF

2015-11-18 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master cffb899c4 -> 33b837333 [SPARK-11725][SQL] correctly handle null inputs for UDF If user use primitive parameters in UDF, there is no way for him to do the null-check for primitive inputs, so we are assuming the primitive input is

spark git commit: [SPARK-11725][SQL] correctly handle null inputs for UDF

2015-11-18 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.6 ad2ebe4db -> 285792b6c [SPARK-11725][SQL] correctly handle null inputs for UDF If user use primitive parameters in UDF, there is no way for him to do the null-check for primitive inputs, so we are assuming the primitive input is

spark git commit: [SPARK-11803][SQL] fix Dataset self-join

2015-11-18 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 3cca5ffb3 -> cffb899c4 [SPARK-11803][SQL] fix Dataset self-join When we resolve the join operator, we may change the output of right side if self-join is detected. So in `Dataset.joinWith`, we should resolve the join operator first, and

spark git commit: [MINOR][BUILD] Ignore ensime cache

2015-11-18 Thread joshrosen
Repository: spark Updated Branches: refs/heads/branch-1.6 7e90893e9 -> 656e1cc50 [MINOR][BUILD] Ignore ensime cache Using ENSIME, I often have `.ensime_cache` polluting my source tree. This PR simply adds the cache directory to `.gitignore` Author: Jakob Odersky Closes

spark git commit: [SPARK-11792] [SQL] [FOLLOW-UP] Change SizeEstimation to KnownSizeEstimation and make estimatedSize return Long instead of Option[Long]

2015-11-18 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 90a7519da -> 6f99522d1 [SPARK-11792] [SQL] [FOLLOW-UP] Change SizeEstimation to KnownSizeEstimation and make estimatedSize return Long instead of Option[Long] https://issues.apache.org/jira/browse/SPARK-11792 The main changes include: *

spark git commit: [SPARK-4557][STREAMING] Spark Streaming foreachRDD Java API method should accept a VoidFunction<...>

2015-11-18 Thread tdas
Repository: spark Updated Branches: refs/heads/master 94624eacb -> 31921e0f0 [SPARK-4557][STREAMING] Spark Streaming foreachRDD Java API method should accept a VoidFunction<...> Currently streaming foreachRDD Java API uses a function prototype requiring a return value of null. This PR

spark git commit: [SPARK-4557][STREAMING] Spark Streaming foreachRDD Java API method should accept a VoidFunction<...>

2015-11-18 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.6 899106cc6 -> c130b8626 [SPARK-4557][STREAMING] Spark Streaming foreachRDD Java API method should accept a VoidFunction<...> Currently streaming foreachRDD Java API uses a function prototype requiring a return value of null. This PR

spark git commit: [MINOR][BUILD] Ignore ensime cache

2015-11-18 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master dbf428c87 -> 90a7519da [MINOR][BUILD] Ignore ensime cache Using ENSIME, I often have `.ensime_cache` polluting my source tree. This PR simply adds the cache directory to `.gitignore` Author: Jakob Odersky Closes

spark git commit: [SPARK-11792] [SQL] [FOLLOW-UP] Change SizeEstimation to KnownSizeEstimation and make estimatedSize return Long instead of Option[Long]

2015-11-18 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 656e1cc50 -> 26c17a515 [SPARK-11792] [SQL] [FOLLOW-UP] Change SizeEstimation to KnownSizeEstimation and make estimatedSize return Long instead of Option[Long] https://issues.apache.org/jira/browse/SPARK-11792 The main changes

spark git commit: [SPARK-11739][SQL] clear the instantiated SQLContext

2015-11-18 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 26c17a515 -> 899106cc6 [SPARK-11739][SQL] clear the instantiated SQLContext Currently, if the first SQLContext is not removed after stopping SparkContext, a SQLContext could set there forever. This patch make this more robust.

spark git commit: [SPARK-11739][SQL] clear the instantiated SQLContext

2015-11-18 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 6f99522d1 -> 94624eacb [SPARK-11739][SQL] clear the instantiated SQLContext Currently, if the first SQLContext is not removed after stopping SparkContext, a SQLContext could set there forever. This patch make this more robust. Author:

spark git commit: [SPARK-11809] Switch the default Mesos mode to coarse-grained mode

2015-11-18 Thread rxin
Repository: spark Updated Branches: refs/heads/master 31921e0f0 -> a416e41e2 [SPARK-11809] Switch the default Mesos mode to coarse-grained mode Based on my conversions with people, I believe the consensus is that the coarse-grained mode is more stable and easier to reason about. It is best

spark git commit: [SPARK-10745][CORE] Separate configs between shuffle and RPC

2015-11-18 Thread vanzin
Repository: spark Updated Branches: refs/heads/master a416e41e2 -> 7c5b64180 [SPARK-10745][CORE] Separate configs between shuffle and RPC [SPARK-6028](https://issues.apache.org/jira/browse/SPARK-6028) uses network module to implement RPC. However, there are some configurations named with

spark git commit: [SPARK-11809] Switch the default Mesos mode to coarse-grained mode

2015-11-18 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 c130b8626 -> ad1445561 [SPARK-11809] Switch the default Mesos mode to coarse-grained mode Based on my conversions with people, I believe the consensus is that the coarse-grained mode is more stable and easier to reason about. It is

spark git commit: [SPARK-10745][CORE] Separate configs between shuffle and RPC

2015-11-18 Thread vanzin
Repository: spark Updated Branches: refs/heads/branch-1.6 ad1445561 -> 34a776798 [SPARK-10745][CORE] Separate configs between shuffle and RPC [SPARK-6028](https://issues.apache.org/jira/browse/SPARK-6028) uses network module to implement RPC. However, there are some configurations named with

spark git commit: rmse was wrongly calculated

2015-11-18 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-1.6 34ded83ed -> 48bfe3f89 rmse was wrongly calculated It was multiplying with U instaed of dividing by U Author: Viveka Kulharia Closes #9771 from vivkul/patch-1. (cherry picked from commit

spark git commit: rmse was wrongly calculated

2015-11-18 Thread srowen
Repository: spark Updated Branches: refs/heads/master 9631ca352 -> 1429e0a2b rmse was wrongly calculated It was multiplying with U instaed of dividing by U Author: Viveka Kulharia Closes #9771 from vivkul/patch-1. Project:

spark git commit: rmse was wrongly calculated

2015-11-18 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-1.5 0ed6d9cf3 -> 4b8dc2556 rmse was wrongly calculated It was multiplying with U instaed of dividing by U Author: Viveka Kulharia Closes #9771 from vivkul/patch-1. (cherry picked from commit

spark git commit: rmse was wrongly calculated

2015-11-18 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-1.4 073c89f06 -> e12fbd80c rmse was wrongly calculated It was multiplying with U instaed of dividing by U Author: Viveka Kulharia Closes #9771 from vivkul/patch-1. (cherry picked from commit

spark git commit: [SPARK-10946][SQL] JDBC - Use Statement.executeUpdate instead of PreparedStatement.executeUpdate for DDLs

2015-11-18 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-1.6 0eb82133f -> 5da7d4130 [SPARK-10946][SQL] JDBC - Use Statement.executeUpdate instead of PreparedStatement.executeUpdate for DDLs New changes with JDBCRDD Author: somideshmukh Closes #9733 from

spark git commit: [SPARK-11792][SQL] SizeEstimator cannot provide a good size estimation of UnsafeHashedRelations

2015-11-18 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 8ac8374a6 -> 0eb82133f [SPARK-11792][SQL] SizeEstimator cannot provide a good size estimation of UnsafeHashedRelations https://issues.apache.org/jira/browse/SPARK-11792 Right now, SizeEstimator will "think" a small

spark git commit: [SPARK-11792][SQL] SizeEstimator cannot provide a good size estimation of UnsafeHashedRelations

2015-11-18 Thread rxin
Repository: spark Updated Branches: refs/heads/master 5e2b44474 -> 1714350bd [SPARK-11792][SQL] SizeEstimator cannot provide a good size estimation of UnsafeHashedRelations https://issues.apache.org/jira/browse/SPARK-11792 Right now, SizeEstimator will "think" a small UnsafeHashedRelation

spark git commit: [SPARK-10946][SQL] JDBC - Use Statement.executeUpdate instead of PreparedStatement.executeUpdate for DDLs

2015-11-18 Thread srowen
Repository: spark Updated Branches: refs/heads/master 1714350bd -> b8f4379ba [SPARK-10946][SQL] JDBC - Use Statement.executeUpdate instead of PreparedStatement.executeUpdate for DDLs New changes with JDBCRDD Author: somideshmukh Closes #9733 from

spark git commit: [SPARK-6541] Sort executors by ID (numeric)

2015-11-18 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-1.6 5da7d4130 -> 04938d929 [SPARK-6541] Sort executors by ID (numeric) "Force" the executor ID sort with Int. Author: Jean-Baptiste Onofré Closes #9165 from jbonofre/SPARK-6541. (cherry picked from commit

spark git commit: [SPARK-11652][CORE] Remote code execution with InvokerTransformer

2015-11-18 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-1.6 04938d929 -> 34ded83ed [SPARK-11652][CORE] Remote code execution with InvokerTransformer Update to Commons Collections 3.2.2 to avoid any potential remote code execution vulnerability Author: Sean Owen Closes

spark git commit: [SPARK-11652][CORE] Remote code execution with InvokerTransformer

2015-11-18 Thread srowen
Repository: spark Updated Branches: refs/heads/master e62820c85 -> 9631ca352 [SPARK-11652][CORE] Remote code execution with InvokerTransformer Update to Commons Collections 3.2.2 to avoid any potential remote code execution vulnerability Author: Sean Owen Closes #9731

spark git commit: [SPARK-11652][CORE] Remote code execution with InvokerTransformer

2015-11-18 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-1.5 f7a7230f3 -> 0ed6d9cf3 [SPARK-11652][CORE] Remote code execution with InvokerTransformer Update to Commons Collections 3.2.2 to avoid any potential remote code execution vulnerability Author: Sean Owen Closes

[1/3] spark git commit: [EXAMPLE][MINOR] Add missing awaitTermination in click stream example

2015-11-18 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.6 e8390e1ab -> 92496b56d [EXAMPLE][MINOR] Add missing awaitTermination in click stream example Author: jerryshao Closes #9730 from jerryshao/clickstream-fix. (cherry picked from commit

[3/3] spark git commit: [SPARK-11692][SQL] Support for Parquet logical types, JSON and BSON (embedded types)

2015-11-18 Thread marmbrus
[SPARK-11692][SQL] Support for Parquet logical types, JSON and BSON (embedded types) Parquet supports some JSON and BSON datatypes. They are represented as binary for BSON and string (UTF-8) for JSON internally. I searched a bit and found Apache drill also supports both in this way,

spark git commit: [SPARK-11787][SQL] Improve Parquet scan performance when using flat schemas.

2015-11-18 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 f383c5a55 -> cfdd8a1a3 [SPARK-11787][SQL] Improve Parquet scan performance when using flat schemas. This patch adds an alternate to the Parquet RecordReader from the parquet-mr project that is much faster for flat schemas. Instead of

spark git commit: [SPARK-11787][SQL] Improve Parquet scan performance when using flat schemas.

2015-11-18 Thread rxin
Repository: spark Updated Branches: refs/heads/master e61367b9f -> 6d0848b53 [SPARK-11787][SQL] Improve Parquet scan performance when using flat schemas. This patch adds an alternate to the Parquet RecordReader from the parquet-mr project that is much faster for flat schemas. Instead of

spark git commit: [SPARK-11839][ML] refactor save/write traits

2015-11-18 Thread jkbradley
Repository: spark Updated Branches: refs/heads/branch-1.6 08e819cb8 -> b0954b532 [SPARK-11839][ML] refactor save/write traits * add "ML" prefix to reader/writer/readable/writable to avoid name collision with java.util.* * define `DefaultParamsReadable/Writable` and use them to save some code

spark git commit: [SPARK-11839][ML] refactor save/write traits

2015-11-18 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master 59a501359 -> e99d33920 [SPARK-11839][ML] refactor save/write traits * add "ML" prefix to reader/writer/readable/writable to avoid name collision with java.util.* * define `DefaultParamsReadable/Writable` and use them to save some code *

spark git commit: [SPARK-11833][SQL] Add Java tests for Kryo/Java Dataset encoders

2015-11-18 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 b0954b532 -> f383c5a55 [SPARK-11833][SQL] Add Java tests for Kryo/Java Dataset encoders Also added some nicer error messages for incompatible types (private types and primitive types) for Kryo/Java encoder. Author: Reynold Xin

spark git commit: Revert "[SPARK-11544][SQL] sqlContext doesn't use PathFilter"

2015-11-18 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 6d0848b53 -> 9c0654d36 Revert "[SPARK-11544][SQL] sqlContext doesn't use PathFilter" This reverts commit 54db79702513e11335c33bcf3a03c59e965e6f16. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: Revert "[SPARK-11544][SQL] sqlContext doesn't use PathFilter"

2015-11-18 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 cfdd8a1a3 -> 19f4f26f3 Revert "[SPARK-11544][SQL] sqlContext doesn't use PathFilter" This reverts commit 54db79702513e11335c33bcf3a03c59e965e6f16. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit:

spark git commit: [SPARK-11816][ML] fix some style issue in ML/MLlib examples

2015-11-18 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.6 19f4f26f3 -> 4b4a6bf5c [SPARK-11816][ML] fix some style issue in ML/MLlib examples jira: https://issues.apache.org/jira/browse/SPARK-11816 Currently I only fixed some obvious comments issue like // scalastyle:off println on the bottom.

spark git commit: [SPARK-11833][SQL] Add Java tests for Kryo/Java Dataset encoders

2015-11-18 Thread rxin
Repository: spark Updated Branches: refs/heads/master e99d33920 -> e61367b9f [SPARK-11833][SQL] Add Java tests for Kryo/Java Dataset encoders Also added some nicer error messages for incompatible types (private types and primitive types) for Kryo/Java encoder. Author: Reynold Xin

spark git commit: [SPARK-11636][SQL] Support classes defined in the REPL with Encoders

2015-11-18 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.6 3cc188915 -> 08e819cb8 [SPARK-11636][SQL] Support classes defined in the REPL with Encoders Before this PR there were two things that would blow up if you called `df.as[MyClass]` if `MyClass` was defined in the REPL: - [x] Because

spark git commit: [SPARK-11636][SQL] Support classes defined in the REPL with Encoders

2015-11-18 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 921900fd0 -> 59a501359 [SPARK-11636][SQL] Support classes defined in the REPL with Encoders Before this PR there were two things that would blow up if you called `df.as[MyClass]` if `MyClass` was defined in the REPL: - [x] Because

spark git commit: [SPARK-11649] Properly set Akka frame size in SparkListenerSuite test

2015-11-18 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-1.5 0439e32e2 -> 9957925e4 [SPARK-11649] Properly set Akka frame size in SparkListenerSuite test SparkListenerSuite's _"onTaskGettingResult() called when result fetched remotely"_ test was extremely slow (1 to 4 minutes to run) and

spark git commit: [SPARK-11816][ML] fix some style issue in ML/MLlib examples

2015-11-18 Thread meng
Repository: spark Updated Branches: refs/heads/master 9c0654d36 -> 67c75828f [SPARK-11816][ML] fix some style issue in ML/MLlib examples jira: https://issues.apache.org/jira/browse/SPARK-11816 Currently I only fixed some obvious comments issue like // scalastyle:off println on the bottom.

spark git commit: [SPARK-6787][ML] add read/write to estimators under ml.feature (1)

2015-11-18 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.6 d9945bc46 -> dc1e23744 [SPARK-6787][ML] add read/write to estimators under ml.feature (1) Add read/write support to the following estimators under spark.ml: * CountVectorizer * IDF * MinMaxScaler * StandardScaler (a little awkward

spark git commit: [SPARK-6787][ML] add read/write to estimators under ml.feature (1)

2015-11-18 Thread meng
Repository: spark Updated Branches: refs/heads/master 5df08949f -> 7e987de17 [SPARK-6787][ML] add read/write to estimators under ml.feature (1) Add read/write support to the following estimators under spark.ml: * CountVectorizer * IDF * MinMaxScaler * StandardScaler (a little awkward because

[2/3] spark git commit: [SPARK-11717] Ignore R session and history files from git

2015-11-18 Thread marmbrus
[SPARK-11717] Ignore R session and history files from git see: https://issues.apache.org/jira/browse/SPARK-11717 SparkR generates R session data and history files under current directory. It might be useful to ignore these files even running SparkR on spark directory for test or development.

spark git commit: [SPARK-11544][SQL] sqlContext doesn't use PathFilter

2015-11-18 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 603a721c2 -> 54db79702 [SPARK-11544][SQL] sqlContext doesn't use PathFilter Apply the user supplied pathfilter while retrieving the files from fs. Author: Dilip Biswal Closes #9652 from dilipbiswal/spark-11544.

spark git commit: [SPARK-11544][SQL] sqlContext doesn't use PathFilter

2015-11-18 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 464b2d421 -> e8390e1ab [SPARK-11544][SQL] sqlContext doesn't use PathFilter Apply the user supplied pathfilter while retrieving the files from fs. Author: Dilip Biswal Closes #9652 from dilipbiswal/spark-11544.

spark git commit: [SPARK-11791] Fix flaky test in BatchedWriteAheadLogSuite

2015-11-18 Thread tdas
Repository: spark Updated Branches: refs/heads/master a402c92c9 -> 921900fd0 [SPARK-11791] Fix flaky test in BatchedWriteAheadLogSuite stack trace of failure: ``` org.scalatest.exceptions.TestFailedDueToTimeoutException: The code passed to eventually never returned normally. Attempted 62

spark git commit: [SPARK-11791] Fix flaky test in BatchedWriteAheadLogSuite

2015-11-18 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.6 6d6fe8085 -> 3cc188915 [SPARK-11791] Fix flaky test in BatchedWriteAheadLogSuite stack trace of failure: ``` org.scalatest.exceptions.TestFailedDueToTimeoutException: The code passed to eventually never returned normally. Attempted 62

spark git commit: [SPARK-10930] History "Stages" page "duration" can be confusing

2015-11-18 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 3a9851936 -> c07a50b86 [SPARK-10930] History "Stages" page "duration" can be confusing Author: Derek Dagit Closes #9051 from d2r/spark-10930-ui-max-task-dur. Project:

spark git commit: [SPARK-11649] Properly set Akka frame size in SparkListenerSuite test

2015-11-18 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-1.6 dc1e23744 -> 226de55ba [SPARK-11649] Properly set Akka frame size in SparkListenerSuite test SparkListenerSuite's _"onTaskGettingResult() called when result fetched remotely"_ test was extremely slow (1 to 4 minutes to run) and

spark git commit: [SPARK-11649] Properly set Akka frame size in SparkListenerSuite test

2015-11-18 Thread andrewor14
Repository: spark Updated Branches: refs/heads/master 7e987de17 -> 3a9851936 [SPARK-11649] Properly set Akka frame size in SparkListenerSuite test SparkListenerSuite's _"onTaskGettingResult() called when result fetched remotely"_ test was extremely slow (1 to 4 minutes to run) and recently

spark git commit: [SPARK-10930] History "Stages" page "duration" can be confusing

2015-11-18 Thread andrewor14
Repository: spark Updated Branches: refs/heads/branch-1.6 226de55ba -> 7b13003d4 [SPARK-10930] History "Stages" page "duration" can be confusing Author: Derek Dagit Closes #9051 from d2r/spark-10930-ui-max-task-dur. (cherry picked from commit

spark git commit: [SPARK-11495] Fix potential socket / file handle leaks that were found via static analysis

2015-11-18 Thread rxin
Repository: spark Updated Branches: refs/heads/master c07a50b86 -> 4b1171219 [SPARK-11495] Fix potential socket / file handle leaks that were found via static analysis The HP Fortify Opens Source Review team (https://www.hpfod.com/open-source-review-project) reported a handful of potential

spark git commit: [SPARK-11495] Fix potential socket / file handle leaks that were found via static analysis

2015-11-18 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 7b13003d4 -> 6913cfe4b [SPARK-11495] Fix potential socket / file handle leaks that were found via static analysis The HP Fortify Opens Source Review team (https://www.hpfod.com/open-source-review-project) reported a handful of

spark git commit: [SPARK-11814][STREAMING] Add better default checkpoint duration

2015-11-18 Thread tdas
Repository: spark Updated Branches: refs/heads/branch-1.6 6913cfe4b -> 6d6fe8085 [SPARK-11814][STREAMING] Add better default checkpoint duration DStream checkpoint interval is by default set at max(10 second, batch interval). That's bad for large batch intervals where the checkpoint interval

spark git commit: [SPARK-11814][STREAMING] Add better default checkpoint duration

2015-11-18 Thread tdas
Repository: spark Updated Branches: refs/heads/master 4b1171219 -> a402c92c9 [SPARK-11814][STREAMING] Add better default checkpoint duration DStream checkpoint interval is by default set at max(10 second, batch interval). That's bad for large batch intervals where the checkpoint interval =

spark git commit: [SPARK-11810][SQL] Java-based encoder for opaque types in Datasets.

2015-11-18 Thread rxin
Repository: spark Updated Branches: refs/heads/master 54db79702 -> 5df08949f [SPARK-11810][SQL] Java-based encoder for opaque types in Datasets. This patch refactors the existing Kryo encoder expressions and adds support for Java serialization. Author: Reynold Xin

spark git commit: [SPARK-11810][SQL] Java-based encoder for opaque types in Datasets.

2015-11-18 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.6 f1aaf8c43 -> d9945bc46 [SPARK-11810][SQL] Java-based encoder for opaque types in Datasets. This patch refactors the existing Kryo encoder expressions and adds support for Java serialization. Author: Reynold Xin

spark git commit: [HOTFIX] Build break from backporting SPARK-11692

2015-11-18 Thread marmbrus
Repository: spark Updated Branches: refs/heads/branch-1.6 92496b56d -> f1aaf8c43 [HOTFIX] Build break from backporting SPARK-11692 Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/f1aaf8c4 Tree:

spark git commit: [SPARK-11614][SQL] serde parameters should be set only when all params are ready

2015-11-18 Thread yhuai
Repository: spark Updated Branches: refs/heads/branch-1.6 4b4a6bf5c -> 59eaec2d4 [SPARK-11614][SQL] serde parameters should be set only when all params are ready see HIVE-7975 and HIVE-12373 With changed semantic of setters in thrift objects in hive, setter should be called only after all

spark git commit: [SPARK-11614][SQL] serde parameters should be set only when all params are ready

2015-11-18 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 67c75828f -> fc3f77b42 [SPARK-11614][SQL] serde parameters should be set only when all params are ready see HIVE-7975 and HIVE-12373 With changed semantic of setters in thrift objects in hive, setter should be called only after all

svn commit: r1715102 [1/3] - in /spark: news/_posts/ site/ site/graphx/ site/mllib/ site/news/ site/releases/ site/screencasts/ site/sql/ site/streaming/

2015-11-18 Thread pwendell
Author: pwendell Date: Thu Nov 19 06:28:44 2015 New Revision: 1715102 URL: http://svn.apache.org/viewvc?rev=1715102=rev Log: Adding SSE 2016 CFP close notice Added: spark/news/_posts/2015-11-19-spark-summit-east-2016-cfp-closing.md spark/site/news/spark-summit-east-2016-cfp-closing.html

svn commit: r1715102 [2/3] - in /spark: news/_posts/ site/ site/graphx/ site/mllib/ site/news/ site/releases/ site/screencasts/ site/sql/ site/streaming/

2015-11-18 Thread pwendell
Modified: spark/site/news/spark-mailing-lists-moving-to-apache.html URL: http://svn.apache.org/viewvc/spark/site/news/spark-mailing-lists-moving-to-apache.html?rev=1715102=1715101=1715102=diff == ---

svn commit: r1715102 [3/3] - in /spark: news/_posts/ site/ site/graphx/ site/mllib/ site/news/ site/releases/ site/screencasts/ site/sql/ site/streaming/

2015-11-18 Thread pwendell
Modified: spark/site/releases/spark-release-1-3-0.html URL: http://svn.apache.org/viewvc/spark/site/releases/spark-release-1-3-0.html?rev=1715102=1715101=1715102=diff == --- spark/site/releases/spark-release-1-3-0.html

spark git commit: [SPARK-11339][SPARKR] Document the list of functions in R base package that are masked by functions with same name in SparkR

2015-11-18 Thread shivaram
Repository: spark Updated Branches: refs/heads/branch-1.6 6731dd668 -> eb1ba1e2e [SPARK-11339][SPARKR] Document the list of functions in R base package that are masked by functions with same name in SparkR Added tests for function that are reported as masked, to make sure the base:: or

spark git commit: [SPARK-11339][SPARKR] Document the list of functions in R base package that are masked by functions with same name in SparkR

2015-11-18 Thread shivaram
Repository: spark Updated Branches: refs/heads/master d02d5b929 -> 1a93323c5 [SPARK-11339][SPARKR] Document the list of functions in R base package that are masked by functions with same name in SparkR Added tests for function that are reported as masked, to make sure the base:: or stats::

spark git commit: [SPARK-11720][SQL][ML] Handle edge cases when count = 0 or 1 for Stats function

2015-11-18 Thread meng
Repository: spark Updated Branches: refs/heads/master 7c5b64180 -> 09ad9533d [SPARK-11720][SQL][ML] Handle edge cases when count = 0 or 1 for Stats function return Double.NaN for mean/average when count == 0 for all numeric types that is converted to Double, Decimal type continue to return

spark git commit: [SPARK-6790][ML] Add spark.ml LinearRegression import/export

2015-11-18 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.6 39c8a995d -> bcc6813dd [SPARK-6790][ML] Add spark.ml LinearRegression import/export This replaces [https://github.com/apache/spark/pull/9656] with updates. fayeshine should be the main author when this PR is committed. CC: mengxr

spark git commit: [SPARK-6790][ML] Add spark.ml LinearRegression import/export

2015-11-18 Thread meng
Repository: spark Updated Branches: refs/heads/master 09ad9533d -> 045a4f045 [SPARK-6790][ML] Add spark.ml LinearRegression import/export This replaces [https://github.com/apache/spark/pull/9656] with updates. fayeshine should be the main author when this PR is committed. CC: mengxr

spark git commit: [SPARK-6789][ML] Add Readable, Writable support for spark.ml ALS, ALSModel

2015-11-18 Thread meng
Repository: spark Updated Branches: refs/heads/master 045a4f045 -> 2acdf10b1 [SPARK-6789][ML] Add Readable, Writable support for spark.ml ALS, ALSModel Also modifies DefaultParamsWriter.saveMetadata to take optional extra metadata. CC: mengxr yanboliang Author: Joseph K. Bradley

spark git commit: [SPARK-6789][ML] Add Readable, Writable support for spark.ml ALS, ALSModel

2015-11-18 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.6 bcc6813dd -> 23b8c2256 [SPARK-6789][ML] Add Readable, Writable support for spark.ml ALS, ALSModel Also modifies DefaultParamsWriter.saveMetadata to take optional extra metadata. CC: mengxr yanboliang Author: Joseph K. Bradley

spark git commit: [SPARK-11813][MLLIB] Avoid serialization of vocab in Word2Vec

2015-11-18 Thread meng
Repository: spark Updated Branches: refs/heads/master 2acdf10b1 -> e391abdf2 [SPARK-11813][MLLIB] Avoid serialization of vocab in Word2Vec jira: https://issues.apache.org/jira/browse/SPARK-11813 I found the problem during training a large corpus. Avoid serialization of vocab in Word2Vec has

spark git commit: [SPARK-11813][MLLIB] Avoid serialization of vocab in Word2Vec

2015-11-18 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.4 e12fbd80c -> eda1ff4ee [SPARK-11813][MLLIB] Avoid serialization of vocab in Word2Vec jira: https://issues.apache.org/jira/browse/SPARK-11813 I found the problem during training a large corpus. Avoid serialization of vocab in Word2Vec

spark git commit: [SPARK-11813][MLLIB] Avoid serialization of vocab in Word2Vec

2015-11-18 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.6 23b8c2256 -> 18e308b84 [SPARK-11813][MLLIB] Avoid serialization of vocab in Word2Vec jira: https://issues.apache.org/jira/browse/SPARK-11813 I found the problem during training a large corpus. Avoid serialization of vocab in Word2Vec

spark git commit: [SPARK-11813][MLLIB] Avoid serialization of vocab in Word2Vec

2015-11-18 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.5 f802b07ab -> 0439e32e2 [SPARK-11813][MLLIB] Avoid serialization of vocab in Word2Vec jira: https://issues.apache.org/jira/browse/SPARK-11813 I found the problem during training a large corpus. Avoid serialization of vocab in Word2Vec

spark git commit: [SPARK-11813][MLLIB] Avoid serialization of vocab in Word2Vec

2015-11-18 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.3 1bfa00d54 -> 5278ef0f1 [SPARK-11813][MLLIB] Avoid serialization of vocab in Word2Vec jira: https://issues.apache.org/jira/browse/SPARK-11813 I found the problem during training a large corpus. Avoid serialization of vocab in Word2Vec

spark git commit: [SPARK-11813][MLLIB] Avoid serialization of vocab in Word2Vec

2015-11-18 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.2 4b6e24e25 -> 307f27e24 [SPARK-11813][MLLIB] Avoid serialization of vocab in Word2Vec jira: https://issues.apache.org/jira/browse/SPARK-11813 I found the problem during training a large corpus. Avoid serialization of vocab in Word2Vec

spark git commit: [SPARK-11813][MLLIB] Avoid serialization of vocab in Word2Vec

2015-11-18 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.1 19835ec1f -> 11ee9d191 [SPARK-11813][MLLIB] Avoid serialization of vocab in Word2Vec jira: https://issues.apache.org/jira/browse/SPARK-11813 I found the problem during training a large corpus. Avoid serialization of vocab in Word2Vec

spark git commit: [SPARK-11684][R][ML][DOC] Update SparkR glm API doc, user guide and example codes

2015-11-18 Thread meng
Repository: spark Updated Branches: refs/heads/master e391abdf2 -> e222d7584 [SPARK-11684][R][ML][DOC] Update SparkR glm API doc, user guide and example codes This PR includes: * Update SparkR:::glm, SparkR:::summary API docs. * Update SparkR machine learning user guide and example codes to

spark git commit: [SPARK-11684][R][ML][DOC] Update SparkR glm API doc, user guide and example codes

2015-11-18 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.6 18e308b84 -> 03c2d20dc [SPARK-11684][R][ML][DOC] Update SparkR glm API doc, user guide and example codes This PR includes: * Update SparkR:::glm, SparkR:::summary API docs. * Update SparkR machine learning user guide and example codes

spark git commit: [SPARK-11820][ML][PYSPARK] PySpark LiR & LoR should support weightCol

2015-11-18 Thread meng
Repository: spark Updated Branches: refs/heads/master e222d7584 -> 603a721c2 [SPARK-11820][ML][PYSPARK] PySpark LiR & LoR should support weightCol [SPARK-7685](https://issues.apache.org/jira/browse/SPARK-7685) and [SPARK-9642](https://issues.apache.org/jira/browse/SPARK-9642) have already

spark git commit: [SPARK-11820][ML][PYSPARK] PySpark LiR & LoR should support weightCol

2015-11-18 Thread meng
Repository: spark Updated Branches: refs/heads/branch-1.6 03c2d20dc -> 464b2d421 [SPARK-11820][ML][PYSPARK] PySpark LiR & LoR should support weightCol [SPARK-7685](https://issues.apache.org/jira/browse/SPARK-7685) and [SPARK-9642](https://issues.apache.org/jira/browse/SPARK-9642) have