spark git commit: [SPARK-9358][SQL] Code generation for UnsafeRow joiner.

2015-07-31 Thread rxin
Repository: spark Updated Branches: refs/heads/master 712f5b7a9 - 03377d252 [SPARK-9358][SQL] Code generation for UnsafeRow joiner. This patch creates a code generated unsafe row concatenator that can be used to concatenate/join two UnsafeRows into a single UnsafeRow. Since it is inherently

spark git commit: [SPARK-9464][SQL] Property checks for UTF8String

2015-07-31 Thread rxin
Repository: spark Updated Branches: refs/heads/master 6996bd2e8 - 14f263448 [SPARK-9464][SQL] Property checks for UTF8String This PR is based on the original work by JoshRosen in #7780, which adds ScalaCheck property-based tests for UTF8String. Author: Josh Rosen joshro...@databricks.com

spark git commit: [SPARK-8264][SQL]add substring_index function

2015-07-31 Thread rxin
Repository: spark Updated Branches: refs/heads/master 03377d252 - 6996bd2e8 [SPARK-8264][SQL]add substring_index function This PR is based on #7533 , thanks to zhichao-li Closes #7533 Author: zhichao.li zhichao...@intel.com Author: Davies Liu dav...@databricks.com Closes #7843 from

spark git commit: [SPARK-9415][SQL] Throw AnalysisException when using MapType on Join and Aggregate

2015-07-31 Thread rxin
Repository: spark Updated Branches: refs/heads/master 14f263448 - 3320b0ba2 [SPARK-9415][SQL] Throw AnalysisException when using MapType on Join and Aggregate JIRA: https://issues.apache.org/jira/browse/SPARK-9415 Following up #7787. We shouldn't use MapType as grouping keys and join keys

[1/2] spark git commit: [SPARK-9451] [SQL] Support entries larger than default page size in BytesToBytesMap integrate with ShuffleMemoryManager

2015-07-31 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master f51fd6fbb - 8cb415a4b http://git-wip-us.apache.org/repos/asf/spark/blob/8cb415a4/sql/core/src/test/scala/org/apache/spark/sql/execution/UnsafeFixedWidthAggregationMapSuite.scala

[2/2] spark git commit: [SPARK-9451] [SQL] Support entries larger than default page size in BytesToBytesMap integrate with ShuffleMemoryManager

2015-07-31 Thread joshrosen
[SPARK-9451] [SQL] Support entries larger than default page size in BytesToBytesMap integrate with ShuffleMemoryManager This patch adds support for entries larger than the default page size in BytesToBytesMap. These large rows are handled by allocating special overflow pages to hold

spark git commit: [SPARK-8936] [MLLIB] OnlineLDA document-topic Dirichlet hyperparameter optimization

2015-07-31 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master 4d5a6e7b6 - f51fd6fbb [SPARK-8936] [MLLIB] OnlineLDA document-topic Dirichlet hyperparameter optimization Adds `alpha` (document-topic Dirichlet parameter) hyperparameter optimization to `OnlineLDAOptimizer` following Huang: Maximum

spark git commit: [SPARK-9318] [SPARK-9320] [SPARKR] Aliases for merge and summary functions on DataFrames

2015-07-31 Thread shivaram
Repository: spark Updated Branches: refs/heads/master 8cb415a4b - 712f5b7a9 [SPARK-9318] [SPARK-9320] [SPARKR] Aliases for merge and summary functions on DataFrames This PR adds synonyms for ```merge``` and ```summary``` in SparkR DataFrame API. cc shivaram Author: Hossein

spark git commit: [SPARK-9497] [SPARK-9509] [CORE] Use ask instead of askWithRetry

2015-07-31 Thread kayousterhout
Repository: spark Updated Branches: refs/heads/master fc0e57e5a - 04a49edfd [SPARK-9497] [SPARK-9509] [CORE] Use ask instead of askWithRetry `RpcEndpointRef.askWithRetry` throws `SparkException` rather than `TimeoutException`. Use ask to replace it because we don't need to retry here.

spark git commit: [SPARK-9053] [SPARKR] Fix spaces around parens, infix operators etc.

2015-07-31 Thread shivaram
Repository: spark Updated Branches: refs/heads/master 6bba7509a - fc0e57e5a [SPARK-9053] [SPARKR] Fix spaces around parens, infix operators etc. ### JIRA [[SPARK-9053] Fix spaces around parens, infix operators etc. - ASF JIRA](https://issues.apache.org/jira/browse/SPARK-9053) ### The Result

spark git commit: [SPARK-9500] add TernaryExpression to simplify ternary expressions

2015-07-31 Thread davies
Repository: spark Updated Branches: refs/heads/master a3a85d73d - 6bba7509a [SPARK-9500] add TernaryExpression to simplify ternary expressions There lots of duplicated code in ternary expressions, create a TernaryExpression for them to reduce duplicated code. cc chenghao-intel Author:

spark git commit: [SQL] address comments for to_date/trunc

2015-07-31 Thread davies
Repository: spark Updated Branches: refs/heads/master 27ae851ce - 0024da915 [SQL] address comments for to_date/trunc This PR address the comments in #7805 cc rxin Author: Davies Liu dav...@databricks.com Closes #7817 from davies/trunc and squashes the following commits: f729d5f [Davies

spark git commit: [SPARK-9446] Clear Active SparkContext in stop() method

2015-07-31 Thread srowen
Repository: spark Updated Branches: refs/heads/master 04a49edfd - 27ae851ce [SPARK-9446] Clear Active SparkContext in stop() method In thread 'stopped SparkContext remaining active' on mailing list, Andres observed the following in driver log: ``` 15/07/29 15:17:09 WARN

spark git commit: [SPARK-9446] Clear Active SparkContext in stop() method

2015-07-31 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-1.4 3d6a9214e - 5ad9f950c [SPARK-9446] Clear Active SparkContext in stop() method In thread 'stopped SparkContext remaining active' on mailing list, Andres observed the following in driver log: ``` 15/07/29 15:17:09 WARN

spark git commit: [SPARK-9056] [STREAMING] Rename configuration `spark.streaming.minRememberDuration` to `spark.streaming.fileStream.minRememberDuration`

2015-07-31 Thread tdas
Repository: spark Updated Branches: refs/heads/master 3c0d2e552 - 060c79aab [SPARK-9056] [STREAMING] Rename configuration `spark.streaming.minRememberDuration` to `spark.streaming.fileStream.minRememberDuration` Rename configuration `spark.streaming.minRememberDuration` to

spark git commit: [SPARK-9507] [BUILD] Remove dependency reduced POM hack now that shade plugin is updated

2015-07-31 Thread srowen
Repository: spark Updated Branches: refs/heads/master 873ab0f96 - 6e5fd613e [SPARK-9507] [BUILD] Remove dependency reduced POM hack now that shade plugin is updated Update to shade plugin 2.4.1, which removes the need for the dependency-reduced-POM workaround and the 'release' profile. Fix

spark git commit: [SPARK-9507] [BUILD] Remove dependency reduced POM hack now that shade plugin is updated

2015-07-31 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-1.4 5ad9f950c - b53ca247d [SPARK-9507] [BUILD] Remove dependency reduced POM hack now that shade plugin is updated Update to shade plugin 2.4.1, which removes the need for the dependency-reduced-POM workaround and the 'release' profile.

spark git commit: [SPARK-9202] capping maximum number of executordriver information kept in Worker

2015-07-31 Thread srowen
Repository: spark Updated Branches: refs/heads/master a8340fa7d - c0686668a [SPARK-9202] capping maximum number of executordriver information kept in Worker https://issues.apache.org/jira/browse/SPARK-9202 Author: CodingCat zhunans...@gmail.com Closes #7714 from CodingCat/SPARK-9202 and

spark git commit: [SPARK-9246] [MLLIB] DistributedLDAModel predict top docs per topic

2015-07-31 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master c0686668a - 3c0d2e552 [SPARK-9246] [MLLIB] DistributedLDAModel predict top docs per topic Add topDocumentsPerTopic to DistributedLDAModel. Add ScalaDoc and unit tests. Author: Meihua Wu meihu...@umich.edu Closes #7769 from

spark git commit: [SPARK-9490] [DOCS] [MLLIB] MLlib evaluation metrics guide example python code uses deprecated print statement

2015-07-31 Thread meng
Repository: spark Updated Branches: refs/heads/master 815c8245f - 873ab0f96 [SPARK-9490] [DOCS] [MLLIB] MLlib evaluation metrics guide example python code uses deprecated print statement Use print(x) not print x for Python 3 in eval examples CC sethah mengxr -- just wanted to close this out

spark git commit: [SPARK-9466] [SQL] Increate two timeouts in CliSuite.

2015-07-31 Thread yhuai
Repository: spark Updated Branches: refs/heads/master fbef566a1 - 815c8245f [SPARK-9466] [SQL] Increate two timeouts in CliSuite. Hopefully this can resolve the flakiness of this suite. JIRA: https://issues.apache.org/jira/browse/SPARK-9466 Author: Yin Huai yh...@databricks.com Closes

spark git commit: [SPARK-9308] [ML] ml.NaiveBayesModel support predicting class probabilities

2015-07-31 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master 060c79aab - fbef566a1 [SPARK-9308] [ML] ml.NaiveBayesModel support predicting class probabilities Make NaiveBayesModel support predicting class probabilities, inherit from ProbabilisticClassificationModel. Author: Yanbo Liang

spark git commit: [SPARK-9481] Add logLikelihood to LocalLDAModel

2015-07-31 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master d04634701 - a8340fa7d [SPARK-9481] Add logLikelihood to LocalLDAModel jkbradley Exposes `bound` (variational log likelihood bound) through public API as `logLikelihood`. Also adds unit tests, some DRYing of `LDASuite`, and includes unit

spark git commit: [SPARK-8640] [SQL] Enable Processing of Multiple Window Frames in a Single Window Operator

2015-07-31 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 0a1d2ca42 - 39ab199a3 [SPARK-8640] [SQL] Enable Processing of Multiple Window Frames in a Single Window Operator This PR enables the processing of multiple window frames in a single window operator. This should improve the performance of

spark git commit: [SPARK-9510] [SPARKR] Remaining SparkR style fixes

2015-07-31 Thread shivaram
Repository: spark Updated Branches: refs/heads/master 6e5fd613e - 82f47b811 [SPARK-9510] [SPARKR] Remaining SparkR style fixes With the change in this patch, I get no more warnings from `./dev/lint-r` in my machine Author: Shivaram Venkataraman shiva...@cs.berkeley.edu Closes #7834 from

spark git commit: [SPARK-9324] [SPARK-9322] [SPARK-9321] [SPARKR] Some aliases for R-like functions in DataFrames

2015-07-31 Thread shivaram
Repository: spark Updated Branches: refs/heads/master 82f47b811 - 710c2b5dd [SPARK-9324] [SPARK-9322] [SPARK-9321] [SPARKR] Some aliases for R-like functions in DataFrames Adds following aliases: * unique (distinct) * rbind (unionAll): accepts many DataFrames * nrow (count) * ncol * dim *

spark git commit: [SPARK-9233] [SQL] Enable code-gen in window function unit tests

2015-07-31 Thread yhuai
Repository: spark Updated Branches: refs/heads/master 710c2b5dd - 3fc0cb920 [SPARK-9233] [SQL] Enable code-gen in window function unit tests Since code-gen is enabled by default, it is better to run window function tests with code-gen. https://issues.apache.org/jira/browse/SPARK-9233

spark git commit: [SPARK-9496][SQL]do not print the password in config

2015-07-31 Thread rxin
Repository: spark Updated Branches: refs/heads/master 0244170b6 - a3a85d73d [SPARK-9496][SQL]do not print the password in config https://issues.apache.org/jira/browse/SPARK-9496 We better do not print the password in log. Author: WangTaoTheTonic wangtao...@huawei.com Closes #7815 from

spark git commit: [SPARK-9214] [ML] [PySpark] support ml.NaiveBayes for Python

2015-07-31 Thread jkbradley
Repository: spark Updated Branches: refs/heads/master 4e5919bfb - 69b62f76f [SPARK-9214] [ML] [PySpark] support ml.NaiveBayes for Python support ml.NaiveBayes for Python Author: Yanbo Liang yblia...@gmail.com Closes #7568 from yanboliang/spark-9214 and squashes the following commits:

spark git commit: [SPARK-9152][SQL] Implement code generation for Like and RLike

2015-07-31 Thread rxin
Repository: spark Updated Branches: refs/heads/master 69b62f76f - 0244170b6 [SPARK-9152][SQL] Implement code generation for Like and RLike JIRA: https://issues.apache.org/jira/browse/SPARK-9152 This PR implements code generation for `Like` and `RLike`. Author: Liang-Chi Hsieh

spark git commit: [SPARK-9496][SQL]do not print the password in config

2015-07-31 Thread rxin
Repository: spark Updated Branches: refs/heads/branch-1.4 6e85064f4 - 3d6a9214e [SPARK-9496][SQL]do not print the password in config https://issues.apache.org/jira/browse/SPARK-9496 We better do not print the password in log. Author: WangTaoTheTonic wangtao...@huawei.com Closes #7815 from

spark git commit: [SPARK-8271][SQL]string function: soundex

2015-07-31 Thread rxin
Repository: spark Updated Branches: refs/heads/master 3fc0cb920 - 4d5a6e7b6 [SPARK-8271][SQL]string function: soundex This PR brings SQL function soundex(), see https://issues.apache.org/jira/browse/HIVE-9738 It's based on #7115 , thanks to HuJiayin Author: HuJiayin jiayin...@intel.com

spark git commit: [SPARK-9507] [BUILD] Remove dependency reduced POM hack now that shade plugin is updated

2015-07-31 Thread srowen
Repository: spark Updated Branches: refs/heads/branch-1.3 f941482b0 - 047a61365 [SPARK-9507] [BUILD] Remove dependency reduced POM hack now that shade plugin is updated Update to shade plugin 2.4.1, which removes the need for the dependency-reduced-POM workaround and the 'release' profile.