Repository: spark
Updated Branches:
refs/heads/branch-1.4 e70be6987 - b0e7c6633
[SPARK-7606] [SQL] [PySpark] add version to Python SQL API docs
Add version info for public Python SQL API.
cc rxin
Author: Davies Liu dav...@databricks.com
Closes #6295 from davies/versions and squashes the
Repository: spark
Updated Branches:
refs/heads/master 8ddcb25b3 - 947ea1cf5
[SPARK-7753] [MLLIB] Update KernelDensity API
Update `KernelDensity` API to make it extensible to different kernels in the
future. `bandwidth` is used instead of `standardDeviation`. The static
`kernelDensity`
Repository: spark
Updated Branches:
refs/heads/branch-1.4 b0e7c6633 - 64762444e
[SPARK-7753] [MLLIB] Update KernelDensity API
Update `KernelDensity` API to make it extensible to different kernels in the
future. `bandwidth` is used instead of `standardDeviation`. The static
`kernelDensity`
Repository: spark
Updated Branches:
refs/heads/master 04940c497 - 8ddcb25b3
[SPARK-7606] [SQL] [PySpark] add version to Python SQL API docs
Add version info for public Python SQL API.
cc rxin
Author: Davies Liu dav...@databricks.com
Closes #6295 from davies/versions and squashes the
Repository: spark
Updated Branches:
refs/heads/master 947ea1cf5 - 1ee8eb431
[SPARK-7745] Change asserts to requires for user input checks in Spark Streaming
Assertions can be turned off. `require` throws an `IllegalArgumentException`
which makes more sense when it's a user set variable.
Repository: spark
Updated Branches:
refs/heads/branch-1.4 64762444e - f08c6f319
[SPARK-7745] Change asserts to requires for user input checks in Spark Streaming
Assertions can be turned off. `require` throws an `IllegalArgumentException`
which makes more sense when it's a user set variable.
Repository: spark
Updated Branches:
refs/heads/branch-1.4 21b150569 - 0df461e08
[SPARK-6416] [DOCS] RDD.fold() requires the operator to be commutative
Document current limitation of rdd.fold.
This does not resolve SPARK-6416 but just documents the issue.
CC JoshRosen
Author: Sean Owen
Repository: spark
Updated Branches:
refs/heads/master 4b7ff3092 - 6e5340269
[SPARK-6416] [DOCS] RDD.fold() requires the operator to be commutative
Document current limitation of rdd.fold.
This does not resolve SPARK-6416 but just documents the issue.
CC JoshRosen
Author: Sean Owen
Repository: spark
Updated Branches:
refs/heads/master 6e5340269 - 699906e53
[SPARK-7394][SQL] Add Pandas style cast (astype)
Author: kaka1992 kaka_1...@163.com
Closes #6313 from kaka1992/astype and squashes the following commits:
73dfd0b [kaka1992] [SPARK-7394] Add Pandas style cast
Repository: spark
Updated Branches:
refs/heads/branch-1.4 70d9839cf - 21b150569
[SPARK-7787] [STREAMING] Fix serialization issue of SerializableAWSCredentials
Lack of default constructor causes deserialization to fail. This occurs only
when the AWS credentials are explicitly specified
Repository: spark
Updated Branches:
refs/heads/master 8730fbb47 - 4b7ff3092
[SPARK-7787] [STREAMING] Fix serialization issue of SerializableAWSCredentials
Lack of default constructor causes deserialization to fail. This occurs only
when the AWS credentials are explicitly specified through
Repository: spark
Updated Branches:
refs/heads/master 13348e21b - 8730fbb47
[SPARK-7749] [SQL] Fixes partition discovery for non-partitioned tables
When no partition columns can be found, we should have an empty
`PartitionSpec`, rather than a `PartitionSpec` with empty partition columns.
Repository: spark
Updated Branches:
refs/heads/branch-1.4 b97a8053a - 70d9839cf
[SPARK-7749] [SQL] Fixes partition discovery for non-partitioned tables
When no partition columns can be found, we should have an empty
`PartitionSpec`, rather than a `PartitionSpec` with empty partition columns.
Repository: spark
Updated Branches:
refs/heads/branch-1.4 0df461e08 - fec3041a6
[SPARK-7394][SQL] Add Pandas style cast (astype)
Author: kaka1992 kaka_1...@163.com
Closes #6313 from kaka1992/astype and squashes the following commits:
73dfd0b [kaka1992] [SPARK-7394] Add Pandas style cast
Repository: spark
Updated Branches:
refs/heads/master 30f3f556f - 3d085
[SPARK-7478] [SQL] Added SQLContext.getOrCreate
Having a SQLContext singleton would make it easier for applications to use a
lazily instantiated single shared instance of SQLContext when needed. It would
avoid
Repository: spark
Updated Branches:
refs/heads/branch-1.4 e79ecc7dc - e29b811ed
[SPARK-7585] [ML] [DOC] VectorIndexer user guide section
Added VectorIndexer section to ML user guide. Also added javaCategoryMaps()
method and Java unit test for it.
CC: mengxr
Author: Joseph K. Bradley
Repository: spark
Updated Branches:
refs/heads/branch-1.4 7e0912b1d - 33e0e
[SPARK-7722] [STREAMING] Added Kinesis to style checker
Author: Tathagata Das tathagata.das1...@gmail.com
Closes #6325 from tdas/SPARK-7722 and squashes the following commits:
9ab35b2 [Tathagata Das] Fixed
Repository: spark
Updated Branches:
refs/heads/master cdc7c055c - 311fab6f1
[SPARK-7722] [STREAMING] Added Kinesis to style checker
Author: Tathagata Das tathagata.das1...@gmail.com
Closes #6325 from tdas/SPARK-7722 and squashes the following commits:
9ab35b2 [Tathagata Das] Fixed styles in
Repository: spark
Updated Branches:
refs/heads/branch-1.4 e29b811ed - 7e0912b1d
[SPARK-7498] [MLLIB] add varargs back to setDefault
We removed `varargs` due to Java compilation issues. That was a false alarm
because I didn't run `build/sbt clean`. So this PR reverts the changes.
jkbradley
Repository: spark
Updated Branches:
refs/heads/master 6d75ed7e5 - cdc7c055c
[SPARK-7498] [MLLIB] add varargs back to setDefault
We removed `varargs` due to Java compilation issues. That was a false alarm
because I didn't run `build/sbt clean`. So this PR reverts the changes.
jkbradley
Repository: spark
Updated Branches:
refs/heads/branch-1.4 33e0e - 96c82515b
[SPARK-7763] [SPARK-7616] [SQL] Persists partition columns into metastore
Author: Yin Huai yh...@databricks.com
Author: Cheng Lian l...@databricks.com
Closes #6285 from liancheng/spark-7763 and squashes the
Repository: spark
Updated Branches:
refs/heads/master 311fab6f1 - 30f3f556f
[SPARK-7763] [SPARK-7616] [SQL] Persists partition columns into metastore
Author: Yin Huai yh...@databricks.com
Author: Cheng Lian l...@databricks.com
Closes #6285 from liancheng/spark-7763 and squashes the following
Repository: spark
Updated Branches:
refs/heads/branch-1.4 e597692ac - c9a80fc40
[SPARK-7711] Add a startTime property to match the corresponding one in Scala
Author: Holden Karau hol...@pigscanfly.ca
Closes #6275 from holdenk/SPARK-771-startTime-is-missing-from-pyspark and
squashes the
Repository: spark
Updated Branches:
refs/heads/branch-1.4 96c82515b - e597692ac
[SPARK-7478] [SQL] Added SQLContext.getOrCreate
Having a SQLContext singleton would make it easier for applications to use a
lazily instantiated single shared instance of SQLContext when needed. It would
avoid
Repository: spark
Updated Branches:
refs/heads/master 3d085 - 6b18cdc1b
[SPARK-7711] Add a startTime property to match the corresponding one in Scala
Author: Holden Karau hol...@pigscanfly.ca
Closes #6275 from holdenk/SPARK-771-startTime-is-missing-from-pyspark and
squashes the
Repository: spark
Updated Branches:
refs/heads/branch-1.4 c9a80fc40 - ba04b5236
[SPARK-7718] [SQL] Speed up partitioning by avoiding closure cleaning
According to yhuai we spent 6-7 seconds cleaning closures in a partitioning job
that takes 12 seconds. Since we provide these closures in
Repository: spark
Updated Branches:
refs/heads/master 15680aeed - 6d75ed7e5
[SPARK-7585] [ML] [DOC] VectorIndexer user guide section
Added VectorIndexer section to ML user guide. Also added javaCategoryMaps()
method and Java unit test for it.
CC: mengxr
Author: Joseph K. Bradley
Repository: spark
Updated Branches:
refs/heads/master 6b18cdc1b - 5287eec5a
[SPARK-7718] [SQL] Speed up partitioning by avoiding closure cleaning
According to yhuai we spent 6-7 seconds cleaning closures in a partitioning job
that takes 12 seconds. Since we provide these closures in Spark we
Repository: spark
Updated Branches:
refs/heads/branch-1.4 e4489c36d - 2be72c99a
[BUILD] Always run SQL tests in master build.
Seems our master build does not run HiveCompatibilitySuite (because
_RUN_SQL_TESTS is not set). This PR introduces a property `AMP_JENKINS_PRB` to
differentiate a PR
Repository: spark
Updated Branches:
refs/heads/master 5a3c04bb9 - 147b6be3b
[BUILD] Always run SQL tests in master build.
Seems our master build does not run HiveCompatibilitySuite (because
_RUN_SQL_TESTS is not set). This PR introduces a property `AMP_JENKINS_PRB` to
differentiate a PR
Repository: spark
Updated Branches:
refs/heads/master 5287eec5a - 5a3c04bb9
[SPARK-7800] isDefined should not marked too early in putNewKey
JIRA: https://issues.apache.org/jira/browse/SPARK-7800
`isDefined` is marked as true twice in `Location.putNewKey`. The first one is
unnecessary and
Repository: spark
Updated Branches:
refs/heads/branch-1.4 ba04b5236 - e4489c36d
[SPARK-7800] isDefined should not marked too early in putNewKey
JIRA: https://issues.apache.org/jira/browse/SPARK-7800
`isDefined` is marked as true twice in `Location.putNewKey`. The first one is
unnecessary
Repository: spark
Updated Branches:
refs/heads/master 147b6be3b - 347b50106
[SPARK-7737] [SQL] Use leaf dirs having data files to discover partitions.
https://issues.apache.org/jira/browse/SPARK-7737
cc liancheng
Author: Yin Huai yh...@databricks.com
Closes #6329 from yhuai/spark-7737 and
Repository: spark
Updated Branches:
refs/heads/branch-1.4 2be72c99a - 11a0640db
[SPARK-7737] [SQL] Use leaf dirs having data files to discover partitions.
https://issues.apache.org/jira/browse/SPARK-7737
cc liancheng
Author: Yin Huai yh...@databricks.com
Closes #6329 from yhuai/spark-7737
Repository: spark
Updated Branches:
refs/heads/branch-1.4 ba620d62f - a17a5cb30
[SPARK-7776] [STREAMING] Added shutdown hook to StreamingContext
Shutdown hook to stop SparkContext was added recently. This results in ugly
errors when a streaming application is terminated by ctrl-C.
```
Repository: spark
Updated Branches:
refs/heads/branch-1.4 df55a0d76 - 2cc7907d7
[DOCS] [MLLIB] Fixing broken link in MLlib Linear Methods documentation.
Just a small change: fixed a broken link in the MLlib Linear Methods
documentation by removing a newline character between the link title
Repository: spark
Updated Branches:
refs/heads/branch-1.4 11a0640db - ba620d62f
[SPARK-7783] [SQL] [PySpark] add DataFrame.rollup/cube in Python
Author: Davies Liu dav...@databricks.com
Closes #6311 from davies/rollup and squashes the following commits:
0261db1 [Davies Liu] use @since
Repository: spark
Updated Branches:
refs/heads/master d68ea24d6 - 17791a581
[SPARK-7783] [SQL] [PySpark] add DataFrame.rollup/cube in Python
Author: Davies Liu dav...@databricks.com
Closes #6311 from davies/rollup and squashes the following commits:
0261db1 [Davies Liu] use @since
a51ca6b
Repository: spark
Updated Branches:
refs/heads/master f5db4b416 - 85b96372c
[SPARK-7219] [MLLIB] Output feature attributes in HashingTF
This PR updates `HashingTF` to output ML attributes that tell the number of
features in the output column. We need to expand `UnaryTransformer` to support
Repository: spark
Updated Branches:
refs/heads/branch-1.4 ef9336335 - df55a0d76
[SPARK-7219] [MLLIB] Output feature attributes in HashingTF
This PR updates `HashingTF` to output ML attributes that tell the number of
features in the output column. We need to expand `UnaryTransformer` to
Repository: spark
Updated Branches:
refs/heads/master 85b96372c - 956c4c910
[SPARK-7657] [YARN] Add driver logs links in application UI, in cluster mode.
This PR adds the URLs to the driver logs to `SparkListenerApplicationStarted`
event, which is later used by the `ExecutorsListener` to
Repository: spark
Updated Branches:
refs/heads/master 17791a581 - f5db4b416
[SPARK-7794] [MLLIB] update RegexTokenizer default settings
The previous default is `{gaps: false, pattern: \\p{L}+|[^\\p{L}\\s]+}`. The
default pattern is hard to understand. This PR changes the default to `{gaps:
Repository: spark
Updated Branches:
refs/heads/branch-1.4 a17a5cb30 - ef9336335
[SPARK-7794] [MLLIB] update RegexTokenizer default settings
The previous default is `{gaps: false, pattern: \\p{L}+|[^\\p{L}\\s]+}`. The
default pattern is hard to understand. This PR changes the default to
Repository: spark
Updated Branches:
refs/heads/branch-1.4 fec3041a6 - f6a29c72c
[SPARK-7793] [MLLIB] Use getOrElse for getting the threshold of SVM model
same issue and fix as in Spark-7694.
Author: Shuo Xiang shuoxiang...@gmail.com
Closes #6321 from coderxiang/nb and squashes the following
Repository: spark
Updated Branches:
refs/heads/master 4f572008f - f6c486aa4
[SQL] [TEST] udf_java_method failed due to jdk version
java.lang.Math.exp(1.0) has different result between jdk versions. so do not
use createQueryTest, write a separate test for it.
```
jdk version result
Repository: spark
Updated Branches:
refs/heads/master f6c486aa4 - 15680aeed
[SPARK-7775] YARN AM negative sleep exception
```
SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation.
SLF4J: Actual binding is of type [org.slf4j.impl.Log4jLoggerFactory]
Exception in
Repository: spark
Updated Branches:
refs/heads/master feb3a9d3f - a25c1ab8f
[SPARK-7565] [SQL] fix MapType in JsonRDD
The key of Map in JsonRDD should be converted into UTF8String (also failed
records), Thanks to yhuai viirya
Closes #6084
Author: Davies Liu dav...@databricks.com
Closes
Repository: spark
Updated Branches:
refs/heads/branch-1.4 f0e421351 - 3aa618510
[SPARK-7565] [SQL] fix MapType in JsonRDD
The key of Map in JsonRDD should be converted into UTF8String (also failed
records), Thanks to yhuai viirya
Closes #6084
Author: Davies Liu dav...@databricks.com
Repository: spark
Updated Branches:
refs/heads/master 1ee8eb431 - feb3a9d3f
[SPARK-7320] [SQL] [Minor] Move the testData into beforeAll()
Follow up of #6340, to avoid the test report missing once it fails.
Author: Cheng Hao hao.ch...@intel.com
Closes #6312 from chenghao-intel/rollup_minor
Repository: spark
Updated Branches:
refs/heads/branch-1.4 f08c6f319 - f0e421351
[SPARK-7320] [SQL] [Minor] Move the testData into beforeAll()
Follow up of #6340, to avoid the test report missing once it fails.
Author: Cheng Hao hao.ch...@intel.com
Closes #6312 from
Repository: spark
Updated Branches:
refs/heads/master a25c1ab8f - 13348e21b
[SPARK-7752] [MLLIB] Use lowercase letters for NaiveBayes.modelType
to be consistent with other string names in MLlib. This PR also updates the
implementation to use vals instead of hardcoded strings. jkbradley
Repository: spark
Updated Branches:
refs/heads/branch-1.4 3aa618510 - b97a8053a
[SPARK-7752] [MLLIB] Use lowercase letters for NaiveBayes.modelType
to be consistent with other string names in MLlib. This PR also updates the
implementation to use vals instead of hardcoded strings. jkbradley
52 matches
Mail list logo