GitHub user steveloughran opened a pull request:
https://github.com/apache/spark/pull/9438
[SPARK-11265] [YARN] YarnClient can't get tokens to talk to Hive in a
secure cluster - backport to branch-1.5
This is a backport of the [SPARK-11265] patch to Branch-1.5; won't compile
against master.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/steveloughran/spark
stevel/patches/SPARK-11265-on-branch-1.5
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/9438.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #9438
----
commit 76d920f2b814304051dd76f0ca78301e872fc811
Author: Yu ISHIKAWA <[email protected]>
Date: 2015-08-25T07:28:51Z
[SPARK-10214] [SPARKR] [DOCS] Improve SparkR Column, DataFrame API docs
cc: shivaram
## Summary
- Add name tags to each methods in DataFrame.R and column.R
- Replace `rdname column` with `rdname {each_func}`. i.e. alias method :
`rdname column` => `rdname alias`
## Generated PDF File
https://drive.google.com/file/d/0B9biIZIU47lLNHN2aFpnQXlSeGs/view?usp=sharing
## JIRA
[[SPARK-10214] Improve SparkR Column, DataFrame API docs - ASF
JIRA](https://issues.apache.org/jira/browse/SPARK-10214)
Author: Yu ISHIKAWA <[email protected]>
Closes #8414 from yu-iskw/SPARK-10214.
(cherry picked from commit d4549fe58fa0d781e0e891bceff893420cb1d598)
Signed-off-by: Shivaram Venkataraman <[email protected]>
commit 4841ebb1861025067a1108c11f64bb144427a308
Author: Sean Owen <[email protected]>
Date: 2015-08-25T07:32:20Z
[SPARK-6196] [BUILD] Remove MapR profiles in favor of hadoop-provided
Follow up to https://github.com/apache/spark/pull/7047
pwendell mentioned that MapR should use `hadoop-provided` now, and indeed
the new build script does not produce `mapr3`/`mapr4` artifacts anymore. Hence
the action seems to be to remove the profiles, which are now not used.
CC trystanleftwich
Author: Sean Owen <[email protected]>
Closes #8338 from srowen/SPARK-6196.
(cherry picked from commit 57b960bf3706728513f9e089455a533f0244312e)
Signed-off-by: Sean Owen <[email protected]>
commit 2032d66706d165079550f06bf695e0b08be7e143
Author: Tathagata Das <[email protected]>
Date: 2015-08-25T07:35:51Z
[SPARK-10210] [STREAMING] Filter out non-existent blocks before creating
BlockRDD
When write ahead log is not enabled, a recovered streaming driver still
tries to run jobs using pre-failure block ids, and fails as the block do not
exists in-memory any more (and cannot be recovered as receiver WAL is not
enabled).
This occurs because the driver-side WAL of ReceivedBlockTracker is recovers
that past block information, and ReceiveInputDStream creates BlockRDDs even if
those blocks do not exist.
The solution in this PR is to filter out block ids that do not exist before
creating the BlockRDD. In addition, it adds unit tests to verify other logic in
ReceiverInputDStream.
Author: Tathagata Das <[email protected]>
Closes #8405 from tdas/SPARK-10210.
(cherry picked from commit 1fc37581a52530bac5d555dbf14927a5780c3b75)
Signed-off-by: Tathagata Das <[email protected]>
commit e5cea566a32d254adc9424a2f9e79b92eda3e6e4
Author: Davies Liu <[email protected]>
Date: 2015-08-25T08:00:44Z
[SPARK-10177] [SQL] fix reading Timestamp in parquet from Hive
We misunderstood the Julian days and nanoseconds of the day in parquet (as
TimestampType) from Hive/Impala, they are overlapped, so can't be added
together directly.
In order to avoid the confusing rounding when do the converting, we use
`2440588` as the Julian Day of epoch of unix timestamp (which should be
2440587.5).
Author: Davies Liu <[email protected]>
Author: Cheng Lian <[email protected]>
Closes #8400 from davies/timestamp_parquet.
(cherry picked from commit 2f493f7e3924b769160a16f73cccbebf21973b91)
Signed-off-by: Cheng Lian <[email protected]>
commit a0f22cf295a1d20814c5be6cc727e39e95a81c27
Author: Josh Rosen <[email protected]>
Date: 2015-08-25T08:06:36Z
[SPARK-10195] [SQL] Data sources Filter should not expose internal types
Spark SQL's data sources API exposes Catalyst's internal types through its
Filter interfaces. This is a problem because types like UTF8String are not
stable developer APIs and should not be exposed to third-parties.
This issue caused incompatibilities when upgrading our `spark-redshift`
library to work against Spark 1.5.0. To avoid these issues in the future we
should only expose public types through these Filter objects. This patch
accomplishes this by using CatalystTypeConverters to add the appropriate
conversions.
Author: Josh Rosen <[email protected]>
Closes #8403 from JoshRosen/datasources-internal-vs-external-types.
(cherry picked from commit 7bc9a8c6249300ded31ea931c463d0a8f798e193)
Signed-off-by: Reynold Xin <[email protected]>
commit 73f1dd1b5acf1c6c37045da25902d7ca5ab795e4
Author: Yin Huai <[email protected]>
Date: 2015-08-25T08:19:34Z
[SPARK-10197] [SQL] Add null check in wrapperFor (inside HiveInspectors).
https://issues.apache.org/jira/browse/SPARK-10197
Author: Yin Huai <[email protected]>
Closes #8407 from yhuai/ORCSPARK-10197.
(cherry picked from commit 0e6368ffaec1965d0c7f89420e04a974675c7f6e)
Signed-off-by: Cheng Lian <[email protected]>
commit 5d6840569761a42624f9852b942e33039d21f46a
Author: Zhang, Liye <[email protected]>
Date: 2015-08-25T10:48:55Z
[DOC] add missing parameters in SparkContext.scala for scala doc
Author: Zhang, Liye <[email protected]>
Closes #8412 from liyezhang556520/minorDoc.
(cherry picked from commit 5c14890159a5711072bf395f662b2433a389edf9)
Signed-off-by: Sean Owen <[email protected]>
commit bdcc8e608d9a1160db988faa76808149c28a3b50
Author: ehnalis <[email protected]>
Date: 2015-08-25T11:30:06Z
Fixed a typo in DAGScheduler.
Author: ehnalis <[email protected]>
Closes #8308 from ehnalis/master.
(cherry picked from commit 7f1e507bf7e82bff323c5dec3c1ee044687c4173)
Signed-off-by: Sean Owen <[email protected]>
commit 0402f1297c697bfbe8b5c7bfc170fcdc6b2c9de5
Author: Michael Armbrust <[email protected]>
Date: 2015-08-25T17:22:54Z
[SPARK-10198] [SQL] Turn off partition verification by default
Author: Michael Armbrust <[email protected]>
Closes #8404 from marmbrus/turnOffPartitionVerification.
(cherry picked from commit 5c08c86bfa43462fb2ca5f7c5980ddfb44dd57f8)
Signed-off-by: Michael Armbrust <[email protected]>
commit 742c82ed97ed3fc60d4f17c4363c52062829ea49
Author: Yuhao Yang <[email protected]>
Date: 2015-08-25T17:54:03Z
[SPARK-8531] [ML] Update ML user guide for MinMaxScaler
jira: https://issues.apache.org/jira/browse/SPARK-8531
Update ML user guide for MinMaxScaler
Author: Yuhao Yang <[email protected]>
Author: unknown <[email protected]>
Closes #7211 from hhbyyh/minmaxdoc.
(cherry picked from commit b37f0cc1b4c064d6f09edb161250fa8b783de52a)
Signed-off-by: Joseph K. Bradley <[email protected]>
commit c740f5dd20459b491a8c088383c19c11a76c225d
Author: Feynman Liang <[email protected]>
Date: 2015-08-25T18:58:47Z
[SPARK-10230] [MLLIB] Rename optimizeAlpha to optimizeDocConcentration
See
[discussion](https://github.com/apache/spark/pull/8254#discussion_r37837770)
CC jkbradley
Author: Feynman Liang <[email protected]>
Closes #8422 from feynmanliang/SPARK-10230.
(cherry picked from commit 881208a8e849facf54166bdd69d3634407f952e7)
Signed-off-by: Joseph K. Bradley <[email protected]>
commit 5a32ed75c939dc42886ea940aba2b14b89e9f40e
Author: Xiangrui Meng <[email protected]>
Date: 2015-08-25T19:16:23Z
[SPARK-10231] [MLLIB] update @Since annotation for mllib.classification
Update `Since` annotation in `mllib.classification`:
1. add version to classes, objects, constructors, and public variables
declared in constructors
2. correct some versions
3. remove `Since` on `toString`
MechCoder dbtsai
Author: Xiangrui Meng <[email protected]>
Closes #8421 from mengxr/SPARK-10231 and squashes the following commits:
b2dce80 [Xiangrui Meng] update @Since annotation for mllib.classification
(cherry picked from commit 16a2be1a84c0a274a60c0a584faaf58b55d4942b)
Signed-off-by: DB Tsai <[email protected]>
commit 95e44b4df81b09803be2fde8c4e2566be0c8fdbc
Author: Feynman Liang <[email protected]>
Date: 2015-08-25T20:21:05Z
[SPARK-9800] Adds docs for GradientDescent$.runMiniBatchSGD alias
* Adds doc for alias of runMIniBatchSGD documenting default value for
convergeTol
* Cleans up a note in code
Author: Feynman Liang <[email protected]>
Closes #8425 from feynmanliang/SPARK-9800.
(cherry picked from commit c0e9ff1588b4d9313cc6ec6e00e5c7663eb67910)
Signed-off-by: Joseph K. Bradley <[email protected]>
commit 186326df21daf8d8271a522f2569eb5cd7be1442
Author: Xiangrui Meng <[email protected]>
Date: 2015-08-25T20:22:38Z
[SPARK-10237] [MLLIB] update since versions in mllib.fpm
Same as #8421 but for `mllib.fpm`.
cc feynmanliang
Author: Xiangrui Meng <[email protected]>
Closes #8429 from mengxr/SPARK-10237.
(cherry picked from commit c619c7552f22d28cfa321ce671fc9ca854dd655f)
Signed-off-by: Xiangrui Meng <[email protected]>
commit 055387c087989c8790b6761429b68416ecee3a33
Author: Feynman Liang <[email protected]>
Date: 2015-08-25T20:23:15Z
[SPARK-9797] [MLLIB] [DOC]
StreamingLinearRegressionWithSGD.setConvergenceTol default value
Adds default convergence tolerance (0.001, set in
`GradientDescent.convergenceTol`) to `setConvergenceTol`'s scaladoc
Author: Feynman Liang <[email protected]>
Closes #8424 from feynmanliang/SPARK-9797.
(cherry picked from commit 9205907876cf65695e56c2a94bedd83df3675c03)
Signed-off-by: Joseph K. Bradley <[email protected]>
commit 6f05b7aebd66a00e2556a29b35084e81ac526406
Author: Xiangrui Meng <[email protected]>
Date: 2015-08-25T21:11:38Z
[SPARK-10239] [SPARK-10244] [MLLIB] update since versions in mllib.pmml and
mllib.util
Same as #8421 but for `mllib.pmml` and `mllib.util`.
cc dbtsai
Author: Xiangrui Meng <[email protected]>
Closes #8430 from mengxr/SPARK-10239 and squashes the following commits:
a189acf [Xiangrui Meng] update since versions in mllib.pmml and mllib.util
(cherry picked from commit 00ae4be97f7b205432db2967ba6d506286ef2ca6)
Signed-off-by: DB Tsai <[email protected]>
commit 8925896b1eb0a13d723d38fb263d3bec0a01ec10
Author: Davies Liu <[email protected]>
Date: 2015-08-25T21:55:34Z
[SPARK-10245] [SQL] Fix decimal literals with precision < scale
In BigDecimal or java.math.BigDecimal, the precision could be smaller than
scale, for example, BigDecimal("0.001") has precision = 1 and scale = 3. But
DecimalType require that the precision should be larger than scale, so we
should use the maximum of precision and scale when inferring the schema from
decimal literal.
Author: Davies Liu <[email protected]>
Closes #8428 from davies/smaller_decimal.
(cherry picked from commit ec89bd840a6862751999d612f586a962cae63f6d)
Signed-off-by: Yin Huai <[email protected]>
commit ab7d46d1d6e7e6705a3348a0cab2d05fe62951cf
Author: Davies Liu <[email protected]>
Date: 2015-08-25T22:19:41Z
[SPARK-10215] [SQL] Fix precision of division (follow the rule in Hive)
Follow the rule in Hive for decimal division. see
https://github.com/apache/hive/blob/ac755ebe26361a4647d53db2a28500f71697b276/ql/src/java/org/apache/hadoop/hive/ql/udf/generic/GenericUDFOPDivide.java#L113
cc chenghao-intel
Author: Davies Liu <[email protected]>
Closes #8415 from davies/decimal_div2.
(cherry picked from commit 7467b52ed07f174d93dfc4cb544dc4b69a2c2826)
Signed-off-by: Yin Huai <[email protected]>
commit 727771352855dbb780008c449a877f5aaa5fc27a
Author: Patrick Wendell <[email protected]>
Date: 2015-08-25T22:56:37Z
Preparing Spark release v1.5.0-rc2
commit 4c03cb4da846bf3ea4cd99f593d74c4a817a7d2d
Author: Patrick Wendell <[email protected]>
Date: 2015-08-25T22:56:44Z
Preparing development version 1.5.1-SNAPSHOT
commit 5cf266fdeb6632622642e5d9bc056a76680b1970
Author: Feynman Liang <[email protected]>
Date: 2015-08-26T00:39:20Z
[SPARK-9888] [MLLIB] User guide for new LDA features
* Adds two new sections to LDA's user guide; one for each optimizer/model
* Documents new features added to LDA (e.g. topXXXperXXX, asymmetric
priors, hyperpam optimization)
* Cleans up a TODO and sets a default parameter in LDA code
jkbradley hhbyyh
Author: Feynman Liang <[email protected]>
Closes #8254 from feynmanliang/SPARK-9888.
(cherry picked from commit 125205cdb35530cdb4a8fff3e1ee49cf4a299583)
Signed-off-by: Joseph K. Bradley <[email protected]>
commit af98e51f273d95e0fc19da1eca32a5f87a8c5576
Author: Xiangrui Meng <[email protected]>
Date: 2015-08-26T01:17:54Z
[SPARK-10233] [MLLIB] update since version in mllib.evaluation
Same as #8421 but for `mllib.evaluation`.
cc avulanov
Author: Xiangrui Meng <[email protected]>
Closes #8423 from mengxr/SPARK-10233.
(cherry picked from commit 8668ead2e7097b9591069599fbfccf67c53db659)
Signed-off-by: Xiangrui Meng <[email protected]>
commit 46750b912781433b6ce0845ac22805cde975361e
Author: Xiangrui Meng <[email protected]>
Date: 2015-08-26T03:07:56Z
[SPARK-10238] [MLLIB] update since versions in mllib.linalg
Same as #8421 but for `mllib.linalg`.
cc dbtsai
Author: Xiangrui Meng <[email protected]>
Closes #8440 from mengxr/SPARK-10238 and squashes the following commits:
b38437e [Xiangrui Meng] update since versions in mllib.linalg
(cherry picked from commit ab431f8a970b85fba34ccb506c0f8815e55c63bf)
Signed-off-by: DB Tsai <[email protected]>
commit b7766699aef65586b0c3af96fb625efaa218d2b2
Author: Xiangrui Meng <[email protected]>
Date: 2015-08-26T05:31:23Z
[SPARK-10240] [SPARK-10242] [MLLIB] update since versions in mlilb.random
and mllib.stat
The same as #8241 but for `mllib.stat` and `mllib.random`.
cc feynmanliang
Author: Xiangrui Meng <[email protected]>
Closes #8439 from mengxr/SPARK-10242.
(cherry picked from commit c3a54843c0c8a14059da4e6716c1ad45c69bbe6c)
Signed-off-by: Xiangrui Meng <[email protected]>
commit be0c9915c0084a187933f338e51e606dc68e93af
Author: Xiangrui Meng <[email protected]>
Date: 2015-08-26T05:33:48Z
[SPARK-10234] [MLLIB] update since version in mllib.clustering
Same as #8421 but for `mllib.clustering`.
cc feynmanliang yu-iskw
Author: Xiangrui Meng <[email protected]>
Closes #8435 from mengxr/SPARK-10234.
(cherry picked from commit d703372f86d6a59383ba8569fcd9d379849cffbf)
Signed-off-by: Xiangrui Meng <[email protected]>
commit 6d8ebc801799714d297c83be6935b37e26dc2df7
Author: Xiangrui Meng <[email protected]>
Date: 2015-08-26T05:35:49Z
[SPARK-10243] [MLLIB] update since versions in mllib.tree
Same as #8421 but for `mllib.tree`.
cc jkbradley
Author: Xiangrui Meng <[email protected]>
Closes #8442 from mengxr/SPARK-10236.
(cherry picked from commit fb7e12fe2e14af8de4c206ca8096b2e8113bfddc)
Signed-off-by: Xiangrui Meng <[email protected]>
commit 08d390f457f80ffdc2dfce61ea579d9026047f12
Author: Xiangrui Meng <[email protected]>
Date: 2015-08-26T05:49:33Z
[SPARK-10235] [MLLIB] update since versions in mllib.regression
Same as #8421 but for `mllib.regression`.
cc freeman-lab dbtsai
Author: Xiangrui Meng <[email protected]>
Closes #8426 from mengxr/SPARK-10235 and squashes the following commits:
6cd28e4 [Xiangrui Meng] update since versions in mllib.regression
(cherry picked from commit 4657fa1f37d41dd4c7240a960342b68c7c591f48)
Signed-off-by: DB Tsai <[email protected]>
commit 21a10a86d20ec1a6fea42286b4d2aae9ce7e848d
Author: Xiangrui Meng <[email protected]>
Date: 2015-08-26T06:45:41Z
[SPARK-10236] [MLLIB] update since versions in mllib.feature
Same as #8421 but for `mllib.feature`.
cc dbtsai
Author: Xiangrui Meng <[email protected]>
Closes #8449 from mengxr/SPARK-10236.feature and squashes the following
commits:
0e8d658 [Xiangrui Meng] remove unnecessary comment
ad70b03 [Xiangrui Meng] update since versions in mllib.feature
(cherry picked from commit 321d7759691bed9867b1f0470f12eab2faa50aff)
Signed-off-by: DB Tsai <[email protected]>
commit 5220db9e352b5d5eae59cead9478ca0a9f73f16b
Author: felixcheung <[email protected]>
Date: 2015-08-26T06:48:16Z
[SPARK-9316] [SPARKR] Add support for filtering using `[` (synonym for
filter / select)
Add support for
```
df[df$name == "Smith", c(1,2)]
df[df$age %in% c(19, 30), 1:2]
```
shivaram
Author: felixcheung <[email protected]>
Closes #8394 from felixcheung/rsubset.
(cherry picked from commit 75d4773aa50e24972c533e8b48697fde586429eb)
Signed-off-by: Shivaram Venkataraman <[email protected]>
commit b0dde36009ce371824ce3e47e60fa0711d7733bb
Author: Xiangrui Meng <[email protected]>
Date: 2015-08-26T18:47:05Z
[SPARK-9665] [MLLIB] audit MLlib API annotations
I only found `ml.NaiveBayes` missing `Experimental` annotation. This PR
doesn't cover Python APIs.
cc jkbradley
Author: Xiangrui Meng <[email protected]>
Closes #8452 from mengxr/SPARK-9665.
(cherry picked from commit 6519fd06cc8175c9182ef16cf8a37d7f255eb846)
Signed-off-by: Joseph K. Bradley <[email protected]>
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]