GitHub user dsjch123 opened a pull request:
https://github.com/apache/spark/pull/20792
Branch 2.1
## What changes were proposed in this pull request?
(Please fill in changes proposed in this fix)
## How was this patch tested?
(Please explain how this patch was tested. E.g. unit tests, integration
tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise,
remove this)
Please review http://spark.apache.org/contributing.html before opening a
pull request.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/apache/spark branch-2.1
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/20792.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #20792
commit 43084b3cc3918b720fe28053d2037fa22a71264e
Author: Herman van Hovell
Date: 2017-02-23T22:58:02Z
[SPARK-19459][SQL][BRANCH-2.1] Support for nested char/varchar fields in ORC
## What changes were proposed in this pull request?
This is a backport of the two following commits:
https://github.com/apache/spark/commit/78eae7e67fd5dec0c2d5b1853ce86cd0f1ae
&
https://github.com/apache/spark/commit/de8a03e68202647555e30fffba551f65bc77608d
This PR adds support for ORC tables with (nested) char/varchar fields.
## How was this patch tested?
Added a regression test to `OrcSourceSuite`.
Author: Herman van Hovell
Closes #17041 from hvanhovell/SPARK-19459-branch-2.1.
commit 66a7ca28a9de92e67ce24896a851a0c96c92aec6
Author: Takeshi Yamamuro
Date: 2017-02-24T09:54:00Z
[SPARK-19691][SQL][BRANCH-2.1] Fix ClassCastException when calculating
percentile of decimal column
## What changes were proposed in this pull request?
This is a backport of the two following commits:
https://github.com/apache/spark/commit/93aa4271596a30752dc5234d869c3ae2f6e8e723
This pr fixed a class-cast exception below;
```
scala> spark.range(10).selectExpr("cast (id as decimal) as
x").selectExpr("percentile(x, 0.5)").collect()
java.lang.ClassCastException: org.apache.spark.sql.types.Decimal cannot be
cast to java.lang.Number
at
org.apache.spark.sql.catalyst.expressions.aggregate.Percentile.update(Percentile.scala:141)
at
org.apache.spark.sql.catalyst.expressions.aggregate.Percentile.update(Percentile.scala:58)
at
org.apache.spark.sql.catalyst.expressions.aggregate.TypedImperativeAggregate.update(interfaces.scala:514)
at
org.apache.spark.sql.execution.aggregate.AggregationIterator$$anonfun$1$$anonfun$applyOrElse$1.apply(AggregationIterator.scala:171)
at
org.apache.spark.sql.execution.aggregate.AggregationIterator$$anonfun$1$$anonfun$applyOrElse$1.apply(AggregationIterator.scala:171)
at
org.apache.spark.sql.execution.aggregate.AggregationIterator$$anonfun$generateProcessRow$1.apply(AggregationIterator.scala:187)
at
org.apache.spark.sql.execution.aggregate.AggregationIterator$$anonfun$generateProcessRow$1.apply(AggregationIterator.scala:181)
at
org.apache.spark.sql.execution.aggregate.ObjectAggregationIterator.processInputs(ObjectAggregationIterator.scala:151)
at
org.apache.spark.sql.execution.aggregate.ObjectAggregationIterator.(ObjectAggregationIterator.scala:78)
at
org.apache.spark.sql.execution.aggregate.ObjectHashAggregateExec$$anonfun$doExecute$1$$anonfun$2.apply(ObjectHashAggregateExec.scala:109)
at
```
This fix simply converts catalyst values (i.e., `Decimal`) into scala ones
by using `CatalystTypeConverters`.
## How was this patch tested?
Added a test in `DataFrameSuite`.
Author: Takeshi Yamamuro
Closes #17046 from maropu/SPARK-19691-BACKPORT2.1.
commit 6da6a27f673f6e45fe619e0411fbaaa14ea34bfb
Author: jerryshao
Date: 2017-02-24T17:28:59Z
[SPARK-19707][CORE] Improve the invalid path check for sc.addJar
## What changes were proposed in this pull request?
Currently in Spark there're two issues when we add jars with invalid path:
* If the jar path is a empty string {--jar ",dummy.jar"}, then Spark will
resolve it to the current directory path and add to classpath / file server,
which is unwanted. This is happened in our programatic way to submit Spark
application. From my understanding Spark should defensively filter out such
empty path.
* If the jar path is a invalid path (file doesn't exist), `addJar` doesn't
check it and will still add to file server, the exception will be delayed until
job running. Actually this local path could be checked beforehand, no need to
wait until task running. We have similar check in `addFile`, but lacks similar
similar mechanism in `addJar`.
## How was this patch tested