[GitHub] spark pull request #15501: Branch 2.0

2016-10-31 Thread asfgit
Github user asfgit closed the pull request at:

https://github.com/apache/spark/pull/15501


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at infrastruct...@apache.org or file a JIRA ticket
with INFRA.
---

-
To unsubscribe, e-mail: reviews-unsubscr...@spark.apache.org
For additional commands, e-mail: reviews-h...@spark.apache.org



[GitHub] spark pull request #15501: Branch 2.0

2016-10-15 Thread lastbus
GitHub user lastbus opened a pull request:

https://github.com/apache/spark/pull/15501

Branch 2.0

## What changes were proposed in this pull request?

(Please fill in changes proposed in this fix)

## How was this patch tested?

(Please explain how this patch was tested. E.g. unit tests, integration 
tests, manual tests)
(If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)

Please review 
https://cwiki.apache.org/confluence/display/SPARK/Contributing+to+Spark before 
opening a pull request.


You can merge this pull request into a Git repository by running:

$ git pull https://github.com/apache/spark branch-2.0

Alternatively you can review and apply these changes as the patch at:

https://github.com/apache/spark/pull/15501.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

This closes #15501


commit 0297896119e11f23da4b14f62f50ec72b5fac57f
Author: Junyang Qian 
Date:   2016-08-20T13:59:23Z

[SPARK-16508][SPARKR] Fix CRAN undocumented/duplicated arguments warnings.

This PR tries to fix all the remaining "undocumented/duplicated arguments" 
warnings given by CRAN-check.

One left is doc for R `stats::glm` exported in SparkR. To mute that 
warning, we have to also provide document for all arguments of that non-SparkR 
function.

Some previous conversation is in #14558.

R unit test and `check-cran.sh` script (with no-test).

Author: Junyang Qian 

Closes #14705 from junyangq/SPARK-16508-master.

(cherry picked from commit 01401e965b58f7e8ab615764a452d7d18f1d4bf0)
Signed-off-by: Shivaram Venkataraman 

commit e62b29f29f44196a1cbe13004ff4abfd8e5be1c1
Author: Dongjoon Hyun 
Date:   2016-08-21T20:07:47Z

[SPARK-17098][SQL] Fix `NullPropagation` optimizer to handle `COUNT(NULL) 
OVER` correctly

## What changes were proposed in this pull request?

Currently, `NullPropagation` optimizer replaces `COUNT` on null literals in 
a bottom-up fashion. During that, `WindowExpression` is not covered properly. 
This PR adds the missing propagation logic.

**Before**
```scala
scala> sql("SELECT COUNT(1 + NULL) OVER ()").show
java.lang.UnsupportedOperationException: Cannot evaluate expression: cast(0 
as bigint) windowspecdefinition(ROWS BETWEEN UNBOUNDED PRECEDING AND UNBOUNDED 
FOLLOWING)
```

**After**
```scala
scala> sql("SELECT COUNT(1 + NULL) OVER ()").show

+--+
|count((1 + CAST(NULL AS INT))) OVER (ROWS BETWEEN UNBOUNDED PRECEDING AND 
UNBOUNDED FOLLOWING)|

+--+
|   
  0|

+--+
```

## How was this patch tested?

Pass the Jenkins test with a new test case.

Author: Dongjoon Hyun 

Closes #14689 from dongjoon-hyun/SPARK-17098.

(cherry picked from commit 91c2397684ab791572ac57ffb2a924ff058bb64f)
Signed-off-by: Herman van Hovell 

commit 49cc44de3ad5495b2690633791941aa00a62b553
Author: Davies Liu 
Date:   2016-08-22T08:16:03Z

[SPARK-17115][SQL] decrease the threshold when split expressions

## What changes were proposed in this pull request?

In 2.0, we change the threshold of splitting expressions from 16K to 64K, 
which cause very bad performance on wide table, because the generated method 
can't be JIT compiled by default (above the limit of 8K bytecode).

This PR will decrease it to 1K, based on the benchmark results for a wide 
table with 400 columns of LongType.

It also fix a bug around splitting expression in whole-stage codegen (it 
should not split them).

## How was this patch tested?

Added benchmark suite.

Author: Davies Liu 

Closes #14692 from davies/split_exprs.

(cherry picked from commit 8d35a6f68d6d733212674491cbf31bed73fada0f)
Signed-off-by: Wenchen Fan 

commit 2add45fabeb0ea4f7b17b5bc4910161370e72627
Author: Jagadeesan 
Date:   2016-08-22T08:30:31Z

[SPARK-17085][STREAMING][DOCUMENTATION AND ACTUAL CODE DIFFERS - 
UNSUPPORTED OPERATIONS]

Changes in  Spark Stuctured Streaming doc in this link

https://spark.apache.org/docs/2.0.0/structured-streaming-programming-guide.html#unsupported-operations

Author: Jagadeesan 

Closes #14715 from jagadeesanas2/SPARK-17085.

(cherry picked from commit bd9655063bdba8836b4ec96ed115e5653e246b65)
Signed-off-by: Sean Owen 

commit 79195982a4c6f8b1a3e02069dea00049cc806574
Author: Junyang Qian 
Date:   2016-08-22T