[GitHub] spark pull request #14462: Branch 2.0

skillgapfinder Tue, 02 Aug 2016 10:31:51 -0700

GitHub user skillgapfinder opened a pull request:

    https://github.com/apache/spark/pull/14462


    Branch 2.0

    ## What changes were proposed in this pull request?
    
    (Please fill in changes proposed in this fix)
    
    
    ## How was this patch tested?
    
    (Please explain how this patch was tested. E.g. unit tests, integration 
tests, manual tests)
    
    
    (If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/apache/spark branch-2.0

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/14462.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #14462
    
----
commit 5c9555e1115ce52954db2a1b18f78cd77ec8c15f
Author: Tom Magrino <[email protected]>
Date:   2016-06-28T20:36:41Z

    [SPARK-16148][SCHEDULER] Allow for underscores in TaskLocation in the 
Executor ID
    
    ## What changes were proposed in this pull request?
    
    Previously, the TaskLocation implementation would not allow for executor 
ids which include underscores.  This tweaks the string split used to get the 
hostname and executor id, allowing for underscores in the executor id.
    
    This addresses the JIRA found here: 
https://issues.apache.org/jira/browse/SPARK-16148
    
    This is moved over from a previous PR against branch-1.6: 
https://github.com/apache/spark/pull/13857
    
    ## How was this patch tested?
    
    Ran existing unit tests for core and streaming.  Manually ran a simple 
streaming job with an executor whose id contained underscores and confirmed 
that the job ran successfully.
    
    This is my original work and I license the work to the project under the 
project's open source license.
    
    Author: Tom Magrino <[email protected]>
    
    Closes #13858 from tmagrino/fixtasklocation.
    
    (cherry picked from commit ae14f362355b131fcb3e3633da7bb14bdd2b6893)
    Signed-off-by: Shixiong Zhu <[email protected]>

commit 43bd612f35490c11a76d5379d723ba65f7afbefd
Author: Davies Liu <[email protected]>
Date:   2016-06-28T21:09:38Z

    [SPARK-16175] [PYSPARK] handle None for UDT
    
    ## What changes were proposed in this pull request?
    
    Scala UDT will bypass all the null and will not pass them into serialize() 
and deserialize() of UDT, this PR update the Python UDT to do this as well.
    
    ## How was this patch tested?
    
    Added tests.
    
    Author: Davies Liu <[email protected]>
    
    Closes #13878 from davies/udt_null.
    
    (cherry picked from commit 35438fb0ad3bcda5c5a3a0ccde1a620699d012db)
    Signed-off-by: Davies Liu <[email protected]>

commit 5626a0af598168a15d68a8817d1dec2a0e3dec7e
Author: gatorsmile <[email protected]>
Date:   2016-06-28T22:32:45Z

    [SPARK-16236][SQL] Add Path Option back to Load API in DataFrameReader
    
    #### What changes were proposed in this pull request?
    koertkuipers identified the PR https://github.com/apache/spark/pull/13727/ 
changed the behavior of `load` API. After the change, the `load` API does not 
add the value of `path` into the `options`. Thank you!
    
    This PR is to add the option `path` back to `load()` API in 
`DataFrameReader`, if and only if users specify one and only one `path` in the 
`load` API. For example, users can see the `path` option after the following 
API call,
    ```Scala
    spark.read
      .format("parquet")
      .load("/test")
    ```
    
    #### How was this patch tested?
    Added test cases.
    
    Author: gatorsmile <[email protected]>
    
    Closes #13933 from gatorsmile/optionPath.
    
    (cherry picked from commit 25520e976275e0d1e3bf9c73128ef4dec4618568)
    Signed-off-by: Reynold Xin <[email protected]>

commit d73c38ed0e129bdcb634000153516fca4b31b9d0
Author: Wenchen Fan <[email protected]>
Date:   2016-06-28T22:39:28Z

    [SPARK-16100][SQL] fix bug when use Map as the buffer type of Aggregator
    
    ## What changes were proposed in this pull request?
    
    The root cause is in `MapObjects`. Its parameter `loopVar` is not declared 
as child, but sometimes can be same with `lambdaFunction`(e.g. the function 
that takes `loopVar` and produces `lambdaFunction` may be `identity`), which is 
a child. This brings trouble when call `withNewChildren`, it may mistakenly 
treat `loopVar` as a child and cause `IndexOutOfBoundsException: 0` later.
    
    This PR fixes this bug by simply pulling out the paremters from 
`LambdaVariable` and pass them to `MapObjects` directly.
    
    ## How was this patch tested?
    
    new test in `DatasetAggregatorSuite`
    
    Author: Wenchen Fan <[email protected]>
    
    Closes #13835 from cloud-fan/map-objects.
    
    (cherry picked from commit 8a977b065418f07d2bf4fe1607a5534c32d04c47)
    Signed-off-by: Cheng Lian <[email protected]>

commit 5fb7804e55e50ba61c3a780b771d9b20b0bf2424
Author: James Thomas <[email protected]>
Date:   2016-06-28T23:12:48Z

    [SPARK-16114][SQL] structured streaming network word count examples
    
    ## What changes were proposed in this pull request?
    
    Network word count example for structured streaming
    
    ## How was this patch tested?
    
    Run locally
    
    Author: James Thomas <[email protected]>
    Author: James Thomas <[email protected]>
    
    Closes #13816 from jjthomas/master.
    
    (cherry picked from commit 3554713a163c58ca176ffde87d2c6e4a91bacb50)
    Signed-off-by: Tathagata Das <[email protected]>

commit 52c9d69f7da05c45cb191fef8f7ce54c8f40b1bb
Author: Burak Yavuz <[email protected]>
Date:   2016-06-29T00:02:16Z

    [MINOR][DOCS][STRUCTURED STREAMING] Minor doc fixes around 
`DataFrameWriter` and `DataStreamWriter`
    
    ## What changes were proposed in this pull request?
    
    Fixes a couple old references to `DataFrameWriter.startStream` to 
`DataStreamWriter.start
    
    Author: Burak Yavuz <[email protected]>
    
    Closes #13952 from brkyvz/minor-doc-fix.
    
    (cherry picked from commit 5545b791096756b07b3207fb3de13b68b9a37b00)
    Signed-off-by: Shixiong Zhu <[email protected]>

commit d7a59f1f450aae06baac96867a26042bd1ccd1d5
Author: Felix Cheung <[email protected]>
Date:   2016-06-29T00:08:28Z

    [SPARKR] add csv tests
    
    ## What changes were proposed in this pull request?
    
    Add unit tests for csv data for SPARKR
    
    ## How was this patch tested?
    
    unit tests
    
    Author: Felix Cheung <[email protected]>
    
    Closes #13904 from felixcheung/rcsv.
    
    (cherry picked from commit 823518c2b5259c8a954431467639198c808c9198)
    Signed-off-by: Shivaram Venkataraman <[email protected]>

commit 835c5a3bd549811178f5b455dc127401c5551866
Author: Shixiong Zhu <[email protected]>
Date:   2016-06-29T01:33:37Z

    [SPARK-16268][PYSPARK] SQLContext should import DataStreamReader
    
    ## What changes were proposed in this pull request?
    
    Fixed the following error:
    ```
    >>> sqlContext.readStream
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "...", line 442, in readStream
        return DataStreamReader(self._wrapped)
    NameError: global name 'DataStreamReader' is not defined
    ```
    
    ## How was this patch tested?
    
    The added test.
    
    Author: Shixiong Zhu <[email protected]>
    
    Closes #13958 from zsxwing/fix-import.
    
    (cherry picked from commit 5bf8881b34a18f25acc10aeb28a06af4c44a6ac8)
    Signed-off-by: Tathagata Das <[email protected]>

commit dd70a115cd562223e97f0b5e6172a9ea758be95d
Author: Reynold Xin <[email protected]>
Date:   2016-06-29T02:36:53Z

    [SPARK-16248][SQL] Whitelist the list of Hive fallback functions
    
    ## What changes were proposed in this pull request?
    This patch removes the blind fallback into Hive for functions. Instead, it 
creates a whitelist and adds only a small number of functions to the whitelist, 
i.e. the ones we intend to support in the long run in Spark.
    
    ## How was this patch tested?
    Updated tests to reflect the change.
    
    Author: Reynold Xin <[email protected]>
    
    Closes #13939 from rxin/hive-whitelist.
    
    (cherry picked from commit 363bcedeea40fe3f1a92271b96af2acba63e058c)
    Signed-off-by: Reynold Xin <[email protected]>

commit 22b4072e704f9a68a605e9a4cebf54d2122fe448
Author: Yanbo Liang <[email protected]>
Date:   2016-06-29T02:53:07Z

    [SPARK-16245][ML] model loading backward compatibility for ml.feature.PCA
    
    ## What changes were proposed in this pull request?
    model loading backward compatibility for ml.feature.PCA.
    
    ## How was this patch tested?
    existing ut and manual test for loading models saved by Spark 1.6.
    
    Author: Yanbo Liang <[email protected]>
    
    Closes #13937 from yanboliang/spark-16245.
    
    (cherry picked from commit 0df5ce1bc1387a58b33cd185008f4022bd3dcc69)
    Signed-off-by: Xiangrui Meng <[email protected]>

commit 345212b9fc91638f6cda8519ddbfec6a780854c1
Author: Davies Liu <[email protected]>
Date:   2016-06-28T20:43:59Z

    [SPARK-16259][PYSPARK] cleanup options in DataFrame read/write API
    
    ## What changes were proposed in this pull request?
    
    There are some duplicated code for options in DataFrame reader/writer API, 
this PR clean them up, it also fix a bug for `escapeQuotes` of csv().
    
    ## How was this patch tested?
    
    Existing tests.
    
    Author: Davies Liu <[email protected]>
    
    Closes #13948 from davies/csv_options.

commit 6650c0533e5c60f8653d2e0a608a42d5838fa553
Author: Tathagata Das <[email protected]>
Date:   2016-06-29T05:07:11Z

    [SPARK-16266][SQL][STREAING] Moved DataStreamReader/Writer from pyspark.sql 
to pyspark.sql.streaming
    
    ## What changes were proposed in this pull request?
    
    - Moved DataStreamReader/Writer from pyspark.sql to pyspark.sql.streaming 
to make them consistent with scala packaging
    - Exposed the necessary classes in sql.streaming package so that they 
appear in the docs
    - Added pyspark.sql.streaming module to the docs
    
    ## How was this patch tested?
    - updated unit tests.
    - generated docs for testing visibility of pyspark.sql.streaming classes.
    
    Author: Tathagata Das <[email protected]>
    
    Closes #13955 from tdas/SPARK-16266.

commit 904122335d94681be2afbaf4f41a50d468e707b9
Author: Holden Karau <[email protected]>
Date:   2016-06-29T08:52:20Z

    [TRIVIAL][DOCS][STREAMING][SQL] The return type mentioned in the Javadoc is 
incorrect for toJavaRDD, â¦
    
    ## What changes were proposed in this pull request?
    
    Change the return type mentioned in the JavaDoc for `toJavaRDD` / `javaRDD` 
to match the actual return type & be consistent with the scala rdd return type.
    
    ## How was this patch tested?
    
    Docs only change.
    
    Author: Holden Karau <[email protected]>
    
    Closes #13954 from holdenk/trivial-streaming-tojavardd-doc-fix.
    
    (cherry picked from commit 757dc2c09d23400dacac22e51f52062bbe471136)
    Signed-off-by: Tathagata Das <[email protected]>

commit 1b4d63f6f1e9f5aa819a149e1f5e45bba7d865bb
Author: Cheng Lian <[email protected]>
Date:   2016-06-29T11:08:36Z

    [SPARK-16291][SQL] CheckAnalysis should capture nested aggregate functions 
that reference no input attributes
    
    ## What changes were proposed in this pull request?
    
    `MAX(COUNT(*))` is invalid since aggregate expression can't be nested 
within another aggregate expression. This case should be captured at analysis 
phase, but somehow sneaks off to runtime.
    
    The reason is that when checking aggregate expressions in `CheckAnalysis`, 
a checking branch treats all expressions that reference no input attributes as 
valid ones. However, `MAX(COUNT(*))` is translated into `MAX(COUNT(1))` at 
analysis phase and also references no input attribute.
    
    This PR fixes this issue by removing the aforementioned branch.
    
    ## How was this patch tested?
    
    New test case added in `AnalysisErrorSuite`.
    
    Author: Cheng Lian <[email protected]>
    
    Closes #13968 from liancheng/spark-16291-nested-agg-functions.
    
    (cherry picked from commit d1e8108854deba3de8e2d87eb4389d11fb17ee57)
    Signed-off-by: Wenchen Fan <[email protected]>

commit ba71cf451efceaa6b454baa51c7a6b7e184d3cb7
Author: Bryan Cutler <[email protected]>
Date:   2016-06-29T12:06:38Z

    [SPARK-16261][EXAMPLES][ML] Fixed incorrect appNames in ML Examples
    
    ## What changes were proposed in this pull request?
    
    Some appNames in ML examples are incorrect, mostly in PySpark but one in 
Scala.  This corrects the names.
    
    ## How was this patch tested?
    Style, local tests
    
    Author: Bryan Cutler <[email protected]>
    
    Closes #13949 from BryanCutler/pyspark-example-appNames-fix-SPARK-16261.
    
    (cherry picked from commit 21385d02a987bcee1198103e447c019f7a769d68)
    Signed-off-by: Nick Pentreath <[email protected]>

commit d96e8c2dd0a9949751d3074b6ab61eee12f5d622
Author: Yanbo Liang <[email protected]>
Date:   2016-06-29T18:20:35Z

    [MINOR][SPARKR] Fix arguments of survreg in SparkR
    
    ## What changes were proposed in this pull request?
    Fix wrong arguments description of ```survreg``` in SparkR.
    
    ## How was this patch tested?
    ```Arguments``` section of ```survreg``` doc before this PR (with wrong 
description for ```path``` and missing ```overwrite```):
    
![image](https://cloud.githubusercontent.com/assets/1962026/16447548/fe7a5ed4-3da1-11e6-8b96-b5bf2083b07e.png)
    
    After this PR:
    
![image](https://cloud.githubusercontent.com/assets/1962026/16447617/368e0b18-3da2-11e6-8277-45640fb11859.png)
    
    Author: Yanbo Liang <[email protected]>
    
    Closes #13970 from yanboliang/spark-16143-followup.
    
    (cherry picked from commit c6a220d756f23ee89a0d1366b20259890c9d67c9)
    Signed-off-by: Xiangrui Meng <[email protected]>

commit 1cde325e29286a8c6631b0b32351994aad7db567
Author: Xin Ren <[email protected]>
Date:   2016-06-29T18:25:00Z

    [SPARK-16140][MLLIB][SPARKR][DOCS] Group k-means method in generated R doc
    
    https://issues.apache.org/jira/browse/SPARK-16140
    
    ## What changes were proposed in this pull request?
    
    Group the R doc of spark.kmeans, predict(KM), summary(KM), 
read/write.ml(KM) under Rd spark.kmeans. The example code was updated.
    
    ## How was this patch tested?
    
    Tested on my local machine
    
    And on my laptop `jekyll build` is failing to build API docs, so here I can 
only show you the html I manually generated from Rd files, with no CSS applied, 
but the doc content should be there.
    
    
![screenshotkmeans](https://cloud.githubusercontent.com/assets/3925641/16403203/c2c9ca1e-3ca7-11e6-9e29-f2164aee75fc.png)
    
    Author: Xin Ren <[email protected]>
    
    Closes #13921 from keypointt/SPARK-16140.
    
    (cherry picked from commit 8c9cd0a7a719ce4286f77f35bb787e2b626a472e)
    Signed-off-by: Xiangrui Meng <[email protected]>

commit edd1905c0fde69025cb6d8d8f15d13d6a6da0e3b
Author: gatorsmile <[email protected]>
Date:   2016-06-29T18:30:49Z

    [SPARK-16236][SQL][FOLLOWUP] Add Path Option back to Load API in 
DataFrameReader
    
    #### What changes were proposed in this pull request?
    In Python API, we have the same issue. Thanks for identifying this issue, 
zsxwing ! Below is an example:
    ```Python
    spark.read.format('json').load('python/test_support/sql/people.json')
    ```
    #### How was this patch tested?
    Existing test cases cover the changes by this PR
    
    Author: gatorsmile <[email protected]>
    
    Closes #13965 from gatorsmile/optionPaths.
    
    (cherry picked from commit 39f2eb1da34f26bf68c535c8e6b796d71a37a651)
    Signed-off-by: Shixiong Zhu <[email protected]>

commit 3cc258efb14ee9a35163daa3fa8f4724507ac4af
Author: Tathagata Das <[email protected]>
Date:   2016-06-29T18:45:57Z

    [SPARK-16256][SQL][STREAMING] Added Structured Streaming Programming Guide
    
    Title defines all.
    
    Author: Tathagata Das <[email protected]>
    
    Closes #13945 from tdas/SPARK-16256.
    
    (cherry picked from commit 64132a14fb7a7255feeb5847a54f541fe551bf23)
    Signed-off-by: Tathagata Das <[email protected]>

commit 809af6d9d7df17f5889ebd8640c189e8d1e143a8
Author: hyukjinkwon <[email protected]>
Date:   2016-06-29T20:32:03Z

    [TRIVIAL] [PYSPARK] Clean up orc compression option as well
    
    ## What changes were proposed in this pull request?
    
    This PR corrects ORC compression option for PySpark as well. I think this 
was missed mistakenly in https://github.com/apache/spark/pull/13948.
    
    ## How was this patch tested?
    
    N/A
    
    Author: hyukjinkwon <[email protected]>
    
    Closes #13963 from HyukjinKwon/minor-orc-compress.
    
    (cherry picked from commit d8a87a3ed211dd08f06eeb9560661b8f11ce82fa)
    Signed-off-by: Davies Liu <[email protected]>

commit a7f66ef62b94cdcf65c3043406fd5fd8d6a584c1
Author: Yin Huai <[email protected]>
Date:   2016-06-29T21:42:58Z

    [SPARK-16301] [SQL] The analyzer rule for resolving using joins should 
respect the case sensitivity setting.
    
    ## What changes were proposed in this pull request?
    The analyzer rule for resolving using joins should respect the case 
sensitivity setting.
    
    ## How was this patch tested?
    New tests in ResolveNaturalJoinSuite
    
    Author: Yin Huai <[email protected]>
    
    Closes #13977 from yhuai/SPARK-16301.
    
    (cherry picked from commit 8b5a8b25b9d29b7d0949d5663c7394b26154a836)
    Signed-off-by: Davies Liu <[email protected]>

commit ef0253ff6d7fb9bf89ef023f2d5864c70d9d792d
Author: Dongjoon Hyun <[email protected]>
Date:   2016-06-29T22:00:41Z

    [SPARK-16006][SQL] Attemping to write empty DataFrame with no fields throws 
non-intuitive exception
    
    ## What changes were proposed in this pull request?
    
    This PR allows `emptyDataFrame.write` since the user didn't specify any 
partition columns.
    
    **Before**
    ```scala
    scala> spark.emptyDataFrame.write.parquet("/tmp/t1")
    org.apache.spark.sql.AnalysisException: Cannot use all columns for 
partition columns;
    scala> spark.emptyDataFrame.write.csv("/tmp/t1")
    org.apache.spark.sql.AnalysisException: Cannot use all columns for 
partition columns;
    ```
    
    After this PR, there occurs no exceptions and the created directory has 
only one file, `_SUCCESS`, as expected.
    
    ## How was this patch tested?
    
    Pass the Jenkins tests including updated test cases.
    
    Author: Dongjoon Hyun <[email protected]>
    
    Closes #13730 from dongjoon-hyun/SPARK-16006.
    
    (cherry picked from commit 9b1b3ae771babf127f64898d5dc110721597a760)
    Signed-off-by: Reynold Xin <[email protected]>

commit c4cebd5725e6d8ade8c0a02652e251d04903da72
Author: Eric Liang <[email protected]>
Date:   2016-06-29T22:07:32Z

    [SPARK-16238] Metrics for generated method and class bytecode size
    
    ## What changes were proposed in this pull request?
    
    This extends SPARK-15860 to include metrics for the actual bytecode size of 
janino-generated methods. They can be accessed in the same way as any other 
codahale metric, e.g.
    
    ```
    scala> 
org.apache.spark.metrics.source.CodegenMetrics.METRIC_GENERATED_CLASS_BYTECODE_SIZE.getSnapshot().getValues()
    res7: Array[Long] = Array(532, 532, 532, 542, 1479, 2670, 3585, 3585)
    
    scala> 
org.apache.spark.metrics.source.CodegenMetrics.METRIC_GENERATED_METHOD_BYTECODE_SIZE.getSnapshot().getValues()
    res8: Array[Long] = Array(5, 5, 5, 5, 10, 10, 10, 10, 15, 15, 15, 38, 63, 
79, 88, 94, 94, 94, 132, 132, 165, 165, 220, 220)
    ```
    
    ## How was this patch tested?
    
    Small unit test, also verified manually that the performance impact is 
minimal (<10%). hvanhovell
    
    Author: Eric Liang <[email protected]>
    
    Closes #13934 from ericl/spark-16238.
    
    (cherry picked from commit 23c58653f900bfb71ef2b3186a95ad2562c33969)
    Signed-off-by: Reynold Xin <[email protected]>

commit 011befd2098bf78979cc8e00de1576bf339583b2
Author: Dongjoon Hyun <[email protected]>
Date:   2016-06-29T23:08:10Z

    [SPARK-16228][SQL] HiveSessionCatalog should return `double`-param 
functions for decimal param lookups
    
    ## What changes were proposed in this pull request?
    
    This PR supports a fallback lookup by casting `DecimalType` into 
`DoubleType` for the external functions with `double`-type parameter.
    
    **Reported Error Scenarios**
    ```scala
    scala> sql("select percentile(value, 0.5) from values 1,2,3 T(value)")
    org.apache.spark.sql.AnalysisException: ... No matching method for class 
org.apache.hadoop.hive.ql.udf.UDAFPercentile with (int, decimal(38,18)). 
Possible choices: _FUNC_(bigint, array<double>)  _FUNC_(bigint, double)  ; line 
1 pos 7
    
    scala> sql("select percentile_approx(value, 0.5) from values 1.0,2.0,3.0 
T(value)")
    org.apache.spark.sql.AnalysisException: ... Only a float/double or 
float/double array argument is accepted as parameter 2, but decimal(38,18) was 
passed instead.; line 1 pos 7
    ```
    
    ## How was this patch tested?
    
    Pass the Jenkins tests (including a new testcase).
    
    Author: Dongjoon Hyun <[email protected]>
    
    Closes #13930 from dongjoon-hyun/SPARK-16228.
    
    (cherry picked from commit 2eaabfa4142d4050be2b45fd277ff5c7fa430581)
    Signed-off-by: Reynold Xin <[email protected]>

commit 8da4314735ed55f259642e2977d8d7bf2212474f
Author: Wenchen Fan <[email protected]>
Date:   2016-06-30T00:15:08Z

    [SPARK-16134][SQL] optimizer rules for typed filter
    
    ## What changes were proposed in this pull request?
    
    This PR adds 3 optimizer rules for typed filter:
    
    1. push typed filter down through `SerializeFromObject` and eliminate the 
deserialization in filter condition.
    2. pull typed filter up through `SerializeFromObject` and eliminate the 
deserialization in filter condition.
    3. combine adjacent typed filters and share the deserialized object among 
all the condition expressions.
    
    This PR also adds `TypedFilter` logical plan, to separate it from normal 
filter, so that the concept is more clear and it's easier to write optimizer 
rules.
    
    ## How was this patch tested?
    
    `TypedFilterOptimizationSuite`
    
    Author: Wenchen Fan <[email protected]>
    
    Closes #13846 from cloud-fan/filter.
    
    (cherry picked from commit d063898bebaaf4ec2aad24c3ac70aabdbf97a190)
    Signed-off-by: Cheng Lian <[email protected]>

commit e1bdf1e02483bf513b6e012e8921d440a5efbc11
Author: Cheng Lian <[email protected]>
Date:   2016-06-30T00:17:43Z

    Revert "[SPARK-16134][SQL] optimizer rules for typed filter"
    
    This reverts commit 8da4314735ed55f259642e2977d8d7bf2212474f.

commit b52bd8070dc852b419283f8a14595e42c179d3d0
Author: Dongjoon Hyun <[email protected]>
Date:   2016-06-30T00:29:17Z

    [SPARK-16267][TEST] Replace deprecated `CREATE TEMPORARY TABLE ... USING` 
from testsuites.
    
    ## What changes were proposed in this pull request?
    
    After SPARK-15674, `DDLStrategy` prints out the following deprecation 
messages in the testsuites.
    
    ```
    12:10:53.284 WARN 
org.apache.spark.sql.execution.SparkStrategies$DDLStrategy:
    CREATE TEMPORARY TABLE normal_orc_source USING... is deprecated,
    please use CREATE TEMPORARY VIEW viewName USING... instead
    ```
    
    Total : 40
    - JDBCWriteSuite: 14
    - DDLSuite: 6
    - TableScanSuite: 6
    - ParquetSourceSuite: 5
    - OrcSourceSuite: 2
    - SQLQuerySuite: 2
    - HiveCommandSuite: 2
    - JsonSuite: 1
    - PrunedScanSuite: 1
    - FilteredScanSuite  1
    
    This PR replaces `CREATE TEMPORARY TABLE` with `CREATE TEMPORARY VIEW` in 
order to remove the deprecation messages in the above testsuites except 
`DDLSuite`, `SQLQuerySuite`, `HiveCommandSuite`.
    
    The Jenkins results shows only remaining 10 messages.
    
    
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/61422/consoleFull
    
    ## How was this patch tested?
    
    This is a testsuite-only change.
    
    Author: Dongjoon Hyun <[email protected]>
    
    Closes #13956 from dongjoon-hyun/SPARK-16267.
    
    (cherry picked from commit 831a04f5d152d1839c0edfdf65bb728aa5957f16)
    Signed-off-by: Reynold Xin <[email protected]>

commit a54852350346cacae61d851d796bc3a7abd3a048
Author: Cheng Lian <[email protected]>
Date:   2016-06-30T05:50:53Z

    [SPARK-16294][SQL] Labelling support for the include_example Jekyll plugin
    
    ## What changes were proposed in this pull request?
    
    This PR adds labelling support for the `include_example` Jekyll plugin, so 
that we may split a single source file into multiple line blocks with different 
labels, and include them in multiple code snippets in the generated HTML page.
    
    ## How was this patch tested?
    
    Manually tested.
    
    <img width="923" alt="screenshot at jun 29 19-53-21" 
src="https://cloud.githubusercontent.com/assets/230655/16451099/66a76db2-3e33-11e6-84fb-63104c2f0688.png";>
    
    Author: Cheng Lian <[email protected]>
    
    Closes #13972 from liancheng/include-example-with-labels.
    
    (cherry picked from commit bde1d6a61593aeb62370f526542cead94919b0c0)
    Signed-off-by: Xiangrui Meng <[email protected]>

commit 3134f116a3565c3a299fa2e7094acd7304d64280
Author: cody koeninger <[email protected]>
Date:   2016-06-30T06:21:03Z

    [SPARK-12177][STREAMING][KAFKA] Update KafkaDStreams to new Kafka 0.10 
Consumer API
    
    ## What changes were proposed in this pull request?
    
    New Kafka consumer api for the released 0.10 version of Kafka
    
    ## How was this patch tested?
    
    Unit tests, manual tests
    
    Author: cody koeninger <[email protected]>
    
    Closes #11863 from koeninger/kafka-0.9.
    
    (cherry picked from commit dedbceec1ef33ccd88101016de969a1ef3e3e142)
    Signed-off-by: Tathagata Das <[email protected]>

commit c8a7c23054209db5474d96de2a7e2d8a6f8cc0da
Author: Tathagata Das <[email protected]>
Date:   2016-06-30T06:38:19Z

    [SPARK-16256][DOCS] Minor fixes on the Structured Streaming Programming 
Guide
    
    Author: Tathagata Das <[email protected]>
    
    Closes #13978 from tdas/SPARK-16256-1.
    
    (cherry picked from commit 2c3d96134dcc0428983eea087db7e91072215aea)
    Signed-off-by: Tathagata Das <[email protected]>

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #14462: Branch 2.0

Reply via email to