[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...

junyangq Thu, 18 Aug 2016 12:24:25 -0700

GitHub user junyangq reopened a pull request:

    https://github.com/apache/spark/pull/14558


    [SPARK-16508][SparkR] Fix warnings on undocumented/duplicated arguments by 
CRAN-check

    ## What changes were proposed in this pull request?
    
    This PR tries to fix all the remaining "undocumented/duplicated arguments" 
warnings given by CRAN-check.
    
    ## How was this patch tested?
    
    R unit test and check-cran.sh script.

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/junyangq/spark SPARK-16508-branch-2.0

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/14558.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #14558
    
----
commit 82e2f09517e9f3d726af0046d251748f892f59c8
Author: Junyang Qian <[email protected]>
Date:   2016-08-09T04:52:34Z

    Fix part of undocumented/duplicated arguments warnings by CRAN-check

commit 41d9dcac3e1d3ee1a27fe094ebb60c1c18d6bcff
Author: Mariusz Strzelecki <[email protected]>
Date:   2016-08-09T16:44:43Z

    [SPARK-16950] [PYSPARK] fromOffsets parameter support in 
KafkaUtils.createDirectStream for python3
    
    ## What changes were proposed in this pull request?
    
    Ability to use KafkaUtils.createDirectStream with starting offsets in 
python 3 by using java.lang.Number instead of Long during param mapping in 
scala helper. This allows py4j to pass Integer or Long to the map and resolves 
ClassCastException problems.
    
    ## How was this patch tested?
    
    unit tests
    
    jerryshao  - could you please look at this PR?
    
    Author: Mariusz Strzelecki <[email protected]>
    
    Closes #14540 from szczeles/kafka_pyspark.
    
    (cherry picked from commit 29081b587f3423bf5a3e0066357884d0c26a04bf)
    Signed-off-by: Davies Liu <[email protected]>

commit 44115e90ef2a80d8ecf3965b97ce7bee21e29158
Author: Josh Rosen <[email protected]>
Date:   2016-08-09T18:21:45Z

    [SPARK-16956] Make ApplicationState.MAX_NUM_RETRY configurable
    
    ## What changes were proposed in this pull request?
    
    This patch introduces a new configuration, 
`spark.deploy.maxExecutorRetries`, to let users configure an obscure behavior 
in the standalone master where the master will kill Spark applications which 
have experienced too many back-to-back executor failures. The current setting 
is a hardcoded constant (10); this patch replaces that with a new cluster-wide 
configuration.
    
    **Background:** This application-killing was added in 
6b5980da796e0204a7735a31fb454f312bc9daac (from September 2012) and I believe 
that it was designed to prevent a faulty application whose executors could 
never launch from DOS'ing the Spark cluster via an infinite series of executor 
launch attempts. In a subsequent patch (#1360), this feature was refined to 
prevent applications which have running executors from being killed by this 
code path.
    
    **Motivation for making this configurable:** Previously, if a Spark 
Standalone application experienced more than `ApplicationState.MAX_NUM_RETRY` 
executor failures and was left with no executors running then the Spark master 
would kill that application, but this behavior is problematic in environments 
where the Spark executors run on unstable infrastructure and can all 
simultaneously die. For instance, if your Spark driver runs on an on-demand EC2 
instance while all workers run on ephemeral spot instances then it's possible 
for all executors to die at the same time while the driver stays alive. In this 
case, it may be desirable to keep the Spark application alive so that it can 
recover once new workers and executors are available. In order to accommodate 
this use-case, this patch modifies the Master to never kill faulty applications 
if `spark.deploy.maxExecutorRetries` is negative.
    
    I'd like to merge this patch into master, branch-2.0, and branch-1.6.
    
    ## How was this patch tested?
    
    I tested this manually using `spark-shell` and `local-cluster` mode. This 
is a tricky feature to unit test and historically this code has not changed 
very often, so I'd prefer to skip the additional effort of adding a testing 
framework and would rather rely on manual tests and review for now.
    
    Author: Josh Rosen <[email protected]>
    
    Closes #14544 from JoshRosen/add-setting-for-max-executor-failures.
    
    (cherry picked from commit b89b3a5c8e391fcaebe7ef3c77ef16bb9431d6ab)
    Signed-off-by: Josh Rosen <[email protected]>

commit 2d136dba415a40a04598068ac2cea0490a6fd091
Author: Davies Liu <[email protected]>
Date:   2016-08-09T17:04:36Z

    [SPARK-16905] SQL DDL: MSCK REPAIR TABLE
    
    MSCK REPAIR TABLE could be used to recover the partitions in external 
catalog based on partitions in file system.
    
    Another syntax is: ALTER TABLE table RECOVER PARTITIONS
    
    The implementation in this PR will only list partitions (not the files with 
a partition) in driver (in parallel if needed).
    
    Added unit tests for it and Hive compatibility test suite.
    
    Author: Davies Liu <[email protected]>
    
    Closes #14500 from davies/repair_table.

commit 901edbb8a41231137796d823c8b6624460163b3a
Author: Junyang Qian <[email protected]>
Date:   2016-08-10T03:24:42Z

    More fixes of the docs.

commit 475ee38150ee5a234156a903e4de227954b0063e
Author: MichaÅ KieÅbowicz <[email protected]>
Date:   2016-08-10T06:01:50Z

    Fixed typo
    
    ## What changes were proposed in this pull request?
    
    Fixed small typo - "value ... ~~in~~ is null"
    
    ## How was this patch tested?
    
    Still compiles!
    
    Author: MichaÅ KieÅbowicz <[email protected]>
    
    Closes #14569 from jupblb/typo-fix.
    
    (cherry picked from commit 9dc3e602d77ccdf670f1b6648e5674066d189cc0)
    Signed-off-by: Reynold Xin <[email protected]>

commit 2285de7347653ea6b3d35d58639ac70563f3c57a
Author: Sun Rui <[email protected]>
Date:   2016-08-10T09:01:29Z

    [SPARK-16522][MESOS] Spark application throws exception on exit.
    
    This is backport of https://github.com/apache/spark/pull/14175 to branch 2.0
    
    Author: Sun Rui <[email protected]>
    
    Closes #14575 from sun-rui/SPARK-16522-branch-2.0.

commit 20efb7969ac8b313cd0895b57789e47d657453a4
Author: Sean Owen <[email protected]>
Date:   2016-08-10T09:14:43Z

    [SPARK-16324][SQL] regexp_extract should doc that it returns empty string 
when match fails
    
    ## What changes were proposed in this pull request?
    
    Doc that regexp_extract returns empty string when regex or group does not 
match
    
    ## How was this patch tested?
    
    Jenkins test, with a few new test cases
    
    Author: Sean Owen <[email protected]>
    
    Closes #14525 from srowen/SPARK-16324.
    
    (cherry picked from commit 0578ff9681edbaab4ae68f67272dc3d4d890d53b)
    Signed-off-by: Sean Owen <[email protected]>

commit 719ac5f37ccf32c34c70524b8cf9a2699c71a353
Author: avulanov <[email protected]>
Date:   2016-08-10T09:25:00Z

    [SPARK-15899][SQL] Fix the construction of the file path with hadoop Path
    
    ## What changes were proposed in this pull request?
    
    Fix the construction of the file path. Previous way of construction caused 
the creation of incorrect path on Windows.
    
    ## How was this patch tested?
    
    Run SQL unit tests on Windows
    
    Author: avulanov <[email protected]>
    
    Closes #13868 from avulanov/SPARK-15899-file.
    
    (cherry picked from commit 11a6844bebbad1968bcdc295ab2de31c60dc0874)
    Signed-off-by: Sean Owen <[email protected]>
    
    # Conflicts:
    #   sql/core/src/main/scala/org/apache/spark/sql/internal/SQLConf.scala
    #   
sql/core/src/test/scala/org/apache/spark/sql/execution/command/DDLSuite.scala

commit 15637f735f4b27b291f40bbeadb98c5e0318bf70
Author: Sean Owen <[email protected]>
Date:   2016-08-10T15:48:57Z

    Revert "[SPARK-15899][SQL] Fix the construction of the file path with 
hadoop Path"
    
    This reverts commit 719ac5f37ccf32c34c70524b8cf9a2699c71a353.

commit 977fbbfcae705dbdbf203bd0a6e7c75a12156d3f
Author: Liang-Chi Hsieh <[email protected]>
Date:   2016-08-10T17:03:55Z

    [SPARK-15639] [SPARK-16321] [SQL] Push down filter at RowGroups level for 
parquet reader
    
    The base class `SpecificParquetRecordReaderBase` used for vectorized 
parquet reader will try to get pushed-down filters from the given 
configuration. This pushed-down filters are used for RowGroups-level filtering. 
However, we don't set up the filters to push down into the configuration. In 
other words, the filters are not actually pushed down to do RowGroups-level 
filtering. This patch is to fix this and tries to set up the filters for 
pushing down to configuration for the reader.
    
    The benchmark that excludes the time of writing Parquet file:
    
        test("Benchmark for Parquet") {
          val N = 500 << 12
            withParquetTable((0 until N).map(i => (101, i)), "t") {
              val benchmark = new Benchmark("Parquet reader", N)
              benchmark.addCase("reading Parquet file", 10) { iter =>
                sql("SELECT _1 FROM t where t._1 < 100").collect()
              }
              benchmark.run()
          }
        }
    
    `withParquetTable` in default will run tests for vectorized reader 
non-vectorized readers. I only let it run vectorized reader.
    
    When we set the block size of parquet as 1024 to have multiple row groups. 
The benchmark is:
    
    Before this patch:
    
    The retrieved row groups: 8063
    
        Java HotSpot(TM) 64-Bit Server VM 1.8.0_71-b15 on Linux 
3.19.0-25-generic
        Intel(R) Core(TM) i7-5557U CPU  3.10GHz
        Parquet reader:                          Best/Avg Time(ms)    Rate(M/s) 
  Per Row(ns)   Relative
        
------------------------------------------------------------------------------------------------
        reading Parquet file                           825 / 1233          2.5  
       402.6       1.0X
    
    After this patch:
    
    The retrieved row groups: 0
    
        Java HotSpot(TM) 64-Bit Server VM 1.8.0_71-b15 on Linux 
3.19.0-25-generic
        Intel(R) Core(TM) i7-5557U CPU  3.10GHz
        Parquet reader:                          Best/Avg Time(ms)    Rate(M/s) 
  Per Row(ns)   Relative
        
------------------------------------------------------------------------------------------------
        reading Parquet file                           306 /  503          6.7  
       149.6       1.0X
    
    Next, I run the benchmark for non-pushdown case using the same benchmark 
code but with disabled pushdown configuration. This time the parquet block size 
is default value.
    
    Before this patch:
    
        Java HotSpot(TM) 64-Bit Server VM 1.8.0_71-b15 on Linux 
3.19.0-25-generic
        Intel(R) Core(TM) i7-5557U CPU  3.10GHz
        Parquet reader:                          Best/Avg Time(ms)    Rate(M/s) 
  Per Row(ns)   Relative
        
------------------------------------------------------------------------------------------------
        reading Parquet file                           136 /  238         15.0  
        66.5       1.0X
    
    After this patch:
    
        Java HotSpot(TM) 64-Bit Server VM 1.8.0_71-b15 on Linux 
3.19.0-25-generic
        Intel(R) Core(TM) i7-5557U CPU  3.10GHz
        Parquet reader:                          Best/Avg Time(ms)    Rate(M/s) 
  Per Row(ns)   Relative
        
------------------------------------------------------------------------------------------------
        reading Parquet file                           124 /  193         16.5  
        60.7       1.0X
    
    For non-pushdown case, from the results, I think this patch doesn't affect 
normal code path.
    
    I've manually output the `totalRowCount` in 
`SpecificParquetRecordReaderBase` to see if this patch actually filter the 
row-groups. When running the above benchmark:
    
    After this patch:
        `totalRowCount = 0`
    
    Before this patch:
        `totalRowCount = 1024000`
    
    Existing tests should be passed.
    
    Author: Liang-Chi Hsieh <[email protected]>
    
    Closes #13701 from viirya/vectorized-reader-push-down-filter2.
    
    (cherry picked from commit 19af298bb6d264adcf02f6f84c8dc1542b408507)
    Signed-off-by: Davies Liu <[email protected]>

commit d3a30d2f0531049b60d1b321b3b8b3d0a6d716d2
Author: Junyang Qian <[email protected]>
Date:   2016-08-10T18:18:23Z

    [SPARK-16579][SPARKR] add install.spark function
    
    Add an install_spark function to the SparkR package. User can run 
`install_spark()` to install Spark to a local directory within R.
    
    Updates:
    
    Several changes have been made:
    
    - `install.spark()`
        - check existence of tar file in the cache folder, and download only if 
not found
        - trial priority of mirror_url look-up: user-provided -> preferred 
mirror site from apache website -> hardcoded backup option
        - use 2.0.0
    
    - `sparkR.session()`
        - can install spark when not found in `SPARK_HOME`
    
    Manual tests, running the check-cran.sh script added in #14173.
    
    Author: Junyang Qian <[email protected]>
    
    Closes #14258 from junyangq/SPARK-16579.
    
    (cherry picked from commit 214ba66a030bc3a718c567a742b0db44bf911d61)
    Signed-off-by: Shivaram Venkataraman <[email protected]>

commit 1e4013571b18ca337ea664838f7f8e781c8de7aa
Author: Tao Wang <[email protected]>
Date:   2016-08-11T05:30:18Z

    [SPARK-17010][MINOR][DOC] Wrong description in memory management document
    
    ## What changes were proposed in this pull request?
    
    change the remain percent to right one.
    
    ## How was this patch tested?
    
    Manual review
    
    Author: Tao Wang <[email protected]>
    
    Closes #14591 from WangTaoTheTonic/patch-1.
    
    (cherry picked from commit 7a6a3c3fbcea889ca20beae9d4198df2fe53bd1b)
    Signed-off-by: Reynold Xin <[email protected]>

commit 8611bc2058eb7397c372de39b59934494569623c
Author: petermaxlee <[email protected]>
Date:   2016-08-10T09:17:21Z

    [SPARK-16866][SQL] Infrastructure for file-based SQL end-to-end tests
    
    ## What changes were proposed in this pull request?
    This patch introduces SQLQueryTestSuite, a basic framework for end-to-end 
SQL test cases defined in spark/sql/core/src/test/resources/sql-tests. This is 
a more standard way to test SQL queries end-to-end in different open source 
database systems, because it is more manageable to work with files.
    
    This is inspired by HiveCompatibilitySuite, but simplified for general 
Spark SQL tests. Once this is merged, I can work towards porting SQLQuerySuite 
over, and eventually also move the existing HiveCompatibilitySuite to use this 
framework.
    
    Unlike HiveCompatibilitySuite, SQLQueryTestSuite compares both the output 
schema and the output data (in string form).
    
    When there is a mismatch, the error message looks like the following:
    
    ```
    [info] - blacklist.sql !!! IGNORED !!!
    [info] - number-format.sql *** FAILED *** (2 seconds, 405 milliseconds)
    [info]   Expected "...147483648     -214748364[8]", but got "...147483648   
-214748364[9]" Result should match for query #1 (SQLQueryTestSuite.scala:171)
    [info]   org.scalatest.exceptions.TestFailedException:
    [info]   at 
org.scalatest.Assertions$class.newAssertionFailedException(Assertions.scala:495)
    [info]   at 
org.scalatest.FunSuite.newAssertionFailedException(FunSuite.scala:1555)
    [info]   at 
org.scalatest.Assertions$class.assertResult(Assertions.scala:1171)
    ```
    
    ## How was this patch tested?
    This is a test infrastructure change.
    
    Author: petermaxlee <[email protected]>
    
    Closes #14472 from petermaxlee/SPARK-16866.
    
    (cherry picked from commit b9f8a117097bc102e261b68f38a679d16e19f2e2)
    Signed-off-by: Wenchen Fan <[email protected]>

commit 51b1016682a805e06b857a6b1f160a877839dbd5
Author: petermaxlee <[email protected]>
Date:   2016-08-11T04:05:32Z

    [SPARK-17008][SPARK-17009][SQL] Normalization and isolation in 
SQLQueryTestSuite.
    
    ## What changes were proposed in this pull request?
    This patch enhances SQLQueryTestSuite in two ways:
    
    1. SPARK-17009: Use a new SparkSession for each test case to provide 
stronger isolation (e.g. config changes in one test case does not impact 
another). That said, we do not currently isolate catalog changes.
    2. SPARK-17008: Normalize query output using sorting, inspired by 
HiveComparisonTest.
    
    I also ported a few new test cases over from SQLQuerySuite.
    
    ## How was this patch tested?
    This is a test harness update.
    
    Author: petermaxlee <[email protected]>
    
    Closes #14590 from petermaxlee/SPARK-17008.
    
    (cherry picked from commit 425c7c2dbd2923094712e1215dd29272fb09cd79)
    Signed-off-by: Wenchen Fan <[email protected]>

commit ea8a198b9838f731458456f369b700815f02198a
Author: petermaxlee <[email protected]>
Date:   2016-08-11T04:26:46Z

    [SPARK-17007][SQL] Move test data files into a test-data folder
    
    ## What changes were proposed in this pull request?
    This patch moves all the test data files in sql/core/src/test/resources to 
sql/core/src/test/resources/test-data, so we don't clutter the top level 
sql/core/src/test/resources. Also deleted 
sql/core/src/test/resources/old-repeated.parquet since it is no longer used.
    
    The change will make it easier to spot sql-tests directory.
    
    ## How was this patch tested?
    This is a test-only change.
    
    Author: petermaxlee <[email protected]>
    
    Closes #14589 from petermaxlee/SPARK-17007.
    
    (cherry picked from commit 665e175328130ab3eb0370cdd2a43ed5a7bed1d6)
    Signed-off-by: Wenchen Fan <[email protected]>

commit 4b434e7dadffd83fe701668a23f0ece03e3f08bb
Author: petermaxlee <[email protected]>
Date:   2016-08-11T06:22:14Z

    [SPARK-17011][SQL] Support testing exceptions in SQLQueryTestSuite
    
    ## What changes were proposed in this pull request?
    This patch adds exception testing to SQLQueryTestSuite. When there is an 
exception in query execution, the query result contains the the exception class 
along with the exception message.
    
    As part of this, I moved some additional test cases for limit from 
SQLQuerySuite over to SQLQueryTestSuite.
    
    ## How was this patch tested?
    This is a test harness change.
    
    Author: petermaxlee <[email protected]>
    
    Closes #14592 from petermaxlee/SPARK-17011.
    
    (cherry picked from commit 0db373aaf87991207a7a8a09853b6fa602f0f45b)
    Signed-off-by: Wenchen Fan <[email protected]>

commit 0ed6236e94318ae0b56363ee1aef4a5577eeebd3
Author: Andrew Ash <[email protected]>
Date:   2016-08-11T10:26:57Z

    Correct example value for spark.ssl.YYY.XXX settings
    
    Docs adjustment to:
    - link to other relevant section of docs
    - correct statement about the only value when actually other values are 
supported
    
    Author: Andrew Ash <[email protected]>
    
    Closes #14581 from ash211/patch-10.
    
    (cherry picked from commit 8a6b7037bb058d00cc767895c3292509576ea2f9)
    Signed-off-by: Sean Owen <[email protected]>

commit 33a213f330bd746fb54783b16ea90c91b23a02a6
Author: avulanov <[email protected]>
Date:   2016-08-11T12:07:14Z

    [SPARK-15899][SQL] Fix the construction of the file path with hadoop Path 
for Spark 2.0
    
    This PR contains the adaptation of 
https://github.com/apache/spark/pull/13868 for Spark 2.0
    
    ## What changes were proposed in this pull request?
    
    Fix the construction of the file path in `SQLConf.scala` and unit tests 
that rely on this: `SQLConfSuite` and `DDLSuite`. Previous way of construction 
caused the creation of incorrect path on Windows.
    
    ## How was this patch tested?
    
    Run unit tests on Windows
    
    Author: avulanov <[email protected]>
    
    Closes #14600 from avulanov/SPARK-15899-file-2.0.

commit b87ba8f3504eee8df2b3b524086038e66cd68cc3
Author: Junyang Qian <[email protected]>
Date:   2016-08-11T16:15:43Z

    Fix remaining undocumented/duplicated warnings

commit 6bf20cd9460fd27c3e1e434b1cf31a3778ec3443
Author: petermaxlee <[email protected]>
Date:   2016-08-11T08:43:08Z

    [SPARK-17015][SQL] group-by/order-by ordinal and arithmetic tests
    
    This patch adds three test files:
    1. arithmetic.sql.out
    2. order-by-ordinal.sql
    3. group-by-ordinal.sql
    
    This includes https://github.com/apache/spark/pull/14594.
    
    This is a test case change.
    
    Author: petermaxlee <[email protected]>
    
    Closes #14595 from petermaxlee/SPARK-17015.
    
    (cherry picked from commit a7b02db457d5fc663ce6a1ef01bf04689870e6b4)
    Signed-off-by: Reynold Xin <[email protected]>

commit bc683f037d4e84f2a42eb7b1aaa9e0e4fd5f833a
Author: petermaxlee <[email protected]>
Date:   2016-08-11T20:55:10Z

    [SPARK-17018][SQL] literals.sql for testing literal parsing
    
    ## What changes were proposed in this pull request?
    This patch adds literals.sql for testing literal parsing end-to-end in SQL.
    
    ## How was this patch tested?
    The patch itself is only about adding test cases.
    
    Author: petermaxlee <[email protected]>
    
    Closes #14598 from petermaxlee/SPARK-17018-2.
    
    (cherry picked from commit cf9367826c38e5f34ae69b409f5d09c55ed1d319)
    Signed-off-by: Reynold Xin <[email protected]>

commit 0fb01496c09defa1436dbb7f5e1cbc5461617a31
Author: WangTaoTheTonic <[email protected]>
Date:   2016-08-11T22:09:23Z

    [SPARK-17022][YARN] Handle potential deadlock in driver handling messages
    
    ## What changes were proposed in this pull request?
    
    We directly send RequestExecutors to AM instead of transfer it to 
yarnShedulerBackend first, to avoid potential deadlock.
    
    ## How was this patch tested?
    
    manual tests
    
    Author: WangTaoTheTonic <[email protected]>
    
    Closes #14605 from WangTaoTheTonic/lock.
    
    (cherry picked from commit ea0bf91b4a2ca3ef472906e50e31fd6268b6f53e)
    Signed-off-by: Marcelo Vanzin <[email protected]>

commit d2c1d641ef05692f629ef7cefa0b2b3131ba3475
Author: Junyang Qian <[email protected]>
Date:   2016-08-12T00:27:33Z

    Keep to the convention where we have docs for generic and the function.

commit b4047fc21cefcf6a43c1ee88af330a042f02bebc
Author: Dongjoon Hyun <[email protected]>
Date:   2016-08-12T06:40:12Z

    [SPARK-16975][SQL] Column-partition path starting '_' should be handled 
correctly
    
    Currently, Spark ignores path names starting with underscore `_` and `.`. 
This causes read-failures for the column-partitioned file data sources whose 
partition column names starts from '_', e.g. `_col`.
    
    **Before**
    ```scala
    scala> spark.range(10).withColumn("_locality_code", 
$"id").write.partitionBy("_locality_code").save("/tmp/parquet")
    scala> spark.read.parquet("/tmp/parquet")
    org.apache.spark.sql.AnalysisException: Unable to infer schema for 
ParquetFormat at /tmp/parquet20. It must be specified manually;
    ```
    
    **After**
    ```scala
    scala> spark.range(10).withColumn("_locality_code", 
$"id").write.partitionBy("_locality_code").save("/tmp/parquet")
    scala> spark.read.parquet("/tmp/parquet")
    res2: org.apache.spark.sql.DataFrame = [id: bigint, _locality_code: int]
    ```
    
    Pass the Jenkins with a new test case.
    
    Author: Dongjoon Hyun <[email protected]>
    
    Closes #14585 from dongjoon-hyun/SPARK-16975-PARQUET.
    
    (cherry picked from commit abff92bfdc7d4c9d2308794f0350561fe0ceb4dd)
    Signed-off-by: Cheng Lian <[email protected]>

commit bde94cd71086fd348f3ba96de628d6df3f87dba5
Author: petermaxlee <[email protected]>
Date:   2016-08-12T06:56:55Z

    [SPARK-17013][SQL] Parse negative numeric literals
    
    ## What changes were proposed in this pull request?
    This patch updates the SQL parser to parse negative numeric literals as 
numeric literals, instead of unary minus of positive literals.
    
    This allows the parser to parse the minimal value for each data type, e.g. 
"-32768S".
    
    ## How was this patch tested?
    Updated test cases.
    
    Author: petermaxlee <[email protected]>
    
    Closes #14608 from petermaxlee/SPARK-17013.
    
    (cherry picked from commit 00e103a6edd1a1f001a94d41dd1f7acc40a1e30f)
    Signed-off-by: Reynold Xin <[email protected]>

commit 38378f59f2c91a6f07366aa2013522c334066c69
Author: Jagadeesan <[email protected]>
Date:   2016-08-13T10:25:03Z

    [SPARK-12370][DOCUMENTATION] Documentation should link to examples â¦
    
    ## What changes were proposed in this pull request?
    
    When documentation is built is should reference examples from the same 
build. There are times when the docs have links that point to files in the 
GitHub head which may not be valid on the current release. Changed that in URLs 
to make them point to the right tag in git using ```SPARK_VERSION_SHORT```
    
    â¦from its own release version] [Streaming programming guide]
    
    Author: Jagadeesan <[email protected]>
    
    Closes #14596 from jagadeesanas2/SPARK-12370.
    
    (cherry picked from commit e46cb78b3b9fd04a50b5ae50f360db612d656a48)
    Signed-off-by: Sean Owen <[email protected]>

commit a21ecc9964bbd6e41a5464dcc85db1529de14d67
Author: Luciano Resende <[email protected]>
Date:   2016-08-13T10:42:38Z

    [SPARK-17023][BUILD] Upgrade to Kafka 0.10.0.1 release
    
    ## What changes were proposed in this pull request?
    Update Kafka streaming connector to use Kafka 0.10.0.1 release
    
    ## How was this patch tested?
    Tested via Spark unit and integration tests
    
    Author: Luciano Resende <[email protected]>
    
    Closes #14606 from lresende/kafka-upgrade.
    
    (cherry picked from commit 67f025d90e6ba8c039ff45e26d34f20d24b92e6a)
    Signed-off-by: Sean Owen <[email protected]>

commit 750f8804540df5ad68a732f68598c4a2dbbc4761
Author: Sean Owen <[email protected]>
Date:   2016-08-13T22:40:43Z

    [SPARK-16966][SQL][CORE] App Name is a randomUUID even when 
"spark.app.name" exists
    
    ## What changes were proposed in this pull request?
    
    Don't override app name specified in `SparkConf` with a random app name. 
Only set it if the conf has no app name even after options have been applied.
    
    See also https://github.com/apache/spark/pull/14602
    This is similar to Sherry302 's original proposal in 
https://github.com/apache/spark/pull/14556
    
    ## How was this patch tested?
    
    Jenkins test, with new case reproducing the bug
    
    Author: Sean Owen <[email protected]>
    
    Closes #14630 from srowen/SPARK-16966.2.
    
    (cherry picked from commit cdaa562c9a09e2e83e6df4e84d911ce1428a7a7c)
    Signed-off-by: Reynold Xin <[email protected]>

commit e02d0d0852c5d56558ddfd13c675b3f2d70a7eea
Author: zero323 <[email protected]>
Date:   2016-08-14T10:59:24Z

    [SPARK-17027][ML] Avoid integer overflow in PolynomialExpansion.getPolySize
    
    ## What changes were proposed in this pull request?
    
    Replaces custom choose function with 
o.a.commons.math3.CombinatoricsUtils.binomialCoefficient
    
    ## How was this patch tested?
    
    Spark unit tests
    
    Author: zero323 <[email protected]>
    
    Closes #14614 from zero323/SPARK-17027.
    
    (cherry picked from commit 0ebf7c1bff736cf54ec47957d71394d5b75b47a7)
    Signed-off-by: Sean Owen <[email protected]>

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #14558: [SPARK-16508][SparkR] Fix warnings on undocumente...

Reply via email to