[GitHub] spark pull request: Branch 1.6

thinkborm Thu, 14 Apr 2016 19:18:20 -0700

GitHub user thinkborm opened a pull request:

    https://github.com/apache/spark/pull/12407


    Branch 1.6

    ## What changes were proposed in this pull request?
    
    (Please fill in changes proposed in this fix)
    
    
    ## How was this patch tested?
    
    (Please explain how this patch was tested. E.g. unit tests, integration 
tests, manual tests)
    
    
    (If this patch involves UI changes, please attach a screenshot; otherwise, 
remove this)
    


You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/apache/spark branch-1.6

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/12407.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #12407
    
----
commit 04e868b63bfda5afe5cb1a0d6387fb873ad393ba
Author: Yanbo Liang <[email protected]>
Date:   2015-12-16T20:59:22Z

    [SPARK-12364][ML][SPARKR] Add ML example for SparkR
    
    We have DataFrame example for SparkR, we also need to add ML example under 
```examples/src/main/r```.
    
    cc mengxr jkbradley shivaram
    
    Author: Yanbo Liang <[email protected]>
    
    Closes #10324 from yanboliang/spark-12364.
    
    (cherry picked from commit 1a8b2a17db7ab7a213d553079b83274aeebba86f)
    Signed-off-by: Joseph K. Bradley <[email protected]>

commit 552b38f87fc0f6fab61b1e5405be58908b7f5544
Author: Davies Liu <[email protected]>
Date:   2015-12-16T23:48:11Z

    [SPARK-12380] [PYSPARK] use SQLContext.getOrCreate in mllib
    
    MLlib should use SQLContext.getOrCreate() instead of creating new 
SQLContext.
    
    Author: Davies Liu <[email protected]>
    
    Closes #10338 from davies/create_context.
    
    (cherry picked from commit 27b98e99d21a0cc34955337f82a71a18f9220ab2)
    Signed-off-by: Davies Liu <[email protected]>

commit 638b89bc3b1c421fe11cbaf52649225662d3d3ce
Author: Andrew Or <[email protected]>
Date:   2015-12-17T00:13:48Z

    [MINOR] Add missing interpolation in NettyRPCEnv
    
    ```
    Exception in thread "main" org.apache.spark.rpc.RpcTimeoutException:
    Cannot receive any reply in ${timeout.duration}. This timeout is controlled 
by spark.rpc.askTimeout
        at 
org.apache.spark.rpc.RpcTimeout.org$apache$spark$rpc$RpcTimeout$$createRpcTimeoutException(RpcTimeout.scala:48)
        at 
org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:63)
        at 
org.apache.spark.rpc.RpcTimeout$$anonfun$addMessageIfTimeout$1.applyOrElse(RpcTimeout.scala:59)
        at 
scala.runtime.AbstractPartialFunction.apply(AbstractPartialFunction.scala:33)
    ```
    
    Author: Andrew Or <[email protected]>
    
    Closes #10334 from andrewor14/rpc-typo.
    
    (cherry picked from commit 861549acdbc11920cde51fc57752a8bc241064e5)
    Signed-off-by: Shixiong Zhu <[email protected]>

commit fb02e4e3bcc50a8f823dfecdb2eef71287225e7b
Author: Imran Rashid <[email protected]>
Date:   2015-12-17T03:01:05Z

    [SPARK-10248][CORE] track exceptions in dagscheduler event loop in tests
    
    `DAGSchedulerEventLoop` normally only logs errors (so it can continue to 
process more events, from other jobs).  However, this is not desirable in the 
tests -- the tests should be able to easily detect any exception, and also 
shouldn't silently succeed if there is an exception.
    
    This was suggested by mateiz on https://github.com/apache/spark/pull/7699.  
It may have already turned up an issue in "zero split job".
    
    Author: Imran Rashid <[email protected]>
    
    Closes #8466 from squito/SPARK-10248.
    
    (cherry picked from commit 38d9795a4fa07086d65ff705ce86648345618736)
    Signed-off-by: Andrew Or <[email protected]>

commit 4af64385b085002d94c54d11bbd144f9f026bbd8
Author: tedyu <[email protected]>
Date:   2015-12-17T03:02:12Z

    [SPARK-12365][CORE] Use ShutdownHookManager where 
Runtime.getRuntime.addShutdownHook() is called
    
    SPARK-9886 fixed ExternalBlockStore.scala
    
    This PR fixes the remaining references to 
Runtime.getRuntime.addShutdownHook()
    
    Author: tedyu <[email protected]>
    
    Closes #10325 from ted-yu/master.
    
    (cherry picked from commit f590178d7a06221a93286757c68b23919bee9f03)
    Signed-off-by: Andrew Or <[email protected]>
    
    Conflicts:
        
sql/hive-thriftserver/src/main/scala/org/apache/spark/sql/hive/thriftserver/SparkSQLCLIDriver.scala

commit 154567dca126d4992c9c9b08d71d22e9af43c995
Author: Rohit Agarwal <[email protected]>
Date:   2015-12-17T03:04:33Z

    [SPARK-12186][WEB UI] Send the complete request URI including the query 
string when redirecting.
    
    Author: Rohit Agarwal <[email protected]>
    
    Closes #10180 from mindprince/SPARK-12186.
    
    (cherry picked from commit fdb38227564c1af40cbfb97df420b23eb04c002b)
    Signed-off-by: Andrew Or <[email protected]>

commit 4ad08035d28b8f103132da9779340c5e64e2d1c2
Author: Marcelo Vanzin <[email protected]>
Date:   2015-12-17T03:47:49Z

    [SPARK-12386][CORE] Fix NPE when spark.executor.port is set.
    
    Author: Marcelo Vanzin <[email protected]>
    
    Closes #10339 from vanzin/SPARK-12386.
    
    (cherry picked from commit d1508dd9b765489913bc948575a69ebab82f217b)
    Signed-off-by: Andrew Or <[email protected]>

commit d509194b81abc3c7bf9563d26560d596e1415627
Author: Yin Huai <[email protected]>
Date:   2015-12-17T07:18:53Z

    [SPARK-12057][SQL] Prevent failure on corrupt JSON records
    
    This PR makes JSON parser and schema inference handle more cases where we 
have unparsed records. It is based on #10043. The last commit fixes the failed 
test and updates the logic of schema inference.
    
    Regarding the schema inference change, if we have something like
    ```
    {"f1":1}
    [1,2,3]
    ```
    originally, we will get a DF without any column.
    After this change, we will get a DF with columns `f1` and 
`_corrupt_record`. Basically, for the second row, `[1,2,3]` will be the value 
of `_corrupt_record`.
    
    When merge this PR, please make sure that the author is simplyianm.
    
    JIRA: https://issues.apache.org/jira/browse/SPARK-12057
    
    Closes #10043
    
    Author: Ian Macalinao <[email protected]>
    Author: Yin Huai <[email protected]>
    
    Closes #10288 from yhuai/handleCorruptJson.
    
    (cherry picked from commit 9d66c4216ad830812848c657bbcd8cd50949e199)
    Signed-off-by: Reynold Xin <[email protected]>

commit da7542f2408140a9a3b7ea245350976ac18676a5
Author: echo2mei <[email protected]>
Date:   2015-12-17T15:59:17Z

    Once driver register successfully, stop it to connect to master.
    
    This commit is to resolve SPARK-12396.
    
    Author: echo2mei <[email protected]>
    
    Closes #10354 from echoTomei/master.
    
    (cherry picked from commit 5a514b61bbfb609c505d8d65f2483068a56f1f70)
    Signed-off-by: Davies Liu <[email protected]>

commit a8466489ab01e59fe07ba20adfc3983ec6928157
Author: Davies Liu <[email protected]>
Date:   2015-12-17T16:01:59Z

    Revert "Once driver register successfully, stop it to connect to master."
    
    This reverts commit da7542f2408140a9a3b7ea245350976ac18676a5.

commit 1ebedb20f2c5b781eafa9bf2b5ab092d744cc4fd
Author: Davies Liu <[email protected]>
Date:   2015-12-17T16:04:11Z

    [SPARK-12395] [SQL] fix resulting columns of outer join
    
    For API DataFrame.join(right, usingColumns, joinType), if the joinType is 
right_outer or full_outer, the resulting join columns could be wrong (will be 
null).
    
    The order of columns had been changed to match that with MySQL and 
PostgreSQL [1].
    
    This PR also fix the nullability of output for outer join.
    
    [1] http://www.postgresql.org/docs/9.2/static/queries-table-expressions.html
    
    Author: Davies Liu <[email protected]>
    
    Closes #10353 from davies/fix_join.
    
    (cherry picked from commit a170d34a1b309fecc76d1370063e0c4f44dc2142)
    Signed-off-by: Davies Liu <[email protected]>

commit 41ad8aced2fc6c694c15e9465cfa34517b2395e8
Author: Yanbo Liang <[email protected]>
Date:   2015-12-17T17:19:46Z

    [SQL] Update SQLContext.read.text doc
    
    Since we rename the column name from ```text``` to ```value``` for 
DataFrame load by ```SQLContext.read.text```, we need to update doc.
    
    Author: Yanbo Liang <[email protected]>
    
    Closes #10349 from yanboliang/text-value.
    
    (cherry picked from commit 6e0771665b3c9330fc0a5b2c7740a796b4cd712e)
    Signed-off-by: Reynold Xin <[email protected]>

commit 1fbca41200d6e73cb276d5949b894881c700323f
Author: Shixiong Zhu <[email protected]>
Date:   2015-12-17T17:55:37Z

    [SPARK-12220][CORE] Make Utils.fetchFile support files that contain special 
characters
    
    This PR encodes and decodes the file name to fix the issue.
    
    Author: Shixiong Zhu <[email protected]>
    
    Closes #10208 from zsxwing/uri.
    
    (cherry picked from commit 86e405f357711ae93935853a912bc13985c259db)
    Signed-off-by: Shixiong Zhu <[email protected]>

commit 881f2544e13679c185a7c34ddb82e885aaa79813
Author: Iulian Dragos <[email protected]>
Date:   2015-12-17T18:19:31Z

    [SPARK-12345][MESOS] Properly filter out SPARK_HOME in the Mesos REST server
    
    Fix problem with #10332, this one should fix Cluster mode on Mesos
    
    Author: Iulian Dragos <[email protected]>
    
    Closes #10359 from dragos/issue/fix-spark-12345-one-more-time.
    
    (cherry picked from commit 8184568810e8a2e7d5371db2c6a0366ef4841f70)
    Signed-off-by: Kousuke Saruta <[email protected]>

commit 88bbb5429dd3efcff6b2835a70143247b08ae6b2
Author: Andrew Or <[email protected]>
Date:   2015-12-17T04:01:47Z

    [SPARK-12390] Clean up unused serializer parameter in BlockManager
    
    No change in functionality is intended. This only changes internal API.
    
    Author: Andrew Or <[email protected]>
    
    Closes #10343 from andrewor14/clean-bm-serializer.
    
    Conflicts:
        core/src/main/scala/org/apache/spark/storage/BlockManager.scala

commit c0ab14fbeab2a81d174c3643a4fcc915ff2902e8
Author: Shixiong Zhu <[email protected]>
Date:   2015-12-17T21:23:48Z

    [SPARK-12410][STREAMING] Fix places that use '.' and '|' directly in split
    
    String.split accepts a regular expression, so we should escape "." and "|".
    
    Author: Shixiong Zhu <[email protected]>
    
    Closes #10361 from zsxwing/reg-bug.
    
    (cherry picked from commit 540b5aeadc84d1a5d61bda4414abd6bf35dc7ff9)
    Signed-off-by: Shixiong Zhu <[email protected]>

commit 48dcee48416d87bf9572ace0a82285bacfcbf46e
Author: Reynold Xin <[email protected]>
Date:   2015-12-17T22:16:49Z

    [SPARK-12397][SQL] Improve error messages for data sources when they are 
not found
    
    Point users to spark-packages.org to find them.
    
    Author: Reynold Xin <[email protected]>
    
    Closes #10351 from rxin/SPARK-12397.
    
    (cherry picked from commit e096a652b92fc64a7b3457cd0766ab324bcc980b)
    Signed-off-by: Michael Armbrust <[email protected]>

commit 4df1dd403441a4e4ca056d294385d8d0d8a0c65d
Author: Evan Chen <[email protected]>
Date:   2015-12-17T22:22:30Z

    [SPARK-12376][TESTS] Spark Streaming Java8APISuite fails in 
assertOrderInvariantEquals method
    
    org.apache.spark.streaming.Java8APISuite.java is failing due to trying to 
sort immutable list in assertOrderInvariantEquals method.
    
    Author: Evan Chen <[email protected]>
    
    Closes #10336 from evanyc15/SPARK-12376-StreamingJavaAPISuite.

commit 9177ea383a29653f0591a59e1ee2dff6b87d5a1c
Author: jhu-chang <[email protected]>
Date:   2015-12-18T01:53:15Z

    [SPARK-11749][STREAMING] Duplicate creating the RDD in file stream when 
recovering from checkpoint data
    
    Add a transient flag `DStream.restoredFromCheckpointData` to control the 
restore processing in DStream to avoid duplicate works:  check this flag first 
in `DStream.restoreCheckpointData`, only when `false`, the restore process will 
be executed.
    
    Author: jhu-chang <[email protected]>
    
    Closes #9765 from jhu-chang/SPARK-11749.
    
    (cherry picked from commit f4346f612b6798517153a786f9172cf41618d34d)
    Signed-off-by: Shixiong Zhu <[email protected]>

commit df0231952e5542e9870f8dde9ecbd7ad9a50f847
Author: Michael Gummelt <[email protected]>
Date:   2015-12-18T11:18:00Z

    [SPARK-12413] Fix Mesos ZK persistence
    
    I believe this fixes SPARK-12413.  I'm currently running an integration 
test to verify.
    
    Author: Michael Gummelt <[email protected]>
    
    Closes #10366 from mgummelt/fix-zk-mesos.
    
    (cherry picked from commit 2bebaa39d9da33bc93ef682959cd42c1968a6a3e)
    Signed-off-by: Kousuke Saruta <[email protected]>

commit 1dc71ec777ff7cac5d3d7adb13f2d63ffe8909b6
Author: Yin Huai <[email protected]>
Date:   2015-12-18T18:52:14Z

    [SPARK-12218][SQL] Invalid splitting of nested AND expressions in Data 
Source filter API
    
    JIRA: https://issues.apache.org/jira/browse/SPARK-12218
    
    When creating filters for Parquet/ORC, we should not push nested AND 
expressions partially.
    
    Author: Yin Huai <[email protected]>
    
    Closes #10362 from yhuai/SPARK-12218.
    
    (cherry picked from commit 41ee7c57abd9f52065fd7ffb71a8af229603371d)
    Signed-off-by: Yin Huai <[email protected]>

commit 3b903e44b912cd36ec26e9e95444656eee7b0c46
Author: Andrew Or <[email protected]>
Date:   2015-12-18T20:56:03Z

    Revert "[SPARK-12365][CORE] Use ShutdownHookManager where 
Runtime.getRuntime.addShutdownHook() is called"
    
    This reverts commit 4af64385b085002d94c54d11bbd144f9f026bbd8.

commit bd33d4ee847973289a58032df35375f03e9f9865
Author: Kousuke Saruta <[email protected]>
Date:   2015-12-18T22:05:06Z

    [SPARK-12404][SQL] Ensure objects passed to StaticInvoke is Serializable
    
    Now `StaticInvoke` receives `Any` as a object and `StaticInvoke` can be 
serialized but sometimes the object passed is not serializable.
    
    For example, following code raises Exception because 
`RowEncoder#extractorsFor` invoked indirectly makes `StaticInvoke`.
    
    ```
    case class TimestampContainer(timestamp: java.sql.Timestamp)
    val rdd = sc.parallelize(1 to 2).map(_ => 
TimestampContainer(System.currentTimeMillis))
    val df = rdd.toDF
    val ds = df.as[TimestampContainer]
    val rdd2 = ds.rdd                                 <----------------- 
invokes extractorsFor indirectory
    ```
    
    I'll add test cases.
    
    Author: Kousuke Saruta <[email protected]>
    Author: Michael Armbrust <[email protected]>
    
    Closes #10357 from sarutak/SPARK-12404.
    
    (cherry picked from commit 6eba655259d2bcea27d0147b37d5d1e476e85422)
    Signed-off-by: Michael Armbrust <[email protected]>

commit eca401ee5d3ae683cbee531c1f8bc981f9603fc8
Author: Burak Yavuz <[email protected]>
Date:   2015-12-18T23:24:41Z

    [SPARK-11985][STREAMING][KINESIS][DOCS] Update Kinesis docs
    
     - Provide example on `message handler`
     - Provide bit on KPL record de-aggregation
     - Fix typos
    
    Author: Burak Yavuz <[email protected]>
    
    Closes #9970 from brkyvz/kinesis-docs.
    
    (cherry picked from commit 2377b707f25449f4557bf048bb384c743d9008e5)
    Signed-off-by: Shixiong Zhu <[email protected]>

commit d6a519ff20652494ac3aeba477526ad1fd810a3c
Author: Yanbo Liang <[email protected]>
Date:   2015-12-19T08:34:30Z

    [SQL] Fix mistake doc of join type for dataframe.join
    
    Fix mistake doc of join type for ```dataframe.join```.
    
    Author: Yanbo Liang <[email protected]>
    
    Closes #10378 from yanboliang/leftsemi.
    
    (cherry picked from commit a073a73a561e78c734119c8b764d37a4e5e70da4)
    Signed-off-by: Reynold Xin <[email protected]>

commit c754a08793458813d608e48ad1b158da770cd992
Author: pshearer <[email protected]>
Date:   2015-12-21T22:04:59Z

    Doc typo: ltrim = trim from left end, not right
    
    Author: pshearer <[email protected]>
    
    Closes #10414 from pshearer/patch-1.
    
    (cherry picked from commit fc6dbcc7038c2b030ef6a2dc8be5848499ccee1c)
    Signed-off-by: Andrew Or <[email protected]>

commit ca3998512dd7801379c96c9399d3d053ab7472cd
Author: Andrew Or <[email protected]>
Date:   2015-12-21T22:09:04Z

    [SPARK-12466] Fix harmless NPE in tests
    
    ```
    [info] ReplayListenerSuite:
    [info] - Simple replay (58 milliseconds)
    java.lang.NullPointerException
        at 
org.apache.spark.deploy.master.Master$$anonfun$asyncRebuildSparkUI$1.applyOrElse(Master.scala:982)
        at 
org.apache.spark.deploy.master.Master$$anonfun$asyncRebuildSparkUI$1.applyOrElse(Master.scala:980)
    ```
    
https://amplab.cs.berkeley.edu/jenkins/view/Spark-QA-Test/job/Spark-Master-SBT/4316/AMPLAB_JENKINS_BUILD_PROFILE=hadoop2.2,label=spark-test/consoleFull
    
    This was introduced in #10284. It's harmless because the NPE is caused by a 
race that occurs mainly in `local-cluster` tests (but don't actually fail the 
tests).
    
    Tested locally to verify that the NPE is gone.
    
    Author: Andrew Or <[email protected]>
    
    Closes #10417 from andrewor14/fix-harmless-npe.
    
    (cherry picked from commit d655d37ddf59d7fb6db529324ac8044d53b2622a)
    Signed-off-by: Andrew Or <[email protected]>

commit 4062cda3087ae42c6c3cb24508fc1d3a931accdf
Author: Patrick Wendell <[email protected]>
Date:   2015-12-22T01:50:29Z

    Preparing Spark release v1.6.0-rc4

commit 5b19e7cfded0e2e41b6f427b4c3cfc3f06f85466
Author: Patrick Wendell <[email protected]>
Date:   2015-12-22T01:50:36Z

    Preparing development version 1.6.0-SNAPSHOT

commit 309ef355fc511b70765983358d5c92b5f1a26bce
Author: Shixiong Zhu <[email protected]>
Date:   2015-12-22T06:28:18Z

    [MINOR] Fix typos in JavaStreamingContext
    
    Author: Shixiong Zhu <[email protected]>
    
    Closes #10424 from zsxwing/typo.
    
    (cherry picked from commit 93da8565fea42d8ac978df411daced4a9ea3a9c8)
    Signed-off-by: Reynold Xin <[email protected]>

----


---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request: Branch 1.6

Reply via email to