[GitHub] spark pull request #21461: [SPARK-23754][Python] Backport bugfix

e-dorigatti Wed, 30 May 2018 03:23:27 -0700

GitHub user e-dorigatti opened a pull request:

    https://github.com/apache/spark/pull/21461


    [SPARK-23754][Python] Backport bugfix

    Fix for master was already accepted in [another pull 
request](https://github.com/apache/spark/pull/21383), but there were conflicts 
while merging in branch-2.3

You can merge this pull request into a Git repository by running:

    $ git pull https://github.com/e-dorigatti/spark fix_spark_23754

Alternatively you can review and apply these changes as the patch at:

    https://github.com/apache/spark/pull/21461.patch

To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:

    This closes #21461
    
----
commit b30a7d28b399950953d4b112c57d4c9b9ab223e9
Author: Marcelo Vanzin <vanzin@...>
Date:   2018-03-26T19:45:45Z

    [SPARK-23572][DOCS] Bring "security.md" up to date.
    
    This change basically rewrites the security documentation so that it's
    up to date with new features, more correct, and more complete.
    
    Because security is such an important feature, I chose to move all the
    relevant configuration documentation to the security page, instead of
    having them peppered all over the place in the configuration page. This
    allows an almost one-stop shop for security configuration in Spark. The
    only exceptions are some YARN-specific minor features which I left in
    the YARN page.
    
    I also re-organized the page's topics, since they didn't make a lot of
    sense. You had kerberos features described inside paragraphs talking
    about UI access control, and other oddities. It should be easier now
    to find information about specific Spark security features. I also
    enabled TOCs for both the Security and YARN pages, since that makes it
    easier to see what is covered.
    
    I removed most of the comments from the SecurityManager javadoc since
    they just replicated information in the security doc, with different
    levels of out-of-dateness.
    
    Author: Marcelo Vanzin <[email protected]>
    
    Closes #20742 from vanzin/SPARK-23572.

commit 3e778f5a91b0553b09fe0e0ee84d771a71504960
Author: Kevin Yu <qyu@...>
Date:   2018-03-26T22:45:27Z

    [SPARK-23162][PYSPARK][ML] Add r2adj into Python API in 
LinearRegressionSummary
    
    ## What changes were proposed in this pull request?
    
    Adding r2adj in LinearRegressionSummary for Python API.
    
    ## How was this patch tested?
    
    Added unit tests to exercise the api calls for the summary classes in 
tests.py.
    
    Author: Kevin Yu <[email protected]>
    
    Closes #20842 from kevinyu98/spark-23162.

commit 35997b59f3116830af06b3d40a7675ef0dbf7091
Author: Liang-Chi Hsieh <viirya@...>
Date:   2018-03-27T12:49:50Z

    [SPARK-23794][SQL] Make UUID as stateful expression
    
    ## What changes were proposed in this pull request?
    
    The UUID() expression is stateful and should implement the `Stateful` trait 
instead of the `Nondeterministic` trait.
    
    ## How was this patch tested?
    
    Added test.
    
    Author: Liang-Chi Hsieh <[email protected]>
    
    Closes #20912 from viirya/SPARK-23794.

commit c68ec4e6a1ed9ea13345c7705ea60ff4df7aec7b
Author: jerryshao <sshao@...>
Date:   2018-03-27T21:39:05Z

    [SPARK-23096][SS] Migrate rate source to V2
    
    ## What changes were proposed in this pull request?
    
    This PR migrate micro batch rate source to V2 API and rewrite UTs to suite 
V2 test.
    
    ## How was this patch tested?
    
    UTs.
    
    Author: jerryshao <[email protected]>
    
    Closes #20688 from jerryshao/SPARK-23096.

commit ed72badb04a56d8046bbd185245abf5ae265ccfd
Author: Bryan Cutler <cutlerb@...>
Date:   2018-03-28T03:06:12Z

    [SPARK-23699][PYTHON][SQL] Raise same type of error caught with Arrow 
enabled
    
    ## What changes were proposed in this pull request?
    
    When using Arrow for createDataFrame or toPandas and an error is 
encountered with fallback disabled, this will raise the same type of error 
instead of a RuntimeError.  This change also allows for the traceback of the 
error to be retained and prevents the accidental chaining of exceptions with 
Python 3.
    
    ## How was this patch tested?
    
    Updated existing tests to verify error type.
    
    Author: Bryan Cutler <[email protected]>
    
    Closes #20839 from BryanCutler/arrow-raise-same-error-SPARK-23699.

commit 34c4b9c57e114cdb390e4dbc7383284d82fea317
Author: hyukjinkwon <gurwls223@...>
Date:   2018-03-28T11:49:27Z

    [SPARK-23765][SQL] Supports custom line separator for json datasource
    
    ## What changes were proposed in this pull request?
    
    This PR proposes to add lineSep option for a configurable line separator in 
text datasource.
    It supports this option by using `LineRecordReader`'s functionality with 
passing it to the constructor.
    
    The approach is similar with https://github.com/apache/spark/pull/20727; 
however, one main difference is, it uses text datasource's `lineSep` option to 
parse line by line in JSON's schema inference.
    
    ## How was this patch tested?
    
    Manually tested and unit tests were added.
    
    Author: hyukjinkwon <[email protected]>
    Author: hyukjinkwon <[email protected]>
    
    Closes #20877 from HyukjinKwon/linesep-json.

commit 761565a3ccbf7f083e587fee14a27b61867a3886
Author: gatorsmile <gatorsmile@...>
Date:   2018-03-28T16:11:52Z

    Revert "[SPARK-23096][SS] Migrate rate source to V2"
    
    This reverts commit c68ec4e6a1ed9ea13345c7705ea60ff4df7aec7b.

commit ea2fdc0d286e449884de44f22a908a26ab1248a5
Author: guoxiaolong <guo.xiaolong1@...>
Date:   2018-03-29T00:49:32Z

    [SPARK-23675][WEB-UI] Title add spark logo, use spark logo image
    
    ## What changes were proposed in this pull request?
    
    Title add spark logo, use spark logo image. reference other big data system 
ui, so i think spark should add it.
    
    spark fix before:
    
![spark_fix_before](https://user-images.githubusercontent.com/26266482/37387866-2d5add0e-2799-11e8-9165-250f2b59df3f.png)
    
    spark fix after:
    
![spark_fix_after](https://user-images.githubusercontent.com/26266482/37387874-329e1876-2799-11e8-8bc5-c619fc1e680e.png)
    
    reference kafka ui:
    
![kafka](https://user-images.githubusercontent.com/26266482/37387878-35ca89d0-2799-11e8-834e-1598ae7158e1.png)
    
    reference storm ui:
    
![storm](https://user-images.githubusercontent.com/26266482/37387880-3854f12c-2799-11e8-8968-b428ba361995.png)
    
    reference yarn ui:
    
![yarn](https://user-images.githubusercontent.com/26266482/37387881-3a72e130-2799-11e8-97bb-dea85f573e95.png)
    
    reference nifi ui:
    
![nifi](https://user-images.githubusercontent.com/26266482/37387887-3cecfea0-2799-11e8-9a71-6c454d25840b.png)
    
    reference flink ui:
    
![flink](https://user-images.githubusercontent.com/26266482/37387888-3f16b1ee-2799-11e8-9d37-8355f0100548.png)
    
    ## How was this patch tested?
    
    manual tests
    
    Please review http://spark.apache.org/contributing.html before opening a 
pull request.
    
    Author: guoxiaolong <[email protected]>
    
    Closes #20818 from guoxiaolongzte/SPARK-23675.

commit 641aec68e8167546dbb922874c086c9b90198f08
Author: Thomas Graves <tgraves@...>
Date:   2018-03-29T08:37:46Z

    [SPARK-23806] Broadcast.unpersist can cause fatal exception when usedâ¦
    
    â¦ with dynamic allocation
    
    ## What changes were proposed in this pull request?
    
    ignore errors when you are waiting for a broadcast.unpersist. This is 
handling it the same way as doing rdd.unpersist in 
https://issues.apache.org/jira/browse/SPARK-22618
    
    ## How was this patch tested?
    
    Patch was tested manually against a couple jobs that exhibit this behavior, 
with the change the application no longer dies due to this and just prints the 
warning.
    
    Please review http://spark.apache.org/contributing.html before opening a 
pull request.
    
    Author: Thomas Graves <[email protected]>
    
    Closes #20924 from tgravescs/SPARK-23806.

commit 505480cb578af9f23acc77bc82348afc9d8468e8
Author: hyukjinkwon <gurwls223@...>
Date:   2018-03-29T10:38:28Z

    [SPARK-23770][R] Exposes repartitionByRange in SparkR
    
    ## What changes were proposed in this pull request?
    
    This PR proposes to expose `repartitionByRange`.
    
    ```R
    > df <- createDataFrame(iris)
    ...
    > getNumPartitions(repartitionByRange(df, 3, col = df$Species))
    [1] 3
    ```
    
    ## How was this patch tested?
    
    Manually tested and the unit tests were added. The diff with `repartition` 
can be checked as below:
    
    ```R
    > df <- createDataFrame(mtcars)
    > take(repartition(df, 10, df$wt), 3)
       mpg cyl  disp  hp drat    wt  qsec vs am gear carb
    1 14.3   8 360.0 245 3.21 3.570 15.84  0  0    3    4
    2 10.4   8 460.0 215 3.00 5.424 17.82  0  0    3    4
    3 32.4   4  78.7  66 4.08 2.200 19.47  1  1    4    1
    > take(repartitionByRange(df, 10, df$wt), 3)
       mpg cyl disp hp drat    wt  qsec vs am gear carb
    1 30.4   4 75.7 52 4.93 1.615 18.52  1  1    4    2
    2 33.9   4 71.1 65 4.22 1.835 19.90  1  1    4    1
    3 27.3   4 79.0 66 4.08 1.935 18.90  1  1    4    1
    ```
    
    Author: hyukjinkwon <[email protected]>
    
    Closes #20902 from HyukjinKwon/r-repartitionByRange.

commit 491ec114fd3886ebd9fa29a482e3d112fb5a088c
Author: Sahil Takiar <stakiar@...>
Date:   2018-03-29T17:23:23Z

    [SPARK-23785][LAUNCHER] LauncherBackend doesn't check state of connection 
before setting state
    
    ## What changes were proposed in this pull request?
    
    Changed `LauncherBackend` `set` method so that it checks if the connection 
is open or not before writing to it (uses `isConnected`).
    
    ## How was this patch tested?
    
    None
    
    Author: Sahil Takiar <[email protected]>
    
    Closes #20893 from sahilTakiar/master.

commit a7755fd8ce2f022118b9827aaac7d5d59f0f297a
Author: Kent Yao <yaooqinn@...>
Date:   2018-03-29T17:46:28Z

    [SPARK-23639][SQL] Obtain token before init metastore client in SparkSQL CLI
    
    ## What changes were proposed in this pull request?
    
    In SparkSQLCLI, SessionState generates before SparkContext instantiating. 
When we use --proxy-user to impersonate, it's unable to initializing a 
metastore client to talk to the secured metastore for no kerberos ticket.
    
    This PR use real user ugi to obtain token for owner before talking to 
kerberized metastore.
    
    ## How was this patch tested?
    
    Manually verified with kerberized hive metasotre / hdfs.
    
    Author: Kent Yao <[email protected]>
    
    Closes #20784 from yaooqinn/SPARK-23639.

commit b348901192b231153b58fe5720253168c87963d4
Author: Jose Torres <torres.joseph.f+github@...>
Date:   2018-03-30T04:36:56Z

    [SPARK-23808][SQL] Set default Spark session in test-only spark sessions.
    
    ## What changes were proposed in this pull request?
    
    Set default Spark session in the TestSparkSession and TestHiveSparkSession 
constructors.
    
    ## How was this patch tested?
    
    new unit tests
    
    Author: Jose Torres <[email protected]>
    
    Closes #20926 from jose-torres/test3.

commit df05fb63abe6018ccbe572c34cf65fc3ecbf1166
Author: Jongyoul Lee <jongyoul@...>
Date:   2018-03-30T06:07:35Z

    [SPARK-23743][SQL] Changed a comparison logic from containing 'slf4j' to 
starting with 'org.slf4j'
    
    ## What changes were proposed in this pull request?
    isSharedClass returns if some classes can/should be shared or not. It 
checks if the classes names have some keywords or start with some names. 
Following the logic, it can occur unintended behaviors when a custom package 
has `slf4j` inside the package or class name. As I guess, the first intention 
seems to figure out the class containing `org.slf4j`. It would be better to 
change the comparison logic to `name.startsWith("org.slf4j")`
    
    ## How was this patch tested?
    This patch should pass all of the current tests and keep all of the current 
behaviors. In my case, I'm using ProtobufDeserializer to get a table schema 
from hive tables. Thus some Protobuf packages and names have `slf4j` inside. 
Without this patch, it cannot be resolved because of ClassCastException from 
different classloaders.
    
    Author: Jongyoul Lee <[email protected]>
    
    Closes #20860 from jongyoul/SPARK-23743.

commit b02e76cbffe9e589b7a4e60f91250ca12a4420b2
Author: yucai <yyu1@...>
Date:   2018-03-30T07:07:38Z

    [SPARK-23727][SQL] Support for pushing down filters for DateType in parquet
    
    ## What changes were proposed in this pull request?
    
    This PR supports for pushing down filters for DateType in parquet
    
    ## How was this patch tested?
    
    Added UT and tested in local.
    
    Author: yucai <[email protected]>
    
    Closes #20851 from yucai/SPARK-23727.

commit 5b5a36ed6d2bb0971edfeccddf0f280936d2275f
Author: Jose Torres <torres.joseph.f+github@...>
Date:   2018-03-30T13:54:26Z

    Roll forward "[SPARK-23096][SS] Migrate rate source to V2"
    
    ## What changes were proposed in this pull request?
    
    Roll forward c68ec4e (#20688).
    
    There are two minor test changes required:
    
    * An error which used to be TreeNodeException[ArithmeticException] is no 
longer wrapped and is now just ArithmeticException.
    * The test framework simply does not set the active Spark session. (Or 
rather, it doesn't do so early enough - I think it only happens when a query is 
analyzed.) I've added the required logic to SQLTestUtils.
    
    ## How was this patch tested?
    
    existing tests
    
    Author: Jose Torres <[email protected]>
    Author: jerryshao <[email protected]>
    
    Closes #20922 from jose-torres/ratefix.

commit bc8d0931170cfa20a4fb64b3b11a2027ddb0d6e9
Author: gatorsmile <gatorsmile@...>
Date:   2018-03-30T15:21:07Z

    [SPARK-23500][SQL][FOLLOWUP] Fix complex type simplification rules to apply 
to entire plan
    
    ## What changes were proposed in this pull request?
    This PR is to improve the test coverage of the original PR 
https://github.com/apache/spark/pull/20687
    
    ## How was this patch tested?
    N/A
    
    Author: gatorsmile <[email protected]>
    
    Closes #20911 from gatorsmile/addTests.

commit ae9172017c361e5c1039bc2ca94048117021974a
Author: Yuming Wang <yumwang@...>
Date:   2018-03-30T21:09:14Z

    [SPARK-23640][CORE] Fix hadoop config may override spark config
    
    ## What changes were proposed in this pull request?
    
    It may be get `spark.shuffle.service.port` from 
https://github.com/apache/spark/blob/9745ec3a61c99be59ef6a9d5eebd445e8af65b7a/core/src/main/scala/org/apache/spark/deploy/SparkHadoopUtil.scala#L459
    
    Therefore, the client configuration `spark.shuffle.service.port` does not 
working unless the configuration is `spark.hadoop.spark.shuffle.service.port`.
    
    - This configuration is not working:
    ```
    bin/spark-sql --master yarn --conf spark.shuffle.service.port=7338
    ```
    - This configuration works:
    ```
    bin/spark-sql --master yarn --conf 
spark.hadoop.spark.shuffle.service.port=7338
    ```
    
    This PR fix this issue.
    
    ## How was this patch tested?
    
    It's difficult to carry out unit testing. But I've tested it manually.
    
    Author: Yuming Wang <[email protected]>
    
    Closes #20785 from wangyum/SPARK-23640.

commit 15298b99ac8944e781328423289586176cf824d7
Author: Tathagata Das <tathagata.das1565@...>
Date:   2018-03-30T23:48:26Z

    [SPARK-23827][SS] StreamingJoinExec should ensure that input data is 
partitioned into specific number of partitions
    
    ## What changes were proposed in this pull request?
    
    Currently, the requiredChildDistribution does not specify the partitions. 
This can cause the weird corner cases where the child's distribution is 
`SinglePartition` which satisfies the required distribution of 
`ClusterDistribution(no-num-partition-requirement)`, thus eliminating the 
shuffle needed to repartition input data into the required number of partitions 
(i.e. same as state stores). That can lead to "file not found" errors on the 
state store delta files as the micro-batch-with-no-shuffle will not run certain 
tasks and therefore not generate the expected state store delta files.
    
    This PR adds the required constraint on the number of partitions.
    
    ## How was this patch tested?
    Modified test harness to always check that ANY stateful operator should 
have a constraint on the number of partitions. As part of that, the existing 
opt-in checks on child output partitioning were removed, as they are redundant.
    
    Author: Tathagata Das <[email protected]>
    
    Closes #20941 from tdas/SPARK-23827.

commit 529f847105fa8d98a5dc4d20955e4870df6bc1c5
Author: Xingbo Jiang <xingbo.jiang@...>
Date:   2018-03-31T02:34:01Z

    [SPARK-23040][CORE][FOLLOW-UP] Avoid double wrap result Iterator.
    
    ## What changes were proposed in this pull request?
    
    Address https://github.com/apache/spark/pull/20449#discussion_r172414393, 
If `resultIter` is already a `InterruptibleIterator`, don't double wrap it.
    
    ## How was this patch tested?
    Existing tests.
    
    Author: Xingbo Jiang <[email protected]>
    
    Closes #20920 from jiangxb1987/SPARK-23040.

commit 44a9f8e6e82c300dc61ca18515aee16f17f27501
Author: Bryan Cutler <cutlerb@...>
Date:   2018-04-02T16:53:37Z

    [SPARK-15009][PYTHON][FOLLOWUP] Add default param checks for 
CountVectorizerModel
    
    ## What changes were proposed in this pull request?
    
    Adding test for default params for `CountVectorizerModel` constructed from 
vocabulary.  This required that the param `maxDF` be added, which was done in 
SPARK-23615.
    
    ## How was this patch tested?
    
    Added an explicit test for CountVectorizerModel in DefaultValuesTests.
    
    Author: Bryan Cutler <[email protected]>
    
    Closes #20942 from 
BryanCutler/pyspark-CountVectorizerModel-default-param-test-SPARK-15009.

commit 6151f29f9f589301159482044fc32717f430db6e
Author: David Vogelbacher <dvogelbacher@...>
Date:   2018-04-02T19:00:37Z

    [SPARK-23825][K8S] Requesting memory + memory overhead for pod memory
    
    ## What changes were proposed in this pull request?
    
    Kubernetes driver and executor pods should request `memory + 
memoryOverhead` as their resources instead of just `memory`, see 
https://issues.apache.org/jira/browse/SPARK-23825
    
    ## How was this patch tested?
    Existing unit tests were adapted.
    
    Author: David Vogelbacher <[email protected]>
    
    Closes #20943 from dvogelbacher/spark-23825.

commit fe2b7a4568d65a62da6e6eb00fff05f248b4332c
Author: Yinan Li <ynli@...>
Date:   2018-04-02T19:20:55Z

    [SPARK-23285][K8S] Add a config property for specifying physical executor 
cores
    
    ## What changes were proposed in this pull request?
    
    As mentioned in SPARK-23285, this PR introduces a new configuration 
property `spark.kubernetes.executor.cores` for specifying the physical CPU 
cores requested for each executor pod. This is to avoid changing the semantics 
of `spark.executor.cores` and `spark.task.cpus` and their role in task 
scheduling, task parallelism, dynamic resource allocation, etc. The new 
configuration property only determines the physical CPU cores available to an 
executor. An executor can still run multiple tasks simultaneously by using 
appropriate values for `spark.executor.cores` and `spark.task.cpus`.
    
    ## How was this patch tested?
    
    Unit tests.
    
    felixcheung srowen jiangxb1987 jerryshao mccheah foxish
    
    Author: Yinan Li <[email protected]>
    Author: Yinan Li <[email protected]>
    
    Closes #20553 from liyinan926/master.

commit a7c19d9c21d59fd0109a7078c80b33d3da03fafd
Author: Kazuaki Ishizaki <ishizaki@...>
Date:   2018-04-02T19:48:44Z

    [SPARK-23713][SQL] Cleanup UnsafeWriter and BufferHolder classes
    
    ## What changes were proposed in this pull request?
    
    This PR implemented the following cleanups related to  `UnsafeWriter` class:
    - Remove code duplication between `UnsafeRowWriter` and `UnsafeArrayWriter`
    - Make `BufferHolder` class internal by delegating its accessor methods to 
`UnsafeWriter`
    - Replace `UnsafeRow.setTotalSize(...)` with 
`UnsafeRowWriter.setTotalSize()`
    
    ## How was this patch tested?
    
    Tested by existing UTs
    
    Author: Kazuaki Ishizaki <[email protected]>
    
    Closes #20850 from kiszk/SPARK-23713.

commit 28ea4e3142b88eb396aa8dd5daf7b02b556204ba
Author: Marcelo Vanzin <vanzin@...>
Date:   2018-04-02T21:35:07Z

    [SPARK-23834][TEST] Wait for connection before disconnect in LauncherServer 
test.
    
    It was possible that the disconnect() was called on the handle before the
    server had received the handshake messages, so no connection was yet
    attached to the handle. The fix waits until we're sure the handle has been
    mapped to a client connection.
    
    Author: Marcelo Vanzin <[email protected]>
    
    Closes #20950 from vanzin/SPARK-23834.

commit a1351828d376a01e5ee0959cf608f767d756dd86
Author: Yogesh Garg <yogesh(dot)garg()databricks(dot)com>
Date:   2018-04-02T23:41:26Z

    [SPARK-23690][ML] Add handleinvalid to VectorAssembler
    
    ## What changes were proposed in this pull request?
    
    Introduce `handleInvalid` parameter in `VectorAssembler` that can take in 
`"keep", "skip", "error"` options. "error" throws an error on seeing a row 
containing a `null`, "skip" filters out all such rows, and "keep" adds relevant 
number of NaN. "keep" figures out an example to find out what this number of 
NaN s should be added and throws an error when no such number could be found.
    
    ## How was this patch tested?
    
    Unit tests are added to check the behavior of `assemble` on specific rows 
and the transformer is called on `DataFrame`s of different configurations to 
test different corner cases.
    
    Author: Yogesh Garg <yogesh(dot)garg()databricks(dot)com>
    Author: Bago Amirbekian <[email protected]>
    Author: Yogesh Garg <[email protected]>
    
    Closes #20829 from yogeshg/rformula_handleinvalid.

commit 441d0d0766e9a6ac4c6ff79680394999ff7191fd
Author: Marcelo Vanzin <vanzin@...>
Date:   2018-04-03T01:31:47Z

    [SPARK-19964][CORE] Avoid reading from remote repos in SparkSubmitSuite.
    
    These tests can fail with a timeout if the remote repos are not responding,
    or slow. The tests don't need anything from those repos, so use an empty
    ivy config file to avoid setting up the defaults.
    
    The tests are passing reliably for me locally now, and failing more often
    than not today without this change since 
http://dl.bintray.com/spark-packages/maven
    doesn't seem to be loading from my machine.
    
    Author: Marcelo Vanzin <[email protected]>
    
    Closes #20916 from vanzin/SPARK-19964.

commit 8020f66fc47140a1b5f843fb18c34ec80541d5ca
Author: lemonjing <932191671@...>
Date:   2018-04-03T01:36:44Z

    [MINOR][DOC] Fix a few markdown typos
    
    ## What changes were proposed in this pull request?
    
    Easy fix in the markdown.
    
    ## How was this patch tested?
    
    jekyII build test manually.
    
    Please review http://spark.apache.org/contributing.html before opening a 
pull request.
    
    Author: lemonjing <[email protected]>
    
    Closes #20897 from Lemonjing/master.

commit 7cf9fab33457ccc9b2d548f15dd5700d5e8d08ef
Author: Xingbo Jiang <xingbo.jiang@...>
Date:   2018-04-03T13:26:49Z

    [MINOR][CORE] Show block manager id when remove RDD/Broadcast fails.
    
    ## What changes were proposed in this pull request?
    
    Address https://github.com/apache/spark/pull/20924#discussion_r177987175, 
show block manager id when remove RDD/Broadcast fails.
    
    ## How was this patch tested?
    
    N/A
    
    Author: Xingbo Jiang <[email protected]>
    
    Closes #20960 from jiangxb1987/bmid.

commit 66a3a5a2dc83e03dedcee9839415c1ddc1fb8125
Author: Jose Torres <torres.joseph.f+github@...>
Date:   2018-04-03T18:05:29Z

    [SPARK-23099][SS] Migrate foreach sink to DataSourceV2
    
    ## What changes were proposed in this pull request?
    
    Migrate foreach sink to DataSourceV2.
    
    Since the previous attempt at this PR #20552, we've changed and strictly 
defined the lifecycle of writer components. This means we no longer need the 
complicated lifecycle shim from that PR; it just naturally works.
    
    ## How was this patch tested?
    
    existing tests
    
    Author: Jose Torres <[email protected]>
    
    Closes #20951 from jose-torres/foreach.

----


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] spark pull request #21461: [SPARK-23754][Python] Backport bugfix

Reply via email to