[
https://issues.apache.org/jira/browse/SPARK-30296?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17012430#comment-17012430
]
Dongjoon Hyun commented on SPARK-30296:
---------------------------------------
Hi, [~EnricoMi].
Please don't set `Fixed Version`. We set that when the committers merge the
PRs. Also, `New Feature` should have the version of `master` branch, 3.0.0 (as
of today), because Apache Spark community has a policy which allows
blackporting bug-fixes only.
- https://spark.apache.org/contributing.html
> Dataset diffing transformation
> ------------------------------
>
> Key: SPARK-30296
> URL: https://issues.apache.org/jira/browse/SPARK-30296
> Project: Spark
> Issue Type: New Feature
> Components: SQL
> Affects Versions: 3.0.0
> Reporter: Enrico Minack
> Priority: Major
>
> Evolving Spark code needs frequent regression testing to prove it still
> produces identical results, or if changes are expected, to investigate those
> changes. Diffing the Datasets of two code paths provides confidence.
> Diffing small schemata is easy, but with wide schema the Spark query becomes
> laborious and error-prone. With a single proven and tested method, diffing
> becomes easier and a more reliable operation. As a Dataset transformation,
> you get this operation first hand with your Dataset API.
> This has proven to be useful for interactive spark as well as deployed
> production code.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]