[
https://issues.apache.org/jira/browse/DATAFU-159?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17927485#comment-17927485
]
Eyal Allweil commented on DATAFU-159:
-------------------------------------
Thank you [~anuta] ! We will look into it. Do you have an example of
input/output so it will be easier to understand?
> Add diff functionality to datafu-spark
> --------------------------------------
>
> Key: DATAFU-159
> URL: https://issues.apache.org/jira/browse/DATAFU-159
> Project: DataFu
> Issue Type: New Feature
> Reporter: Eyal Allweil
> Priority: Major
>
> A useful feature when examining results is the ability to clearly understand
> the differences between two datasets - for example, doing regressions between
> expected and actual results.
> Spark provides the _except_ functionality, but this is often not enough for
> this - for example, see [this question on Stack
> Overflow.|https://stackoverflow.com/questions/44338412/how-to-compare-two-dataframe-and-print-columns-that-are-different-in-scala]
> Datafu-pig had a macro for doing this, and this could be a useful addition to
> datafu-spark.
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)