If you use the join that takes USING columns it should automatically coalesce (take the non null value from) the left/right columns:
https://github.com/apache/spark/blob/master/sql/core/src/main/scala/org/apache/spark/sql/DataFrame.scala#L405 On Tue, Jan 19, 2016 at 10:51 PM, Zhong Wang <wangzhong....@gmail.com> wrote: > Hi all, > > I am joining two tables with common columns using full outer join. > However, the current Dataframe API doesn't support nature joins, so the > output contains redundant common columns from both of the tables. > > Is there any way to remove these redundant columns for a "nature" full > outer join? For a left outer join or right outer join, I can select just > the common columns from the left table or the right table. However, for a > full outer join, it seems it is quite difficult to do that, because there > are null values in both of the left and right common columns. > > > Thanks, > Zhong >