I think you have to use an alias. To provide an alias to a Dataset:
val d1 = a.as("d1")
val d2 = b.as("d2")
Then join, using the alias in the column names:
d1.joinWith(d2, $"d1.edid" === $"d2.edid")
Finally, please doublecheck your column names. I did not see "edid" in your
case class.
Xinh
On Thu, Mar 10, 2016 at 9:09 PM, 박주형 wrote:
> Hi. I want to join two DataSet. but below stderr is shown
>
> 16/03/11 13:55:51 WARN ColumnName: Constructing trivially true equals
> predicate, ''edid = 'edid'. Perhaps you need to use aliases.
> Exception in thread "main" org.apache.spark.sql.AnalysisException: cannot
> resolve 'edid' given input columns dataType, avg, sigma, countUnique,
> numRows, recentEdid, categoryId, accCount, statType, categoryId, max,
> accCount, firstQuarter, recentEdid, replicationRateAvg, numRows, min,
> countNotNull, countNotNull, dcid, numDistinctRows, max, firstQuarter, min,
> replicationRateAvg, dcid, statType, avg, sigma, dataType, median,
> thirdQuarter, numDistinctRows, median, countUnique, thirdQuarter;
>
>
> my case class is
> case class Stat(statType: Int, dataType: Int, dcid: Int,
> categoryId: Int, recentEdid: Int, countNotNull: Int, countUnique:
> Int, accCount: Int, replicationRateAvg: Double,
> numDistinctRows: Double, numRows: Double,
> min: Double, max: Double, sigma: Double, avg: Double,
> firstQuarter: Double, thirdQuarter: Double, median: Double)
>
> and my code is
> a.joinWith(b, $"edid" === $"edid").show()
>
> If i use DataFrame, renaming a’s column could solve it. How can I join two
> DataSet of same case class?
>