Hi All, I have a requirement to pivot multiple columns using single columns, the pivot API doesn't support doing that hence I have been doing pivot for two columns and then trying to merge the dataset the result is producing empty dataset. Below is the sudo code
Main dataset => 33 columns (30 columns are string and 2 columns are of type double array lets say vector1 and vector2, 1 column Decider which has 0 & 1 values) String grouByColumns = "col1,col2,col3,col4,col5,col6.......col30"; Vector columns : Vector1 and Vector2 i do pivot like below List< Object > values = new ArrayList<Object>(); values.add("0"); values.add("1") Dataset<Row> pivot1 = mainDataset.grouBy(grouByColumns).pivot("Decider",values).agg(functions.callUDF(CUSTOM_UDAF,mainDataset.col("Vector1")); pivot1 = pivot1.withColumRenamed("0","Vector1_0"); pivot1 = pivot1.withColumRenamed("1","Vector1_1"); *Count on pivot1* = 12856 Dataset<Row> pivot2 = mainDataset.grouBy(grouByColumns).pivot("Decider",values).agg(functions.callUDF(CUSTOM_UDAF,mainDataset.col("Vector2")); pivot2 = pivot2.withColumRenamed("0","Vector2_0"); pivot2 = pivot2.withColumRenamed("1","Vector2_1"); *Count on pivot2* = 12856 Dataset<Row> finalDataset = pivot1.join(pivot2,Seq<grouByColumns >); *Count on pivot1 *= 0 ? Why this sould be 12856 right? The same code works on local with less columns and 100 records. Is there anything i am missing here is there any better way to pivot the multiple columns i can not do combine because my aggregation columns are array of doubles. The pivot1 & pivot2 dataset derived by same parent dataset the group by columns are same all i am doing is inner join on these two dataset with same group by columns why it doesn't work? Thank you Anil Langote