I have two RDD

leftRDD = RDD[(Long, (DetailInputRecord, VISummary, Long))]
and
rightRDD =
RDD[(Long, com.ebay.ep.poc.spark.reporting.process.model.dw.SpsLevelMetricSum)

DetailInputRecord is a object that contains (guid, sessionKey,
sessionStartDAte, siteID)

There are 10 records in leftRDD (confirmed with leftRDD.count, and each of
DetailInputRecord record in leftRDD has data within its members)

I do leftRDD.leftOuterJoin(rightRDD)

viEventsWithListings  = leftRDD
spsLvlMetric   = rightRDD

val viEventsWithListingsJoinSpsLevelMetric =
viEventsWithListings.leftOuterJoin(spsLvlMetric).map  {
      case (viJoinSpsLevelMetric) => {
        val (sellerId, ((viEventDetail, viSummary, itemId), spsLvlMetric))
= viJoinSpsLevelMetric

        println("sellerId:" + sellerId)
        println("sessionKey:" + viEventDetail.get("sessionKey"))
        println("guid:" + viEventDetail.get("guid"))
        println("sessionStartDate:" + viEventDetail.get("sessionStartDate"))
        println("siteId:" + viEventDetail.get("siteId"))

        if (spsLvlMetric.isDefined) {

            // do something

         }
}

I print  each of the items within the DetailInputRecord (viEventDetail) of
viEventsWithListings before and within leftOuterJoin.  Before leftOuterJoin
i get values of each member within record (total 10 records).

Within join when i do the print i get only guid as value for all members.
How is this possible ?

Within join: (print statements. These are guids)
sessionKey:27c9fbc014b4f61526f0574001b73b00
guid:27c9fbc014b4f61526f0574001b73b00
sessionStartDate:27c9fbc014b4f61526f0574001b73b00
siteId:27c9fbc014b4f61526f0574001b73b00

What went wrong, i have debugged multiple times but fail to understand the
reason.
Appreciate your help
-- 
Deepak

Reply via email to