Dear Romi, Priya, Sujt and Shivaram and all,
I have took lots of days to think into this issue, however, without  any enough 
good solution...I shall appreciate your all kind help.
There is an RDD<StringDate> rdd1, and another RDD<StringDate, float> rdd2, 
(rdd2 can be PairRDD, or DataFrame with two columns as <StringDate, 
float>).StringDate column values from rdd1 and rdd2 are cross but not the same.

I would like to get a new RDD<StringDate, float> rdd3, StringDate in rdd3 would 
be all from (same) as rdd1, and float in rdd3 would be from rdd2 if its 
StringDate is in rdd2, or else NULL would be assigned.
each row in rdd3[ i ] = <rdd1[ i ].StringDate, rdd2[ i ].float or NULL>, 
rdd2[i].StringDate would be same as rdd1[ i ].StringDate, 
then rdd2[ i ].float is assigned rdd3[ i ] StringDate part. What kinds of API 
or function would I use...
Thanks very much!Zhiliang

Reply via email to