spark program for dependent cascading operations

srimugunthan dhandapani Thu, 29 Dec 2016 02:46:38 -0800

Hi all,

Can somebody solve the below problem using spark?


There are two dataset of numbers

Set1= {4,6,11,14} and Set2= {5,11,12,3}

I have to subtract each element of Set 1 by elements of Set2. But if the
element in Set2 is bigger, then the residue left after subtraction is used
for the subtraction of next element in Set1

For example:

Set1.[0] - Set2.[0] is (4-5). [with result = 0 and the residue = 1 which is
used for next element subtraction.]

Set1.[1] - Set2.[1] will be (6 - 1(residue from last subtraction) - 11)
[result = 0 and the residue is 6]

Set1.[2] - Set2.[2] will be (11-6-12) [result = 0 and with residue = 7]

Set1.[3] - Set2.[3] will be (14-7-3) [result = 4 and with residue = 0]

giving answer sets {0,0,0,4} and {1, 6, 7, 0}

Considering two more examples:

Set1 = {12, 4, 5} and Set2 = {10, 11, 12, 13}
giving answer sets { 2, 0, 0} and {0, 7, 14, 13}

and

Set1 = {4, 6, 7,2 } and Set2 = {6, 10}
giving answer sets  { 0 ,0, 1, 2 } and {2,6}

Can spark's programming model be used to perform dependent operations on
two Spark RDDs? If so how can i use spark's dataframe or RDD API to perform
these operation?

spark program for dependent cascading operations

Reply via email to