Hi Shiyuan,
Re 1) Yes, but it has (almost) nothing to do with Spark since model1 =
pipeline1.fit(df) is a blocking operation and therefore the following
line will only be executed after this line has finished.
Re 2) Use a concurrency library like Java's
https://docs.oracle.com/javase/8/docs/api/j
Hi spark users,
I am looking for a way to paralleling #A and #B in the code below. Since
dataframe in spark is immutable, #A and #B are completely separated
operations
My question is:
1). As for spark 2.1, #B only starts when #A is completed. Is it right?
2). What's the best way to paralleli