Hi Antony,
yes, it's possible to "inject"/reuse an existing Spark context via the pipeline
options. From the SparkPipelineOptions:
@Description("If the spark runner will be initialized with a provided Spark
Context. "
+ "The Spark Context should be provided with SparkContextOptions.")
@Default.Boolean(false)
boolean getUsesProvidedSparkContext();
void setUsesProvidedSparkContext(boolean value);
Regards
JB
On 05/10/2017 10:16 AM, Antony Mayi wrote:
I've got a (dirty) usecase where I have existing spark batch job which produces
an output that I would like to feed into my beam pipeline (assuming running on
SparkRunner). I was trying to run it as one job (the output is reduced so not a
big data hence ok to do something like Create.of(rdd.collect())) but that's
failing because of the two separate spark contexts.
Is it possible to build the beam pipeline on existing spark context?
thx,
Antony.
--
Jean-Baptiste Onofré
[email protected]
http://blog.nanthrax.net
Talend - http://www.talend.com