[ https://issues.apache.org/jira/browse/SPARK-32534?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Hyukjin Kwon updated SPARK-32534: --------------------------------- Priority: Minor (was: Blocker) > Cannot load a Pipeline Model on a stopped Spark Context > ------------------------------------------------------- > > Key: SPARK-32534 > URL: https://issues.apache.org/jira/browse/SPARK-32534 > Project: Spark > Issue Type: Bug > Components: Deploy, Kubernetes > Affects Versions: 2.4.6 > Reporter: Kevin Van Lieshout > Priority: Minor > Original Estimate: 24h > Remaining Estimate: 24h > > I am running Spark in a Kubernetes cluster than is running Spark NLP using > the Pyspark ML Pipeline Model class to load the model and then transform on > the spark dataframe. We run this within a docker container that starts up a > spark context, mounts volumes, spins up executors, etc and then does it > transformations, udfs, etc and then closes down the spark context. The first > time I load the model when my service has just been started, everything is > fine. If I run my application for a second time without resetting my service, > even though the context is entirely stopped from the previous run and a new > one is started up, the Pipeline Model has some attribute in one of its base > classes that thinks the context its running on is closed, so then I get a : > cannot call a function on a stopped spark context when I try and load the > model in my service again. I have to shut down my service each time if I want > consecutive runs through my spark pipeline, which is not ideal, so I was > wondering if this was a common issue amongst fellow pyspark users that use > Pipeline Model, or is there a common work around to resetting all spark > contexts or whether the pipeline model caches a spark context of some sort. > Any help is very useful. > > > cls.pipeline = PipelineModel.read().load(NLP_MODEL) > > is how I load the model. And our spark context is very similar to a typical > kubernetes/spark setup. Nothing special there -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org For additional commands, e-mail: issues-h...@spark.apache.org