Hi Shane,

I think what you are looking for to set max result size on the driver is by
passing in a spark-submit argument that looks something like this:

pio train ... -- --conf spark.driver.maxResultSize=4g ...

Regarding PAlgorithm, the predict() method does not actually have the
SparkContext in it (
The "model" argument, unlike P2LAlgorithm, can contain RDDs. In
PAlgorithm.predict(), you would be able to perform RDD operations directly
on the model argument. If the SparkContext is needed, the context() method
can be used on the model RDD.

Hope these help.


On Wed, Feb 21, 2018 at 12:08 PM Shane Johnson <shanewaldenjohn...@gmail.com>

> Hi team,
> We have a specific use case where we are trying to save off a map from the
> train function and reuse it in the predict function to increase our predict
> function response time. I know the collect() forces everything to the
> driver. We are collecting the RDD to a map as we don't have a spark context
> in the predict function.
> I am getting this error and am looking for a way to adjust the parameter
> from 1G to 4G+. I can see a way to do it in Spark 1.6 but we are using
> Spark 2.1.1 and I have not seen the ability to set this. *Has anyone been
> able to adjust the maxResultSize to something more than 1G?*
> Exception in thread "main" org.apache.spark.SparkException: Job aborted due 
> to stage failure: Total size of serialized results of 7 tasks (1156.3 MB) is 
> bigger than spark.driver.maxResultSize (1024.0 MB)
> I have tried to set this parameter but get this as a result with Spark
> 2.1.1
> Error: Unrecognized option: --driver-maxResultSize
> Our other option is to do the work to obtain a spark context in the
> predict function so we can pass the RDD through from the train to predict
> function. The documentation was a little unclear to me on PredictionIO. *Is
> this the right place to learn how to get a spark context in the predict
> function?* https://predictionio.incubator.apache.
> org/templates/vanilla/dase/
> Also I am not seeing in this documentation how to get the spark context
> into the predict function, it looks like it is only used in the train
> function.
> Thanks in advance for your expertise.
> *Shane Johnson | 801.360.3350 <(801)%20360-3350>*
> LinkedIn <https://www.linkedin.com/in/shanewjohnson> | Facebook
> <https://www.facebook.com/shane.johnson.71653>

Reply via email to