Hi team, We have a specific use case where we are trying to save off a map from the train function and reuse it in the predict function to increase our predict function response time. I know the collect() forces everything to the driver. We are collecting the RDD to a map as we don't have a spark context in the predict function.
I am getting this error and am looking for a way to adjust the parameter from 1G to 4G+. I can see a way to do it in Spark 1.6 but we are using Spark 2.1.1 and I have not seen the ability to set this. *Has anyone been able to adjust the maxResultSize to something more than 1G?* Exception in thread "main" org.apache.spark.SparkException: Job aborted due to stage failure: Total size of serialized results of 7 tasks (1156.3 MB) is bigger than spark.driver.maxResultSize (1024.0 MB) I have tried to set this parameter but get this as a result with Spark 2.1.1 Error: Unrecognized option: --driver-maxResultSize Our other option is to do the work to obtain a spark context in the predict function so we can pass the RDD through from the train to predict function. The documentation was a little unclear to me on PredictionIO. *Is this the right place to learn how to get a spark context in the predict function?* https://predictionio.incubator.apache.org/templates/vanilla/dase/ Also I am not seeing in this documentation how to get the spark context into the predict function, it looks like it is only used in the train function. Thanks in advance for your expertise. *Shane Johnson | 801.360.3350* LinkedIn <https://www.linkedin.com/in/shanewjohnson> | Facebook <https://www.facebook.com/shane.johnson.71653>
