That turned out to be a silly data type mistake. At one point in the iterative call, I was passing an integer value for the parameter 'alpha' of the ALS train API, which was expecting a Double. So, py4j in fact complained that it cannot take a method that takes an integer value for that parameter.
On 8 July 2015 at 12:35, sooraj <soora...@gmail.com> wrote: > Hi, > > I am using MLlib collaborative filtering API on an implicit preference > data set. From a pySpark notebook, I am iteratively creating the matrix > factorization model with the aim of measuring the RMSE for each combination > of parameters for this API like the rank, lambda and alpha. After the code > successfully completed six iterations, on the seventh call of the > ALS.trainImplicit API, I get a confusing exception that says py4j cannot > find the method trainImplicitALSmodel. The full trace is included at the > end of the email. > > I am running Spark over YARN (yarn-client mode) with five executors. This > error seems to be happening completely on the driver as I don't see any > error on the Spark web interface. I have tried changing the > spark.yarn.am.memory configuration value, but it doesn't help. Any > suggestion on how to debug this will be very helpful. > > Thank you, > Sooraj > > Here is the full error trace: > > ---------------------------------------------------------------------------Py4JError > Traceback (most recent call > last)<ipython-input-8-ad6ca35e7521> in <module>() 3 4 for index, > (r, l, a, i) in enumerate(itertools.product(ranks, lambdas, alphas, > iters)):----> 5 model = ALS.trainImplicit(scoreTableTrain, rank = r, > iterations = i, lambda_ = l, alpha = a) 6 7 predictionsTrain = > model.predictAll(userProductTrainRDD) > /usr/local/spark-1.4/spark-1.4.0-bin-hadoop2.6/python/pyspark/mllib/recommendation.pyc > in trainImplicit(cls, ratings, rank, iterations, lambda_, blocks, alpha, > nonnegative, seed) 198 nonnegative=False, > seed=None): 199 model = callMLlibFunc("trainImplicitALSModel", > cls._prepare(ratings), rank,--> 200 iterations, > lambda_, blocks, alpha, nonnegative, seed) 201 return > MatrixFactorizationModel(model) 202 > /usr/local/spark-1.4/spark-1.4.0-bin-hadoop2.6/python/pyspark/mllib/common.pyc > in callMLlibFunc(name, *args) 126 sc = > SparkContext._active_spark_context 127 api = > getattr(sc._jvm.PythonMLLibAPI(), name)--> 128 return callJavaFunc(sc, > api, *args) 129 130 > /usr/local/spark-1.4/spark-1.4.0-bin-hadoop2.6/python/pyspark/mllib/common.pyc > in callJavaFunc(sc, func, *args) 119 """ Call Java Function """ > 120 args = [_py2java(sc, a) for a in args]--> 121 return _java2py(sc, > func(*args)) 122 123 > /usr/local/lib/python2.7/site-packages/py4j/java_gateway.pyc in > __call__(self, *args) 536 answer = > self.gateway_client.send_command(command) 537 return_value = > get_return_value(answer, self.gateway_client,--> 538 > self.target_id, self.name) 539 540 for temp_arg in temp_args: > /usr/local/lib/python2.7/site-packages/py4j/protocol.pyc in > get_return_value(answer, gateway_client, target_id, name) 302 > raise Py4JError( 303 'An error occurred while > calling {0}{1}{2}. Trace:\n{3}\n'.--> 304 > format(target_id, '.', name, value)) 305 else: 306 > raise Py4JError( > Py4JError: An error occurred while calling o667.trainImplicitALSModel. Trace: > py4j.Py4JException: Method trainImplicitALSModel([class > org.apache.spark.api.java.JavaRDD, class java.lang.Integer, class > java.lang.Integer, class java.lang.Integer, class java.lang.Integer, class > java.lang.Double, class java.lang.Boolean, null]) does not exist > at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:333) > at py4j.reflection.ReflectionEngine.getMethod(ReflectionEngine.java:342) > at py4j.Gateway.invoke(Gateway.java:252) > at py4j.commands.AbstractCommand.invokeMethod(AbstractCommand.java:133) > at py4j.commands.CallCommand.execute(CallCommand.java:79) > at py4j.GatewayConnection.run(GatewayConnection.java:207) > at java.lang.Thread.run(Thread.java:724) > > >