Re: Many operations cause StackOverflowError with AWS EMR YARN cluster

2017-01-26 Thread Geoffrey Mon
Hello Chesnay, Thanks for the advice. I've begun adding multiple jobs per Python plan file here: https://issues.apache.org/jira/browse/FLINK-5183 and https://github.com/GEOFBOT/flink/tree/FLINK-5183 The functionality of the patch works. I am able to run multiple jobs per file successfully, but

Re: Many operations cause StackOverflowError with AWS EMR YARN cluster

2016-11-23 Thread Chesnay Schepler
Hello, implementing collect() in python is not that trivial and the gain is questionable. There is an inherent size limit (think 10mb), and it is a bit at odds with the deployment model of the Python API. Something easier would be to execute each iteration of the for-loop as a separate job

Re: Many operations cause StackOverflowError with AWS EMR YARN cluster

2016-11-20 Thread Geoffrey Mon
Hello, I know that the reuse of the data set in my plan is causing the problem (after one dictionary atom is learned using the data set "S", "S" is updated for use with the next dictionary atom). When I comment out the line updating the data set "S", I have no problem and the plan processing

Re: Many operations cause StackOverflowError with AWS EMR YARN cluster

2016-11-14 Thread Geoffrey Mon
Hi Ufuk, The master instance of the cluster was also a m3.xlarge instance with 15 GB RAM, which I would've expected to be enough. I have gotten the program to run successfully on a personal virtual cluster where each node has 8 GB RAM and where the master node was also a worker node, so the

Re: Many operations cause StackOverflowError with AWS EMR YARN cluster

2016-11-14 Thread Ufuk Celebi
The Python API is in alpha state currently, so we would have to check if it is related specifically to that. Looping in Chesnay who worked on that. The JVM GC error happens on the client side as that's where the optimizer runs. How much memory does the client submitting the job have? How do

Many operations cause StackOverflowError with AWS EMR YARN cluster

2016-11-13 Thread Geoffrey Mon
Hello all, I have a pretty complicated plan file using the Flink Python API running on a AWS EMR cluster of m3.xlarge instances using YARN. The plan is for a dictionary learning algorithm and has to run a sequence of operations many times; each sequence involves bulk iterations with join