travis-leith commented on PR #40744: URL: https://github.com/apache/spark/pull/40744#issuecomment-2205869472
> > Thanks @peter-toth. I tested this patch locally. But it seem it throws `StackOverflowError`. How to reproduce: > > ``` > > ./dev/make-distribution.sh --tgz -Phive -Phive-thriftserver > > tar -zxf spark-3.5.0-SNAPSHOT-bin-3.3.5.tgz > > cd spark-3.5.0-SNAPSHOT-bin-3.3.5 > > bin/spark-sql > > ``` > > > > > > > > > > > > > > > > > > > > > > > > ``` > > spark-sql (default)> WITH RECURSIVE t(n) AS ( > > > VALUES (1) > > > UNION ALL > > > SELECT n+1 FROM t WHERE n < 100 > > > ) > > > SELECT sum(n) FROM t; > > 23/05/30 13:21:21 ERROR Executor: Exception in task 0.0 in stage 265.0 (TID 199) > > java.lang.StackOverflowError > > at java.lang.reflect.Constructor.newInstance(Constructor.java:423) > > ``` > > Thanks for testing this PR @wangyum. Iterestingly, I didn't encounter stack overflow when recursion level is <100. The error starts to appear at level ~170 in my tests. I guess this depends on your default stack size. Since recursion works in a way that each iteration depends on the previous iteration, the RDD lineage of the tasks are getting bigger and bigger and the deserialization of those tasks can throw stack overflow error at some point. Let me amend this PR with adding optional checkpointing so as to truncate RDD linage and be able to deal with deeper recursion... @peter-toth I have not looked closely at the implementation but I do have a question about this: has the logic been implemented in some way similar to tail call optimization such that there is no recursion depth limit? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
