travis-leith commented on PR #40744:
URL: https://github.com/apache/spark/pull/40744#issuecomment-2205869472

   > > Thanks @peter-toth. I tested this patch locally. But it seem it throws 
`StackOverflowError`. How to reproduce:
   > > ```
   > > ./dev/make-distribution.sh --tgz  -Phive -Phive-thriftserver
   > > tar -zxf spark-3.5.0-SNAPSHOT-bin-3.3.5.tgz
   > > cd spark-3.5.0-SNAPSHOT-bin-3.3.5
   > > bin/spark-sql
   > > ```
   > > 
   > > 
   > >     
   > >       
   > >     
   > > 
   > >       
   > >     
   > > 
   > >     
   > >   
   > > ```
   > > spark-sql (default)> WITH RECURSIVE t(n) AS (
   > >                    >     VALUES (1)
   > >                    > UNION ALL
   > >                    >     SELECT n+1 FROM t WHERE n < 100
   > >                    > )
   > >                    > SELECT sum(n) FROM t;
   > > 23/05/30 13:21:21 ERROR Executor: Exception in task 0.0 in stage 265.0 
(TID 199)
   > > java.lang.StackOverflowError
   > >  at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
   > > ```
   > 
   > Thanks for testing this PR @wangyum. Iterestingly, I didn't encounter 
stack overflow when recursion level is <100. The error starts to appear at 
level ~170 in my tests. I guess this depends on your default stack size. Since 
recursion works in a way that each iteration depends on the previous iteration, 
the RDD lineage of the tasks are getting bigger and bigger and the 
deserialization of those tasks can throw stack overflow error at some point. 
Let me amend this PR with adding optional checkpointing so as to truncate RDD 
linage and be able to deal with deeper recursion...
   
   @peter-toth I have not looked closely at the implementation but I do have a 
question about this: has the logic been implemented in some way similar to tail 
call optimization such that there is no recursion depth limit?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to