HeartSaVioR edited a comment on issue #27872: [SPARK-31115][SQL] Detect known 
Janino bug janino-compiler/janino#113 and apply workaround automatically as a 
fail-back via avoid using switch statement in generated code
URL: https://github.com/apache/spark/pull/27872#issuecomment-598492806
 
 
   Thanks for spending your time to analyze the patch and provide and detailed 
voice, @rednaxelafx .
   
   The rationalization of this patch is to address known issue "before" we 
reach the real solution, as the real solution is out of our control; no one can 
answer when Janino 3.0.16 will be released, and even whether it will be 
released or not. Janino 3.1 refactors everything by oneself and makes things be 
unstable - recent version of Janino makes lots of Spark UTs fail which makes me 
less confident we can migrate to Janino 3.1.x.
   
   I totally agree the patch is a band-aid fix and not well designed to extend 
(I didn't intend it though), and there're better options available (more to 
less preferable personally)
   
   1. Participate actively to Janino - fix all the issues in 3.1.x which breaks 
Spark, and adopt Janino 3.1.x. It should be ideal if we can provide set of 
regression tests so that Janino community can keep the stability. For sure, I'm 
not an expert of JVM so I can't lead the effort. A couple of JVM experts should 
jump in and lead the effort.
   
   2. Fork Janino - create a main dev. branch from v3.0.15, apply the patch for 
janino-compiler/janino#114 (and port back some bugfixes after 3.0.15), release 
v3.0.16 by our side. License seems to be 3-clause BSD license, which doesn't 
seem to restrict redistribution. The thing of this option would be who/which 
group is willing to maintain the fork and release under their name.
   
   3. This patch. (try to compile and fail-back)
   
   4. Modify ExpandExec to check the number of operations in for statement, and 
use `if ~ else if` when the number of operations exceed the threshold. This 
should be ideally checking the length of offset but it would be weird if Spark 
does it, so count the lines blindly. Performance regression may happen in some 
cases where it can run with `switch` but due to blind count it runs with `if ~ 
else if`, but the case wouldn't be common. 
   
   5. Give up addressing issue - either let end users to tune their query or 
guide that end users should enable failback option and run with interactive 
mode.
   
   WDYT? Would it be better to raise this to discussion in dev@ list?

----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
 
For queries about this service, please contact Infrastructure at:
[email protected]


With regards,
Apache Git Services

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to