HeartSaVioR edited a comment on issue #27872: [SPARK-31115][SQL] Detect known Janino bug janino-compiler/janino#113 and apply workaround automatically as a fail-back via avoid using switch statement in generated code URL: https://github.com/apache/spark/pull/27872#issuecomment-598492806 Thanks for spending your time to analyze the patch and provide and detailed voice, @rednaxelafx . The rationalization of this patch is to address known issue "before" we reach the real solution, as the real solution is out of our control; no one can answer when Janino 3.0.16 will be released, and even whether it will be released or not. Janino 3.1 refactors everything by oneself and makes things be unstable - recent version of Janino makes lots of Spark UTs fail which makes me less confident we can migrate to Janino 3.1.x. I totally agree the patch is a band-aid fix and not well designed to extend (I didn't intend it though), and there're better options available (more to less preferable personally) 1. Participate actively to Janino - fix all the issues in 3.1.x which breaks Spark, and adopt Janino 3.1.x. It should be ideal if we can provide set of regression tests so that Janino community can keep the stability. For sure, I'm not an expert of JVM so I can't lead the effort. A couple of JVM experts should jump in and lead the effort. 2. Fork Janino - create a main dev. branch from v3.0.15, apply the patch for janino-compiler/janino#114 (and port back some bugfixes after 3.0.15), release v3.0.16 by our side. License seems to be 3-clause BSD license, which doesn't seem to restrict redistribution. The thing of this option would be who/which group is willing to maintain the fork and release under their name. 3. This patch. (try to compile and fail-back) 4. Modify ExpandExec to check the number of operations in for statement, and use `if ~ else if` when the number of operations exceed the threshold. This should be ideally checking the length of offset but it would be weird if Spark does it, so count the lines blindly. Performance regression may happen in some cases where it can run with `switch` but due to blind count it runs with `if ~ else if`, but the case wouldn't be common. 5. Give up addressing issue - either let end users to tune their query or guide that end users should enable failback option and run with interactive mode. WDYT? Would it be better to raise this to discussion in dev@ list?
---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected] With regards, Apache Git Services --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
