Github user mgaido91 commented on a diff in the pull request:

    https://github.com/apache/spark/pull/19480#discussion_r144541081
  
    --- Diff: 
sql/catalyst/src/test/scala/org/apache/spark/sql/catalyst/expressions/CodeGenerationSuite.scala
 ---
    @@ -201,6 +201,23 @@ class CodeGenerationSuite extends SparkFunSuite with 
ExpressionEvalHelper {
         }
       }
     
    +  test("SPARK-22226: group splitted expressions into one method per nested 
class") {
    --- End diff --
    
    @viirya I have a good and a bad news... Thanks to your suggestion I have 
been able to understand and reproduce the issue. Moreover, I found also another 
issue which is fixed by this problem and I am adding a UT for that too: in some 
cases, we might have a 
    ```
    Code of method apply(...) grows beyond 64 KB
    ```
    And with this PR the problem is fixed.
    
    The bad thing is that the UT you provided still fails, but with a different 
error: actually it is always a Constant Pool limit exceeded exception, but it 
is in a NestedClass. From my analysis, this is caused by another problem, ie. 
that we might reference too many fields of the superclass in the NestedClasses. 
This might be addressed maybe trying to tune the magic number which I brought 
to 1000k in this PR, but I am pretty sure that it will be also addressed by the 
ongoing PR for SPARK-18016, since he is trying to reduce the number of 
variables. Thus I consider this out of scope for this PR.


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to