Github user kiszk commented on the issue:

    https://github.com/apache/spark/pull/19518
  
    Based on performance results and usage of constant pool entry, I would like 
to use hybrid approach with flat global variable and array.  
    For example, first 500 variables are stored into flat global variables, 
then others are stored into arrays with 32767 elements. I think that most of 
non-extreme cases can enjoy simple code without array accesses and good 
performance.
    
    WDYT?
    
    ```
    class Foo {
      int globalVars1;
      int globalVars2;
      ...
      int globalVars499;
      int globalVars500;
      int[] globalArrays1 = new int[32767];
      int[] globalArrays2 = new int[32767];
      int[] globalArrays3 = new int[32767];
      ...
    
      void apply1(InternalRow i) {
        globalVars1 = 1;
        globalVars2 = 1;
        ...
        globalVars499 = 1;
        globalVars500 = 1;
      }
    
      void apply2(InternalRow i) {
        globalArrays1[0] = 1;
        globalArrays1[1] = 1;
        ...
      }
    
      void apply(InternalRow i) {
        apply0(i);
        apply1(i);
        apply2(i);
        ...
      }
    }
    ```


---

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to