Github user kiszk commented on the issue:
https://github.com/apache/spark/pull/19518
Based on performance results and usage of constant pool entry, I would like
to use hybrid approach with flat global variable and array.
For example, first 500 variables are stored into flat global variables,
then others are stored into arrays with 32767 elements. I think that most of
non-extreme cases can enjoy simple code without array accesses and good
performance.
WDYT?
```
class Foo {
int globalVars1;
int globalVars2;
...
int globalVars499;
int globalVars500;
int[] globalArrays1 = new int[32767];
int[] globalArrays2 = new int[32767];
int[] globalArrays3 = new int[32767];
...
void apply1(InternalRow i) {
globalVars1 = 1;
globalVars2 = 1;
...
globalVars499 = 1;
globalVars500 = 1;
}
void apply2(InternalRow i) {
globalArrays1[0] = 1;
globalArrays1[1] = 1;
...
}
void apply(InternalRow i) {
apply0(i);
apply1(i);
apply2(i);
...
}
}
```
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]