Hi all,

Investigating performance of Hive query compile I found some problems around 
memorization in GeneratedMetadataHandler_xxx classes. These classes are 
generated from JaninonRelMetadataProvider and the generated code does 
memorization, anchored on the ‘map’ field of the RelMetadataQuery ‘mq’ 
parameter. My measurements show that these calls explode into deep recursive 
stacks. I was measuring a complex query (49 joins, plenty of expressions) and 
some of the numbers look staggering. Take for instance 
GeneratedMetadataHandler_RowCount.getRowCount:
-          It is called 9303 times as top call
-          It generates 165754 total recursive calls, up to a nest level 18
-          The memo cache is hit 18065 (successful found key)
-          The memo is populated 147689 times (missed key)
-          The function gets no less than 22186 distinct RelMetadataQuery `mq` 
instances (!!).

This situation repeats for each GeneratedMetadataHandler_XXX code:
Class

 Top Calls

 Total Calls

 Memo Cache Hits

 Memo Cache Miss

 Distinct RelMetadataQuery instances

getMaxRowCount

489

105262

6615

98647

489

getDistinctRowCount

6828

74286

6179

68107

5828

areColumnsUnique

19197

149708

13367

136341

2139

getColumnOrigins

250

3021

39

2982

24

getCumulativeCost

2249

9267

4660

4607

1

getNonCumulativeCost

3559

3559

1285

3559

18

getSelectivity

15636

35727

1114

34613

4984

getUniqueKeys

26311

111715

5212

106503

3266


Looking at this, the root problems seems to be the fact that the code uses 
often the construct `RelMetadataQuery.instance()` to obtain a reference to a 
needed object. But each call to `instance()` returns a new object, and this new 
object has a new, clean, memorization `map` field. So we have a very poor memo 
cache hit ratio, but far worse is the effect of repeating ad-nauseam recursive 
calls on deep trees. I made an experiment where I modified the code in 
`RelMetadataQuery.instance()` to reference a threadLocal `map` field and the 
difference is like night vs. day:

Class

 Top Calls

 Total Calls

 Memo Cache Hits

 Memo Cache Miss

getRowCount

8179

14426

12920

1506

getMaxRowCount

489

2535

1327

1208

getDistinctRowCount

3138

7947

4939

3008

areColumnsUnique

1103

3191

1495

1696

getColumnOrigins

250

1273

205

1068

getCumulativeCost

2249

6755

4288

2467

getNonCumulativeCost

2274

2274

0

2274

getSelectivity

539

562

27

535

getUniqueKeys

2635

5248

2437

2811


Gone are the deep recursive calls and explosion into +100k calls. I’m seeing 2x 
- 10x compile time improvements.

I’m asking here if there is some reason behind the frequent replacement of the 
RelMetadataQuery object being used (and hence a clean mem cache), or is just 
some unintended consequence? I am making now changes on Hive side to address 
this (HIVE-16757), if the cache reset effect is accidental we should address 
this as well in Calcite.

BTW I think `RelMetadataQuery.instance()` should be named 
`RelMetadataQuery.createInstance()` to be clear what the effect is.

Thanks,
~Remus

Reply via email to