vrajat opened a new issue, #14656:
URL: https://github.com/apache/pinot/issues/14656

   `mapFields` in IS & EV have a structure like:
   
   ```
   "10000": {
         "Server...8098": "ONLINE"
       },
   ```
   where 
   * `10000` is the segment name,
   * `Server...8098` is the server name
   * `ONLINE` is the status of the segment on that server.
   
   The state of the segment is represented using multiple hash maps in 
controller etc.
   The root cause of the memory overhead is that _default size of each of the 
hashmaps is 16_ while the mean no. of items is <=3.
   With replication and replica groups, the overhead increases.
   
   In an experiment, it was noticed that for a table with 100K segments, 500MB 
of memory was lost due to unused capacity. There were 300K single item hash 
maps.
   
   ```
   j.u.LinkedHashMap<>(size: 1, capacity: 16) {(key:"Server_8098", 
val:"ONLINE")}
   j.u.LinkedHashMap<>(size: 1, capacity: 16) {(key:"Server_8098", 
val:"ONLINE")}
   j.u.LinkedHashMap<>(size: 1, capacity: 16) {(key:"Server_8098", 
val:"ONLINE")}
   j.u.LinkedHashMap<>(size: 1, capacity: 16) {(key:"Server_8098", 
val:"ONLINE")}
   j.u.LinkedHashMap<>(size: 1, capacity: 16) {(key:"Server_8098", 
val:"ONLINE")}
   j.u.LinkedHashMap<>(size: 1, capacity: 16) {(key:"Server_8098", 
val:"ONLINE")}
   j.u.LinkedHashMap<>(size: 1, capacity: 16) {(key:"Server_8098", 
val:"ONLINE")}
   j.u.LinkedHashMap<>(size: 1, capacity: 16) {(key:"Server_8098", 
val:"ONLINE")}
   ```
   
   Additionally the data structure overhead due to the large number of hashmaps 
is ~200MB for 100K segments.
   
   
   |#instances | (Average) objectsize | Total overheadper class | Class name|
   |-----------|----------------------|-------------------------|------------|
   |6,412,230 | 40b | 75,143Kb (3.8%) | j.u.LinkedHashMap$Entry
   |5,356,965 | 24b | 62,776Kb (3.2%) | String
   |3,498,014 | 32b | 40,992Kb (2.1%) | j.u.HashMap$Node
   |3,290,971 | 86b | 38,566Kb (2.0%) | j.u.HashMap$Node[]
   |2,307,923 | 40b | 27,045Kb (1.4%) | j.u.TreeMap$Entry
   |1,784,111 | 64b | 20,907Kb (1.1%) | j.u.LinkedHashMap
   |1,518,617 | 48b | 17,796Kb (0.9%) | j.u.HashMap
   |489,050 | 24b | 5,731Kb (0.3%) | j.u.LinkedHashMap$LinkedKeySet
   |377,361 | 24b | 4,422Kb (0.2%) | j.u.LinkedHashMap$LinkedEntrySet
   |325,425 | 16b | 3,813Kb (0.2%) | j.u.HashMap$KeySet
   |315,886 | 32b | 3,701Kb (0.2%) | j.u.concurrent.ConcurrentHashMap$Node
   |216,027 | 16b | 2,531Kb (0.1%) | j.u.HashMap$EntrySet
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to