bsglz commented on PR #5193:
URL: https://github.com/apache/hbase/pull/5193#issuecomment-1536008360

   > Oh, there is a problem that, what is the default value for 
G1HeapRegionSize? After googling, I think the default size is based on the heap 
size?
   > 
   > > The G1 GC is a regionalized and generational garbage collector, which 
means that the Java object heap (heap) is divided into a number of equally 
sized regions. Upon startup, the Java Virtual Machine (JVM) sets the region 
size. The region sizes can vary from 1 MB to 32 MB depending on the heap size. 
The goal is to have no more than 2048 regions. The eden, survivor, and old 
generations are logical sets of these regions and are not contiguous.
   > 
   > For having 2048 regions, if we have 16GB heap size, it will have 8MB 
region size, then there is no problem for using 2MB chunk size. And if we have 
less than 8GB heap size, 2047KB will also make a lot of humongous objects...
   > 
   > So I think a better approach is to documentation this out? If you are 
using G1 GC, be careful about the chunk size and the heap region size...
   
   1. The G1HeapRegionSize calculated by initial_heap_size and max_heap_size, 
here are some cases for better understand:
   //xms=0,xmx=10g -> region size 2048K
   //xms=10g,xmx=10g -> region size 4096K
   //xms=0,xmx=20g -> region size 4096K
   //xms=20,xmx=20g -> region size 8192K
   //xms=0,xmx=30g -> region size 4096K
   //xms=0,xmx=32g -> region size 8192K
   
   The related jdk8 code shows below:
   ```
   void HeapRegion::setup_heap_region_size(size_t initial_heap_size, size_t 
max_heap_size) {
     uintx region_size = G1HeapRegionSize;
     if (FLAG_IS_DEFAULT(G1HeapRegionSize)) {
       size_t average_heap_size = (initial_heap_size + max_heap_size) / 2;
       region_size = MAX2(average_heap_size / HeapRegionBounds::target_number(),
                          (uintx) HeapRegionBounds::min_size());
     }
   
     int region_size_log = log2_long((jlong) region_size);
     // Recalculate the region size to make sure it's a power of
     // 2. This means that region_size is the largest power of 2 that's
     // <= what we've calculated so far.
     region_size = ((uintx)1 << region_size_log);
   
     // Now make sure that we don't go over or under our limits.
     if (region_size < HeapRegionBounds::min_size()) {
       region_size = HeapRegionBounds::min_size();
     } else if (region_size > HeapRegionBounds::max_size()) {
       region_size = HeapRegionBounds::max_size();
     }
   
     // And recalculate the log.
     region_size_log = log2_long((jlong) region_size);
   
     // Now, set up the globals.
     guarantee(LogOfHRGrainBytes == 0, "we should only set it once");
     LogOfHRGrainBytes = region_size_log;
   
     guarantee(LogOfHRGrainWords == 0, "we should only set it once");
     LogOfHRGrainWords = LogOfHRGrainBytes - LogHeapWordSize;
   
     guarantee(GrainBytes == 0, "we should only set it once");
     // The cast to int is safe, given that we've bounded region_size by
     // MIN_REGION_SIZE and MAX_REGION_SIZE.
     GrainBytes = (size_t)region_size;
   
     guarantee(GrainWords == 0, "we should only set it once");
     GrainWords = GrainBytes >> LogHeapWordSize;
     guarantee((size_t) 1 << LogOfHRGrainWords == GrainWords, "sanity");
   
     guarantee(CardsPerRegion == 0, "we should only set it once");
     CardsPerRegion = GrainBytes >> CardTableModRefBS::card_shift;
   }
   ```
   So it's easy to get 4M.
   
   2. If the chunk size is 2047KB and if the heapRegionSize is 2MB, we will 
just waste 1KB memory, its ok. But if the chunk size is 2047KB and if the 
heapRegionSize is 4MB, we will waste 2MB memory.
   
   > Every humongous object gets allocated as a sequence of contiguous regions 
in the old generation. The start of the object itself is always located at the 
start of the first region in that sequence. Any leftover space in the last 
region of the sequence will be lost for allocation until the entire object is 
reclaimed.
   
   3.In practice, many users have little background knowledge about 
heapRegionSize and humongous objects and can easily run into problems, only 
document it seems not enough.
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to