Sergey, Ilya, Thanks for the clarification. We’re on the same page.
Dmitriy, Sergi, Alex G., do you have any thoughts on this? — Denis > On Mar 17, 2017, at 5:33 AM, Ilya Lantukh <[email protected]> wrote: > > Denis, Sergey, > > Changes in https://issues.apache.org/jira/browse/IGNITE-4535 won't be so > global. I am not going to replace older mechanics, but rather re-enable and > adjust them to work with PageMemory. We will still use the same distributed > hash table and evict entries based on existing EvictionPolicy API. I think > 'Dht' methods, like getDhtEvictQueueCurrentSize(), are still relevant and > important. > > On Fri, Mar 17, 2017 at 12:08 PM, Sergey Chugunov <[email protected] >> wrote: > >> Dmitriy, >> >> My main goal was to add a metric to estimate FreeList space fragmentation >> and "hist" was the first thing I came up with. >> >> Let's consider one case: we placed into a cache 4 entities 60% page size >> each. >> After that we'll have 4 pages in FreeList each with a hole of 40% of it's >> size. >> Utilization of FreeList will be 60% but with big fragmentation. >> >> Let's consider another case: we have added and removed a bunch of entries >> much smaller than a page. After that we have two pages 90% full, one page >> 50% full and one page 10% full. >> Utilization of FreeList is 60% again, very simple math, but fragmentation >> is much smaller. >> >> So, when we calculate only a simple average we lose a lot of information; >> and this information may be very useful to make a decision about best page >> size configuration. >> >> Thanks, >> Sergey. >> >> >> On Thu, Mar 16, 2017 at 10:22 PM, Dmitriy Setrakyan <[email protected] >>> >> wrote: >> >>> As far as the percentage of the free page space, why do we need to >> provide >>> 3 ranges: 0 -> 16, 16 -> 32, 32 -> 64, etc? Why not just provide average >>> free bytes percentage as one value? >>> >>> Am I misunderstanding something? >>> >>> On Thu, Mar 16, 2017 at 11:04 AM, Denis Magda <[email protected]> wrote: >>> >>>> Sergey, >>>> >>>> Considering that the swap tier will no longer be supported in 2.0 all >> the >>>> methods that start with ‘getSwap…’ are no longer relevant and have to >> be >>>> removed from metrics. For instance, the swap functionality has already >>> been >>>> wiped out from .NET: >>>> https://issues.apache.org/jira/browse/IGNITE-4736 >>>> >>>> Next, I’m also confused with the metrics that include ‘Dht’ in its >> name. >>>> The on-heap tier we have in 1.x will be replaced with on-heap cache: >>>> https://issues.apache.org/jira/browse/IGNITE-4535 < >>>> https://issues.apache.org/jira/browse/IGNITE-4535> >>>> Does it me that ‘Dht’ methods are still relevant or they need to be >>>> replaced with something more meaningful? *Alex G.*, please chime in. >>>> >>>> Finally, personally I don’t like the API for these 3 methods >>>> >>>>> >>>>> public float getPagesPercentage_8_16_freeBytes(); >>>>> public float getPagesPercentage_16_64_freeBytes(); >>>>> public float getPagesPercentage_64_256_freeBytes(); >>>> >>>> Wouldn’t it better to have a single method like this? >>>> >>>> public float[] getPagesFreeBytesPercentage(); >>>> >>>> where >>>> >>>> float[0] - 0 to 16 free bytes. >>>> float[1] - 16 to 32 free bytes. >>>> float[2] - 32 to 64 free bytes. >>>> ….. >>>> float[N] - page_size - 16 to page size free bytes. >>>> >>>> — >>>> Denis >>>> >>>>> On Mar 16, 2017, at 10:22 AM, Sergey Chugunov < >>> [email protected]> >>>> wrote: >>>>> >>>>> Denis, >>>>> >>>>> Here is a version of CacheMetrics interface with all changes how I >> see >>>> them >>>>> (pretty long list :)). >>>>> >>>>> public interface CacheMetrics { >>>>> >>>>> public long getCacheHits(); >>>>> >>>>> public float getCacheHitPercentage(); >>>>> >>>>> public long getCacheMisses(); >>>>> >>>>> public float getCacheMissPercentage(); >>>>> >>>>> public long getCacheGets(); >>>>> >>>>> public long getCachePuts(); >>>>> >>>>> public long getCacheRemovals(); >>>>> >>>>> public long getCacheEvictions(); >>>>> >>>>> public float getAverageGetTime(); >>>>> >>>>> public float getAveragePutTime(); >>>>> >>>>> public float getAverageRemoveTime(); >>>>> >>>>> public float getAverageTxCommitTime(); >>>>> >>>>> public float getAverageTxRollbackTime(); >>>>> >>>>> public long getCacheTxCommits(); >>>>> >>>>> public long getCacheTxRollbacks(); >>>>> >>>>> public String name(); >>>>> >>>>> public long getOverflowSize(); >>>>> >>>>> public long getOffHeapGets(); >>>>> >>>>> public long getOffHeapPuts();//removing as it duplicates cachePuts >>>>> >>>>> public long getOffHeapRemovals(); >>>>> >>>>> public long getOffHeapEvictions(); >>>>> >>>>> public long getOffHeapHits(); >>>>> >>>>> public float getOffHeapHitPercentage(); >>>>> >>>>> public long getOffHeapMisses();//removing as it duplicates >>> cacheMisses >>>>> >>>>> public float getOffHeapMissPercentage();//removing as it >> duplicates >>>>> cacheMissPercentage >>>>> >>>>> public long getOffHeapEntriesCount(); >>>>> >>>>> public long getOffHeapPrimaryEntriesCount(); >>>>> >>>>> public long getOffHeapBackupEntriesCount(); >>>>> >>>>> public long getOffHeapAllocatedSize(); >>>>> >>>>> public long getOffHeapMaxSize(); >>>>> >>>>> public long getSwapGets(); >>>>> >>>>> public long getSwapPuts(); >>>>> >>>>> public long getSwapRemovals(); >>>>> >>>>> public long getSwapHits(); >>>>> >>>>> public long getSwapMisses(); >>>>> >>>>> public long getSwapEntriesCount(); >>>>> >>>>> public long getSwapSize(); >>>>> >>>>> public float getSwapHitPercentage(); >>>>> >>>>> public float getSwapMissPercentage(); >>>>> >>>>> public int getSize(); >>>>> >>>>> public int getKeySize(); >>>>> >>>>> public boolean isEmpty(); >>>>> >>>>> public int getDhtEvictQueueCurrentSize(); >>>>> >>>>> public int getTxThreadMapSize(); >>>>> >>>>> public int getTxXidMapSize(); >>>>> >>>>> public int getTxCommitQueueSize(); >>>>> >>>>> public int getTxPrepareQueueSize(); >>>>> >>>>> public int getTxStartVersionCountsSize(); >>>>> >>>>> public int getTxCommittedVersionsSize(); >>>>> >>>>> public int getTxRolledbackVersionsSize(); >>>>> >>>>> public int getTxDhtThreadMapSize(); >>>>> >>>>> public int getTxDhtXidMapSize(); >>>>> >>>>> public int getTxDhtCommitQueueSize(); >>>>> >>>>> public int getTxDhtPrepareQueueSize(); >>>>> >>>>> public int getTxDhtStartVersionCountsSize(); >>>>> >>>>> public int getTxDhtCommittedVersionsSize(); >>>>> >>>>> public int getTxDhtRolledbackVersionsSize(); >>>>> >>>>> public boolean isWriteBehindEnabled(); >>>>> >>>>> public int getWriteBehindFlushSize(); >>>>> >>>>> public int getWriteBehindFlushThreadCount(); >>>>> >>>>> public long getWriteBehindFlushFrequency(); >>>>> >>>>> public int getWriteBehindStoreBatchSize(); >>>>> >>>>> public int getWriteBehindTotalCriticalOverflowCount(); >>>>> >>>>> public int getWriteBehindCriticalOverflowCount(); >>>>> >>>>> public int getWriteBehindErrorRetryCount(); >>>>> >>>>> public int getWriteBehindBufferSize(); >>>>> >>>>> public String getKeyType(); >>>>> >>>>> public String getValueType(); >>>>> >>>>> public boolean isStoreByValue(); >>>>> >>>>> public boolean isStatisticsEnabled(); >>>>> >>>>> public boolean isManagementEnabled(); >>>>> >>>>> public boolean isReadThrough(); >>>>> >>>>> public boolean isWriteThrough(); >>>>> >>>>> public long getTotalAllocatedPages(); >>>>> >>>>> public long getTotalEvictedPages(); >>>>> >>>>> } >>>>> >>>>> >>>>> Also I suggest to introduce new interface for MemoryPolicy metrics >> and >>>> make >>>>> it available through *IgniteCacheDatabaseSharedManager*: >>>>> >>>>> >>>>> public interface IgniteMemoryPolicyMetrics { >>>>> >>>>> /** >>>>> >>>>> * @return Memory policy name. >>>>> >>>>> */ >>>>> >>>>> public String getName(); >>>>> >>>>> >>>>> /** >>>>> >>>>> * @return Total number of allocated pages. >>>>> >>>>> */ >>>>> >>>>> public long getTotalAllocatedPages(); >>>>> >>>>> >>>>> /** >>>>> >>>>> * @return Amount (in bytes) of not yet allocated space in >>> PageMemory. >>>>> >>>>> */ >>>>> >>>>> public long getAvailableSpace(); >>>>> >>>>> >>>>> /** >>>>> >>>>> * @return Number of allocated pages per second within PageMemory. >>>>> >>>>> */ >>>>> >>>>> public float getAllocationRate(); >>>>> >>>>> >>>>> /** >>>>> >>>>> * @return Number of evicted pages per second within PageMemory. >>>>> >>>>> */ >>>>> >>>>> public float getEvictionRate(); >>>>> >>>>> >>>>> /** >>>>> >>>>> * Large entities bigger than page are split into fragments so >> each >>>>> fragment can fit into a page. >>>>> >>>>> * >>>>> >>>>> * @return Percentage of pages fully occupied by large entities. >>>>> >>>>> */ >>>>> >>>>> public long getLargeEntriesPagesPercentage(); >>>>> >>>>> >>>>> //---FreeList-related metrics >>>>> >>>>> >>>>> /** >>>>> >>>>> * @return Free space to overall size ratio across all pages in >>>>> FreeList. >>>>> >>>>> */ >>>>> >>>>> public float getPagesFillFactor(); >>>>> >>>>> >>>>> /** >>>>> >>>>> * @return Percentage of pages in FreeList with free space >= 8 >> and >>> < >>>>> 16 bytes >>>>> >>>>> */ >>>>> >>>>> public float getPagesPercentage_8_16_freeBytes(); >>>>> >>>>> >>>>> /** >>>>> >>>>> * @return Percentage of pages in FreeList with free space >= 16 >>> and < >>>>> 64 bytes >>>>> >>>>> */ >>>>> >>>>> public float getPagesPercentage_16_64_freeBytes(); >>>>> >>>>> >>>>> /** >>>>> >>>>> * @return Percentage of pages in FreeList with free space >= 64 >>> and < >>>>> 256 bytes >>>>> >>>>> */ >>>>> >>>>> public float getPagesPercentage_64_256_freeBytes(); >>>>> >>>>> } >>>>> >>>>> In my mind last three methods provide some kind of hist to give an >>>> insight >>>>> about memory fragmentation. >>>>> If there are a lot of pages with relatively big free chunks and less >>>> with a >>>>> smaller chunks it may indicate that memory is fragmented and it may >> be >>>>> reasonable to adjust page sizes. >>>>> >>>>> Thanks, >>>>> Sergey. >>>>> >>>>> >>>>> >>>>> On Thu, Mar 16, 2017 at 1:29 AM, Denis Magda <[email protected]> >>> wrote: >>>>> >>>>>> Hi Sergey, >>>>>> >>>>>>>> In memory management scheme based on MemoryPolicies it may be >> useful >>>>>> (and >>>>>>>> easier) to collect some metrics not for individual caches but for >>>> whole >>>>>>>> MemoryPolicies where several caches may reside. >>>>>>>> >>>>>> >>>>>> I would collect the metrics for every single MemoryPolicy as well as >>> for >>>>>> individual caches. It makes sense to expose which cache contributes >>>> more to >>>>>> memory utilization. >>>>>> >>>>>>>> - free space / used space tracking; >>>>>>>> - allocation / eviction rate; >>>>>> >>>>>> Please consider this as well: >>>>>> - total number of pages; >>>>>> - total number of enters (how hard to support?). >>>>>> >>>>>>>> - metrics to track memory fragmentation: e.g. % of pages with >> only >>> 8 >>>>>>>> bytes free, 16 bytes free and so on; >>>>>>>> - % of big fragmented entries in cache: may be useful to adjust >>> page >>>>>>>> size. >>>>>> >>>>>>> >>>>>> How do you see this in the metrics interface? >>>>>> >>>>>> >>>>>>> 3. Useful, not going to remove: >>>>>>> getOffHeapGets //useful as there still may be deserialized entries >>>>>>> residing on-heap >>>>>>> getOffHeapHitPercentage >>>>>>> getOffHeapHits //overall hits include offheap and onheap >>>>>>> getOffHeapMisses //I think in new model is the same as >>> getCacheMisses >>>>>>> getOffHeapMissPercentage //same as above >>>>>>> getOffHeapPuts //same as above >>>>>>> getOffHeapRemovals //same as above >>>>>> >>>>>> Could you please prepare an updated version of the cache metrics >>> adding >>>>>> new methods and renaming existing ones (only if necessary)? It will >> be >>>>>> simpler to keep up the discussion relying on this updated interface. >>>>>> >>>>>> — >>>>>> Denis >>>>>> >>>>>>> On Mar 15, 2017, at 8:32 AM, Sergey Chugunov < >>>> [email protected]> >>>>>> wrote: >>>>>>> >>>>>>> Also I looked through current set of metrics available on >>>>>>> *CacheMetrics *interface >>>>>>> and suggest following changes: >>>>>>> >>>>>>> >>>>>>> 1. All methods related to tracking swap space (including >>>>>>> *getOverflowSize*) to be removed. >>>>>>> >>>>>>> 2. Useless/hard to calculate in new memory management approach: >>>>>>> getOffHeapAllocatedSize //max size is constrained by MemoryPolicy >>>>>> config >>>>>>> getOffHeapEntriesCount //all cache entries live offheap >>>>>>> getOffHeapEvictions //will be captured on MemoryPolicyMetrics >> level; >>>>>>> getOffHeapMaxSize //same as the first one >>>>>>> >>>>>>> 3. Useful, not going to remove: >>>>>>> getOffHeapGets //useful as there still may be deserialized entries >>>>>>> residing on-heap >>>>>>> getOffHeapHitPercentage >>>>>>> getOffHeapHits //overall hits include offheap and onheap >>>>>>> getOffHeapMisses //I think in new model is the same as >>> getCacheMisses >>>>>>> getOffHeapMissPercentage //same as above >>>>>>> getOffHeapPuts //same as above >>>>>>> getOffHeapRemovals //same as above >>>>>>> >>>>>>> Please share your thought if I miss something here. >>>>>>> >>>>>>> Thanks, >>>>>>> Sergey Chugunov. >>>>>>> >>>>>>> On Wed, Mar 15, 2017 at 4:51 PM, Sergey Chugunov < >>>>>> [email protected]> >>>>>>> wrote: >>>>>>> >>>>>>>> Hello Igniters, >>>>>>>> >>>>>>>> As part of [1] cache metrics need to be updated as some of them >> like >>>>>> swap >>>>>>>> hits are not applicable anymore. >>>>>>>> >>>>>>>> In memory management scheme based on MemoryPolicies it may be >> useful >>>>>> (and >>>>>>>> easier) to collect some metrics not for individual caches but for >>>> whole >>>>>>>> MemoryPolicies where several caches may reside. >>>>>>>> >>>>>>>> I suggest the following list of new metrics to collect for each >>>>>>>> MemoryPolicy: >>>>>>>> >>>>>>>> - free space / used space tracking; >>>>>>>> - allocation / eviction rate; >>>>>>>> - metrics to track memory fragmentation: e.g. % of pages with >> only >>> 8 >>>>>>>> bytes free, 16 bytes free and so on; >>>>>>>> - % of big fragmented entries in cache: may be useful to adjust >>> page >>>>>>>> size. >>>>>>>> >>>>>>>> >>>>>>>> Please suggest any other metrics that may be worth tracking. >>>>>>>> >>>>>>>> [1] https://issues.apache.org/jira/browse/IGNITE-3477 >>>>>>>> >>>>>>>> Thanks, >>>>>>>> Sergey Chugunov. >>>>>>>> >>>>>> >>>>>> >>>> >>>> >>> >> > > > > -- > Best regards, > Ilya
