Sergey, Ilya,

Thanks for the clarification. We’re on the same page.

Dmitriy, Sergi, Alex G., do you have any thoughts on this? 

—
Denis

> On Mar 17, 2017, at 5:33 AM, Ilya Lantukh <[email protected]> wrote:
> 
> Denis, Sergey,
> 
> Changes in https://issues.apache.org/jira/browse/IGNITE-4535 won't be so
> global. I am not going to replace older mechanics, but rather re-enable and
> adjust them to work with PageMemory. We will still use the same distributed
> hash table and evict entries based on existing EvictionPolicy API. I think
> 'Dht' methods, like getDhtEvictQueueCurrentSize(), are still relevant and
> important.
> 
> On Fri, Mar 17, 2017 at 12:08 PM, Sergey Chugunov <[email protected]
>> wrote:
> 
>> Dmitriy,
>> 
>> My main goal was to add a metric to estimate FreeList space fragmentation
>> and "hist" was the first thing I came up with.
>> 
>> Let's consider one case: we placed into a cache 4 entities 60% page size
>> each.
>> After that we'll have 4 pages in FreeList each with a hole of 40% of it's
>> size.
>> Utilization of FreeList will be 60% but with big fragmentation.
>> 
>> Let's consider another case: we have added and removed a bunch of entries
>> much smaller than a page. After that we have two pages 90% full, one page
>> 50% full and one page 10% full.
>> Utilization of FreeList is 60% again, very simple math, but fragmentation
>> is much smaller.
>> 
>> So, when we calculate only a simple average we lose a lot of information;
>> and this information may be very useful to make a decision about best page
>> size configuration.
>> 
>> Thanks,
>> Sergey.
>> 
>> 
>> On Thu, Mar 16, 2017 at 10:22 PM, Dmitriy Setrakyan <[email protected]
>>> 
>> wrote:
>> 
>>> As far as the percentage of the free page space, why do we need to
>> provide
>>> 3 ranges: 0 -> 16, 16 -> 32, 32 -> 64, etc? Why not just provide average
>>> free bytes percentage as one value?
>>> 
>>> Am I misunderstanding something?
>>> 
>>> On Thu, Mar 16, 2017 at 11:04 AM, Denis Magda <[email protected]> wrote:
>>> 
>>>> Sergey,
>>>> 
>>>> Considering that the swap tier will no longer be supported in 2.0 all
>> the
>>>> methods that start with ‘getSwap…’ are no longer relevant and have to
>> be
>>>> removed from metrics. For instance, the swap functionality has already
>>> been
>>>> wiped out from .NET:
>>>> https://issues.apache.org/jira/browse/IGNITE-4736
>>>> 
>>>> Next, I’m also confused with the metrics that include ‘Dht’ in its
>> name.
>>>> The on-heap tier we have in 1.x will be replaced with on-heap cache:
>>>> https://issues.apache.org/jira/browse/IGNITE-4535 <
>>>> https://issues.apache.org/jira/browse/IGNITE-4535>
>>>> Does it me that ‘Dht’ methods are still relevant or they need to be
>>>> replaced with something more meaningful? *Alex G.*, please chime in.
>>>> 
>>>> Finally, personally I don’t like the API for these 3 methods
>>>> 
>>>>> 
>>>>>   public float getPagesPercentage_8_16_freeBytes();
>>>>>   public float getPagesPercentage_16_64_freeBytes();
>>>>>   public float getPagesPercentage_64_256_freeBytes();
>>>> 
>>>> Wouldn’t it better to have a single method like this?
>>>> 
>>>> public float[] getPagesFreeBytesPercentage();
>>>> 
>>>> where
>>>> 
>>>> float[0] - 0 to 16 free bytes.
>>>> float[1] - 16 to 32 free bytes.
>>>> float[2] - 32 to 64 free bytes.
>>>> …..
>>>> float[N] - page_size - 16 to page size free bytes.
>>>> 
>>>> —
>>>> Denis
>>>> 
>>>>> On Mar 16, 2017, at 10:22 AM, Sergey Chugunov <
>>> [email protected]>
>>>> wrote:
>>>>> 
>>>>> Denis,
>>>>> 
>>>>> Here is a version of CacheMetrics interface with all changes how I
>> see
>>>> them
>>>>> (pretty long list :)).
>>>>> 
>>>>> public interface CacheMetrics {
>>>>> 
>>>>>  public long getCacheHits();
>>>>> 
>>>>>  public float getCacheHitPercentage();
>>>>> 
>>>>>  public long getCacheMisses();
>>>>> 
>>>>>  public float getCacheMissPercentage();
>>>>> 
>>>>>  public long getCacheGets();
>>>>> 
>>>>>  public long getCachePuts();
>>>>> 
>>>>>  public long getCacheRemovals();
>>>>> 
>>>>>  public long getCacheEvictions();
>>>>> 
>>>>>  public float getAverageGetTime();
>>>>> 
>>>>>  public float getAveragePutTime();
>>>>> 
>>>>>  public float getAverageRemoveTime();
>>>>> 
>>>>>  public float getAverageTxCommitTime();
>>>>> 
>>>>>  public float getAverageTxRollbackTime();
>>>>> 
>>>>>  public long getCacheTxCommits();
>>>>> 
>>>>>  public long getCacheTxRollbacks();
>>>>> 
>>>>>  public String name();
>>>>> 
>>>>>  public long getOverflowSize();
>>>>> 
>>>>>  public long getOffHeapGets();
>>>>> 
>>>>>  public long getOffHeapPuts();//removing as it duplicates cachePuts
>>>>> 
>>>>>  public long getOffHeapRemovals();
>>>>> 
>>>>>  public long getOffHeapEvictions();
>>>>> 
>>>>>  public long getOffHeapHits();
>>>>> 
>>>>>  public float getOffHeapHitPercentage();
>>>>> 
>>>>>  public long getOffHeapMisses();//removing as it duplicates
>>> cacheMisses
>>>>> 
>>>>>  public float getOffHeapMissPercentage();//removing as it
>> duplicates
>>>>> cacheMissPercentage
>>>>> 
>>>>>  public long getOffHeapEntriesCount();
>>>>> 
>>>>>  public long getOffHeapPrimaryEntriesCount();
>>>>> 
>>>>>  public long getOffHeapBackupEntriesCount();
>>>>> 
>>>>>  public long getOffHeapAllocatedSize();
>>>>> 
>>>>>  public long getOffHeapMaxSize();
>>>>> 
>>>>>  public long getSwapGets();
>>>>> 
>>>>>  public long getSwapPuts();
>>>>> 
>>>>>  public long getSwapRemovals();
>>>>> 
>>>>>  public long getSwapHits();
>>>>> 
>>>>>  public long getSwapMisses();
>>>>> 
>>>>>  public long getSwapEntriesCount();
>>>>> 
>>>>>  public long getSwapSize();
>>>>> 
>>>>>  public float getSwapHitPercentage();
>>>>> 
>>>>>  public float getSwapMissPercentage();
>>>>> 
>>>>>  public int getSize();
>>>>> 
>>>>>  public int getKeySize();
>>>>> 
>>>>>  public boolean isEmpty();
>>>>> 
>>>>>  public int getDhtEvictQueueCurrentSize();
>>>>> 
>>>>>  public int getTxThreadMapSize();
>>>>> 
>>>>>  public int getTxXidMapSize();
>>>>> 
>>>>>  public int getTxCommitQueueSize();
>>>>> 
>>>>>  public int getTxPrepareQueueSize();
>>>>> 
>>>>>  public int getTxStartVersionCountsSize();
>>>>> 
>>>>>  public int getTxCommittedVersionsSize();
>>>>> 
>>>>>  public int getTxRolledbackVersionsSize();
>>>>> 
>>>>>  public int getTxDhtThreadMapSize();
>>>>> 
>>>>>  public int getTxDhtXidMapSize();
>>>>> 
>>>>>  public int getTxDhtCommitQueueSize();
>>>>> 
>>>>>  public int getTxDhtPrepareQueueSize();
>>>>> 
>>>>>  public int getTxDhtStartVersionCountsSize();
>>>>> 
>>>>>  public int getTxDhtCommittedVersionsSize();
>>>>> 
>>>>>  public int getTxDhtRolledbackVersionsSize();
>>>>> 
>>>>>  public boolean isWriteBehindEnabled();
>>>>> 
>>>>>  public int getWriteBehindFlushSize();
>>>>> 
>>>>>  public int getWriteBehindFlushThreadCount();
>>>>> 
>>>>>  public long getWriteBehindFlushFrequency();
>>>>> 
>>>>>  public int getWriteBehindStoreBatchSize();
>>>>> 
>>>>>  public int getWriteBehindTotalCriticalOverflowCount();
>>>>> 
>>>>>  public int getWriteBehindCriticalOverflowCount();
>>>>> 
>>>>>  public int getWriteBehindErrorRetryCount();
>>>>> 
>>>>>  public int getWriteBehindBufferSize();
>>>>> 
>>>>>  public String getKeyType();
>>>>> 
>>>>>  public String getValueType();
>>>>> 
>>>>>  public boolean isStoreByValue();
>>>>> 
>>>>>  public boolean isStatisticsEnabled();
>>>>> 
>>>>>  public boolean isManagementEnabled();
>>>>> 
>>>>>  public boolean isReadThrough();
>>>>> 
>>>>>  public boolean isWriteThrough();
>>>>> 
>>>>>  public long getTotalAllocatedPages();
>>>>> 
>>>>>  public long getTotalEvictedPages();
>>>>> 
>>>>> }
>>>>> 
>>>>> 
>>>>> Also I suggest to introduce new interface for MemoryPolicy metrics
>> and
>>>> make
>>>>> it available through *IgniteCacheDatabaseSharedManager*:
>>>>> 
>>>>> 
>>>>> public interface IgniteMemoryPolicyMetrics {
>>>>> 
>>>>>   /**
>>>>> 
>>>>>    * @return Memory policy name.
>>>>> 
>>>>>    */
>>>>> 
>>>>>   public String getName();
>>>>> 
>>>>> 
>>>>>   /**
>>>>> 
>>>>>    * @return Total number of allocated pages.
>>>>> 
>>>>>    */
>>>>> 
>>>>>   public long getTotalAllocatedPages();
>>>>> 
>>>>> 
>>>>>   /**
>>>>> 
>>>>>    * @return Amount (in bytes) of not yet allocated space in
>>> PageMemory.
>>>>> 
>>>>>    */
>>>>> 
>>>>>   public long getAvailableSpace();
>>>>> 
>>>>> 
>>>>>   /**
>>>>> 
>>>>>    * @return Number of allocated pages per second within PageMemory.
>>>>> 
>>>>>    */
>>>>> 
>>>>>   public float getAllocationRate();
>>>>> 
>>>>> 
>>>>>   /**
>>>>> 
>>>>>    * @return Number of evicted pages per second within PageMemory.
>>>>> 
>>>>>    */
>>>>> 
>>>>>   public float getEvictionRate();
>>>>> 
>>>>> 
>>>>>   /**
>>>>> 
>>>>>    * Large entities bigger than page are split into fragments so
>> each
>>>>> fragment can fit into a page.
>>>>> 
>>>>>    *
>>>>> 
>>>>>    * @return Percentage of pages fully occupied by large entities.
>>>>> 
>>>>>    */
>>>>> 
>>>>>   public long getLargeEntriesPagesPercentage();
>>>>> 
>>>>> 
>>>>>   //---FreeList-related metrics
>>>>> 
>>>>> 
>>>>>   /**
>>>>> 
>>>>>    * @return Free space to overall size ratio across all pages in
>>>>> FreeList.
>>>>> 
>>>>>    */
>>>>> 
>>>>>   public float getPagesFillFactor();
>>>>> 
>>>>> 
>>>>>   /**
>>>>> 
>>>>>    * @return Percentage of pages in FreeList with free space >= 8
>> and
>>> <
>>>>> 16 bytes
>>>>> 
>>>>>    */
>>>>> 
>>>>>   public float getPagesPercentage_8_16_freeBytes();
>>>>> 
>>>>> 
>>>>>   /**
>>>>> 
>>>>>    * @return Percentage of pages in FreeList with free space >= 16
>>> and <
>>>>> 64 bytes
>>>>> 
>>>>>    */
>>>>> 
>>>>>   public float getPagesPercentage_16_64_freeBytes();
>>>>> 
>>>>> 
>>>>>   /**
>>>>> 
>>>>>    * @return Percentage of pages in FreeList with free space >= 64
>>> and <
>>>>> 256 bytes
>>>>> 
>>>>>    */
>>>>> 
>>>>>   public float getPagesPercentage_64_256_freeBytes();
>>>>> 
>>>>> }
>>>>> 
>>>>> In my mind last three methods provide some kind of hist to give an
>>>> insight
>>>>> about memory fragmentation.
>>>>> If there are a lot of pages with relatively big free chunks and less
>>>> with a
>>>>> smaller chunks it may indicate that memory is fragmented and it may
>> be
>>>>> reasonable to adjust page sizes.
>>>>> 
>>>>> Thanks,
>>>>> Sergey.
>>>>> 
>>>>> 
>>>>> 
>>>>> On Thu, Mar 16, 2017 at 1:29 AM, Denis Magda <[email protected]>
>>> wrote:
>>>>> 
>>>>>> Hi Sergey,
>>>>>> 
>>>>>>>> In memory management scheme based on MemoryPolicies it may be
>> useful
>>>>>> (and
>>>>>>>> easier) to collect some metrics not for individual caches but for
>>>> whole
>>>>>>>> MemoryPolicies where several caches may reside.
>>>>>>>> 
>>>>>> 
>>>>>> I would collect the metrics for every single MemoryPolicy as well as
>>> for
>>>>>> individual caches. It makes sense to expose which cache contributes
>>>> more to
>>>>>> memory utilization.
>>>>>> 
>>>>>>>> - free space / used space tracking;
>>>>>>>> - allocation / eviction rate;
>>>>>> 
>>>>>> Please consider this as well:
>>>>>> - total number of pages;
>>>>>> - total number of enters (how hard to support?).
>>>>>> 
>>>>>>>> - metrics to track memory fragmentation: e.g. % of pages with
>> only
>>> 8
>>>>>>>> bytes free, 16 bytes free and so on;
>>>>>>>> - % of big fragmented entries in cache: may be useful to adjust
>>> page
>>>>>>>> size.
>>>>>> 
>>>>>>> 
>>>>>> How do you see this in the metrics interface?
>>>>>> 
>>>>>> 
>>>>>>> 3. Useful, not going to remove:
>>>>>>> getOffHeapGets //useful as there still may be deserialized entries
>>>>>>> residing on-heap
>>>>>>> getOffHeapHitPercentage
>>>>>>> getOffHeapHits //overall hits include offheap and onheap
>>>>>>> getOffHeapMisses //I think in new model is the same as
>>> getCacheMisses
>>>>>>> getOffHeapMissPercentage //same as above
>>>>>>> getOffHeapPuts //same as above
>>>>>>> getOffHeapRemovals //same as above
>>>>>> 
>>>>>> Could you please prepare an updated version of the cache metrics
>>> adding
>>>>>> new methods and renaming existing ones (only if necessary)? It will
>> be
>>>>>> simpler to keep up the discussion relying on this updated interface.
>>>>>> 
>>>>>> —
>>>>>> Denis
>>>>>> 
>>>>>>> On Mar 15, 2017, at 8:32 AM, Sergey Chugunov <
>>>> [email protected]>
>>>>>> wrote:
>>>>>>> 
>>>>>>> Also I looked through current set of metrics available on
>>>>>>> *CacheMetrics *interface
>>>>>>> and suggest following changes:
>>>>>>> 
>>>>>>> 
>>>>>>> 1. All methods related to tracking swap space (including
>>>>>>> *getOverflowSize*) to be removed.
>>>>>>> 
>>>>>>> 2. Useless/hard to calculate in new memory management approach:
>>>>>>> getOffHeapAllocatedSize //max size is constrained by MemoryPolicy
>>>>>> config
>>>>>>> getOffHeapEntriesCount //all cache entries live offheap
>>>>>>> getOffHeapEvictions //will be captured on MemoryPolicyMetrics
>> level;
>>>>>>> getOffHeapMaxSize //same as the first one
>>>>>>> 
>>>>>>> 3. Useful, not going to remove:
>>>>>>> getOffHeapGets //useful as there still may be deserialized entries
>>>>>>> residing on-heap
>>>>>>> getOffHeapHitPercentage
>>>>>>> getOffHeapHits //overall hits include offheap and onheap
>>>>>>> getOffHeapMisses //I think in new model is the same as
>>> getCacheMisses
>>>>>>> getOffHeapMissPercentage //same as above
>>>>>>> getOffHeapPuts //same as above
>>>>>>> getOffHeapRemovals //same as above
>>>>>>> 
>>>>>>> Please share your thought if I miss something here.
>>>>>>> 
>>>>>>> Thanks,
>>>>>>> Sergey Chugunov.
>>>>>>> 
>>>>>>> On Wed, Mar 15, 2017 at 4:51 PM, Sergey Chugunov <
>>>>>> [email protected]>
>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Hello Igniters,
>>>>>>>> 
>>>>>>>> As part of [1] cache metrics need to be updated as some of them
>> like
>>>>>> swap
>>>>>>>> hits are not applicable anymore.
>>>>>>>> 
>>>>>>>> In memory management scheme based on MemoryPolicies it may be
>> useful
>>>>>> (and
>>>>>>>> easier) to collect some metrics not for individual caches but for
>>>> whole
>>>>>>>> MemoryPolicies where several caches may reside.
>>>>>>>> 
>>>>>>>> I suggest the following list of new metrics to collect for each
>>>>>>>> MemoryPolicy:
>>>>>>>> 
>>>>>>>> - free space / used space tracking;
>>>>>>>> - allocation / eviction rate;
>>>>>>>> - metrics to track memory fragmentation: e.g. % of pages with
>> only
>>> 8
>>>>>>>> bytes free, 16 bytes free and so on;
>>>>>>>> - % of big fragmented entries in cache: may be useful to adjust
>>> page
>>>>>>>> size.
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Please suggest any other metrics that may be worth tracking.
>>>>>>>> 
>>>>>>>> [1] https://issues.apache.org/jira/browse/IGNITE-3477
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Sergey Chugunov.
>>>>>>>> 
>>>>>> 
>>>>>> 
>>>> 
>>>> 
>>> 
>> 
> 
> 
> 
> -- 
> Best regards,
> Ilya

Reply via email to