[ 
https://issues.apache.org/jira/browse/GIRAPH-1125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15749458#comment-15749458
 ] 

ASF GitHub Bot commented on GIRAPH-1125:
----------------------------------------

Github user edunov commented on a diff in the pull request:

    https://github.com/apache/giraph/pull/12#discussion_r92487131
  
    --- Diff: 
giraph-core/src/main/java/org/apache/giraph/utils/ThreadLocalProgressCounter.java
 ---
    @@ -0,0 +1,67 @@
    +/*
    + * Licensed to the Apache Software Foundation (ASF) under one
    + * or more contributor license agreements.  See the NOTICE file
    + * distributed with this work for additional information
    + * regarding copyright ownership.  The ASF licenses this file
    + * to you under the Apache License, Version 2.0 (the
    + * "License"); you may not use this file except in compliance
    + * with the License.  You may obtain a copy of the License at
    + *
    + *     http://www.apache.org/licenses/LICENSE-2.0
    + *
    + * Unless required by applicable law or agreed to in writing, software
    + * distributed under the License is distributed on an "AS IS" BASIS,
    + * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
    + * See the License for the specific language governing permissions and
    + * limitations under the License.
    + */
    +package org.apache.giraph.utils;
    +
    +import java.util.ArrayList;
    +import java.util.List;
    +
    +/**
    + * Makes a list of {@link ProgressCounter} accessible through
    + * a {@link ThreadLocal}.
    + */
    +public class ThreadLocalProgressCounter extends 
ThreadLocal<ProgressCounter> {
    +  /**
    +   * List of counters.
    +   */
    +  private final List<ProgressCounter> counters = new ArrayList<>();
    +
    +  /**
    +   * Initializes a new counter, adds it to the list of counters
    +   * and returns it.
    +   * @return Progress counter.
    +   */
    +  @Override
    +  protected ProgressCounter initialValue() {
    +    ProgressCounter threadCounter = new ProgressCounter();
    +    synchronized (counters) {
    +      counters.add(threadCounter);
    +    }
    +    return threadCounter;
    +  }
    +
    +  /**
    +   * Sums the progress of all counters.
    +   * @return Sum of all counters
    +   */
    +  public long getProgress() {
    +    long progress = 0;
    +    synchronized (counters) {
    +      for (ProgressCounter entry : counters) {
    +        progress += entry.getValue();
    +      }
    +    }
    +    return progress;
    +  }
    +
    +  /**
    +   * Removes all counters.
    +   */
    +  public void reset() {
    --- End diff --
    
    What is the purpose of this function and how do you use it? 


> Add memory estimation mechanism to out-of-core
> ----------------------------------------------
>
>                 Key: GIRAPH-1125
>                 URL: https://issues.apache.org/jira/browse/GIRAPH-1125
>             Project: Giraph
>          Issue Type: Improvement
>            Reporter: Hassan Eslami
>            Assignee: Hassan Eslami
>
> The new out-of-core mechanism is designed with the adaptivity goal in mind, 
> meaning that we wanted out-of-core mechanism to kick in only when it is 
> necessary. In other words, when the amount of data (graph, messages, and 
> mutations) all fit in memory, we want to take advantage of the entire memory. 
> And, when in a stage the memory is short, only enough (minimal) amount of 
> data goes out of core (to disk). This ensures a good performance for the 
> out-of-core mechanism.
> To satisfy the adaptiveness goal, we need to know how much memory is used at 
> each point of time. The default out-of-core mechanism (ThresholdBasedOracle) 
> get memory information based on JVM's internal methods (Runtime's 
> freeMemory()). This method is inaccurate (and pessimistic), meaning that it 
> does not account for garbage data that has not been purged by GC. Using JVM's 
> default methods, OOC behaves pessimistically and move data out of core even 
> if it is not necessary. For instance, consider the case where there are a lot 
> of garbage on the heap, but GC has not happened for a while. In this case, 
> the default OOC pushes data on disk and immediately after a major GC it 
> brings back the data to memory. This causes inefficiency in the default out 
> of core mechanism. If out-of-core is used but the data can entirely fit in 
> memory, the job goes out of core even though going out of core is not 
> necessary.
> To address this issue, we need to have a mechanism to more accurately know 
> how much of heap is filled with non-garbage data. Consequently, we need to 
> change the Oracle (OOC policy) to take advantage of a more accurate memory 
> usage estimation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to