Chetan Mehrotra created OAK-6339:
------------------------------------

             Summary: MapRecord#getKeys should should initialize child 
iterables lazily
                 Key: OAK-6339
                 URL: https://issues.apache.org/jira/browse/OAK-6339
             Project: Jackrabbit Oak
          Issue Type: Improvement
          Components: segment-tar
            Reporter: Chetan Mehrotra
            Priority: Minor
             Fix For: 1.8


Recently we saw OutOfMemory using 
[oakRepoStats|https://github.com/chetanmeh/oak-console-scripts/tree/master/src/main/groovy/repostats]
 script with a SegmentNodeStore setup where uuid index has 16M+ entries and 
thus creating a very flat hierarchy. This happened while computing 
Tree#getChildren iterator which internally invokes MapRecord#getKeys to obtain 
an iterable for child node names.

This happened because code in getKeys computes the key list eagerly by calling 
bucket.getKeys() which recursivly calls same for each child bucket and thus 
resulting in eager evaluation.
{code}
        if (isBranch(size, level)) {
            List<MapRecord> buckets = getBucketList(segment);
            List<Iterable<String>> keys =
                    newArrayListWithCapacity(buckets.size());
            for (MapRecord bucket : buckets) {
                keys.add(bucket.getKeys());
            }
            return concat(keys);
        }
{code}

Instead here we should use same approach as used in MapRecord#getEntries i.e. 
evalate the iterable for child buckets lazily
{code}
        if (isBranch(size, level)) {
            List<MapRecord> buckets = getBucketList(segment);
            List<Iterable<MapEntry>> entries =
                    newArrayListWithCapacity(buckets.size());
            for (final MapRecord bucket : buckets) {
                entries.add(new Iterable<MapEntry>() {
                    @Override
                    public Iterator<MapEntry> iterator() {
                        return bucket.getEntries(diffKey, diffValue).iterator();
                    }
                });
            }
            return concat(entries);
        }
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

Reply via email to