[ https://issues.apache.org/jira/browse/OAK-9052?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17099871#comment-17099871 ]
Thomas Mueller commented on OAK-9052: ------------------------------------- Data structure: * FlatFileBufferLinkedList is used in the second phase and contains a list of NodeStateEntry objects. * NodeStateEntry.nodeState is a LazyChildrenNodeState for entries in memory, but can be a DocumentNodeState when reading from MongoDB (in the first phase). * NodeStateEntry objects can be (de-)serialized using the NodeStateEntryWriter / NodeStateEntryReader. That is usually only used in the first phase. * The temp file is stored in temp/flat-file-store/sort-work-dir/sortInBatch...flatfile (by default using compression). > Reindexing using --doc-traversal-mode may need a lot of memory > -------------------------------------------------------------- > > Key: OAK-9052 > URL: https://issues.apache.org/jira/browse/OAK-9052 > Project: Jackrabbit Oak > Issue Type: Improvement > Components: indexing, mongomk > Reporter: Thomas Mueller > Priority: Major > > Indexing using oak-run and --doc-traversal-mode uses the FlatFileStore. For > aggregation, there is a limit on memory usage, by default around 100 MB. > Depending on the content structure, this limit can be exceeded. > It would be good to find a way to avoid a memory limit, for example using a > temporary storage (a file, or a persistent key/value store). -- This message was sent by Atlassian Jira (v8.3.4#803005)