Hi there.

I was looking at the performance of code that creates large collections (CreateManyChildNodesTest, XmlImportTest), and found:

    Iterable<NodeDocument> readChildDocs(@Nonnull final String path,
                                         @Nullable String name,
                                         int limit) {
        String to = Utils.getKeyUpperLimit(checkNotNull(path));
        String from;
        if (name != null) {
            from = Utils.getIdFromPath(PathUtils.concat(path, name));
        } else {
            from = Utils.getKeyLowerLimit(path);
        }
        if (name != null || limit > NUM_CHILDREN_CACHE_LIMIT) {
            // do not use cache when there is a lower bound name
            // or more than 16k child docs are requested
            return store.query(Collection.NODES, from, to, limit);
        }

So *if* we use paging, only the first page use the cache.

Paging appears to use maximally 1600 entries at once (DocumentNodeState):

    /**
     * The number of child nodes to fetch initially.
     */
    static final int INITIAL_FETCH_SIZE = 100;

    /**
     * The maximum number of child nodes to fetch in one call. (1600).
     */
    static final int MAX_FETCH_SIZE = INITIAL_FETCH_SIZE << 4;

The maximum number of entries that could be cached however seems to be bigger:

    /**
     * Do not cache more than this number of children for a document.
     */
    static final int NUM_CHILDREN_CACHE_LIMIT = 
Integer.getInteger("oak.documentMK.childrenCacheLimit", 16 * 1024);

Maybe it's just me, but something seems to be weird here:

1) why the different limits that do not seem to be consistent?

2) why disabling caching for all but the first page?

Best regards, Julian

Reply via email to