nfsantos commented on code in PR #979:
URL: https://github.com/apache/jackrabbit-oak/pull/979#discussion_r1229698077


##########
oak-run-commons/src/main/java/org/apache/jackrabbit/oak/index/indexer/document/flatfile/pipelined/PipelinedNodeStateHolderFactory.java:
##########
@@ -0,0 +1,43 @@
+/*
+ * Licensed to the Apache Software Foundation (ASF) under one
+ * or more contributor license agreements.  See the NOTICE file
+ * distributed with this work for additional information
+ * regarding copyright ownership.  The ASF licenses this file
+ * to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance
+ * with the License.  You may obtain a copy of the License at
+ *
+ *   http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing,
+ * software distributed under the License is distributed on an
+ * "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
+ * KIND, either express or implied.  See the License for the
+ * specific language governing permissions and limitations
+ * under the License.
+ */
+package org.apache.jackrabbit.oak.index.indexer.document.flatfile.pipelined;
+
+import java.util.ArrayList;
+
+import static org.apache.jackrabbit.oak.commons.PathUtils.elements;
+import static 
org.apache.jackrabbit.oak.index.indexer.document.flatfile.NodeStateEntryWriter.getPath;
+
+/**
+ * Factory of {@link PipelinedNodeStateHolder}, reuse an internal array list 
to reduce object allocation.
+ * Not thread safe.

Review Comment:
   This code is on the hot path, so I am willing to sacrifice simplicity for 
efficiency. However, I have not run any benchmark to validate that creating a 
new intermediate array for every element line read from the intermediate files 
will cause a measurable performance degradation. Intuitively, I expect so, 
because this is a tight loop and the files contain millions of lines, but I 
could be wrong.
   
   I did not knew about the `PathUtils.getDepth(path)`, that seems to be a 
better solution as it avoids the need for any intermediate object and can be 
done in a thread-safe way. I will re-rewrite the code to use that method to 
create an array of the right size, as you suggest, therefore making it thread 
safe. If I have time, run some microbenchmarks.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to