[GitHub] [jackrabbit-oak] rishabhdaim opened a new pull request, #550: OAK-9751 : handled cases where path changes in lucene exceeds max bui…

2022-04-25 Thread GitBox


rishabhdaim opened a new pull request, #550:
URL: https://github.com/apache/jackrabbit-oak/pull/550

   …lder size


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [jackrabbit-oak] mreutegg commented on a diff in pull request #550: OAK-9751 : handled cases where path changes in lucene exceeds max bui…

2022-04-25 Thread GitBox


mreutegg commented on code in PR #550:
URL: https://github.com/apache/jackrabbit-oak/pull/550#discussion_r857716136


##
oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/hybrid/LuceneJournalPropertyBuilder.java:
##
@@ -93,6 +93,9 @@ public void addSerializedProperty(@Nullable String json) {
 }
 if (sizeWithinLimits()) {
 indexedNodes.put(path, reader.readString());
+} else {
+// return if max limit reached for builder to avoid 
overflow exception
+return;

Review Comment:
   The same thought crossed my mind as well when I reviewed the change but I 
didn't point it out.
   
   +1 for parsing the entire JSON.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [jackrabbit-oak] thomasmueller commented on pull request #550: OAK-9751 : handled cases where path changes in lucene exceeds max bui…

2022-04-25 Thread GitBox


thomasmueller commented on PR #550:
URL: https://github.com/apache/jackrabbit-oak/pull/550#issuecomment-1108719091

   BTW, thanks a lot for the PR! It is great that you have found and fixed the 
issue!


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [jackrabbit-oak] thomasmueller commented on a diff in pull request #550: OAK-9751 : handled cases where path changes in lucene exceeds max bui…

2022-04-25 Thread GitBox


thomasmueller commented on code in PR #550:
URL: https://github.com/apache/jackrabbit-oak/pull/550#discussion_r857640876


##
oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/hybrid/LuceneJournalPropertyBuilder.java:
##
@@ -93,6 +93,9 @@ public void addSerializedProperty(@Nullable String json) {
 }
 if (sizeWithinLimits()) {
 indexedNodes.put(path, reader.readString());
+} else {
+// return if max limit reached for builder to avoid 
overflow exception
+return;

Review Comment:
   I can see why the old code is broken: if the size is exceeded, we don't call 
readString(), but continue and try to read a comma...
   
   But I also don't like the new code: we wouldn't detect any errors in the 
Json format afterwards... Which I would expect we do.
   
   Instead of return, what about:
   
   ```
   String x = reader.readString()
   if (sizeWithinLimits()) {
   indexedNodes.put(path, x);
   }
   ```
   
   Sure, one could argue it is more efficient to return, but I think the side 
effect of _not_ parsing the rest in this case is worse than the performance hit 
(which is unlikely to be measurable).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [jackrabbit-oak] mreutegg commented on a diff in pull request #550: OAK-9751 : handled cases where path changes in lucene exceeds max bui…

2022-04-25 Thread GitBox


mreutegg commented on code in PR #550:
URL: https://github.com/apache/jackrabbit-oak/pull/550#discussion_r857606356


##
oak-lucene/src/test/java/org/apache/jackrabbit/oak/plugins/index/lucene/hybrid/LuceneJournalPropertyBuilderTest.java:
##
@@ -53,7 +53,25 @@ public void nullOrEmptyJson() throws Exception{
 
 assertTrue(Iterables.isEmpty(((IndexedPaths)builder2.build(;
 }
+@Test
+public void addJsonLessThanMaxBuilderSize() throws Exception {
+String a = null;
+for (int i = 0; i < 499; i++) {
+a = "{\"/var/eventing/jobs/foo/2022/4/19/14/27/af96fcfa9e32_8589" 
+ i + "\" :[\"/oak:index/foo\",\"/oak:index/bar\"]}";
+builder.addSerializedProperty(a);
+}
+
assertEquals(createdIndexPathMap((IndexedPaths)builder.build()).size(), 998);

Review Comment:
   Please swap the two arguments for assertEquals(). The first argument is the 
expected value, which is 998 in this test.



##
oak-lucene/src/test/java/org/apache/jackrabbit/oak/plugins/index/lucene/hybrid/LuceneJournalPropertyBuilderTest.java:
##
@@ -53,7 +53,25 @@ public void nullOrEmptyJson() throws Exception{
 
 assertTrue(Iterables.isEmpty(((IndexedPaths)builder2.build(;
 }
+@Test
+public void addJsonLessThanMaxBuilderSize() throws Exception {
+String a = null;
+for (int i = 0; i < 499; i++) {
+a = "{\"/var/eventing/jobs/foo/2022/4/19/14/27/af96fcfa9e32_8589" 
+ i + "\" :[\"/oak:index/foo\",\"/oak:index/bar\"]}";
+builder.addSerializedProperty(a);
+}
+
assertEquals(createdIndexPathMap((IndexedPaths)builder.build()).size(), 998);
+}
 
+@Test
+public void addJsonBiggerThanMaxBuilderSize() throws Exception {
+String a = null;
+for (int i = 0; i < 502; i++) {
+a = "{\"/var/eventing/jobs/foo/2022/4/19/14/27/af96fcfa9e32_8589" 
+ i + "\" :[\"/oak:index/foo\",\"/oak:index/bar\"]}";
+builder.addSerializedProperty(a);
+}
+
assertEquals(createdIndexPathMap((IndexedPaths)builder.build()).size(), 1000);

Review Comment:
   Same as above. Please swap the two arguments to assertEquals().



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



[GitHub] [jackrabbit-oak] thomasmueller commented on a diff in pull request #550: OAK-9751 : handled cases where path changes in lucene exceeds max bui…

2022-04-25 Thread GitBox


thomasmueller commented on code in PR #550:
URL: https://github.com/apache/jackrabbit-oak/pull/550#discussion_r857640876


##
oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/hybrid/LuceneJournalPropertyBuilder.java:
##
@@ -93,6 +93,9 @@ public void addSerializedProperty(@Nullable String json) {
 }
 if (sizeWithinLimits()) {
 indexedNodes.put(path, reader.readString());
+} else {
+// return if max limit reached for builder to avoid 
overflow exception
+return;

Review Comment:
   I can see why the old code is bad: if the size is exceeded, we don't call 
readString(), but continue and try to read a comma...
   
   But I also don't like the new code: we wouldn't detect any errors in the 
Json format afterwards... Which I would expect we do.
   
   Instead of return, what about:
   
   ```
   String x = reader.readString()
   if (sizeWithinLimits()) {
   indexedNodes.put(path, x);
   }
   ```
   
   Sure, one could argue it is more efficient to return, but I think the side 
effect of _not_ parsing the rest in this case is worse than the performance hit 
(which is unlikely to be measurable).



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org