[GitHub] [jackrabbit-oak] rishabhdaim opened a new pull request, #550: OAK-9751 : handled cases where path changes in lucene exceeds max bui…
rishabhdaim opened a new pull request, #550: URL: https://github.com/apache/jackrabbit-oak/pull/550 …lder size -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [jackrabbit-oak] mreutegg commented on a diff in pull request #550: OAK-9751 : handled cases where path changes in lucene exceeds max bui…
mreutegg commented on code in PR #550: URL: https://github.com/apache/jackrabbit-oak/pull/550#discussion_r857716136 ## oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/hybrid/LuceneJournalPropertyBuilder.java: ## @@ -93,6 +93,9 @@ public void addSerializedProperty(@Nullable String json) { } if (sizeWithinLimits()) { indexedNodes.put(path, reader.readString()); +} else { +// return if max limit reached for builder to avoid overflow exception +return; Review Comment: The same thought crossed my mind as well when I reviewed the change but I didn't point it out. +1 for parsing the entire JSON. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [jackrabbit-oak] thomasmueller commented on pull request #550: OAK-9751 : handled cases where path changes in lucene exceeds max bui…
thomasmueller commented on PR #550: URL: https://github.com/apache/jackrabbit-oak/pull/550#issuecomment-1108719091 BTW, thanks a lot for the PR! It is great that you have found and fixed the issue! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [jackrabbit-oak] thomasmueller commented on a diff in pull request #550: OAK-9751 : handled cases where path changes in lucene exceeds max bui…
thomasmueller commented on code in PR #550: URL: https://github.com/apache/jackrabbit-oak/pull/550#discussion_r857640876 ## oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/hybrid/LuceneJournalPropertyBuilder.java: ## @@ -93,6 +93,9 @@ public void addSerializedProperty(@Nullable String json) { } if (sizeWithinLimits()) { indexedNodes.put(path, reader.readString()); +} else { +// return if max limit reached for builder to avoid overflow exception +return; Review Comment: I can see why the old code is broken: if the size is exceeded, we don't call readString(), but continue and try to read a comma... But I also don't like the new code: we wouldn't detect any errors in the Json format afterwards... Which I would expect we do. Instead of return, what about: ``` String x = reader.readString() if (sizeWithinLimits()) { indexedNodes.put(path, x); } ``` Sure, one could argue it is more efficient to return, but I think the side effect of _not_ parsing the rest in this case is worse than the performance hit (which is unlikely to be measurable). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [jackrabbit-oak] mreutegg commented on a diff in pull request #550: OAK-9751 : handled cases where path changes in lucene exceeds max bui…
mreutegg commented on code in PR #550: URL: https://github.com/apache/jackrabbit-oak/pull/550#discussion_r857606356 ## oak-lucene/src/test/java/org/apache/jackrabbit/oak/plugins/index/lucene/hybrid/LuceneJournalPropertyBuilderTest.java: ## @@ -53,7 +53,25 @@ public void nullOrEmptyJson() throws Exception{ assertTrue(Iterables.isEmpty(((IndexedPaths)builder2.build(; } +@Test +public void addJsonLessThanMaxBuilderSize() throws Exception { +String a = null; +for (int i = 0; i < 499; i++) { +a = "{\"/var/eventing/jobs/foo/2022/4/19/14/27/af96fcfa9e32_8589" + i + "\" :[\"/oak:index/foo\",\"/oak:index/bar\"]}"; +builder.addSerializedProperty(a); +} + assertEquals(createdIndexPathMap((IndexedPaths)builder.build()).size(), 998); Review Comment: Please swap the two arguments for assertEquals(). The first argument is the expected value, which is 998 in this test. ## oak-lucene/src/test/java/org/apache/jackrabbit/oak/plugins/index/lucene/hybrid/LuceneJournalPropertyBuilderTest.java: ## @@ -53,7 +53,25 @@ public void nullOrEmptyJson() throws Exception{ assertTrue(Iterables.isEmpty(((IndexedPaths)builder2.build(; } +@Test +public void addJsonLessThanMaxBuilderSize() throws Exception { +String a = null; +for (int i = 0; i < 499; i++) { +a = "{\"/var/eventing/jobs/foo/2022/4/19/14/27/af96fcfa9e32_8589" + i + "\" :[\"/oak:index/foo\",\"/oak:index/bar\"]}"; +builder.addSerializedProperty(a); +} + assertEquals(createdIndexPathMap((IndexedPaths)builder.build()).size(), 998); +} +@Test +public void addJsonBiggerThanMaxBuilderSize() throws Exception { +String a = null; +for (int i = 0; i < 502; i++) { +a = "{\"/var/eventing/jobs/foo/2022/4/19/14/27/af96fcfa9e32_8589" + i + "\" :[\"/oak:index/foo\",\"/oak:index/bar\"]}"; +builder.addSerializedProperty(a); +} + assertEquals(createdIndexPathMap((IndexedPaths)builder.build()).size(), 1000); Review Comment: Same as above. Please swap the two arguments to assertEquals(). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org
[GitHub] [jackrabbit-oak] thomasmueller commented on a diff in pull request #550: OAK-9751 : handled cases where path changes in lucene exceeds max bui…
thomasmueller commented on code in PR #550: URL: https://github.com/apache/jackrabbit-oak/pull/550#discussion_r857640876 ## oak-lucene/src/main/java/org/apache/jackrabbit/oak/plugins/index/lucene/hybrid/LuceneJournalPropertyBuilder.java: ## @@ -93,6 +93,9 @@ public void addSerializedProperty(@Nullable String json) { } if (sizeWithinLimits()) { indexedNodes.put(path, reader.readString()); +} else { +// return if max limit reached for builder to avoid overflow exception +return; Review Comment: I can see why the old code is bad: if the size is exceeded, we don't call readString(), but continue and try to read a comma... But I also don't like the new code: we wouldn't detect any errors in the Json format afterwards... Which I would expect we do. Instead of return, what about: ``` String x = reader.readString() if (sizeWithinLimits()) { indexedNodes.put(path, x); } ``` Sure, one could argue it is more efficient to return, but I think the side effect of _not_ parsing the rest in this case is worse than the performance hit (which is unlikely to be measurable). -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@jackrabbit.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org