mikemccand commented on code in PR #12829:
URL: https://github.com/apache/lucene/pull/12829#discussion_r1435648884


##########
lucene/codecs/src/java/org/apache/lucene/codecs/simpletext/SimpleTextSegmentInfoFormat.java:
##########
@@ -197,7 +197,13 @@ public SegmentInfo read(
         sortField[i] = 
SortFieldProvider.forName(provider).readSortField(bytes);
         assert bytes.eof();
       }
-      Sort indexSort = sortField.length == 0 ? null : new Sort(sortField);
+
+      final Sort indexSort;
+      if (sortField.length == 0) {

Review Comment:
   Thank you!  Die ternary operator die!



##########
lucene/core/src/java/org/apache/lucene/index/FieldInfos.java:
##########
@@ -437,6 +486,31 @@ private void verifySoftDeletedFieldName(String fieldName, 
boolean isSoftDeletesF
       }
     }
 
+    private void verifyParentFieldName(String fieldName, boolean 
isParentField) {
+      if (isParentField) {
+        if (parentFieldName == null) {
+          throw new IllegalArgumentException(
+              "this index has ["
+                  + fieldName
+                  + "] as parent document field already but parent document 
field is not configured in IWC");
+        } else if (fieldName.equals(parentFieldName) == false) {
+          throw new IllegalArgumentException(
+              "cannot configure ["
+                  + parentFieldName
+                  + "] as parent document field ; this index uses ["
+                  + fieldName
+                  + "] as parent document field  already");
+        }
+      } else if (fieldName.equals(parentFieldName)) {
+        throw new IllegalArgumentException(
+            "cannot configure ["
+                + parentFieldName
+                + "] as parent document field ; this index uses ["

Review Comment:
   Hmm isn't this exception message wrong?  It should be more like "this index 
has <parentFieldName> as parent field already, cannot index this as non-parent 
field now" or so?



##########
lucene/core/src/java/org/apache/lucene/index/FieldInfos.java:
##########
@@ -132,6 +135,13 @@ public FieldInfos(FieldInfo[] infos) {
         }
         softDeletesField = info.name;
       }
+      if (info.isParentField()) {
+        if (parentField != null && parentField.equals(info.name) == false) {
+          throw new IllegalArgumentException(
+              "multiple parent fields [" + info.name + ", " + parentField + 
"]");

Review Comment:
   Maybe clarify for the user something like `only a single parent field is 
allowed: saw both "X" and "Y"` or so?



##########
lucene/core/src/java/org/apache/lucene/index/FieldInfos.java:
##########
@@ -355,11 +395,19 @@ static final class FieldNumbers {
       this.omitNorms = new HashMap<>();
       this.storeTermVectors = new HashMap<>();
       this.softDeletesFieldName = softDeletesFieldName;
+      this.parentFieldName = parentFieldName;
+      if (softDeletesFieldName != null
+          && parentFieldName != null
+          && parentFieldName.equals(softDeletesFieldName)) {
+        throw new IllegalArgumentException(
+            "parent document and soft-deletes field can't be the same field");

Review Comment:
   Include the field name in the exception message?



##########
lucene/core/src/java/org/apache/lucene/index/FieldInfos.java:
##########
@@ -78,6 +80,7 @@ public FieldInfos(FieldInfo[] infos) {
     boolean hasPointValues = false;
     boolean hasVectorValues = false;
     String softDeletesField = null;
+    String parentField = null;

Review Comment:
   We should remove all these (pointless) initializers since they are already 
javac's default?  Or it's OK for a "when in Rome..." situation...



##########
lucene/core/src/java/org/apache/lucene/index/FieldInfo.java:
##########
@@ -206,6 +210,13 @@ public void checkConsistency() {
       throw new IllegalArgumentException(
           "vectorDimension must be >=0; got " + vectorDimension + " (field: '" 
+ name + "')");
     }
+
+    if (softDeletesField && parentField) {
+      throw new IllegalArgumentException(
+          "field can't be used as soft-deletes field and parent document field 
(field: '"

Review Comment:
   Whoa, sneaky!



##########
lucene/core/src/java/org/apache/lucene/codecs/lucene94/Lucene94FieldInfosFormat.java:
##########
@@ -157,6 +158,13 @@ public FieldInfos read(
           boolean omitNorms = (bits & OMIT_NORMS) != 0;
           boolean storePayloads = (bits & STORE_PAYLOADS) != 0;
           boolean isSoftDeletesField = (bits & SOFT_DELETES_FIELD) != 0;
+          boolean isParentField =
+              format >= FORMAT_PARENT_FIELD ? (bits & PARENT_FIELD_FIELD) != 0 
: false;
+
+          assert (bits & 0xE0) == 0
+              : "unused bits are set \"" + Integer.toBinaryString(bits) + "\"";
+          assert format >= FORMAT_PARENT_FIELD || (bits & 0xF0) == 0
+              : "parent field bit is set but shouldn't' \"" + 
Integer.toBinaryString(bits) + "\"";

Review Comment:
   Can we upgrade these to `CorruptIndexException` since conceivably this can 
happen due to user events (bad disk, RAM, whatever CPU bit fiddling, cosmic 
ray...)?  `assert` should be reserved for cases that we feel strongly can only 
occur if we (Lucene sources) have a bug.



##########
lucene/core/src/java/org/apache/lucene/index/FieldInfos.java:
##########
@@ -188,6 +200,26 @@ public static FieldInfos getMergedFieldInfos(IndexReader 
reader) {
     }
   }
 
+  private static String getAndValidateParentField(List<LeafReaderContext> 
leaves) {
+    boolean set = false;
+    String theField = null;
+    for (LeafReaderContext ctx : leaves) {
+      String field = ctx.reader().getFieldInfos().getParentField();
+      if (set && Objects.equals(field, theField) == false) {
+        throw new IllegalStateException(

Review Comment:
   As far as we know, users should never be able to do anything that would 
trigger this?  This means we have some bug that allowed broken schema into the 
index?



##########
lucene/core/src/java/org/apache/lucene/index/FieldInfo.java:
##########
@@ -84,7 +86,8 @@ public FieldInfo(
       int vectorDimension,
       VectorEncoding vectorEncoding,
       VectorSimilarityFunction vectorSimilarityFunction,
-      boolean softDeletesField) {
+      boolean softDeletesField,
+      boolean parentField) {

Review Comment:
   Could we rename the arg & member to `isParentField`?



##########
lucene/core/src/java/org/apache/lucene/index/IndexWriterConfig.java:
##########
@@ -545,4 +545,20 @@ public IndexWriterConfig setIndexWriterEventListener(
     this.eventListener = eventListener;
     return this;
   }
+
+  /**
+   * Sets the parent document field. If this optional property is set, 
IndexWriter will add an
+   * internal field to every root document added to the index writer. A 
document is considered a
+   * parent document if it's the last document in a document block indexed via 
{@link
+   * IndexWriter#addDocuments(Iterable)} or {@link 
IndexWriter#updateDocuments(Term, Iterable)}
+   * (Iterable)} and it's relatives. Additionally, all individual documents 
added via the single

Review Comment:
   Hmm is this `(Iterable)}` a leftover?



##########
lucene/core/src/java/org/apache/lucene/index/IndexWriterConfig.java:
##########
@@ -545,4 +545,20 @@ public IndexWriterConfig setIndexWriterEventListener(
     this.eventListener = eventListener;
     return this;
   }
+
+  /**
+   * Sets the parent document field. If this optional property is set, 
IndexWriter will add an
+   * internal field to every root document added to the index writer. A 
document is considered a
+   * parent document if it's the last document in a document block indexed via 
{@link
+   * IndexWriter#addDocuments(Iterable)} or {@link 
IndexWriter#updateDocuments(Term, Iterable)}
+   * (Iterable)} and it's relatives. Additionally, all individual documents 
added via the single

Review Comment:
   Also `it's` -> `its` in this sentence.



##########
lucene/core/src/java/org/apache/lucene/codecs/lucene94/Lucene94FieldInfosFormat.java:
##########
@@ -157,6 +158,13 @@ public FieldInfos read(
           boolean omitNorms = (bits & OMIT_NORMS) != 0;
           boolean storePayloads = (bits & STORE_PAYLOADS) != 0;
           boolean isSoftDeletesField = (bits & SOFT_DELETES_FIELD) != 0;
+          boolean isParentField =
+              format >= FORMAT_PARENT_FIELD ? (bits & PARENT_FIELD_FIELD) != 0 
: false;
+
+          assert (bits & 0xE0) == 0
+              : "unused bits are set \"" + Integer.toBinaryString(bits) + "\"";
+          assert format >= FORMAT_PARENT_FIELD || (bits & 0xF0) == 0
+              : "parent field bit is set but shouldn't' \"" + 
Integer.toBinaryString(bits) + "\"";

Review Comment:
   Also fix `shouldn't'` -> `shouldn't` (remove the extra trailing single 
quote)?



##########
lucene/core/src/java/org/apache/lucene/index/FieldInfos.java:
##########
@@ -437,6 +486,31 @@ private void verifySoftDeletedFieldName(String fieldName, 
boolean isSoftDeletesF
       }
     }
 
+    private void verifyParentFieldName(String fieldName, boolean 
isParentField) {
+      if (isParentField) {
+        if (parentFieldName == null) {
+          throw new IllegalArgumentException(
+              "this index has ["
+                  + fieldName
+                  + "] as parent document field already but parent document 
field is not configured in IWC");
+        } else if (fieldName.equals(parentFieldName) == false) {
+          throw new IllegalArgumentException(
+              "cannot configure ["
+                  + parentFieldName
+                  + "] as parent document field ; this index uses ["
+                  + fieldName
+                  + "] as parent document field  already");
+        }
+      } else if (fieldName.equals(parentFieldName)) {
+        throw new IllegalArgumentException(

Review Comment:
   Maybe add a comment that this is the `else case -- isParentField == false`?  
On quick read I missed that extra closing `}`...



##########
lucene/core/src/java/org/apache/lucene/codecs/lucene94/Lucene94FieldInfosFormat.java:
##########
@@ -157,6 +158,8 @@ public FieldInfos read(
           boolean omitNorms = (bits & OMIT_NORMS) != 0;
           boolean storePayloads = (bits & STORE_PAYLOADS) != 0;
           boolean isSoftDeletesField = (bits & SOFT_DELETES_FIELD) != 0;
+          boolean isParentField =
+              format >= FORMAT_PARENT_FIELD ? (bits & PARENT_FIELD_FIELD) != 0 
: false;

Review Comment:
   +1 for both paranoia checks?  (bit is not set if format is old; no "unused" 
bits are ever set).  Bit flags are otherwise a bit dangerous...



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to