[GitHub] [solr] cpoerschke commented on a change in pull request #243: SOLR-15480 Added putAll method to Tuple and refactored the copy constructor, merge, and clone methods to use it.

GitBox Tue, 07 Sep 2021 04:39:00 -0700


cpoerschke commented on a change in pull request #243:
URL: https://github.com/apache/solr/pull/243#discussion_r703432000




##########
File path: solr/solrj/src/java/org/apache/solr/client/solrj/io/Tuple.java
##########
@@ -87,10 +87,20 @@ public Tuple(String k1, Object v1, String k2, Object v2) {
    * @param fields map containing keys and values to be copied to this tuple
    */
   public Tuple(Map<String, ?> fields) {
-    // TODO Use bulk putAll operation that will properly size the map
-    // https://issues.apache.org/jira/browse/SOLR-15480
-    for (Map.Entry<String, ?> entry : fields.entrySet()) {
-      put(entry.getKey(), entry.getValue());
+    putAll(fields);
+  }
+
+  /**
+   * A copy constructor
+   * @param original Tuple that will be copied
+   */
+  public Tuple(Tuple original) {
+    this.putAll(original.fields);
+    if (original.fieldNames != null) {
+      this.fieldNames = new ArrayList<>(original.fieldNames);
+    }
+    if (original.fieldLabels != null) {
+      this.fieldLabels = new HashMap<>(original.fieldLabels);

Review comment:
       > I thought about the case where we rename `other.fieldNames` on 
collision as well, but I'm not sure where we would call them since the only 
real "identifier" that `Tuple`'s would have would be either their position 
(this || other) or their java object reference.
   
   My interpretation of `fieldNames` is as a filter of what is to be 
serialised, possibly with an ordering also, i.e. no collision as such. And I 
agree that there aren't any real "identifiers" that might be used in renaming. 
And `fieldNames` and `fieldLabels` being used together, would renaming in the 
former then also need to imply renaming in the latter?
   
   (It's not very "merge like" but in principle collision or overlap could also 
be resolved via removal: `[A B C] merged with [B  D F] resulting in [A C D F]` 
with the common `B` removed but that seems quite counter-intuitive not least 
how would the caller know that `B` was removed?)

##########
File path: solr/solrj/src/test/org/apache/solr/client/solrj/io/TupleTest.java
##########
@@ -0,0 +1,101 @@
+package org.apache.solr.client.solrj.io;

Review comment:
       minor: the pull request CI or locally `gradlew precommit` or `gradlew 
checks` will likely flag that there's no license header here as yet

##########
File path: solr/solrj/src/java/org/apache/solr/client/solrj/io/Tuple.java
##########
@@ -87,10 +87,20 @@ public Tuple(String k1, Object v1, String k2, Object v2) {
    * @param fields map containing keys and values to be copied to this tuple
    */
   public Tuple(Map<String, ?> fields) {
-    // TODO Use bulk putAll operation that will properly size the map
-    // https://issues.apache.org/jira/browse/SOLR-15480
-    for (Map.Entry<String, ?> entry : fields.entrySet()) {
-      put(entry.getKey(), entry.getValue());
+    putAll(fields);
+  }
+
+  /**
+   * A copy constructor
+   * @param original Tuple that will be copied
+   */
+  public Tuple(Tuple original) {
+    this.putAll(original.fields);
+    if (original.fieldNames != null) {
+      this.fieldNames = new ArrayList<>(original.fieldNames);
+    }
+    if (original.fieldLabels != null) {
+      this.fieldLabels = new HashMap<>(original.fieldLabels);

Review comment:
       > 1. Give `other` precedence and do `putAll`s for `fields` and 
`fieldLabels`.
   > 2. Add individual `fieldName` items from `other` only if they aren't 
present in `this.fieldNames`.
   
   Yes, that makes sense to me too. Looking at how `clone` and `merge` are used 
e.g. `HashJoinStream` there is documented also -- e.g. 
https://solr.apache.org/guide/8_9/stream-decorator-reference.html#hashjoin -- 
w.r.t. right being used if both left and right contain the same field. So 
`other` being given precedence would be consistent with that, assuming 
`left.merge(right)` directionality.
   
   > The only downsides I could see would be if the user had a `field` that 
they didn't want externally serializable (ie: it doesn't have an associated 
`fieldName/fieldLabel`), then `other` could override that original state. That 
may be fine in this case. If a user wants more fine grain control, then they 
should probably just do the merge "by hand" instead of relying on `Tuple.merge`.
   
   Yes, that's one of the edge cases where `tupleA` and `tupleB` could have 
labels and names for a particular purpose but the merged result might not 
necessarily suit the purpose. As you say, in that scenario alternative 
"merging" could be done, or regular `Tuple.merge` could be called followed by 
some sort of post-processing.
   
   Another edge case is collisions in the field labels, I'll add a 
`TupleTest.writeMapTest()` for that, let me know what you think and/or feel 
free to amend or revert.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[GitHub] [solr] cpoerschke commented on a change in pull request #243: SOLR-15480 Added putAll method to Tuple and refactored the copy constructor, merge, and clone methods to use it.

Reply via email to