cpoerschke commented on a change in pull request #243:
URL: https://github.com/apache/solr/pull/243#discussion_r703432000
##########
File path: solr/solrj/src/java/org/apache/solr/client/solrj/io/Tuple.java
##########
@@ -87,10 +87,20 @@ public Tuple(String k1, Object v1, String k2, Object v2) {
* @param fields map containing keys and values to be copied to this tuple
*/
public Tuple(Map<String, ?> fields) {
- // TODO Use bulk putAll operation that will properly size the map
- // https://issues.apache.org/jira/browse/SOLR-15480
- for (Map.Entry<String, ?> entry : fields.entrySet()) {
- put(entry.getKey(), entry.getValue());
+ putAll(fields);
+ }
+
+ /**
+ * A copy constructor
+ * @param original Tuple that will be copied
+ */
+ public Tuple(Tuple original) {
+ this.putAll(original.fields);
+ if (original.fieldNames != null) {
+ this.fieldNames = new ArrayList<>(original.fieldNames);
+ }
+ if (original.fieldLabels != null) {
+ this.fieldLabels = new HashMap<>(original.fieldLabels);
Review comment:
> I thought about the case where we rename `other.fieldNames` on
collision as well, but I'm not sure where we would call them since the only
real "identifier" that `Tuple`'s would have would be either their position
(this || other) or their java object reference.
My interpretation of `fieldNames` is as a filter of what is to be
serialised, possibly with an ordering also, i.e. no collision as such. And I
agree that there aren't any real "identifiers" that might be used in renaming.
And `fieldNames` and `fieldLabels` being used together, would renaming in the
former then also need to imply renaming in the latter?
(It's not very "merge like" but in principle collision or overlap could also
be resolved via removal: `[A B C] merged with [B D F] resulting in [A C D F]`
with the common `B` removed but that seems quite counter-intuitive not least
how would the caller know that `B` was removed?)
##########
File path: solr/solrj/src/test/org/apache/solr/client/solrj/io/TupleTest.java
##########
@@ -0,0 +1,101 @@
+package org.apache.solr.client.solrj.io;
Review comment:
minor: the pull request CI or locally `gradlew precommit` or `gradlew
checks` will likely flag that there's no license header here as yet
##########
File path: solr/solrj/src/java/org/apache/solr/client/solrj/io/Tuple.java
##########
@@ -87,10 +87,20 @@ public Tuple(String k1, Object v1, String k2, Object v2) {
* @param fields map containing keys and values to be copied to this tuple
*/
public Tuple(Map<String, ?> fields) {
- // TODO Use bulk putAll operation that will properly size the map
- // https://issues.apache.org/jira/browse/SOLR-15480
- for (Map.Entry<String, ?> entry : fields.entrySet()) {
- put(entry.getKey(), entry.getValue());
+ putAll(fields);
+ }
+
+ /**
+ * A copy constructor
+ * @param original Tuple that will be copied
+ */
+ public Tuple(Tuple original) {
+ this.putAll(original.fields);
+ if (original.fieldNames != null) {
+ this.fieldNames = new ArrayList<>(original.fieldNames);
+ }
+ if (original.fieldLabels != null) {
+ this.fieldLabels = new HashMap<>(original.fieldLabels);
Review comment:
> 1. Give `other` precedence and do `putAll`s for `fields` and
`fieldLabels`.
> 2. Add individual `fieldName` items from `other` only if they aren't
present in `this.fieldNames`.
Yes, that makes sense to me too. Looking at how `clone` and `merge` are used
e.g. `HashJoinStream` there is documented also -- e.g.
https://solr.apache.org/guide/8_9/stream-decorator-reference.html#hashjoin --
w.r.t. right being used if both left and right contain the same field. So
`other` being given precedence would be consistent with that, assuming
`left.merge(right)` directionality.
> The only downsides I could see would be if the user had a `field` that
they didn't want externally serializable (ie: it doesn't have an associated
`fieldName/fieldLabel`), then `other` could override that original state. That
may be fine in this case. If a user wants more fine grain control, then they
should probably just do the merge "by hand" instead of relying on `Tuple.merge`.
Yes, that's one of the edge cases where `tupleA` and `tupleB` could have
labels and names for a particular purpose but the merged result might not
necessarily suit the purpose. As you say, in that scenario alternative
"merging" could be done, or regular `Tuple.merge` could be called followed by
some sort of post-processing.
Another edge case is collisions in the field labels, I'll add a
`TupleTest.writeMapTest()` for that, let me know what you think and/or feel
free to amend or revert.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]