egalpin commented on a change in pull request #15257:
URL: https://github.com/apache/beam/pull/15257#discussion_r681775815
##########
File path:
sdks/java/io/elasticsearch/src/main/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java
##########
@@ -1512,10 +1528,14 @@ static String createBulkApiEntity(DocToBulk spec,
String document, int backendVe
isDelete = spec.getIsDeleteFn().apply(parsedDocument);
}
}
+ final boolean isAppendOnly = Boolean.TRUE.equals(spec.getAppendOnly());
Review comment:
I also think that we should be consistent in how we make use of the
various settings: either we should use the approach of
`Boolean.TRUE.equals(...)` or the existing non-nullable approach setting a
default via builder. I personally prefer the method you've used for its
flexibility and ease of maintenance over default values set via builder.
Could you please update `getUsePartialUpdate` to be used in the same manner?
##########
File path:
sdks/java/io/elasticsearch/src/main/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java
##########
@@ -1333,6 +1337,18 @@ public DocToBulk withUsePartialUpdate(boolean
usePartialUpdate) {
return builder().setUsePartialUpdate(usePartialUpdate).build();
}
+ /**
+ * Provide an instruction to control whether the target index should be
considered append-only.
+ * For append-only indexes and/or data streams, only {@code create}
operations will be issued.
+ * Updates and deletions are not allowed, so related options will be
ignored.
+ *
+ * @param appendOnly set to true to allow one document appending
Review comment:
nit: typo I think. `to allow only` rather than `to allow one`
##########
File path:
sdks/java/io/elasticsearch/src/main/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java
##########
@@ -1333,6 +1337,18 @@ public DocToBulk withUsePartialUpdate(boolean
usePartialUpdate) {
return builder().setUsePartialUpdate(usePartialUpdate).build();
}
+ /**
+ * Provide an instruction to control whether the target index should be
considered append-only.
+ * For append-only indexes and/or data streams, only {@code create}
operations will be issued.
Review comment:
One interesting point about create Vs. index: unless specifying one's
own document IDs, the two methods result in the same behaviour[1]. I think
having a note here in the docstring explaining that point would be a
nice-to-have addition. Really, we should rely on users to consult the
Elasticsearch (or any Sink's) docs to determine their needs, but illustrating
that the 2 options are effectively equivalent depending on other settings might
prompt users to double check the ES docs for their use case.
[1]
https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-bulk.html#docs-bulk-api-desc
##########
File path:
sdks/java/io/elasticsearch/src/main/java/org/apache/beam/sdk/io/elasticsearch/ElasticsearchIO.java
##########
@@ -1512,10 +1528,14 @@ static String createBulkApiEntity(DocToBulk spec,
String document, int backendVe
isDelete = spec.getIsDeleteFn().apply(parsedDocument);
}
}
+ final boolean isAppendOnly = Boolean.TRUE.equals(spec.getAppendOnly());
if (isDelete) {
+ checkState(!isAppendOnly, "No deletions allowed for append-only
indices");
// delete request used for deleting a document
return String.format("{ \"delete\" : %s }%n", documentMetadata);
+ } else if (isAppendOnly) {
Review comment:
I wonder about this control structure a bit. The `isDelete` case is
special because its truthiness will change depending on the document and a
function applied to that document. The other settings all always have constant
behaviour regardless of input document.
Given the number of branches here, I think it would be best to restructure
this section to return early whenever possible and reduce the nesting of the
logic
So with both those points said, I think we should unnest the
settings-branching making each setting its own `if-return` rather than having
any `else/else if` such that all settings are flat but listed in order of
precedence.
Thoughts?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]