eolivelli commented on code in PR #18668:
URL: https://github.com/apache/pulsar/pull/18668#discussion_r1034503124
##########
site2/docs/io-elasticsearch-sink.md:
##########
@@ -52,43 +52,44 @@ The configuration of the Elasticsearch sink connector has
the following properti
### Property
-| Name | Type|Required | Default | Description
Review Comment:
did you reformat the table ?
##########
pulsar-io/elastic-search/src/main/java/org/apache/pulsar/io/elasticsearch/ElasticSearchSink.java:
##########
@@ -240,20 +240,28 @@ public Pair<String, String>
extractIdAndDocument(Record<GenericObject> record) t
if (id != null
&& idHashingAlgorithm != null
&& idHashingAlgorithm !=
ElasticSearchConfig.IdHashingAlgorithm.NONE) {
- Hasher hasher;
- switch (idHashingAlgorithm) {
- case SHA256:
- hasher = Hashing.sha256().newHasher();
- break;
- case SHA512:
- hasher = Hashing.sha512().newHasher();
- break;
- default:
- throw new UnsupportedOperationException("Unsupported
IdHashingAlgorithm: "
- + idHashingAlgorithm);
+
+ boolean performHashing = true;
+ if (elasticSearchConfig.isConditionalIdHashing()
+ && id.getBytes(StandardCharsets.UTF_8).length <= 512) {
Review Comment:
do we really need to create the byte[] instance ?
it will generate some garbage
maybe you can create the byte[] here and do not call `hasher.putString(id,
StandardCharsets.UTF_8);` but use the byte[] created here.
I suspect that `putString` will perform the encoding another time
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]