nicoloboschi opened a new pull request, #15428: URL: https://github.com/apache/pulsar/pull/15428
### Motivation ElasticSearch has a size limitation for the document id. The hard limit is 512 bytes. It is not configurable. >_id is limited to 512 bytes in size and larger values will be rejected. https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-id-field.html In order to by-pass this limitation without losing data integrity and uniqueness a new option is introduced to create an hash of the Pulsar record key. ### Modifications * New option `idHashingAlgorithm`, enum NONE, SHA256, SHA512 * For the hashing computation we use Guava utility class `com.google.common.hash.Hashing` Note that this option will add a CPU overhead in order to compute the hashed value. It is suggested to use this option together with `canonicalKeyFields` in order to guarantee the same hashed value for the same key content regardless the key fields order. (https://github.com/apache/pulsar/pull/15426) - [x] `doc` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
