nicoloboschi opened a new pull request #14805:
URL: https://github.com/apache/pulsar/pull/14805


   ### Motivation
   
   OpenSearch high-level rest api client does not support Elastic 8 servers.
   There are some hardcoded request fields that are no longer supported in ES 8 
and the server throws an error. ([Full 
guide](https://www.elastic.co/guide/en/elasticsearch/reference/current/migrating-8.0.html)).
 For the ES Sink usage the first problem I found was about createIndex request 
for the field "type" (`include_type_name` in request, `type` in responses).
   
   Elastic has a guide to softly migrate to newer installation using [custom 
http 
headers](https://www.elastic.co/guide/en/elasticsearch/reference/current/rest-api-compatibility.html).
 
   We are using OpenSearch client which doesn't support 
`"application/vnd.elasticsearch+json;compatible-with=7"` content-type.
   
   The best solution is to migrate from high-level rest client to the official 
[Elastic java 
client](https://www.elastic.co/guide/en/elasticsearch/client/java-api-client/current/index.html)
 which is Apache 2 licensed.
   See the [License page](https://www.elastic.co/pricing/faq/licensing)
   >Update: The Java HLRC has been deprecated in 7.15.0 in favor of the Java 
API Client. The Java API Client is licensed under Apache 2.0.
   
   Elastic java-client is not compatible with OpenSearch.
   The solution is to keep both the client (opensearch high-level and elastic 
java-client) and use the proper one based on the target server.
   
   ### Modifications
   * Added a new configuration property: `compatibilityMode` that accepts:
     * AUTO (default): it will discover the server version and choose the right 
client implementation
     * ELASTICSEARCH: Force to use the ES java client
     * ELASTICSEARCH_7: Force to use OpenSearch client. It's better to use the 
OpenSearch implementation for ES7 since it is more "tested" in production and 
it is the current implementation. (that's a conservative choice to avoid 
regression while upgrading Pulsar)
     * OPENSEARCH: Force to use OpenSearch client
   * Created a new interface `RestClient` with the two different 
implementations.
   * Created a new BulkProcessor API (very similar to the HighLevel rest API 
client one) to handle bulk requests with the following features:
      * Multi-thread async requests
      * Threshold based on number of operations
      * Threshold based on byte sizes of operations
      * Periodic auto flush
   * Moved all the tests (both unit and integration) that are using the 
container to run with both ES 7,ES 8 and OpenSearch docker containers.
   
   ### Verifying this change
   
   - [x] Make sure that the change passes the CI checks.
   
   This change is already covered by existing tests, such as *(please describe 
tests)*.
   * tests under `pulsar-io/elastic-search`
   * Pulsar Sink integration tests
   
   ### Documentation
   
   - [x] `doc` 
     
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


Reply via email to