lhotari opened a new pull request, #23762:
URL: https://github.com/apache/pulsar/pull/23762

   Fixes #23717 #23306
   
   ### Motivation
   
   The Pulsar Alpine docker image has stability issues due to including 
[glibc-package 
solution](https://github.com/apache/pulsar/tree/e535d990f60b5c15f1ec440e82de2fa80d6783a5/docker/glibc-package)
 which includes glibc library into the Alpine docker image. 
   
   The stability issue causes JVM crashes. This happens mainly in Netty native 
library loading and usage. It's also a problem in other native libraries such 
as Conscrypt 
(https://github.com/apache/pulsar/pull/23364#issuecomment-2380635032) and 
Snappy (#22804).
   
   Alpine maintainers don't recommend adding glibc to Alpine since mixing real 
glibc in Alpine will result in an unstable environment. In Alpine maintainer 
[Ariadne Conill's 
words](https://gitlab.alpinelinux.org/alpine/aports/-/merge_requests/24647#note_176723):
 _"Combining glibc and musl runtimes is basically all but guaranteed to create 
an unstable environment, unless the system is appropriately configured (glibc 
side uses glibc binaries only, and vice versa)."_
   
   * blog post by Ariadne Conill, [there is no such thing as a "glibc based 
alpine 
image"](https://ariadne.space/2021/08/26/there-is-no-such-thing-as-a-glibc-based-alpine-image/)
   * comments explaining reasons:
     * 
https://github.com/docker-library/official-images/pull/10779#issuecomment-906095981
     * 
https://github.com/docker-library/official-images/pull/10779#issuecomment-906240756
     * 
https://gitlab.alpinelinux.org/alpine/aports/-/merge_requests/24647#note_176723
   
   The reason why the glibc solution was initially added was to support Pulsar 
IO Kinesis connector. The Amazon Kinesis Producer Library isn't fully Java. The 
Java API calls a native executable which is built for glibc. Amazon doesn't 
provide a binary for Alpine, however the native executable source code is 
provided. In order to use Amazon Kinesis Producer Library on Alpine, the stable 
solution is to compile this binary specifically for Alpine.
   
   ### Modifications
   
   This PR includes changes to:
   - remove the glibc-package solution
   - provide a Amazon Kinesis Producer Library (KPL) executable compiled for 
Alpine
     - There's a Dockerfile that downloads and compiles KPL 0.15.12 version for 
Alpine.
       - The resulting docker image has been published to 
[`apachepulsar/pulsar-io-kinesis-sink-kinesis_producer:0.15.12`](https://hub.docker.com/r/apachepulsar/pulsar-io-kinesis-sink-kinesis_producer/tags).
 
         - This will need to be updated only when Kinesis Producer Library is 
upgraded within Pulsar.    
   - The `kinesis_producer` executable is copied from  
`apachepulsar/pulsar-io-kinesis-sink-kinesis_producer:0.15.12` image to the 
`apachepulsar/pulsar-all` image and an environment variable 
`PULSAR_IO_KINESIS_KPL_PATH` is set to the executable path.
   - Amazon Kinesis Producer Library has been upgraded from 0.14.13 version to 
0.15.12. 
     - The 0.14.13 contains several critical issues such as a potential data 
loss issue
     - In 0.15.12, the implementation uses AWS STS (Security Token Service) 
under the covers. 
       - It is necessary to update the unit and integration tests Localstack 
configuration to include STS support and overriding the endpoints in the Pulsar 
IO Kinesis Sink connector
   - The Pulsar IO Kinesis Sink connector has been modified to support the AWS 
Kinesis Producer Library parameter `nativeExecutable`. When the  
`PULSAR_IO_KINESIS_KPL_PATH` env var is set, it will be set to the 
`nativeExecutable` parameter as the default value. This is how the Pulsar IO 
Kinesis Sink connector will use the `kinesis_producer` binary compiled for 
Alpine when using the `pulsar-all` image.
   
   - In the Pulsar docker image, `LD_PRELOAD=/lib/libgcompat.so.0` is set. This 
is required to support loading Netty native libraries in Alpine unless the JVM 
already loads the `gcompat` library which provides a `glibc` compatibility 
layer for Alpine. The Alpine `gcompat` library is the recommended option for 
Netty native libraries on Alpine.
     - More details in comments: 
https://github.com/netty/netty-tcnative/issues/907#issuecomment-2548228536 and 
https://github.com/apache/pulsar/issues/23717#issuecomment-2548242000
   
   ### Documentation
   
   <!-- DO NOT REMOVE THIS SECTION. CHECK THE PROPER BOX ONLY. -->
   
   - [ ] `doc` <!-- Your PR contains doc changes. -->
   - [ ] `doc-required` <!-- Your PR changes impact docs and you will update 
later -->
   - [x] `doc-not-needed` <!-- Your PR changes do not impact docs -->
   - [ ] `doc-complete` <!-- Docs have been already added -->


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to