Hi Olasoji, Thanks for the update.
1. Could you add these as code examples to the KIP, and not as prose? See [1-3] for some examples of well-written public interfaces. 2. I find the proposed API surface area excessive, and think we need to narrow it down. Looking at what's merged on trunk right now and not including your proposed modifications, this KIP includes: * 2 interfaces * 12 classes * 5 constructors * 8 static methods * 2 static variables * 18 unique instance methods Committing to long-term stability of this many classes and interfaces as public API will make it hard to initially review it, and innovate on it in the future. 4. Is there a way to modify the existing signatures/implementations such that we don't have to expose any subclasses of Compression as public API, only the CompressionTypes? 6. What does "first provider loaded" mean to users? AFAIU, this looks arbitrary to users, as it depends on classpath ordering, filesystem ordering, etc. 8. I'm trying to understand under what conditions this can be attacked (threat model), and what mitigations Kafka needs to provide (implementation). Is Kafka providing special filtering? Is Kafka providing the custom class loader? 6, 7, 8. Please add these details to the KIP so that other reviewers don't need to follow the discussion thread so closely. 9. The testing section looks like it is written for individuals implementing accelerator plugins. Could you also write about how the Kafka side of the implementation will be tested? Thanks, Greg [1] https://cwiki.apache.org/confluence/display/KAFKA/KIP-98+-+Exactly+Once+Delivery+and+Transactional+Messaging [2] https://cwiki.apache.org/confluence/display/KAFKA/KIP-405%3A+Kafka+Tiered+Storage [3] https://cwiki.apache.org/confluence/display/KAFKA/KIP-877%3A+Mechanism+for+plugins+and+connectors+to+register+metrics On Fri, Feb 21, 2025 at 12:15 AM Denloye, Olasoji <olasoji.denl...@intel.com> wrote: > Hi Greg, Thanks for your response. > > 1. Some changes will be made to the Compression and Builder interfaces. At > least one default method, isAvailable(), would be added to the Builder > interface. The static methods in the Compression interface that create and > return specific builders would be modified to return a new builder or a > loaded builder from a service provider if it exists. The return types may > need to be more generic e.g Builder<GzipCompression> vs > GzipCompression.Builder. > 2. These classes are appropriate to add to the API surface as they may be > required to develop efficient compression implementations. I will modify > the KIP. > 4. This is because the Compression interface (currently) expects a Builder > of type 'GzipCompression.Builder'. I agree that alternative implementations > need not extend the base implementation with some modifications to the > Compression interface. However if an alternative implementation is > registered as a service provider of GzipCompression then it must be of type > 'CompressionType.GZIP' and it must be fully compatible with the base > implementation of GzipCompression. > 5. So far no, but 'Configurable' looks promising > 6. If no specific configuration is set, then the first provider loaded and > available will be used. If a specific configuration is set then that > specific provider will be used if available else the base implementation. > 7. Yes, a ConfigException will be thrown if the class specified in the > config is not found on the classpath. > 8. This can be implemented by using special filtering or a custom class > loader. The concern here is that any class on the class path (trusted or > not) could be loaded which could intercept messages or corrupt messages by > compressing using an unknown algorithm. > > -----Original Message----- > From: Greg Harris <greg.har...@aiven.io.INVALID> > Sent: Tuesday, February 18, 2025 1:58 PM > To: dev@kafka.apache.org > Subject: Re: [DISCUSS] KIP-1121: Compression acceleration in Kafka > > Hi Olasoji, > > Thanks for the KIP! I find it addresses many of the concerns I had with > KIP-984. Thank you for following up on that! > > 1. Since the Compression interface is internal currently, it may contain > some components that were not designed with pluggability in mind, such as > the static methods. Are these interfaces being published as-is, or are some > changes going to be made first? > 2. Similarly, the ByteBufferOutputStream, CompressionType, BufferSupplier, > ByteBufferInputStream, ChunkedByteStream, and GzipOutputStream classes in > the signature or implementations of Compression are not public API > currently, and would need to be made public along with Compression. Are > these classes appropriate to add to our API surface? > 3. Could you adjust the KIP structure/formatting to make the public API > surface area more clear? Some prose in "Proposed Changes" could become code > examples in "Public Interfaces". > 4. What is the significance of this statement: "Any new service provider > should extend the base implementation and thus return the same > CompressionType". Are we expecting that `o instanceof GzipCompression == > o.type().equals(CompressionType.GZIP)` at all times? If an alternative > implementation doesn't need any part of the upstream implementation, does > it make sense to force them to extend one another? > 5. Is this interface expected to be composed with any other plugin > mix-ins, such as Configurable, Reconfigurable, Closeable, Monitorable, > Versioned, etc? > 6. If there are multiple accelerators present, which one will be chosen? > 7. If the configuration is set but the accelerator implementation isn't > available, does that cause a ConfigException? > 8. Could you expand on this statement: "Due to security and/or > compatibility concerns it is possible to limit this feature to only load > approved service providers (e.g those distributed with Kafka)". How will > this be implemented? What threat model does this assume? > > Thanks, > Greg > > On Tue, Feb 18, 2025 at 11:41 AM Denloye, Olasoji < > olasoji.denl...@intel.com> > wrote: > > > Hi, I would like to bring this back to everyone's attention. This KIP > > introduces a simple framework that enables the use of alternative, > > faster implementations of the existing compression codecs (Gzip, > > Snappy, Zstandard and LZ4) in Kafka. > > > > https://cwiki.apache.org/confluence/x/Xgr0Ew > > > > Thanks > > > > -----Original Message----- > > From: Denloye, Olasoji <olasoji.denl...@intel.com> > > Sent: Thursday, February 13, 2025 1:22 PM > > To: dev@kafka.apache.org > > Subject: RE: [DISCUSS] KIP-1121: Compression acceleration in Kafka > > > > Hi, I would like to bring this back to everyone's attention. I have > > made changes to the proposal to incorporate the ServiceLoader API as > suggested. > > > > https://cwiki.apache.org/confluence/x/Xgr0Ew > > > > Thanks > > > > -----Original Message----- > > From: Denloye, Olasoji > > Sent: Thursday, December 19, 2024 11:01 AM > > To: dev@kafka.apache.org > > Subject: RE: [DISCUSS] KIP-1121: Compression acceleration in Kafka > > > > Hi Frédérik, Thanks for the suggestion to use the existing > > ServiceLoader feature in Java. It could serve as suitable mechanism > > for selecting the best available compression codec. > > > > Thanks > > > > -----Original Message----- > > From: Frédérik Rouleau <froul...@confluent.io.INVALID> > > Sent: Wednesday, December 18, 2024 12:24 AM > > To: dev@kafka.apache.org > > Subject: Re: [DISCUSS] KIP-1121: Compression acceleration in Kafka > > > > Hi Olasoji, > > > > I think that having a way to provide some optimised/accelerated > > implementation is a good idea. > > What not using the ServiceLoader feature of Java? That would allow > > anyone to provide their own optimised implementation depending on > > their infrastructure. We can keep the existing implementation as the > > default and allow Kafka to find a better implementation based on > > Service found. As you mentioned, we might need to enrich the interface > > to allow Kafka to find the best version available at runtime. > > > > Regards, > > > > On Tue, Dec 17, 2024 at 6:36 PM Denloye, Olasoji < > > olasoji.denl...@intel.com> > > wrote: > > > > > Hi Everyone, > > > > > > I would like to start a discussion on KIP-1121: Compression > > > acceleration acceleration in Kafka. KIP-1121: Compression > > > acceleration in Kafka - Apache Kafka - Apache Software Foundation< > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-1121%3A+Compre > > > ss > > > ion+acceleration+in+Kafka > > > > > > > > > >