Re: [DISCUSS] KIP-388 Add observer interface to record request and response

Dong Lin Sat, 17 Nov 2018 14:38:24 -0800

Hey Lincong,

The main concern for the original version of the KIP is that, if we expose
internal classes such as RequestChannel.Request and AbstractResponse, it
will be difficult to make change to those internal classes in the future.


In the latest version of the KIP, it seems that user is going to provide
implementation of classes such as RequestAdapter, ResponseAdapter, which
again interact with internal classes RequestChannel.Request and
AbstractResponse directly. So it seems that we still have the same issue of
exposing the same internal classes to the user plugin implementation?

Here is some high level idea of how to make this work without exposing
internal classes. Basically we want to have the same information as
currently contained in AbstractRequest, AbstractResponse and RequestHeader.
These classes can be replaced with the following fields. The transformation
between ByteBuffer and AbstractRequest/AbstractResponse/RequestHeader is
totally specified by the schema of each request/response which is already
public interface in Kafka. So this approach will not expose internal
classes.

AbstractRequest -> int apiKeyId, short apiVersion, ByteBuffer buffer
AbstractResponse -> int apiKeyId, short apiVersion, ByteBuffer buffer
RequestHeader -> short requsetHeaderSchemaVersion, ByteBuffer buffer

And user are also free to compile their plugin together with the Kafka
source code so that they can just do the following without re-writing the
serialization/deserialization logic:

apiKey.parseRequest(apiVersion, buffer)
Struct struct = apiKey.parseRequest(apiVersion, buffer);
AbstractRequest requset = AbstractRequest.parseRequest(apiKey, apiVersion,
struct);

Currently I am not sure whether it is going to be expensive to do
conversion between Struct and ByteBuffer. My understanding is that this is
not relatively expensive because the main CPU overhead in broker appears to
be e.g. decompression, re-compression, SSL. But if it is expensive, it may
be reasonable to replace ByteBuffer with Struct, where Struct already
contain fields such as Schema. I have not carefully considered this
directly. The final solution probably needs to explicitly specify whether
we want to keep binary compatibility and source compatibility. Hopefully
this is good direction to move this KIP forward.

Thanks,
Dong

On Wed, Nov 14, 2018 at 10:41 AM Lincong Li <andrewlinc...@gmail.com> wrote:

> Hi Wesley,
>
> Thank you very much for your feedback. The concern on memory pressure is
> definitely valid. However it should be the user's job to keep this concern
> in mind and implement the observer in the most reasonable way for their use
> case. In other words, implement it at their own risks.
>
> The alternative approach to mitigate this concern is to implement adapters
> in a way that it extracts all information required by all its getters at
> initialization when a reference to request/response is given so that
> references to request/response could be garbaged collected. However, this
> proactive initialization might be wasteful. For example, methods such as "
> public Map<TopicPartition, Long> getFetchFromTopicPartitionSizeInBytes()"
> takes non-constant time.
>
> Best regards,
> Lincong Li
>
> On Tue, Nov 13, 2018 at 5:56 PM xiongqi wu <xiongq...@gmail.com> wrote:
>
> > Lincong,
> >
> > Thanks for the KIP.
> > I have a question about the lifecycle of request and response.
> > With the current (requestAdapter, responseAdapter) implementation,
> > the observer can potentially extend the lifecycle of request and response
> > through adapter.
> > Anyone can implement own observer, and some observers may want to do
> async
> > process or batched processing.
> >
> > Could you clarify how could we make sure this do not increase the memory
> > pressure on potentially holding large request/response object?
> >
> >
> >
> > Xiongqi (Wesley) Wu
> >
> >
> > On Mon, Nov 12, 2018 at 10:23 PM Lincong Li <andrewlinc...@gmail.com>
> > wrote:
> >
> > > Thanks Mayuresh, Ismael and Colin for your feedback!
> > >
> > > I updated the KIP basing on your feedback. The change is basically that
> > two
> > > interfaces are introduced to prevent the internal classes from being
> > > exposed. These two interfaces contain getters that allow user to
> extract
> > > information from request and response in their own implementation(s) of
> > the
> > > observer interface and they would not constraint future implementation
> > > changes in neither RequestChannel.Request nor AbstractResponse. There
> > could
> > > be more getters defined in these two interfaces. The implementation of
> > > these two interfaces will be provided as part of the KIP.
> > >
> > > I also expanded on "Add code to the broker (in KafkaApis) to allow
> Kafka
> > > servers to invoke any
> > > observers defined. More specifically, change KafkaApis code to invoke
> all
> > > defined observers, in the order in which they were defined, for every
> > > request-response pair" by providing a sample code block which shows how
> > > these interfaces are used in the KafkaApis class.
> > >
> > > Let me know if you have any question, concern, or comments. Thank you
> > very
> > > much!
> > >
> > > Best regards,
> > > Lincong Li
> > >
> > > On Fri, Nov 9, 2018 at 10:34 AM Mayuresh Gharat <
> > > gharatmayures...@gmail.com>
> > > wrote:
> > >
> > > > Hi Lincong,
> > > >
> > > > Thanks for the KIP.
> > > >
> > > > As Colin pointed out, would it better to expose certain specific
> pieces
> > > of
> > > > information from the request/response like api key, request headers,
> > > record
> > > > counts, client ID instead of the entire request/response objects ?
> This
> > > > enables us to change the request response apis independently of this
> > > > pluggable public API, in future, unless you think we have a strong
> > reason
> > > > that we need to expose the request, response objects.
> > > >
> > > > Also, it would be great if you can expand on :
> > > > "Add code to the broker (in KafkaApis) to allow Kafka servers to
> invoke
> > > any
> > > > observers defined. More specifically, change KafkaApis code to invoke
> > all
> > > > defined observers, in the order in which they were defined, for every
> > > > request-response pair."
> > > > probably with an example of how you visualize it. It would help the
> KIP
> > > to
> > > > be more concrete and easier to understand the end to end workflow.
> > > >
> > > > Thanks,
> > > >
> > > > Mayuresh
> > > >
> > > > On Thu, Nov 8, 2018 at 7:44 PM Ismael Juma <ism...@juma.me.uk>
> wrote:
> > > >
> > > > > I agree, the current KIP doesn't discuss the public API that we
> would
> > > be
> > > > > exposing and it's extensive if the normal usage would allow for
> > casting
> > > > > AbstractRequest into the various subclasses and potentially even
> > > > accessing
> > > > > Records and related for produce request.
> > > > >
> > > > > There are many use cases where this could be useful, but it
> requires
> > > > quite
> > > > > a bit of thinking around the APIs that we expose and the expected
> > > usage.
> > > > >
> > > > > Ismael
> > > > >
> > > > > On Thu, Nov 8, 2018, 6:09 PM Colin McCabe <cmcc...@apache.org
> wrote:
> > > > >
> > > > > > Hi Lincong Li,
> > > > > >
> > > > > > I agree that server-side instrumentation is helpful.  However, I
> > > don't
> > > > > > think this is the right approach.
> > > > > >
> > > > > > The problem is that RequestChannel.Request and AbstractResponse
> are
> > > > > > internal classes that should not be exposed.  These are
> > > implementation
> > > > > > details that we may change in the future.  Freezing these into a
> > > public
> > > > > API
> > > > > > would really hold back the project.  For example, for really
> large
> > > > > > responses, we might eventually want to avoid materializing the
> > whole
> > > > > > response all at once.  It would make more sense to return it in a
> > > > > streaming
> > > > > > fashion.  But if we need to support this API forever, we can't do
> > > that.
> > > > > >
> > > > > > I think it's fair to say that this is, at best, half a solution
> to
> > > the
> > > > > > problem of tracing requests.  Users still need to write the
> plugin
> > > code
> > > > > and
> > > > > > arrange for it to be on their classpath to make this work.  I
> think
> > > the
> > > > > > alternative here is not client-side instrumentation, but simply
> > > making
> > > > > the
> > > > > > change to the broker without using a plugin interface.
> > > > > >
> > > > > > If a public interface is absolutely necessary here we should
> expose
> > > > only
> > > > > > things like the API key, client ID, time, etc. that don't
> constrain
> > > the
> > > > > > implementation a lot in the future.  I think we should also use
> > java
> > > > here
> > > > > > to avoid the compatibility issues we have had with Scala APIs in
> > the
> > > > > past.
> > > > > >
> > > > > > best,
> > > > > > Colin
> > > > > >
> > > > > >
> > > > > > On Thu, Nov 8, 2018, at 11:34, radai wrote:
> > > > > > > another downside to client instrumentation (beyond the number
> of
> > > > > > > client codebases one would need to cover) is that in a large
> > > > > > > environments you'll have a very long tail of applications using
> > > older
> > > > > > > clients to upgrade - it would be a long and disruptive process
> > (as
> > > > > > > opposed to updating broker-side instrumentation)
> > > > > > > On Thu, Nov 8, 2018 at 11:04 AM Peter M. Elias <
> > > > petermel...@gmail.com>
> > > > > > wrote:
> > > > > > > >
> > > > > > > > I know we have a lot of use cases for this type of
> > functionality
> > > at
> > > > > my
> > > > > > > > enterprise deployment. I think it's helpful for maintaining
> > > > > > reliability of
> > > > > > > > the cluster especially and identifying clients that are not
> > > > properly
> > > > > > tuned
> > > > > > > > and therefore applying excessive load to the brokers.
> > > Additionally,
> > > > > > there
> > > > > > > > is a bit of a dark spot without something like as currently.
> > For
> > > > > > example,
> > > > > > > > if a client is not using a consumer group, there is no direct
> > way
> > > > to
> > > > > > query
> > > > > > > > the state of the consumer without looking at raw network
> > > > connections
> > > > > to
> > > > > > > > determine the extent of the traffic generated by that
> > particular
> > > > > > consumer.
> > > > > > > >
> > > > > > > > While client instrumentation can certainly help with this
> > > > currently,
> > > > > > given
> > > > > > > > that Kafka is intended to be a shared service across a
> > > potentially
> > > > > very
> > > > > > > > large surface area of clients, central observation of client
> > > > activity
> > > > > > is in
> > > > > > > > my opinion an essential feature.
> > > > > > > >
> > > > > > > > Peter
> > > > > > > >
> > > > > > > > On Thu, Nov 8, 2018 at 12:13 PM radai <
> > > radai.rosenbl...@gmail.com>
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > bump.
> > > > > > > > >
> > > > > > > > > I think the proposed API (Observer) is useful for any sort
> of
> > > > > > > > > multi-tenant environment for chargeback and reporting
> > purposes.
> > > > > > > > >
> > > > > > > > > if no one wants to comment, can we initiate a vote?
> > > > > > > > > On Mon, Nov 5, 2018 at 6:31 PM Lincong Li <
> > > > andrewlinc...@gmail.com
> > > > > >
> > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > Hi everyone. Here
> > > > > > > > > > <
> > > > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-388%3A+Add+observer+interface+to+record+request+and+response
> > > > > > > > > >
> > > > > > > > > > is
> > > > > > > > > > my KIP. Any feedback is appreciated.
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > > Lincong Li
> > > > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > > --
> > > > -Regards,
> > > > Mayuresh R. Gharat
> > > > (862) 250-7125
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-388 Add observer interface to record request and response

Reply via email to