On Wed, May 6, 2015 at 10:40 AM, Jiangjie Qin <j...@linkedin.com.invalid>
wrote:

> Jun & Ewen,
>
> Thanks a lot for the comments. Both of them look better than my original
> plan.
>
> Jun, one downside about not changing NetworkClient but use a different
> metadata implementation is that NetworkClient will still have all the
> metadata related code there which makes it a little bit weird. I think
> Ewen¹s approach solve this problem.
>
> Ewen, If I understand correctly, you are proposing something similar to
> the structure we are using in
> AbstractFetcherThread/ConsumerFetcherThread/ReplicaFetcherThread. That
> will make NetworkClient look clean but increase the layers which is
> probably OK.
>

Yes, same basic idea.


>
> Inspired by your suggestions. I have another thought which seems closer to
> Jun¹s idea. What if we move maybeUpdateMetadata()/handleMetadataResponse()
> and related logic in NetworkClient to metadata and pass in NetworkClient
> as an argument. Like Jun suggested, we need a Metadata interface and
> different implementations.
>

This could work too. I think it ends up with the new Metadata interface and
NetworkClient being more coupled than layered. What I was hoping to do was
to remove the concept of Metadata entirely from NetworkClient so it only
deals with Nodes and is focused on the request/response handling. But that
might require introducing a listener interface (if Metadata was refactored
to be passed a NetworkClient it would use to initiate metadata refreshes)
or provide the right hooks so that behavior can be overridden (if Metadata
was to subclass NetworkClient).

I think all these options could work and it's not clear to me which will
work out best in separating the code, so I'd just try one of them and see
how the patch looks.

-Ewen


>
> Thanks.
>
> Jiangjie (Becket) Qin
>
>
>
> On 5/5/15, 11:31 PM, "Ewen Cheslack-Postava" <e...@confluent.io> wrote:
>
> >+1 on trying to reuse the NetworkClient code.
> >
> >I think Jun's approach could work, but I'm wondering if refactoring a bit
> >could get better separation of concerns without a somewhat awkward nop
> >implementation of Metadata. I'm not sure what combination of delegation or
> >subclassing makes sense yet, but here's another approach that I think
> >could
> >work:
> >
> >* Get rid of metadata stuff from NetworkClient. Add a subclass that also
> >manages all the metadata. (Since it's used for both producer and consumer,
> >the obvious name that I first jumped to is ClientNetworkClient, but
> >somehow
> >I think we can come up with something less confusing.)
> >* leastLoadedNode is the only caller of metadata.fetch() in that class,
> >maybeUpdateMetadata is the only caller of leastLoadedNode,
> >maybeUpdateMetadata is only called in poll when a combination of metadata
> >related timeouts end up being 0. These can be safely refactored into the
> >subclass with one override of poll(). Same with metadataFetchInProgress
> >assuming the rest of the changes below.
> >* Some of the default implementations (e.g. handleMetadataResponse) can be
> >left nops in NetworkClient and moved to the subclass.
> >* Others can be overridden to call the super method then take the
> >additional action necessary (e.g., on disconnect, move the metadata update
> >request to the subclass).
> >* Making the timeout handling in poll() work for both NetworkClient and
> >the
> >new base class might be the messiest part and might require breaking down
> >the implementation of poll into multiple methods.
> >* isReady uses metadataFetchInProgress and gets a timeout from the
> >Metadata
> >class. We can just override this method as well, though I feel like
> >there's
> >probably a cleaner solution.
> >
> >-Ewen
> >
> >
> >On Tue, May 5, 2015 at 4:54 PM, Jun Rao <j...@confluent.io> wrote:
> >
> >> Hi, Jiangjie,
> >>
> >> Thanks for taking on this.
> >>
> >> I was thinking that one way to decouple the dependency on Metadata in
> >> NetworkClient is the following.
> >> 1. Make Metadata an interface.
> >> 2. Rename current Metadata class to sth like KafkaMetadata that
> >>implements
> >> the Metadata interface.
> >> 3. Have a new NoOpMetadata class that implements the Metadata interface.
> >> This class
> >> 3.1 does nothing for any write method
> >> 3.2 returns max long for any method that asks for a timestamp
> >> 3.3. returns an empty Cluster for fetch().
> >>
> >> Then we can leave NetworkClient unchanged and just pass in a
> >>NoOpMetadata
> >> when using NetworkClient in the controller. The consumer/producer client
> >> will be using KafkaMetadata.
> >>
> >> As for replica fetchers, it may be possible to use KafkaConsumer.
> >>However,
> >> we don't need the metadata and the offset management. So, perhaps it's
> >> easier to just use NetworkClient. Also, currently, there is one replica
> >> fetcher thread per source broker. By using NetworkClient, we can change
> >> that to using a single thread for all source brokers. This is probably a
> >> bigger change. So, maybe we can do it later.
> >>
> >> Jun
> >>
> >>
> >> I think we probably need to replace replica fetcher with NetworkClient
> >>as
> >> well. Replica fetcher gets leader from the controller and therefore
> >>doesn't
> >>
> >> On Tue, May 5, 2015 at 1:37 PM, Jiangjie Qin <j...@linkedin.com.invalid
> >
> >> wrote:
> >>
> >> > I am trying to see if we can reuse the NetworkClient class to be used
> >>in
> >> > controller to broker communication. (Also, we can probably use
> >> > KafkaConsumer which is already using NetworkClient in replica
> >>fetchers).
> >> > Currently NetworkClient does the following things in addition to
> >>sending
> >> > requests.
> >> >
> >> >   1.  Connection state management.
> >> >   2.  Flow control (inflight requests)
> >> >   3.  Metadata refresh
> >> >
> >> > In controller we need (1) and (2) but not (3). NetworkClient is
> >>tightly
> >> > coupled with metadata now and this is the major blocker of reusing
> >> > NetworkClient in controller.  For controller, we don¹t need
> >>NetworkClient
> >> > to manage any metadata because the controller has listeners to monitor
> >> the
> >> > cluster state and has all the information about topic metadata.
> >> > I am thinking we can add a disable metadata refresh flag to
> >>NetworkClient
> >> > or set metadata refresh interval to be Long.MAX_VALUE, so the metadata
> >> will
> >> > be managed outside NetworkClient.
> >> > This needs minimal change to allow NetworkClient to be reused, but the
> >> > ugly part is NetworkClient still has the entire Metadata while it
> >> actually
> >> > only needs a NodeList.
> >> >
> >> > Want to see what do people think about this.
> >> >
> >> > Thanks.
> >> >
> >> > Jiangjie (Becket) Qin
> >> >
> >> >
> >>
> >
> >
> >
> >--
> >Thanks,
> >Ewen
>
>


-- 
Thanks,
Ewen

Reply via email to