HI Colin.  Thanks for the KIP.  Here is some feedback and various questions.

"*Controller processes will listen on a separate port from brokers.  This
will be true even when the broker and controller are co-located in the same
JVM*". I assume it is possible that the port numbers could be the same when
using separate JVMs (i.e. broker uses port 9192 and controller also uses
port 9192).  I think it would be clearer to state this along these
lines: "Controller
nodes will listen on a port, and the controller port must differ from any
port that a broker in the same JVM is listening on.  In other words, a
controller and a broker node, when in the same JVM, do not share ports"

I think the sentence "*In the realm of ACLs, this translates to controllers
requiring CLUSTERACTION on CLUSTER for all operations*" is confusing.  It
feels to me that you can just delete it.  Am I missing something here?

The KIP states "*The metadata will be stored in memory on all the active
controllers.*"  Can there be multiple active controllers?  Should it
instead read "The metadata will be stored in memory on all potential
controllers." (or something like that)?

KIP-595 states "*we have assumed the name __cluster_metadata for this
topic, but this is not a formal part of this proposal*".  This KIP-631
states "*Metadata changes need to be persisted to the __metadata log before
we propagate them to the other nodes in the cluster.  This means waiting
for the metadata log's last stable offset to advance to the offset of the
change.*"  Are we here formally defining "__metadata" as the topic name,
and should these sentences refer to "__metadata topic" rather than
"__metadata log"?  What are the "other nodes in the cluster" that are
referred to?  These are not controller nodes but brokers, right?  If so,
then should we say "before we propagate them to the brokers"?  Technically
we have a controller cluster and a broker cluster -- two separate clusters,
correct?  (Even though we could potentially share JVMs and therefore
require no additional processes.). If the statement is referring to nodes
in both clusters then maybe we should state "before we propagate them to
the other nodes in the controller cluster or to brokers."

"*The controller may have several of these uncommitted changes in flight at
any given time.  In essence, the controller's in-memory state is always a
little bit in the future compared to the current state.  This allows the
controller to continue doing things while it waits for the previous changes
to be committed to the Raft log.*"  Should the three references above be to
the active controller rather than just the controller?

"*Therefore, the controller must not make this future state "visible" to
the rest of the cluster until it has been made persistent – that is, until
it becomes current state*". Again I wonder if this should refer to "active"
controller, and indicate "anyone else" as opposed to "the rest of the
cluster" since we are talking about 2 clusters here?

"*When the active controller decides that it itself should create a
snapshot, it will first try to give up the leadership of the Raft quorum.*"
Why?  Is it necessary to state this?  It seems like it might be an
implementation detail rather than a necessary constraint/requirement that
we declare publicly and would have to abide by.

"*It will reject brokers whose metadata is too stale*". Why?  An example
might be helpful here.

"*it may lose subsequent conflicts if its broker epoch is stale*" This is
the first time a "broker epoch" is mentioned.  I am assuming it is the
controller epoch communicated to it (if any).  It would be good to
introduce it/explicitly state what it is before referring to it.

Ron

On Tue, Jul 7, 2020 at 6:48 PM Colin McCabe <cmcc...@apache.org> wrote:

> Hi all,
>
> I posted a KIP about how the quorum-based controller envisioned in KIP-500
> will work.  Please take a look here:
> https://cwiki.apache.org/confluence/x/4RV4CQ
>
> best,
> Colin
>

Reply via email to