On Sun, Mar 25, 2018 at 6:16 AM, Ben Pfaff <[email protected]> wrote:

> I applied this series to master, with the exception of ovn-ctl, which
> needs some extra work and will come in a separate patch to be posted
> later.
>
>
Hi Ben,

I am on it. I will address the review comments and post a new patch.

Thanks
Numan


> On Fri, Mar 23, 2018 at 03:27:24PM -0700, Ben Pfaff wrote:
> > Thank you for the review.  As before, I am applying most of your
> > comments and not bothering to respond to them, so below you can find
> > what needed a reply.
> >
> > On Sat, Feb 24, 2018 at 03:17:51PM -0800, Justin Pettit wrote:
> > > > On Dec 31, 2017, at 9:16 PM, Ben Pfaff <[email protected]> wrote:
> > > > diff --git a/ovsdb/raft-private.c b/ovsdb/raft-private.c
> > > > new file mode 100644
> > > > index 000000000000..457d1292a949
> > > > --- /dev/null
> > > > +++ b/ovsdb/raft-private.c
> > > >
> > > > +struct ovsdb_error * OVS_WARN_UNUSED_RESULT
> > > > +raft_address_validate(const char *address)
> > > > +{
> > > > ...
> > > > +        return ovsdb_error(NULL, "%s: expected \"tcp\" or \"ssl\"
> address",
> > > > +                           address);
> > >
> > > I assume you're downplaying "unix:", since it only makes sense for
> testing purposes?
> >
> > Yes.  I don't want to encourage users to use it, since it isn't useful
> > for distribution but only for testing..
> >
> > > > +char *
> > > > +raft_address_to_nickname(const char *address, const struct uuid
> *sid)
> > > > +{
> > >
> > > I think it may be worth explaining what a nickname is and how one gets
> one.
> >
> > OK, I added a comment.
> >
> > > > +void
> > > > +raft_servers_format(const struct hmap *servers, struct ds *ds)
> > > > +{
> > > > +    int i = 0;
> > > > +    const struct raft_server *s;
> > > > +    HMAP_FOR_EACH (s, hmap_node, servers) {
> > > > +        if (i++) {
> > > > +            ds_put_cstr(ds, ", ");
> > > > +        }
> > > > +        ds_put_format(ds, SID_FMT"(%s)", SID_ARGS(&s->sid),
> s->address);
> > >
> > > Do you think it's worth putting a space between the SID and
> parenthetical address?
> >
> > It's actually more readable the way it is.  I spent a lot of time
> > reading this stuff.
> >
> > > > +static void
> > > > +raft_append_request_from_jsonrpc(struct ovsdb_parser *p,
> > > > +                                 struct raft_append_request *rq)
> > > > +{
> > > > ...
> > > > +
> > > > +    const struct json *log = ovsdb_parser_member(p, "log",
> OP_ARRAY);
> > > > +    if (!log) {
> > > > +        return;
> > > > +    }
> > >
> > > Should this parser argument include OP_OPTIONAL?
> >
> > No, an append request always includes a log entry array (although I
> > guess it could be an empty array).  Can you help me to understand what
> > makes you think so?  Perhaps there is some inconsistency here that I do
> > not yet see, that I should fix.
> >
> > > > +static void
> > > > +raft_append_reply_to_jsonrpc(const struct raft_append_reply *rpy,
> > > > +                             struct json *args)
> > > > +{
> > > > ...
> > > > +    json_object_put_string(args, "result",
> > > > +                           raft_append_result_to_string(
> rpy->result));
> > >
> > > I believe raft_append_result_to_string() can return NULL, which could
> > > cause a problem here.
> >
> > The code shouldn't try to send an invalid RPC; if it does, then a
> > segfault will allow the problem to be diagnosed about as well as an
> > assertion failure would.
> >
> > > > +static void
> > > > +raft_format_append_reply(const struct raft_append_reply *rpy,
> struct ds *s)
> > > > +{
> > > > ...
> > > > +    ds_put_format(s, " result=\"%s\"",
> > > > +                  raft_append_result_to_string(rpy->result));
> > >
> > > I don't think this will crash, but it will print strange if
> > > 'rpy->result' is not valid.
> >
> > The code shouldn't try to print an invalid RPC; such an RPC, if
> > received, would fail to demarshal before it got to that point.
> >
> > > Is it intentional that "n_entries" and "prev_log_*" aren't printed?
> >
> > They aren't very useful for debugging and clutter the output.
> >
> > > > +void
> > > > +raft_rpc_format(const union raft_rpc *rpc, struct ds *s)
> > > > +{
> > > > +    ds_put_format(s, "%s", raft_rpc_type_to_string(rpc->type));
> > > > +    if (rpc->common.comment) {
> > > > +        ds_put_format(s, " \"%s\"", rpc->common.comment);
> > > > +    }
> > > > +    ds_put_char(s, ':');
> > >
> > > Is printing the SID not important?
> >
> > It's better to let the caller print it, since it has more context that
> > allows for more human-friendly names (e.g. those nicknames you asked
> > about).
> >
> > I added a comment.
> >
> > > > +/* Creates a database file that represents a new server in an
> existing Raft
> > > > + * cluster.
> > > > + *
> > > > + * Creates the local copy of the cluster's log in 'file_name',
> which must not
> > > > + * already exist.  Gives it the name 'name', which must be the same
> name
> > > > + * passed in to raft_create_cluster() earlier.
> > >
> > > Is raft_create_cluster() supposed to be called before this?  I think
> > > raft_create_cluster() would have already created the log file, which
> > > the previous sentence said must not already exist.
> >
> > Every server needs a log file, not just the first server in the cluster.
> >
> > > > + * This only creates the on-disk file and does no network access.
> Use
> > > > + * raft_open() to start operating the new server.  (Until this
> happens, the
> > > > + * new server has not joined the cluster.)
> > >
> > > The name for this function (and the corresponding ovsdb-tool command)
> > > seems a bit confusing in this way.  Not a big deal, but it doesn't
> > > seem obvious.
> >
> > I'm open to better names, can you suggest one?
> >
> > > > + * Returns null if successful, otherwise an ovsdb_error describing
> the
> > > > + * problem. */
> > > > +struct ovsdb_error * OVS_WARN_UNUSED_RESULT
> > > > +raft_join_cluster(const char *file_name,
> > > > +                  const char *name, const char *local_address,
> > > > +                  const struct sset *remote_addresses,
> > > > +                  const struct uuid *cid)
> > > > +{
> > > > ...
> > > > +    /* Write log file. */
> > > > +    struct raft_header h = {
> > > > +        .sid = uuid_random(),
> > > > +        .cid = cid ? *cid : UUID_ZERO,
> > > > +        .name = xstrdup(name),
> > > > +        .local_address = xstrdup(local_address),
> > > > +        .joining = true,
> > > > +    };
> > > > +    sset_clone(&h.remote_addresses, remote_addresses);
> > > > +    error = ovsdb_log_write_and_free(log, raft_header_to_json(&h));
> > > > +    raft_header_uninit(&h);
> > >
> > > I don't think anything initializes 'h->snap', which seems like it
> > > could be a problem for raft_header_to_json() and raft_header_uninit().
> >
> > In C, an initializer always initializes every member, and those
> > functions appear to properly handle nulls.
> >
> > I added a comment.
> >
> > > > +static struct json *
> > > > +raft_servers_for_index(const struct raft *raft, uint64_t index)
> > >
> > > I think it would be worth mentioning that the caller must free the
> > > return value.
> >
> > Good catch.  The existing caller didn't, so there was a bug here.
> > Should be fixed now.
> >
> > > > +static void
> > > > +raft_add_conn(struct raft *raft, struct jsonrpc_session *js,
> > > > +              const struct uuid *sid, bool incoming)
> > > > +{
> > > > ...
> > > > +    if (sid) {
> > > > +        conn->sid = *sid;
> > > > +    }
> > > > +    conn->nickname = raft_address_to_nickname(jsonr
> pc_session_get_name(js),
> > > > +                                              &conn->sid);
> > >
> > > The 'sid' argument can be null, which seems like it could be a problem
> with raft_address_to_nickname().
> >
> > Yes, the nickname can end up being 0000 for a while, but it changes to a
> > better one on the first received RPC.  It's not ideal, admittedly, but
> > it's clear enough in the logs.
> >
> > > > +static bool
> > > > +raft_conn_receive(struct raft *raft, struct raft_conn *conn,
> > > > +                  union raft_rpc *rpc)
> > > > +{
> > > > ...
> > > > +    } else if (!uuid_equals(&conn->sid, &rpc->common.sid)) {
> > > > +        static struct vlog_rate_limit rl = VLOG_RATE_LIMIT_INIT(1,
> 5);
> > > > +        VLOG_WARN_RL(&rl,
> > > > +                     "%s: remote server ID changed from "SID_FMT"
> to "SID_FMT,
> > > > +                     jsonrpc_session_get_name(conn->js),
> > > > +                     SID_ARGS(&conn->sid),
> SID_ARGS(&rpc->common.sid));
> > >
> > > If the SID changed, shouldn't we update "conn->sid"?
> >
> > It's more of an error message.  The sid shouldn't change.  I updated the
> > code to make that clearer, and to discard the message.
> >
> > > > +static void
> > > > +raft_run_reconfigure(struct raft *raft)
> > > > +{
> > > > ...
> > > > +    /* Remove a server, if one is scheduled for removal. */
> > > > +    HMAP_FOR_EACH (s, hmap_node, &raft->servers) {
> > > > +        if (s->phase == RAFT_PHASE_REMOVE) {
> > > > +            hmap_remove(&raft->servers, &s->hmap_node);
> > > > +            raft->remove_server = s;
> > >
> > > Are we sure that 'raft->remove_server' will only be assigned once?  I
> > > think it might be worth asserting that it's null before assigning
> > > here.
> >
> > We can be sure that it is null because we set it to null earlier in the
> > function.
> >
> > > > +/* Returns true if 'raft' has grown enough that reducing the log to
> a snapshot
> > > > + * would be valuable, false otherwise.  When this function returns
> true, the
> > > > + * client should consider using raft_store_snapshot() to reduce the
> log storage
> > > > + * requirements. */
> > > > +bool
> > > > +raft_grew_lots(const struct raft *raft)
> > > > +{
> > > > +    return (!raft->joining
> > > > +            && !raft->leaving
> > > > +            && !raft->left
> > > > +            && !raft->failed
> > > > +            && raft->last_applied - raft->log_start >= 100
> > > > +            && ovsdb_log_grew_lots(raft->log));
> > >
> > > ovsdb_log_grew_lots() is based on the filesize.  Raft entries are
> > > bigger.  It's probably not significant, but should this difference be
> > > taken that into account?
> >
> > I think that doubling the size of a file should be independent of the
> > size of the individual entries.  It's meant to be, anyhow.
> >
> > > > +/* Replaces the log for 'raft', up to the last log entry read, by
> > > > + * 'new_snapshot_data'.  Returns NULL if successful, otherwise an
> error that
> > > > + * the caller must eventually free. */
> > > > +struct ovsdb_error * OVS_WARN_UNUSED_RESULT
> > > > +raft_store_snapshot(struct raft *raft, const struct json
> *new_snapshot_data)
> > >
> > > I think it would be worth mentioning that this function doesn't take
> > > ownership of 'new_snapshot_data'.
> >
> > I still think that 'const' is a good enough hint.  I always assume that
> > const arguments don't transfer ownership, unless there's a comment to
> > the contrary.
> >
> > > Some final random things:
> > >
> > >     - It might be nice to add some tests that use "tcp" instead of
> "unix" sockets.  I thought I noticed some instances in the code that could
> be problematic.
> > >
> > >     - I think corner cases, such as 'raft->failed' getting set to true
> aren't handled in the client code.
> > >
> > >     - It's very difficult to determine whether an argument is consumed
> or not, or whether the return value should be freed.  Better documentation
> (particularly for public functions) would help.
> > >
> > > Thanks for implementing this important feature!
> > >
> > > Acked-by: Justin Pettit <[email protected]>
> >
> > Thanks.  I'll look at those last things and polish this a bit more and
> > then commit it to master.
> _______________________________________________
> dev mailing list
> [email protected]
> https://mail.openvswitch.org/mailman/listinfo/ovs-dev
>
_______________________________________________
dev mailing list
[email protected]
https://mail.openvswitch.org/mailman/listinfo/ovs-dev

Reply via email to