Re: [tabled patch] abstract out TCP-write code
On 09/22/2010 10:37 PM, Pete Zaitcev wrote: On Wed, 22 Sep 2010 21:26:13 -0400 Jeff Garzik wrote: So, we go a longer route and re-hook the list of completions to a per-server global instead of a client. The patch is straight- forward. The only thing we need to be careful is to make sure that no outstanding completions are left in the queue before freeing a client struct. This is ensured by force-running completions. Looking at this change again, I don't see how this avoids use-after-free. If completions exist after state change function leads one to cli_evt_dispose() -> cli_free(), then cli_write_run_compl() still calls cli_write_free() with the stale 'cli' pointer. We run completions before freeing in all cases. My patch was correct. Logically, if completions are run before freeing in all cases, there is no need to make write_compl_q global. That was a red herring, which by side effect avoided the bug with the stale 'cli' pointer. Jeff -- To unsubscribe from this list: send the line "unsubscribe hail-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [tabled patch] abstract out TCP-write code
On Wed, 22 Sep 2010 21:26:13 -0400 Jeff Garzik wrote: > > So, we go a longer route and re-hook the list of completions > > to a per-server global instead of a client. The patch is straight- > > forward. The only thing we need to be careful is to make sure > > that no outstanding completions are left in the queue before > > freeing a client struct. This is ensured by force-running completions. > Looking at this change again, I don't see how this avoids > use-after-free. If completions exist after state change function leads > one to cli_evt_dispose() -> cli_free(), then cli_write_run_compl() still > calls cli_write_free() with the stale 'cli' pointer. We run completions before freeing in all cases. My patch was correct. > It seems like the real fix is to have the functions in the FSM loop > return an additional piece of information, indicating that 'cli' is no > longer valid. That's kinda backwards. Might as well add refcounts. > client_write's object lifetime should always be a subset of client's > object lifetime. Sure. But that's an argument for refcounting struct cli. > > Speaking of backpointers, I think it would be much cleaner > > to get rid of two-argument format for callback. It stinks of > > special-casing. Either throw a back pointer into the first word > > of buf, or create some kind of object/struct passed as > > argument to atcp_writeq(), that's what I would do. > > It is a common idiom even in GLib that callbacks receive two anonymous > pointers; witness the data type GFunc's 'data' and 'user_data' > arguments: > http://library.gnome.org/devel/glib/stable/glib-Doubly-Linked-Lists.html#GFunc There's a lot of retarged garbage in Glib, just look at their lists. If someone smarter wrote Glib, we would not need struct list_head. Heck in fact queue.h existed before Glib and they still blew it. Oh god and their hash is no better. > This is because typically one has two objects -- a parent and a child -- > with different object lifetime. [] I think your excuses are getting more sophisticated than the code is worth. If you don't see how 2 arguments are special-casing, fine. Just say you like it. -- Pete -- To unsubscribe from this list: send the line "unsubscribe hail-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [tabled patch] abstract out TCP-write code
On 09/22/2010 08:28 PM, Pete Zaitcev wrote: On Wed, 22 Sep 2010 20:09:08 -0400 Jeff Garzik wrote: @@ -1410,7 +1224,7 @@ static void tcp_cli_event(int fd, short events, void *userdata) do { loop = cli->evt_table[cli->state](cli, events); - loop |= cli_write_run_compl(); + loop |= atcp_write_run_compl(&cli->wst); } while (loop); } This cannot be right. Please see commit d1a45fca7908b7128ed4fe2ab61f02ee938f: tabled: fix running completions over disposed cli Miracluously this never actually crashed on me, but I added unrelated debugging printout into the dispatch routine and it printed weird values. Then it dawned on me that a state change function may dispose of the struct cli, in which case cli_write_run_compl is use-after-free. It may seem that checking if the old state was evt_dispose before running cli_write_run_compl is an expedient fix, but that does not work, because we do not always dispose of the cli in such case. If the cli to be disposed still has anything in the queue, we need to continue to deliver events, and for that we have to run outstanding completions. So, we go a longer route and re-hook the list of completions to a per-server global instead of a client. The patch is straight- forward. The only thing we need to be careful is to make sure that no outstanding completions are left in the queue before freeing a client struct. This is ensured by force-running completions. One other necessary change was to add a back poiter from a completion to the current client. This is because one caller needed the client pointer (object_get_more). Thanks for the reminder. Looking at this change again, I don't see how this avoids use-after-free. If completions exist after state change function leads one to cli_evt_dispose() -> cli_free(), then cli_write_run_compl() still calls cli_write_free() with the stale 'cli' pointer. It seems like the real fix is to have the functions in the FSM loop return an additional piece of information, indicating that 'cli' is no longer valid. client_write's object lifetime should always be a subset of client's object lifetime. Speaking of backpointers, I think it would be much cleaner to get rid of two-argument format for callback. It stinks of special-casing. Either throw a back pointer into the first word of buf, or create some kind of object/struct passed as argument to atcp_writeq(), that's what I would do. It is a common idiom even in GLib that callbacks receive two anonymous pointers; witness the data type GFunc's 'data' and 'user_data' arguments: http://library.gnome.org/devel/glib/stable/glib-Doubly-Linked-Lists.html#GFunc This is because typically one has two objects -- a parent and a child -- with different object lifetime. When considering both tabled and itd, you see example of this in the need for struct client (tabled) or struct session (itd) pointers on occasion, in cli_writeq calls. Jeff -- To unsubscribe from this list: send the line "unsubscribe hail-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [tabled patch] abstract out TCP-write code
On Wed, 22 Sep 2010 20:09:08 -0400 Jeff Garzik wrote: > @@ -1410,7 +1224,7 @@ static void tcp_cli_event(int fd, short events, void > *userdata) > > do { > loop = cli->evt_table[cli->state](cli, events); > - loop |= cli_write_run_compl(); > + loop |= atcp_write_run_compl(&cli->wst); > } while (loop); > } This cannot be right. Please see commit d1a45fca7908b7128ed4fe2ab61f02ee938f: tabled: fix running completions over disposed cli Miracluously this never actually crashed on me, but I added unrelated debugging printout into the dispatch routine and it printed weird values. Then it dawned on me that a state change function may dispose of the struct cli, in which case cli_write_run_compl is use-after-free. It may seem that checking if the old state was evt_dispose before running cli_write_run_compl is an expedient fix, but that does not work, because we do not always dispose of the cli in such case. If the cli to be disposed still has anything in the queue, we need to continue to deliver events, and for that we have to run outstanding completions. So, we go a longer route and re-hook the list of completions to a per-server global instead of a client. The patch is straight- forward. The only thing we need to be careful is to make sure that no outstanding completions are left in the queue before freeing a client struct. This is ensured by force-running completions. One other necessary change was to add a back poiter from a completion to the current client. This is because one caller needed the client pointer (object_get_more). Speaking of backpointers, I think it would be much cleaner to get rid of two-argument format for callback. It stinks of special-casing. Either throw a back pointer into the first word of buf, or create some kind of object/struct passed as argument to atcp_writeq(), that's what I would do. -- Pete -- To unsubscribe from this list: send the line "unsubscribe hail-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[tabled patch] abstract out TCP-write code
This is step #1. Other steps in the process: 2) update server/atcp.c and itd/*.c in tandem, until they have matching TCP-write code. should be straightforward, as both are based on the same codebase (tabled). 3) move atcp to libhail 4) remove TCP-write code from tabled, itd 5) update atcp to support SSL, sendfile 6) update chunkd to support atcp (req. step #5) All this code bears the same lineage, so it shouldn't be too difficult. Also note, this is a first draft with embedded libevent dependencies. I agree w/ zaitcev that the goal should be to eliminate these. atcp is wonderfully generic at present; not even a GLib dependency, IIRC. server/Makefile.am |1 server/atcp.c | 243 + server/atcp.h | 90 +++ server/bucket.c|8 - server/object.c| 56 ++-- server/server.c| 237 +-- server/tabled.h| 37 +--- 7 files changed, 400 insertions(+), 272 deletions(-) diff --git a/server/Makefile.am b/server/Makefile.am index 5b53a0a..5e0abd5 100644 --- a/server/Makefile.am +++ b/server/Makefile.am @@ -4,6 +4,7 @@ INCLUDES= -I$(top_srcdir)/include @GLIB_CFLAGS@ @HAIL_CFLAGS@ sbin_PROGRAMS = tabled tdbadm tabled_SOURCES = tabled.h \ + atcp.c atcp.h \ bucket.c cldu.c config.c metarep.c object.c replica.c \ server.c status.c storage.c storparse.c util.c tabled_LDADD = ../lib/libtdb.a \ diff --git a/server/atcp.c b/server/atcp.c new file mode 100644 index 000..dac5b91 --- /dev/null +++ b/server/atcp.c @@ -0,0 +1,243 @@ + +/* + * Copyright 2010 Red Hat, Inc. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; see the file COPYING. If not, write to + * the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. + * + */ + +#define _GNU_SOURCE +#include "tabled-config.h" + +#include +#include +#include +#include +#include "atcp.h" + +bool atcp_cb_free(void *cb_data1, void *cb_data2, bool done) +{ + /* in typical usage, cb_data1 is the owner of cb_data2, + and has a longer lifetime. Therefore, by convention, + cb_data2 is the buffer to be released. +*/ + free(cb_data2); + return false; +} + +static void atcp_write_complete(struct atcp_write *tmp) +{ + struct atcp_wr_state *wst = tmp->wst; + + list_del(&tmp->node); + list_add_tail(&tmp->node, &wst->write_compl_q); +} + +static bool atcp_write_free(struct atcp_write *tmp, bool done) +{ + struct atcp_wr_state *wst = tmp->wst; + bool rcb = false; + + wst->write_cnt -= tmp->length; + list_del_init(&tmp->node); + if (tmp->cb) + rcb = tmp->cb(tmp->cb_data1, tmp->cb_data2, done); + free(tmp); + + return rcb; +} + +bool atcp_write_run_compl(struct atcp_wr_state *wst) +{ + struct atcp_write *wr; + bool do_loop; + + do_loop = false; + while (!list_empty(&wst->write_compl_q)) { + wr = list_entry(wst->write_compl_q.next, + struct atcp_write, node); + do_loop |= atcp_write_free(wr, true); + } + return do_loop; +} + +void atcp_write_free_all(struct atcp_wr_state *wst) +{ + struct atcp_write *wr, *tmp; + + atcp_write_run_compl(wst); + list_for_each_entry_safe(wr, tmp, &wst->write_q, node) { + atcp_write_free(wr, false); + } +} + +size_t atcp_wqueued(struct atcp_wr_state *wst) +{ + return wst->write_cnt; +} + +static bool atcp_writable(struct atcp_wr_state *wst) +{ + int n_iov; + struct atcp_write *tmp; + ssize_t rc; + struct iovec iov[ATCP_MAX_WR_IOV]; + + /* accumulate pending writes into iovec */ + n_iov = 0; + list_for_each_entry(tmp, &wst->write_q, node) { + if (n_iov == ATCP_MAX_WR_IOV) + break; + /* bleh, struct iovec should declare iov_base const */ + iov[n_iov].iov_base = (void *) tmp->buf; + iov[n_iov].iov_len = tmp->togo; + n_iov++; + } + + /* execute non-blocking write */ +do_write: + rc = writev(wst->fd, iov, n_iov); + if (rc < 0) { + if (errno == EINTR) + goto do_write; + if (errno != EAGAIN) + goto err_out; +
[tabled patch 2/2] test/start-daemon: Ignore stale .pid files.
The effect of this is to proceed with "make check" if - stale file is present, and - no process uses the PID recorded therein. So, the worst that can happen is a revert to old behavior if an unrelated process reuses the PID. Should make no difference for RPM builds at all. Signed-off-by: Jim Meyering Signed-off-by: Pete Zaitcev --- test/start-daemon | 10 +++--- 1 file changed, 7 insertions(+), 3 deletions(-) You know, if we decide that we want to permit "make check" on unclean tree (which is where this patch aims), we might want to rm -r test/data too. Otherwise code lock conflicts are inevitable and check is likely to fail anyway. diff --git a/test/start-daemon b/test/start-daemon index da653d6..6985468 100755 --- a/test/start-daemon +++ b/test/start-daemon @@ -1,10 +1,14 @@ #!/bin/sh -for d in cld chunkd tabled; do +for d in cld chunkd tabled +do if [ -f $d.pid ] then - echo "$d.pid file found. daemon still running?" - exit 1 + if kill -0 `cat $d.pid` 2> /dev/null + then + echo "$d.pid file found. daemon still running?" >&2 + exit 1 + fi fi done -- To unsubscribe from this list: send the line "unsubscribe hail-devel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html