Re: [tabled patch] abstract out TCP-write code

2010-09-22 Thread Jeff Garzik

On 09/22/2010 10:37 PM, Pete Zaitcev wrote:

On Wed, 22 Sep 2010 21:26:13 -0400
Jeff Garzik  wrote:


  So, we go a longer route and re-hook the list of completions
  to a per-server global instead of a client. The patch is straight-
  forward. The only thing we need to be careful is to make sure
  that no outstanding completions are left in the queue before
  freeing a client struct. This is ensured by force-running completions.



Looking at this change again, I don't see how this avoids
use-after-free.  If completions exist after state change function leads
one to cli_evt_dispose() ->  cli_free(), then cli_write_run_compl() still
calls cli_write_free() with the stale 'cli' pointer.


We run completions before freeing in all cases. My patch was correct.


Logically, if completions are run before freeing in all cases, there is 
no need to make write_compl_q global.  That was a red herring, which by 
side effect avoided the bug with the stale 'cli' pointer.


Jeff


--
To unsubscribe from this list: send the line "unsubscribe hail-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [tabled patch] abstract out TCP-write code

2010-09-22 Thread Pete Zaitcev
On Wed, 22 Sep 2010 21:26:13 -0400
Jeff Garzik  wrote:

> >  So, we go a longer route and re-hook the list of completions
> >  to a per-server global instead of a client. The patch is straight-
> >  forward. The only thing we need to be careful is to make sure
> >  that no outstanding completions are left in the queue before
> >  freeing a client struct. This is ensured by force-running completions.

> Looking at this change again, I don't see how this avoids 
> use-after-free.  If completions exist after state change function leads 
> one to cli_evt_dispose() -> cli_free(), then cli_write_run_compl() still 
> calls cli_write_free() with the stale 'cli' pointer.

We run completions before freeing in all cases. My patch was correct.

> It seems like the real fix is to have the functions in the FSM loop 
> return an additional piece of information, indicating that 'cli' is no 
> longer valid.

That's kinda backwards. Might as well add refcounts.

> client_write's object lifetime should always be a subset of client's 
> object lifetime.

Sure. But that's an argument for refcounting struct cli.

> > Speaking of backpointers, I think it would be much cleaner
> > to get rid of two-argument format for callback. It stinks of
> > special-casing. Either throw a back pointer into the first word
> > of buf, or create some kind of object/struct passed as
> > argument to atcp_writeq(), that's what I would do.
> 
> It is a common idiom even in GLib that callbacks receive two anonymous 
> pointers; witness the data type GFunc's 'data' and 'user_data' 
> arguments: 
> http://library.gnome.org/devel/glib/stable/glib-Doubly-Linked-Lists.html#GFunc

There's a lot of retarged garbage in Glib, just look at their lists.
If someone smarter wrote Glib, we would not need struct list_head.
Heck in fact queue.h existed before Glib and they still blew it.
Oh god and their hash is no better.

> This is because typically one has two objects -- a parent and a child -- 
> with different object lifetime. []

I think your excuses are getting more sophisticated than the code
is worth. If you don't see how 2 arguments are special-casing, fine.
Just say you like it.

-- Pete
--
To unsubscribe from this list: send the line "unsubscribe hail-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [tabled patch] abstract out TCP-write code

2010-09-22 Thread Jeff Garzik

On 09/22/2010 08:28 PM, Pete Zaitcev wrote:

On Wed, 22 Sep 2010 20:09:08 -0400
Jeff Garzik  wrote:


@@ -1410,7 +1224,7 @@ static void tcp_cli_event(int fd, short events, void 
*userdata)

do {
loop = cli->evt_table[cli->state](cli, events);
-   loop |= cli_write_run_compl();
+   loop |= atcp_write_run_compl(&cli->wst);
} while (loop);
  }


This cannot be right. Please see commit
d1a45fca7908b7128ed4fe2ab61f02ee938f:

 tabled: fix running completions over disposed cli

 Miracluously this never actually crashed on me, but I added unrelated
 debugging printout into the dispatch routine and it printed weird
 values. Then it dawned on me that a state change function may dispose
 of the struct cli, in which case cli_write_run_compl is use-after-free.

 It may seem that checking if the old state was evt_dispose before
 running cli_write_run_compl is an expedient fix, but that does not
 work, because we do not always dispose of the cli in such case.
 If the cli to be disposed still has anything in the queue, we
 need to continue to deliver events, and for that we have to
 run outstanding completions.

 So, we go a longer route and re-hook the list of completions
 to a per-server global instead of a client. The patch is straight-
 forward. The only thing we need to be careful is to make sure
 that no outstanding completions are left in the queue before
 freeing a client struct. This is ensured by force-running completions.

 One other necessary change was to add a back poiter from a completion
 to the current client. This is because one caller needed the client
 pointer (object_get_more).


Thanks for the reminder.

Looking at this change again, I don't see how this avoids 
use-after-free.  If completions exist after state change function leads 
one to cli_evt_dispose() -> cli_free(), then cli_write_run_compl() still 
calls cli_write_free() with the stale 'cli' pointer.


It seems like the real fix is to have the functions in the FSM loop 
return an additional piece of information, indicating that 'cli' is no 
longer valid.


client_write's object lifetime should always be a subset of client's 
object lifetime.




Speaking of backpointers, I think it would be much cleaner
to get rid of two-argument format for callback. It stinks of
special-casing. Either throw a back pointer into the first word
of buf, or create some kind of object/struct passed as
argument to atcp_writeq(), that's what I would do.


It is a common idiom even in GLib that callbacks receive two anonymous 
pointers; witness the data type GFunc's 'data' and 'user_data' 
arguments: 
http://library.gnome.org/devel/glib/stable/glib-Doubly-Linked-Lists.html#GFunc


This is because typically one has two objects -- a parent and a child -- 
with different object lifetime.  When considering both tabled and itd, 
you see example of this in the need for struct client (tabled) or struct 
session (itd) pointers on occasion, in cli_writeq calls.


Jeff



--
To unsubscribe from this list: send the line "unsubscribe hail-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [tabled patch] abstract out TCP-write code

2010-09-22 Thread Pete Zaitcev
On Wed, 22 Sep 2010 20:09:08 -0400
Jeff Garzik  wrote:

> @@ -1410,7 +1224,7 @@ static void tcp_cli_event(int fd, short events, void 
> *userdata)
>  
>   do {
>   loop = cli->evt_table[cli->state](cli, events);
> - loop |= cli_write_run_compl();
> + loop |= atcp_write_run_compl(&cli->wst);
>   } while (loop);
>  }

This cannot be right. Please see commit
d1a45fca7908b7128ed4fe2ab61f02ee938f:

tabled: fix running completions over disposed cli

Miracluously this never actually crashed on me, but I added unrelated
debugging printout into the dispatch routine and it printed weird
values. Then it dawned on me that a state change function may dispose
of the struct cli, in which case cli_write_run_compl is use-after-free.

It may seem that checking if the old state was evt_dispose before
running cli_write_run_compl is an expedient fix, but that does not
work, because we do not always dispose of the cli in such case.
If the cli to be disposed still has anything in the queue, we
need to continue to deliver events, and for that we have to
run outstanding completions.

So, we go a longer route and re-hook the list of completions
to a per-server global instead of a client. The patch is straight-
forward. The only thing we need to be careful is to make sure
that no outstanding completions are left in the queue before
freeing a client struct. This is ensured by force-running completions.

One other necessary change was to add a back poiter from a completion
to the current client. This is because one caller needed the client
pointer (object_get_more).

Speaking of backpointers, I think it would be much cleaner
to get rid of two-argument format for callback. It stinks of
special-casing. Either throw a back pointer into the first word
of buf, or create some kind of object/struct passed as
argument to atcp_writeq(), that's what I would do.

-- Pete
--
To unsubscribe from this list: send the line "unsubscribe hail-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[tabled patch] abstract out TCP-write code

2010-09-22 Thread Jeff Garzik

This is step #1.  Other steps in the process:

2) update server/atcp.c and itd/*.c in tandem, until they have matching
   TCP-write code.  should be straightforward, as both are based on the
   same codebase (tabled).
3) move atcp to libhail
4) remove TCP-write code from tabled, itd
5) update atcp to support SSL, sendfile
6) update chunkd to support atcp (req. step #5)

All this code bears the same lineage, so it shouldn't be too difficult.

Also note, this is a first draft with embedded libevent dependencies.  I
agree w/ zaitcev that the goal should be to eliminate these.  atcp is
wonderfully generic at present; not even a GLib dependency, IIRC.


 server/Makefile.am |1 
 server/atcp.c  |  243 +
 server/atcp.h  |   90 +++
 server/bucket.c|8 -
 server/object.c|   56 ++--
 server/server.c|  237 +--
 server/tabled.h|   37 +---
 7 files changed, 400 insertions(+), 272 deletions(-)

diff --git a/server/Makefile.am b/server/Makefile.am
index 5b53a0a..5e0abd5 100644
--- a/server/Makefile.am
+++ b/server/Makefile.am
@@ -4,6 +4,7 @@ INCLUDES= -I$(top_srcdir)/include @GLIB_CFLAGS@ 
@HAIL_CFLAGS@
 sbin_PROGRAMS  = tabled tdbadm
 
 tabled_SOURCES = tabled.h  \
+ atcp.c atcp.h \
  bucket.c cldu.c config.c metarep.c object.c replica.c \
  server.c status.c storage.c storparse.c util.c
 tabled_LDADD   = ../lib/libtdb.a   \
diff --git a/server/atcp.c b/server/atcp.c
new file mode 100644
index 000..dac5b91
--- /dev/null
+++ b/server/atcp.c
@@ -0,0 +1,243 @@
+
+/*
+ * Copyright 2010 Red Hat, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; see the file COPYING.  If not, write to
+ * the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA.
+ *
+ */
+
+#define _GNU_SOURCE
+#include "tabled-config.h"
+
+#include 
+#include 
+#include 
+#include 
+#include "atcp.h"
+
+bool atcp_cb_free(void *cb_data1, void *cb_data2, bool done)
+{
+   /* in typical usage, cb_data1 is the owner of cb_data2,
+  and has a longer lifetime.  Therefore, by convention,
+  cb_data2 is the buffer to be released.
+*/
+   free(cb_data2);
+   return false;
+}
+
+static void atcp_write_complete(struct atcp_write *tmp)
+{
+   struct atcp_wr_state *wst = tmp->wst;
+
+   list_del(&tmp->node);
+   list_add_tail(&tmp->node, &wst->write_compl_q);
+}
+
+static bool atcp_write_free(struct atcp_write *tmp, bool done)
+{
+   struct atcp_wr_state *wst = tmp->wst;
+   bool rcb = false;
+
+   wst->write_cnt -= tmp->length;
+   list_del_init(&tmp->node);
+   if (tmp->cb)
+   rcb = tmp->cb(tmp->cb_data1, tmp->cb_data2, done);
+   free(tmp);
+
+   return rcb;
+}
+
+bool atcp_write_run_compl(struct atcp_wr_state *wst)
+{
+   struct atcp_write *wr;
+   bool do_loop;
+
+   do_loop = false;
+   while (!list_empty(&wst->write_compl_q)) {
+   wr = list_entry(wst->write_compl_q.next,
+   struct atcp_write, node);
+   do_loop |= atcp_write_free(wr, true);
+   }
+   return do_loop;
+}
+
+void atcp_write_free_all(struct atcp_wr_state *wst)
+{
+   struct atcp_write *wr, *tmp;
+
+   atcp_write_run_compl(wst);
+   list_for_each_entry_safe(wr, tmp, &wst->write_q, node) {
+   atcp_write_free(wr, false);
+   }
+}
+
+size_t atcp_wqueued(struct atcp_wr_state *wst)
+{
+   return wst->write_cnt;
+}
+
+static bool atcp_writable(struct atcp_wr_state *wst)
+{
+   int n_iov;
+   struct atcp_write *tmp;
+   ssize_t rc;
+   struct iovec iov[ATCP_MAX_WR_IOV];
+
+   /* accumulate pending writes into iovec */
+   n_iov = 0;
+   list_for_each_entry(tmp, &wst->write_q, node) {
+   if (n_iov == ATCP_MAX_WR_IOV)
+   break;
+   /* bleh, struct iovec should declare iov_base const */
+   iov[n_iov].iov_base = (void *) tmp->buf;
+   iov[n_iov].iov_len = tmp->togo;
+   n_iov++;
+   }
+
+   /* execute non-blocking write */
+do_write:
+   rc = writev(wst->fd, iov, n_iov);
+   if (rc < 0) {
+   if (errno == EINTR)
+   goto do_write;
+   if (errno != EAGAIN)
+   goto err_out;
+   

[tabled patch 2/2] test/start-daemon: Ignore stale .pid files.

2010-09-22 Thread Pete Zaitcev
The effect of this is to proceed with "make check" if
 - stale file is present, and
 - no process uses the PID recorded therein.
So, the worst that can happen is a revert to old behavior if an unrelated
process reuses the PID. Should make no difference for RPM builds at all.

Signed-off-by: Jim Meyering 
Signed-off-by: Pete Zaitcev 

---
 test/start-daemon |   10 +++---
 1 file changed, 7 insertions(+), 3 deletions(-)

You know, if we decide that we want to permit "make check" on unclean tree
(which is where this patch aims), we might want to rm -r test/data too.
Otherwise code  lock conflicts are inevitable and check is likely
to fail anyway.

diff --git a/test/start-daemon b/test/start-daemon
index da653d6..6985468 100755
--- a/test/start-daemon
+++ b/test/start-daemon
@@ -1,10 +1,14 @@
 #!/bin/sh
 
-for d in cld chunkd tabled; do
+for d in cld chunkd tabled
+do
if [ -f $d.pid ]
then
-   echo "$d.pid file found.  daemon still running?"
-   exit 1
+   if kill -0 `cat $d.pid` 2> /dev/null
+   then
+   echo "$d.pid file found.  daemon still running?" >&2
+   exit 1
+   fi
fi
 done
 
--
To unsubscribe from this list: send the line "unsubscribe hail-devel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html