Hail license changed
Hail license change was just pushed to the github hail repository. Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hail status and update (was Re: Question about hail)
(CC'd hail-devel list) On 11/26/2012 02:28 AM, Hideki Yamane wrote: Hello hail upstream authors, I'm interested in porting hail (and Aeolus) to Debian, but have some questions about it. Cool! Q: Is this project is still alive? if so, where is the current main site. Could you tell me the status, please? The main site for source code is currently github. https://github.com/jgarzik/hail https://github.com/jgarzik/tabled https://github.com/jgarzik/itd https://github.com/jgarzik/nfs4d The Hail home page is https://hail.wiki.kernel.org/ but that clearly has some stale links on it (see below). The status is definitely long pause at the moment, as it got de-prioritized by my employer, but I still have a goal of completing the Paxos implementation in CLD, making it truly distributed. Patches are still accepted, and I want it to keep working on current platforms. - its upstream site hail.wiki.kernel.org points out some materials but those are empty at all. slide audio: http://www.kernel.org/pub/media/talks/hail/ git repo: http://git.kernel.org/?p=daemon/distsrv/itd.git source: http://www.kernel.org/pub/software/network/distsrv/ Hail was collateral damage in a kernel.org hack. No data was lost or compromised, but it took kernel.org months to recover even basic account services and git access. wikis took months longer after that. I'm still waiting to see if anybody has an archive of old tarballs, because k.org was my canonical upstream storage location, with zero local ones. Local git cryptographic integrity -- the canonical root of all the source code -- never disappeared, and was always verified and not-hacked. It briefly moved to github, but is now back at kernel.org. The tarballs can conceivably be recovered by checking out a git tag, re-running autogen.sh, and then make dist... but with autoconf/automake/libtool upgrades over the years, tarball checksums might change using that method. and there's no mail in http://news.gmane.org/gmane.comp.distributed.hail.devel except spam. Pete complained about this too. Not sure what to do about it -- the mail server is vger.kernel.org, same as LKML, with a postmaster who maintains the spam filter to similar setup and standards. I was surprised at Pete's spam report at first, too. I never see any of the spam, only the rare hail-devel message, because of good spam filtering. Of course, the M-L archiving bots seem to keep it all. Q: Can you add special exemptions for OpenSSL to its license, and add or later, too? - Its license is GPL-2, right? and it links to OpenSSL. However, openssl license is NOT compatible with GPL without special exemptions see http://lintian.debian.org/tags/possible-gpl-code-linked-with-openssl.html and http://www.openssl.org/support/faq.html#LEGAL2 So, we cannot distribute hail binary without its special exemptions. - GPL-2 and GPL-3 is NOT compatible. However, Image Warehouse (part of Aeolus) is licensed under GPL-3, and it seems to link to hail library (hstor.h). We should change a) hail to GPL-2 or later or GPL-3 (or later) b) iwhd to GPL-2 to avoid license incompatibility. History: hail was GPL-2 only, following the lead of the kernel. But it sounds like this is impractical given the few existing users, so I am in favor of relicensing to GPL-2 or later. I disagree with the interpretation vis a vis openssl and several others do too. However, if this is an impediment to use, I would be happy to accept a pull request adding the openssl exemption language. Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Project Hail wikis alive again!
kernel.org fixed their wiki system, which means that all the k.org wikis are once again read-write! This includes Project Hail's home page, https://hail.wiki.kernel.org/ I hope to have the git repos moved back from https://github.com/jgarzik/ to kernel.org soon also. Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Unable to log into Hail Wiki
On 01/25/2012 08:40 PM, Pete Zaitcev wrote: Jeff, looks like the wiki rots. The login points to this URL https://hail.wiki.kernel.org/articles/u/s/e/Special%7EUserLogin_94cd.html It returns 404. HALP? Yes -- all kernel.org wikis are _still_ read-only, even this many months after the kernel.org breakin. ata.wiki.kernel.org has similar behavior. Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch hail 1/1] Plug leak in hstor_parse_key
On 10/14/2011 01:34 PM, Pete Zaitcev wrote: Signed-off-by: Pete Zaitcevzait...@kotori.zaitcev.us --- lib/hstor.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/hstor.c b/lib/hstor.c index cb9c4da..5ce9b76 100644 --- a/lib/hstor.c +++ b/lib/hstor.c @@ -761,7 +761,7 @@ void hstor_free_keylist(struct hstor_keylist *keylist) static void hstor_parse_key(xmlDocPtr doc, xmlNode *node, struct hstor_keylist *keylist) { - struct hstor_object *obj = calloc(1, sizeof(*obj)); + struct hstor_object *obj; xmlChar *xs; good catch... applied -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Hail git tree location
On 10/04/2011 12:12 PM, Pete Zaitcev wrote: Are we going to have a git tree somewhere? It looks like our old one was purged from git.kernel.org. Sorry, I should have posted. It was migrated along with the kernel.org trees to https://github.com/jgarzik/{hail,tabled,itd,nfs4d} though kernel.org ones should be coming back. Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: FYI, rpc/ is gone from Fedora 15
On 05/05/2011 10:14 AM, Jim Meyering wrote: FYI, /usr/include/rpc/ no longer exists, as of F15's glibc-headers-2.13.90-10, so hail's lib/cld_msg_rpc.h will have to do something about this #include: $ grep rpc.h lib/cld_msg_rpc.h #includerpc/rpc.h hm. Surely they did not delete sunrpc from glibc? That would be disappointing. Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[hail] CLD conversion to TCP lands in git
I just pushed the CLD protocol change (UDP - TCP) to hail.git[1]. See the original post[2] for more details. It seems pretty solid from my beating on it, but it's still raw code. The focus will be on hammering out the kinks in this switch over the next 7-10 days, so expect some breakage and churn during that time. In particular, while tabled /should/ work, as the API has only seen minor changes, it will need a good stress test before I regain confidence in it. There was also an unrelated-to-CLD API change in libhail that requires a small tabled tweak[3], that will be attended-to this week. Jeff [1] http://git.kernel.org/?p=daemon/distsrv/hail.git;a=commit;h=3bdeaab68e1c2776a3488ac03f49f7b4bc2659c8 [2] http://marc.info/?l=hail-develm=129379489716486w=2 [3] http://git.kernel.org/?p=daemon/distsrv/hail.git;a=commit;h=59becbb9e329cdc20e4894f331fcb8dfc104c35a -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 2/3] CLD: switch network proto from UDP to TCP
On 01/02/2011 06:32 PM, Pete Zaitcev wrote: On Fri, 31 Dec 2010 05:57:28 -0500 Jeff Garzikj...@garzik.org wrote: + struct cldc_tcp *tcp = private; + ssize_t rc; + struct ubbp_header ubbp; + + memcpy(ubbp.magic, CLD1, 4); + ubbp.op_size = (buflen 8) | 1; +#ifdef WORDS_BIGENDIAN + swab32(ubbp.op_size); +#endif + + rc = write(tcp-fd,ubbp, sizeof(ubbp)); Why not this: unsigned int n; n = (buflen 8) | 1; ubbp.op_size = GUINT32_TO_LE(n); Yep. I used the #ifdef on the read(2) side, where I did not want to create an additional var... then I copied that onto the write(2) side, where it is less efficient as you point out. Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Crash with db5
On 01/02/2011 08:20 PM, Pete Zaitcev wrote: Looks like Rawhide throws this if libdb-devel is in use: make check-TESTS make[3]: Entering directory `/q/zaitcev/hail/hail-tip/test/cld' PASS: prep-db DB_ENV-lsn_reset: method not permitted before handle's open method DB_ENV-dbremove: method not permitted before handle's open method cld[11548]: SIGSEGV PASS: start-daemon port file not found. FAIL: pid-exists libdb-5.1.19-2.fc15.x86_64 Are you compiling with db4 headers, but linking with db5? Or vice versa? This is a problem I ran into, with F14. hail's configure script searches for the first libdb, which will always be libdb5 on = F14, because libdb5 is always installed due to dependencies. But... you can either have db4-devel or libdb-devel installed for the devel pkg. If you have db4-devel + libdb5... boom. Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/3] CLD: convert back to libevent
Switch CLD from hand-rolled server poll code, to libevent. Follows similar techniques and rationale as chunkd commit c1aed7464f237e5a6309351bf003162c77d69e27. This reverts ancient commit 90b3b5edcf5aa00577f4395fdbb490ed7e9be824. Signed-off-by: Jeff Garzik jgar...@redhat.com --- cld/Makefile.am |3 - cld/cld.h | 22 +++ cld/server.c| 161 cld/session.c | 69 4 files changed, 118 insertions(+), 137 deletions(-) diff --git a/cld/Makefile.am b/cld/Makefile.am index 9a13ce0..30eea0b 100644 --- a/cld/Makefile.am +++ b/cld/Makefile.am @@ -12,7 +12,8 @@ cld_SOURCES = cldb.h cld.h \ cldb.c msg.c server.c session.c util.c cld_LDADD = \ ../lib/libhail.la @GLIB_LIBS@ @CRYPTO_LIBS@ \ - @SSL_LIBS@ @DB4_LIBS@ @XML_LIBS@ @LIBCURL@ + @SSL_LIBS@ @DB4_LIBS@ @XML_LIBS@ @LIBCURL@ \ + @EVENT_LIBS@ cldbadm_SOURCES= cldb.h cldbadm.c cldbadm_LDADD = @CRYPTO_LIBS@ @GLIB_LIBS@ @DB4_LIBS@ diff --git a/cld/cld.h b/cld/cld.h index 4c0099f..17f14b8 100644 --- a/cld/cld.h +++ b/cld/cld.h @@ -22,8 +22,9 @@ #include netinet/in.h #include sys/time.h -#include poll.h +#include event.h #include glib.h +#include elist.h #include cldb.h #include cld_msg_rpc.h #include cld_common.h @@ -59,13 +60,13 @@ struct session { uint64_tlast_contact; uint64_tnext_fh; - struct cld_timertimer; + struct eventtimer; uint64_tnext_seqid_in; uint64_tnext_seqid_out; GList *out_q; /* outgoing pkts (to client) */ - struct cld_timerretry_timer; + struct eventretry_timer; charuser[CLD_MAX_USERNAME]; @@ -85,10 +86,10 @@ struct server_stats { unsigned long garbage;/* num. garbage pkts dropped */ }; -struct server_poll { +struct server_socket { int fd; - bool(*cb)(int fd, short events, void *userdata); - void*userdata; + struct eventev; + struct list_headsockets_node; }; struct server { @@ -103,14 +104,13 @@ struct server { struct cldb cldb; /* database info */ - GArray *polls; - GArray *poll_data; + struct event_base *evbase_main; - GHashTable *sessions; + struct list_headsockets; - struct cld_timer_list timers; + GHashTable *sessions; - struct cld_timerchkpt_timer;/* db4 checkpoint timer */ + struct eventchkpt_timer;/* db4 checkpoint timer */ struct server_stats stats; /* global statistics */ }; diff --git a/cld/server.c b/cld/server.c index 7a57785..aed501b 100644 --- a/cld/server.c +++ b/cld/server.c @@ -559,7 +559,7 @@ static void simple_sendresp(int sock_fd, const struct client *cli, info-op); } -static bool udp_srv_event(int fd, short events, void *userdata) +static void udp_srv_event(int fd, short events, void *userdata) { struct client cli; char host[64]; @@ -586,7 +586,7 @@ static bool udp_srv_event(int fd, short events, void *userdata) rrc = recvmsg(fd, hdr, 0); if (rrc 0) { syslogerr(UDP recvmsg); - return true; /* continue main loop; do NOT terminate server */ + return; } cli.addr_len = hdr.msg_namelen; @@ -601,59 +601,60 @@ static bool udp_srv_event(int fd, short events, void *userdata) if (!parse_pkt_header(raw_pkt, rrc, pkt, hdr_len)) { cld_srv.stats.garbage++; - return true; + return; } if (!get_pkt_info(pkt, raw_pkt, rrc, hdr_len, info)) { xdr_free((xdrproc_t)xdr_cld_pkt_hdr, (char *)pkt); cld_srv.stats.garbage++; - return true; + return; } if (packet_is_dupe(info)) { /* silently drop dupes */ xdr_free((xdrproc_t)xdr_cld_pkt_hdr, (char *)pkt); - return true; + return; } err = validate_pkt_session(info, cli); if (err) { simple_sendresp(fd, cli, info, err); xdr_free((xdrproc_t)xdr_cld_pkt_hdr, (char *)pkt); - return true; + return; } err = pkt_chk_sig(raw_pkt, rrc, pkt); if (err) { simple_sendresp(fd, cli, info, err); xdr_free((xdrproc_t)xdr_cld_pkt_hdr, (char *)pkt); - return true; + return; } if (!(cld_srv.cldb.is_master
[PATCH 2/3] CLD: switch network proto from UDP to TCP
Convert CLD network protocol from UDP to TCP. Server, client lib, and chunkd's cldu module are all updated. tabled's cldu module must be updated also. The original rationale for UDP use was following Google's lead, based on the advice in the original Chubby paper, describing TCP's back-off policies and other behavior during times of high network congestion. This seems a bit dubious without further third party evidence, and TCP vastly simplifies our lives. While the code remains open and modular enough to support other protocols (hopefully RDMA or SCTP one day), this upgrade from UDP to TCP promises to make the current codebase much easier to use, while avoiding the reinvent TCP, by using UDP problem, which was a rabbit hole threatening CLD. Signed-off-by: Jeff Garzik jgar...@redhat.com --- chunkd/cldu.c|6 cld/cld.h| 43 ++ cld/msg.c|4 cld/server.c | 356 --- cld/session.c|4 configure.ac |1 include/Makefile.am |2 include/cld_common.h |4 include/cldc.h | 24 ++- include/ncld.h |4 include/ubbp.h | 52 +++ lib/Makefile.am |2 lib/cldc-dns.c |2 lib/cldc-tcp.c | 185 ++ lib/cldc-udp.c | 141 lib/cldc.c | 54 +++ 16 files changed, 595 insertions(+), 289 deletions(-) diff --git a/chunkd/cldu.c b/chunkd/cldu.c index 026c523..41f94b5 100644 --- a/chunkd/cldu.c +++ b/chunkd/cldu.c @@ -165,7 +165,7 @@ static void cldu_sess_event(void *priv, uint32_t what) */ if (cs-nsess) { applog(LOG_ERR, Session failed, sid SIDFMT, - SIDARG(cs-nsess-udp-sess-sid)); + SIDARG(cs-nsess-tcp-sess-sid)); } else { applog(LOG_ERR, Session open failed); } @@ -177,7 +177,7 @@ static void cldu_sess_event(void *priv, uint32_t what) } else { if (cs) applog(LOG_INFO, cldc event 0x%x sid SIDFMT, - what, SIDARG(cs-nsess-udp-sess-sid)); + what, SIDARG(cs-nsess-tcp-sess-sid)); else applog(LOG_INFO, cldc event 0x%x no sid, what); } @@ -372,7 +372,7 @@ static int cldu_set_cldc(struct cld_session *cs, int newactive) } applog(LOG_INFO, New CLD session created, sid SIDFMT, - SIDARG(cs-nsess-udp-sess-sid)); + SIDARG(cs-nsess-tcp-sess-sid)); /* * First, make sure the base directory exists. diff --git a/cld/cld.h b/cld/cld.h index 17f14b8..b1f9bbf 100644 --- a/cld/cld.h +++ b/cld/cld.h @@ -30,6 +30,7 @@ #include cld_common.h #include hail_log.h #include hail_private.h +#include ubbp.h struct client; struct session_outpkt; @@ -43,10 +44,39 @@ enum { SFL_FOREGROUND = (1 0), /* run in foreground */ }; +struct atcp_read { + void*buf; + unsigned intbuf_size; + unsigned intbytes_wanted; + unsigned intbytes_read; + + void(*cb)(void *, bool); + void*cb_data; + + struct list_headnode; +}; + +struct atcp_read_state { + struct list_headq; +}; + struct client { + int fd; + + struct eventev; + short ev_mask;/* EV_READ and/or EV_WRITE */ + struct sockaddr_in6 addr; /* inet address */ socklen_t addr_len; /* inet address len */ charaddr_host[64]; /* ASCII version of inet addr */ + charaddr_port[16]; /* ASCII version of inet addr */ + + struct atcp_read_state rst; + + struct ubbp_header ubbp; + + charraw_pkt[CLD_RAW_MSG_SZ]; + unsigned intraw_size; }; struct session { @@ -124,6 +154,17 @@ struct pkt_info { size_t hdr_len; }; +#define ___constant_swab32(x) ((uint32_t)( \ +(((uint32_t)(x) (uint32_t)0x00ffUL) 24) |\ +(((uint32_t)(x) (uint32_t)0xff00UL) 8) |\ +(((uint32_t)(x) (uint32_t)0x00ffUL) 8) |\ +(((uint32_t)(x) (uint32_t)0xff00UL) 24))) + +static inline uint32_t swab32(uint32_t v) +{ + return ___constant_swab32(v); +} + /* msg.c */ extern int inode_lock_rescan(DB_TXN *txn, cldino_t inum); extern void msg_get(struct session *sess, const void *v); @@ -178,7 +219,7 @@ extern int sess_load(GHashTable *ss); extern struct server cld_srv; extern struct hail_log srv_log; extern struct timeval current_time; -extern int udp_tx(int
Re: [patch tabled 6/8] Add filesystem back-end
On 11/28/2010 08:41 PM, Pete Zaitcev wrote: This patch adds the first new back-end and makes some changes to the way nodes are added, to make the invariants of storage_node more sensible. The filesystem back-end itself is not intended for production use, so it makes no attempt to run any asynchronous transfers. We also add a test. Note that this differs from the preliminary versions of this patch. We used to add both chunk and fs back-ends, so that tabled replicates to both. This makes sense as a test of store path, but on retrieval tabled selects any one of available storage nodes with the object, randomly. It creates gaps in test coverage in any given run. Therefore, we test two back-end types sequentially now. Signed-off-by: Pete Zaitcevzait...@redhat.com --- server/Makefile.am |2 server/stor_chunk.c | 21 - server/stor_fs.c | 498 + server/storage.c | 157 ++-- server/storparse.c | 97 +++ server/tabled.h | 31 ++ test/Makefile.am |3 test/be_fs-test.conf |5 test/combo-redux | 74 ++ test/prep-db |4 test/start-daemon|1 test/stop-daemon |9 12 files changed, 835 insertions(+), 67 deletions(-) commit bccedeedabbe713e4053afa185314b3f57f3d204 Author: Pete Zaitcevzait...@yahoo.com Date: Sun Nov 28 17:58:05 2010 -0700 Add fs back-end, with a test. diff --git a/server/Makefile.am b/server/Makefile.am index 52beec4..71bcb35 100644 --- a/server/Makefile.am +++ b/server/Makefile.am @@ -6,7 +6,7 @@ sbin_PROGRAMS = tabled tdbadm tabled_SOURCES= tabled.h \ bucket.c cldu.c config.c metarep.c object.c replica.c \ server.c status.c storage.c storparse.c \ - stor_chunk.c util.c + stor_chunk.c stor_fs.c util.c tabled_LDADD = ../lib/libtdb.a \ @HAIL_LIBS@ @PCRE_LIBS@ @GLIB_LIBS@ \ @CRYPTO_LIBS@ @DB4_LIBS@ @EVENT_LIBS@ @SSL_LIBS@ diff --git a/server/stor_chunk.c b/server/stor_chunk.c index 815adcf..7462a9c 100644 --- a/server/stor_chunk.c +++ b/server/stor_chunk.c @@ -31,8 +31,7 @@ #includenetdb.h #include tabled.h -static const char stor_key_fmt[] = %016llx; -#define STOR_KEY_SLEN 16 +static const char stor_key_fmt[] = STOR_KEY_FMT; static int stor_new_stc(struct storage_node *stn, struct st_client **stcp) { @@ -66,24 +65,6 @@ static int stor_new_stc(struct storage_node *stn, struct st_client **stcp) return 0; } -static void stor_read_event(int fd, short events, void *userdata) -{ - struct open_chunk *cep = userdata; - - cep-r_armed = false;/* no EV_PERSIST */ - if (cep-ocb) - (*cep-ocb)(cep); -} - -static void stor_write_event(int fd, short events, void *userdata) -{ - struct open_chunk *cep = userdata; - - cep-w_armed = false;/* no EV_PERSIST */ - if (cep-ocb) - (*cep-ocb)(cep); -} - /* * Open *cep using stn, set up chunk session if needed. */ diff --git a/server/stor_fs.c b/server/stor_fs.c new file mode 100644 index 000..b433a67 --- /dev/null +++ b/server/stor_fs.c @@ -0,0 +1,498 @@ + +/* + * Copyright 2010 Red Hat, Inc. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; see the file COPYING. If not, write to + * the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. + * + */ + +#define _GNU_SOURCE +#include tabled-config.h + +#includesys/types.h +#includesys/stat.h +#includeerrno.h +#includefcntl.h +#includesyslog.h +#includestring.h +#includeglib.h +#includeevent.h +#include tabled.h + +static const char stor_key_fmt[] = STOR_KEY_FMT; + +static char *fs_obj_pathname(const char *base, uint64_t key) +{ + enum { PREFIX_LEN = 3 }; + char prefix[PREFIX_LEN + 1]; + char stckey[STOR_KEY_SLEN+1]; + char *s; + int rc; + + /* we know that stckey is going to be longer than PREFIX_LEN */ + sprintf(stckey, stor_key_fmt, (unsigned long long) key); + memcpy(prefix, stckey, PREFIX_LEN); + prefix[PREFIX_LEN] = 0; + + rc = asprintf(s, %s/%s/%s, base, prefix, stckey + PREFIX_LEN); + if (rc 0) + goto err_out; + + return s; + +err_out: + return NULL; +} + +static char *fs_ctl_pathname(const char *base, const char *file) +{ + char *s; + int rc; + + rc = asprintf(s, %s/%s, base, file); + if
Re: [patch tabled 8/8] Add Swift back-end
On 11/28/2010 08:41 PM, Pete Zaitcev wrote: This patch allows to use tabled with OpenStack Swift object store as if it were our chunkserver, with some extra tricks. The configuration has to be entred manually into CLD, just like in case of filesystem back-end. The code is fairly experimental, so it retains extra messages. Also, since Swift authorizes by plaintext passwords, support for SSL is essential, but is currently missing. There is no build-time test for this, because it would require us to depend on OpenStack, which is untenable. Signed-off-by: Pete Zaitcevzait...@redhat.com applied patches 6-8. well done, this is a milestone for tabled! -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch hail 1/2] Add subdomain calling format
On 12/05/2010 10:53 PM, Pete Zaitcev wrote: Amazon appears to give up on forcing users to migrate and bucket-in-path format is going to stay. However, they still refuse to list buckets from other regions on the default endpoint, which leads to annoying indirection (need to know the region somehow before listing). Easier just use the subdomain format in one invocation. Signed-off-by: Pete Zaitcevzait...@redhat.com --- include/hstor.h |6 + lib/hstor.c | 178 +- 2 files changed, 106 insertions(+), 78 deletions(-) applied 1-2 -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch tabled 1/8] Shuffle fields of storage nodes
On 11/28/2010 08:39 PM, Pete Zaitcev wrote: This helps copy-paste safer later, mostly. Signed-off-by: Pete Zaitcevzait...@redhat.com --- server/object.c |2 - server/storage.c | 79 ++--- server/tabled.h | 12 +++--- 3 files changed, 53 insertions(+), 40 deletions(-) applied 1-5 Gonna give the file backend a cursory test, and swift backend a slightly-more-than-cursory test, then merge those. Well done! Pluggable storage backends make tabled more interesting. -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: AC_CONFIG_MACRO_DIR([m4])
On 12/05/2010 04:56 PM, Pete Zaitcev wrote: Autoconf printed a warning when reconfiguting Hail, so I gave up and added this: [...] Now I have a directory m4/ with symlinks... This does not seem to be helping any portability, unless I miss where the promised macro are being saved locally. What was this about, do you happen to know? I presume you refer to this: [jgar...@bd hail]$ ./autogen.sh CFLAGS=-O2 -Wall -Wshadow -g -march=native ./configure --disable-shared libtoolize: putting auxiliary files in `.'. libtoolize: linking file `./ltmain.sh' libtoolize: You should add the contents of the following files to `aclocal.m4': libtoolize: `/usr/share/aclocal/libtool.m4' libtoolize: `/usr/share/aclocal/ltoptions.m4' libtoolize: `/usr/share/aclocal/ltversion.m4' libtoolize: `/usr/share/aclocal/lt~obsolete.m4' libtoolize: Consider adding `AC_CONFIG_MACRO_DIR([m4])' to configure.ac and libtoolize: rerunning libtoolize, to keep the correct libtool macros in-tree. libtoolize: Consider adding `-I m4' to ACLOCAL_AMFLAGS in Makefile.am. libtoolize: putting auxiliary files in `.'. libtoolize: linking file `./ltmain.sh' libtoolize: Consider adding `AC_CONFIG_MACRO_DIR([m4])' to configure.ac and libtoolize: rerunning libtoolize, to keep the correct libtool macros in-tree. libtoolize: Consider adding `-I m4' to ACLOCAL_AMFLAGS in Makefile.am. Think about what this implies: Keeping the correct libtool macros in-tree implies adding a pointless maintenance burden. The distro always gives us correct, up-to-date files. Why would hail want to potentially lag upstream's version of these macros, forcing us to manually track macros that are currently updated automatically for each ./autogen.sh invocation? It is this same silly logic that leads programmers to ship in-tree copies of (for example) zlib. Therefore, the requirement to rebuild hail's configure script is to have a recent distro. Users of tarballs never see this, so this is only an issue for those on oddball or ancient OS's, who are building release tarballs, or working directly out the git repo. And if someone is doing that, they have a lot more headaches than just outdated libtool to contend with. Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: AC_CONFIG_MACRO_DIR([m4])
On 12/06/2010 12:44 PM, Pete Zaitcev wrote: On Mon, 06 Dec 2010 12:32:22 -0500 Jeff Garzikj...@garzik.org wrote: Keeping the correct libtool macros in-tree implies adding a pointless maintenance burden. The distro always gives us correct, up-to-date files. Why would hail want to potentially lag upstream's version of these macros, forcing us to manually track macros that are currently updated automatically for each ./autogen.sh invocation? I presumed that the important part is a compatibility between the syntax used in various .am files and the libtool scriptography that underpins them. Lagging upstream has no downside in this case (unlike zlib, where security fixes may exist). It does not seem optimal to run a current libtool with outdated macro files. In all cases except current one, you're checking in third party, maintained, versioned files to hail.git where they will be less-well maintained, and generally out-of-date vis a vis current [upstream | Fedora]. Where is the value in performing this additional work, besides silencing a warning seen only by git repo users? Users of tarballs never see this, so this is only an issue for those on oddball or ancient OS's, who are building release tarballs, or working directly out the git repo. Well, if you say so... Do you have knowledge to the contrary? Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [patch hail] remove duplicated stc_readport
On 10/26/2010 03:47 PM, Pete Zaitcev wrote: Now that we have a common library for Hail, an opportunity opens to trim some duplication, such as stc_readport. It even had a comment about it. Note that we leave cld_readport in the API for a few weeks, while I get my tabled trees and RPMs in order. Unfortunately we routinely neglect to set specific version in RPM headers (e.g. no Requires: cld= 0.8.2). Also, get rid of g_file_get_contents. Talk about pointless: it requires caller to free memory, and it's not like code is any more compact or easier to understand. Signed-off-by: Pete Zaitcevzait...@redhat.com applied it would be nice if a follow-up patch moved the hail_readport() definition into a more generic, not-CLD-specific header such as include/hail.h[1] Jeff [1] which doesn't exist yet. maybe we could rename hail_log.h to hail.h, and make hail.h a dumping ground for hail-generic items. -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
tabled + atcp
Just committed this: commit 57c4be44cdfa6c0cda6cf26d19e8048a945c5a78 Author: Jeff Garzik j...@garzik.org Date: Sat Oct 23 14:01:20 2010 -0400 Use libhail's atcp rather than our own async TCP write code. Should be functionally equivalent, as atcp originated from tabled code. Please test, and highlight any behavior differences with vanilla tabled v0.5.2. Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH hail] const-correctness tweaks
On 10/20/2010 04:53 AM, Jim Meyering wrote: Jeff Garzik wrote: ... Hi Jeff. Sorry I didn't notice that the first time. I built with ./autogen.sh ./configure make. It looks like you recommend -Wall -Wshadow. The two warnings above are the only ones I see with the patch, and they're easy to fix. When storing const pointer params into a struct like that, I've found that it's best to cast away the const, which really does reflect the semantics: by using const on the parameter, I view it as promising not to deref through the pointer *in that function*. Since it's usually not reasonable to make the struct member const (as you saw, it propagates too far and often ends up being contradictory), the lesser evil of the cast is preferable here. If you're still game, the following incremental patch seems to be enough for me: Let me know and I'll resubmit the full one. Well, my primary concern now originates from curl_easy_setopt(3) documentation: CURLOPT_WRITEFUNCTION Function pointer that should match the following prototype: size_t function( void *ptr, size_t size, size_t nmemb, void *stream); hstor's callback is passed directly to libcurl, so we seem to be bound by outside constraints, no? I compiled hail (with that patch) on F13 with -Wall -Wshadow with no warnings. That curl_easy_setopt documentation seems to be overly strict, or perhaps out of date?. When I compare with the code (curl/typecheck-gcc.h), I see all of the necessary const attributes: /* evaluates to true if expr is of type curl_write_callback or similar */ #define _curl_is_write_cb(expr) \ (_curl_is_read_cb(expr) ||\ __builtin_types_compatible_p(__typeof__(expr), __typeof__(fwrite)) || \ __builtin_types_compatible_p(__typeof__(expr), curl_write_callback) || \ _curl_callback_compatible((expr), _curl_write_callback1) ||\ _curl_callback_compatible((expr), _curl_write_callback2) ||\ _curl_callback_compatible((expr), _curl_write_callback3) ||\ _curl_callback_compatible((expr), _curl_write_callback4) ||\ _curl_callback_compatible((expr), _curl_write_callback5) ||\ _curl_callback_compatible((expr), _curl_write_callback6)) typedef size_t (_curl_write_callback1)(const char *, size_t, size_t, void*); typedef size_t (_curl_write_callback2)(const char *, size_t, size_t, const void*); typedef size_t (_curl_write_callback3)(const char *, size_t, size_t, FILE*); typedef size_t (_curl_write_callback4)(const void *, size_t, size_t, void*); typedef size_t (_curl_write_callback5)(const void *, size_t, size_t, const void*); typedef size_t (_curl_write_callback6)(const void *, size_t, size_t, FILE*); But even if curl were requiring some suboptimal signature, it would be nice not to impose that on all projects that use hail. Are there older curl headers that do require the const-free signature? If there are and you want to support them, too, let me know -- maybe I can cook up an autoconf test to make things work there, with minimal impact. Nah, I wouldn't worry about the const signature, it's probably just out of date documentation. If users appear running old OS's or OS versions, we can tackle autoconf'ing on a piecemeal basis as needs arise. Committed these patches of yours to hail.git and tabled.git. Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
hail version 0.7.2 released
Home: https://hail.wiki.kernel.org/ Git: git://git.kernel.org/pub/scm/daemon/distsrv/hail.git Download: http://www.kernel.org/pub/software/network/distsrv/hail/ Version 0.7.2 release notes (NEWS): - cld: read overrun bug fix - chunkd: add checksum table to disk format, one checksum per 64k of obj data - chunkd, libhail: add new GET_PART operation for partial object retrieval - chunkd: bug fixes - chunkd: use libevent (again) for main loop polling - libhail: add async TCP network writing API, atcp_wr* - libhail: bug fixes This release includes incompatible API and on-disk format changes. Git shortlog attached. Jeff Garzik (12): chunkd: Add checksum table to on-disk format, one sum per 64k of data chunkd: checksum data prior to returning via GET chunk: Add Get-Partial-Object (GET_PART) operation lib/chunksrv.c: add FIXME chunkd: internal 32/64-bit type fixes test/chunkd/get-part: read and test segment of randomized memory libhail: add async TCP network writing API, atcp_wr* Use libevent in chunkd, rather than hand-rolled server-poll functionality. atcp: extract pre- and post-writev code into separate functions Merge branch 'chunkd-libevent' chunkd: properly checksum a multi-block range Release version 0.7.2. Jim Meyering (4): cld: don't expect inode name to be NUL-terminated (avoid read overrun) lib/hstor.c: avoid an unconditional leak in append_qparam chunkd: don't leak an FS object iterator libhail/hstor: const-correctness tweaks Pete Zaitcev (3): libhail: Fix calling convention of huri_field_escape Change cfgfile.txt into a real config file pkg: add doc/setup.txt to install
tabled version 0.5.2 released
Home: https://hail.wiki.kernel.org/ Git: git://git.kernel.org/pub/scm/daemon/distsrv/tabled.git Download: http://www.kernel.org/pub/software/network/distsrv/tabled/ Version 0.5.2 release notes (NEWS): - Permit randomly allocated TCP port, for db4 replication master - Install etc.tabled.conf as a useful example configuration - minor testsuite additions - many minor bug fixes Git shortlog attached.
Re: tabled version 0.5.2 released
On 10/22/2010 11:39 PM, Jeff Garzik wrote: Home: https://hail.wiki.kernel.org/ Git: git://git.kernel.org/pub/scm/daemon/distsrv/tabled.git Download: http://www.kernel.org/pub/software/network/distsrv/tabled/ Version 0.5.2 release notes (NEWS): - Permit randomly allocated TCP port, for db4 replication master - Install etc.tabled.conf as a useful example configuration - minor testsuite additions - many minor bug fixes Git shortlog attached. er, now it's attached. Jeff Garzik (2): test/.gitignore: ignore list-keys test Release version 0.5.2. Jim Meyering (6): server/server.c (net_write_port): Don't ignore write error. server/server.c: use sizeof(s) rather than equivalent 64 don't dereference NULL on OOM server/status.c: don't deref NULL on failed strdup server/bucket.c: don't deref NULL upon failed malloc adapt to changed signature of hstor_get's callback function Pete Zaitcev (8): Fix crash when stopping slave Clean name vs host cleanup a call to closelog() Support auto replicaton port test/start-daemon: Factor 3 pid-checking if blocks into a loop. test/start-daemon: Ignore stale .pid files. Add a test for hstor_keys Install etc.tabled.conf
Re: hail version 0.7.2 released
On 10/22/2010 11:22 PM, Jeff Garzik wrote: Home: https://hail.wiki.kernel.org/ Git: git://git.kernel.org/pub/scm/daemon/distsrv/hail.git Download: http://www.kernel.org/pub/software/network/distsrv/hail/ It seems that kernel.org mirroring is broken or extremely slow at the moment. The releases should appear as soon as kernel.org mirrors pick back up. Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH hail] const-correctness tweaks
On 10/20/2010 04:00 AM, Jim Meyering wrote: Jeff Garzik wrote: On 10/06/2010 08:07 AM, Jim Meyering wrote: Make write_cb callback's buffer parameter const, like all write-like functions. Give a few char * parameters the const attribute. Signed-off-by: Jim Meyeringmeyer...@redhat.com --- It looks like most of hail's interfaces are const-correct, but one stood out because it provokes a warning when I tried to pass a const-correct write_cb function to hstor_get from iwhd: proxy.c:382: warning: passing argument 4 of 'hstor_get' from \ incompatible pointer type /usr/include/hstor.h:173: note: expected \ 'size_t (*)(void *, size_t, size_t, void *)' but argument is of type \ 'size_t (*)(const void *, size_t, size_t, void *)' In case you feel comfortable fixing this, here's a patch: Sorry for not getting back to this; I had hoped to solve some additional problems that cropped up, but didn't have time. So, to forestall further delay, libtool: compile: gcc -DHAVE_CONFIG_H -I. -I.. -I../include -pthread -I/usr/include/glib-2.0 -I/usr/lib64/glib-2.0/include -I/usr/include/libxml2 -O2 -Wall -Wshadow -g -MT hutil.lo -MD -MP -MF .deps/hutil.Tpo -c hutil.c -o hutil.o hutil.c: In function ‘hreq_hdr_push’: hutil.c:145: warning: assignment discards qualifiers from pointer target type hutil.c:146: warning: assignment discards qualifiers from pointer target type warnings appear after this patch. When solving these warnings with const' markers, it quickly becomes a bit of a rat's nest. At a minimum, the write_cb callback signature must match libcurl's, which does not use 'const'. I can see this makes sense from libcurl implementation's perspective, even if it does not really match the constness one expects from a foo-get function. Hi Jeff. Sorry I didn't notice that the first time. I built with ./autogen.sh ./configure make. It looks like you recommend -Wall -Wshadow. The two warnings above are the only ones I see with the patch, and they're easy to fix. When storing const pointer params into a struct like that, I've found that it's best to cast away the const, which really does reflect the semantics: by using const on the parameter, I view it as promising not to deref through the pointer *in that function*. Since it's usually not reasonable to make the struct member const (as you saw, it propagates too far and often ends up being contradictory), the lesser evil of the cast is preferable here. If you're still game, the following incremental patch seems to be enough for me: Let me know and I'll resubmit the full one. Well, my primary concern now originates from curl_easy_setopt(3) documentation: CURLOPT_WRITEFUNCTION Function pointer that should match the following prototype: size_t function( void *ptr, size_t size, size_t nmemb, void *stream); hstor's callback is passed directly to libcurl, so we seem to be bound by outside constraints, no? Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH hail] const-correctness tweaks
On 10/06/2010 08:07 AM, Jim Meyering wrote: Make write_cb callback's buffer parameter const, like all write-like functions. Give a few char * parameters the const attribute. Signed-off-by: Jim Meyeringmeyer...@redhat.com --- It looks like most of hail's interfaces are const-correct, but one stood out because it provokes a warning when I tried to pass a const-correct write_cb function to hstor_get from iwhd: proxy.c:382: warning: passing argument 4 of 'hstor_get' from \ incompatible pointer type /usr/include/hstor.h:173: note: expected \ 'size_t (*)(void *, size_t, size_t, void *)' but argument is of type \ 'size_t (*)(const void *, size_t, size_t, void *)' In case you feel comfortable fixing this, here's a patch: include/hstor.h |4 ++-- lib/hstor.c |5 +++-- lib/hutil.c |2 +- 3 files changed, 6 insertions(+), 5 deletions(-) This requires updating test/large-object.c in tabled, too. Would you mind sending along that companion patch? -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: CLD multi-node status
On 09/30/2010 04:55 AM, Geert Jansen wrote: is it correct that CLD is basically single-master right now? I can't find any trace of the mentioned Paxos implementation in the source. The current main branch is single-master, correct. The 'replica' branch of hail.git contains the multi-node server -- where paxos implementation is imported from db4 replication engine. No multi-node client lib update exists, however. Look for this to change in the next 2 weeks, though! Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH hail] chunkd: don't leak an FS object iterator
On 09/29/2010 11:20 AM, Jim Meyering wrote: chk_list_objs called fs_list_objs_open without also calling fs_list_objs_close. 32,808 bytes in 1 blocks are definitely lost in loss record 413 of 419 at 0x4A0515D: malloc (vg_replace_malloc.c:195) by 0x31BA8A26D0: __alloc_dir (opendir.c:184) by 0x405619: fs_list_objs_open (be-fs.c:974) by 0x40B202: chk_list_objs (selfcheck.c:41) by 0x40B575: chk_dbscan (selfcheck.c:131) by 0x40B628: chk_thread_scan (selfcheck.c:147) by 0x40B757: chk_thread_command (selfcheck.c:179) by 0x40B890: chk_thread_func (selfcheck.c:219) by 0x31BC464E83: g_thread_create_proxy (gthread.c:1893) by 0x31BB407760: start_thread (pthread_create.c:301) by 0x31BA8E151C: clone (clone.S:115) After seeing a few valgrind references from you, I'm curious... do you by chance happen to have a valgrind suppression file for openssl on Fedora? I've been wanting to run valgrind on chunkd, but each time I attempt it, I -- and valgrind -- have been overwhelmed by openssl false positives. openssl, deep in its RAND_xxx functions, intentionally does crazy stuff like using random, uninitialized stack contents as RNG entropy. Cute, but valgrind quite rightly complains loudly about it. It's a topic I've been meaning to research, because I currently lack the valgrind-fu necessary to have an effective valgrind+chunkd session. Thanks, Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH hail] chunkd: don't leak an FS object iterator
On 09/29/2010 11:20 AM, Jim Meyering wrote: chk_list_objs called fs_list_objs_open without also calling fs_list_objs_close. 32,808 bytes in 1 blocks are definitely lost in loss record 413 of 419 at 0x4A0515D: malloc (vg_replace_malloc.c:195) by 0x31BA8A26D0: __alloc_dir (opendir.c:184) by 0x405619: fs_list_objs_open (be-fs.c:974) by 0x40B202: chk_list_objs (selfcheck.c:41) by 0x40B575: chk_dbscan (selfcheck.c:131) by 0x40B628: chk_thread_scan (selfcheck.c:147) by 0x40B757: chk_thread_command (selfcheck.c:179) by 0x40B890: chk_thread_func (selfcheck.c:219) by 0x31BC464E83: g_thread_create_proxy (gthread.c:1893) by 0x31BB407760: start_thread (pthread_create.c:301) by 0x31BA8E151C: clone (clone.S:115) Signed-off-by: Jim Meyeringmeyer...@redhat.com applied -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[chunkd patch] convert to libevent
For a nice code savings... chunkd/Makefile.am |1 chunkd/chunkd.h| 28 + chunkd/cldu.c | 64 +-- chunkd/server.c| 289 + chunkd/util.c | 23 configure.ac |3 6 files changed, 116 insertions(+), 292 deletions(-) diff --git a/chunkd/Makefile.am b/chunkd/Makefile.am index 78bba72..a45a89b 100644 --- a/chunkd/Makefile.am +++ b/chunkd/Makefile.am @@ -10,4 +10,5 @@ chunkd_SOURCES= chunkd.h \ objcache.c chunkd_LDADD = \ ../lib/libhail.la @GLIB_LIBS@ @CRYPTO_LIBS@ \ + @EVENT_LIBS@ \ @SSL_LIBS@ @TOKYOCABINET_LIBS@ @XML_LIBS@ @LIBCURL@ diff --git a/chunkd/chunkd.h b/chunkd/chunkd.h index 937573c..5be155a 100644 --- a/chunkd/chunkd.h +++ b/chunkd/chunkd.h @@ -28,7 +28,7 @@ #include chunk_msg.h #include hail_log.h #include tchdb.h -#include cldc.h /* for cld_timer */ +#include event.h #include objcache.h #ifndef ARRAY_SIZE @@ -77,6 +77,8 @@ struct client { charaddr_host[64]; /* ASCII version of inet addr */ charaddr_port[16]; /* ASCII version of port */ int fd; /* socket */ + struct eventev; + short ev_mask;/* EV_READ and/or EV_WRITE */ charuser[CHD_USER_SZ + 1]; @@ -172,18 +174,10 @@ struct server_stats { unsigned long opt_write; /* optimistic writes */ }; -struct server_poll { - short events; /* POLL* from poll.h */ - boolbusy; /* if true, do not poll us */ - - /* callback function, data */ - bool(*cb)(int fd, short events, void *userdata); - void*userdata; -}; - struct server_socket { int fd; const struct listen_cfg *cfg; + struct eventev; struct list_headsockets_node; }; @@ -207,14 +201,15 @@ struct server { char*pid_file; /* PID file */ int pid_fd; + struct event_base *evbase_main; + struct list_headlisteners; struct list_headsockets;/* points into listeners */ - GHashTable *fd_info; - GThreadPool *workers; /* global thread worker pool */ int max_workers; int worker_pipe[2]; + struct eventworker_ev; struct list_headwr_trash; unsigned inttrash_sz; @@ -311,11 +306,6 @@ extern void syslogerr(const char *prefix); extern void strup(char *s); extern int write_pid_file(const char *pid_fn); extern int fsetflags(const char *prefix, int fd, int or_flags); -extern void timer_init(struct cld_timer *timer, const char *name, - void (*cb)(struct cld_timer *), void *userdata); -extern void timer_add(struct cld_timer *timer, time_t expires); -extern void timer_del(struct cld_timer *timer); -extern time_t timers_run(void); extern char *time2str(char *strbuf, time_t time); extern void hexstr(const unsigned char *buf, size_t buf_len, char *outstr); @@ -328,7 +318,7 @@ extern bool cli_err(struct client *cli, enum chunk_errcode code, bool recycle_ok extern int cli_writeq(struct client *cli, const void *buf, unsigned int buflen, cli_write_func cb, void *cb_data); extern bool cli_wr_sendfile(struct client *, cli_write_func); -extern bool cli_rd_set_poll(struct client *cli, bool readable); +extern void cli_rd_set_poll(struct client *cli, bool readable); extern void cli_wr_set_poll(struct client *cli, bool writable); extern bool cli_cb_free(struct client *cli, struct client_write *wr, bool done); @@ -336,7 +326,7 @@ extern bool cli_write_start(struct client *cli); extern int cli_req_avail(struct client *cli); extern int cli_poll_mod(struct client *cli); extern bool worker_pipe_signal(struct worker_info *wi); -extern bool tcp_cli_event(int fd, short events, void *userdata); +extern void tcp_cli_event(int fd, short events, void *userdata); extern void resp_init_req(struct chunksrv_resp *resp, const struct chunksrv_req *req); diff --git a/chunkd/cldu.c b/chunkd/cldu.c index dd8b67c..026c523 100644 --- a/chunkd/cldu.c +++ b/chunkd/cldu.c @@ -21,6 +21,7 @@ #include hail-config.h #include sys/types.h +#include sys/time.h #include sys/socket.h #include glib.h #include syslog.h @@ -39,21 +40,23 @@ struct cld_host { }; struct cld_session { - bool forced_hosts; /* Administrator overrode default CLD */ - bool is_dead; - struct ncld_sess *nsess;/* library state */ + bool
Re: Autostart
On Wed, Sep 29, 2010 at 7:09 PM, Pete Zaitcev zait...@redhat.com wrote: An interesting question is what to do when iwhd exits. I decided not to kill what was started. So, we have a little self-contained cell of tabled, chunkd, S3, based off a certain local directory or other namespace anchor. Therefore, when iwhd restarts, it tests if the cell is still there, and uses that. It also tests if the service started successfuly, using the same method. As we see, for each service iwhd starts, it needs to verify that it's available (either before spawning it, or after). This would be done best by talking to the service. But iwhd only has S3 client, and no CLD client, so it cannot talk to cld (or chunkd). I had an idea: add an autostart feature to tabled. Tabled knows how to talk to both chunkd and cld, so it can verify that they are running. It would not be that much code. The downside is that it's clearly a special case, encoding of a policy. So I am asking how objectionable it would be (including do we want tabled -a for tests... they kinda run ok as they are). It seems like quite a special case feature. tabled is designed to use multiple chunkd nodes (and hopefully soon, multiple cld nodes). So having tabled start chunkd/cld seems misaligned with the existing design. That said, if it was possible to write a script or program that performed autostart without modifying tabled, it would be a nice addition to the git repository. tabled.autostart could be a simple program that pinged cld/chunkd, started if necessary, then exec'd the real tabled. Something modular and separate like that would be great. Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH hail] lib/hstor.c: avoid an unconditional leak in append_qparam
On 09/27/2010 04:53 AM, Jim Meyering wrote: Signed-off-by: Jim Meyeringmeyer...@redhat.com --- I would have preferred to insert a single line right before the huri_field_escape call: char *v = strdup(val); [would result in a more compact, single-hunk patch] but it looks like hail uses the anachronistic (pre-C99) declare all vars at outer scope style, so I conformed. lib/hstor.c |5 - 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/lib/hstor.c b/lib/hstor.c index 6c67bfa..79e0420 100644 --- a/lib/hstor.c +++ b/lib/hstor.c @@ -676,6 +676,7 @@ static GString *append_qparam(GString *str, const char *key, const char *val, char *arg_char) { char *stmp; + char *v; str = g_string_append(str, arg_char); arg_char[0] = ''; @@ -683,9 +684,11 @@ static GString *append_qparam(GString *str, const char *key, const char *val, str = g_string_append(str, key); str = g_string_append(str, =); - stmp = huri_field_escape(strdup(val), QUERY_ESCAPE_MASK); + v = strdup(val); + stmp = huri_field_escape(v, QUERY_ESCAPE_MASK); str = g_string_append(str, stmp); free(stmp); + free(v); applied Yeah, I don't like C++ var decls; I think the code gets too disorganized, making it really easy to miss a decl when reviewing. Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH hail] lib/hstor.c: avoid an unconditional leak in append_qparam
On 09/27/2010 12:29 PM, Pete Zaitcev wrote: On Mon, 27 Sep 2010 10:53:06 +0200 Jim Meyeringj...@meyering.net wrote: - stmp = huri_field_escape(strdup(val), QUERY_ESCAPE_MASK); + v = strdup(val); + stmp = huri_field_escape(v, QUERY_ESCAPE_MASK); str = g_string_append(str, stmp); free(stmp); + free(v); I think you may be fooled by the ridiculous calling convention Doh, my memory and I were fooled, too. -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [hail patch 1/1] Fix calling convention of huri_field_escape
On 09/27/2010 08:49 PM, Pete Zaitcev wrote: Premature optimization is the root of all evil. Use a sensible convention of not screwing with the argument, at the expense of extra strdup. Fortunately, all users are confined to Hail itself, even if huri_field_escape is exported. Signed-off-by: Pete Zaitcevzait...@redhat.com --- include/hstor.h |2 +- lib/hstor.c | 44 +--- lib/huri.c | 10 +- 3 files changed, 35 insertions(+), 21 deletions(-) applied -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [tabled patch 1/1] Add a test for hstor_keys
On 09/27/2010 08:52 PM, Pete Zaitcev wrote: Our current tests do not invoke hstor_keys at all, and so they did not catch a crash with double free in append_qparam. Add a very basic test which at least calls hstor_keys to verify that it does not crash right away. This test does not excercise complex modes such as S3 paging, but better this than nothing. Signed-off-by: Pete Zaitcevzait...@redhat.com --- test/Makefile.am |4 + test/list-keys.c | 102 + 2 files changed, 105 insertions(+), 1 deletion(-) applied.. FYI you forgot to update test/.gitignore. Fixed. -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH tabled 1/2] server/config.c: don't dereference NULL on OOM
On 09/24/2010 07:32 AM, Jim Meyering wrote: You can pull from the oom branch here: git://git.infradead.org/users/meyering/tabled.git Got nearly everything perfect. Need one more minor yet important change. As described in doc/contributions.txt, every changeset MUST have a Signed-off-by line at the end of a changeset's description. I was able to pull and build just fine, so your git repo setup and push appears correct. Also, in your pull request, please put the branch immediately following the repo URL on the same line, for easier cut-n-paste. Here's how Linus requests his pull-requests to look: ---SNIP- Please pull from 'upstream-linus' branch of git://git.kernel.org/pub/scm/git/jgarzik/libata-dev.git upstream-linus to receive the following updates: drivers/ata/ahci.c| 193 +++-- drivers/ata/libata-acpi.c | 40 +- drivers/ata/libata-core.c |3 + drivers/ata/libata.h |2 + drivers/ata/pata_ali.c|2 +- include/linux/ata.h |9 ++- include/linux/libata.h| 12 +++ 7 files changed, 178 insertions(+), 83 deletions(-) Dirk Hohndel (1): pata_ali: trivial fix of a very frequent spelling mistake Robert Hancock (1): ahci: display all AHCI 1.3 HBA capability flags (v2) Tejun Heo (5): ahci: disable 64bit DMA by default on SB600s libata: cosmetic updates libata: implement more acpi filtering options libata: make gtf_filter per-dev ahci: filter FPDMA non-zero offset enable for Aspire 3810T diff --git a/drivers/ata/ahci.c b/drivers/ata/ahci.c index acd1162..4edca6e 100644 --- a/drivers/ata/ahci.c +++ b/drivers/ata/ahci.c [COMBINED PATCH FOLLOWS...] ---SNIP- -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH tabled 1/2] server/config.c: don't dereference NULL on OOM
On 09/24/2010 01:43 PM, Jim Meyering wrote: Jeff Garzik wrote: On 09/24/2010 07:32 AM, Jim Meyering wrote: You can pull from the oom branch here: git://git.infradead.org/users/meyering/tabled.git Got nearly everything perfect. Need one more minor yet important change. As described in doc/contributions.txt, every changeset MUST have a Signed-off-by line at the end of a changeset's description. I was able to pull and build just fine, so your git repo setup and push appears correct. Also, in your pull request, please put the branch immediately following the repo URL on the same line, for easier cut-n-paste. Here's how Linus requests his pull-requests to look: Ok. I've added those pesky S.O.B lines with this: git filter-branch --msg-filter \ 'cat printf \nSigned-off-by: Jim Meyeringmeyer...@redhat.com\n' \ HEAD~4..HEAD and pushed the result. Please pull from the 'oom' branch of git://git.infradead.org/users/meyering/tabled.git pulled from you pushed upstream, thanks! -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH tabled] server/server.c (net_write_port): Don't ignore write error.
On 09/23/2010 03:55 AM, Jim Meyering wrote: Better safe than sorry... Unreported write failures can be unpleasant. I fixed the one below so that a failure indication can propagate up the call tree. You might also want to report the failure to stderr. I let my editor automatically update the copyright date and remove trailing spaces. If you'd rather separate those from the fix, let me know and I can adjust and resend. Patch applied, thanks. The typical preference is to receive whitespace and other cosmetic changes in a separate patch, thereby highlighting the functional changes. But we're not so strict here that I would reject an otherwise useful patch... Also FWIW, we're not very strict about reproducing the GCC-ish (GNU-ish?) style of $FILENAME ($FUNCTION): in each changelog -- though you're certainly welcome to continue, if that's your preference. Given that git show $COMMIT will give you filename and per-diff-chunk function names, reproducing that in the git changelog entry seems somewhat redundant. A simple, English-language summary of the change is fine. Just a style tip, though, feel free to ignore! :) Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[tabled patch v2] abstract out TCP-write code
Changes from v1: - avoid referencing dead struct client (grep for 'invalidate_cli'), by changing FSM callback prototype. - insert 'void *priv' member into struct atcp_wr_state, and replace cb_data1/cb_data2 callback parameters with (struct atcp_wr_state *, void *). struct client / struct session, or whatever, may be stored in atcp_wr_state::priv. - minor API polishing and further abstraction server/Makefile.am |1 server/atcp.c | 238 +++ server/atcp.h | 100 +++ server/bucket.c|8 - server/object.c| 56 +-- server/server.c| 268 + server/status.c|3 server/tabled.h| 46 ++--- 8 files changed, 436 insertions(+), 284 deletions(-) diff --git a/server/Makefile.am b/server/Makefile.am index 5b53a0a..5e0abd5 100644 --- a/server/Makefile.am +++ b/server/Makefile.am @@ -4,6 +4,7 @@ INCLUDES= -I$(top_srcdir)/include @GLIB_CFLAGS@ @HAIL_CFLAGS@ sbin_PROGRAMS = tabled tdbadm tabled_SOURCES = tabled.h \ + atcp.c atcp.h \ bucket.c cldu.c config.c metarep.c object.c replica.c \ server.c status.c storage.c storparse.c util.c tabled_LDADD = ../lib/libtdb.a \ diff --git a/server/atcp.c b/server/atcp.c new file mode 100644 index 000..0050a68 --- /dev/null +++ b/server/atcp.c @@ -0,0 +1,238 @@ + +/* + * Copyright 2010 Red Hat, Inc. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License as published by + * the Free Software Foundation. + * + * This program is distributed in the hope that it will be useful, + * but WITHOUT ANY WARRANTY; without even the implied warranty of + * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the + * GNU General Public License for more details. + * + * You should have received a copy of the GNU General Public License + * along with this program; see the file COPYING. If not, write to + * the Free Software Foundation, 675 Mass Ave, Cambridge, MA 02139, USA. + * + */ + +#define _GNU_SOURCE +#include tabled-config.h + +#include string.h +#include stdlib.h +#include errno.h +#include sys/uio.h +#include atcp.h + +bool atcp_cb_free(struct atcp_wr_state *wst, void *cb_data, bool done) +{ + free(cb_data); + return false; +} + +static void atcp_write_complete(struct atcp_write *tmp) +{ + struct atcp_wr_state *wst = tmp-wst; + + list_del(tmp-node); + list_add_tail(tmp-node, wst-write_compl_q); +} + +static bool atcp_write_free(struct atcp_write *tmp, bool done) +{ + struct atcp_wr_state *wst = tmp-wst; + bool rcb = false; + + wst-write_cnt -= tmp-length; + list_del_init(tmp-node); + if (tmp-cb) + rcb = tmp-cb(wst, tmp-cb_data, done); + free(tmp); + + return rcb; +} + +bool atcp_write_run_compl(struct atcp_wr_state *wst) +{ + struct atcp_write *wr; + bool do_loop; + + do_loop = false; + while (!list_empty(wst-write_compl_q)) { + wr = list_entry(wst-write_compl_q.next, + struct atcp_write, node); + do_loop |= atcp_write_free(wr, true); + } + return do_loop; +} + +void atcp_write_free_all(struct atcp_wr_state *wst) +{ + struct atcp_write *wr, *tmp; + + atcp_write_run_compl(wst); + list_for_each_entry_safe(wr, tmp, wst-write_q, node) { + atcp_write_free(wr, false); + } +} + +static bool atcp_writable(struct atcp_wr_state *wst) +{ + int n_iov; + struct atcp_write *tmp; + ssize_t rc; + struct iovec iov[ATCP_MAX_WR_IOV]; + + /* accumulate pending writes into iovec */ + n_iov = 0; + list_for_each_entry(tmp, wst-write_q, node) { + if (n_iov == ATCP_MAX_WR_IOV) + break; + /* bleh, struct iovec should declare iov_base const */ + iov[n_iov].iov_base = (void *) tmp-buf; + iov[n_iov].iov_len = tmp-togo; + n_iov++; + } + + /* execute non-blocking write */ +do_write: + rc = writev(wst-fd, iov, n_iov); + if (rc 0) { + if (errno == EINTR) + goto do_write; + if (errno != EAGAIN) + goto err_out; + return true; + } + + /* iterate through write queue, issuing completions based on +* amount of data written +*/ + while (rc 0) { + int sz; + + /* get pointer to first record on list */ + tmp = list_entry(wst-write_q.next, struct atcp_write, node); + + /* mark data consumed by decreasing tmp-len */ + sz = (tmp-togo rc) ? tmp-togo : rc; + tmp-togo -= sz; + tmp-buf +=
Re: [tabled patch] abstract out TCP-write code
On 09/23/2010 11:28 AM, Jim Meyering wrote: Every developer should have MALLOC_PERTURB_=N (N in 1..255) set in his/her environment on glibc-based systems. Almost all the time. I heard about it a while ago, even submitted a bugzilla bug to have it documented adequately. But apparently its absent from my .bash_profile. Added. Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [tabled patch] abstract out TCP-write code
On 09/22/2010 10:37 PM, Pete Zaitcev wrote: On Wed, 22 Sep 2010 21:26:13 -0400 Jeff Garzikj...@garzik.org wrote: It is a common idiom even in GLib that callbacks receive two anonymous pointers; witness the data type GFunc's 'data' and 'user_data' arguments: http://library.gnome.org/devel/glib/stable/glib-Doubly-Linked-Lists.html#GFunc There's a lot of retarged garbage in Glib, just look at their lists. If someone smarter wrote Glib, we would not need struct list_head. I use both list types, because there's a use case for both. You don't always have the luxury of having a struct in which to embed data+next pointers. Allocated strings are an excellent example. GFunc has two parameters for a reason :) See for example http://library.gnome.org/devel/glib/stable/glib-Doubly-Linked-Lists.html#g-list-foreach It really is a common idiom, based on a common need, not just my style preference. :) Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [tabled patch] abstract out TCP-write code
On 09/22/2010 10:37 PM, Pete Zaitcev wrote: On Wed, 22 Sep 2010 21:26:13 -0400 Jeff Garzikj...@garzik.org wrote: So, we go a longer route and re-hook the list of completions to a per-server global instead of a client. The patch is straight- forward. The only thing we need to be careful is to make sure that no outstanding completions are left in the queue before freeing a client struct. This is ensured by force-running completions. Looking at this change again, I don't see how this avoids use-after-free. If completions exist after state change function leads one to cli_evt_dispose() - cli_free(), then cli_write_run_compl() still calls cli_write_free() with the stale 'cli' pointer. We run completions before freeing in all cases. My patch was correct. Logically, if completions are run before freeing in all cases, there is no need to make write_compl_q global. That was a red herring, which by side effect avoided the bug with the stale 'cli' pointer. Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Reconsidering libevent
On Tue, Sep 21, 2010 at 5:06 PM, Steven Dake sd...@redhat.com wrote: libevent version 2 has proper mutual exclusion, but the code needs some work. 1.x should work for chunkd at the moment. I need to resist my own urge to think too far ahead and overengineer for the future sometimes; I think this is one of those occasions. libevent 1.x seems solid for single-thread usage, and that's how we'll use libevent, even if multiple chunkd threads are in existence. Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [hail patch 0/3] chunkd: on-disk checksumming and get-partial operation
Just pushed this out to hail.git. -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] don't expect inode name to be NUL-terminated (avoid read overrun)
On 09/10/2010 08:55 AM, Jim Meyering wrote: * server/msg.c (msg_get): Copy only name_len bytes, then NUL-terminate, rather than using snprintf to copy up to and including nonexistent NUL. --- valgrind exposed this. The use of snprintf would have been correct if the inode name buffer (following the struct raw_inode) were NUL-terminated, but it is not. applied -- good catch out of curiosity, what is your patch base? We combined cld and chunkd into a single 'hail' pkg, and from the pathname, your patch was generated from the older cld pkg. We'd like to find the source and replace cld/chunkd with 'hail'. F12? F13? rawhide? Thanks, Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[hail patch 0/3] chunkd: on-disk checksumming and get-partial operation
This patchset is just about ready to go upstream. Just need to write a couple tests (familiar refrain eh?:)). These changes add a new Get-Partial-Object (GET_PART) chunkd operation. GET_PART permits partial retrieval of an object, by adding an (offset,length) pair to the standard Get-Object (GET) operation. length==0 is special-cased as meaning retrieve until end of object. The maximum number of bytes that may be requested in a single GET_PART request is 4 x 64k blocks (256k). Larger lengths will be truncated down to the maximum. Because we currently only store whole-object SHA1 checksums, we are left without an ability to verify on-disk data is valid, when retrieving a subset of an object. Thus, a necessary pre-req of GET_PART is changing the checksum scheme, which is done as follows: * objects are defined as runs of 64k logical blocks * checksums are stored on-disk for each 64k in an object * Rather than returning the stored SHA1 checksum, which serves to verify both on-disk and network integrity, we break this into two steps, * verify per-64k checksums at GET_PART time * generate on-the-fly SHA1 checksum for GET_PART returned data The chunkd network protocol supports any offset/length, including not-64k-aligned values. However, failure to align GET_PART requests on 64k boundaries will result in reduced performance, due to additional work chunkd must perform [and then throw away], because chunkd now works in 64k chunks internally. This is a major protocol milestone, and should immediately enable sane usage by nfs4d and itd (see wiki if unfamiliar), as well as hopefully providing useful benefits to tabled as well. -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[hail patch 1/3] chunkd: Add checksum table to on-disk format
commit f1de17a6e2b3afdbfbfa581228280b65a4a17e5f Author: Jeff Garzik j...@garzik.org Date: Thu Aug 5 17:47:03 2010 -0400 chunkd: Add checksum table to on-disk format, one sum per 64k of data Signed-off-by: Jeff Garzik jgar...@redhat.com chunkd/be-fs.c | 162 - 1 file changed, 137 insertions(+), 25 deletions(-) diff --git a/chunkd/be-fs.c b/chunkd/be-fs.c index 4b851a7..d714e7c 100644 --- a/chunkd/be-fs.c +++ b/chunkd/be-fs.c @@ -53,14 +53,23 @@ struct fs_obj { int in_fd; char*in_fn; off_t sendfile_ofs; + + size_t checked_bytes; + SHA_CTX checksum; + unsigned intcsum_idx; + void*csum_tbl; + size_t csum_tbl_sz; + + unsigned intn_blk; }; struct be_fs_obj_hdr { charmagic[4]; uint32_tkey_len; uint64_tvalue_len; + uint32_tn_blk; - charreserved[16]; + charreserved[12]; unsigned char hash[CHD_CSUM_SZ]; charowner[128]; @@ -208,6 +217,8 @@ static struct fs_obj *fs_obj_alloc(void) obj-out_fd = -1; obj-in_fd = -1; + SHA1_Init(obj-checksum); + return obj; } @@ -318,6 +329,17 @@ static bool key_valid(const void *key, size_t key_len) return true; } +static unsigned int fs_blk_count(uint64_t data_len) +{ + uint64_t n_blk; + + n_blk = data_len CHUNK_BLK_ORDER; + if (data_len (CHUNK_BLK_SZ - 1)) + n_blk++; + + return (unsigned int) n_blk; +} + struct backend_obj *fs_obj_new(uint32_t table_id, const void *key, size_t key_len, uint64_t data_len, @@ -325,6 +347,7 @@ struct backend_obj *fs_obj_new(uint32_t table_id, { struct fs_obj *obj; char *fn = NULL; + size_t csum_bytes; enum chunk_errcode erc = che_InternalError; off_t skip_len; @@ -339,6 +362,13 @@ struct backend_obj *fs_obj_new(uint32_t table_id, return NULL; } + obj-n_blk = fs_blk_count(data_len); + csum_bytes = obj-n_blk * CHD_CSUM_SZ; + obj-csum_tbl = malloc(csum_bytes); + if (!obj-csum_tbl) + goto err_out; + obj-csum_tbl_sz = csum_bytes; + /* build local fs pathname */ fn = fs_obj_pathname(table_id, key, key_len); if (!fn) @@ -359,7 +389,7 @@ struct backend_obj *fs_obj_new(uint32_t table_id, obj-out_fn = fn; /* calculate size of front-of-file metadata area */ - skip_len = sizeof(struct be_fs_obj_hdr) + key_len; + skip_len = sizeof(struct be_fs_obj_hdr) + key_len + csum_bytes; /* position file pointer where object data (as in, not metadata) * will begin @@ -397,7 +427,10 @@ struct backend_obj *fs_obj_open(uint32_t table_id, const char *user, struct be_fs_obj_hdr hdr; ssize_t rrc; uint64_t value_len, tmp64; + size_t csum_bytes; enum chunk_errcode erc = che_InternalError; + struct iovec iov[2]; + size_t total_rd_len; if (!key_valid(key, key_len)) { *err_code = che_InvalidKey; @@ -457,23 +490,45 @@ struct backend_obj *fs_obj_open(uint32_t table_id, const char *user, goto err_out; value_len = GUINT64_FROM_LE(hdr.value_len); + obj-n_blk = GUINT32_FROM_LE(hdr.n_blk); + csum_bytes = obj-n_blk * CHD_CSUM_SZ; /* verify file size large enough to contain value */ - tmp64 = value_len + sizeof(hdr) + key_len; + tmp64 = value_len + sizeof(hdr) + key_len + csum_bytes; if (G_UNLIKELY(st.st_size tmp64)) { applog(LOG_ERR, obj(%s) size error, too small, obj-in_fn); goto err_out; } + /* verify expected size of checksum table */ + if (G_UNLIKELY(fs_blk_count(value_len) != obj-n_blk)) { + applog(LOG_ERR, obj(%s) unexpected blk count + (%u from val sz, %u from hdr), + obj-in_fn, fs_blk_count(value_len), obj-n_blk); + goto err_out; + } + + obj-csum_tbl = malloc(csum_bytes); + if (!obj-csum_tbl) + goto err_out; + obj-csum_tbl_sz = csum_bytes; + obj-bo.key = malloc(key_len); obj-bo.key_len = key_len; if (!obj-bo.key) goto err_out; - /* read object variable-length header */ - rrc = read(obj-in_fd, obj-bo.key, key_len); - if ((rrc != key_len) || (memcmp(key, obj-bo.key, key_len))) { - applog(LOG_ERR, read hdr key obj(%s) failed: %s, + /* init additional header segment list */ + iov[0].iov_base
Re: [tabled patch 4/5] Support auto replicaton port
On 08/12/2010 03:22 PM, Pete Zaitcev wrote: Allow random ports for replication master to listen on. The patch is somewhat larger than expected, because before we had the MASTER file written right after locking. Now we may have it written without listening parameters, and the slaves must be ready to deal with it. Unlike the auto client port, we do not need to write any accessor files, because we already report the host and port through CLD. Listening on random ports has security implications. Signed-off-by: Pete Zaitcevzait...@redhat.com applied 1-4 of 5 -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [tabled patch 1/3] make a const struct static
On 08/05/2010 11:40 PM, Pete Zaitcev wrote: Signed-off-by: Pete Zaitcevzait...@redhat.com --- server/server.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit 93b990f68e5c2c652759a2db8af049d172b8489c Author: Pete Zaitcevzait...@yahoo.com Date: Thu Aug 5 20:33:21 2010 -0600 Make initialized struct a static const. applied 1-2 -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [hail patch 2/3] fix 32/64 wire interoperability
On 08/04/2010 07:16 PM, Pete Zaitcev wrote: Testing found that tabled and chunkd running on CPUs with different word length cannot talk to each other. The bug was introduced by commit ea5d20bc22aeed077312c9c1824e84651af17a16. The fix is to add named padding that takes the place of the invisible padding, thus making the layout platform-neutral. Signed-off-by: Pete Zaitcevzait...@redhat.com --- include/chunk_msg.h |1 + 1 file changed, 1 insertion(+) diff --git a/include/chunk_msg.h b/include/chunk_msg.h index a34fc21..4c170e4 100644 --- a/include/chunk_msg.h +++ b/include/chunk_msg.h @@ -91,6 +91,7 @@ struct chunksrv_resp { uint32_tnonce; /* txn id, copied from request */ uint64_tdata_len; /* len of addn'l data */ unsigned char hash[CHD_CSUM_SZ]; /* SHA1 checksum */ + unsigned char rsv2[4];/* pad for 64 bits */ }; good catch. applied 1-3, and pushed out. I wonder if we shouldn't switch to attribute(packed) for safety, though. -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [hail patch 1/1] Make host, url, orig_path dynamic
On 07/29/2010 01:41 PM, Pete Zaitcev wrote: On Tue, 20 Jul 2010 16:34:19 -0400 Jeff Garzikj...@garzik.org wrote: lib/hstor.c | 147 +++--- 1 file changed, 104 insertions(+), 43 deletions(-) applied It's not in the git repo. Check this URL: http://git.kernel.org/?p=daemon/distsrv/hail.git Forgot to push, sorry. It's there now. Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [hail patch 1/7] Drop old comments about chunkdc
On 07/29/2010 10:49 PM, Pete Zaitcev wrote: Signed-off-by: Pete Zaitcevzait...@redhat.com --- configure.ac |2 -- 1 file changed, 2 deletions(-) commit 00be6055a3801ef8e84a4c78b43b43b67a76eab9 Author: Pete Zaitcevzait...@yahoo.com Date: Thu Jul 29 19:10:05 2010 -0600 Drop comment for a dead library. applied 1-7 to tabled repo -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [hail patch 1/1] Make host, url, orig_path dynamic
On 07/20/2010 04:16 PM, Pete Zaitcev wrote: Some of my performance tests for tabled hit truncation again: [zait...@hitlain tests]$ ./poke5 -v -h niphredil.zaitcev.lan -u auser -p apass -b test -o -k testkey-hitlain/73b84a11e6d83c65e45853338d646042 -f testdir/73b84a11e6d83c65e45853338d646042 * About to connect() to niphredil.zaitcev.lan port 80 (#0) * Trying fec0::1:219:b9ff:fe58:7ad6... * TCP_NODELAY set * connected * Connected to niphredil.zaitcev.lan (fec0::1:219:b9ff:fe59:7ad6) port 80 (#0) PUT /test/testkey-hitlain/73b84a11e6d83c65e45853338d HTTP/1.1 Accept: */* Host: niphredil.zaitcev.lan Date: Tue, 20 Jul 2010 01:07:33 + Authorization: AWS testuser:RefcbVYgr2m9KTRxOrCfr4zzfPE= Content-Length: 214745088 Expect: 100-continue * The requested URL returned error: 403 As you can see, the path in PUT is truncated, and this causes 403 since it's included into a hash. The patch addresses this issue and a bunch of other fixed-size strings before we hit that. Signed-off-by: Pete Zaitcevzait...@redhat.com --- lib/hstor.c | 147 +++--- 1 file changed, 104 insertions(+), 43 deletions(-) applied -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 4/4] chunkd checksums each block, as it is read from disk
Note that we are checksumming hot cache data, so SHA1 isn't as punishing as one might think. chunkd/be-fs.c | 51 ++- 1 file changed, 50 insertions(+), 1 deletion(-) commit 2211e3b58620093866be4130397cb3b476620725 Author: Jeff Garzik j...@garzik.org Date: Sun Jul 18 03:03:35 2010 -0400 [chunkd] checksum data prior to returning via GET When reading a file off disk, checksum the data after reading from disk, prior to sending across network to client. Fail read, if checksum fails. This guarantees we will never send corrupted data to a client. Signed-off-by: Jeff Garzik jgar...@redhat.com diff --git a/chunkd/be-fs.c b/chunkd/be-fs.c index 2120991..dce2561 100644 --- a/chunkd/be-fs.c +++ b/chunkd/be-fs.c @@ -49,6 +49,10 @@ struct fs_obj { int in_fd; char*in_fn; + off_t in_pos; + + off_t tail_pos; + size_t tail_len; size_t checked_bytes; SHA_CTX checksum; @@ -364,6 +368,8 @@ struct backend_obj *fs_obj_new(uint32_t table_id, if (!obj-csum_tbl) goto err_out; obj-csum_tbl_sz = csum_bytes; + obj-tail_pos = data_len ~(CHUNK_BLK_SZ - 1); + obj-tail_len = data_len (CHUNK_BLK_SZ - 1); /* build local fs pathname */ fn = fs_obj_pathname(table_id, key, key_len); @@ -488,6 +494,8 @@ struct backend_obj *fs_obj_open(uint32_t table_id, const char *user, value_len = GUINT64_FROM_LE(hdr.value_len); obj-n_blk = GUINT32_FROM_LE(hdr.n_blk); csum_bytes = obj-n_blk * CHD_CSUM_SZ; + obj-tail_pos = value_len ~(CHUNK_BLK_SZ - 1); + obj-tail_len = value_len (CHUNK_BLK_SZ - 1); /* verify file size large enough to contain value */ tmp64 = value_len + sizeof(hdr) + key_len + csum_bytes; @@ -571,15 +579,56 @@ void fs_obj_free(struct backend_obj *bo) free(obj); } +static bool can_csum_blk(struct fs_obj *obj, size_t len) +{ + if (obj-in_pos (CHUNK_BLK_SZ - 1)) + return false; + + if (obj-in_pos == obj-tail_pos len == obj-tail_len) + return true; + if (len == CHUNK_BLK_SZ) + return true; + + return false; +} + ssize_t fs_obj_read(struct backend_obj *bo, void *ptr, size_t len) { struct fs_obj *obj = bo-private; ssize_t rc; rc = read(obj-in_fd, ptr, len); - if (rc 0) + if (rc 0) { applog(LOG_ERR, obj read(%s) failed: %s, obj-in_fn, strerror(errno)); + return -errno; + } + + if (can_csum_blk(obj, rc)) { + unsigned char md[CHD_CSUM_SZ]; + unsigned int blk_pos; + int cmprc; + + SHA1(ptr, rc, md); + + blk_pos = (unsigned int) (obj-in_pos CHUNK_BLK_ORDER); + cmprc = memcmp(md, obj-csum_tbl + (blk_pos * CHD_CSUM_SZ), + CHD_CSUM_SZ); + + if (cmprc) { + applog(LOG_WARNING, obj(%s) csum failed @ 0x%llx, + obj-in_fn, + (unsigned long long) obj-in_pos); + return -EIO; + } + } else { + applog(LOG_INFO, obj(%s) unaligned read, 0x%x @ 0x%llx, + obj-in_fn, len, + (unsigned long long) obj-in_pos); + + } + + obj-in_pos += rc; return rc; } -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 1/3 v2] chunkd: remove sendfile(2) zero-copy support
On 07/17/2010 11:45 PM, Steven Dake wrote: On 07/16/2010 10:46 PM, Jeff Garzik wrote: chunkd: remove sendfile(2) zero-copy support chunkd will be soon checksumming data in main memory. That removes the utility of a zero-copy interface which bypasses the on-heap data requirement. Signed-off-by: Jeff Garzikjgar...@redhat.com May be able to use vmsplice with sendfile (if linux is only target platform). Haven't tried it myself, but the operations look interesting at achieving zero copy with sockets from memory addresses. Even though the man pages say only for pipes, this syscall definitely works with TCP. The big question: is it actually faster than read()+write() ? Years ago, I experimented with using some fancy new Linux-specific syscalls in a from-scratch implementation of cp(1). It turned out that read()+write() was faster than other methods. That was file-file copying. It's probably worth investigating vmsplice() for our file-checksum-TCP case, definitely. Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 2/3 v2] chunkd: Add checksum table to on-disk format, one sum per 64k of data
chunkd/be-fs.c | 145 +--- chunkd/chunkd.h |3 + 2 files changed, 131 insertions(+), 17 deletions(-) commit 394109d5c2fc2d15d91c2d36eecd57594922c1b3 Author: Jeff Garzik j...@garzik.org Date: Sat Jul 17 01:05:15 2010 -0400 chunkd: Add checksum table to on-disk format, one sum per 64k of data Signed-off-by: Jeff Garzik jgar...@redhat.com diff --git a/chunkd/be-fs.c b/chunkd/be-fs.c index 0a81134..5955afa 100644 --- a/chunkd/be-fs.c +++ b/chunkd/be-fs.c @@ -49,14 +49,23 @@ struct fs_obj { int in_fd; char*in_fn; + + size_t checked_bytes; + SHA_CTX checksum; + unsigned intcsum_idx; + void*csum_tbl; + size_t csum_tbl_sz; + + unsigned intn_blk; }; struct be_fs_obj_hdr { charmagic[4]; uint32_tkey_len; uint64_tvalue_len; + uint32_tn_blk; - charreserved[16]; + charreserved[12]; unsigned char hash[CHD_CSUM_SZ]; charowner[128]; @@ -204,6 +213,8 @@ static struct fs_obj *fs_obj_alloc(void) obj-out_fd = -1; obj-in_fd = -1; + SHA1_Init(obj-checksum); + return obj; } @@ -314,6 +325,17 @@ static bool key_valid(const void *key, size_t key_len) return true; } +static unsigned int fs_blk_count(uint64_t data_len) +{ + uint64_t n_blk; + + n_blk = data_len CHUNK_BLK_ORDER; + if (data_len (CHUNK_BLK_SZ - 1)) + n_blk++; + + return (unsigned int) n_blk; +} + struct backend_obj *fs_obj_new(uint32_t table_id, const void *key, size_t key_len, uint64_t data_len, @@ -321,6 +343,7 @@ struct backend_obj *fs_obj_new(uint32_t table_id, { struct fs_obj *obj; char *fn = NULL; + size_t csum_bytes; enum chunk_errcode erc = che_InternalError; off_t skip_len; @@ -335,6 +358,13 @@ struct backend_obj *fs_obj_new(uint32_t table_id, return NULL; } + obj-n_blk = fs_blk_count(data_len); + csum_bytes = obj-n_blk * CHD_CSUM_SZ; + obj-csum_tbl = malloc(csum_bytes); + if (!obj-csum_tbl) + goto err_out; + obj-csum_tbl_sz = csum_bytes; + /* build local fs pathname */ fn = fs_obj_pathname(table_id, key, key_len); if (!fn) @@ -355,7 +385,7 @@ struct backend_obj *fs_obj_new(uint32_t table_id, obj-out_fn = fn; /* calculate size of front-of-file metadata area */ - skip_len = sizeof(struct be_fs_obj_hdr) + key_len; + skip_len = sizeof(struct be_fs_obj_hdr) + key_len + csum_bytes; /* position file pointer where object data (as in, not metadata) * will begin @@ -393,7 +423,10 @@ struct backend_obj *fs_obj_open(uint32_t table_id, const char *user, struct be_fs_obj_hdr hdr; ssize_t rrc; uint64_t value_len, tmp64; + size_t csum_bytes; enum chunk_errcode erc = che_InternalError; + struct iovec iov[2]; + size_t total_rd_len; if (!key_valid(key, key_len)) { *err_code = che_InvalidKey; @@ -453,23 +486,45 @@ struct backend_obj *fs_obj_open(uint32_t table_id, const char *user, goto err_out; value_len = GUINT64_FROM_LE(hdr.value_len); + obj-n_blk = GUINT32_FROM_LE(hdr.n_blk); + csum_bytes = obj-n_blk * CHD_CSUM_SZ; /* verify file size large enough to contain value */ - tmp64 = value_len + sizeof(hdr) + key_len; + tmp64 = value_len + sizeof(hdr) + key_len + csum_bytes; if (G_UNLIKELY(st.st_size tmp64)) { applog(LOG_ERR, obj(%s) size error, too small, obj-in_fn); goto err_out; } + /* verify expected size of checksum table */ + if (G_UNLIKELY(fs_blk_count(value_len) != obj-n_blk)) { + applog(LOG_ERR, obj(%s) unexpected blk count + (%u from val sz, %u from hdr), + obj-in_fn, fs_blk_count(value_len), obj-n_blk); + goto err_out; + } + + obj-csum_tbl = malloc(csum_bytes); + if (!obj-csum_tbl) + goto err_out; + obj-csum_tbl_sz = csum_bytes; + obj-bo.key = malloc(key_len); obj-bo.key_len = key_len; if (!obj-bo.key) goto err_out; - /* read object variable-length header */ - rrc = read(obj-in_fd, obj-bo.key, key_len); - if ((rrc != key_len) || (memcmp(key, obj-bo.key, key_len))) { - applog(LOG_ERR, read hdr key obj(%s) failed: %s, + /* init additional header segment list */ + iov[0].iov_base = obj-bo.key
[PATCH 0/3] update chunkd checksum verification scheme
This patchset is part of the work necessary to get ranged-GET (aka partial GET) working. As explained in http://marc.info/?l=hail-develm=127871407125539w=2 the current chunkd checksum scheme does not work at all for partial retrievals, and must be revamped. These patches present step 1 of 4, adding a table of checksums to chunkd's local on-disk format. There are no protocol or API changes in this patchset, existing clients should work fine without any changes. Nevertheless, this will not be committed to the main branch until partial retrieval is actually implemented. I don't commit changes unless they are actually neeeded. This checksum table and sendfile removal work is not required until partial-GET actually exists. Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 1/3] chunkd: remove sendfile(2) support
commit d663521ba7e6a808be02633e57dbeb7a95973c0f Author: Jeff Garzik j...@garzik.org Date: Thu Jul 15 13:50:10 2010 -0400 chunkd: remove sendfile(2) zero-copy support chunkd will be soon checksumming data in main memory. That removes the utility of a zero-copy interface which bypasses the on-heap data requirement. Signed-off-by: Jeff Garzik jgar...@redhat.com chunkd/be-fs.c | 60 chunkd/chunkd.h | 14 - chunkd/object.c | 31 chunkd/server.c | 28 -- configure.ac|3 -- 5 files changed, 15 insertions(+), 121 deletions(-) diff --git a/chunkd/be-fs.c b/chunkd/be-fs.c index f72ed48..5c97388 100644 --- a/chunkd/be-fs.c +++ b/chunkd/be-fs.c @@ -25,9 +25,6 @@ #include sys/stat.h #include sys/socket.h #include sys/uio.h -#if defined(HAVE_SYS_SENDFILE_H) -#include sys/sendfile.h -#endif #include stdlib.h #include unistd.h #include stdio.h @@ -52,7 +49,6 @@ struct fs_obj { int in_fd; char*in_fn; - off_t sendfile_ofs; }; struct be_fs_obj_hdr { @@ -542,62 +538,6 @@ ssize_t fs_obj_write(struct backend_obj *bo, const void *ptr, size_t len) return rc; } -#if defined(HAVE_SENDFILE) defined(__linux__) - -ssize_t fs_obj_sendfile(struct backend_obj *bo, int out_fd, size_t len) -{ - struct fs_obj *obj = bo-private; - ssize_t rc; - - if (obj-sendfile_ofs == 0) { - obj-sendfile_ofs += sizeof(struct be_fs_obj_hdr); - obj-sendfile_ofs += bo-key_len; - } - - rc = sendfile(out_fd, obj-in_fd, obj-sendfile_ofs, len); - if (rc 0) - applog(LOG_ERR, obj sendfile(%s) failed: %s, - obj-in_fn, strerror(errno)); - - return rc; -} - -#elif defined(HAVE_SENDFILE) defined(__FreeBSD__) - -ssize_t fs_obj_sendfile(struct backend_obj *bo, int out_fd, size_t len) -{ - struct fs_obj *obj = bo-private; - ssize_t rc; - off_t sbytes = 0; - - if (obj-sendfile_ofs == 0) { - obj-sendfile_ofs += sizeof(struct be_fs_obj_hdr); - obj-sendfile_ofs += bo-key_len; - } - - rc = sendfile(obj-in_fd, out_fd, obj-sendfile_ofs, len, - NULL, sbytes, 0); - if (rc 0) { - applog(LOG_ERR, obj sendfile(%s) failed: %s, - obj-in_fn, strerror(errno)); - return rc; - } - - obj-sendfile_ofs += sbytes; - - return sbytes; -} - -#else - -ssize_t fs_obj_sendfile(struct backend_obj *bo, int out_fd, size_t len) -{ - applog(LOG_ERR, BUG: sendfile used but not supported); - return -EOPNOTSUPP; -} - -#endif /* HAVE_SENDFILE HAVE_SYS_SENDFILE_H */ - bool fs_obj_write_commit(struct backend_obj *bo, const char *user, unsigned char *md, bool sync_data) { diff --git a/chunkd/chunkd.h b/chunkd/chunkd.h index 1e1b1d3..1e3741a 100644 --- a/chunkd/chunkd.h +++ b/chunkd/chunkd.h @@ -48,8 +48,6 @@ enum { STD_COOKIE_MIN = 7, STD_TRASH_MAX = 1000, - - CLI_MAX_SENDFILE_SZ = 512 * 1024, }; struct client; @@ -63,7 +61,6 @@ struct client_write { uint64_tlen;/* write buffer length */ cli_write_func cb; /* callback */ void*cb_data; /* data passed to cb */ - boolsendfile; /* using sendfile? */ struct list_headnode; }; @@ -275,7 +272,6 @@ extern bool fs_obj_delete(uint32_t table_id, const char *user, const void *kbuf, size_t klen, enum chunk_errcode *err_code); extern int fs_obj_disable(const char *fn); -extern ssize_t fs_obj_sendfile(struct backend_obj *bo, int out_fd, size_t len); extern int fs_list_objs_open(struct fs_obj_lister *t, const char *root_path, uint32_t table_id); extern int fs_list_objs_next(struct fs_obj_lister *t, char **fnp); @@ -330,7 +326,6 @@ extern void applog(int prio, const char *fmt, ...); extern bool cli_err(struct client *cli, enum chunk_errcode code, bool recycle_ok); extern int cli_writeq(struct client *cli, const void *buf, unsigned int buflen, cli_write_func cb, void *cb_data); -extern bool cli_wr_sendfile(struct client *, cli_write_func); extern bool cli_rd_set_poll(struct client *cli, bool readable); extern void cli_wr_set_poll(struct client *cli, bool writable); extern bool cli_cb_free(struct client *cli, struct client_write *wr, @@ -349,15 +344,6 @@ extern void read_config(void); /* selfcheck.c */ extern int chk_spawn(TCHDB *hdb); -static inline bool use_sendfile(struct client *cli) -{ -#if defined(HAVE_SENDFILE) defined(HAVE_SYS_SENDFILE_H) - return cli-ssl ? false : true; -#else
[PATCH 3/3] chunkd: on-disk format stores per-64k checksums
commit e6fcc02bea062af291148771a59ee2028ae98834 Author: Jeff Garzik j...@garzik.org Date: Thu Jul 15 13:57:17 2010 -0400 chunkd: Add checksum table to on-disk format, one sum per 64k of data Signed-off-by: Jeff Garzik jgar...@redhat.com chunkd/be-fs.c | 145 + 1 file changed, 127 insertions(+), 18 deletions(-) diff --git a/chunkd/be-fs.c b/chunkd/be-fs.c index 671c8fd..1bd85ea 100644 --- a/chunkd/be-fs.c +++ b/chunkd/be-fs.c @@ -40,6 +40,11 @@ #define BE_FS_OBJ_MAGICCHU1 +enum { + CHUNK_BLK_ORDER = 16, /* 64k blocks */ + CHUNK_BLK_SZ= 1 CHUNK_BLK_ORDER, +}; + struct fs_obj { struct backend_obj bo; @@ -49,14 +54,23 @@ struct fs_obj { int in_fd; char*in_fn; + + size_t checked_bytes; + SHA_CTX checksum; + unsigned intcsum_idx; + void*csum_tbl; + size_t csum_tbl_sz; + + unsigned intn_blk; }; struct be_fs_obj_hdr { charmagic[4]; uint32_tkey_len; uint64_tvalue_len; + uint32_tn_blk; - charreserved[16]; + charreserved[12]; unsigned char hash[CHD_CSUM_SZ]; charowner[128]; @@ -204,6 +218,8 @@ static struct fs_obj *fs_obj_alloc(void) obj-out_fd = -1; obj-in_fd = -1; + SHA1_Init(obj-checksum); + return obj; } @@ -314,6 +330,17 @@ static bool key_valid(const void *key, size_t key_len) return true; } +static unsigned int fs_blk_count(uint64_t data_len) +{ + uint64_t n_blk; + + n_blk = data_len CHUNK_BLK_ORDER; + if (data_len (CHUNK_BLK_SZ - 1)) + n_blk++; + + return (unsigned int) n_blk; +} + struct backend_obj *fs_obj_new(uint32_t table_id, const void *key, size_t key_len, uint64_t data_len, @@ -321,6 +348,7 @@ struct backend_obj *fs_obj_new(uint32_t table_id, { struct fs_obj *obj; char *fn = NULL; + size_t csum_bytes; enum chunk_errcode erc = che_InternalError; off_t skip_len; @@ -335,6 +363,13 @@ struct backend_obj *fs_obj_new(uint32_t table_id, return NULL; } + obj-n_blk = fs_blk_count(data_len); + csum_bytes = obj-n_blk * CHD_CSUM_SZ; + obj-csum_tbl = malloc(csum_bytes); + if (!obj-csum_tbl) + goto err_out; + obj-csum_tbl_sz = csum_bytes; + /* build local fs pathname */ fn = fs_obj_pathname(table_id, key, key_len); if (!fn) @@ -355,7 +390,7 @@ struct backend_obj *fs_obj_new(uint32_t table_id, obj-out_fn = fn; /* calculate size of front-of-file metadata area */ - skip_len = sizeof(struct be_fs_obj_hdr) + key_len; + skip_len = sizeof(struct be_fs_obj_hdr) + key_len + csum_bytes; /* position file pointer where object data (as in, not metadata) * will begin @@ -391,8 +426,11 @@ struct backend_obj *fs_obj_open(uint32_t table_id, const char *user, struct stat st; struct be_fs_obj_hdr hdr; ssize_t rrc; - uint64_t value_len; + uint64_t value_len, tmp64; + size_t csum_bytes; enum chunk_errcode erc = che_InternalError; + struct iovec iov[2]; + size_t total_rd_len; if (!key_valid(key, key_len)) { *err_code = che_InvalidKey; @@ -447,25 +485,49 @@ struct backend_obj *fs_obj_open(uint32_t table_id, const char *user, } /* verify object key length matches input key length */ - if (GUINT32_FROM_LE(hdr.key_len) != key_len) + if (G_UNLIKELY(GUINT32_FROM_LE(hdr.key_len) != key_len)) goto err_out; - /* verify file size large enough to contain value */ value_len = GUINT64_FROM_LE(hdr.value_len); - if ((st.st_size - sizeof(hdr) - key_len) value_len) { + obj-n_blk = GUINT32_FROM_LE(hdr.n_blk); + csum_bytes = obj-n_blk * CHD_CSUM_SZ; + + /* verify file size large enough to contain value */ + tmp64 = value_len + sizeof(hdr) + key_len + csum_bytes; + if (G_UNLIKELY(st.st_size tmp64)) { applog(LOG_ERR, obj(%s) unexpected size change, obj-in_fn); goto err_out; } + /* verify expected size of checksum table */ + if (G_UNLIKELY(fs_blk_count(value_len) != obj-n_blk)) { + applog(LOG_ERR, obj(%s) unexpected blk count + (%u from val sz, %u from hdr), + obj-in_fn, fs_blk_count(value_len), obj-n_blk); + goto err_out; + } + + obj-csum_tbl = malloc(csum_bytes
Re: New 'hail' repository created, with major packaging rework
On 07/06/2010 11:24 AM, Pete Zaitcev wrote: On Mon, 05 Jul 2010 15:22:40 -0400 Jeff Garzikj...@garzik.org wrote: Moving libhttpstor is now a simple matter of simultaneous commits to hail.git and tabled.git, moving the code and updating build machinery. BTW, I suggest we do it differently: rename the functions and the struct httpstor as they are introduced in libhail (without changing anything else, to prevent accidential regressions). This way, tabled and our out-of-tree tests can continue to build for a couple of days and smoothly switch over to new libraries. OK, just pushed the following out to hail.git. If people disagree with naming, now's the time to speak up. commit 5188f48dd3c73ce86f2bc453a326ee0bf40fd6db Author: Jeff Garzik j...@garzik.org Date: Wed Jul 7 02:16:28 2010 -0400 libhail: Import httpstor, httputil modules from tabled With the following transformations: s/req_/hreq_/ s/httpstor_/hstor_/ s//huri_/ s//hutil_/ Signed-off-by: Jeff Garzik jgar...@redhat.com -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] tabled: use httpstor API from libhail
On Wed, Jul 07, 2010 at 06:38:22PM -0400, Jeff Garzik wrote: Just committed the following to tabled.git on my local laptop, on a side branch. This won't be pushed onto the main tabled branch until Friday, to give people time to convert as zaitcev suggested in the 'new hail repository' thread. This has now been pushed, as branch 'libhail-merge'. Branch master, aka the main trunk, remains untouched until Friday (unless some critical tabled issue arises before then, of course). As a side note, this requires a couple hail.git commits that will be pushed to upstream hail.git from my local laptop in a couple hours (movement of uri_parse from tabled's libhttpstor into libhail), so you'll need to update hail.git before being able to use the patch below. These hail.git commits have now been pushed: commit c7b833069e28cf9bddb69f46bb5e09138ab4984d libhail: add huri_parse API (imported from tabled) commit 55b0c57ca8f2b6beecea5a4680d76f45a7c32c28 Fix .gitignore issue causing test/chunkd/ to be completely ignored. commit 5188f48dd3c73ce86f2bc453a326ee0bf40fd6db libhail: Import httpstor, httputil modules from tabled Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] chunkd: add cp command, for local intra-table copies
On 07/06/2010 11:17 AM, Pete Zaitcev wrote: On Tue, 6 Jul 2010 03:24:29 -0400 Jeff Garzikj...@garzik.org wrote: The following patch, against current hail.git, adds the CP command to chunkd, permitting copying from object-object inside a single table. What is it for? Fun! :) More seriously, it is mainly an infrastructure patch, adding things that the upcoming RCP command will use. As CP is far less complex, this allows me to verify several bits of machinery before moving forward. I imagine CP will be tangentially helpful, but not a crucial feature in and of itself. Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] chunkd: add cp command, for local intra-table copies
On 07/06/2010 11:17 AM, Pete Zaitcev wrote: On Tue, 6 Jul 2010 03:24:29 -0400 Jeff Garzikj...@garzik.org wrote: The following patch, against current hail.git, adds the CP command to chunkd, permitting copying from object-object inside a single table. What is it for? Here's a real-world example. Quoting from the S3 documentation, this describes the PUT (copy) operation, something that tabled does not yet support, but should: This implementation of the PUT operation creates a copy of an object that is already stored in Amazon S3. A PUT copy operation is the same as performing a GET and then a PUT. Adding the request header, x-amz-copy-source, makes the PUT operation copy the source object into the destination bucket. Assuming that a given tabled object is already fully replicated -- HOPEFULLY the common case for us -- the least expensive way to implement this is for each chunkd containing object OLD_KEY CHO_CP(object OLD_KEY - object NEW_KEY) Assuming each chunkd node has the necessary free space, this method totally avoids using network bandwidth, when creating a copy of an object Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
tabled: Some Amazon S3 features to consider
Here are a few interesting things that have appeared in the S3 API since its initial release: 1) Object versioning. All objects now uniquely identified by (key, version) pair. API compatibility is maintained by supporting the notion of current version. 2) Object copying. Rather than an expensive S3-client-S3 round-trip, you may supply the x-amz-copy-source header to the PUT operation, causing S3 to use an existing object's data as the source for the PUT. 3) Reduced redundancy. x-amz-storage-class header may used to specify normal durability (STANDARD) or reduced durability (REDUCED_REDUNDANCY). 4) Regions (localization). Bucket locations may be set. Project Hail services have some notion of location as well. See if we can match up the two... 5) POST HTTP method. POST is like PUT, but can be used directly from a browser. -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH] chunk: add CP operation
This patch * adds local, intra-table copy operation to chunkd/libhail * illustrates what files need updating, when adding a new op to chunk * adds some 'worker' infrastructure which should help with future ops, notably remote copy (RCP) * should assist tabled's implementation of S3 copy (x-amz-copy-source) chunkd/chunkd.h | 19 +++ chunkd/object.c | 117 ++ chunkd/server.c | 122 doc/chcli.8 | 13 - include/chunk_msg.h |2 include/chunkc.h| 10 +++ lib/chunkdc.c | 56 ++ test/chunkd/Makefile.am |5 + tools/chcli.c | 77 ++ 9 files changed, 409 insertions(+), 12 deletions(-) diff --git a/chunkd/chunkd.h b/chunkd/chunkd.h index e019f0d..5d39353 100644 --- a/chunkd/chunkd.h +++ b/chunkd/chunkd.h @@ -104,6 +104,8 @@ struct client { unsigned intreq_used; /* amount of req_buf in use */ void*req_ptr; /* start of unexamined data */ uint16_tkey_len; + unsigned intvar_len;/* len of vari len record */ + boolsecond_var; /* inside 2nd vari len rec? */ char*hdr_start; /* current hdr start */ char*hdr_end; /* current hdr end (so far) */ @@ -124,6 +126,7 @@ struct client { charnetbuf_out[CLI_DATA_BUF_SZ]; charkey[CHD_KEY_SZ]; chartable[CHD_KEY_SZ]; + charkey2[CHD_KEY_SZ]; }; struct backend_obj { @@ -162,6 +165,14 @@ struct volume_entry { char*owner; /* obj owner username */ }; +struct worker_info { + enum chunk_errcode err;/* error returned to pipe */ + struct client *cli; /* associated client conn */ + + void(*thr_ev)(struct worker_info *); + void(*pipe_ev)(struct worker_info *); +}; + struct server_stats { unsigned long poll; /* number polls */ unsigned long event; /* events dispatched */ @@ -209,6 +220,10 @@ struct server { GHashTable *fd_info; + GThreadPool *workers; /* global thread worker pool */ + int max_workers; + int worker_pipe[2]; + struct list_headwr_trash; unsigned inttrash_sz; @@ -278,6 +293,7 @@ extern int fs_obj_do_sum(const char *fn, unsigned int klen, char **csump); extern bool object_del(struct client *cli); extern bool object_put(struct client *cli); extern bool object_get(struct client *cli, bool want_body); +extern bool object_cp(struct client *cli); extern bool cli_evt_data_in(struct client *cli, unsigned int events); extern void cli_out_end(struct client *cli); extern void cli_in_end(struct client *cli); @@ -314,12 +330,15 @@ extern bool cli_err(struct client *cli, enum chunk_errcode code, bool recycle_ok extern int cli_writeq(struct client *cli, const void *buf, unsigned int buflen, cli_write_func cb, void *cb_data); extern bool cli_wr_sendfile(struct client *, cli_write_func); +extern bool cli_rd_set_poll(struct client *cli, bool readable); extern void cli_wr_set_poll(struct client *cli, bool writable); extern bool cli_cb_free(struct client *cli, struct client_write *wr, bool done); extern bool cli_write_start(struct client *cli); extern int cli_req_avail(struct client *cli); extern int cli_poll_mod(struct client *cli); +extern bool worker_pipe_signal(struct worker_info *wi); +extern bool tcp_cli_event(int fd, short events, void *userdata); extern void resp_init_req(struct chunksrv_resp *resp, const struct chunksrv_req *req); diff --git a/chunkd/object.c b/chunkd/object.c index 116792f..af187b6 100644 --- a/chunkd/object.c +++ b/chunkd/object.c @@ -25,6 +25,7 @@ #include unistd.h #include string.h #include errno.h +#include poll.h #include stdio.h #include syslog.h #include glib.h @@ -356,3 +357,119 @@ start_write: return cli_write_start(cli); } +static void worker_cp_thr(struct worker_info *wi) +{ + static const unsigned bufsz = (1 * 1024 * 1024); + void *buf = NULL; + struct client *cli = wi-cli; + struct backend_obj *obj = NULL, *out_obj = NULL; + enum chunk_errcode err = che_InternalError; + unsigned char md[SHA_DIGEST_LENGTH]; + char hashstr[50]; + + buf = malloc(bufsz); + if (!buf) + goto out; + + cli-in_obj = obj = fs_obj_open(cli-table_id, cli-user, cli-key2, + cli-var_len, err); + if
stor_obj_test
This function seems to be missing the meat. It retrieves then disposes of a keylist. bool stor_obj_test(struct open_chunk *cep, uint64_t key) { struct st_keylist *klist; if (!cep-stc) return false; klist = stc_keys(cep-stc); if (!klist) return false; stc_free_keylist(klist); return true; } -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
chunkd on-disk and network protocol format change
The following commit introduces an incompatible chunkd change, which breaks compatibility with (a) existing on-disk chunkd databases, and (b) existing chunkd network protocol entities. Prior to commit ea5d20bc22aeed077312c9c1824e84651af17a16, chunkd stored SHA1 checksums as ASCII, and sent them across the wire in each message in ASCII. Converting these to directly store and use SHA1 binary checksums on-disk saves several memory allocations, and more importantly, shaves 44 bytes off each chunkd message. ASCII is only needed in the XML-based list-objects output, so we only perform the conversion at list-objects time. Jeff commit ea5d20bc22aeed077312c9c1824e84651af17a16 Author: Jeff Garzik j...@garzik.org Date: Wed Jul 7 00:51:48 2010 -0400 [chunk] protocol, disk fmt: Replace ASCII checksum representation with binary Rather than converting SHA1 checksums back and forth between ASCII and binary, always store and compare binary checksums. Only convert to ASCII when performing a list-objects request, which requires XML output. Among other savings, this decreases the size of the per-message fixed-length header by 44 bytes. Signed-off-by: Jeff Garzik jgar...@redhat.com chunkd/be-fs.c | 47 +++ chunkd/chunkd.h| 9 + chunkd/object.c| 14 -- chunkd/selfcheck.c | 19 +++ include/chunk_msg.h| 4 ++-- 5 files changed, 37 insertions(+), 56 deletions(-) -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: New 'hail' repository created, with major packaging rework
On 07/05/2010 03:13 PM, Pete Zaitcev wrote: On Fri, 02 Jul 2010 02:59:20 -0400 Jeff Garzikj...@garzik.org wrote: git://git.kernel.org/pub/scm/daemon/distsrv/hail.git libhail is a single shared library binary, linking together cldc, ncld, libtimer, and chunkdc modules. In other words, libhail at present is a simplistic combination of cld/lib and chunkd/lib. [zait...@lembas hail-tip]$ ls lib include include: chunkc.h chunksrv.hcld-private.h Makefile ncld.h chunk_msg.h cldc.helist.hMakefile.am objcache.h chunk-private.h cld_common.h hail_log.h Makefile.in lib: chunkdc.c cldc-udp.c libhail.pc.in Makefile chunksrv.c cld_msg_rpc.x libhail-uninstalled.pc Makefile.am cldc.c common.c libhail-uninstalled.pc.in Makefile.in cldc-dns.c libhail.pc libtimer.c pkt.c [zait...@lembas hail-tip]$ grep httpstor lib/*.c [zait...@lembas hail-tip]$ What has happened to the plan to include httpstor into libhail? Still planned, and can easily be done. Important first step was getting the foundation laid -- creating hail.git, and synchronizing hail.git and tabled.git, and associated RPM packaging. Moving libhttpstor is now a simple matter of simultaneous commits to hail.git and tabled.git, moving the code and updating build machinery. I can release a hail 0.7.1 and tabled 0.5.1 with this change, if you feel versioning and pushing out this libhttpstor change is highly important. (or you can do that yourself, doesn't make a difference to me) Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
chunkd near-term enhancements
Here are a few chunkd enhancements that are currently on my drawing board, for the near term: (CHO_xxx denotes new chunkd network protocol commands, as listed in include/chunk_msg.h) * CHO_SET_SERVERS: chunkd shall maintain a per-connection buffer known as SERVER_LIST. This chunkd command is issued by the client prior to using a SERVER_LIST-related command (see below), to reset the contents of the connection's SERVER_LIST buffer. * CHO_RCP: copy a single object to each remote server in SERVER_LIST * CHO_PUT_THRU: like PUT, but causes chunkd to further replicate the incoming object to each remote server in SERVER_LIST * CHO_APPEND: append data onto an object. * CHO_APPEND_THRU: append data locally, and, replicate foreach remote server in SERVER_LIST The authentication used in chunkd-chunkd connections is the logged-in username/shared-secret combination. Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
New 'hail' repository created, with major packaging rework
A new git repository git://git.kernel.org/pub/scm/daemon/distsrv/hail.git was created, preserving the full histories of cld.git and chunkd.git. The existing cld.git and chunkd.git repositories have been left untouched, for now. I also have not yet updated tabled.git for this new work, though it should be an easy matter of linking against libhail rather than other libs. This new repository creates hail-$VERSION.tar.gz tarballs via make distcheck, producing libhail, cld and chunkd binaries. libhail is a single shared library binary, linking together cldc, ncld, libtimer, and chunkdc modules. In other words, libhail at present is a simplistic combination of cld/lib and chunkd/lib. The RPM package specfile has been updated (pkg/hail.spec) to generate the following complement of packages on Fedora: Wrote: /garz/rpm/SRPMS/hail-0.7-0.1.gc69acd63.fc12.src.rpm Wrote: /garz/rpm/RPMS/x86_64/hail-0.7-0.1.gc69acd63.fc12.x86_64.rpm - contains libhail Wrote: /garz/rpm/RPMS/x86_64/hail-cld-0.7-0.1.gc69acd63.fc12.x86_64.rpm - contains cld Wrote: /garz/rpm/RPMS/x86_64/hail-chunkd-0.7-0.1.gc69acd63.fc12.x86_64.rpm - contains chunkd Wrote: /garz/rpm/RPMS/x86_64/hail-devel-0.7-0.1.gc69acd63.fc12.x86_64.rpm - contains libhail devel libs, headers Wrote: /garz/rpm/RPMS/x86_64/hail-debuginfo-0.7-0.1.gc69acd63.fc12.x86_64.rpm rpmlint still issues several warnings about hail-cld and hail-chunkd packages. That must be fixed before this package suite rename can be submitted to Fedora (pkg renames must be submitted as new packages, and go through the pkg review process all over again). To produce hail*.rpm packages on Fedora, I would do something like this: 1) set up rpm build directories (== $RBD in this example) 2) git clone git://git.kernel.org/pub/scm/daemon/distsrv/hail.git 3) cd hail 4) ./autogen.sh 5) ./autogen.sh 6) ./configure 7) make -s dist 8) cp *.tar.gz pkg/*.init pkg/*.sysconf $RBD/SOURCES 9) cp pkg/hail.spec $RBD/SPECS 10) cd $RBD 11) rpmbuild -ba SPECS/hail.spec As mentioned above, the {cld,chunkd}.git repositories have been left untouched, so if something goes wildly wrong with this scheme, we can easily backtrack. Comments welcome. Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hail version 0.7 released
Version 0.7 of hail core services has been released, at the expected places: http://www.kernel.org/pub/software/network/distsrv/hail/ ftp://ftp.kernel.org/pub/software/network/distsrv/hail/ git://git.kernel.org/pub/scm/daemon/distsrv/hail.git This release replaces separate chunkd and cld packages with a single 'hail' package, which provides libhail, cld and chunkd binaries. Release notes (from the NEWS file), showing changes since the last official cld/chunkd releases: - cld and chunkd merged into single 'hail' package, providing libhail, cld and chunkd binaries. libcldc and libchunkdc libraries no longer exist. - cld: bug fixes - cld: use XDR for all messages - cldc: bug fixes - cldc: improve verbose output - cldc: add new 'ncld' client API - add experimental 'cldfuse' FUSE filesystem - support db 4.9, 5.0 - chunkd: bug fixes - chunkd: update to ncld, fix CLD-related bugs - chunkd: improve and canonicalize verbosity controls and output - chunkd: be less inflexible about CLD paths - chunkd: (protocol change) replace SSL/no-SSL split ports with in-band SSL negotiation - chunkd: integrity self-checking - chunkd: fix GET/PUT for larger than 2GB values - chcli: bug fixes As with prior cld/chunkd releases, there will be no attempt at backwards compatibility, API freeze or protocol freeze until just prior to 1.0 release. In this release, backwards incompatible cld and chunkd network protocol changes have occurred. Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
tabled version 0.5 released
Coinciding with hail core v0.7 release is this tabled release, v0.5, at the usual places: git://git.kernel.org/pub/scm/daemon/distsrv/tabled.git http://www.kernel.org/pub/software/network/distsrv/tabled/ ftp://ftp.kernel.org/pub/software/network/distsrv/tabled/ Release notes: - update for hail v0.7, newly combined from cld+chunkd packages - reduce CLD client verbosity - check for db 4.8, 4.9, 5.0 - use new ncld API internally - background replication thread - config: add Group (one cell), drop StorageNode - add new tests - fixes for many serious bugs and crashes -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [tabled patch 1/1] Stagger the start-daemon
On 06/30/2010 10:49 AM, Pete Zaitcev wrote: My rule of thumb is that magic delays are evil or stupid, so I worked on eliminating them from our scripts. However, in this case it's just not worth it, because the result is that we have to wait way more than 100s for several cycles of CLD timeouts to complete, not just one, before we declare a failure. With this patch, all builds completed that I submitted to Fedora build system. Signed-off-by: Pete Zaitcevzait...@redhat.com --- test/start-daemon |4 test/wait-for-listen.c |7 ++- 2 files changed, 6 insertions(+), 5 deletions(-) applied -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[RFC] new structure: hail pkg instead of cld, chunkd
I've been thinking about a new structure for the projects, namely having a single hail or hail-core package, that includes cld and chunkd services, and associated client libraries inside a new libhail. In real terms, it would look like this: cld - hail libcldc - libhail, libhail-devel chunkd - hail libchunkdc - libhail, libhail-devel tabled - tabled (no change) libhttpstor - libhail, libhail-devel itd - itd (no change) nfs4d - nfs4d (no change) Core services (cld, chunkd), their associated client libs (libcldc, libchunkdc), and other useful common routines (libhttpstor) would find a new home in the hail RPM, providing cld, chunkd and libhail. tabled, itd and nfs4d are consider hail applications, and live in their own separate packages, BuildRequire-ing the core hail packages. I think this new organization will be more useful to both developers and future users. For developers, changing the core services, and packaging commonly reused routines is easier. For users, the core services and application separation is more clear, IMO easier to understand at a glance. Comments? Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Metadata replication in tabled
On 06/24/2010 08:31 PM, Pete Zaitcev wrote: I worked on fixing the metadata replication in tabled. There were some difficulties in existing code, in particular the aliasing between the hostname used to identify nodes and the hostname used in bind() for listening was impossible to work around in repmgr. In the end I gave up on repmgr and switched tabled to the Base API. So, the replication works now... for some values of works, which is still a progress. We essentially have a tabled that can really be considered as replicated. Before, it was only data replication, which was great and all but useless against disk failues in the tabled's database. I think it's a major treshold for tabled. er, huh? In addition to data replication, we already have metadata replication via db4 repmgr in tabled.git, which ensures metadata db integrity in the case of disk or tabled node failure. The core problem with current tabled.git is that S3 clients expect all nodes to support PUT/DELETE as well as GET. Our current use w/ db4 slave mode does not fulfill this client requirement. Your work here, moving to the base replication API, eliminates several obstacles on the path to making all tabled nodes support PUT/DELETE. But it is not true to say that metadata replication did not exist prior to this patch. With either repmgr or base API, we still need to make failover more transparent to our S3 clients. Unfortunately, the code is rather ugly. I tried to create a kind of an optional replication layer, so that tdbadm could be built without it. Although I succeeded, the result is a hideous mess of methods and callbacks, functions with side effects, and a bunch of poorly laid out state machines. In places I cannot wrap my own head around what's going on without a help of pencil and paper. So, while working, it's not ready for going in. Still, I'm going to throw it here in case I get hit by a bus, or if anyone wants an example of using db4 replication early. Based on a quick read, it seems straightforward, and looks like something I can try tomorrow... Very excited to try this :) Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Zookeeper instead of CLD in Hail
On 06/04/2010 11:27 PM, Pete Zaitcev wrote: I heard people say they cribbed from the same Chubby paper, but it's bollocks. It's absolutely nothing like what Chubby implies. No locks for one thing. To be sure, Zookeeper provides a canned piece of code which implements locks, kinda like you can implement compare-and-swap using Dekker's algorithm on a CPU that doesn't have it. The canned lock creates sequenced files (using a ZK server call that creates unique filenames), then sets some watches (same as CLD offers), then re-reads the directory to find the lowest number sequential file, which is the winner of the lock. Haha, only serious. I tested it, it works, but ew. Yeah, the main similarity is... both ZK and CLD offer some type of filesystem (with all that implies). ZK is IMO not much like Chubby at all, in terms of focus / design goals. Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [chunkd patch 4/6] Print client port
On 05/21/2010 12:54 AM, Pete Zaitcev wrote: - host, sizeof(host), NULL, 0, NI_NUMERICHOST); + host, sizeof(host), port, sizeof(port), NI_NUMERICHOST); host[sizeof(host) - 1] = 0; - applog(LOG_INFO, client %s connected%s, host, + host[sizeof(port) - 1] = 0; You truncate the wrong variable. -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [chunkd patch 1/6] Fix the leak of suddenly closed connections
On 05/21/2010 12:54 AM, Pete Zaitcev wrote: After a period of uptime, chunkd may stop working with this: May 20 08:51:47 azdragon2 chunkd[4034]: tcp accept: Too many open files An examination with lsof shows that file descriptors for sockets and object data files are leaked in neat pairs. As it turns out, the root cause is not processing the case when tabled opens a connection to read an object, then closes it before the data is transferred. On some systems, sendfile returns no error in such case, but the amount of data that it attempted to send before it recognized that the socket was closed. If that happens, chunkd will not receive a POLLOUT indication and the struct cli will linger forever with non-empty write queue. The fix has two parts: 1. Permit a client in evt_recycle state to process outstanding writes in the same manner a client in evt_dispose does. Note that in our specific failure case no actual processing is going to occur, so this part has an effect of permitting the dispatch to work. If we do not do this, a POLLIN may throw us into the evt_read_fixed stage. 2. Once we're getting dispatched, dispose of clients that had connections closed, using the unmaskable POLLHUP bit. As an aside, tabled 0.5-0.7.x resets the connections when Firefox asks for a file that was modified after a certain date. In that case, tabled wants to know when the file was modified, so it reads the header off chunkd. If it turns out that the client is not interested in the data, tabled simply closes the connection without reading whatever data has arrived. This may change in the future, but the bug in chunkd should be fixed anyway, for general robustness. Signed-off-by: Pete Zaitcevzait...@redhat.com applied 1-6, after fixing truncation bug newly introduced -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [chunkd patch 6/6] Make cli_wr_set_poll bool
On 05/21/2010 12:54 AM, Pete Zaitcev wrote: The upside of this cleanup is an ease of reading and evaluating with fewer control paths. [This patch will only work if patch 2/6 is applied. Sorry.] Signed-off-by: Pete Zaitcevzait...@redhat.com --- server/chunkd.h |2 +- server/object.c |3 +-- server/server.c | 18 +- 3 files changed, 7 insertions(+), 16 deletions(-) ITYM Make cli_wr_set_poll void -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [tabled patch 1/1] fix the selection of chunk
On 05/25/2010 11:30 PM, Pete Zaitcev wrote: If a chunkserver goes down, tabled sometimes throws a phantom object not found. It happens because we keep hitting the same down node and exhaust the retries. The existing code calls rand() every time and hopes for the best, but this is too likely to end poorly. The fix is to only randomize once before the retry loop, and then cycle through all available nodes deterministically. The same fix would apply even if we used a better technique to select an available chunkserver than just random. Also, we refactor the code just a little bit, so that the enormous function object_get_body gets somewhat easier to follow. Signed-off-by: Pete Zaitcevzait...@redhat.com applied -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: iSCSI front-end for Hail
As of itd commit 196e8f317fc7202460d7adde93dac939caf23f5d, the iSCSI target daemon appears to survive stress tests, and does not leak memory. I call that a good first milestone. Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: iSCSI front-end for Hail
As of commit 23a5795e3ca555a6454b199e071482bb50655508, itd is passing integrity and stress tests from two test suites, iscsi-harness found in netbsd-iscsi pkg, and basic blkdev integrity tests using dd(1). There is a whopping big memory leak that needs fixing, but the basics appear to be working. Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [cld patch 1/1] use specified username in cldcli
On 05/03/2010 12:07 AM, Pete Zaitcev wrote: I suspect I copy-pasted over it when I converted to ncld, but anyhow this patch seems work and do what's expected for --user flag. Signed-off-by: Pete Zaitcevzait...@redhat.com --- tools/cldcli.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/tools/cldcli.c b/tools/cldcli.c index 7c73091..79a1009 100644 --- a/tools/cldcli.c +++ b/tools/cldcli.c @@ -712,7 +712,7 @@ int main (int argc, char *argv[]) dr = host_list-data; nsess = ncld_sess_open(dr-host, dr-port,error, sess_event, NULL, -cldcli, cldcli,cli_log); +our_user, our_user,cli_log); if (!nsess) { if (error 1000) { applied PS. you sent this to j...@garzik.com, and unfortunately, I don't own that domain. I should. :) For future reference, jgar...@pobox.com or j...@garzik.org are equivalent and should exist for the long term (even if I get fired from Red Hat or somesuch :)). Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [chunkd patch 1/2] eradicate last vestiges of libevent
On 05/01/2010 12:51 AM, Pete Zaitcev wrote: We stopped using libevent in Chunk a while ago, but for some reason not all references were removed. I tested this patch by building on a fresh Fedora 13 system without libevent. Signed-off-by: Pete Zaitcevzait...@redhat.com --- configure.ac |3 --- pkg/chunkd.spec|2 +- server/Makefile.am |2 +- 3 files changed, 2 insertions(+), 5 deletions(-) applied 1-2 Thanks for updating the email subject lines. Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
iSCSI front-end for Hail
Hail devs, Project Hail was, in part, conceived as an umbrella of libraries and services enabling the mating of a well known, Internet-standard API with a back-end that enables distributed storage. tabled is an example of this: it provides an application front-end compatible with S3 API, using Hail back-end services chunkd and CLD. nfs4d[1] is a second, work-in-progress example. nfs4d is a fully working NFSv4 front-end, waiting to be mated to the Hail back-end services. A third example is something I poked at long ago, iSCSI. The vinzvault announcement[2] got me thinking about the iSCSI target[3] daemon that I had worked on, a while ago. vinzvault, sheepdog, DST, drbd, nbd and iSCSI all attempt to provide remote network attached storage, usually for storage on ephemeral virtual machines, similar to Amazon's Elastic Block Storage (EBS) on their EC2 grid. I dusted off my itd (iSCSI target daemon) project, fixed a bunch of bugs, and got it working[4] in the hopes that this might be useful to Hail or vinzvault or so. itd is a remote iSCSI service exporting one or more slices of storage as a standard SCSI device on your system. It is based off of 'netbsd-iscsi' in Fedora, which is in turn based off an old, open source Intel codebase. netbsd-iscsi seemed a more pliable codebase than the very-nice SCSI TGT project[5]. The web browsable itd tree (with git:// URL for cloning) can be found at http://git.kernel.org/?p=daemon/distsrv/itd.git As I write this email, I am borrowing a lot of networking code from tabled, to convert from GNet over to the more-flexible TCP server codebase found in tabled -- notably the asynchronous background TCP writing code in tabled. Hopefully will finish and commit this by the end of the weekend. At that point, itd should be a fully compliant SCSI target, capable of reading/writing -- to a pre-allocated RAM space. Once that milestone is reached, the RAM storage may be replaced with Hail components, or other gadgets like MongoDB[6], to provide scalable, distributed storage. Jeff [1] https://hail.wiki.kernel.org/index.php/Nfs4d [2] http://www.mail-archive.com/linux-clus...@redhat.com/msg08555.html [3] a SCSI target is a remote network server, in SCSI parlance. It is mated with an initiator, which is SCSI's term for client. [4] well, only small WRITEs work at the moment. but READ is fully working at high speeds. [5] http://stgt.sourceforge.net/ [6] http://www.mongodb.org/ -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Patch 09/12] tabled: drop double prefixing
On 04/18/2010 12:42 AM, Pete Zaitcev wrote: On Fedora 14, the following is seen in syslog: Apr 17 19:58:52 niphredil tabled: tabled: connecting to site hitlain.zaitcev.lan:8083: No route to host Apr 17 19:58:56 niphredil tabled: tabled: DB_ENV-rep_elect:WARNING: nvotes (1) is sub-majority with nsites (2) Drop the extra prefix, it only wastes screen space. Signed-off-by: Pete Zaitcevzait...@redhat.com --- lib/tdb.c |7 ++- 1 file changed, 6 insertions(+), 1 deletion(-) applied 9-12 -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Patch 1/8] CLD: cleanup: add cld_msg_rpc.x
On 04/16/2010 10:18 PM, Pete Zaitcev wrote: On Wed, 14 Apr 2010 15:55:01 -0400 Jeff Garzikj...@garzik.org wrote: +++ b/lib/Makefile.am @@ -27,6 +27,7 @@ libcldc_la_SOURCES= \ common.c\ libtimer.c \ pkt.c \ + cld_msg_rpc.x \ cld_msg_rpc_xdr.c that's quite strange, because I built an official rawhide copy just fine without this... Strange indeed, I re-checked and it went away now. Oh well. I wonder if it's a problem with the 'clean' functionality. The EXTRA_DIST line contains a list of things forced to be included in the tarball, typically used for things not contained in *_SOURCES. AFAICT from the autoconf/automake docs, that is where sources for generated sources[1] should reside. So I still wonder how it disappeared for you... Jeff [1] Brought to you by the Department of Redundant Redundancies -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: Trivial Q about chunkd's main_loop
On 04/17/2010 09:36 PM, Pete Zaitcev wrote: Is there a reason why the main_loop in chunkd uses a naked g_hash_table_lookup instead of srv_poll_lookup? Performance? @@ -1681,8 +1681,7 @@ static int main_loop(void) fired++; - sp = g_hash_table_lookup(chunkd_srv.fd_info, - GINT_TO_POINTER(pfd-fd)); + sp = srv_poll_lookup(pfd-fd); if (G_UNLIKELY(!sp)) { Looks like it should be changed to call srv_poll_lookup(), indeed. srv_poll_lookup() is marked 'static', so there should not be any performance difference after the compiler's optimizer passes get finished with it. Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
tabled RPM build fails before it succeeds
The same source, same spec. Build #1 (fails on x86_64): http://koji.fedoraproject.org/koji/taskinfo?taskID=2119825 Build #2 (fails on i686): http://koji.fedoraproject.org/koji/taskinfo?taskID=2120174 Build #3 (success on all platforms): http://koji.fedoraproject.org/koji/taskinfo?taskID=2120215 -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Patch 2/8] CLD: cleanup: add a log entry about sent packet
On 04/14/2010 02:34 PM, Pete Zaitcev wrote: Currently, there's nothing in the verbose output about sent packets at all. No, really! This is very confusing, even if I run tcpdump in the same time. I think we should add this. Signed-off-by: Pete Zaitcevzait...@redhat.com --- lib/cldc.c |2 ++ 1 file changed, 2 insertions(+) applied 2-6 -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Patch 7/8] tabled: cleanup: add #include
On 04/14/2010 02:35 PM, Pete Zaitcev wrote: Same as everywhere else: missing prototypes, so implementations are not actually matched by the compiler. Signed-off-by: Pete Zaitcevzait...@redhat.com --- lib/readport.c |1 + test/libtest.c |1 + 2 files changed, 2 insertions(+) applied 7-8 -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Patch 1/8] CLD: cleanup: add cld_msg_rpc.x
On 04/14/2010 02:33 PM, Pete Zaitcev wrote: You know what's weird... Without this, I cannot build an RPM at all, the rpmbuild complains about unpackaged files and aborts. But everyone else seems to have no problem? Strange. BTW, I am on Fedora 14. Signed-off-by: Pete Zaitcevzait...@redhat.com --- lib/Makefile.am |1 + 1 file changed, 1 insertion(+) diff --git a/lib/Makefile.am b/lib/Makefile.am index ea72426..012d558 100644 --- a/lib/Makefile.am +++ b/lib/Makefile.am @@ -27,6 +27,7 @@ libcldc_la_SOURCES= \ common.c\ libtimer.c \ pkt.c \ + cld_msg_rpc.x \ cld_msg_rpc_xdr.c that's quite strange, because I built an official rawhide copy just fine without this... Maybe you can try the SRPM from the koji build? http://koji.fedoraproject.org/koji/taskinfo?taskID=2114193 May I presume you are using make distcheck to generate the tarball for your custom RPMs? Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Patch 1/3] CLD: End-to-end verbosity
On 03/31/2010 08:43 PM, Pete Zaitcev wrote: diff --git a/server/server.c b/server/server.c index 3208e0f..2d68ee6 100644 --- a/server/server.c +++ b/server/server.c @@ -55,7 +55,7 @@ static struct argp_option options[] = { Store database environment in DIRECTORY. Default: CLD_DEF_DATADIR }, { debug, 'D', LEVEL, 0, - Set debug output to LEVEL (0 = off, 2 = max) }, + Set debug output to LEVEL (0 = off, 1 = debugging) }, { stderr, 'E', NULL, 0, Switch the log to standard error }, { foreground, 'F', NULL, 0, @@ -64,6 +64,8 @@ static struct argp_option options[] = { Bind to UDP port PORT. Default: CLD_DEF_PORT }, { pid, 'P', FILE, 0, Write daemon process id to FILE. Default: CLD_DEF_PIDFN }, + { verbose, 'v', NULL, 0, + Enable the session-level verbosity }, { strict-free, 1001, NULL, 0, For memory-checker runs. When shutting down server, free local heap, rather than simply exit(2)ing and letting OS clean up. }, As is hinted by the current code's debugging switch being an integer 'level' value, the server [and client?] has increasing levels of verbosity. The debug levels are 0: key messages affecting server operation, only 1: debugging output enabled, sans per-packet output 2: debugging output enabled, including per-packet output ie. clearly ordered by increasing value == increased verbosity. As is clearly illustrated when I cut the patch down to the above snippet, the user interface you have created gives the user two knobs for log verbosity, and it is not clear to a casual user which knob controls which sets of messages. That makes for a -more- confusing user interface, because the user must constantly ask themselves the question do I need debug? or verbose? I don't know! Additionally, this interface changes runs counter to other tools, which increase verbosity with added -v switches -- analagous to the existing integer-based debug level interface. If it is truly your desire to permit fine-grained selection of certain classes of messages, then don't dick around! Go ahead and create a bitmap log mask which permits fine-grained selection of various messages, much like netif_msg_* and netif_msg_init() in the kernel's include/linux/netdevice.h. Having two switches, -d and -v, for different, undocumented classes of message just increases confusion. Put yourself in the mind of a user trying to figure out which is which. I readily admit the __internal implementation__ resulting from your patches is a useful cleanup, but at a macro level, it merely increases logging user interface confusion. Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Patch 1/7] tabled: make two dump displays uniform
On 04/01/2010 09:51 PM, Pete Zaitcev wrote: From: Jeff Garzikjgar...@pobox.com Subject: Re: Tabled issues Date: Mon, 29 Mar 2010 15:32:33 -0400 I asserted that the standard stats dump facility must dump all available statistics. That does not exclude other methods of stat(us) dumping. Your patch added new stats to the HTML-pretty version of output, but failed to add the new stats to the standard stat dump facility. Your wish is my command. Signed-off-by: Pete Zaitcevzait...@redhat.com --- server/replica.c | 28 + server/server.c | 47 ++ server/status.c | 22 +-- server/storage.c | 50 + server/tabled.h |3 ++ 5 files changed, 117 insertions(+), 33 deletions(-) applied, thanks. I will endeavor to make the stats dump more like nfs4d in the future, FWIW. -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Patch 2/7] tabled: fix the endless recusion when reading long objects
On 04/01/2010 09:51 PM, Pete Zaitcev wrote: At certain network and disk speeds, tabled can blow its stack by filling it with (essentially) endless recursion: #2 0x0040c077 in cli_write_free (cli=value optimized out, tmp= 0x7bb910, done=value optimized out) at server.c:397 #3 0x0040ca55 in cli_writable (cli=0x686e90) at server.c:525 #4 0x0040da65 in cli_write_start (cli=0x686e90) at server.c:561 #5 0x00408ad5 in object_get_poke (cli=0x686e90) at object.c:1039 #6 0x0040c077 in cli_write_free (cli=value optimized out, tmp= 0x7bb8d0, done=value optimized out) at server.c:397 #7 0x0040ca55 in cli_writable (cli=0x686e90) at server.c:525 #8 0x0040da65 in cli_write_start (cli=0x686e90) at server.c:561 #9 0x00408ad5 in object_get_poke (cli=0x686e90) at object.c:1039 #10 0x0040c077 in cli_write_free (cli=value optimized out, tmp= 0x7bb890, done=value optimized out) at server.c:397 The fix is to deliver callbacks only from the top level. Callbacks must be delivered every time a send is completed, which amounts to every call to is_writeable(). Since there is a large number of callers to it, we found it advantageous to run callbacks from every source of events. In other words, every function that is passed to event_set must invoke cli_write_run_compl. Mind that storage.c contains calls to event_set. Signed-off-by: Pete Zaitcevzait...@redhat.com --- server/object.c |4 +++ server/server.c | 52 +++--- server/tabled.h |6 + 3 files changed, 50 insertions(+), 12 deletions(-) applied 2-7 -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Patch 1/3] CLD: End-to-end verbosity
On 04/06/2010 11:32 PM, Pete Zaitcev wrote: On Tue, 06 Apr 2010 10:40:33 -0400 Jeff Garzikj...@garzik.org wrote: The debug levels are 0: key messages affecting server operation, only 1: debugging output enabled, sans per-packet output 2: debugging output enabled, including per-packet output The previous patch did just that: Why did you reject it? That's a damned good question. I have no idea. Did I ever reply to that patch? It looks like I fscked up and missed it? Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: CLD doesn't build on db-4.3
On 04/01/2010 07:01 AM, Samba - BoYang wrote: hi, * CLD doesn't build on db-4.3 on suse 11, since db-4.3 uses deprecated structure members DBC-c_xxx(c_close(), etc) instead of DBC-xxx. :-) It won't build on db-4.4, either. probably won't build on db-4.5, as db-5.0 says DBC-xxx was introduced in db-4.6. :-) Should we disable support for 4.3 - 4.5 and add 4.9 - 5.0? I'd answer yes, by a circuitous route: if I understand things correctly, the replicated PAXOS db4 backend that we are heading towards (see the 'replica' branch of git://git.kernel.org/pub/scm/daemon/cld/cld.git) was buggy in early db4 releases. Therefore, it sounds like we could eliminate two issues with a single change, by removing support for db 4.3 - 4.5, the DBC issue and the PAXOS issue. I'm fine with adding support for 4.9+ as long as the APIs function in a compatible manner. Want to create the simple patch for this? :) Thanks, Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [Patch 1/1] tabled: fix a crash when looking up non-existing NID
On 03/28/2010 09:57 PM, Pete Zaitcev wrote: Signed-off-by: Pete Zaitcevzait...@redhat.com --- server/storage.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) applied -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH] chunkd: fix duplicate stc_object allocation in stc_parse_key()
On 03/16/2010 05:59 AM, Akinobu Mita wrote: At the beginning of stc_parse_key(), st_object is allocated twice for the same variable. Signed-off-by: Akinobu Mitaakinobu.m...@gmail.com --- lib/chunkdc.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) good catch, applied -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html