Hail license changed

2012-12-12 Thread Jeff Garzik
Hail license change was just pushed to the github hail repository. Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Hail status and update (was Re: Question about hail)

2012-11-27 Thread Jeff Garzik
(CC'd hail-devel list) On 11/26/2012 02:28 AM, Hideki Yamane wrote: Hello hail upstream authors, I'm interested in porting hail (and Aeolus) to Debian, but have some questions about it. Cool! Q: Is this project is still alive? if so, where is the current main site. Could you tell

Project Hail wikis alive again!

2012-05-02 Thread Jeff Garzik
kernel.org fixed their wiki system, which means that all the k.org wikis are once again read-write! This includes Project Hail's home page, https://hail.wiki.kernel.org/ I hope to have the git repos moved back from https://github.com/jgarzik/ to kernel.org soon also. Jeff --

Re: Unable to log into Hail Wiki

2012-01-26 Thread Jeff Garzik
On 01/25/2012 08:40 PM, Pete Zaitcev wrote: Jeff, looks like the wiki rots. The login points to this URL https://hail.wiki.kernel.org/articles/u/s/e/Special%7EUserLogin_94cd.html It returns 404. HALP? Yes -- all kernel.org wikis are _still_ read-only, even this many months after the

Re: [patch hail 1/1] Plug leak in hstor_parse_key

2011-10-18 Thread Jeff Garzik
On 10/14/2011 01:34 PM, Pete Zaitcev wrote: Signed-off-by: Pete Zaitcevzait...@kotori.zaitcev.us --- lib/hstor.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/hstor.c b/lib/hstor.c index cb9c4da..5ce9b76 100644 --- a/lib/hstor.c +++ b/lib/hstor.c @@ -761,7 +761,7

Re: Hail git tree location

2011-10-04 Thread Jeff Garzik
On 10/04/2011 12:12 PM, Pete Zaitcev wrote: Are we going to have a git tree somewhere? It looks like our old one was purged from git.kernel.org. Sorry, I should have posted. It was migrated along with the kernel.org trees to https://github.com/jgarzik/{hail,tabled,itd,nfs4d}

Re: FYI, rpc/ is gone from Fedora 15

2011-05-05 Thread Jeff Garzik
On 05/05/2011 10:14 AM, Jim Meyering wrote: FYI, /usr/include/rpc/ no longer exists, as of F15's glibc-headers-2.13.90-10, so hail's lib/cld_msg_rpc.h will have to do something about this #include: $ grep rpc.h lib/cld_msg_rpc.h #includerpc/rpc.h hm. Surely they did not delete

[hail] CLD conversion to TCP lands in git

2011-03-24 Thread Jeff Garzik
I just pushed the CLD protocol change (UDP - TCP) to hail.git[1]. See the original post[2] for more details. It seems pretty solid from my beating on it, but it's still raw code. The focus will be on hammering out the kinks in this switch over the next 7-10 days, so expect some breakage

Re: [PATCH 2/3] CLD: switch network proto from UDP to TCP

2011-01-03 Thread Jeff Garzik
On 01/02/2011 06:32 PM, Pete Zaitcev wrote: On Fri, 31 Dec 2010 05:57:28 -0500 Jeff Garzikj...@garzik.org wrote: + struct cldc_tcp *tcp = private; + ssize_t rc; + struct ubbp_header ubbp; + + memcpy(ubbp.magic, CLD1, 4); + ubbp.op_size = (buflen 8) | 1; +#ifdef

Re: Crash with db5

2011-01-02 Thread Jeff Garzik
On 01/02/2011 08:20 PM, Pete Zaitcev wrote: Looks like Rawhide throws this if libdb-devel is in use: make check-TESTS make[3]: Entering directory `/q/zaitcev/hail/hail-tip/test/cld' PASS: prep-db DB_ENV-lsn_reset: method not permitted before handle's open method DB_ENV-dbremove: method not

[PATCH 1/3] CLD: convert back to libevent

2010-12-31 Thread Jeff Garzik
Switch CLD from hand-rolled server poll code, to libevent. Follows similar techniques and rationale as chunkd commit c1aed7464f237e5a6309351bf003162c77d69e27. This reverts ancient commit 90b3b5edcf5aa00577f4395fdbb490ed7e9be824. Signed-off-by: Jeff Garzik jgar...@redhat.com --- cld

[PATCH 2/3] CLD: switch network proto from UDP to TCP

2010-12-31 Thread Jeff Garzik
to TCP promises to make the current codebase much easier to use, while avoiding the reinvent TCP, by using UDP problem, which was a rabbit hole threatening CLD. Signed-off-by: Jeff Garzik jgar...@redhat.com --- chunkd/cldu.c|6 cld/cld.h| 43 ++ cld/msg.c

Re: [patch tabled 6/8] Add filesystem back-end

2010-12-13 Thread Jeff Garzik
On 11/28/2010 08:41 PM, Pete Zaitcev wrote: This patch adds the first new back-end and makes some changes to the way nodes are added, to make the invariants of storage_node more sensible. The filesystem back-end itself is not intended for production use, so it makes no attempt to run any

Re: [patch tabled 8/8] Add Swift back-end

2010-12-13 Thread Jeff Garzik
On 11/28/2010 08:41 PM, Pete Zaitcev wrote: This patch allows to use tabled with OpenStack Swift object store as if it were our chunkserver, with some extra tricks. The configuration has to be entred manually into CLD, just like in case of filesystem back-end. The code is fairly experimental,

Re: [patch hail 1/2] Add subdomain calling format

2010-12-07 Thread Jeff Garzik
On 12/05/2010 10:53 PM, Pete Zaitcev wrote: Amazon appears to give up on forcing users to migrate and bucket-in-path format is going to stay. However, they still refuse to list buckets from other regions on the default endpoint, which leads to annoying indirection (need to know the region

Re: [patch tabled 1/8] Shuffle fields of storage nodes

2010-12-07 Thread Jeff Garzik
On 11/28/2010 08:39 PM, Pete Zaitcev wrote: This helps copy-paste safer later, mostly. Signed-off-by: Pete Zaitcevzait...@redhat.com --- server/object.c |2 - server/storage.c | 79 ++--- server/tabled.h | 12 +++--- 3 files changed, 53

Re: AC_CONFIG_MACRO_DIR([m4])

2010-12-06 Thread Jeff Garzik
On 12/05/2010 04:56 PM, Pete Zaitcev wrote: Autoconf printed a warning when reconfiguting Hail, so I gave up and added this: [...] Now I have a directory m4/ with symlinks... This does not seem to be helping any portability, unless I miss where the promised macro are being saved locally. What

Re: AC_CONFIG_MACRO_DIR([m4])

2010-12-06 Thread Jeff Garzik
On 12/06/2010 12:44 PM, Pete Zaitcev wrote: On Mon, 06 Dec 2010 12:32:22 -0500 Jeff Garzikj...@garzik.org wrote: Keeping the correct libtool macros in-tree implies adding a pointless maintenance burden. The distro always gives us correct, up-to-date files. Why would hail want to potentially

Re: [patch hail] remove duplicated stc_readport

2010-10-26 Thread Jeff Garzik
On 10/26/2010 03:47 PM, Pete Zaitcev wrote: Now that we have a common library for Hail, an opportunity opens to trim some duplication, such as stc_readport. It even had a comment about it. Note that we leave cld_readport in the API for a few weeks, while I get my tabled trees and RPMs in order.

tabled + atcp

2010-10-23 Thread Jeff Garzik
Just committed this: commit 57c4be44cdfa6c0cda6cf26d19e8048a945c5a78 Author: Jeff Garzik j...@garzik.org Date: Sat Oct 23 14:01:20 2010 -0400 Use libhail's atcp rather than our own async TCP write code. Should be functionally equivalent, as atcp originated from tabled code

Re: [PATCH hail] const-correctness tweaks

2010-10-22 Thread Jeff Garzik
On 10/20/2010 04:53 AM, Jim Meyering wrote: Jeff Garzik wrote: ... Hi Jeff. Sorry I didn't notice that the first time. I built with ./autogen.sh ./configure make. It looks like you recommend -Wall -Wshadow. The two warnings above are the only ones I see with the patch, and they're easy

hail version 0.7.2 released

2010-10-22 Thread Jeff Garzik
. Git shortlog attached. Jeff Garzik (12): chunkd: Add checksum table to on-disk format, one sum per 64k of data chunkd: checksum data prior to returning via GET chunk: Add Get-Partial-Object (GET_PART) operation lib/chunksrv.c: add FIXME chunkd: internal 32/64-bit

tabled version 0.5.2 released

2010-10-22 Thread Jeff Garzik
Home: https://hail.wiki.kernel.org/ Git: git://git.kernel.org/pub/scm/daemon/distsrv/tabled.git Download: http://www.kernel.org/pub/software/network/distsrv/tabled/ Version 0.5.2 release notes (NEWS): - Permit randomly allocated TCP port, for db4 replication master - Install etc.tabled.conf as

Re: tabled version 0.5.2 released

2010-10-22 Thread Jeff Garzik
On 10/22/2010 11:39 PM, Jeff Garzik wrote: Home: https://hail.wiki.kernel.org/ Git: git://git.kernel.org/pub/scm/daemon/distsrv/tabled.git Download: http://www.kernel.org/pub/software/network/distsrv/tabled/ Version 0.5.2 release notes (NEWS): - Permit randomly allocated TCP port, for db4

Re: hail version 0.7.2 released

2010-10-22 Thread Jeff Garzik
On 10/22/2010 11:22 PM, Jeff Garzik wrote: Home: https://hail.wiki.kernel.org/ Git: git://git.kernel.org/pub/scm/daemon/distsrv/hail.git Download: http://www.kernel.org/pub/software/network/distsrv/hail/ It seems that kernel.org mirroring is broken or extremely slow at the moment

Re: [PATCH hail] const-correctness tweaks

2010-10-20 Thread Jeff Garzik
On 10/20/2010 04:00 AM, Jim Meyering wrote: Jeff Garzik wrote: On 10/06/2010 08:07 AM, Jim Meyering wrote: Make write_cb callback's buffer parameter const, like all write-like functions. Give a few char * parameters the const attribute. Signed-off-by: Jim Meyeringmeyer...@redhat.com

Re: [PATCH hail] const-correctness tweaks

2010-10-07 Thread Jeff Garzik
On 10/06/2010 08:07 AM, Jim Meyering wrote: Make write_cb callback's buffer parameter const, like all write-like functions. Give a few char * parameters the const attribute. Signed-off-by: Jim Meyeringmeyer...@redhat.com --- It looks like most of hail's interfaces are const-correct, but one

Re: CLD multi-node status

2010-09-30 Thread Jeff Garzik
On 09/30/2010 04:55 AM, Geert Jansen wrote: is it correct that CLD is basically single-master right now? I can't find any trace of the mentioned Paxos implementation in the source. The current main branch is single-master, correct. The 'replica' branch of hail.git contains the multi-node

Re: [PATCH hail] chunkd: don't leak an FS object iterator

2010-09-30 Thread Jeff Garzik
On 09/29/2010 11:20 AM, Jim Meyering wrote: chk_list_objs called fs_list_objs_open without also calling fs_list_objs_close. 32,808 bytes in 1 blocks are definitely lost in loss record 413 of 419 at 0x4A0515D: malloc (vg_replace_malloc.c:195) by 0x31BA8A26D0: __alloc_dir

Re: [PATCH hail] chunkd: don't leak an FS object iterator

2010-09-30 Thread Jeff Garzik
On 09/29/2010 11:20 AM, Jim Meyering wrote: chk_list_objs called fs_list_objs_open without also calling fs_list_objs_close. 32,808 bytes in 1 blocks are definitely lost in loss record 413 of 419 at 0x4A0515D: malloc (vg_replace_malloc.c:195) by 0x31BA8A26D0: __alloc_dir

[chunkd patch] convert to libevent

2010-09-30 Thread Jeff Garzik
For a nice code savings... chunkd/Makefile.am |1 chunkd/chunkd.h| 28 + chunkd/cldu.c | 64 +-- chunkd/server.c| 289 + chunkd/util.c | 23 configure.ac |3 6 files changed, 116

Re: Autostart

2010-09-30 Thread Jeff Garzik
On Wed, Sep 29, 2010 at 7:09 PM, Pete Zaitcev zait...@redhat.com wrote: An interesting question is what to do when iwhd exits. I decided not to kill what was started. So, we have a little self-contained cell of tabled, chunkd, S3, based off a certain local directory or other namespace anchor.

Re: [PATCH hail] lib/hstor.c: avoid an unconditional leak in append_qparam

2010-09-27 Thread Jeff Garzik
On 09/27/2010 04:53 AM, Jim Meyering wrote: Signed-off-by: Jim Meyeringmeyer...@redhat.com --- I would have preferred to insert a single line right before the huri_field_escape call: char *v = strdup(val); [would result in a more compact, single-hunk patch] but it looks like hail uses

Re: [PATCH hail] lib/hstor.c: avoid an unconditional leak in append_qparam

2010-09-27 Thread Jeff Garzik
On 09/27/2010 12:29 PM, Pete Zaitcev wrote: On Mon, 27 Sep 2010 10:53:06 +0200 Jim Meyeringj...@meyering.net wrote: - stmp = huri_field_escape(strdup(val), QUERY_ESCAPE_MASK); + v = strdup(val); + stmp = huri_field_escape(v, QUERY_ESCAPE_MASK); str =

Re: [hail patch 1/1] Fix calling convention of huri_field_escape

2010-09-27 Thread Jeff Garzik
On 09/27/2010 08:49 PM, Pete Zaitcev wrote: Premature optimization is the root of all evil. Use a sensible convention of not screwing with the argument, at the expense of extra strdup. Fortunately, all users are confined to Hail itself, even if huri_field_escape is exported. Signed-off-by:

Re: [tabled patch 1/1] Add a test for hstor_keys

2010-09-27 Thread Jeff Garzik
On 09/27/2010 08:52 PM, Pete Zaitcev wrote: Our current tests do not invoke hstor_keys at all, and so they did not catch a crash with double free in append_qparam. Add a very basic test which at least calls hstor_keys to verify that it does not crash right away. This test does not excercise

Re: [PATCH tabled 1/2] server/config.c: don't dereference NULL on OOM

2010-09-24 Thread Jeff Garzik
On 09/24/2010 07:32 AM, Jim Meyering wrote: You can pull from the oom branch here: git://git.infradead.org/users/meyering/tabled.git Got nearly everything perfect. Need one more minor yet important change. As described in doc/contributions.txt, every changeset MUST have a Signed-off-by

Re: [PATCH tabled 1/2] server/config.c: don't dereference NULL on OOM

2010-09-24 Thread Jeff Garzik
On 09/24/2010 01:43 PM, Jim Meyering wrote: Jeff Garzik wrote: On 09/24/2010 07:32 AM, Jim Meyering wrote: You can pull from the oom branch here: git://git.infradead.org/users/meyering/tabled.git Got nearly everything perfect. Need one more minor yet important change. As described

Re: [PATCH tabled] server/server.c (net_write_port): Don't ignore write error.

2010-09-23 Thread Jeff Garzik
On 09/23/2010 03:55 AM, Jim Meyering wrote: Better safe than sorry... Unreported write failures can be unpleasant. I fixed the one below so that a failure indication can propagate up the call tree. You might also want to report the failure to stderr. I let my editor automatically update the

[tabled patch v2] abstract out TCP-write code

2010-09-23 Thread Jeff Garzik
Changes from v1: - avoid referencing dead struct client (grep for 'invalidate_cli'), by changing FSM callback prototype. - insert 'void *priv' member into struct atcp_wr_state, and replace cb_data1/cb_data2 callback parameters with (struct atcp_wr_state *, void *). struct client / struct

Re: [tabled patch] abstract out TCP-write code

2010-09-23 Thread Jeff Garzik
On 09/23/2010 11:28 AM, Jim Meyering wrote: Every developer should have MALLOC_PERTURB_=N (N in 1..255) set in his/her environment on glibc-based systems. Almost all the time. I heard about it a while ago, even submitted a bugzilla bug to have it documented adequately. But apparently its

Re: [tabled patch] abstract out TCP-write code

2010-09-23 Thread Jeff Garzik
On 09/22/2010 10:37 PM, Pete Zaitcev wrote: On Wed, 22 Sep 2010 21:26:13 -0400 Jeff Garzikj...@garzik.org wrote: It is a common idiom even in GLib that callbacks receive two anonymous pointers; witness the data type GFunc's 'data' and 'user_data' arguments:

Re: [tabled patch] abstract out TCP-write code

2010-09-22 Thread Jeff Garzik
On 09/22/2010 10:37 PM, Pete Zaitcev wrote: On Wed, 22 Sep 2010 21:26:13 -0400 Jeff Garzikj...@garzik.org wrote: So, we go a longer route and re-hook the list of completions to a per-server global instead of a client. The patch is straight- forward. The only thing we need to

Re: Reconsidering libevent

2010-09-21 Thread Jeff Garzik
On Tue, Sep 21, 2010 at 5:06 PM, Steven Dake sd...@redhat.com wrote: libevent version 2 has proper mutual exclusion, but the code needs some work. 1.x should work for chunkd at the moment. I need to resist my own urge to think too far ahead and overengineer for the future sometimes; I think

Re: [hail patch 0/3] chunkd: on-disk checksumming and get-partial operation

2010-09-15 Thread Jeff Garzik
Just pushed this out to hail.git. -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html

Re: [PATCH] don't expect inode name to be NUL-terminated (avoid read overrun)

2010-09-14 Thread Jeff Garzik
On 09/10/2010 08:55 AM, Jim Meyering wrote: * server/msg.c (msg_get): Copy only name_len bytes, then NUL-terminate, rather than using snprintf to copy up to and including nonexistent NUL. --- valgrind exposed this. The use of snprintf would have been correct if the inode name buffer

[hail patch 0/3] chunkd: on-disk checksumming and get-partial operation

2010-09-14 Thread Jeff Garzik
This patchset is just about ready to go upstream. Just need to write a couple tests (familiar refrain eh?:)). These changes add a new Get-Partial-Object (GET_PART) chunkd operation. GET_PART permits partial retrieval of an object, by adding an (offset,length) pair to the standard Get-Object

[hail patch 1/3] chunkd: Add checksum table to on-disk format

2010-09-14 Thread Jeff Garzik
commit f1de17a6e2b3afdbfbfa581228280b65a4a17e5f Author: Jeff Garzik j...@garzik.org Date: Thu Aug 5 17:47:03 2010 -0400 chunkd: Add checksum table to on-disk format, one sum per 64k of data Signed-off-by: Jeff Garzik jgar...@redhat.com chunkd/be-fs.c | 162

Re: [tabled patch 4/5] Support auto replicaton port

2010-08-13 Thread Jeff Garzik
On 08/12/2010 03:22 PM, Pete Zaitcev wrote: Allow random ports for replication master to listen on. The patch is somewhat larger than expected, because before we had the MASTER file written right after locking. Now we may have it written without listening parameters, and the slaves must be

Re: [tabled patch 1/3] make a const struct static

2010-08-10 Thread Jeff Garzik
On 08/05/2010 11:40 PM, Pete Zaitcev wrote: Signed-off-by: Pete Zaitcevzait...@redhat.com --- server/server.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) commit 93b990f68e5c2c652759a2db8af049d172b8489c Author: Pete Zaitcevzait...@yahoo.com Date: Thu Aug 5 20:33:21 2010 -0600

Re: [hail patch 2/3] fix 32/64 wire interoperability

2010-08-05 Thread Jeff Garzik
On 08/04/2010 07:16 PM, Pete Zaitcev wrote: Testing found that tabled and chunkd running on CPUs with different word length cannot talk to each other. The bug was introduced by commit ea5d20bc22aeed077312c9c1824e84651af17a16. The fix is to add named padding that takes the place of the

Re: [hail patch 1/1] Make host, url, orig_path dynamic

2010-07-29 Thread Jeff Garzik
On 07/29/2010 01:41 PM, Pete Zaitcev wrote: On Tue, 20 Jul 2010 16:34:19 -0400 Jeff Garzikj...@garzik.org wrote: lib/hstor.c | 147 +++--- 1 file changed, 104 insertions(+), 43 deletions(-) applied It's not in the git repo. Check this URL:

Re: [hail patch 1/7] Drop old comments about chunkdc

2010-07-29 Thread Jeff Garzik
On 07/29/2010 10:49 PM, Pete Zaitcev wrote: Signed-off-by: Pete Zaitcevzait...@redhat.com --- configure.ac |2 -- 1 file changed, 2 deletions(-) commit 00be6055a3801ef8e84a4c78b43b43b67a76eab9 Author: Pete Zaitcevzait...@yahoo.com Date: Thu Jul 29 19:10:05 2010 -0600 Drop

Re: [hail patch 1/1] Make host, url, orig_path dynamic

2010-07-20 Thread Jeff Garzik
On 07/20/2010 04:16 PM, Pete Zaitcev wrote: Some of my performance tests for tabled hit truncation again: [zait...@hitlain tests]$ ./poke5 -v -h niphredil.zaitcev.lan -u auser -p apass -b test -o -k testkey-hitlain/73b84a11e6d83c65e45853338d646042 -f

[PATCH 4/4] chunkd checksums each block, as it is read from disk

2010-07-18 Thread Jeff Garzik
Note that we are checksumming hot cache data, so SHA1 isn't as punishing as one might think. chunkd/be-fs.c | 51 ++- 1 file changed, 50 insertions(+), 1 deletion(-) commit 2211e3b58620093866be4130397cb3b476620725 Author: Jeff Garzik j

Re: [PATCH 1/3 v2] chunkd: remove sendfile(2) zero-copy support

2010-07-18 Thread Jeff Garzik
On 07/17/2010 11:45 PM, Steven Dake wrote: On 07/16/2010 10:46 PM, Jeff Garzik wrote: chunkd: remove sendfile(2) zero-copy support chunkd will be soon checksumming data in main memory. That removes the utility of a zero-copy interface which bypasses the on-heap data requirement. Signed-off

[PATCH 2/3 v2] chunkd: Add checksum table to on-disk format, one sum per 64k of data

2010-07-16 Thread Jeff Garzik
chunkd/be-fs.c | 145 +--- chunkd/chunkd.h |3 + 2 files changed, 131 insertions(+), 17 deletions(-) commit 394109d5c2fc2d15d91c2d36eecd57594922c1b3 Author: Jeff Garzik j...@garzik.org Date: Sat Jul 17 01:05:15 2010 -0400 chunkd

[PATCH 0/3] update chunkd checksum verification scheme

2010-07-15 Thread Jeff Garzik
This patchset is part of the work necessary to get ranged-GET (aka partial GET) working. As explained in http://marc.info/?l=hail-develm=127871407125539w=2 the current chunkd checksum scheme does not work at all for partial retrievals, and must be revamped. These patches present step 1 of 4,

[PATCH 1/3] chunkd: remove sendfile(2) support

2010-07-15 Thread Jeff Garzik
commit d663521ba7e6a808be02633e57dbeb7a95973c0f Author: Jeff Garzik j...@garzik.org Date: Thu Jul 15 13:50:10 2010 -0400 chunkd: remove sendfile(2) zero-copy support chunkd will be soon checksumming data in main memory. That removes the utility of a zero-copy interface which

[PATCH 3/3] chunkd: on-disk format stores per-64k checksums

2010-07-15 Thread Jeff Garzik
commit e6fcc02bea062af291148771a59ee2028ae98834 Author: Jeff Garzik j...@garzik.org Date: Thu Jul 15 13:57:17 2010 -0400 chunkd: Add checksum table to on-disk format, one sum per 64k of data Signed-off-by: Jeff Garzik jgar...@redhat.com chunkd/be-fs.c | 145

Re: New 'hail' repository created, with major packaging rework

2010-07-07 Thread Jeff Garzik
the following out to hail.git. If people disagree with naming, now's the time to speak up. commit 5188f48dd3c73ce86f2bc453a326ee0bf40fd6db Author: Jeff Garzik j...@garzik.org Date: Wed Jul 7 02:16:28 2010 -0400 libhail: Import httpstor, httputil modules from tabled With the following

Re: [PATCH] tabled: use httpstor API from libhail

2010-07-07 Thread Jeff Garzik
On Wed, Jul 07, 2010 at 06:38:22PM -0400, Jeff Garzik wrote: Just committed the following to tabled.git on my local laptop, on a side branch. This won't be pushed onto the main tabled branch until Friday, to give people time to convert as zaitcev suggested in the 'new hail repository' thread

Re: [PATCH] chunkd: add cp command, for local intra-table copies

2010-07-06 Thread Jeff Garzik
On 07/06/2010 11:17 AM, Pete Zaitcev wrote: On Tue, 6 Jul 2010 03:24:29 -0400 Jeff Garzikj...@garzik.org wrote: The following patch, against current hail.git, adds the CP command to chunkd, permitting copying from object-object inside a single table. What is it for? Fun! :) More

Re: [PATCH] chunkd: add cp command, for local intra-table copies

2010-07-06 Thread Jeff Garzik
On 07/06/2010 11:17 AM, Pete Zaitcev wrote: On Tue, 6 Jul 2010 03:24:29 -0400 Jeff Garzikj...@garzik.org wrote: The following patch, against current hail.git, adds the CP command to chunkd, permitting copying from object-object inside a single table. What is it for? Here's a real-world

tabled: Some Amazon S3 features to consider

2010-07-06 Thread Jeff Garzik
Here are a few interesting things that have appeared in the S3 API since its initial release: 1) Object versioning. All objects now uniquely identified by (key, version) pair. API compatibility is maintained by supporting the notion of current version. 2) Object copying. Rather than an

[PATCH] chunk: add CP operation

2010-07-06 Thread Jeff Garzik
This patch * adds local, intra-table copy operation to chunkd/libhail * illustrates what files need updating, when adding a new op to chunk * adds some 'worker' infrastructure which should help with future ops, notably remote copy (RCP) * should assist tabled's implementation of S3 copy

stor_obj_test

2010-07-06 Thread Jeff Garzik
This function seems to be missing the meat. It retrieves then disposes of a keylist. bool stor_obj_test(struct open_chunk *cep, uint64_t key) { struct st_keylist *klist; if (!cep-stc) return false; klist = stc_keys(cep-stc); if (!klist)

chunkd on-disk and network protocol format change

2010-07-06 Thread Jeff Garzik
perform the conversion at list-objects time. Jeff commit ea5d20bc22aeed077312c9c1824e84651af17a16 Author: Jeff Garzik j...@garzik.org Date: Wed Jul 7 00:51:48 2010 -0400 [chunk] protocol, disk fmt: Replace ASCII checksum representation with binary Rather than converting SHA1

Re: New 'hail' repository created, with major packaging rework

2010-07-05 Thread Jeff Garzik
On 07/05/2010 03:13 PM, Pete Zaitcev wrote: On Fri, 02 Jul 2010 02:59:20 -0400 Jeff Garzikj...@garzik.org wrote: git://git.kernel.org/pub/scm/daemon/distsrv/hail.git libhail is a single shared library binary, linking together cldc, ncld, libtimer, and chunkdc modules. In other

chunkd near-term enhancements

2010-07-04 Thread Jeff Garzik
Here are a few chunkd enhancements that are currently on my drawing board, for the near term: (CHO_xxx denotes new chunkd network protocol commands, as listed in include/chunk_msg.h) * CHO_SET_SERVERS: chunkd shall maintain a per-connection buffer known as SERVER_LIST. This chunkd command

New 'hail' repository created, with major packaging rework

2010-07-02 Thread Jeff Garzik
A new git repository git://git.kernel.org/pub/scm/daemon/distsrv/hail.git was created, preserving the full histories of cld.git and chunkd.git. The existing cld.git and chunkd.git repositories have been left untouched, for now. I also have not yet updated tabled.git for this new

Hail version 0.7 released

2010-07-02 Thread Jeff Garzik
Version 0.7 of hail core services has been released, at the expected places: http://www.kernel.org/pub/software/network/distsrv/hail/ ftp://ftp.kernel.org/pub/software/network/distsrv/hail/ git://git.kernel.org/pub/scm/daemon/distsrv/hail.git This release replaces

tabled version 0.5 released

2010-07-02 Thread Jeff Garzik
Coinciding with hail core v0.7 release is this tabled release, v0.5, at the usual places: git://git.kernel.org/pub/scm/daemon/distsrv/tabled.git http://www.kernel.org/pub/software/network/distsrv/tabled/ ftp://ftp.kernel.org/pub/software/network/distsrv/tabled/ Release notes: - update for

Re: [tabled patch 1/1] Stagger the start-daemon

2010-07-01 Thread Jeff Garzik
On 06/30/2010 10:49 AM, Pete Zaitcev wrote: My rule of thumb is that magic delays are evil or stupid, so I worked on eliminating them from our scripts. However, in this case it's just not worth it, because the result is that we have to wait way more than 100s for several cycles of CLD timeouts

[RFC] new structure: hail pkg instead of cld, chunkd

2010-06-29 Thread Jeff Garzik
I've been thinking about a new structure for the projects, namely having a single hail or hail-core package, that includes cld and chunkd services, and associated client libraries inside a new libhail. In real terms, it would look like this: cld - hail libcldc

Re: Metadata replication in tabled

2010-06-25 Thread Jeff Garzik
On 06/24/2010 08:31 PM, Pete Zaitcev wrote: I worked on fixing the metadata replication in tabled. There were some difficulties in existing code, in particular the aliasing between the hostname used to identify nodes and the hostname used in bind() for listening was impossible to work around in

Re: Zookeeper instead of CLD in Hail

2010-06-07 Thread Jeff Garzik
On 06/04/2010 11:27 PM, Pete Zaitcev wrote: I heard people say they cribbed from the same Chubby paper, but it's bollocks. It's absolutely nothing like what Chubby implies. No locks for one thing. To be sure, Zookeeper provides a canned piece of code which implements locks, kinda like you can

Re: [chunkd patch 4/6] Print client port

2010-05-25 Thread Jeff Garzik
On 05/21/2010 12:54 AM, Pete Zaitcev wrote: - host, sizeof(host), NULL, 0, NI_NUMERICHOST); + host, sizeof(host), port, sizeof(port), NI_NUMERICHOST); host[sizeof(host) - 1] = 0; - applog(LOG_INFO, client %s connected%s, host, +

Re: [chunkd patch 1/6] Fix the leak of suddenly closed connections

2010-05-25 Thread Jeff Garzik
On 05/21/2010 12:54 AM, Pete Zaitcev wrote: After a period of uptime, chunkd may stop working with this: May 20 08:51:47 azdragon2 chunkd[4034]: tcp accept: Too many open files An examination with lsof shows that file descriptors for sockets and object data files are leaked in neat pairs. As

Re: [chunkd patch 6/6] Make cli_wr_set_poll bool

2010-05-25 Thread Jeff Garzik
On 05/21/2010 12:54 AM, Pete Zaitcev wrote: The upside of this cleanup is an ease of reading and evaluating with fewer control paths. [This patch will only work if patch 2/6 is applied. Sorry.] Signed-off-by: Pete Zaitcevzait...@redhat.com --- server/chunkd.h |2 +- server/object.c |

Re: [tabled patch 1/1] fix the selection of chunk

2010-05-25 Thread Jeff Garzik
On 05/25/2010 11:30 PM, Pete Zaitcev wrote: If a chunkserver goes down, tabled sometimes throws a phantom object not found. It happens because we keep hitting the same down node and exhaust the retries. The existing code calls rand() every time and hopes for the best, but this is too likely to

Re: iSCSI front-end for Hail

2010-05-06 Thread Jeff Garzik
As of itd commit 196e8f317fc7202460d7adde93dac939caf23f5d, the iSCSI target daemon appears to survive stress tests, and does not leak memory. I call that a good first milestone. Jeff -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message

Re: iSCSI front-end for Hail

2010-05-05 Thread Jeff Garzik
As of commit 23a5795e3ca555a6454b199e071482bb50655508, itd is passing integrity and stress tests from two test suites, iscsi-harness found in netbsd-iscsi pkg, and basic blkdev integrity tests using dd(1). There is a whopping big memory leak that needs fixing, but the basics appear to be

Re: [cld patch 1/1] use specified username in cldcli

2010-05-02 Thread Jeff Garzik
On 05/03/2010 12:07 AM, Pete Zaitcev wrote: I suspect I copy-pasted over it when I converted to ncld, but anyhow this patch seems work and do what's expected for --user flag. Signed-off-by: Pete Zaitcevzait...@redhat.com --- tools/cldcli.c |2 +- 1 file changed, 1 insertion(+), 1

Re: [chunkd patch 1/2] eradicate last vestiges of libevent

2010-05-01 Thread Jeff Garzik
On 05/01/2010 12:51 AM, Pete Zaitcev wrote: We stopped using libevent in Chunk a while ago, but for some reason not all references were removed. I tested this patch by building on a fresh Fedora 13 system without libevent. Signed-off-by: Pete Zaitcevzait...@redhat.com --- configure.ac

iSCSI front-end for Hail

2010-05-01 Thread Jeff Garzik
Hail devs, Project Hail was, in part, conceived as an umbrella of libraries and services enabling the mating of a well known, Internet-standard API with a back-end that enables distributed storage. tabled is an example of this: it provides an application front-end compatible with S3 API,

Re: [Patch 09/12] tabled: drop double prefixing

2010-04-18 Thread Jeff Garzik
On 04/18/2010 12:42 AM, Pete Zaitcev wrote: On Fedora 14, the following is seen in syslog: Apr 17 19:58:52 niphredil tabled: tabled: connecting to site hitlain.zaitcev.lan:8083: No route to host Apr 17 19:58:56 niphredil tabled: tabled: DB_ENV-rep_elect:WARNING: nvotes (1) is sub-majority

Re: [Patch 1/8] CLD: cleanup: add cld_msg_rpc.x

2010-04-17 Thread Jeff Garzik
On 04/16/2010 10:18 PM, Pete Zaitcev wrote: On Wed, 14 Apr 2010 15:55:01 -0400 Jeff Garzikj...@garzik.org wrote: +++ b/lib/Makefile.am @@ -27,6 +27,7 @@ libcldc_la_SOURCES= \ common.c\ libtimer.c \ pkt.c \ +

Re: Trivial Q about chunkd's main_loop

2010-04-17 Thread Jeff Garzik
On 04/17/2010 09:36 PM, Pete Zaitcev wrote: Is there a reason why the main_loop in chunkd uses a naked g_hash_table_lookup instead of srv_poll_lookup? Performance? @@ -1681,8 +1681,7 @@ static int main_loop(void) fired++; - sp =

tabled RPM build fails before it succeeds

2010-04-16 Thread Jeff Garzik
The same source, same spec. Build #1 (fails on x86_64): http://koji.fedoraproject.org/koji/taskinfo?taskID=2119825 Build #2 (fails on i686): http://koji.fedoraproject.org/koji/taskinfo?taskID=2120174 Build #3 (success on all platforms):

Re: [Patch 2/8] CLD: cleanup: add a log entry about sent packet

2010-04-14 Thread Jeff Garzik
On 04/14/2010 02:34 PM, Pete Zaitcev wrote: Currently, there's nothing in the verbose output about sent packets at all. No, really! This is very confusing, even if I run tcpdump in the same time. I think we should add this. Signed-off-by: Pete Zaitcevzait...@redhat.com --- lib/cldc.c |2

Re: [Patch 7/8] tabled: cleanup: add #include

2010-04-14 Thread Jeff Garzik
On 04/14/2010 02:35 PM, Pete Zaitcev wrote: Same as everywhere else: missing prototypes, so implementations are not actually matched by the compiler. Signed-off-by: Pete Zaitcevzait...@redhat.com --- lib/readport.c |1 + test/libtest.c |1 + 2 files changed, 2 insertions(+)

Re: [Patch 1/8] CLD: cleanup: add cld_msg_rpc.x

2010-04-14 Thread Jeff Garzik
On 04/14/2010 02:33 PM, Pete Zaitcev wrote: You know what's weird... Without this, I cannot build an RPM at all, the rpmbuild complains about unpackaged files and aborts. But everyone else seems to have no problem? Strange. BTW, I am on Fedora 14. Signed-off-by: Pete Zaitcevzait...@redhat.com

Re: [Patch 1/3] CLD: End-to-end verbosity

2010-04-06 Thread Jeff Garzik
On 03/31/2010 08:43 PM, Pete Zaitcev wrote: diff --git a/server/server.c b/server/server.c index 3208e0f..2d68ee6 100644 --- a/server/server.c +++ b/server/server.c @@ -55,7 +55,7 @@ static struct argp_option options[] = { Store database environment in DIRECTORY. Default:

Re: [Patch 1/7] tabled: make two dump displays uniform

2010-04-06 Thread Jeff Garzik
On 04/01/2010 09:51 PM, Pete Zaitcev wrote: From: Jeff Garzikjgar...@pobox.com Subject: Re: Tabled issues Date: Mon, 29 Mar 2010 15:32:33 -0400 I asserted that the standard stats dump facility must dump all available statistics. That does not exclude other methods of stat(us) dumping. Your

Re: [Patch 2/7] tabled: fix the endless recusion when reading long objects

2010-04-06 Thread Jeff Garzik
On 04/01/2010 09:51 PM, Pete Zaitcev wrote: At certain network and disk speeds, tabled can blow its stack by filling it with (essentially) endless recursion: #2 0x0040c077 in cli_write_free (cli=value optimized out, tmp= 0x7bb910, done=value optimized out) at server.c:397 #3

Re: [Patch 1/3] CLD: End-to-end verbosity

2010-04-06 Thread Jeff Garzik
On 04/06/2010 11:32 PM, Pete Zaitcev wrote: On Tue, 06 Apr 2010 10:40:33 -0400 Jeff Garzikj...@garzik.org wrote: The debug levels are 0: key messages affecting server operation, only 1: debugging output enabled, sans per-packet output 2: debugging output enabled,

Re: CLD doesn't build on db-4.3

2010-04-01 Thread Jeff Garzik
On 04/01/2010 07:01 AM, Samba - BoYang wrote: hi, * CLD doesn't build on db-4.3 on suse 11, since db-4.3 uses deprecated structure members DBC-c_xxx(c_close(), etc) instead of DBC-xxx. :-) It won't build on db-4.4, either. probably won't build on db-4.5, as db-5.0 says DBC-xxx was

Re: [Patch 1/1] tabled: fix a crash when looking up non-existing NID

2010-03-29 Thread Jeff Garzik
On 03/28/2010 09:57 PM, Pete Zaitcev wrote: Signed-off-by: Pete Zaitcevzait...@redhat.com --- server/storage.c |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) applied -- To unsubscribe from this list: send the line unsubscribe hail-devel in the body of a message to

Re: [PATCH] chunkd: fix duplicate stc_object allocation in stc_parse_key()

2010-03-16 Thread Jeff Garzik
On 03/16/2010 05:59 AM, Akinobu Mita wrote: At the beginning of stc_parse_key(), st_object is allocated twice for the same variable. Signed-off-by: Akinobu Mitaakinobu.m...@gmail.com --- lib/chunkdc.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) good catch, applied -- To

  1   2   >