[PVFS2-developers] patches: acache and perf counters

2005-12-13 Thread Phil Carns
These patches largely depend on the acache patches from this posting: http://www.beowulf-underground.org/pipermail/pvfs2-developers/2005-November/001606.html perf-counter-api.patch: - This patch overhauls the performance counter api to add quite a bit of

[PVFS2-developers] patches: pvfs2-server shutdown fixes

2005-12-13 Thread Phil Carns
These patches fix problems that can crop up if a server is shutdown while it is still trying to service I/O operations. The first one depends on perf counter patches from http://www.beowulf-underground.org/pipermail/pvfs2-developers/2005-December/001704.html, but only because of a minor

Re: [PVFS2-developers] Removing of a directory

2005-12-16 Thread Phil Carns
That is a good point. As it turns out there is a cheap way to add this extra check. sys-rename has a bit of code that looks like this (which was part of a fix to some rename bugs I ran into a while back): if(attr-objtype == PVFS_TYPE_DIRECTORY attr-u.dir.dirent_count != 0) {

Re: [PVFS2-developers] patches: acache and perf counters

2005-12-30 Thread Phil Carns
= GETATTR_ACACHE_MISS; return 1; --- This was causing cache lookups to succeed sometimes when not all of the attributes were present. This showed up when trying to untar a big tarball onto pvfs2 with the acache enabled. -Phil Phil Carns wrote: Great- thanks

Re: [PVFS2-developers] patches: acache and perf counters

2006-01-04 Thread Phil Carns
Two more small changes for perf-counter.c: - line 593 still has an Ld reference (change to lld) - line 174: pc-start_time_array_ms[0] = tv.tv_sec*1000 + should be: pc-start_time_array_ms[0] = ((uint64_t)tv.tv_sec)*1000 + -Phil Phil Carns wrote: Ok, one more update. I think this is just

Re: [Pvfs2-developers] patches: dirent_count check bug fixes

2006-01-24 Thread Phil Carns
Sam Lang wrote: On Jan 20, 2006, at 1:50 PM, Phil Carns wrote: pvfs2-remove-dirent-count.patch: === This patch is straightforward; it simply removes the dirent_count check from the sys-remove path. The main problem with this check is that it relies on attributes which

Re: [Pvfs2-developers] Problem with multiple pvfs2 file systems mounted on a single client

2006-03-01 Thread Phil Carns
I was able to replicate it in a little bit simpler environment this afternoon. It looks like the problem is with the statfs and/or mount upcalls. The problem with those two is that they are serviced in pvfs2-client-core using blocking functions- so if one of them hangs on a long network

Re: [Pvfs2-developers] tuning the 2.6 kernels for write performance

2006-03-24 Thread Phil Carns
Murali Vilayannur wrote: Hi Phil, First of all, great work! There are 2 other parameters that I thought could also make an impact a) Choice of file-system (this was investigated by Nathan last year, I think) as well as choice of journaling modes. We looked at this a little bit again recently,

[Pvfs2-developers] patches: misc. bug fixes

2006-03-31 Thread Phil Carns
sys-io-error-handling.patch: This patch fixes a variety of error handling issues in sys-io.sm. The error handling paths in here still seem like they may be a little fragile, but this patch addresses a couple of specific areas where we could report an error faster

[Pvfs2-developers] patches: misc. gossip additions

2006-03-31 Thread Phil Carns
I don't know if all of this is generally useful, but some of it may be. job-flow-err-log.patch: --- - adds flow pointer value to several debugging messages so they are easier to match up - adds error messages if a flow is cancelled. It seems like it is helpful to see in

Re: [Pvfs2-developers] PVFS2 and 32/64 bit oddities (take 3)

2006-04-04 Thread Phil Carns
Is INTPTR_MIN defined in the kernel headers somewhere as well? I am having a hard time compiling the kerne module at the moment because /kernel/linux-2.6/pvfs2-utils.c ends up pulling in pvfs2-sysint.h. I am using the 2.6.15.4 kernel. -Phil Sam Lang wrote: This seems to work everywhere

Re: [Pvfs2-developers] PVFS2 and 32/64 bit oddities (take 3)

2006-04-05 Thread Phil Carns
, Phil Carns wrote: Actually, it does compile now that I look closer (I had a second problem confusing me), but it does generate quite a few warnings like this: pvfs2/include/pvfs2-sysint.h:44:5: warning: INTPTR_MIN is not defined pvfs2/include/pvfs2-sysint.h:44:19: warning: INT32_MIN

Re: [Pvfs2-developers] PVFS2 and 32/64 bit oddities (take 3)

2006-04-05 Thread Phil Carns
to fix it up so that we have something that works in both cases. -sam On Apr 5, 2006, at 11:48 AM, Phil Carns wrote: Ahh, -Wundef is the difference. That flag is getting set automatically on my laptop for some reason but not on the other boxes that I try. -Phil Phil Carns wrote: I'll see

Re: [Pvfs2-developers] PVFS2 and 32/64 bit oddities (take 3)

2006-04-05 Thread Phil Carns
On Apr 5, 2006, at 11:48 AM, Phil Carns wrote: Ahh, -Wundef is the difference. That flag is getting set automatically on my laptop for some reason but not on the other boxes that I try. -Phil Phil Carns wrote: I'll see if I can figure out something to do based on Nathan's suggestion. Looking

[Pvfs2-developers] kernel build problems on 64bit RHEL4

2006-04-25 Thread Phil Carns
The current CVS head is giving this build error on a 64bit RHEL4 box that is running 2.6.9-22.0.1.ELsmp. Anyone have any ideas? CC [M] /home/pcarns/working/pvfs2/vendor/build/src/kernel/linux-2.6/devpvfs2-req.o /home/pcarns/working/pvfs2/vendor/build/src/kernel/linux-2.6/devpvfs2-req.c: In

Re: [Pvfs2-developers] problems creating storage space with -f option

2006-04-26 Thread Phil Carns
machine... Patch also fixes a string parsing bug in dbpf_mkpath() which fails to create a directory whose path is /opt/pvfs2/1 where the last component is a single character... Let me know, Thanks, Murali On Wed, 26 Apr 2006, Phil Carns wrote: After poking at this a little, it looks like

[Pvfs2-developers] patches: karma

2006-05-18 Thread Phil Carns
Whoops- that was a little big for the mailing list. This one has gzip'd patches. Original Message Subject: patches: karma Date: Thu, 18 May 2006 17:10:34 +0200 From: Phil Carns [EMAIL PROTECTED] To: PVFS2-developers pvfs2-developers@beowulf-underground.org perf-mon.patch

[Pvfs2-developers] patches: server bugs and logging enhancements

2006-05-18 Thread Phil Carns
server-logging.patch: - I thought we did this already, but it may have gotten lost in the shuffle somewhere. Anyway, this patch does two things: - make sure that the starting message is always printed both on stdout and in the log files when a server starts, regardless of

[Pvfs2-developers] patches: system interface cleanups

2006-05-18 Thread Phil Carns
msgpairarray-decode.patch: This cleans up how decoding errors (for responses) are handled in msgpairarray. It now saves the error in a slot specific to each particular msgpair, so that the problem will show up in EDETAIL if possible, rather than as a generic problem

Re: [Pvfs2-developers] patches: pvfs2-ping, pvfs2-genconfig, test programs

2006-05-18 Thread Phil Carns
Robert Latham wrote: On Thu, Apr 06, 2006 at 02:40:01PM -0600, Bart Taylor wrote: As an example, suppose that you are using a SAN and you want one particular server to have a dedicated LUN for metadata. In that case you might want to specify something like this: --iospecs

Re: [Pvfs2-developers] patches: karma

2006-05-23 Thread Phil Carns
Robert Latham wrote: On Thu, May 18, 2006 at 05:10:34PM +0200, Phil Carns wrote: Index: pvfs2_src/src/server/perf-mon.sm === --- pvfs2_src/src/server/perf-mon.sm(revision 1541) +++ pvfs2_src/src/server/perf-mon.sm(revision

[Pvfs2-developers] extra trove error messages

2006-05-24 Thread Phil Carns
If I start a pvfs2-server in the foreground (with the -d option), I see quite a few of these messages on the console when accessing the file system, even with just an ls: pvfs2-trove-dbpf: DB-get: DB_NOTFOUND: No matching key/data pair found I assume this isn't really an error; trove was

Re: [Pvfs2-developers] patches: server bugs and logging enhancements

2006-05-24 Thread Phil Carns
rev-lookup-hostnames-runtime.patch: -- Can you tweak this so that the check for HAVE_GETHOSTBYADDR is retained? I know it's a pain, and I can't remember the specifics right now, but we probably need that check for BGL. ==rob Ahh, sure thing. The version of PVFS2

[Pvfs2-developers] depend.sh changes (CC variable)

2006-06-01 Thread Phil Carns
There was a change recently committed to allow depend.sh to work with different compilers. This seems to work pretty well, but I have one suggestion. Could we change the following line in Makefile.in? $(E)CC=$(CC) $(srcdir)/maint/depend.sh ... to $(E)CC=$(CC) $(srcdir)/maint/depend.sh

Re: [Pvfs2-developers] patches: pvfs2-ping, pvfs2-genconfig, test programs

2006-06-01 Thread Phil Carns
have now, as well as covering pretty much all the use cases you guys have. Let me know if you have any problems/concerns. -sam On May 18, 2006, at 11:26 AM, Phil Carns wrote: Robert Latham wrote: On Thu, Apr 06, 2006 at 02:40:01PM -0600, Bart Taylor wrote: As an example, suppose that you

[Pvfs2-developers] patches: bug fixes

2006-06-07 Thread Phil Carns
All of the patches listed in this email are limited scope- they just fix specific bugs and don't make any protocol or storage format changes: pvfs2-kernel-permissions.patch: --- The pvfs2_permission() function is significantly different (due to #ifdefs) based on whether or not the

[Pvfs2-developers] patch: tcache terminology

2006-06-07 Thread Phil Carns
tcache-terminology.patch: -- This patch clarifies/fixes some terminology disagreement between the tcache documentation (in doxygen) and what the tcache actually does. This patch does not change semantics or behavior, but there is a function rename. This is basically what is

[Pvfs2-developers] patches: pvfs2-validate

2006-06-07 Thread Phil Carns
This is a work in progress, but we wanted to go ahead and share some patches to see if anyone has an comments, etc. The last patch in this email (pvfs2-validate.patch) implements a tool similar to pvfs2-fsck that takes a different approach and adds some different functionality. The first two

[Pvfs2-developers] help debugging request processor/distribution

2006-06-12 Thread Phil Carns
Hi all, I am looking at an I/O problem that I don't completely understand. The setup is that there are 15 servers and 20 clients (all RHEL3 SMP). The clients are running a proprietary application. At the end of the run they each write their share of a data set into a 36 GB file. So each

Re: [Pvfs2-developers] help debugging request processor/distribution

2006-06-13 Thread Phil Carns
.. Thanks, Murali On Tue, 13 Jun 2006, Phil Carns wrote: I went back and added some much more specific debugging messages and put some special prefixes on flow, bmi, and request processor messages so I could group them a little easier, and got rid of the extra mutexes. After running a few more tests

Re: [Pvfs2-developers] help debugging request processor/distribution

2006-06-14 Thread Phil Carns
, but likely not the most efficient. Rob Phil Carns wrote: Sorry- rendezvous the wrong terminology here for what is happening within bmi_tcp at the individual message level. It doesn't implicitly exchange control messages before putting each buffer on the wire. bmi_tcp will send any size message

Re: [Pvfs2-developers] help debugging request processor/distribution

2006-06-16 Thread Phil Carns
Pete Wyckoff wrote: [EMAIL PROTECTED] wrote on Wed, 14 Jun 2006 13:09 -0500: I've attached a patch of the sys-io state machine from trunk that doesn't post the flow until the initial response is received. Phil and Murali, can you let me know if this fixes your respective problems? Also,

Re: [Pvfs2-developers] patches: directory hints

2006-06-16 Thread Phil Carns
These patches are based on part of the functionality Murali originally contributed in this email thread: http://www.beowulf-underground.org/pipermail/pvfs2-developers/2005-November/001624.html Thanks for the patches! No, thank you - you did all of the hard work! Examples: [EMAIL

Re: [Pvfs2-developers] Re: BMI TCP socket close for sock buf size

2006-07-17 Thread Phil Carns
Seems reasonable. BTW, we've talked about this already, but since the msgpairarray state machine is the current topic, I'll reiterate some of my ideas. Its written in such a way that at present can't be used by sys-io.sm. The problem is that it blocks (doesn't complete) waiting for a

Re: [Pvfs2-developers] terminating state machines

2006-07-26 Thread Phil Carns
Walter B. Ligon III wrote: OK, guys, I have another issue I want input on. When child SMs terminate they have to notify their parent. The parent has to wait for all the children to terminate. So I've been thinking to use the job subsystem for this: the parent would post a job to wait for

Re: [Pvfs2-developers] bmi_thread_function question

2006-07-26 Thread Phil Carns
Sam Lang wrote: Hi All, I noticed that in thread-mgr.c:bmi_thread_function, there is a call to BMI_test_unexpected if the bmi_unexp_count 0, which only happens on the server. The value of bmi_unexp_count on the server can be as high as 50 from the job_bmi_unexp posts in server.c. It

Re: [Pvfs2-developers] terminating state machines

2006-07-26 Thread Phil Carns
Phil, first your questions: The parent will push a frame onto a stack for each child it is starting. A frame is everything that used to be in either a s_op or sm_p on the server or client, except for the stuff that actually runs the SM (now in an smcb). The parent can pass in anything it

Re: [Pvfs2-developers] terminating state machines

2006-07-26 Thread Phil Carns
I don't see why the two have to be dependent for this to work. Do you mean by the parent posting a job, the state machine stepping code would handling the actual posting? I was assuming that the parent state action could just call job_concurrent_sm_post (or whatever its called).

Re: [Pvfs2-developers] terminating state machines

2006-07-26 Thread Phil Carns
some of yours) Walt Phil Carns wrote: Walter B. Ligon III wrote: OK, guys, I have another issue I want input on. When child SMs terminate they have to notify their parent. The parent has to wait for all the children to terminate. So I've been thinking to use the job subsystem

Re: [Pvfs2-developers] terminating state machines

2006-07-27 Thread Phil Carns
Hmm...I had been thinking about a flow implementation that used the new concurrent state machine code...it sounds like that's a bad idea because the testing and restarting would take too long to switch between bmi and trove? We use the post/test model through pvfs2 though, so maybe I

Re: [Pvfs2-developers] acache timeout

2006-07-27 Thread Phil Carns
Dean Hildebrand wrote: Thanks for the info Phil. While I'm running an I/O performance experiment, I never want the attributes to expire, so I'm using the -a option to set it to a really high value. But for the average pvfs2 setup, what is a useful value? With 5msec, the cache was continually

Re: [Pvfs2-developers] terminating state machines

2006-07-27 Thread Phil Carns
Thanks for the detailed explanation Phil. I hadn't thought about the context switches that might slow down flow. I was primarily thinking of something that would be cleaner, and easier to modify and test for different scenarios. If at some point I get around to playing with a flow

[Pvfs2-developers] patch roundup

2006-08-10 Thread Phil Carns
I am just following up on the status of a few patches that are out on the mailing list to update my notes. I have divided them up based on where I think they stand. Do any of these (aside from the last one listed) need further work? Any that are definitely declined? Should be ready?

Re: [Pvfs2-developers] patch roundup

2006-08-10 Thread Phil Carns
Murali Vilayannur wrote: Hi Phil, - ncache.patch - sys-rename.patch (I think the last issues here were resolved with the sys-rename.patch?) http://www.beowulf-underground.org/pipermail/pvfs2-developers/2006-June/002212.html

Re: [Pvfs2-developers] patch roundup

2006-08-10 Thread Phil Carns
Robert Latham wrote: On Thu, Aug 10, 2006 at 11:52:32PM +0200, Phil Carns wrote: I am just following up on the status of a few patches that are out on the mailing list to update my notes. I have divided them up based on where I think they stand. Do any of these (aside from the last one

Re: [Pvfs2-developers] bmi questions

2006-08-18 Thread Phil Carns
I have some questions related to the design semantics of BMI. * timeouts. It looks like the timeout for bmi test calls is the max amount of time spent _idling_ in the test call (as apposed to the max time spent in the test call). This is correct. The name of the argument is

Re: [Pvfs2-developers] disabling ncache

2006-08-21 Thread Phil Carns
I doubt that this is the problem, but I should mention that I although the main ncache patch was accepted, we are still missing the sys-rename update in trunk (I just noticed Friday). It is the second of the two patches mentioned in this thread:

Re: [Pvfs2-developers] BMI_testunexpected and free

2006-08-22 Thread Phil Carns
Pete Wyckoff wrote: [EMAIL PROTECTED] wrote on Tue, 22 Aug 2006 09:23 -0500: something like the attached patch? does not incude the ib and gm changes.. Looks good to me (the cleanups too :)). Can you copy the tcp implementation into ib and gm too, just the free()? Then we won't have a

Re: [Pvfs2-developers] server side memory problems in trunk?

2006-08-23 Thread Phil Carns
the BMI_unexpected_free). I'm just going to move the free back down to server_state_machine_complete. -sam On Aug 23, 2006, at 3:17 PM, Phil Carns wrote: I am having some problems with builds from the latest cvs head snapshot.The server keeps crashing while I try to run an ACL test case, but I think

Re: [Pvfs2-developers] ACL errors using LTP tests

2006-08-23 Thread Phil Carns
I am fairly sure that I was running this with the current CVS head. We originally saw these latest problems in our local tree after a vendor merge, but I switched to the HEAD version from Clemson this afternoon to double check that we didn't introduce anything on our end (that's when I saw

Re: [Pvfs2-developers] ACL errors using LTP tests

2006-08-24 Thread Phil Carns
be kernel bugs fixed in newer kernels.. Cant say for sure. Let me know if you find out anything.. Thanks, Murali On Wed, 23 Aug 2006, Phil Carns wrote: I am fairly sure that I was running this with the current CVS head. We originally saw these latest problems in our local tree after a vendor merge

Re: [Pvfs2-developers] ncache causes shared creat problems

2006-08-28 Thread Phil Carns
Pete Wyckoff wrote: The simul code, test #14, does a shared create: all processes try to do creat(file, 0644) at the same time through the VFS. There is no O_EXCL, so what should happen here is that they all succeed, although under the hood, all but one will probably have to unwind the

Re: [Pvfs2-developers] ncache causes shared creat problems

2006-08-28 Thread Phil Carns
I think so. When one node deletes a file, it does not send out messages to invalidate the cache in all of the other clients, so those still have a cached (no longer valid) entry. If those other clients then lookup the file it will succeed (as if another client had won the race to create it),

[Pvfs2-developers] patches: acl test cleanups

2006-09-05 Thread Phil Carns
These patches clean up the remaining issues from the tacl_xattr.sh test script (this is easy stuff- Murali did all of the hard work). tacl-xattr-homedir.patch: - This makes tacl-xattr.sh slightly more portable. Some Linux distributions have adduser utilities that do

[Pvfs2-developers] patches: bug fixes

2006-09-05 Thread Phil Carns
pread-pwrite.patch: --- This fixes a bug in a patch that I submitted earlier to provide a simple alternate AIO implementation. It defines _GNU_SOURCE in a limited area for dbpf so that we can get proper definitions of pread() and pwrite() on Linux. I tried using

Re: [Pvfs2-developers] MPI-io tests

2006-09-08 Thread Phil Carns
We just the MPI part for starting up a lot of processes.. Sorry for the incorrect phrasing in my emails. We dont use MPI I/O. Used the posix interface directly. Simultaneous create problems with vfs is possibly due to the request scheduler on server not serializing crdirents of the same

Re: [Pvfs2-developers] patch: binding to specific addresses

2006-09-12 Thread Phil Carns
Thanks! Murali Vilayannur wrote: Hi Phil, I just checked in all your fixes to HEAD... Thanks for the patches! Hopefully, all the xattr/acls stuff works with trunk on most distros now(?) Murali ___ Pvfs2-developers mailing list

Re: [Pvfs2-developers] pvfs2_flush_inode() and directory permissions

2006-09-20 Thread Phil Carns
you give it a spin and see if that fixes this issue? Thanks, Murali On Wed, 20 Sep 2006, Phil Carns wrote: We are seeing a new problem (not sure how long it has been around) with setgid permission bits on directories. This happens with 2.6 kernels: /home/pcarns mkdir /mnt/pvfs2/dir1 /home

Re: [Pvfs2-developers] pvfs2_flush_inode() and directory permissions

2006-09-21 Thread Phil Carns
Murali Vilayannur wrote: Thanks, Phil! BTW: Do all the ACL stuff work for you guys now on different distros? Murali It works on all of the redhat variants that we work with, which was our main concern. The test script doesn't jive on the gentoo box that I use for development, but it looks

Re: [Pvfs2-developers] RFC: Config file overhaul

2006-09-27 Thread Phil Carns
think that we want to try to enforce any rules on what parameters may be changed. This is just a convenient way to reload, right? Rob Phil Carns wrote: I have a few questions about some details of the implementation: - What exactly will a HUP do on the server side? For example, is this pretty

[Pvfs2-developers] duplicate entries in directory listing

2006-10-09 Thread Phil Carns
We are seeing a strange bug where if we list the contents of a directory while files are being created in it, we sometimes get duplicates and/or missing files in the output. I can reproduce it on a single machine by running these two scripts at the same time: tester.sh:

Re: [Pvfs2-developers] duplicate entries in directory listing

2006-10-09 Thread Phil Carns
We've talked about having pvfs2-client pull out duplicates (or the kernel module) in the cases where one of those chooses to break a readdir into multiple operations, but we haven't spent much time investigating where the replication is actually happening in order to accomplish this.

Re: [Pvfs2-developers] duplicate entries in directory listing

2006-10-09 Thread Phil Carns
I started thinking about some more possible ideas, but I realized after looking closer at the code that I don't actually see why duplicates would occur in the first place with the algorithm that is being used :) I apologize if this has been discussed a few times already, but could we walk

Re: [Pvfs2-developers] duplicate entries in directory listing

2006-10-09 Thread Phil Carns
Phil Carns wrote: I started thinking about some more possible ideas, but I realized after looking closer at the code that I don't actually see why duplicates would occur in the first place with the algorithm that is being used :) I apologize if this has been discussed a few times already

[Pvfs2-developers] housekeeping

2006-10-10 Thread Phil Carns
There is a duplicate file in the src/kernel/linux-2.6 directory. Right now both xattr-default.c and xattr_default.c are present. I think xattr_default.c is an imposter. There is also still one relatively simple patch hanging around that hasn't been integrated:

Re: [Pvfs2-developers] threaded client-core and the device thread

2006-10-17 Thread Phil Carns
Just to see if I'm noticing the same issue, what was the exact problem Phil was noticing? Shouldn't multiple requests take longer than a single request? The workload I was using was multiple rpc.nfsd threads issuing 64 KB requests (through the writev/readv interface) to the PVFS2 kernel

[Pvfs2-developers] kernel readdir question

2006-11-08 Thread Phil Carns
It's been a while since I've seen this bug first hand, but I am just now getting around to looking at it. Every once in a while we have seen cases where ls -al in a pvfs2 directory fails to show the . and .. entries. I _think_ this has mainly occurred after restarting pvfs2-client and/or

Re: [Pvfs2-developers] kernel readdir question

2006-11-10 Thread Phil Carns
and/or reopen the directory... I think we're in agreement then. I do agree though that we don't need the version field.. I think I can look at it if that's alright. By all means! :) thanks, Murali -sam thanks, Murali -sam thanks, Murali On 11/8/06, Phil Carns [EMAIL PROTECTED

Re: [Pvfs2-developers] encoding negative responses

2006-11-10 Thread Phil Carns
that indicates what really happened (ie, I didn't actually do the creation, but here is your handle anyway) rather than using a negative error code for that purpose. -Phil Rob Ross wrote: Did we reach any sort of consensus on this idea? Rob Phil Carns wrote: We've run into a couple of scenarios lately

[Pvfs2-developers] TroveMethod configuration parameter

2006-11-28 Thread Phil Carns
I have some questions about the TroveMethod parameter. It looks like it can be specified either in the Defaults section or the StorageHints section. I added some gossip prints to Trove's initialize() and collection_lookup() functions to see what these correspond to: - TroveMethod in

[Pvfs2-developers] threaded library dependency

2006-11-28 Thread Phil Carns
The top level Makefile.in has a dependency in it to make sure that the pvfs2 library is built for the KERNAPPS targets (pvfs2-client and friends): $(KERNAPPS): %: %.o $(LIBRARIES) In pvfs 2.6.0, we now need a similar rule for the threaded version of the KERNAPPS:

[Pvfs2-developers] tuning kernel buffer settings

2006-11-29 Thread Phil Carns
We recently ran some tests that we thought would be interesting to share. We used the following setup: - single client - 16 servers - gigabit ethernet - read/write tests, with 40 GB files - using reads and writes of 100 MB each in size - varying number of processes running concurrently on the

[Pvfs2-developers] TroveSyncData settings

2006-11-29 Thread Phil Carns
We recently ran some tests trying different sync settings in PVFS2. We ran into one pleasant surprise, although probably it is already obvious to others. Here is the setup: 12 clients 4 servers read/write test application, 100 MB operations, large files fibre channel SAN storage The test

[Pvfs2-developers] read buffer bug

2006-11-29 Thread Phil Carns
I ran into a problem today with the 2.6.0 release. This happened to show up in the read04 LTP test, but not reliably. I have attached a test program that I think does trigger it reliably, though. When run on ext3: /home/pcarns ./testme /tmp/foo.txt read returned: 7,

Re: [Pvfs2-developers] TroveSyncData settings

2006-11-29 Thread Phil Carns
, so the trove worker thread doesn't get stuck waiting on the sync... -Phil Rob Ross wrote: This is similar to using O_DIRECT, which has also shown benefits. With alt aio, do we sync in the context of the I/O thread? Thanks, Rob Phil Carns wrote: One thing that we noticed while testing

[Pvfs2-developers] server crash on startup with millions of files

2007-02-20 Thread Phil Carns
Hi guys, We have run into a problem recently with a configuration that looks like this: - x86_64 architecture - 16 servers - SAN based storage - approximately 1.4 million files on PVFS Everything works fine, except when we stop and then later restart one of the pvfs2-server daemons. At

Re: [Pvfs2-developers] server crash on startup with millions of files

2007-02-20 Thread Phil Carns
with just upgrading to a newer version. I didn't realize that those error code changes might have an impact here. -Phil Sam Lang wrote: On Feb 20, 2007, at 6:29 AM, Phil Carns wrote: Hi guys, We have run into a problem recently with a configuration that looks like this: - x86_64

Re: [Pvfs2-developers] server crash on startup with millions of files

2007-02-22 Thread Phil Carns
It ended up taking a little work to get another environment to trigger this reliably, but I think I have something now. I modified the iterate_handles() function a bit so that it keeps scanning over and over again indefinitely rather than letting the server start up. This forces the code

Re: [Pvfs2-developers] server crash on startup with millions of files

2007-02-23 Thread Phil Carns
is safe to commit if it looks ok on your end... -Phil Phil Carns wrote: It ended up taking a little work to get another environment to trigger this reliably, but I think I have something now. I modified the iterate_handles() function a bit so that it keeps scanning over and over again

Re: [Pvfs2-developers] server crash on startup with millions of files

2007-02-23 Thread Phil Carns
was already scratched to avoid the potential costs at creation time, especially as the filesystem grows. -sam On Feb 20, 2007, at 11:23 AM, Phil Carns wrote: Robert Latham wrote: On Tue, Feb 20, 2007 at 07:29:16AM -0500, Phil Carns wrote: Oh, and one other detail; the memory usage

Re: [Pvfs2-developers] server crash on startup with millions of files

2007-02-23 Thread Phil Carns
I get an error about DB_BUFFER_SMALL being undefined in this patch. Should it have the same #ifdefs wrapped around it as are currently in dbpf-keyval.c? It is using ENOMEM if DB_BUFFER_SMALL isnt' defined. -Phil Phil Carns wrote: Thanks Sam! We will give these patches a try and report

Re: [Pvfs2-developers] server crash on startup with millions of files

2007-02-23 Thread Phil Carns
. -Phil Phil Carns wrote: Ok, I have tried several iterations both with and without these patches. The test system is again using a SAN, this time with a dataspace_attributes.db file of about 451 MB on a particular server. I'm not sure how many files are on the file system; I just cranked out

Re: [Pvfs2-developers] server crash on startup with millions of files

2007-02-23 Thread Phil Carns
Yeah that is odd. Setting the cursor for each call to iterate_handles may be the reason for it starting over. Do you know how many times it starts over? The number of times iterate_handles is called will be (# of files / 4096). It only goes through the file twice if I am looking at the

Re: [Pvfs2-developers] server crash on startup with millions of files

2007-02-23 Thread Phil Carns
Phil Carns wrote: Yeah that is odd. Setting the cursor for each call to iterate_handles may be the reason for it starting over. Do you know how many times it starts over? The number of times iterate_handles is called will be (# of files / 4096). It only goes through the file twice

Re: [Pvfs2-developers] server crash on startup with millions of files

2007-03-07 Thread Phil Carns
Can we conclude this discussion? In summary: * The current comparison function causes bad IO patterns for iterate on the dspace db. We can change it but the disk format will change in new releases. - If we change it, either we check a version number and provide the right

Re: [Pvfs2-developers] server crash on startup with millions of files

2007-03-15 Thread Phil Carns
performance (the doc says it can really slow things down). Let me know how these changes look, and if someone gets a chance to look at performance differences, that would be great. Thanks, -sam On Mar 7, 2007, at 2:39 PM, Phil Carns wrote: Can we conclude this discussion? In summary

[Pvfs2-developers] patches: permission/acl bug fixes

2007-03-20 Thread Phil Carns
acl-check-assert.patch: This is a bug fix to the server side acl handling. It replaces an assertion with normal error handling to prevent a server from crashing if it encounters invalid acl information. check-group.patch: -- This follows up on some

[Pvfs2-developers] patches: mount bug fixes

2007-03-20 Thread Phil Carns
initialize-dyn.patch: - This is a correction to the initialize-dyn test program. It previously hardcoded the number of mounted file systems and would crash if a different number were mounted. mount-mem-leaks.patch: -- This patch corrects multiple

[Pvfs2-developers] patch: error code bug fixes

2007-03-20 Thread Phil Carns
This patch corrects a variety of error code problems: - several BMI error codes were not tagged with the BMI error class, which is important to allow client state machines to retry on network errors - ditto above for a few flow errors - ECONNRESET was not understood by BMI or included in the

[Pvfs2-developers] patch: namei bug fixes

2007-03-20 Thread Phil Carns
I am sending this patch in a separate email because it may need some discussion to hash out. Sometime in the past several months, the pvfs2_lookup() function in namei.c changed (I think along with something not directly related, but I don't recall exactly what happened now). This change

[Pvfs2-developers] patches: misc. bug fixes

2007-03-20 Thread Phil Carns
auto-sm-tracking.patch: --- At some point, new linked lists were added to track state machines that are currently running within the server. When an SM completes, it is implicitly removed from the list. However, SMs that were started without a request (ie internal state

Re: [Pvfs2-developers] DB_BUFFER_SMALL usage dbpf-dspace.c

2007-03-20 Thread Phil Carns
Whoops, one other thing to report; apparently not all db libraries have the get_pagesize() function either. I happen to be trying this on a box with version 4.1.25 of db. -Phil Phil Carns wrote: I think we talked about this some when batting around patches for the db startup issues, but I

[Pvfs2-developers] deprecated acache-torture test program

2007-03-23 Thread Phil Carns
I just noticed that there is a test program in the source tree called acache-torture.c that should probably be removed. It is in the test/client/sysint directory. I think that a patch I submitted a long time ago actually removed this source file, but that part may have been overlooked. At

Re: [Pvfs2-developers] patch: namei bug fixes

2007-04-02 Thread Phil Carns
to work on 2.4 and 2.6 (as yours does). On Mar 27, 2007, at 4:33 PM, Phil Carns wrote: Just to clarify a little bit, there is actually a problem with the 2.6 code path here as well. I went back and ran some tests with and without the namei.patch on a RHEL4 box (2.6.9-something

Re: [Pvfs2-developers] patch: namei bug fixes

2007-04-02 Thread Phil Carns
It looks like if a non-null dentry is returned from lookup, dput is called on that dentry, which decrements the usage count. If null is returned dput isn't called. Could it be that we're actually leaking entries in the dcache with these patches? -sam Maybe? I'm not sure what's

Re: [Pvfs2-developers] patch: namei bug fixes

2007-04-04 Thread Phil Carns
Hi Phil, Good idea to look at the other file systems. My (admittedly limited) understanding of disconnected dentries is based on the Documentation/ filesystems/Exporting doc, which may be a bit outdated. It suggests that lookup should return whatever d_splice_alias returns (assuming

Re: [Pvfs2-developers] patches: misc. bug fixes

2007-04-05 Thread Phil Carns
Murali Vilayannur wrote: Hi Phil, FWIW, if these patches haven't been committed, it looks good :) I am really backlogged with all my emails. auto-sm-tracking.patch: --- At some point, new linked lists were added to track state machines that are currently running within the

[Pvfs2-developers] patches: misc small fixes

2007-04-05 Thread Phil Carns
These are all pretty minor: mod-parm-desc.patch: The pvfs2 kernel module is missing the parameter description for the op_timeout_secs option that can be set at insmod time. server-gossip-errno.patch: -- The pvfs2-server is trying to use errno to

[Pvfs2-developers] Re: dspace iterate and MULTIPLE_KEY

2007-05-08 Thread Phil Carns
I had not seen this, but it looks like you have it sorted out already now. How did this bug manifest itself in terms of what the user sees? Does this cause an error when the servers start, or does something pop up later when you create new files? -Phil Sam Lang wrote: Hi Phil, With the

[Pvfs2-developers] setting xattrs on pvfs2 root

2007-05-08 Thread Phil Carns
It looks like pvfs2 does not allow you to set xattrs on the root directory. Is this expected? # checking the mount options: mount -t pvfs2 tcp://localhost:3334/pvfs2-fs on /mnt/pvfs2 type pvfs2 (rw,user_xattr) # confirming that xattrs can be set on a normal directory: setfattr -n

  1   2   3   >