Re: [autofs] Segmentation fault in read_one_map() - ldap_msgfree()
On Tue, 2011-08-30 at 14:27 -0400, Leonardo Chiquitto wrote: Hello, A customer reported a segmentation fault in automount (one occurrence so far, but we have a core dump). More information about the exact autofs version and included patches below. Thanks for reporting this and for spending the time to try and work out what's going on. I'll have a look see too. Call trace: Thread 7 (Thread 4254): #0 do_sigwait (set=0x7fff9eb877c0, sig=0x7fff9eb878dc) at ../sysdeps/unix/sysv/linux/sigwait.c:65 #1 __sigwait (set=0x7fff9eb877c0, sig=0x7fff9eb878dc) at ../sysdeps/unix/sysv/linux/sigwait.c:100 #2 statemachine (argc=value optimized out, argv=value optimized out) at automount.c:1327 #3 main (argc=value optimized out, argv=value optimized out) at automount.c:2142 Thread 6 (Thread 4255): #0 pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 alarm_handler (arg=value optimized out) at alarm.c:206 #2 start_thread (arg=value optimized out) at pthread_create.c:306 #3 clone () from /lib64/libc.so.6 Thread 5 (Thread 4256): #0 pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 st_queue_handler (arg=value optimized out) at state.c:1103 #2 start_thread (arg=value optimized out) at pthread_create.c:306 #3 clone () from /lib64/libc.so.6 Thread 4 (Thread 4259): #0 __poll (fds=0x40822150, nfds=3, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:87 #1 get_pkt (arg=value optimized out) at automount.c:882 #2 handle_packet (arg=value optimized out) at automount.c:1019 #3 handle_mounts (arg=value optimized out) at automount.c:1551 #4 start_thread (arg=value optimized out) at pthread_create.c:306 #5 clone () from /lib64/libc.so.6 Thread 3 (Thread 4262): #0 __poll (fds=0x41023150, nfds=3, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:87 #1 get_pkt (arg=value optimized out) at automount.c:882 #2 handle_packet (arg=value optimized out) at automount.c:1019 #3 handle_mounts (arg=value optimized out) at automount.c:1551 #4 start_thread (arg=value optimized out) at pthread_create.c:306 #5 clone () from /lib64/libc.so.6 Thread 2 (Thread 4263): #0 __poll (fds=0x41824150, nfds=3, timeout=-1) at ../sysdeps/unix/sysv/linux/poll.c:87 #1 get_pkt (arg=value optimized out) at automount.c:882 #2 handle_packet (arg=value optimized out) at automount.c:1019 #3 handle_mounts (arg=value optimized out) at automount.c:1551 #4 start_thread (arg=value optimized out) at pthread_create.c:306 #5 clone () from /lib64/libc.so.6 Thread 1 (Thread 18996): #0 __libc_free (mem=0x556a0498) at malloc.c:3461 #1 ber_free_buf (ber=0x556a0460) at io.c:177 #2 ber_free (ber=0x556a0460, freebuf=0) at io.c:196 #3 ldap_msgfree (lm=0x55e4ddc0) at result.c:1151 #4 read_one_map (ap=0x5569c120, age=1311189199, context=0x556a0040) at lookup_ldap.c:2341 #5 lookup_read_map (ap=0x5569c120, age=1311189199, context=0x556a0040) at lookup_ldap.c:2361 #6 do_read_map (ap=0x5569c120, map=0x5568d1c0, age=1311189199) at lookup.c:296 #7 read_source_instance (this=value optimized out, ap=0x5569c120, map=0x5569c220, age=1311189199) at lookup.c:374 #8 read_map_source (this=value optimized out, ap=0x5569c120, map=0x5569c220, age=1311189199) at lookup.c:393 #9 lookup_nss_read_map (ap=0x5569c120, source=value optimized out, age=1311189199) at lookup.c:534 #10 do_readmap (arg=value optimized out) at state.c:463 #11 start_thread (arg=value optimized out) at pthread_create.c:306 #12 clone () from /lib64/libc.so.6 The crash happens because we're passing an invalid (already freed?) LDAPMessage structure to ldap_msgfree() in read_one_map(): static int read_one_map(struct autofs_point *ap, struct lookup_context *ctxt, time_t age, int *result_ldap) { (...) rv = do_get_entries(sp, source, ctxt); if (rv != LDAP_SUCCESS) { ldap_msgfree(sp.result); unbind_ldap_connection(ap-logopt, sp.ldap, ctxt); *result_ldap = rv; free(sp.query); return NSS_STATUS_NOTFOUND; } ldap_msgfree(sp.result);=== } while (sp.morePages == TRUE); (gdb) frame 4 #4 0x2aab8c74 in read_one_map (ap=0x5569c120, age=1311189199, context=0x556a0040) at lookup_ldap.c:2341 2341 ldap_msgfree(sp.result); (gdb) print sp $1 = {ap = 0x5569c120, ldap = 0x55769c10, query = 0x5569ef00 (objectclass=automount), attrs = 0x41a25c80, cookie = 0x0, pageSize = 2000, morePages = 0, totalCount = 0, result = 0x556a8e40, age = 1311189199} I'm staring at the code for some time now, and still can't figure out how that could happen. Is it somehow possible that do_get_entries()
Re: [autofs] Segmentation fault in read_one_map() - ldap_msgfree()
On Wed, 2011-08-31 at 09:16 +0800, Ian Kent wrote: On Tue, 2011-08-30 at 14:27 -0400, Leonardo Chiquitto wrote: Hello, A customer reported a segmentation fault in automount (one occurrence so far, but we have a core dump). More information about the exact autofs version and included patches below. Thanks for reporting this and for spending the time to try and work out what's going on. I'll have a look see too. btw, you may find this patch useful too, although you may need to do a bit of work for it to apply, not sure (since the line numbers you give don't quite match mine). autofs-5.0.6 - fix paged query more results check From: Ian Kent ra...@themaw.net When getting paged results from an LDAP server the server returns an opaque cookie (of type berval) that is used to retrieve the next page. The criteria for deciding if there are more pages is that the berval value is non-null and has a non-zero length. To determine if the berval value has non-zero length autofs checks the strlen() of the value but on ppc64 and s390x this can return 0 even if the value has non-zero length causing a premature termination of the query. Fix this by also checking the berval length field. Also make sure we free the opaque cookie when the query is finished. --- CHANGELOG |1 + modules/lookup_ldap.c | 13 - 2 files changed, 13 insertions(+), 1 deletions(-) diff --git a/CHANGELOG b/CHANGELOG index a178b74..884a9ae 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -2,6 +2,7 @@ === - fix ipv6 name for lookup fix. - improve mount location error reporting. +- fix paged query more results check. 28/06/2011 autofs-5.0.6 --- diff --git a/modules/lookup_ldap.c b/modules/lookup_ldap.c index 719fed1..a25050a 100644 --- a/modules/lookup_ldap.c +++ b/modules/lookup_ldap.c @@ -2041,7 +2041,8 @@ do_paged: rv = ldap_parse_page_control(sp-ldap, returnedControls, sp-totalCount, sp-cookie); - if (sp-cookie sp-cookie-bv_val strlen(sp-cookie-bv_val)) + if (sp-cookie sp-cookie-bv_val + (strlen(sp-cookie-bv_val) || sp-cookie-bv_len)) sp-morePages = TRUE; else sp-morePages = FALSE; @@ -2382,6 +2383,10 @@ static int read_one_map(struct autofs_point *ap, rv == LDAP_SIZELIMIT_EXCEEDED) { if (sp.result) ldap_msgfree(sp.result); + if (sp.cookie) { + ber_bvfree(sp.cookie); + sp.cookie = NULL; + } sp.pageSize = sp.pageSize / 2; if (sp.pageSize 5) { debug(ap-logopt, MODPREFIX @@ -2397,6 +2402,8 @@ static int read_one_map(struct autofs_point *ap, if (rv != LDAP_SUCCESS || !sp.result) { unbind_ldap_connection(ap-logopt, sp.ldap, ctxt); *result_ldap = rv; + if (sp.cookie) + ber_bvfree(sp.cookie); free(sp.query); return NSS_STATUS_UNAVAIL; } @@ -2406,6 +2413,8 @@ static int read_one_map(struct autofs_point *ap, ldap_msgfree(sp.result); unbind_ldap_connection(ap-logopt, sp.ldap, ctxt); *result_ldap = rv; + if (sp.cookie) + ber_bvfree(sp.cookie); free(sp.query); return NSS_STATUS_NOTFOUND; } @@ -2417,6 +2426,8 @@ static int read_one_map(struct autofs_point *ap, unbind_ldap_connection(ap-logopt, sp.ldap, ctxt); source-age = age; + if (sp.cookie) + ber_bvfree(sp.cookie); free(sp.query); return NSS_STATUS_SUCCESS; ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] Segmentation fault in read_one_map() - ldap_msgfree()
On Tue, 2011-08-30 at 14:27 -0400, Leonardo Chiquitto wrote: Hello, A customer reported a segmentation fault in automount (one occurrence so far, but we have a core dump). More information about the exact autofs version and included patches below. snip ... I'd also like to suggest the patch below. I believe it can potentially avoid the crash. Thanks, Leonardo AutoFS version is 5.0.5 plus all upstream patches up and including autofs-5.0.5-fix-submount-shutdown-wait.patch plus these assorted bug fixes: autofs-5.0.5-auto-adjust-ldap-page-size.patch autofs-5.0.5-replace-gplv3-code.patch autofs-5.0.5-fix-paged-ldap-map-read.patch autofs-5.0.5-fix-next-task-list-update.patch autofs-5.0.5-fix-stale-map-read.patch autofs-5.0.5-fix-out-of-order-locking-in-readmap.patch autofs-5.0.5-remove-master_mutex_unlock-leftover.patch autofs-5.0.5-fix-null-cache-deadlock.patch Index: autofs/modules/lookup_ldap.c === --- autofs.orig/modules/lookup_ldap.c +++ autofs/modules/lookup_ldap.c @@ -361,6 +361,7 @@ static int get_query_dn(unsigned logopt, MODPREFIX query succeeded, no matches for %s, query); ldap_msgfree(result); + result = NULL; free(query); return 0; There's a check on result just above this so result can't (or shouldn't be able to) be NULL here, it isn't included within a loop (that can potentially change result) and the function returns straight after so this isn't needed. } @@ -1583,6 +1584,7 @@ int lookup_read_master(struct master *ma MODPREFIX query succeeded, no matches for %s, query); ldap_msgfree(result); + result = NULL; unbind_ldap_connection(logging, ldap, ctxt); free(query); return NSS_STATUS_NOTFOUND; Same here. @@ -2380,8 +2382,8 @@ static int read_one_map(struct autofs_po if (rv == LDAP_ADMINLIMIT_EXCEEDED || rv == LDAP_SIZELIMIT_EXCEEDED) { - if (sp.result) - ldap_msgfree(sp.result); + ldap_msgfree(sp.result); + sp.result = NULL; sp.pageSize = sp.pageSize / 2; if (sp.pageSize 5) { debug(ap-logopt, MODPREFIX @@ -2404,12 +2406,14 @@ static int read_one_map(struct autofs_po rv = do_get_entries(sp, source, ctxt); if (rv != LDAP_SUCCESS) { ldap_msgfree(sp.result); + sp.result = NULL; This one isn't needed because sp.result has been checked for NULL in the if statement above and the next thing we do is return. unbind_ldap_connection(ap-logopt, sp.ldap, ctxt); *result_ldap = rv; free(sp.query); return NSS_STATUS_NOTFOUND; } ldap_msgfree(sp.result); + sp.result = NULL; } while (sp.morePages == TRUE); But this looks suspect. My not setting sp.result to NULL after freeing it within the loop (when continuing) is dangerous and needs to be fixed. This is probably due to not really knowing what the state of the fields of the sp structure (at least sp.result and sp.cookie) are when we get LDAP_ADMINLIMIT_EXCEEDED or LDAP_SIZELIMIT_EXCEEDED as a return. debug(ap-logopt, MODPREFIX done updating map); @@ -2582,6 +2586,7 @@ static int lookup_one(struct autofs_poin debug(ap-logopt, MODPREFIX got answer, but no entry for %s, query); ldap_msgfree(result); + result = NULL; unbind_ldap_connection(ap-logopt, ldap, ctxt); free(query); return CHE_MISSING; @@ -2768,6 +2773,7 @@ next: } ldap_msgfree(result); + result = NULL; unbind_ldap_connection(ap-logopt, ldap, ctxt); And this is the same as the case in get_query_dn() and lookup_read_master() too. /* Failed to find wild entry, update cache if needed */ So I think this patch might help, can you get it tested? It assumes the other patch I posted has already been applied. autofs-5.0.6 - fix result null check in read_one_map() From: Ian Kent ik...@redhat.com --- modules/lookup_ldap.c |7 ++- 1 files changed, 6 insertions(+), 1 deletions(-) diff --git a/modules/lookup_ldap.c b/modules/lookup_ldap.c index 6f8b466..9f2d4f3 100644 --- a/modules/lookup_ldap.c +++ b/modules/lookup_ldap.c @@ -2380,8 +2380,10 @@ static int read_one_map(struct autofs_point *ap, if (rv == LDAP_ADMINLIMIT_EXCEEDED || rv == LDAP_SIZELIMIT_EXCEEDED) { - if (sp.result
Re: [autofs] autofs4 past 2.6.38: how to make it work?
On Wed, 2011-08-17 at 10:27 +0400, Michael Tokarev wrote: In 3.0, only two relevant commits are missing as far as I can see: this negative autofs dentries fix, and LOOKUP_CONTINUE change. The rest - again, ifaics - are merely cosmetics. I'm not sure about this LOOKUP_CONTINUE change, how relevant it is, but the two combined does not look bad for -stable. I think... :) Not sure what the LOOKUP_CONTINUE change is, what does it look like or do you have a git commit number? There is actually one more change that needs to go in, although you might not see the problem. It is a fix for a lockdep problem. Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] autofs4 past 2.6.38: how to make it work?
On Wed, 2011-08-17 at 12:42 +0400, Michael Tokarev wrote: 17.08.2011 12:28, Ian Kent пишет: On Wed, 2011-08-17 at 10:27 +0400, Michael Tokarev wrote: In 3.0, only two relevant commits are missing as far as I can see: this negative autofs dentries fix, and LOOKUP_CONTINUE change. The rest - again, ifaics - are merely cosmetics. I'm not sure about this LOOKUP_CONTINUE change, how relevant it is, but the two combined does not look bad for -stable. I think... :) Not sure what the LOOKUP_CONTINUE change is, what does it look like or do you have a git commit number? commit 49084c3bb2055c401f3493c13edae14d49128ca0 Author: Al Viro v...@zeniv.linux.org.uk Date:Sat Jun 25 21:59:52 2011 -0400 Subject: kill LOOKUP_CONTINUE OK, that one shouldn't make any difference to autofs. Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] autofs4 past 2.6.38: how to make it work?
On Tue, 2011-08-16 at 00:02 +0400, Michael Tokarev wrote: After searching a bit more, especially after realizing it's not 2.6.37+ but 2.6.38+ (so I corrected $subject), I found a few references to this, especially https://bugzilla.redhat.com/show_bug.cgi?id=719607 and a few more (most of which are without answers). I tested the patch proposed in RH#719607 (attachtment #512209) - it restores functionality of automounter, at least as far as I can see. I think it should go to -stable, too... ;) That path (actually a different equivalent patch) is included in 3.1-rc. There were major changes to the kernel automount code in the 2.6.38 and there have been a number of corrections along the way, with this patch being the last one for known problems. I recommend staying with 2.6.37 until 3.1 is available. Thanks, /mjt 15.08.2011 23:38, Michael Tokarev wrote: Hello. I found out that since 2.6.37 kernel, my automount (neither debian version 5.0.4-3.2 nor the latest 5.0.6) does not work anymore. Automounter starts correctly, mounts - say - /misc, and stays there listening for events. But the problem is that no events gets delivered to it no matter how I try. Accesses to /misc/foo always returns immediately with ENOENT error, and automount (all threads of it) is sleeping without any activity whatsoever. I tried to bisect this issue, but faced some problems -- most kernels around the bad commit OOPSes while accessing /misc/foo. Here's the bisect log: # good: [22763c5cf3690a681551162c15d34d935308c8d7] Linux 2.6.32 git bisect good 22763c5cf3690a681551162c15d34d935308c8d7 # bad: [521cb40b0c44418a4fd36dc633f575813d59a43d] Linux 2.6.38 git bisect bad 521cb40b0c44418a4fd36dc633f575813d59a43d # skip: [b7ab39f631f505edc2bbdb86620d5493f995c9da] fs: dcache scale dentry refcount git bisect skip b7ab39f631f505edc2bbdb86620d5493f995c9da # good: [cb4b492ac7595aad10756fe0b04691f0965e0cfc] autofs4: rename dentry to expiring in autofs4_lookup_expiring() git bisect good cb4b492ac7595aad10756fe0b04691f0965e0cfc # good: [5f57cbcc02cf18f6b22ef4066bb10afeb8f930ff] fs: dcache remove d_mounted git bisect good 5f57cbcc02cf18f6b22ef4066bb10afeb8f930ff # skip: [ab90911ff90cdab59b31c045c3f0ae480d14f29d] Allow d_manage() to be used in RCU-walk mode git bisect skip ab90911ff90cdab59b31c045c3f0ae480d14f29d # skip: [cc53ce53c86924bfe98a12ea20b7465038a08792] Add a dentry op to allow processes to be held during pathwalk transit git bisect skip cc53ce53c86924bfe98a12ea20b7465038a08792 # skip: [6651149371b842715906311b4631b8489cebf7e8] autofs4: Clean up autofs4_free_ino() git bisect skip 6651149371b842715906311b4631b8489cebf7e8 # bad: [726a5e0688fd344110d8f2979d87f243a4ba1a48] autofs4: autofs4_get_inode() doesn't need autofs_info * argument anymore git bisect bad 726a5e0688fd344110d8f2979d87f243a4ba1a48 # skip: [9e3fea16ba386fa549a0b2de8a203e5d412997a0] autofs4: Fix wait validation git bisect skip 9e3fea16ba386fa549a0b2de8a203e5d412997a0 # bad: [14a2f00bde7668fe18d1c8355d26c7c96961e1f7] autofs4: autofs4_mkroot() is not different from autofs4_init_ino() git bisect bad 14a2f00bde7668fe18d1c8355d26c7c96961e1f7 # skip: [71e469db242c2eeb00faf9caf7d9e00150c00a6e] autofs4: Clean up dentry operations git bisect skip 71e469db242c2eeb00faf9caf7d9e00150c00a6e # bad: [c14cc63a63e94d490ac6517a555113c30d420db4] autofs4 - fix get_next_positive_dentry() git bisect bad c14cc63a63e94d490ac6517a555113c30d420db4 # skip: [e61da20a50d21725ff27571a6dff9468e4fb7146] autofs4: Clean up inode operations git bisect skip e61da20a50d21725ff27571a6dff9468e4fb7146 # skip: [dd89f90d2deb9aa5bc8e1b15d726ff5c0bb2b623] autofs4: Add v4 pseudo direct mount support git bisect skip dd89f90d2deb9aa5bc8e1b15d726ff5c0bb2b623 # skip: [8c13a676d5a56495c350f3141824a5ef6c6b4606] autofs4: Remove unused code git bisect skip 8c13a676d5a56495c350f3141824a5ef6c6b4606 # bad: [b650c858c26bd9ba29ebc82d30f09355845a294a] autofs4: Merge the remaining dentry ops tables git bisect bad b650c858c26bd9ba29ebc82d30f09355845a294a # skip: [b5b801779d59165c4ecf1009009109545bd1f642] autofs4: Add d_manage() dentry operation git bisect skip b5b801779d59165c4ecf1009009109545bd1f642 # skip: [b5b801779d59165c4ecf1009009109545bd1f642] autofs4: Add d_manage() dentry operation git bisect skip b5b801779d59165c4ecf1009009109545bd1f642 # skip: [b5b801779d59165c4ecf1009009109545bd1f642] autofs4: Add d_manage() dentry operation git bisect skip b5b801779d59165c4ecf1009009109545bd1f642 Kernel b5b801779d59165c4ecf1009009109545bd1f642 shows this OOPS: [ 11.700752] [ cut here ] [ 11.703371] kernel BUG at fs/dcache.c:1304! [ 11.703371] invalid opcode: [#1] SMP [ 11.703371] last sysfs file: /sys/devices/virtual/vc/vcsa6/dev [ 11.703371] CPU 0 [ 11.703371] Modules linked in: ext4
Re: [autofs] Autofs dump map option
On Fri, 2011-08-05 at 11:11 +0200, Ondrej Valousek wrote: Hi Ian, Thanks for looking after the bugs #538408 and #704416 for me, but I have one slight problem with it: Automouter won't show any keys in my indirect NIS maps unless I specify BROWSE_MODE=yes. Maybe it is expected behaviour so I was thinking - can we always force browse_mode when dumping maps? Sorry for spotting this too late. Ha, I'll have a look and see what we need to do. I'm on leave between the 8th and 13th and will have limited internet connectivity so I likely won't be able to do much until I get back. Thanks, Ondrej __ The information contained in this e-mail and in any attachments is confidential and is designated solely for the attention of the intended recipient(s). If you are not an intended recipient, you must not use, disclose, copy, distribute or retain this e-mail or any part thereof. If you have received this e-mail in error, please notify the sender by return e-mail and delete all copies of this e-mail from your computer system(s). Please direct any additional queries to: communicati...@s3group.com. Thank You. Silicon and Software Systems Limited (S3 Group). Registered in Ireland no. 378073. Registered Office: South County Business Park, Leopardstown, Dublin 18 __ ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] Autofs dump map option
On Sat, 2011-08-06 at 16:54 +0800, Ian Kent wrote: On Fri, 2011-08-05 at 11:11 +0200, Ondrej Valousek wrote: Hi Ian, Thanks for looking after the bugs #538408 and #704416 for me, but I have one slight problem with it: Automouter won't show any keys in my indirect NIS maps unless I specify BROWSE_MODE=yes. Maybe it is expected behaviour so I was thinking - can we always force browse_mode when dumping maps? Sorry for spotting this too late. Ha, I'll have a look and see what we need to do. I'm on leave between the 8th and 13th and will have limited internet connectivity so I likely won't be able to do much until I get back. I think this patch should help. autofs-5.0.6 - fix dumpmaps not reading maps From: Ian Kent ra...@themaw.net The lookup modules won't read any indirect map entries (other than those in a file map) unless unless the browse option is set. In order to list the entries when tyhe dumpmap option is given the browse option needs to be set. --- CHANGELOG|1 + lib/master.c |9 + 2 files changed, 10 insertions(+), 0 deletions(-) diff --git a/CHANGELOG b/CHANGELOG index 884a9ae..946a196 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -3,6 +3,7 @@ - fix ipv6 name for lookup fix. - improve mount location error reporting. - fix paged query more results check. +- fix dumpmaps not reading maps. 28/06/2011 autofs-5.0.6 --- diff --git a/lib/master.c b/lib/master.c index 153a38b..6c89e1d 100644 --- a/lib/master.c +++ b/lib/master.c @@ -1283,6 +1283,15 @@ int master_show_mounts(struct master *master) printf(\nMount point: %s\n, ap-path); printf(\nsource(s):\n); + /* +* Ensure we actually read indirect map entries so we can +* list them. The map reads won't read any indirect map +* entries (other than those in a file map) unless the +* browse option is set. +*/ + if (ap-type == LKP_INDIRECT) + ap-flags |= MOUNT_FLAG_GHOST; + /* Read the map content into the cache */ if (lookup_nss_read_map(ap, NULL, now)) lookup_prune_cache(ap, now); ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] [Libtirpc-devel] [PATCH] Autofs configure fails to detect IPv6 when libtirpc is enabled
On Thu, 2011-07-28 at 07:20 -0400, Chuck Lever wrote: On Jul 27, 2011, at 10:26 PM, Ian Kent wrote: On Tue, 2011-07-26 at 23:30 -0400, Chuck Lever wrote: For IPv6 support, use functions that are part of the modern libtirpc API. This is described in Sun doc 816-1435. You probably will be most successful with the simplified interface which is described in Chapter 4. You might need somewhat more extensive surgery since I'm guessing you have separate code paths to invoke the IPv4 and IPv6 legacy RPC functions; generally speaking that should not be needed when using the libtirpc API. I doubt the simplified interface will be adequate since this code was written because of a need for greater control over timeouts. Perhaps that won't be the case, I don't know yet. If you want control over connection timeouts, use the expert-level or bottom-level interfaces. Otherwise you can set per-RPC timeouts when clnt_call(3t) is invoked. nfs-utils has some example code (support/nfs/rpc_socket.c is one place to look). Your suggestion amounts to saying I need to re-write all my RPC code. The substantial change with client-side TI-RPC is how CLIENTs are created. The other RPC operations are similar or the same as they were with the legacy API. Once you get over getnetconfigent(3t) it's really not as bad as it looks. Umm ... Why is __rpcb_findaddr() declared in the public header files but not defined anywhere is the source? Why is __rpcb_findaddr_timed() defined in the source but not defined in the public header files? This version of libtirpc was split from the Sun version over a decade ago when the code was immature. So you're going to find this kind of thing in many places. Yes, I was aware of that, but haven't paid enough attention to the doc. The TI-RPC API is defined in 816-1435. You really shouldn't consider using any of the interfaces defined in the headers but not in that doc, as those are internal interfaces and can change. Ummm .. rpcb_getaddr() might be what I'm looking for, I'll look further. On the other hand, we have at least two important RPC-based applications that can make use of this interface. I wonder if it makes sense to harden that API but leave it hidden, so apps external to the library can depend on it. Yeah, but if I can achieve what I need without it that's the way I'll go. It looks like I might not be able to do what I want the way I want with ti-rpc but it is still too early to tell. It's also too early to tell if ti-rpc actually already does some or all of what I need already. Time will tell. One example of something I need is to control the timeout, not the timeout for interactions after the client is constructed but the timeout of the client construction itself, including any queries to rpcbind that may be needed (hence why I want to do that manually too). Such apps would not be portable away from Linux nor to Linux distributions that don't have libtirpc yet. It sounds good but, as we all know, adding things that may need to change or are not designed for external use can be a support burden. Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] [Libtirpc-devel] [PATCH] Autofs configure fails to detect IPv6 when libtirpc is enabled
On Wed, 2011-07-27 at 11:39 +0800, Ian Kent wrote: On Tue, 2011-07-26 at 23:30 -0400, Chuck Lever wrote: On Jul 26, 2011, at 10:40 PM, Ian Kent wrote: On Tue, 2011-07-26 at 22:09 -0400, Chuck Lever wrote: On Jul 26, 2011, at 9:23 PM, Ian Kent wrote: On Wed, 2011-07-27 at 08:57 +0800, Ian Kent wrote: On Tue, 2011-07-26 at 17:13 -0400, Steve Dickson wrote: On 07/26/2011 10:50 AM, Chuck Lever wrote: On Jul 26, 2011, at 2:29 PM, Steve Dickson wrote: From: Ian Kent ra...@themaw.net The IPv6 client functions clntudp6_bufcreate(), clntudp6_create and clnttcp6_create and the server functions svcudp6_bufcreate(), svctcp6_create() and svcudp6_create() are not included in the library whe libtirpc is built. Are these part of the libtirpc standard API? I'm not sure why we would need them if, say, Solaris does not support these. It appears they are not since they are not mentioned the man pages. But, at least in the autofs code, they are expected https://bugzilla.redhat.com/show_bug.cgi?id=711956#c0 Ian, where else are these routines defined? Now that I look I can't find the original source tar that was used for libtirpc, thought I had it. Found what I had. AFAICT what I think was the original source doesn't have any IPv6 code that I can see. Worse, these functions were excluded with the #ifdef INET6_NOT_USED macro as far back as libtirpc version 0.1.5 so, my bad, sorry. The story is that long ago when I changed autofs to use libtirpc (to make it ready for IPv6) I found these functions in the source and they were (obviously) the IPv6 counterparts for the corresponding IPv4 functions which I was already using, so I used them. It took me quite a while to realize my code wasn't working and then I found that somewhere along the line they have been excluded, oops! If there are to be no IPv6 counterparts for the corresponding IPv4 functions which functions should I use then? So what can I use? It seems to me that these functions would be useful for people porting code that uses the corresponding IPv4 functions so could we define them please. At some point someone must have had that same idea It looks to me like these functions were part of an original attempt at IPv6 support that was abandoned long ago. They are not part of TI-RPC, but as you observed, they are merely IPv6 versions of the legacy RPC API. I don't see these implemented in glibc, for example. For IPv6 support, use functions that are part of the modern libtirpc API. This is described in Sun doc 816-1435. You probably will be most successful with the simplified interface which is described in Chapter 4. You might need somewhat more extensive surgery since I'm guessing you have separate code paths to invoke the IPv4 and IPv6 legacy RPC functions; generally speaking that should not be needed when using the libtirpc API. I doubt the simplified interface will be adequate since this code was written because of a need for greater control over timeouts. Perhaps that won't be the case, I don't know yet. If you want control over connection timeouts, use the expert-level or bottom-level interfaces. Otherwise you can set per-RPC timeouts when clnt_call(3t) is invoked. nfs-utils has some example code (support/nfs/rpc_socket.c is one place to look). Your suggestion amounts to saying I need to re-write all my RPC code. The substantial change with client-side TI-RPC is how CLIENTs are created. The other RPC operations are similar or the same as they were with the legacy API. Once you get over getnetconfigent(3t) it's really not as bad as it looks. Sure, but it's the dependent code in autofs that uses the RPC routines that will force me to keep the interface. But, like I said, it may be a non-issue since I can lift these routines straight out of libtirpc (as long as I attribute copyright according to the comment in the source file). That's not going to be straight forward either. Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] [Libtirpc-devel] [PATCH] Autofs configure fails to detect IPv6 when libtirpc is enabled
On Wed, 2011-07-27 at 08:57 +0800, Ian Kent wrote: On Tue, 2011-07-26 at 17:13 -0400, Steve Dickson wrote: On 07/26/2011 10:50 AM, Chuck Lever wrote: On Jul 26, 2011, at 2:29 PM, Steve Dickson wrote: From: Ian Kent ra...@themaw.net The IPv6 client functions clntudp6_bufcreate(), clntudp6_create and clnttcp6_create and the server functions svcudp6_bufcreate(), svctcp6_create() and svcudp6_create() are not included in the library whe libtirpc is built. Are these part of the libtirpc standard API? I'm not sure why we would need them if, say, Solaris does not support these. It appears they are not since they are not mentioned the man pages. But, at least in the autofs code, they are expected https://bugzilla.redhat.com/show_bug.cgi?id=711956#c0 Ian, where else are these routines defined? Now that I look I can't find the original source tar that was used for libtirpc, thought I had it. Found what I had. AFAICT what I think was the original source doesn't have any IPv6 code that I can see. Worse, these functions were excluded with the #ifdef INET6_NOT_USED macro as far back as libtirpc version 0.1.5 so, my bad, sorry. The story is that long ago when I changed autofs to use libtirpc (to make it ready for IPv6) I found these functions in the source and they were (obviously) the IPv6 counterparts for the corresponding IPv4 functions which I was already using, so I used them. It took me quite a while to realize my code wasn't working and then I found that somewhere along the line they have been excluded, oops! If there are to be no IPv6 counterparts for the corresponding IPv4 functions which functions should I use then? So what can I use? It seems to me that these functions would be useful for people porting code that uses the corresponding IPv4 functions so could we define them please. At some point someone must have had that same idea steved Signed-off-by: Steve Dickson ste...@redhat.com --- src/rpc_soc.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/rpc_soc.c b/src/rpc_soc.c index c678429..584ac71 100644 --- a/src/rpc_soc.c +++ b/src/rpc_soc.c @@ -236,7 +236,7 @@ clnttcp_create(raddr, prog, vers, sockp, sendsz, recvsz) /* IPv6 version of clnt*_*create */ -#ifdef INET6_NOT_USED +#ifdef INET6 CLIENT * clntudp6_bufcreate(raddr, prog, vers, wait, sockp, sendsz, recvsz) @@ -392,7 +392,7 @@ svcraw_create() /* IPV6 version */ -#ifdef INET6_NOT_USED +#ifdef INET6 SVCXPRT * svcudp6_bufcreate(fd, sendsz, recvsz) int fd; -- 1.7.6 -- Magic Quadrant for Content-Aware Data Loss Prevention Research study explores the data loss prevention market. Includes in-depth analysis on the changes within the DLP market, and the criteria used to evaluate the strengths and weaknesses of these DLP solutions. http://www.accelacomm.com/jaw/sfnl/114/51385063/ ___ Libtirpc-devel mailing list libtirpc-de...@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libtirpc-devel -- Got Input? Slashdot Needs You. Take our quick survey online. Come on, we don't ask for help often. Plus, you'll get a chance to win $100 to spend on ThinkGeek. http://p.sf.net/sfu/slashdot-survey ___ Libtirpc-devel mailing list libtirpc-de...@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libtirpc-devel ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] [Libtirpc-devel] [PATCH] Autofs configure fails to detect IPv6 when libtirpc is enabled
On Tue, 2011-07-26 at 22:09 -0400, Chuck Lever wrote: On Jul 26, 2011, at 9:23 PM, Ian Kent wrote: On Wed, 2011-07-27 at 08:57 +0800, Ian Kent wrote: On Tue, 2011-07-26 at 17:13 -0400, Steve Dickson wrote: On 07/26/2011 10:50 AM, Chuck Lever wrote: On Jul 26, 2011, at 2:29 PM, Steve Dickson wrote: From: Ian Kent ra...@themaw.net The IPv6 client functions clntudp6_bufcreate(), clntudp6_create and clnttcp6_create and the server functions svcudp6_bufcreate(), svctcp6_create() and svcudp6_create() are not included in the library whe libtirpc is built. Are these part of the libtirpc standard API? I'm not sure why we would need them if, say, Solaris does not support these. It appears they are not since they are not mentioned the man pages. But, at least in the autofs code, they are expected https://bugzilla.redhat.com/show_bug.cgi?id=711956#c0 Ian, where else are these routines defined? Now that I look I can't find the original source tar that was used for libtirpc, thought I had it. Found what I had. AFAICT what I think was the original source doesn't have any IPv6 code that I can see. Worse, these functions were excluded with the #ifdef INET6_NOT_USED macro as far back as libtirpc version 0.1.5 so, my bad, sorry. The story is that long ago when I changed autofs to use libtirpc (to make it ready for IPv6) I found these functions in the source and they were (obviously) the IPv6 counterparts for the corresponding IPv4 functions which I was already using, so I used them. It took me quite a while to realize my code wasn't working and then I found that somewhere along the line they have been excluded, oops! If there are to be no IPv6 counterparts for the corresponding IPv4 functions which functions should I use then? So what can I use? It seems to me that these functions would be useful for people porting code that uses the corresponding IPv4 functions so could we define them please. At some point someone must have had that same idea It looks to me like these functions were part of an original attempt at IPv6 support that was abandoned long ago. They are not part of TI-RPC, but as you observed, they are merely IPv6 versions of the legacy RPC API. I don't see these implemented in glibc, for example. For IPv6 support, use functions that are part of the modern libtirpc API. This is described in Sun doc 816-1435. You probably will be most successful with the simplified interface which is described in Chapter 4. You might need somewhat more extensive surgery since I'm guessing you have separate code paths to invoke the IPv4 and IPv6 legacy RPC functions; generally speaking that should not be needed when using the libtirpc API. I doubt the simplified interface will be adequate since this code was written because of a need for greater control over timeouts. Perhaps that won't be the case, I don't know yet. Your suggestion amounts to saying I need to re-write all my RPC code. steved Signed-off-by: Steve Dickson ste...@redhat.com --- src/rpc_soc.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/rpc_soc.c b/src/rpc_soc.c index c678429..584ac71 100644 --- a/src/rpc_soc.c +++ b/src/rpc_soc.c @@ -236,7 +236,7 @@ clnttcp_create(raddr, prog, vers, sockp, sendsz, recvsz) /* IPv6 version of clnt*_*create */ -#ifdef INET6_NOT_USED +#ifdef INET6 CLIENT * clntudp6_bufcreate(raddr, prog, vers, wait, sockp, sendsz, recvsz) @@ -392,7 +392,7 @@ svcraw_create() /* IPV6 version */ -#ifdef INET6_NOT_USED +#ifdef INET6 SVCXPRT * svcudp6_bufcreate(fd, sendsz, recvsz) int fd; -- 1.7.6 -- Magic Quadrant for Content-Aware Data Loss Prevention Research study explores the data loss prevention market. Includes in-depth analysis on the changes within the DLP market, and the criteria used to evaluate the strengths and weaknesses of these DLP solutions. http://www.accelacomm.com/jaw/sfnl/114/51385063/ ___ Libtirpc-devel mailing list libtirpc-de...@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libtirpc-devel -- Got Input? Slashdot Needs You. Take our quick survey online. Come on, we don't ask for help often. Plus, you'll get a chance to win $100 to spend on ThinkGeek. http://p.sf.net/sfu/slashdot-survey ___ Libtirpc-devel mailing list libtirpc-de...@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libtirpc-devel -- To unsubscribe from this list: send the line unsubscribe linux-nfs in the body of a message to majord...@vger.kernel.org
Re: [autofs] [Libtirpc-devel] [PATCH] Autofs configure fails to detect IPv6 when libtirpc is enabled
On Wed, 2011-07-27 at 10:40 +0800, Ian Kent wrote: On Tue, 2011-07-26 at 22:09 -0400, Chuck Lever wrote: On Jul 26, 2011, at 9:23 PM, Ian Kent wrote: On Wed, 2011-07-27 at 08:57 +0800, Ian Kent wrote: On Tue, 2011-07-26 at 17:13 -0400, Steve Dickson wrote: On 07/26/2011 10:50 AM, Chuck Lever wrote: On Jul 26, 2011, at 2:29 PM, Steve Dickson wrote: From: Ian Kent ra...@themaw.net The IPv6 client functions clntudp6_bufcreate(), clntudp6_create and clnttcp6_create and the server functions svcudp6_bufcreate(), svctcp6_create() and svcudp6_create() are not included in the library whe libtirpc is built. Are these part of the libtirpc standard API? I'm not sure why we would need them if, say, Solaris does not support these. It appears they are not since they are not mentioned the man pages. But, at least in the autofs code, they are expected https://bugzilla.redhat.com/show_bug.cgi?id=711956#c0 Ian, where else are these routines defined? Now that I look I can't find the original source tar that was used for libtirpc, thought I had it. Found what I had. AFAICT what I think was the original source doesn't have any IPv6 code that I can see. Worse, these functions were excluded with the #ifdef INET6_NOT_USED macro as far back as libtirpc version 0.1.5 so, my bad, sorry. The story is that long ago when I changed autofs to use libtirpc (to make it ready for IPv6) I found these functions in the source and they were (obviously) the IPv6 counterparts for the corresponding IPv4 functions which I was already using, so I used them. It took me quite a while to realize my code wasn't working and then I found that somewhere along the line they have been excluded, oops! If there are to be no IPv6 counterparts for the corresponding IPv4 functions which functions should I use then? So what can I use? It seems to me that these functions would be useful for people porting code that uses the corresponding IPv4 functions so could we define them please. At some point someone must have had that same idea It looks to me like these functions were part of an original attempt at IPv6 support that was abandoned long ago. They are not part of TI-RPC, but as you observed, they are merely IPv6 versions of the legacy RPC API. I don't see these implemented in glibc, for example. For IPv6 support, use functions that are part of the modern libtirpc API. This is described in Sun doc 816-1435. You probably will be most successful with the simplified interface which is described in Chapter 4. You might need somewhat more extensive surgery since I'm guessing you have separate code paths to invoke the IPv4 and IPv6 legacy RPC functions; generally speaking that should not be needed when using the libtirpc API. I doubt the simplified interface will be adequate since this code was written because of a need for greater control over timeouts. Perhaps that won't be the case, I don't know yet. Your suggestion amounts to saying I need to re-write all my RPC code. That comment is a bit dramatic, sorry. Actually I should be able to replace these calls with the contents of the functions as they are in libtirpc without too much trouble. steved Signed-off-by: Steve Dickson ste...@redhat.com --- src/rpc_soc.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/src/rpc_soc.c b/src/rpc_soc.c index c678429..584ac71 100644 --- a/src/rpc_soc.c +++ b/src/rpc_soc.c @@ -236,7 +236,7 @@ clnttcp_create(raddr, prog, vers, sockp, sendsz, recvsz) /* IPv6 version of clnt*_*create */ -#ifdef INET6_NOT_USED +#ifdef INET6 CLIENT * clntudp6_bufcreate(raddr, prog, vers, wait, sockp, sendsz, recvsz) @@ -392,7 +392,7 @@ svcraw_create() /* IPV6 version */ -#ifdef INET6_NOT_USED +#ifdef INET6 SVCXPRT * svcudp6_bufcreate(fd, sendsz, recvsz) int fd; -- 1.7.6 -- Magic Quadrant for Content-Aware Data Loss Prevention Research study explores the data loss prevention market. Includes in-depth analysis on the changes within the DLP market, and the criteria used to evaluate the strengths and weaknesses of these DLP solutions. http://www.accelacomm.com/jaw/sfnl/114/51385063/ ___ Libtirpc-devel mailing list libtirpc-de...@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/libtirpc-devel -- Got Input? Slashdot Needs You. Take our quick survey online. Come on, we don't ask for help often. Plus, you'll get a chance to win $100 to spend on ThinkGeek
Re: [autofs] [Libtirpc-devel] [PATCH] Autofs configure fails to detect IPv6 when libtirpc is enabled
On Tue, 2011-07-26 at 23:30 -0400, Chuck Lever wrote: On Jul 26, 2011, at 10:40 PM, Ian Kent wrote: On Tue, 2011-07-26 at 22:09 -0400, Chuck Lever wrote: On Jul 26, 2011, at 9:23 PM, Ian Kent wrote: On Wed, 2011-07-27 at 08:57 +0800, Ian Kent wrote: On Tue, 2011-07-26 at 17:13 -0400, Steve Dickson wrote: On 07/26/2011 10:50 AM, Chuck Lever wrote: On Jul 26, 2011, at 2:29 PM, Steve Dickson wrote: From: Ian Kent ra...@themaw.net The IPv6 client functions clntudp6_bufcreate(), clntudp6_create and clnttcp6_create and the server functions svcudp6_bufcreate(), svctcp6_create() and svcudp6_create() are not included in the library whe libtirpc is built. Are these part of the libtirpc standard API? I'm not sure why we would need them if, say, Solaris does not support these. It appears they are not since they are not mentioned the man pages. But, at least in the autofs code, they are expected https://bugzilla.redhat.com/show_bug.cgi?id=711956#c0 Ian, where else are these routines defined? Now that I look I can't find the original source tar that was used for libtirpc, thought I had it. Found what I had. AFAICT what I think was the original source doesn't have any IPv6 code that I can see. Worse, these functions were excluded with the #ifdef INET6_NOT_USED macro as far back as libtirpc version 0.1.5 so, my bad, sorry. The story is that long ago when I changed autofs to use libtirpc (to make it ready for IPv6) I found these functions in the source and they were (obviously) the IPv6 counterparts for the corresponding IPv4 functions which I was already using, so I used them. It took me quite a while to realize my code wasn't working and then I found that somewhere along the line they have been excluded, oops! If there are to be no IPv6 counterparts for the corresponding IPv4 functions which functions should I use then? So what can I use? It seems to me that these functions would be useful for people porting code that uses the corresponding IPv4 functions so could we define them please. At some point someone must have had that same idea It looks to me like these functions were part of an original attempt at IPv6 support that was abandoned long ago. They are not part of TI-RPC, but as you observed, they are merely IPv6 versions of the legacy RPC API. I don't see these implemented in glibc, for example. For IPv6 support, use functions that are part of the modern libtirpc API. This is described in Sun doc 816-1435. You probably will be most successful with the simplified interface which is described in Chapter 4. You might need somewhat more extensive surgery since I'm guessing you have separate code paths to invoke the IPv4 and IPv6 legacy RPC functions; generally speaking that should not be needed when using the libtirpc API. I doubt the simplified interface will be adequate since this code was written because of a need for greater control over timeouts. Perhaps that won't be the case, I don't know yet. If you want control over connection timeouts, use the expert-level or bottom-level interfaces. Otherwise you can set per-RPC timeouts when clnt_call(3t) is invoked. nfs-utils has some example code (support/nfs/rpc_socket.c is one place to look). Your suggestion amounts to saying I need to re-write all my RPC code. The substantial change with client-side TI-RPC is how CLIENTs are created. The other RPC operations are similar or the same as they were with the legacy API. Once you get over getnetconfigent(3t) it's really not as bad as it looks. Sure, but it's the dependent code in autofs that uses the RPC routines that will force me to keep the interface. But, like I said, it may be a non-issue since I can lift these routines straight out of libtirpc (as long as I attribute copyright according to the comment in the source file). Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] Lockdep splat in autofs with 2.6.39-rc2
On Thu, 2011-04-21 at 17:25 -0400, Steven Rostedt wrote: On Thu, Apr 07, 2011 at 03:44:03PM -0400, Nick Bowler wrote: Just saw this on 2.6.39-rc2 after half a day or so of uptime. I've never seen it before today so it may be a regression from 2.6.38. Nothing seems have failed as a result. Please let me know if you need any more info. Could you try this patch. I know it may be hard to reproduce, but the issue is that we are recursing down the locks in a tree/list and we changed a lock from being nested to being a parent. This patch tells lockdep about what we did. Signed-off-by: Steven Rostedt rost...@goodmis.org Hi Steven, It appears this is not included in current mainline yet so I'm guessing it is still a problem. Is this the correct way to handle the problem? Do you want me to forward the patch to Al Viro for inclusion in his tree and subsequent inclusion in mainline or would you like to do that? Ian diff --git a/fs/autofs4/expire.c b/fs/autofs4/expire.c index 450f529..1feb68e 100644 --- a/fs/autofs4/expire.c +++ b/fs/autofs4/expire.c @@ -124,6 +124,7 @@ start: /* Negative dentry - try next */ if (!simple_positive(q)) { spin_unlock(p-d_lock); + lock_set_subclass(q-d_lock.dep_map, 0, _RET_IP_); p = q; goto again; } @@ -186,6 +187,7 @@ again: /* Negative dentry - try next */ if (!simple_positive(ret)) { spin_unlock(p-d_lock); + lock_set_subclass(ret-d_lock.dep_map, 0, _RET_IP_); p = ret; goto again; } ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] RPC: Can't bind to reserved port (98) in autofs 4.1.3-240
On Tue, 2011-07-12 at 08:33 -0700, Brickles, Stephen wrote: This bug seems to crop up from time to time. I’ve never had much luck in being able to fix this by running Redhat updates. It seems to be pretty random. Finally I found a machine on which this bug recurred over and over and so I was able to set up debugging on it. The mount point listed in these maps below is a fully functioning mount-point most of the time. There are other mount points in ‘auto.mntpnt’ which have the same random error – I just pruned these out to make the explanation simpler. I set up the debugging according to Jeff’s instructions at http://people.redhat.com/jmoyer and the output is below. It would appear from Google that “found negative cache entry for key” is an issue which seems to have come up before. Also from the debug log, it’s difficult to tell if this is a problem That's probably a consequence of the mount failures but I'm not very familiar with the caching system changes that Jeff did for autofs version 4 in RHEL-4. with ‘autofs’ or ‘mount’. The RPC error is coming from mount. Also this machine is part of a simulation server farm. It would appear that sending a sequence of simulation jobs in rapid succession to a host (ie. multiple rsh by a user in a short period of time) causing a whole bunch of mount requests at once, seems more likely trigger this problem. Which is usually what causes reserved port exhaustion. Just how many reserved ports are actually available has changed somewhat over time and is configurable to a limit (obviously), not sure about RHEL-4. One thing that compounds the problem is that mount.nfs probes mountd for version information and NFS for version information leaving behind a bunch of reserved ports in a wait state. Then the kernel uses another port for the mount itself. Again I haven't looked at RHEL-4 for some time so I can't say whether NFS in RHEL-4 can share the port used for all mounts to a given server, it certainly does in RHEL-5 and later. If NFS doesn't do that then each mount will take another port until it is umounted. The wait state (I think 60 seconds) I mentioned is required by the TCP protocol and so is unavoidable and those ports cannot be reused until the timeout has expired. Basically, it's fairly easy to use a lot of reserved ports really quickly when rapidly mounting mounts and end up with not a great many mounts actually done. There's not a lot you can do about the port exhaustion other than ensure that the NFS server allows connections from higher numbered ports (ie. not in reserved port range) and ensure that your version of mount.nfs also supports it. I think the RHEL-4 you are using should be fine with that for mount.nfs but I think it still uses a little more ports than it really could get away with. Thanks, Stephen uname -a Linux server3 2.6.9-100.ELsmp #1 SMP Tue Feb 1 12:04:42 EST 2011 x86_64 x86_64 x86_64 GNU/Linux cat /etc/redhat-release Red Hat Enterprise Linux WS release 4 (Nahant Update 9) rpm -qa | grep autofs autofs-4.1.3-240 rpm -qa | grep nfs-utils nfs-utils-lib-devel-1.0.6-10.el4_8.1 nfs-utils-1.0.6-94.EL4 nfs-utils-lib-1.0.6-10.el4_8.1 auto.master: /mntpnt auto.mntpnt -rw,intr,soft,vers=3 auto.mntpnt: mail -rw,actimeo=0,vers=3 nis_server:/var/mail /var/log/messages: Jul 11 23:54:01 server3 automount[6778]: attempting to mount entry /mntpnt/mail Jul 11 23:54:01 server3 kernel: RPC: Can't bind to reserved port (98). Jul 11 23:54:01 server3 kernel: RPC: can't bind to reserved port. Jul 11 23:54:01 server3 kernel: RPC: error 5 connecting to server nis_server Jul 11 23:54:01 server3 kernel: RPC: Can't bind to reserved port (98). Jul 11 23:54:01 server3 kernel: RPC: can't bind to reserved port. Jul 11 23:54:01 server3 automount[3564]: mount: nis_server:/var/mail: can't read superblock Jul 11 23:54:01 server3 kernel: RPC: error 5 connecting to server nis_server Jul 11 23:54:01 server3 automount[3564]: mount(nfs): nfs: mount failure nis_server:/var/mail on /mntpnt/mail Jul 11 23:54:01 server3 automount[3564]: failed to mount /mntpnt/mail /var/log/debug: Jul 11 23:54:01 server3 automount[6778]: send_fail: token=247707 Jul 11 23:54:01 server3 automount[6778]: handle_packet: type = 0 Jul 11 23:54:01 server3 automount[6778]: handle_packet_missing: token 247708, name mail Jul 11 23:54:01 server3 automount[6778]: handle_packet_missing: expired negative cache entry for key. Jul 11 23:54:01 server3 automount[6778]: attempting to mount entry /mntpnt/mail Jul 11 23:54:01 server3 automount[3564]: lookup(yp): looking up mail Jul 11 23:54:01 server3 automount[6778]: mt-key set to mail Jul 11 23:54:01 server3 automount[3564]: lookup(yp): mail - -rw,actimeo=0,vers=3 nis_server:/var/mail Jul 11 23:54:01 server3 automount[3564]: parse(sun): expanded entry: -rw,actimeo=0,vers=3
Re: [autofs] autofs does not resolve names ?
On Sun, 2011-07-17 at 01:37 +0200, JA Magallón wrote: On Sat, 16 Jul 2011 10:30:58 +0800, Ian Kent ra...@themaw.net wrote: (CC LKML for info and completeness...) On Sat, 2011-07-16 at 03:01 +0200, JA Magallón wrote: Hi all... Since the update to autofs-5.0.6, I have a curious problem. Probably it is not autofs to blame, but I ask here to see if someone can give me some light... Have you applied autofs-5.0.6-fix-ipv6-name-for-lookup-fix.patch? Which fixes a stupid mistake on my part. Love you ! That did the trick, everything works again. Are you planning to release a 5.0.7 with the fix ? That will go into 5.0.7 but it probably won't be released for a while. Patches, such as this, are available on http://www.kernel.org/pub/linux/daemons/autofs/v5 as soon as they are committed to the autofs repository. Setup: /etc/autofs/auto.master: /home ldap://danae-nfs.cps.unizar.es/automountMapName=auto_home,o=diis --timeout=60,tcp,nobrowse (user homes map from LDAP, shared homes at a server) The name is resolved: annwn:/etc/autofs# host danae-nfs danae-nfs.cps.unizar.es has address 155.210.152.202 annwn:/etc/autofs# ping danae-nfs PING danae-nfs.cps.unizar.es (155.210.152.202) 56(84) bytes of data. 64 bytes from danae-nfs.cps.unizar.es (155.210.152.202): icmp_req=1 ttl=255 time=0.339 ms ... But when I access home (ssh the box): Jul 15 01:39:31 annwn automount[7774]: Starting automounter version 5.0.6, master map auto.master Jul 15 01:39:31 annwn automount[7774]: using kernel protocol version 5.02 Jul 15 01:39:31 annwn automount[7774]: lookup_nss_read_master: reading master files auto.master ... Jul 15 01:39:31 annwn automount[7774]: mount_mount: mount(nfs): root=/home name=magallon what=danae-nfs:/export/home/usuarios/giga/magallon, fstype=nfs, options=tcp,quota Jul 15 01:39:31 annwn automount[7774]: mount_mount: mount(nfs): nfs options=tcp,quota, nobind=0, nosymlink=0, ro=0 Jul 15 01:39:31 annwn automount[7774]: get_nfs_info: called with host danae-nfs(69.65.19.116) proto udp version 0x30 Jul 15 01:39:37 annwn automount[7774]: get_nfs_info: called with host danae-nfs(69.65.19.116) proto tcp version 0x30 Where the hell does it get that IP address for my server Diggin a bit more: annwn:/var/log# host 69.65.19.116 116.19.65.69.in-addr.arpa domain name pointer nx-deleted-host-1.no-ip.com because I have at /etc/resolv.conf: search cps.unizar.es unizar.es no-ip.org Why did it not resolved first with the first domain ? If I delete the no-ip.org domain, I get: Jul 15 01:47:18 annwn automount[8068]: mount_mount: mount(nfs): root=/home name=magallon what=danae-nfs:/export/home/usuarios/giga/magallon, fstype=nfs, options=tcp,quota Jul 15 01:47:18 annwn automount[8068]: mount_mount: mount(nfs): nfs options=tcp,quota, nobind=0, nosymlink=0, ro=0 Jul 15 01:47:20 annwn automount[8068]: add_host_addrs: hostname lookup failed: Name or service not known Jul 15 01:47:20 annwn automount[8068]: mount(nfs): no hosts available Jul 15 01:47:20 annwn automount[8068]: dev_ioctl_send_fail: token = 28 Jul 15 01:47:20 annwn automount[8068]: failed to mount /home/magallon automount does not add the search domains and resolve ? what am I doing wrong ? is it a bug ? uuhhh ? TIA -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/ ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] autofs does not resolve names ?
On Sat, 2011-07-16 at 03:01 +0200, JA Magallón wrote: Hi all... Since the update to autofs-5.0.6, I have a curious problem. Probably it is not autofs to blame, but I ask here to see if someone can give me some light... Have you applied autofs-5.0.6-fix-ipv6-name-for-lookup-fix.patch? Which fixes a stupid mistake on my part. Setup: /etc/autofs/auto.master: /home ldap://danae-nfs.cps.unizar.es/automountMapName=auto_home,o=diis --timeout=60,tcp,nobrowse (user homes map from LDAP, shared homes at a server) The name is resolved: annwn:/etc/autofs# host danae-nfs danae-nfs.cps.unizar.es has address 155.210.152.202 annwn:/etc/autofs# ping danae-nfs PING danae-nfs.cps.unizar.es (155.210.152.202) 56(84) bytes of data. 64 bytes from danae-nfs.cps.unizar.es (155.210.152.202): icmp_req=1 ttl=255 time=0.339 ms ... But when I access home (ssh the box): Jul 15 01:39:31 annwn automount[7774]: Starting automounter version 5.0.6, master map auto.master Jul 15 01:39:31 annwn automount[7774]: using kernel protocol version 5.02 Jul 15 01:39:31 annwn automount[7774]: lookup_nss_read_master: reading master files auto.master ... Jul 15 01:39:31 annwn automount[7774]: mount_mount: mount(nfs): root=/home name=magallon what=danae-nfs:/export/home/usuarios/giga/magallon, fstype=nfs, options=tcp,quota Jul 15 01:39:31 annwn automount[7774]: mount_mount: mount(nfs): nfs options=tcp,quota, nobind=0, nosymlink=0, ro=0 Jul 15 01:39:31 annwn automount[7774]: get_nfs_info: called with host danae-nfs(69.65.19.116) proto udp version 0x30 Jul 15 01:39:37 annwn automount[7774]: get_nfs_info: called with host danae-nfs(69.65.19.116) proto tcp version 0x30 Where the hell does it get that IP address for my server Diggin a bit more: annwn:/var/log# host 69.65.19.116 116.19.65.69.in-addr.arpa domain name pointer nx-deleted-host-1.no-ip.com because I have at /etc/resolv.conf: search cps.unizar.es unizar.es no-ip.org Why did it not resolved first with the first domain ? If I delete the no-ip.org domain, I get: Jul 15 01:47:18 annwn automount[8068]: mount_mount: mount(nfs): root=/home name=magallon what=danae-nfs:/export/home/usuarios/giga/magallon, fstype=nfs, options=tcp,quota Jul 15 01:47:18 annwn automount[8068]: mount_mount: mount(nfs): nfs options=tcp,quota, nobind=0, nosymlink=0, ro=0 Jul 15 01:47:20 annwn automount[8068]: add_host_addrs: hostname lookup failed: Name or service not known Jul 15 01:47:20 annwn automount[8068]: mount(nfs): no hosts available Jul 15 01:47:20 annwn automount[8068]: dev_ioctl_send_fail: token = 28 Jul 15 01:47:20 annwn automount[8068]: failed to mount /home/magallon automount does not add the search domains and resolve ? what am I doing wrong ? is it a bug ? uuhhh ? TIA ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] [PATCH] Fix non-IPv6 host name lookups (5.0.6)
On Wed, 2011-06-29 at 11:39 -0300, Leonardo Chiquitto wrote: Fix non-IPv6 host name lookups Commit 5b083026 (fix ipv6 name for lookup) causes a regression in regular (non-IPv6) host name lookups: it trims the first host name character even when it's not a [. This patch fixes the issue. Gaah yep. --- modules/replicated.c |8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) Index: autofs/modules/replicated.c === --- autofs.orig/modules/replicated.c +++ autofs/modules/replicated.c @@ -1125,15 +1125,17 @@ static int add_host_addrs(struct host ** } len = strlen(name); - if (name[0] == '[' name[--len] == ']') + if (name[0] == '[' name[--len] == ']') { name[len] = '\0'; + memmove(name, name + 1, len); + } memset(hints, 0, sizeof(hints)); hints.ai_flags = AI_NUMERICHOST; hints.ai_family = AF_UNSPEC; hints.ai_socktype = SOCK_DGRAM; - ret = getaddrinfo(name + 1, NULL, hints, ni); + ret = getaddrinfo(name, NULL, hints, ni); if (ret) goto try_name; @@ -1153,7 +1155,7 @@ try_name: hints.ai_family = AF_UNSPEC; hints.ai_socktype = SOCK_DGRAM; - ret = getaddrinfo(name + 1, NULL, hints, ni); + ret = getaddrinfo(name, NULL, hints, ni); if (ret) { error(LOGOPT_ANY, hostname lookup failed: %s, gai_strerror(ret)); ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] [PATCH] Fix non-IPv6 host name lookups (5.0.6)
On Thu, 2011-06-30 at 14:37 +0800, Ian Kent wrote: On Wed, 2011-06-29 at 11:39 -0300, Leonardo Chiquitto wrote: Fix non-IPv6 host name lookups Commit 5b083026 (fix ipv6 name for lookup) causes a regression in regular (non-IPv6) host name lookups: it trims the first host name character even when it's not a [. This patch fixes the issue. Gaah yep. --- modules/replicated.c |8 +--- 1 file changed, 5 insertions(+), 3 deletions(-) Index: autofs/modules/replicated.c === --- autofs.orig/modules/replicated.c +++ autofs/modules/replicated.c @@ -1125,15 +1125,17 @@ static int add_host_addrs(struct host ** } len = strlen(name); - if (name[0] == '[' name[--len] == ']') + if (name[0] == '[' name[--len] == ']') { name[len] = '\0'; + memmove(name, name + 1, len); + } memset(hints, 0, sizeof(hints)); hints.ai_flags = AI_NUMERICHOST; hints.ai_family = AF_UNSPEC; hints.ai_socktype = SOCK_DGRAM; - ret = getaddrinfo(name + 1, NULL, hints, ni); + ret = getaddrinfo(name, NULL, hints, ni); if (ret) goto try_name; @@ -1153,7 +1155,7 @@ try_name: hints.ai_family = AF_UNSPEC; hints.ai_socktype = SOCK_DGRAM; - ret = getaddrinfo(name + 1, NULL, hints, ni); + ret = getaddrinfo(name, NULL, hints, ni); if (ret) { error(LOGOPT_ANY, hostname lookup failed: %s, gai_strerror(ret)); I'd prefer to avoid the use of memmove(3) though so I think this will also fix it. autofs-5.0.6 - fix ipv6 name for lookup fix From: Ian Kent ik...@redhat.com Fix an error in the recent ipv6 name for lookup patch. --- modules/replicated.c | 13 - 1 files changed, 8 insertions(+), 5 deletions(-) diff --git a/modules/replicated.c b/modules/replicated.c index 7f2b892..a10a817 100644 --- a/modules/replicated.c +++ b/modules/replicated.c @@ -,7 +,8 @@ static int add_host_addrs(struct host **list, const char *host, unsigned int weight, unsigned int options) { struct addrinfo hints, *ni, *this; - char *name = strdup(host); + char *n_ptr; + char *name = n_ptr = strdup(host); int len; char buf[MAX_ERR_BUF]; int rr = 0; @@ -1125,15 +1126,17 @@ static int add_host_addrs(struct host **list, const char *host, } len = strlen(name); - if (name[0] == '[' name[--len] == ']') + if (name[0] == '[' name[--len] == ']') { name[len] = '\0'; + name++; + } memset(hints, 0, sizeof(hints)); hints.ai_flags = AI_NUMERICHOST; hints.ai_family = AF_UNSPEC; hints.ai_socktype = SOCK_DGRAM; - ret = getaddrinfo(name + 1, NULL, hints, ni); + ret = getaddrinfo(name, NULL, hints, ni); if (ret) goto try_name; @@ -1153,7 +1156,7 @@ try_name: hints.ai_family = AF_UNSPEC; hints.ai_socktype = SOCK_DGRAM; - ret = getaddrinfo(name + 1, NULL, hints, ni); + ret = getaddrinfo(name, NULL, hints, ni); if (ret) { error(LOGOPT_ANY, hostname lookup failed: %s, gai_strerror(ret)); @@ -1172,7 +1175,7 @@ try_name: } freeaddrinfo(ni); done: - free(name); + free(n_ptr); return ret; } ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
[autofs] [ANNOUNCE] autofs 5.0.6 release
Hi all, This is long overdue, even so, I think there are still some things that need fixing but aren't clear cut. However there are too many patches to delay any longer so here is release 5.0.6. The kernel patches are not being updated any more because of the introduction of the vfs-automount changes to the VFS. If there are needs in this area we will need to discuss how to deal with them on the mailing list. Known issues (haven't changed since 5.0.5) == - Quoted strings in the master map are still not yet handled. - There is a problem with mount --move in some releases of SuSE (perhaps other distributions as well) which can cause mounts to not be moved correctly resulting in /etc/mtab continually growing due to invalid entries. - When the active restart is being used it will happily re-connect a mount that is unresponsive, perhaps because the server is not responding. A forced expire (USR1 signal) should be enough to clean up. autofs == The package can be found at: ftp://ftp.kernel.org/pub/linux/daemons/autofs/v5 It is autofs-5.0.6.tar.[gz|bz2] No source rpm is there as it can be produced by using: rpmbuild -ts autofs-5.0.6.tar.gz and the binary rpm by using: rpmbuild -tb autofs-5.0.6.tar.gz See the INSTALL file for information about configure options and kernel requirements. Here are the entries from the CHANGELOG which outline the updates: 28/06/2011 autofs-5.0.6 --- - fix included map read fail handling. - refactor ldap sasl bind handling. - add mount wait timeout parameter. - special case cifs escapes. - fix compile fail with when LDAP is excluded. - more code analysis corrections (and fix a typo in an init script). - fix backwards #ifndef INET6. - fix stale initialization for file map instance. - add preen fsck for ext4 mounts. - don't use master_lex_destroy() to clear parse buffer. - make documentation for set-log-priority clearer. - fix timeout in connect_nb(). - fix pidof init script usage. - check for path mount location in generic module. - dont fail mount on access fail. - fix rpc fail on large export list. - fix memory leak on reload. - update kernel patches for 2.6.18 and 2.6.19. - dont connect at ldap lookup module init. - fix random selection option. - fix disable timeout. - fix strdup() return value check (Leonardo Chiquitto). - fix reconnect get base dn. - add missing sasl mutex callbacks. - fix get query dn failure. - fix ampersand escape in auto.smb. - add locality as valid ldap master map attribute. - add locality as valid ldap master map attribute fix. - add simple bind authentication. - fix master map source server unavailable handling. - add autofs_ldap_auth.conf man page. - fix random selection for host on different network. - make redhat init script more lsb compliant. - don't hold lock for simple mounts. - fix remount locking. - fix wildcard map entry match. - fix parse_sun() module init. - dont check null cache on expire. - fix null cache race. - fix cache_init() on source re-read. - fix mapent becomes negative during lookup. - check each dc server individually. - fix negative cache included map lookup. - remove state machine timed wait. - remove extra read master map call. - fix error handing in do_mount_indirect(). - expire thread use pending mutex. - remove ERR_remove_state() openssl call. - fix init script restart option. - fix init script status privilege error. - always read file maps mount lookup map read fix. - fix direct map not updating on reread. - add external bind method. - fix add simple bind auth. - add option to dump configured automount maps. - use weight only for server selection. - fix isspace() wild card substition. - auto adjust ldap page size. - fix prune cache valid check. - fix mountd vers retry. - fix expire race. - replace GPLv3 code. - fix paged ldap map read. - fix next task list update. - fix stale map read. - fix null cache clean. - automount(8) man page correction. - fix out of order locking in readmap. - include ip address in debug logging. - mount using address for DNS round robin host names. - reset negative status on cache prune. - remove master_mutex_unlock() leftover. - fix sanity checks for brackets in server name. - fix lsb service name in init script. - fix map source check in file lookup. - fix simple bind without SASL support. - fix sasl bind host name selection. - add nobind option. - add base64 password encode. - fix ipv6 name for lookup. - fix libtirpc ipv6 check. Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] Deadlock in automount caused by AB/BA lock ordering
On Fri, 2011-06-24 at 17:02 -0300, Leonardo Chiquitto wrote: On Mon, Jun 13, 2011 at 11:34 PM, Ian Kent ra...@themaw.net wrote: On Thu, 2011-05-19 at 20:21 -0300, Leonardo Chiquitto wrote: Hello, We received a support request from a customer reporting a hang in the automount daemon. Analyzing the core dump, it looks like automount can deadlock if two threads execute in the following order: (All the line numbers are adjusted to match the latest version from Git) Thread 7 (Thread 25043): Thread 6 (Thread 24007): #9 start_thread() libpthread.so.0 #3 start_thread() libpthread.so.0 #8 do_read_master() at automount.c:1259 #2 do_readmap() at state.c:479 #7 master_read_master() at master.c:853 -- master_mutex_lock() [state.c:462] -- master_mutex_unlock() [state.c:466] -- master_mutex_lock() [master.c:836] -- cache_writelock() [master.c:838 or :849] -- master_source_readlock() [state.c:477] #6 lookup_nss_read_master() at lookup.c:229 #5 do_read_master() at lookup.c:96 #4 lookup_read_master() at lookup_ldap.c:1676 #3 master_parse_entry() at master_parse.y:829 #2 master_add_map_source() at master.c:192 #1 master_source_writelock() at master.c:543 #1 cache_readlock() at cache.c:60 #0 pthread_rwlock_wrlock() libpthread.so.0#0 pthread_rwlock_rdlock() libpthread.so.0 At this point: Thread 7: locked(master_mutex_lock) locked(cache_writelock) A waits(master_source_writelock) B Thread 6: locked(master_source_readlock)B waits(cache_readlock) A The AutoFS version is 5.0.5 plus all upstream patches up to autofs-5.0.5-fix-submount-shutdown-wait.patch plus autofs-5.0.5-fix-out-of-order-locking-in-readmap.patch autofs-5.0.5-fix-next-task-list-update.patch autofs-5.0.5-fix-stale-map-read.patch At the first sight, moving pthread_cleanup_pop(1) (and consequently master_mutex_unlock()) to the end of do_readmap() could avoid the problem, but I didn't test this yet (please see untested patch below). Any insight would be much appreciated! Thanks, Leonardo Index: autofs/daemon/state.c === --- autofs.orig/daemon/state.c +++ autofs/daemon/state.c @@ -463,7 +463,6 @@ static void *do_readmap(void *arg) status = lookup_nss_read_map(ap, NULL, now); if (!status) pthread_exit(NULL); - pthread_cleanup_pop(1); if (ap-type == LKP_INDIRECT) { lookup_prune_cache(ap, now); @@ -504,6 +503,7 @@ static void *do_readmap(void *arg) } pthread_cleanup_pop(1); + pthread_cleanup_pop(1); return NULL; } This might be all we need since once the master map is read the null cache is set up and and can't change while we hold the read lock autofs-5.0.5 - fix null cache deadlock From: Ian Kent ik...@redhat.com --- daemon/state.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/daemon/state.c b/daemon/state.c index 3645440..51809a1 100644 --- a/daemon/state.c +++ b/daemon/state.c @@ -473,11 +473,11 @@ static void *do_readmap(void *arg) mnts = tree_make_mnt_tree(_PROC_MOUNTS, /); pthread_cleanup_push(tree_mnts_cleanup, mnts); - pthread_cleanup_push(master_source_lock_cleanup, ap-entry); - master_source_readlock(ap-entry); nc = ap-entry-master-nc; cache_readlock(nc); pthread_cleanup_push(cache_lock_cleanup, nc); + master_source_readlock(ap-entry); + pthread_cleanup_push(master_source_lock_cleanup, ap-entry); map = ap-entry-maps; while (map) { /* Is map source up to date or no longer valid */ Ian, Thanks for the patch! It really seems to be enough to resolve this issue: the problem didn't happen again after ~10 days testing. Great, I'll add a description to the patch and commit it. Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] Bug in autofs4_d_automount()?
On Sat, 2011-06-18 at 00:29 +0100, David Howells wrote: Hi Ian, At the top of autofs4_d_automount() you have: /* The daemon never triggers a mount. */ if (autofs4_oz_mode(sbi)) return NULL; I think this should be returning -EISDIR. If by some chance we do get here in Oz mode, this will cause the kernel to just loop forever. A return of NULL is meant to indicate that you got a collision and that it should recheck the mountpoint - but it does not advance path in follow_managed(). I think your mistaken this time. That was done to work around the fact that -EISDIR can't be returned in the case where the dentry doesn't and (and won't have) a explicit mount on it without entering an ELOOP situation. IIRC I needed to return NULL here and handle it in -d_manage() to make this work. -EISDIR is the return to indicate this is to be treated as a normal directory. That would be the better way to do it but the logic doesn't allow it ATM. David ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] [PATCH] AUTOFS4: Fix the return from autofs4_d_automount() and simplify autofs4_d_manage()
On Sat, 2011-06-18 at 01:11 +0100, David Howells wrote: autofs4_d_automount() returns 0 if it detects that the calling process is in Oz mode (ie. it's the autofs userspace daemon). This return, however, is meant to indicate to follow_automount() that the caller should retry the check on the the current path point. In the Oz mode case, this is a bad idea because nothing has changed on the path, and follow_managed() will just repeat until follow_automount() hits the total_link_count limit and returns -ELOOP. What it should do is return -EISDIR to indicate to the callers that actually it wants the daemon to see this directory as an ordinary directory. No, not unless the check below can be changed to somehow not trigger when LOOKUP_CONTINUE is set for autofs automount dentrys that don't actually end up with a mount on them, but are automount triggers never the less: if (PTR_ERR(mnt) == -EISDIR (flags LOOKUP_CONTINUE)) return -EREMOTE; return PTR_ERR(mnt); Fact is in our discussions on this we could never reach agreement so I worked around it by breaking out of the follow_managed() loop using -d_managed() instead, thinking that the above check must remain to satisfy the needs of other kernel users. Now, given that change outlined above, it is then unnecessary for autofs4_d_manage() to return -EISDIR if the current path point is not a mountpoint. If it returns 0 instead, and the path point isn't a mountpoint, then follow_managed() will skip the attempt to transit to the mounted filesystem and proceed to call autofs4_d_automount(), which will return -EISDIR. Signed-off-by: David Howells dhowe...@redhat.com --- fs/autofs4/root.c |9 ++--- 1 files changed, 2 insertions(+), 7 deletions(-) diff --git a/fs/autofs4/root.c b/fs/autofs4/root.c index f55ae23..a6dc11c 100644 --- a/fs/autofs4/root.c +++ b/fs/autofs4/root.c @@ -334,7 +334,7 @@ static struct vfsmount *autofs4_d_automount(struct path *path) /* The daemon never triggers a mount. */ if (autofs4_oz_mode(sbi)) - return NULL; + return ERR_PTR(-EISDIR); /* * If an expire request is pending everyone must wait. @@ -435,13 +435,8 @@ int autofs4_d_manage(struct dentry *dentry, bool rcu_walk) dentry, dentry-d_name.len, dentry-d_name.name); /* The daemon never waits. */ - if (autofs4_oz_mode(sbi)) { - if (rcu_walk) - return 0; - if (!d_mountpoint(dentry)) - return -EISDIR; + if (autofs4_oz_mode(sbi)) return 0; - } /* We need to sleep, so we need pathwalk to be in ref-mode */ if (rcu_walk) ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] Odd NIS map failure with mount point creation time in the future?
On Fri, 2011-06-17 at 15:12 +0100, James Pearson wrote: I using CentOS 5 on a large number of boxes with a NIS indirect automount map. I've been using the following syntax in /etc/auto.master: /mntpointyp:custom.map And this has worked fine for ages Recently, I wanted to provide some custom local overrides to mount points in the NIS map, so I've changed /etc/auto.master to be: /mntpointcustom.map and created a file called /etc/custom.map which contains something like: host1host:/disk1 +custom.map i.e. include the NIS map after any local mount point settings On most machines, this works fine - but on a number of machines, after a reboot, the mounts from the NIS map fail to mount - although the other mounts from /etc/custom.map mount fine. One thing I noticed in common on all the machines with this problem is that datestamp on the automount mount point (/mntpoint) was in the future - by an hour or two. I guess the hardware clock is an hour or two ahead of the real time. ntp runs on all these boxes - but starts after autofs - however if I change ntp to startup before autofs, then autofs works fine after a reboot ... Any idea why autofs fails to read entries from the included NIS map when the creation date of the map mount point is in the future? - but works fine when the same NIS map is referenced directly from /etc/auto.master? Don't know about the timestamp but there were some included map fixes in RHEL-5.7, at least one was a fix for a regression. Versions? James Pearson ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] nesting automount maps in ldap
On Fri, 2011-06-17 at 10:40 -0400, Jimmy Dorff wrote: Hello, I'm attempting to migrate an existing (and working) nis automount system to ldap. We have several layers of nested maps and I'm attempting to recreate that in LDAP. # Automount master for /foo dn: cn=/foo, ou=auto.master,dc=phy,dc=duke,dc=edu objectClass: automount cn: /foo automountInformation: ldap ldapserver:ou=auto.phy,dc=phy,dc=duke,dc=edu dn: ou=auto.phy, dc=phy,dc=duke,dc=edu objectClass: top objectClass: automountMap ou: auto.phy # mounting /foo/web works great! dn: cn=web, ou=auto.phy, dc=phy,dc=duke,dc=edu objectClass: automount cn: web automountInformation: -fstype=nfs nfsserver01:/srv/httpd # here is my attempt at nesting another level dn: cn=project, ou=auto.phy,dc=phy,dc=duke,dc=edu objectClass: automount cn: project automountInformation: ldap ldapserver:ou=auto.project,dc=phy,dc=duke,dc=edu I'm not sure about using this older syntax. Even if it is correct and we find a bug with it I'd be inclined to recommend using the newer syntax if possible. That would be (IIRC): automountInformation: ldap://ldapserver/ou=auto.project,dc=phy,dc=duke,dc=edu # This fails. I get automount: failed to mount /foo/project dn: ou=auto.project, dc=phy,dc=duke,dc=edu objectClass: top objectClass: automountMap ou: auto.project # this never shows up (ghosting enabled) dn: cn=linux, ou=auto.project, dc=phy,dc=duke,dc=edu objectClass: automount cn: linux automountInformation: -fstype=nfs nfsserver02:/srv/linux I can't see to find an example of nest ldap maps. I'm using autofs-5.0.5-31.el6 if that makes any difference. Log a RHEL bug (you probably should go via support actually) and include a full debug log. That is set LOGGING=debug in /etc/sysconfig/autofs and ensure that daemon.* output is being captured by syslog. Thanks! Jimmy Dorff ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] nesting automount maps in ldap
On Fri, 2011-06-17 at 16:40 -0400, Jimmy Dorff wrote: On 06/17/2011 10:40 AM, Jimmy Dorff wrote: # here is my attempt at nesting another level dn: cn=project, ou=auto.phy,dc=phy,dc=duke,dc=edu objectClass: automount cn: project automountInformation: ldap ldapserver:ou=auto.project,dc=phy,dc=duke,dc=edu My problem was the lack of -fstype=autofs. Oh .. yeah, that may well be it, but that should have been needed in the original map. Maybe a look at a debug log would be useful and that should show pretty quickly if it is just the missing fstype option. This works: dn: cn=project,ou=auto.phy,dc=phy,dc=duke,dc=edu objectClass: automount cn: project automountInformation: -fstype=autofs ou=auto.project,dc=phy,dc=duke,dc=edu I also think this is called layering rather than nesting.. but I'm not sure. Maybe, but the terminology I've always used for these is submount, as in a sub autofs mount. Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] Odd NIS map failure with mount point creation time in the future?
On Sat, 2011-06-18 at 21:08 +0100, James Pearson wrote: James Pearson wrote: Sorry, I should have said - this is CentOS 5.5 with autofs 5.0.1-0.rc2.143.el5_5.6 - and I've also tried it with 5.0.1-0.rc2.143.el5_6.2 - with the same result. It is easy to reproduce - with an included NIS map - if I do: /etc/init.d/autofs stop; date -s 1 minute; /etc/init.d/autofs start; date -s -1 minute This creates the automount mount point 1 minute into the future. Then if I try to access a server defined in the NIS map, the mount fails. However, if I wait a minute (i.e until the system clock passes the date stamp of the mount point), the automount of file systems in the NIS map works fine. Is it possible to get a copy of the autofs RPM for RHEL-5.7 to test? I had a look at the RHEL5 5.0.1-0.rc2.143.el5_6.2 source - and the following patch appears to 'fix' my problem - it's a bit of a hack - as in the included NIS map case, it just resets the age of the map to 'now', if it is in the future. James Pearson plain text document attachment (autofs-5.0.1-future.patch) --- ./daemon/lookup.c.mpc 2011-06-18 20:04:51.076507000 +0100 +++ ./daemon/lookup.c 2011-06-18 20:52:18.436154033 +0100 @@ -848,6 +848,12 @@ int lookup_nss_mount(struct autofs_point struct map_source *map; enum nsswitch_status status; int result = 0; + time_t now = time(NULL); + + if (entry-age now) { + debug(ap-logopt, map %s age in the future - changing it to now, entry-path); + entry-age = now; + } /* * For each map source (ie. each entry for the mount That sounds a bit like bug https://bugzilla.redhat.com/show_bug.cgi?id=632471 It was fixed in development revision 152, which I happen to have on people.redhat.com. You could give that a try. http://people.redhat.com/~ikent/autofs-5.0.1-0.rc2.152.el5/ Keep in mind that there were a few other fixes that went into the final release version so your mileage may vary. Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] [PATCH] Remove master_mutex_unlock() leftover
On Mon, 2011-06-13 at 18:39 -0300, Leonardo Chiquitto wrote: Hello Ian, list While trying to make some progress on the AB/BA locking issue, I think I've found a bug introduced by commit dc0c3734. Committed, pushed and posted on kernel.org. Thanks Leonard. Thanks, Leonardo Remove master_mutex_unlock() leftover Commit dc0c3734 (fix out of order locking in readmap) removes the calls to master_mutex_lock() and master_mutex_unlock() from master_find_mapent(), but leaves an unlock behind. Signed-off-by: Leonardo Chiquitto lchiqui...@novell.com --- lib/master.c |4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) Index: autofs/lib/master.c === --- autofs.orig/lib/master.c +++ autofs/lib/master.c @@ -643,10 +643,8 @@ struct master_mapent *master_find_mapent entry = list_entry(p, struct master_mapent, list); - if (!strcmp(entry-path, path)) { - master_mutex_unlock(); + if (!strcmp(entry-path, path)) return entry; - } } return NULL; ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] Deadlock in automount caused by AB/BA lock ordering
On Thu, 2011-05-19 at 20:21 -0300, Leonardo Chiquitto wrote: Hello, We received a support request from a customer reporting a hang in the automount daemon. Analyzing the core dump, it looks like automount can deadlock if two threads execute in the following order: (All the line numbers are adjusted to match the latest version from Git) Thread 7 (Thread 25043): Thread 6 (Thread 24007): #9 start_thread() libpthread.so.0 #3 start_thread() libpthread.so.0 #8 do_read_master() at automount.c:1259 #2 do_readmap() at state.c:479 #7 master_read_master() at master.c:853 -- master_mutex_lock() [state.c:462] -- master_mutex_unlock() [state.c:466] -- master_mutex_lock() [master.c:836] -- cache_writelock() [master.c:838 or :849] -- master_source_readlock() [state.c:477] #6 lookup_nss_read_master() at lookup.c:229 #5 do_read_master() at lookup.c:96 #4 lookup_read_master() at lookup_ldap.c:1676 #3 master_parse_entry() at master_parse.y:829 #2 master_add_map_source() at master.c:192 #1 master_source_writelock() at master.c:543 #1 cache_readlock() at cache.c:60 #0 pthread_rwlock_wrlock() libpthread.so.0#0 pthread_rwlock_rdlock() libpthread.so.0 At this point: Thread 7: locked(master_mutex_lock) locked(cache_writelock) A waits(master_source_writelock) B Thread 6: locked(master_source_readlock)B waits(cache_readlock) A The AutoFS version is 5.0.5 plus all upstream patches up to autofs-5.0.5-fix-submount-shutdown-wait.patch plus autofs-5.0.5-fix-out-of-order-locking-in-readmap.patch autofs-5.0.5-fix-next-task-list-update.patch autofs-5.0.5-fix-stale-map-read.patch At the first sight, moving pthread_cleanup_pop(1) (and consequently master_mutex_unlock()) to the end of do_readmap() could avoid the problem, but I didn't test this yet (please see untested patch below). Any insight would be much appreciated! Thanks, Leonardo Index: autofs/daemon/state.c === --- autofs.orig/daemon/state.c +++ autofs/daemon/state.c @@ -463,7 +463,6 @@ static void *do_readmap(void *arg) status = lookup_nss_read_map(ap, NULL, now); if (!status) pthread_exit(NULL); - pthread_cleanup_pop(1); if (ap-type == LKP_INDIRECT) { lookup_prune_cache(ap, now); @@ -504,6 +503,7 @@ static void *do_readmap(void *arg) } pthread_cleanup_pop(1); + pthread_cleanup_pop(1); return NULL; } This might be all we need since once the master map is read the null cache is set up and and can't change while we hold the read lock autofs-5.0.5 - fix null cache deadlock From: Ian Kent ik...@redhat.com --- daemon/state.c |4 ++-- 1 files changed, 2 insertions(+), 2 deletions(-) diff --git a/daemon/state.c b/daemon/state.c index 3645440..51809a1 100644 --- a/daemon/state.c +++ b/daemon/state.c @@ -473,11 +473,11 @@ static void *do_readmap(void *arg) mnts = tree_make_mnt_tree(_PROC_MOUNTS, /); pthread_cleanup_push(tree_mnts_cleanup, mnts); - pthread_cleanup_push(master_source_lock_cleanup, ap-entry); - master_source_readlock(ap-entry); nc = ap-entry-master-nc; cache_readlock(nc); pthread_cleanup_push(cache_lock_cleanup, nc); + master_source_readlock(ap-entry); + pthread_cleanup_push(master_source_lock_cleanup, ap-entry); map = ap-entry-maps; while (map) { /* Is map source up to date or no longer valid */ ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] failed to mount offset
On Wed, 2011-06-08 at 09:43 -0400, Greg Wooledge wrote: On Wed, Jun 08, 2011 at 09:17:35PM +0800, Ian Kent wrote: On Wed, 2011-06-08 at 05:06 -0500, Mag Gam wrote: Trying to mount a newly created volume on a fileserver (appliance) and nis. Able to see the volume using showmount and able to mount so, I don't believe its a permission problem. I'm guessing you are using the hosts map? You will get that until the mount tree under /net/appliance expires away so the exported entries can be updated. The exported entries can't be sanely updated while the mount tree under the server directory is being used since export list can be hierarchical and so can have order of mount/umount dependencies. When you add a new file system (or a new export at least) to an NFS server that is being accessed through the hosts map, there is no way to tell autofs on the clients to re-read the list of exports. As Ian says, it can't be updated while autofs is still running. The only workaround is to reboot the client systems. Sorry. It affects us too. Or get the mount to expire away, which, as you observe is hard to do on a busy system. I've been thinking about this for a while now as I do need to improve the situation. I should be able to check for dependent mount sub-trees and avoid updating only those until they aren't in use, since they should be handled as sub-trees (for both mounting and expiring) at points in the tree that introduce dependencies. But I suspect the sub-tree handling code doesn't actually work how I originally wanted it to, so that will also make it harder. Consequently, it's going to be fairly difficult to implement so I won't start working on it until I have a clearer picture of how I'll do it. And these the failed to mount offset messages sound like they need work as well. Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] failed to mount offset
On Wed, 2011-06-08 at 05:06 -0500, Mag Gam wrote: Hello Trying to mount a newly created volume on a fileserver (appliance) and nis. Able to see the volume using showmount and able to mount so, I don't believe its a permission problem. The error I get is, -zsh: cd: /net/appliance/newvol No such file or directory I'm guessing you are using the hosts map? You will get that until the mount tree under /net/appliance expires away so the exported entries can be updated. The exported entries can't be sanely updated while the mount tree under the server directory is being used since export list can be hierarchical and so can have order of mount/umount dependencies. kernel version: 2.6.18-238.el5 autofs version autofs-5.0.1-0.rc2.143.el5_5.6.x86_64 Running RHEL 5.6 (Enterprise edition) Autofs is in debugging mode but does not show up anything. After I restart autofs many times I only see this message in syslog (debugging enabled): mount_multi_triggers: mount offset /net/appliancev/newvol at /net/appliance failed to mount offset At this point autofs is probably completely confused as it tries to reconstruct the tree of mounts under /net/appliance. That message is usually a unpleasant sign that things are messed up. About the only thing that is worth doing is a postmortem analysis to see if there is anything that can be changed to make automount more tolerant. Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] Returning issues.
On Tue, 2011-06-07 at 11:15 +0200, Stef Bon wrote: On 06/07/2011 07:23 AM, Ian Kent wrote: On Sun, 2011-06-05 at 19:01 +0200, Stef Bon wrote: hi, I see that issues that automounter doing unnecessary mounting like: http://linux.kernel.org/pipermail/autofs/2011-May/006568.html I think you misunderstood the report here. Neither of the points below were related to it. and here http://linux.kernel.org/pipermail/autofs/2011-April/006542.html some more documentation here would prevent users asking the same questions. This behaviour is mostly caused by: A. ls is a alias (in de shell) to ls --color=auto which will follow targets, and this makes the automounter mount. This mostly isn't a problem these days as most user space utilities behave better with automount these days. We will need to wait and see how the user space utilities behave with the new vfs automount kernel infrastructure and work out what needs to be done as we go. I hope you're right. How do apps know a directory is a autofs managed mountpoint? B. extended attributes called by the environment, or an app. I see here a sollution is proposed to fix this in coreutils and the acl package. Hmm, is this right? Yes, this was related to inconsistent use of system calls within the user space utility, an lgetxattr(2) call was being made instead of a getxattr(2) call (inconsistent with respect to previous calls in the same utility code path), which was causing a mount to occur on the follow, as required by the lgetxattr(2) call. So, in this case, it needed to be fixed in the utility. You're mixing things here I think. A lgetxattr will normally not follow a symlink, as the call getxattr does. Isn't it the other way around? (compare stat and lstat: lstat looks at the symlink self, stat follows the symlink and takes the target) Oh ... of course you are right and I think I'll need to re-visit the report for case B. Don't forget that we are talking about directories that have follow link semantics and that is the source of the semantic behavior. Also we are talking about automount directories themselves not the thing that may be mounted on top of them. Once mounted upon, the autofs directory should be followed when looking up a path unless it is in the process of expiring or has an in progress mount occurring. Lets try that again with the test in the kernel and referring to your problem with stat in the previous mail. if (!(flags LOOKUP_FOLLOW) !(flags (LOOKUP_CONTINUE | LOOKUP_DIRECTORY | LOOKUP_OPEN | LOOKUP_CREATE))) return -EISDIR; If we aren't using an l variant (such as stat(2)) LOOKUP_FOLLOW will be set so this test won't prevent a mount from being attempted. The EISDIR is a little misleading. It is used internally by the kernel automount code to say don't automount just use this directory. The comment: /* We want to mount if someone is trying to open/create a file of any * type under the mountpoint, wants to traverse through the mountpoint * or wants to open the mounted directory. snip ... * We don't want to mount if someone's just doing a stat and they've * set AT_SYMLINK_NOFOLLOW - unless they're stat'ing a directory and * appended a '/' to the name. essentially says, don't mount for l variants of system calls that are trying to get information unless a / is appended to the end of the path. That sounds like it is the opposite to what you observed and the opposite of what used to happen. Previously the autofs module did not distinguish between LOOKUP_FOLLOW being set or not and would always not perform a mount. So, once again we have the age old problem of what to do to prevent mount storms without compromising the semantics of system calls. But we don't always see the mount storms occurring nowadays so a lot has changed in user space, which is good. I did miss this during development though, *sigh*. Note that LOOKUP_CONTINUE is set if the path component is not the last component in the path so a mount will always be attempted in that case. Anyhow, when a userspace utility does something like that, it's a serious error. Utilities should do exactly what is requested, and never anything else. | autofs directories don't have extended attributes. You mean only the autofs managed mount points? The contents of the share can have Xattr right? Sure, yes, autofs must get out of the road once it has a mount on it. Can't an autofs managed directory have Xattr?? That does not sound right. No, the autofs fs doesn't support extended attributes. Do you really think we need extended attributes? If you do then a patch, which includes some reasoning of why we need them, would be welcomed. No, it suprise me only, since the contents of an autofs managed mountpoint can have Xatttr (see above) and I consider
Re: [autofs] bug doing simple stat call to mountpoint
On Mon, 2011-06-06 at 09:07 +0200, Stef Bon wrote: On 06/05/2011 11:37 PM, Stef Bon wrote: On 06/05/2011 06:44 PM, Stef Bon wrote: Hi, I'm using autofs 5.0.5, kernel 2.6.38.6. In the construction I'm working on I got the following error: stat /mnt/mount.md5key/sbon/0/0a8805e89f4cc6653185a4ec4335cca1 stat: cannot stat `/mnt/mount.md5key/sbon/0/0a8805e89f4cc6653185a4ec4335cca1': No such file or directory sbon [ /tmp/mount.md5key/users/sbon/cache/0a8805e89f4cc6653185a4ec4335cca1 ]$ ls /mnt/mount.md5key/sbon/0/0a8805e89f4cc6653185a4ec4335cca1 gives contents as expected I've tried linux 2.6.39.1, and same behaviour. I'll try an older kernel (2.6.37) Oef I have found the cause of this. I had to add a trailing slash, like: stat /mnt/mount.md5key/sbon/0/0a8805e89f4cc6653185a4ec4335cca/ this makes the automount mount. That's right. There are significant changes to the kernel automount infrastructure from 2.6.38, most notably what you saw here. Be aware that 2.6.38 has some problems and 2.6.39 or later should be used if at all possible. Hopefully there won't be too many problems resulting from the changes. For information, the comment in the VFS function fs/namei.c:follow_automount() is: /* We want to mount if someone is trying to open/create a file of any * type under the mountpoint, wants to traverse through the mountpoint * or wants to open the mounted directory. * * We don't want to mount if someone's just doing a stat and they've * set AT_SYMLINK_NOFOLLOW - unless they're stat'ing a directory and * appended a '/' to the name. * * An exception to this is autofs. It needs to tell lies on stat(2) * to prevent mount storms for things like color ls so it can set a * dentry flag to provide for this. */ Although I think the last paragraph is actually not correct since there is no distinction between autofs and other execution threads in the test that follows the comment. Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] Returning issues.
On Sun, 2011-06-05 at 19:01 +0200, Stef Bon wrote: hi, I see that issues that automounter doing unnecessary mounting like: http://linux.kernel.org/pipermail/autofs/2011-May/006568.html I think you misunderstood the report here. Neither of the points below were related to it. and here http://linux.kernel.org/pipermail/autofs/2011-April/006542.html some more documentation here would prevent users asking the same questions. This behaviour is mostly caused by: A. ls is a alias (in de shell) to ls --color=auto which will follow targets, and this makes the automounter mount. This mostly isn't a problem these days as most user space utilities behave better with automount these days. We will need to wait and see how the user space utilities behave with the new vfs automount kernel infrastructure and work out what needs to be done as we go. B. extended attributes called by the environment, or an app. I see here a sollution is proposed to fix this in coreutils and the acl package. Hmm, is this right? Yes, this was related to inconsistent use of system calls within the user space utility, an lgetxattr(2) call was being made instead of a getxattr(2) call (inconsistent with respect to previous calls in the same utility code path), which was causing a mount to occur on the follow, as required by the lgetxattr(2) call. So, in this case, it needed to be fixed in the utility. We may see more of these over the coming months, we will need to wait and see and work out what we need to do on a case by case basis. If an app wants to get the xattr of an autofs managed directory, what is the sollution proposed here then? autofs directories don't have extended attributes. Can't an autofs managed directory have Xattr?? That does not sound right. No, the autofs fs doesn't support extended attributes. Do you really think we need extended attributes? If you do then a patch, which includes some reasoning of why we need them, would be welcomed. In my construction I'm blocking the xattr calls to autofs managed dirs, unless it's already mounted, and I consider this as an ugly hack. How? Since the autofs fs inode operations do not define a getxattr operation these calls always return -EOPNOTSUPP. So they don't cause a callback to user space. User space should handle the EOPNOTSUPP return since extended attributes may not be available for a file system? OTOH the lgetxattr(2) call requires the kernel follow the mount point (by definition) and so a mount should be attempted and the call made upon the mounted file system if it succeeds. The bottom line is that the kernel has changed quite significantly underneath you, sorry. Although the behavior should be substantially the same it isn't exactly the same and we will need to work out if we need to change anything and what we need to change. Keep in mind that we need to try and adhere to normal system call semantics were we can, if at all possible. Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] [PATCH] FedFS file system client support
On Tue, 2011-05-31 at 11:14 -0400, Chuck Lever wrote: On May 30, 2011, at 12:15 AM, Ian Kent wrote: On Thu, 2011-05-26 at 12:39 -0400, Chuck Lever wrote: Hi- The next release of fedfs-utils will provide all necessary components for a Linux NFS client to participate in a FedFS domain as a file system client. This new utility is intended to be a part of the upcoming release. I'm interested in comments on this approach. Ian had suggested a new lookup module, but a program map looked simpler to prototype and has the great advantage of no dependencies between fedfs-utils and autofs. If program maps have some nasty problem that require the use of a lookup module, we can take that next step. The only difficulty with using a program map is that to use it you need to know the name of the of the key (aka. /nfs4/key) since, at the moment, autofs can't enumerate program map keys. If I understand your comment correctly, the problem is how the client discovers FedFS domain names to use as keys. As I understand it, FedFS does not currently have a mechanism for clients to discover FedFS domain names, like, say, AFS did with CellServDB. Also, FedFS file system clients don't belong to a particular domain, so they don't have any idea what might be their local domain. Any client can access any domain; security is provided by the underlying file system protocols. The domains are just a way to organize the name spaces, without any regard to security administrative domains. Thus, right now users and applications have to know a priori the whole pathname (or, Globally Useful Name, in FedFS parlance) in order to access a file resource via FedFS. Are you suggesting there is a better design we could adopt that might prepare FedFS for a time when there is a mechanism for discovering FedFS domain names? Only in so much as, to do this we would need to use a lookup module or add autofs support for program maps to enumerate their keys and that only makes sense if there is a way to enumerate domains, perhaps by pre-configuring names. Adding the ability for program maps to enumerate their keys has been discussed before. Although it hasn't been done yet, using a NULL key to ask the program for a list of its known keys should be straight forward. Of course that assumes that the FedFS program map will allow the use of a local list of domains to prime its cache and that autofs will function ok for exiting and not yet existing keys (it should, if not I need to fix it). The other catch is that we don't know if others using program maps handle the case of a NULL key since they don't get them now. From memory, a SEGV in a program map doesn't kill autofs and the request just fails, as it should, but I'd need to check that. To allow the use of external lookup modules I would need to formalize and document the lookup module interface and provide a means to configure a path to look in for modules or a way to specify an explicit path to an external lookup module. I think that will require a fair amount of change to separate the internal autofs bits for modules from the internals independent bits. I've been thinking about that for another request. I'm still not sure about it since we have had some difficult problems relating to the order of module dependent shared libraries being closed (or being released anyway) and this likely would open that can of worms. Mmm ... ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] [PATCH] FedFS file system client support
On Thu, 2011-05-26 at 12:39 -0400, Chuck Lever wrote: Hi- The next release of fedfs-utils will provide all necessary components for a Linux NFS client to participate in a FedFS domain as a file system client. This new utility is intended to be a part of the upcoming release. I'm interested in comments on this approach. Ian had suggested a new lookup module, but a program map looked simpler to prototype and has the great advantage of no dependencies between fedfs-utils and autofs. If program maps have some nasty problem that require the use of a lookup module, we can take that next step. The only difficulty with using a program map is that to use it you need to know the name of the of the key (aka. /nfs4/key) since, at the moment, autofs can't enumerate program map keys. Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] BUG() in shrink_dcache_for_umount_subtree on nfs4 mount
On Thu, 2011-05-26 at 09:49 -0400, Jeff Layton wrote: On Wed, 25 May 2011 16:08:15 -0400 Jeff Layton jlay...@redhat.com wrote: On Wed, 27 Apr 2011 16:23:07 -0700 Mark Moseley moseleym...@gmail.com wrote: I posted this to bugzilla a while back but I figured I'd paste it here too: - I've been getting bit by the exact same bug and been bisecting for the past couple of weeks. It's slow going as it can sometimes take a day for the BUG() to show up (though can also at time take 10 minutes). And I've also seen it more than once where something was good after a day and then BUG()'d later on, just to make things more complicated. So the upshot is that while I feel confident enough about this latest batch of bisecting to post it here, I wouldn't bet my life on it. I hope this isn't a case where bisecting just shows where the bug gets exposed but not where it actually got planted :) Incidentally, I tried the patch from the top of this thread and it didn't seem to make a difference. I still got bit. I've posted on the linux-fsdevel thread that Jeff Layton started about it, http://www.spinics.net/lists/linux-nfs/msg20280.html if you need more details on my setup (though I'll be happy to provide anything else you need). Though in that thread you'll see that I'm not using autofs explicitly, the Netapp GX cluster NFS appears to use autofs to do the implicit submounts (I'm not 100% sure that's the correct terminology, so hopefully you know what I mean). Here's my bisect log, ending up at commit e61da20a50d21725ff27571a6dff9468e4fb7146 git bisect start 'fs' # good: [3c0eee3fe6a3a1c745379547c7e7c904aa64f6d5] Linux 2.6.37 git bisect good 3c0eee3fe6a3a1c745379547c7e7c904aa64f6d5 # bad: [c56eb8fb6dccb83d9fe62fd4dc00c834de9bc470] Linux 2.6.38-rc1 git bisect bad c56eb8fb6dccb83d9fe62fd4dc00c834de9bc470 # good: [7c955fca3e1d8132982148267d9efcafae849bb6] Merge branch 'for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-udf-2.6 git bisect good 7c955fca3e1d8132982148267d9efcafae849bb6 # good: [c32b0d4b3f19c2f5d29568f8b7b72b61693f1277] fs/mpage.c: consolidate code git bisect good c32b0d4b3f19c2f5d29568f8b7b72b61693f1277 # bad: [f8206b925fb0eba3a11839419be118b09105d7b1] Merge branch 'for-linus' of git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs-2.6 git bisect bad f8206b925fb0eba3a11839419be118b09105d7b1 # good: [a8f2800b4f7b76cecb7209cb6a7d2b14904fc711] nfsd4: fix callback restarting git bisect good a8f2800b4f7b76cecb7209cb6a7d2b14904fc711 # bad: [6651149371b842715906311b4631b8489cebf7e8] autofs4: Clean up autofs4_free_ino() git bisect bad 6651149371b842715906311b4631b8489cebf7e8 # good: [0ad53eeefcbb2620b6a71ffdaad4add20b450b8b] afs: add afs_wq and use it instead of the system workqueue git bisect good 0ad53eeefcbb2620b6a71ffdaad4add20b450b8b # good: [01c64feac45cea1317263eabc4f7ee1b240f297f] CIFS: Use d_automount() rather than abusing follow_link() git bisect good 01c64feac45cea1317263eabc4f7ee1b240f297f # good: [b5b801779d59165c4ecf1009009109545bd1f642] autofs4: Add d_manage() dentry operation git bisect good b5b801779d59165c4ecf1009009109545bd1f642 # bad: [e61da20a50d21725ff27571a6dff9468e4fb7146] autofs4: Clean up inode operations git bisect bad e61da20a50d21725ff27571a6dff9468e4fb7146 # good: [8c13a676d5a56495c350f3141824a5ef6c6b4606] autofs4: Remove unused code git bisect good 8c13a676d5a56495c350f3141824a5ef6c6b4606 I can more or less reproduce this at will now, I think even with very few NFS operations on an automounted nfsv4 mount. Here's an oops from a 2.6.39 kernel: [ 119.419789] tun0: Features changed: 0x4800 - 0x4000 [ 178.242917] FS-Cache: Netfs 'nfs' registered for caching [ 178.269980] SELinux: initialized (dev 0:2c, type nfs4), uses genfs_contexts [ 178.282296] SELinux: initialized (dev 0:2d, type nfs4), uses genfs_contexts [ 523.953284] BUG: Dentry 8801f3084180{i=2,n=} still in use (1) [unmount of nfs4 0:2c] [ 523.953306] [ cut here ] [ 523.954013] kernel BUG at fs/dcache.c:925! [ 523.954013] invalid opcode: [#1] SMP [ 523.954013] last sysfs file: /sys/devices/virtual/bdi/0:45/uevent [ 523.954013] CPU 1 [ 523.954013] Modules linked in: nfs lockd auth_rpcgss nfs_acl tun fuse ip6table_filter ip6_tables ebtable_nat ebtables sunrpc cachefiles fscache cpufreq_ondemand powernow_k8 freq_table mperf it87 adt7475 hwmon_vid xfs snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel raid1 snd_hda_codec snd_usb_audio snd_usbmidi_lib snd_hwdep snd_seq snd_rawmidi snd_seq_device snd_pcm snd_timer snd uvcvideo ppdev videodev soundcore media
Re: [autofs] Automount /home for all the users when a user try to use /home
On Thu, 2011-05-19 at 09:10 +0200, giggzounet wrote: Hi, I'm using autofs for years and on different configurations. It works great! so thx for your great work. But I have a little question/problem: We have a cluster (with CentOS 5.5). It's a little cluster, so the /home is on the master. And this /home is exported to the nodes. I have installed autofs in order to automount this /home when a user wants to use his /home. So at the moment the configuration is simple: cat auto.master /home /etc/auto.home cat auto.home * nfs_oscar:/home/ So when a user comes autofs mounts only the /home for the user. So at the moment it works. but I find it not optimal, because when 2 users try to connect from a node I get 2 mounts. I would like to get this behaviour: when a user try to connect to his /home, autofs mount all the export /home. It is possible ? It might be possible to come up with a method to mount all the keys in a indirect map on first access but that sounds like it would be more difficult than it's worth. Do you really need to mount these separately? Could you use a direct mount map instead? You could do this by using the following: in the master map (possibly /etc/auto.master): /- /etc/auto.direct.homes and /etc/auto.direct.homes could contain: /home oscar:/home This way there is only one mount not many and it will always make all home directories available on first access. Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] Deadlock in automount caused by AB/BA lock ordering
On Thu, 2011-05-19 at 20:21 -0300, Leonardo Chiquitto wrote: Hello, We received a support request from a customer reporting a hang in the automount daemon. Analyzing the core dump, it looks like automount can deadlock if two threads execute in the following order: This looks like a rather interesting mistake on my part. It doesn't look simple to fix either, anyway I'm working on it. (All the line numbers are adjusted to match the latest version from Git) Thread 7 (Thread 25043): Thread 6 (Thread 24007): #9 start_thread() libpthread.so.0 #3 start_thread() libpthread.so.0 #8 do_read_master() at automount.c:1259 #2 do_readmap() at state.c:479 #7 master_read_master() at master.c:853 -- master_mutex_lock() [state.c:462] -- master_mutex_unlock() [state.c:466] -- master_mutex_lock() [master.c:836] -- cache_writelock() [master.c:838 or :849] -- master_source_readlock() [state.c:477] #6 lookup_nss_read_master() at lookup.c:229 #5 do_read_master() at lookup.c:96 #4 lookup_read_master() at lookup_ldap.c:1676 #3 master_parse_entry() at master_parse.y:829 #2 master_add_map_source() at master.c:192 #1 master_source_writelock() at master.c:543 #1 cache_readlock() at cache.c:60 #0 pthread_rwlock_wrlock() libpthread.so.0#0 pthread_rwlock_rdlock() libpthread.so.0 At this point: Thread 7: locked(master_mutex_lock) locked(cache_writelock) A waits(master_source_writelock) B Thread 6: locked(master_source_readlock)B waits(cache_readlock) A The AutoFS version is 5.0.5 plus all upstream patches up to autofs-5.0.5-fix-submount-shutdown-wait.patch plus autofs-5.0.5-fix-out-of-order-locking-in-readmap.patch autofs-5.0.5-fix-next-task-list-update.patch autofs-5.0.5-fix-stale-map-read.patch At the first sight, moving pthread_cleanup_pop(1) (and consequently master_mutex_unlock()) to the end of do_readmap() could avoid the problem, but I didn't test this yet (please see untested patch below). Any insight would be much appreciated! Thanks, Leonardo Index: autofs/daemon/state.c === --- autofs.orig/daemon/state.c +++ autofs/daemon/state.c @@ -463,7 +463,6 @@ static void *do_readmap(void *arg) status = lookup_nss_read_map(ap, NULL, now); if (!status) pthread_exit(NULL); - pthread_cleanup_pop(1); if (ap-type == LKP_INDIRECT) { lookup_prune_cache(ap, now); @@ -504,6 +503,7 @@ static void *do_readmap(void *arg) } pthread_cleanup_pop(1); + pthread_cleanup_pop(1); return NULL; } ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] Question/request about host mapping
On Thu, 2011-04-07 at 15:29 -0600, Michael Coffman wrote: On Thu, Apr 7, 2011 at 8:56 AM, Ian Kent ra...@themaw.net wrote: On Wed, 2011-04-06 at 12:52 -0600, Michael Coffman wrote: Hello, I am trying to use the -host map for global access to servers in my environment. I have noted on this list that after the first mount of a server via host mapping, the server is never again probed for mount points. This seems like a major flaw in the host map functionality. This is a problem when adding new file systems to our NFS servers. It means systems have to be completely idled in order to get them to unmount and re-mount the servers to see the new mount points (not possible very often in my environment). Is the assumption that if you use host mapping, your server will either never change, or you have the ability to 'reboot' your environment when a new file system is added to a server? Neither. The problem is that exports can be nested and frequently are (think of the nohide export option). For example, with exports like: /export/vol1/data1 /export/vol1/data2 and then add: /export/vol1 and update the map when either or both /export/vol1/data1 and /export/vol1/data2 are mounted, /export/vol1 is accessed it will cover these mounts. Now that might not seem like a problem but when you end up with multiple layers of mounts mounted multiple times everything starts to get confused really fast. First off, thanks for the reply and the details on the reasoning for not automatically re-scanning servers that have been mounted via host mapping. I guess in my simple world I would never export a higher level directory if sub-directories were already exported. Do lots or people really nest exports to the same sets of client systems? I will have to think about some scenarios and some possibilities of algorithms on how to deal with them and get back with you... In any case, it seems having a way to ping autofs and have it re-read would imply that the admin would have some knowledge of what was being requested. Is there any way that functionality could be added that would allow for sending a signal ( say USR2 ) that would cause the automounter to re-run /etc/auto.net to re-query -host managed servers for new file systems? Well, if you have sensible, realistic, workable ideas on how to handle the nesting problem then share. And I don't mean just do this type through away comments that have no workable basis in fact. Hopefully this will not come across as me saying 'just do it' :)We have used AMD for years and I would really like to switch to autofs and was planning on doing so until this issue came up. I believe that some kind of knob that could be used to refresh the maps on my clients so they see new exports when added to the servers would be very useful. Not an automatic re-querying of the servers, but a signal that I can bump automount with when I know what changes have been made on the other end and want to refresh. I'll think more about it. Perhaps, if I can work out some way to check for active mount within nested set of exports, I could ignore the nested ones on update. The problem then will be people complaining that suddenly it stopped seeing new mounts when I added them. Thanks again for taking the time to reply to my message. Ian -- -MichaelC ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] [PATCH] Sanity checks for brackets (escaped or otherwise) in server name
On Fri, 2011-04-01 at 14:43 +0530, Siddhesh Poyarekar wrote: When autofs is configured as follows: * -nodev,nosuid,intr,soft,retry=10,proto=tcp :/tmp1 One could make a mount request as follows: df /autom/tmp1/som\(efile and crash automount, since automount tries to parse the brackets to get the weight for the server. Automount should not parse these brackets if they're escaped. Also throw a syntax error in case of mismatched brackets instead of crashing. Sample configuration for this: * -nodev,nosuid,intr,soft,retry=10,proto=tcp foo(2:/tmp1 Signed-off-by: Siddhesh Poyarekar siddhesh.poyare...@gmail.com Thanks for being through, but it's already in the commit queue, due to the RHEL and Fedora bugs you logged. The patch remains attributed to you, of course. Can't say when the bunch of patches I have in the queue will be committed and posted. Also, I've still got a couple of must fix bugs before I roll them up into the next release and I'm also quite busy. Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] Question/request about host mapping
On Wed, 2011-04-06 at 12:52 -0600, Michael Coffman wrote: Hello, I am trying to use the -host map for global access to servers in my environment. I have noted on this list that after the first mount of a server via host mapping, the server is never again probed for mount points. This seems like a major flaw in the host map functionality. This is a problem when adding new file systems to our NFS servers. It means systems have to be completely idled in order to get them to unmount and re-mount the servers to see the new mount points (not possible very often in my environment). Is the assumption that if you use host mapping, your server will either never change, or you have the ability to 'reboot' your environment when a new file system is added to a server? Neither. The problem is that exports can be nested and frequently are (think of the nohide export option). For example, with exports like: /export/vol1/data1 /export/vol1/data2 and then add: /export/vol1 and update the map when either or both /export/vol1/data1 and /export/vol1/data2 are mounted, /export/vol1 is accessed it will cover these mounts. Now that might not seem like a problem but when you end up with multiple layers of mounts mounted multiple times everything starts to get confused really fast. Is there any way that functionality could be added that would allow for sending a signal ( say USR2 ) that would cause the automounter to re-run /etc/auto.net to re-query -host managed servers for new file systems? Well, if you have sensible, realistic, workable ideas on how to handle the nesting problem then share. And I don't mean just do this type through away comments that have no workable basis in fact. Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] different behaviour of autofs
On Mon, 2011-03-28 at 09:53 +0200, hugo tempo wrote: Hi, We (@vu.nl) use the automounter with ldap, a.o. to automount home-directories. Now under squeeze if we do a wildcard access (ls /home/* or even ls /home) we get a listing of (seemingly) all the userdirectories from the mountmap (in ldap), which are certainly not all mounted under /home on that specific machine. Same story with /net where we automount packages. With lenny we never saw this behaviour, which we honestly do not appreciate. Anybody knowing the specifics about this? Specifics about what. Perhaps you expect me to just guess the version history and the changes you have made to your autofs configuration. Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] stat -L triggering mount (behavior change starting with 2.6.38-rc1)
On Mon, 2011-03-28 at 19:53 -0300, Leonardo Chiquitto wrote: Hello, After the update to 2.6.38, I noticed that AutoFS started to mount volumes at times it normally wouldn't. More specifically, ls -la inside an AutoFS mount point will trigger the mount of all available maps. This can be reproduced with a simple indirect mount setup: # cat /etc/auto.master /data /etc/auto.data # cat /etc/auto.data isos -fstype=nfs4,ro,rsize=8192,wsize=8192,intr,nolock,nosuid libre:/isos # egrep '(autofs|nfs4)' /proc/mounts /etc/auto.data /data autofs rw,relatime,fd=7,pgrp=4300,timeout=600,minproto=5,\ maxproto=5,indirect 0 0 # stat -L /data/isos /dev/null How about an strace of this please? # egrep '(autofs|nfs4)' /proc/mounts /etc/auto.data /data autofs rw,relatime,fd=7,pgrp=4300,timeout=600,minproto=5,\ maxproto=5,indirect 0 0 libre:/isos /data/isos nfs4 ro,nosuid,relatime,vers=4,(.. more mount options ..) 0 0 Bisecting the changes between 2.6.37 and 2.6.38, I found that 2.6.38-rc1 exhibits the new behavior already. I went as near 2.6.37 as I could and verified that commit b650c858c2 (autofs4: Merge the remaining dentry ops tables) also shows the new behavior. I tried to bisect the remaining ~18 commits but everything I tested/built ended up unbootable. I also tested mainline and verified that the problem is still there. There where two significant patch series merged in 2.6.38 which were to similar parts of the VFS. One was the vfs-scale series and the other was the vfs-automount series. So 2.6.38 had significant autofs changes. I think the mount storm your seeing would be due to this check not catching lstat(2) calls, but AFAICS it should catch this case: if (!(flags LOOKUP_FOLLOW) !(flags (LOOKUP_CONTINUE | LOOKUP_DIRECTORY | LOOKUP_OPEN | LOOKUP_CREATE))) return -EISDIR; What do you think David? Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] stat -L triggering mount (behavior change starting with 2.6.38-rc1)
On Mon, 2011-03-28 at 19:53 -0300, Leonardo Chiquitto wrote: Bisecting the changes between 2.6.37 and 2.6.38, I found that 2.6.38-rc1 exhibits the new behavior already. I went as near 2.6.37 as I could and verified that commit b650c858c2 (autofs4: Merge the remaining dentry ops tables) also shows the new behavior. I tried to bisect the remaining ~18 commits but everything I tested/built ended up unbootable. I also tested mainline and verified that the problem is still there. I suspect it will but does it also happen with 2.6.39-rc1? Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] stat -L triggering mount (behavior change starting with 2.6.38-rc1)
On Fri, 2011-04-01 at 12:27 +0800, Ian Kent wrote: On Mon, 2011-03-28 at 19:53 -0300, Leonardo Chiquitto wrote: There where two significant patch series merged in 2.6.38 which were to similar parts of the VFS. One was the vfs-scale series and the other was the vfs-automount series. So 2.6.38 had significant autofs changes. I think the mount storm your seeing would be due to this check not catching lstat(2) calls, but AFAICS it should catch this case: Oh, hang on, we have this comment in the VFS code! /* We want to mount if someone is trying to open/create a file of any * type under the mountpoint, wants to traverse through the mountpoint * or wants to open the mounted directory. * * We don't want to mount if someone's just doing a stat and they've * set AT_SYMLINK_NOFOLLOW - unless they're stat'ing a directory and * appended a '/' to the name. */ if (!(flags LOOKUP_FOLLOW) !(flags (LOOKUP_CONTINUE | LOOKUP_DIRECTORY | LOOKUP_OPEN | LOOKUP_CREATE))) return -EISDIR; But stat -L says de-reference (a stat(2) call, not an lstat(2) call) the symlink so AT_SYMLINK_NOFOLLOW is not set and consequently LOOKUP_FOLLOW will be set. Since automounting modules no longer see the lookup flags autofs doesn't know about this. I missed this case, oops! I'm not even sure what we can do about it either since this code is shared by other file systems than autofs that don't need to tell lies when stat()ing a directory to prevent mount storms. mmm ... ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
[autofs] [PATCH 1/2] autofs-5.0.5 - fix next task list update
When the state queue task manager transfered an automount point pending task to its task queue for execution the state queue as mistakenly being seen as empty when the completing task was the only task in the state queue. --- CHANGELOG |1 + daemon/state.c |8 +--- 2 files changed, 6 insertions(+), 3 deletions(-) diff --git a/CHANGELOG b/CHANGELOG index 347d7d7..a9687b7 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -62,6 +62,7 @@ - fix mountd vers retry. - fix expire race. - replace GPLv3 code. +- fix next task list update. 03/09/2009 autofs-5.0.5 --- diff --git a/daemon/state.c b/daemon/state.c index 38617c3..85587bd 100644 --- a/daemon/state.c +++ b/daemon/state.c @@ -1150,11 +1150,13 @@ remove: next = list_entry((task-pending)-next, struct state_queue, pending); - list_del_init(next-pending); - list_add_tail(next-list, p); - list_del(task-list); free(task); + + list_del_init(next-pending); + list_add_tail(next-list, head); + if (p == head) + p = head-next; } if (list_empty(head)) ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
[autofs] [PATCH 2/2] autofs-5.0.5 - fix stale map read
A previous patch to fix direct maps not updating on re-read has a side effect of causing maps to always be re-read on lookup. This is because, following the application of the previous patch, the map stale status is no longer being updated on a successful map read. --- CHANGELOG|1 + daemon/lookup.c |1 + daemon/state.c |1 - include/master.h |1 + lib/master.c | 37 + 5 files changed, 28 insertions(+), 13 deletions(-) diff --git a/CHANGELOG b/CHANGELOG index a9687b7..fcf9145 100644 --- a/CHANGELOG +++ b/CHANGELOG @@ -63,6 +63,7 @@ - fix expire race. - replace GPLv3 code. - fix next task list update. +- fix stale map read. 03/09/2009 autofs-5.0.5 --- diff --git a/daemon/lookup.c b/daemon/lookup.c index 36e60c9..0f7051b 100644 --- a/daemon/lookup.c +++ b/daemon/lookup.c @@ -1139,6 +1139,7 @@ int lookup_prune_cache(struct autofs_point *ap, time_t age) cache_readlock(map-mc); lookup_prune_one_cache(ap, map-mc, age); pthread_cleanup_pop(1); + clear_stale_instances(map); map-stale = 0; map = map-next; } diff --git a/daemon/state.c b/daemon/state.c index 85587bd..3645440 100644 --- a/daemon/state.c +++ b/daemon/state.c @@ -501,7 +501,6 @@ static void *do_readmap(void *arg) pthread_cleanup_pop(1); pthread_cleanup_pop(1); pthread_cleanup_pop(1); - lookup_prune_cache(ap, now); } pthread_cleanup_pop(1); diff --git a/include/master.h b/include/master.h index bef59d3..1c1a7d5 100644 --- a/include/master.h +++ b/include/master.h @@ -89,6 +89,7 @@ struct map_source * master_find_source_instance(struct map_source *, const char *, const char *, int, const char **); struct map_source * master_add_source_instance(struct map_source *, const char *, const char *, time_t, int, const char **); +void clear_stale_instances(struct map_source *); void send_map_update_request(struct autofs_point *); void master_source_writelock(struct master_mapent *); void master_source_readlock(struct master_mapent *); diff --git a/lib/master.c b/lib/master.c index 95bd3fb..4b48883 100644 --- a/lib/master.c +++ b/lib/master.c @@ -465,7 +465,26 @@ master_add_source_instance(struct map_source *source, const char *type, const ch return new; } -static void check_stale_instances(struct map_source *source) +static int check_stale_instances(struct map_source *source) +{ + struct map_source *map; + + if (!source) + return 0; + + map = source-instance; + while (map) { + if (map-stale) + return 1; + if (check_stale_instances(map)) + return 1; + map = map-next; + } + + return 0; +} + +void clear_stale_instances(struct map_source *source) { struct map_source *map; @@ -474,11 +493,9 @@ static void check_stale_instances(struct map_source *source) map = source-instance; while (map) { - if (map-stale) { - source-stale = 1; - break; - } - check_stale_instances(map-instance); + clear_stale_instances(map); + if (map-stale) + map-stale = 0; map = map-next; } @@ -496,12 +513,8 @@ void send_map_update_request(struct autofs_point *ap) map = ap-entry-maps; while (map) { - check_stale_instances(map); - map = map-next; - } - - map = ap-entry-maps; - while (map) { + if (check_stale_instances(map)) + map-stale = 1; if (map-stale) { need_update = 1; break; ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
[autofs] [PATCH 0/2] Series for the non-expiring mounts problem
These two patches should fix the problem you reported where a USR1 signal fails to trigger an expire. They should also fix the unnecessary map reload problem. Please test these for me. --- Ian Kent (2): autofs-5.0.5 - fix stale map read autofs-5.0.5 - fix next task list update CHANGELOG|2 ++ daemon/lookup.c |1 + daemon/state.c |9 + include/master.h |1 + lib/master.c | 37 + 5 files changed, 34 insertions(+), 16 deletions(-) -- Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] 5.0.5 non-expiring mounts
On Thu, 2011-03-24 at 19:03 -0300, Leonardo Chiquitto wrote: I finally had time to return to this issue. To avoid the confusion caused by old kernels, I reproduced the problem on openSUSE Factory (kernel 2.6.38 and autofs 5.0.5 with all kernel.org patches applied). Here's the configuration used: # cat /etc/auto.master /data /etc/auto.data # cat /etc/auto.data isos -fstype=nfs,ro,rsize=8192,wsize=8192,intr,nolock,nosuid libre:/isos # cat /etc/sysconfig/autofs | grep -v '^#' AUTOFS_OPTIONS= LOCAL_OPTIONS= APPEND_OPTIONS=yes USE_MISC_DEVICE=yes DEFAULT_MASTER_MAP_NAME=auto.master DEFAULT_TIMEOUT=600 DEFAULT_BROWSE_MODE=yes DEFAULT_LOGGING=debug DEFAULT_MAP_OBJECT_CLASS=nisMap DEFAULT_ENTRY_OBJECT_CLASS=nisObject DEFAULT_MAP_ATTRIBUTE=nisMapName DEFAULT_ENTRY_ATTRIBUTE=cn DEFAULT_VALUE_ATTRIBUTE=nisMapEntry DEFAULT_AUTH_CONF_FILE=etc/autofs_ldap_auth.conf MAP_HASH_TABLE_SIZE=1024 I'm attaching the automount debug logs showing the following sequence: - automount startup - mount of an NFS volume (/data/isos) - failed attempt to trigger the expiration of the mounted volume (sending SIGUSR1) - successful attempt to trigger the expiration of the mounted volume (sending SIGUSR1 again) I also confirmed that the problem no longer happens if I revert the following commit: commit 08aafab4c1d0ab6227c80f8cd1086ae78556a370 Author: Ian Kent ra...@themaw.net Date: Thu Sep 9 11:10:47 2010 +0800 autofs-5.0.5 - fix direct map not updating on reread Philip, do you think you could try to revert it in your setup/package just to confirm this works? The problem was actually introduced by autofs-5.0.5-remove-state-machine-timed-wait.patch. There had been a long standing bug in the state queue handling which I thought was a pthreads problem. When I added the above patch everything appeared to work OK. But the change you mentioned above exposed the bug and the log you provided allowed me to work out what was broken and fix it. But the stale map processing has also been broken somewhere along the way and I'm also working on fixing that. Once that's done there's not much more needed for 5.0.6, at last. Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] autofs problem
On Mon, 2011-03-14 at 10:09 +, hpc.ad...@uea.ac.uk wrote: Hello Ian, Thank you for your reply. The contents of /etc/sysconfig/autofs are: [root@head00 ~]# grep -v '^#' /etc/sysconfig/autofs TIMEOUT=300 BROWSE_MODE=no USE_MISC_DEVICE=yes [root@head00 ~]# When the machine boots up, it completely fails to mount /home/username; it just complains that /home/username doesn't exist. The contents of /var/log/messages then show the message given below, thus I assume that they are related. Do let me know if you require further information. Googling around for the error message hasn't been very successful :-( Failing to open a control file handle on an automount is fatal. I don't know why that's happening. We would probably need to add some log prints to get more information. Thanks for your help, Wadud. -Original Message- From: hpc.admin-boun...@uea.ac.uk [mailto:hpc.admin-boun...@uea.ac.uk] On Behalf Of Ian Kent Sent: Sunday, March 13, 2011 1:29 AM To: hpc.ad...@uea.ac.uk Cc: autofs@linux.kernel.org Subject: Re: [autofs] autofs problem On Tue, 2011-03-08 at 11:24 +, hpc.ad...@uea.ac.uk wrote: Hello, I am experiencing autofs problems with my Centos 5.5 system. Upon boot, the mount fails with the following error messages (created by passing the -d option in the automount): Mar 8 10:38:19 cn024 automount[6395]: do_notify_state: signal 15 Mar 8 10:38:19 cn024 automount[6395]: master_notify_state_change: sig 15 switching /home from 1 to 5 Mar 8 10:38:19 cn024 automount[6395]: st_prepare_shutdown: state 1 path /home Mar 8 10:38:19 cn024 automount[6395]: expire_proc: exp_proc = 1090562368 path /home Mar 8 10:38:19 cn024 automount[6395]: expire_cleanup: got thid 1090562368 path /home stat 0 Mar 8 10:38:19 cn024 automount[6395]: expire_cleanup: sigchld: exp 1090562368 finished, switching from 5 to 7 Mar 8 10:38:19 cn024 automount[6395]: st_shutdown: state 5 path /home Mar 8 10:38:19 cn024 smartd[7258]: smartd has fork()ed into background mode. New PID=7258. Mar 8 10:38:19 cn024 automount[6395]: umount_multi: path /home incl 0 Mar 8 10:38:19 cn024 automount[6395]: umounted indirect mount /home Mar 8 10:38:19 cn024 automount[6395]: automount_path_to_fifo: fifo name /var/run/autofs.fifo-home Mar 8 10:38:19 cn024 automount[6395]: shut down path /home Mar 8 10:38:19 cn024 automount[6395]: autofs stopped Mar 8 10:38:28 cn024 automount[7334]: Starting automounter version 5.0.1-0.rc2.143.el5, master map auto.master Mar 8 10:38:28 cn024 automount[7334]: using kernel protocol version 5.01 Mar 8 10:38:28 cn024 automount[7334]: lookup_nss_read_master: reading master files auto.master Mar 8 10:38:28 cn024 automount[7334]: parse_init: parse(sun): init gathered global options: (null) Mar 8 10:38:28 cn024 automount[7334]: lookup_read_master: lookup(file): read entry /home Mar 8 10:38:28 cn024 automount[7334]: master_do_mount: mounting /home Mar 8 10:38:28 cn024 automount[7334]: automount_path_to_fifo: fifo name /var/run/autofs.fifo-home Mar 8 10:38:28 cn024 automount[7334]: lookup_nss_read_map: reading map file /etc/auto.home Mar 8 10:38:28 cn024 automount[7334]: parse_init: parse(sun): init gathered global options: rw,intr Mar 8 10:38:28 cn024 automount[7334]: do_mount_autofs_indirect: failed to create ioctl fd for autofs path /home Mar 8 10:38:28 cn024 automount[7334]: handle_mounts: mount of /home failed! Mar 8 10:38:28 cn024 automount[7334]: master_do_mount: failed to startup mount Mar 8 10:38:28 cn024 automount[7334]: no mounts in table The error messages to note are the last four lines. When I restart the daemon, the automount works, but fails after a day or two. The version of autofs I am using is: What makes you think these are the same problem? Did you check to see if /home was already mounted when it failed? What is in your /etc/sysconfig/autofs? Name: autofs Relocations: (not relocatable) Version : 5.0.1Vendor: CentOS Release : 0.rc2.143.el5_5.6 Any help will be greatly appreciated. Thanks in advance. -- Wadud Miah High Performance Computing Systems Developer Research Computing Services, University of East Anglia Telephone: 01603 593856 Information Services -- ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs ___ Hpc.admin mailing list hpc.ad...@uea.ac.uk http://www.uea.ac.uk/mailman21/listinfo/hpc.admin ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] autofs problem
On Mon, 2011-03-14 at 22:59 +0800, Ian Kent wrote: On Mon, 2011-03-14 at 10:09 +, hpc.ad...@uea.ac.uk wrote: Hello Ian, Thank you for your reply. The contents of /etc/sysconfig/autofs are: [root@head00 ~]# grep -v '^#' /etc/sysconfig/autofs TIMEOUT=300 BROWSE_MODE=no USE_MISC_DEVICE=yes [root@head00 ~]# When the machine boots up, it completely fails to mount /home/username; it just complains that /home/username doesn't exist. The contents of /var/log/messages then show the message given below, thus I assume that they are related. Do let me know if you require further information. Googling around for the error message hasn't been very successful :-( Failing to open a control file handle on an automount is fatal. I don't know why that's happening. We would probably need to add some log prints to get more information. Where does the map come from, is it network based? If it is and you use NetworkManager, have you checked to see if automount is starting up during NetworkManager initialization? In the log you show why do we see automount shutting down, then starting up, you say the problem is at boot but the example log doesn't show an error at boot? And, did you check if /home was already mounted after the shutdown and before that startup in the log. Thanks for your help, Wadud. -Original Message- From: hpc.admin-boun...@uea.ac.uk [mailto:hpc.admin-boun...@uea.ac.uk] On Behalf Of Ian Kent Sent: Sunday, March 13, 2011 1:29 AM To: hpc.ad...@uea.ac.uk Cc: autofs@linux.kernel.org Subject: Re: [autofs] autofs problem On Tue, 2011-03-08 at 11:24 +, hpc.ad...@uea.ac.uk wrote: Hello, I am experiencing autofs problems with my Centos 5.5 system. Upon boot, the mount fails with the following error messages (created by passing the -d option in the automount): Mar 8 10:38:19 cn024 automount[6395]: do_notify_state: signal 15 Mar 8 10:38:19 cn024 automount[6395]: master_notify_state_change: sig 15 switching /home from 1 to 5 Mar 8 10:38:19 cn024 automount[6395]: st_prepare_shutdown: state 1 path /home Mar 8 10:38:19 cn024 automount[6395]: expire_proc: exp_proc = 1090562368 path /home Mar 8 10:38:19 cn024 automount[6395]: expire_cleanup: got thid 1090562368 path /home stat 0 Mar 8 10:38:19 cn024 automount[6395]: expire_cleanup: sigchld: exp 1090562368 finished, switching from 5 to 7 Mar 8 10:38:19 cn024 automount[6395]: st_shutdown: state 5 path /home Mar 8 10:38:19 cn024 smartd[7258]: smartd has fork()ed into background mode. New PID=7258. Mar 8 10:38:19 cn024 automount[6395]: umount_multi: path /home incl 0 Mar 8 10:38:19 cn024 automount[6395]: umounted indirect mount /home Mar 8 10:38:19 cn024 automount[6395]: automount_path_to_fifo: fifo name /var/run/autofs.fifo-home Mar 8 10:38:19 cn024 automount[6395]: shut down path /home Mar 8 10:38:19 cn024 automount[6395]: autofs stopped Mar 8 10:38:28 cn024 automount[7334]: Starting automounter version 5.0.1-0.rc2.143.el5, master map auto.master Mar 8 10:38:28 cn024 automount[7334]: using kernel protocol version 5.01 Mar 8 10:38:28 cn024 automount[7334]: lookup_nss_read_master: reading master files auto.master Mar 8 10:38:28 cn024 automount[7334]: parse_init: parse(sun): init gathered global options: (null) Mar 8 10:38:28 cn024 automount[7334]: lookup_read_master: lookup(file): read entry /home Mar 8 10:38:28 cn024 automount[7334]: master_do_mount: mounting /home Mar 8 10:38:28 cn024 automount[7334]: automount_path_to_fifo: fifo name /var/run/autofs.fifo-home Mar 8 10:38:28 cn024 automount[7334]: lookup_nss_read_map: reading map file /etc/auto.home Mar 8 10:38:28 cn024 automount[7334]: parse_init: parse(sun): init gathered global options: rw,intr Mar 8 10:38:28 cn024 automount[7334]: do_mount_autofs_indirect: failed to create ioctl fd for autofs path /home Mar 8 10:38:28 cn024 automount[7334]: handle_mounts: mount of /home failed! Mar 8 10:38:28 cn024 automount[7334]: master_do_mount: failed to startup mount Mar 8 10:38:28 cn024 automount[7334]: no mounts in table The error messages to note are the last four lines. When I restart the daemon, the automount works, but fails after a day or two. The version of autofs I am using is: What makes you think these are the same problem? Did you check to see if /home was already mounted when it failed? What is in your /etc/sysconfig/autofs? Name: autofs Relocations: (not relocatable) Version : 5.0.1Vendor: CentOS Release : 0.rc2.143.el5_5.6 Any help will be greatly appreciated. Thanks in advance. -- Wadud Miah High Performance Computing Systems Developer Research Computing Services, University of East Anglia Telephone: 01603 593856
Re: [autofs] Recovering from the loss of a NFS Server
On Sat, 2011-03-12 at 23:27 -0500, Breitman, Jason wrote: OS Linux hostname 2.6.18-238.el5 #1 SMP Sun Dec 19 14:22:44 EST 2010 x86_64 x86_64 x86_64 GNU/Linux autofs package autofs-5.0.1-0.rc2.148.bz579312.1.el5 Mount options $ cat /etc/auto.master # Master map for automounter # /home auto_home -hard,intr,retry=10 $ cat /etc/sysconfig/autofs TIMEOUT=86400 - we have a long TIMEOUT to avoid mount storms. What am I trying to do? Prior to a disaster recovery test, my home directory will be mounted from my-nfs-server.domainname:/home/jbreitma. At this point my-nfs-server.domainname points to 1.1.1.1. There are active reads and writes to my home directory. Lets say I have a subdirectory called htdocs and am running apache. Now we are cutoff from 1.1.1.1 because the Data Center where 1.1.1.1 lives is no longer accessible. We simulate this with an ACL. We now repoint my-nfs-server.domainname to 2.2.2.2. The NFS Clients where /home/jbreitma is mounted are now confused. What is my best coarse of action? umount -l /home/jbreitma /etc/init.d/autofs restart fuser -k /home/jbreitma kill -USR1 `pgrep automount` etc ... That's about all you can do. The umount -l has it's own set of problems. In particular any process that has an active mount must do a cd . (I believe that will work) to recover from the changed mount otherwise getcwd(3) will fail and /proc/pid/cwd will point to / instead of a valid working directory. Also, there is pretty much no way to get the RPC layer to give up on those outstanding IOs which will cause ongoing problems. How do I recover from this situation? There's not much you can do for read/write mounts and even read only fail over hasn't been implemented within the Linux kernel NFS client. I am open to a new approach if that is required. The only way I think high availability NFS can work today is when the backend deals with the change such as in Clustered environments. I have had some success with umount -l /home/jbreitma followed by a /etc/init.d/autofs restart, but this does not always work. I specifically fail when active writes and or reads are occurring to /home/jbreitma. Jason Breitman AT-Tech-GTI jason.breit...@blackrock.com BlackRock THIS MESSAGE AND ANY ATTACHMENTS ARE CONFIDENTIAL, PROPRIETARY, AND MAY BE PRIVILEGED. If this message was misdirected, BlackRock, Inc. and its subsidiaries, (BlackRock) does not waive any confidentiality or privilege. If you are not the intended recipient, please notify us immediately and destroy the message without disclosing its contents to anyone. Any distribution, use or copying of this e-mail or the information it contains by other than an intended recipient is unauthorized. The views and opinions expressed in this e-mail message are the author's own and may not reflect the views and opinions of BlackRock, unless the author is authorized by BlackRock to express such views or opinions on its behalf. All email sent to or from this address is subject to electronic storage and review by BlackRock. Although BlackRock operates anti-virus programs, it does not accept responsibility for any damage whatsoever caused by viruses being passed. ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] autofs problem
On Tue, 2011-03-08 at 11:24 +, hpc.ad...@uea.ac.uk wrote: Hello, I am experiencing autofs problems with my Centos 5.5 system. Upon boot, the mount fails with the following error messages (created by passing the -d option in the automount): Mar 8 10:38:19 cn024 automount[6395]: do_notify_state: signal 15 Mar 8 10:38:19 cn024 automount[6395]: master_notify_state_change: sig 15 switching /home from 1 to 5 Mar 8 10:38:19 cn024 automount[6395]: st_prepare_shutdown: state 1 path /home Mar 8 10:38:19 cn024 automount[6395]: expire_proc: exp_proc = 1090562368 path /home Mar 8 10:38:19 cn024 automount[6395]: expire_cleanup: got thid 1090562368 path /home stat 0 Mar 8 10:38:19 cn024 automount[6395]: expire_cleanup: sigchld: exp 1090562368 finished, switching from 5 to 7 Mar 8 10:38:19 cn024 automount[6395]: st_shutdown: state 5 path /home Mar 8 10:38:19 cn024 smartd[7258]: smartd has fork()ed into background mode. New PID=7258. Mar 8 10:38:19 cn024 automount[6395]: umount_multi: path /home incl 0 Mar 8 10:38:19 cn024 automount[6395]: umounted indirect mount /home Mar 8 10:38:19 cn024 automount[6395]: automount_path_to_fifo: fifo name /var/run/autofs.fifo-home Mar 8 10:38:19 cn024 automount[6395]: shut down path /home Mar 8 10:38:19 cn024 automount[6395]: autofs stopped Mar 8 10:38:28 cn024 automount[7334]: Starting automounter version 5.0.1-0.rc2.143.el5, master map auto.master Mar 8 10:38:28 cn024 automount[7334]: using kernel protocol version 5.01 Mar 8 10:38:28 cn024 automount[7334]: lookup_nss_read_master: reading master files auto.master Mar 8 10:38:28 cn024 automount[7334]: parse_init: parse(sun): init gathered global options: (null) Mar 8 10:38:28 cn024 automount[7334]: lookup_read_master: lookup(file): read entry /home Mar 8 10:38:28 cn024 automount[7334]: master_do_mount: mounting /home Mar 8 10:38:28 cn024 automount[7334]: automount_path_to_fifo: fifo name /var/run/autofs.fifo-home Mar 8 10:38:28 cn024 automount[7334]: lookup_nss_read_map: reading map file /etc/auto.home Mar 8 10:38:28 cn024 automount[7334]: parse_init: parse(sun): init gathered global options: rw,intr Mar 8 10:38:28 cn024 automount[7334]: do_mount_autofs_indirect: failed to create ioctl fd for autofs path /home Mar 8 10:38:28 cn024 automount[7334]: handle_mounts: mount of /home failed! Mar 8 10:38:28 cn024 automount[7334]: master_do_mount: failed to startup mount Mar 8 10:38:28 cn024 automount[7334]: no mounts in table The error messages to note are the last four lines. When I restart the daemon, the automount works, but fails after a day or two. The version of autofs I am using is: What makes you think these are the same problem? Did you check to see if /home was already mounted when it failed? What is in your /etc/sysconfig/autofs? Name: autofs Relocations: (not relocatable) Version : 5.0.1Vendor: CentOS Release : 0.rc2.143.el5_5.6 Any help will be greatly appreciated. Thanks in advance. -- Wadud Miah High Performance Computing Systems Developer Research Computing Services, University of East Anglia Telephone: 01603 593856 Information Services -- ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] autofs hangs
On Wed, 2011-03-09 at 14:11 -0500, Jaskaran Singh wrote: I am running into an issue here, I using autofs with ldap to mount /home directories. This is my os version Description: Ubuntu 10.10 Release: 10.10 my /etc/auto.master is commented out. in /etc/nsswitch.conf last line is automount: ldap I can automount my home directory from the NFS server using LDAP with out an issue. But after a few minutes of inactivity, when i try to login to the machine and run my application it hangs. My application tries to read /etc/config.txt but rather than reading /etc/config.txt it tries to mount /home/etc/config.txt , i can see that in the syslog which is attached below. If i were to login and move to another location besides /home/*, the command will run fine for example if I cd from/home to any other part of the file system like /tmp or /var/log. Somehow you need to get an strace when the access to /etc/config.txt is happening, that might help. Mar 9 14:02:02 console automount[23227]: do_bind: lookup(ldap): ldap simple bind returned 0 Mar 9 14:02:02 console automount[23227]: lookup_one: lookup(ldap): searching for ((objectclass=automount)(|(cn=repository)(cn=/)(cn= \2A))) under ou=auto.home,dc=domain,dc=org Mar 9 14:02:02 console automount[23227]: lookup_one: lookup(ldap): getting first entry for cn=repository Mar 9 14:02:02 console automount[23227]: lookup_one: lookup(ldap): got answer, but no entry for ((objectclass=automount)(|(cn=repository)(cn=/)(cn=\2A))) Mar 9 14:02:02 console automount[23227]: key repository not found in map source(s). Mar 9 14:02:02 console automount[23227]: ioctl_send_fail: token = 548 Mar 9 14:02:02 console automount[23227]: failed to mount /home/repository Mar 9 14:02:02 console automount[23227]: handle_packet: type = 3 Mar 9 14:02:02 console automount[23227]: handle_packet_missing_indirect: token 549, name repository, request pid 23381 Mar 9 14:02:02 console automount[23227]: attempting to mount entry /home/repository Mar 9 14:02:02 console automount[23227]: lookup_mount: lookup(ldap): looking up repository Mar 9 14:02:02 console automount[23227]: ioctl_send_fail: token = 549 Mar 9 14:02:02 console automount[23227]: failed to mount /home/repository Mar 9 14:02:19 console automount[23227]: st_expire: state 1 path /home Mar 9 14:02:19 console automount[23227]: expire_proc: exp_proc = 3057638256 path /home Mar 9 14:02:19 console automount[23227]: expire_proc_indirect: expire /home/jsingh Mar 9 14:02:19 console automount[23227]: 1 remaining in /home Mar 9 14:02:19 console automount[23227]: expire_cleanup: got thid 3057638256 path /home stat 3 Mar 9 14:02:19 console automount[23227]: expire_cleanup: sigchld: exp 3057638256 finished, switching from 2 to 1 Mar 9 14:02:19 console automount[23227]: st_ready: st_ready(): state = 2 path /home Mar 9 14:03:34 console automount[23227]: st_expire: state 1 path /home Mar 9 14:03:34 console automount[23227]: expire_proc: exp_proc = 3057638256 path /home Mar 9 14:03:34 console automount[23227]: expire_proc_indirect: expire /home/jsingh Mar 9 14:03:34 console automount[23227]: 1 remaining in /home Mar 7 17:04:39 console automount[1105]: lookup_one: lookup(ldap): got answer, but no entry for ((objectclass=automount)(|(cn=etc)(cn=/)(cn= \2A))) Mar 7 17:04:39 console automount[1105]: key etc not found in map source(s). Mar 7 17:04:39 console automount[1105]: ioctl_send_fail: token = 2 Mar 7 17:04:39 console automount[1105]: failed to mount /home/etc Mar 7 17:04:39 console automount[1105]: handle_packet: type = 3 Mar 7 17:04:39 console automount[1105]: handle_packet_missing_indirect: token 3, name repository, request pid 2011 Mar 7 17:04:39 console automount[1105]: ioctl_send_fail: token = 3 Mar 7 17:04:39 console automount[1105]: failed to mount /home/repository Mar 7 17:04:39 console automount[1105]: handle_packet: type = 3 Mar 7 17:04:39 console automount[1105]: handle_packet_missing_indirect: token 4, name repository, request pid 2011 Mar 7 17:04:39 console automount[1105]: attempting to mount entry /home/repository Mar 7 17:04:39 console automount[1105]: lookup_mount: lookup(ldap): looking up repository Mar 7 17:04:39 console automount[1105]: ioctl_send_fail: token = 4 Mar 7 17:04:39 console automount[1105]: failed to mount /home/repository Mar 7 17:05:34 console automount[1105]: handle_packet: type = 3 Mar 7 17:05:34 console automount[1105]: handle_packet_missing_indirect: token 5, name etc, request pid 2126 Mar 7 17:05:34 console automount[1105]: attempting to mount entry /home/etc Mar 7 17:05:34 console automount[1105]: lookup_mount: lookup(ldap): looking up etc Mar 7 17:05:34 console automount[1105]: ioctl_send_fail: token = 5 Mar 7 17:05:34 console automount[1105]: failed to mount /home/etc autofs: Installed: 5.0.5-0ubuntu2 Candidate: 5.0.5-0ubuntu2 Version table: *** 5.0.5-0ubuntu2 0 500
Re: [autofs] Use bind instead of nfs if host is localhost feature is broken? removed?
On Thu, 2011-03-10 at 15:59 -0800, Nye Liu wrote: I have been relying on the behavior that foo host:/export/foo uses 'bind' if host is localhost, and 'nfs' otherwise... i.e. if i am on the machine host, /export/foo is mounted on /mnt/foo using bind, but if i am not on host, host:/export/foo is mounted on /mnt/foo using nfs. But the latest version of autofs no longer does that. What latest version? Do i have to use this everywhere now? foo (1),host(2):/export/foo The behavior is more like what I expect, but it seems wasteful. ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] AutoFS creates unmountable directories when ghosting is enabled
On Sun, 2011-03-06 at 23:37 -0300, Leonardo Chiquitto wrote: On Wed, Dec 15, 2010 at 12:49 AM, Ian Kent ra...@themaw.net wrote: On Fri, 2010-12-10 at 17:10 -0200, Leonardo Chiquitto wrote: Hello Ian and list, I'd like to forward a bug report we received on openSUSE's Bugzilla [1]. Please consider the following setup to reproduce the problem: host:~ # grep automount /etc/nsswitch.conf automount:files host:~ # cat /etc/auto.master /vol /etc/auto.vol host:~ # cat /etc/auto.vol data1 -fstype=nfs,ro,rsize=8192,wsize=8192,intr,nolock,nosuid srv:/data1 data2 -fstype=nfs,ro,rsize=8192,wsize=8192,intr,nolock,nosuid srv:/data2 data3 -fstype=nfs,ro,rsize=8192,wsize=8192,intr,nolock,nosuid srv:/data3 host:~ # cat /etc/sysconfig/autofs AUTOFS_OPTIONS= LOCAL_OPTIONS= APPEND_OPTIONS=yes DEFAULT_MASTER_MAP_NAME=auto.master DEFAULT_TIMEOUT=600 DEFAULT_BROWSE_MODE=yes DEFAULT_LOGGING=debug USE_MISC_DEVICE=yes host:~ # ls -F /vol data1/ data2/ data3/ The problem depends on ghosting being enabled (ie, BROWSE_MODE=yes). When we try to access a non-existent key/entry, AutoFS will fail to mount it but will still create the mount point: host:~ # ls -d /vol/invalid; sleep 10; ls -d /vol/invalid ls: cannot access /vol/invalid: No such file or directory /vol/invalid host:~ # ls -F /vol data1/ data2/ data3/ invalid/ The problem happens because lookup_ghost() iterates over all cached mapents and creates directories for each entry in the cache. Since the cache also stores entries for failed mounts (negative entries), it ends up creating directories for mount points that don't exist in the map. Yes, an obvious problem. I tested the patch below and it resolves the problem for me, but I'm not sure if this is the best (or even the correct) way to fix the bug. I'd appreciate if you could review and comment. Yeah, at first I thought there were a few other cases. There still might be so let me think about it further. Hello Ian, I bisected this problem today and discovered that it appeared after the following commit: commit 08aafab4c1d0ab6227c80f8cd1086ae78556a370 Author: Ian Kent ra...@themaw.net Date: Thu Sep 9 11:10:47 2010 +0800 autofs-5.0.5 - fix direct map not updating on reread If the map type is explicitly specified for a map the map isn't properly updated when a re-read is requested. This is because the map stale flag is incorrectly cleared after after the lookup module reads the map instead of at the completion of the update procedure. The map stale flag should only be cleared if the map read fails for some reason, otherwise it is updated when the refresh is completed. Does this ring a bell? If not, I'll see if I can debug it further tomorrow. Not really. I'm still not sure your original patch deals with the problem fully. Could you also have a look at and try (from kernel.org): autofs-5.0.5-fix-prune-cache-valid-check.patch Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] AutoFS creates unmountable directories when ghosting is enabled
On Wed, 2011-03-09 at 12:15 +0800, Ian Kent wrote: Hello Ian, I bisected this problem today and discovered that it appeared after the following commit: commit 08aafab4c1d0ab6227c80f8cd1086ae78556a370 Author: Ian Kent ra...@themaw.net Date: Thu Sep 9 11:10:47 2010 +0800 autofs-5.0.5 - fix direct map not updating on reread If the map type is explicitly specified for a map the map isn't properly updated when a re-read is requested. This is because the map stale flag is incorrectly cleared after after the lookup module reads the map instead of at the completion of the update procedure. The map stale flag should only be cleared if the map read fails for some reason, otherwise it is updated when the refresh is completed. Does this ring a bell? If not, I'll see if I can debug it further tomorrow. Not really. I'm still not sure your original patch deals with the problem fully. Could you also have a look at and try (from kernel.org): autofs-5.0.5-fix-prune-cache-valid-check.patch Although, looking again, maybe we need both patches. Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] 5.0.5 non-expiring mounts
On Fri, 2011-03-04 at 17:10 -0300, Leonardo Chiquitto wrote: On Wed, Feb 16, 2011 at 5:08 AM, Ian Kent ra...@themaw.net wrote: On Tue, 2011-02-15 at 16:11 -0200, Leonardo Chiquitto wrote: On Tue, Feb 15, 2011 at 10:28 AM, Ian Kent ra...@themaw.net wrote: On Mon, 2011-02-14 at 21:28 -0800, Mike Marion wrote: On Mon, Feb 14, 2011 at 07:37:01PM -0800, Ian Kent wrote: That is kernel revision and autofs revision? 2.6.16.60-0.59.1 (Sles10 sp3 with an updated, but not bleeding edge, patch). autofs 5.0.5 with most of the patches up to a couple months ago. It's hard to get exacts because it's a PTF from Novell (we really pushed them to upgrade to 5.0.5) but it should be pretty much equal to the patch they just released for sle 11 sp1 that they're recommending as they default going forward. That make it hard, as you know. But I wouldn't mind spending a bit of time on it, if you can also. Let's assume that it's a user space problem for now. Here are the call traces for all automount processes on the kernel side: I think it's a user space problem. snip ... And here are the call traces from the user land daemon: Thread 9 (Thread 4017): #0 0x2b56e465d6a8 in __lll_mutex_lock_wait () from /lib64/libpthread.so.0 #1 0x2b56e46599fb in _L_mutex_lock_92 () from /lib64/libpthread.so.0 #2 0x2b56e4659455 in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x555746cd in master_mutex_lock () at master.c:49 #4 0xd260 in do_hup_signal (master=0x5568d010, age=1296063258) at automount.c:1276 #5 0x55560bd3 in statemachine (arg=value optimized out) at automount.c:1354 #6 main (arg=value optimized out) at automount.c:2142 Thread 8 (Thread 20702): #0 0x2b56e4dd62a7 in brk () from /lib64/libc.so.6 #1 0x55577dfe in expire (logopt=2, cmd=value optimized out, fd=21, ioctlfd=21, path=0x5569ca20 /usr2, arg=0x41c27ef4) at dev-ioctl-lib.c:657 #2 0x55577ebe in ioctl_expire (logopt=21, ioctlfd=-1, path=0x5569ca20 /usr2, when=0) at dev-ioctl-lib.c:701 #3 0x55561e4e in expire_proc_indirect (arg=value optimized out) at indirect.c:545 #4 0x2b56e4657193 in start_thread () from /lib64/libpthread.so.0 #5 0x2b56e4ddcdfd in sysctl () from /lib64/libc.so.6 #6 0x in ?? () Thread 7 (Thread 7060): #0 0x2b56e465ac77 in pthread_rwlock_wrlock () from /lib64/libpthread.so.0 #1 0x555752ea in master_source_writelock (entry=value optimized out) at master.c:527 #2 0x55575f8f in master_add_map_source (entry=0x556a10b0, type=0x0, format=0x0, age=1296059657, argc=1, argv=value optimized out) at master.c:191 #3 0x55579ee3 in master_parse_entry (buffer=value optimized out, default_timeout=86400, logging=value optimized out, age=1296059657) at master_parse.y:823 #4 0x2aab83fe in lookup_read_master (master=value optimized out, age=1296059657, context=value optimized out) at lookup_ldap.c:1625 #5 0x55569052 in do_read_master (master=0x5568d010, type=value optimized out, age=1296059657) at lookup.c:96 #6 0x5556aa3c in lookup_nss_read_master (master=0x5568d010, age=1296059657) at lookup.c:229 #7 0x55575c28 in master_read_master (master=0x5568d010, age=1296059657, readall=1) at master.c:831 #8 0xd844 in do_read_master (arg=value optimized out) at automount.c:1259 #9 0x2b56e4657193 in start_thread () from /lib64/libpthread.so.0 #10 0x2b56e4ddcdfd in sysctl () from /lib64/libc.so.6 #11 0x in ?? () Thread 6 (Thread 6851): #0 0x2b56e465aa3d in pthread_rwlock_rdlock () from /lib64/libpthread.so.0 #1 0x5556deb6 in cache_readlock (mc=0x5568e5b8) at cache.c:60 #2 0x5556baff in do_readmap (arg=value optimized out) at state.c:479 #3 0x2b56e4657193 in start_thread () from /lib64/libpthread.so.0 #4 0x2b56e4ddcdfd in sysctl () from /lib64/libc.so.6 #5 0x in ?? () Thread 5 (Thread 4026): #0 0x2b56e465d6a8 in __lll_mutex_lock_wait () from /lib64/libpthread.so.0 #1 0x2b56e46599fb in _L_mutex_lock_92 () from /lib64/libpthread.so.0 #2 0x2b56e4659455 in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x555746cd in master_mutex_lock () at master.c:49 #4 0x55560ff9 in handle_packet_missing_indirect (ap=0x5569c940, pkt=0x41823ec0) at indirect.c:808 #5 0xfa32 in handle_packet (ap=value optimized out) at automount.c:1026 #6 handle_mounts (ap=value optimized out) at automount.c:1551 #7 0x2b56e4657193 in start_thread () from /lib64/libpthread.so.0 #8 0x2b56e4ddcdfd in sysctl () from /lib64/libc.so.6 #9 0x
Re: [autofs] Before I start carving wheels....
On Sat, 2011-02-26 at 15:24 -0500, Vincent Liggio wrote: Ok, that works. I will put a bug report into redhat so they hopefully will integrate that into the latest code. Why they are releasing a -18 for F15 I don't get, when it doesn't even have any code changes. You didn't look: * Mon Feb 07 2011 Fedora Release Engineering rel-...@lists.fedoraproject.org - 5:6.1.5-18 - Rebuilt for https://fedoraproject.org/wiki/Fedora_15_Mass_Rebuild Standard practice when branching a release, to ensure packages that may not have been re-built don't have broken dependencies. So to sum up, amd 6.1.5 patched as below works fine on F14 kernel's with autofs4. The amd redhat supplies does NOT work, in fact, it doesn't even load properly with autofs enabled. What are you saying? If you add this patch to Fedora am-utils it then works? If that's not the case then there is more work to do! Vince On 02/25/2011 12:39 PM, Ion Badulescu wrote: On Thu, 24 Feb 2011, Ian Kent wrote: But the autofs4 module should be able to be used for autofs kernel protocol version 3. It may require some changes in user space and, since the v3 protocol in autofs4 hasn't been tested for so long, there may be some other bugs that need fixing. Problem is that amd I believe specifically looks for autofs3 (even though the code says minimum autofs version 3, it fails to work with autofs4). And since no one is responding to the bug I put in about that, and the last time code was released for amd was in 2005, amd using autofs on the current F14 kernel seems to dead. I'll grab the amd source and have a quick look. Where is the right place to get it? It looks like amd should work with autofs protocol version v4. You should also try modprobe autofs4 before starting amd and see what happens. No, amd will happily work with either autofs3 or autofs4. The problem is that it (optimistically) tries to use the highest version that the kernel supports, which is autofs5 these days. But it itself doesn't have support for autofs5, so it fails miserably. The patch (copy/pasted so it might not apply cleanly) fixes autofs: commit 5cefcd3e1c7cb4943697e48996b8b1cbc7a9e7de Author: Ion Badulescu io...@buggy.badula.org Date: Tue Nov 30 07:14:23 2010 -0500 max supported autofs version is 4 diff --git a/conf/autofs/autofs_linux.c b/conf/autofs/autofs_linux.c index af61804..e901da7 100644 --- a/conf/autofs/autofs_linux.c +++ b/conf/autofs/autofs_linux.c @@ -59,7 +59,7 @@ */ #define AUTOFS_MIN_VERSION 3 -#define AUTOFS_MAX_VERSION AUTOFS_MAX_PROTO_VERSION +#define AUTOFS_MAX_VERSION 4 /* -Ion ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] Before I start carving wheels....
On Fri, 2011-02-25 at 12:39 -0500, Ion Badulescu wrote: On Thu, 24 Feb 2011, Ian Kent wrote: But the autofs4 module should be able to be used for autofs kernel protocol version 3. It may require some changes in user space and, since the v3 protocol in autofs4 hasn't been tested for so long, there may be some other bugs that need fixing. Problem is that amd I believe specifically looks for autofs3 (even though the code says minimum autofs version 3, it fails to work with autofs4). And since no one is responding to the bug I put in about that, and the last time code was released for amd was in 2005, amd using autofs on the current F14 kernel seems to dead. I'll grab the amd source and have a quick look. Where is the right place to get it? It looks like amd should work with autofs protocol version v4. You should also try modprobe autofs4 before starting amd and see what happens. No, amd will happily work with either autofs3 or autofs4. My point above was that, in the past we've seen the wrong module loaded or no module loaded because of name match failure. Pre-loading the module uncovers those sorts of difficulties. The problem is that it (optimistically) tries to use the highest version that the kernel supports, which is autofs5 these days. But it itself doesn't have support for autofs5, so it fails miserably. The patch (copy/pasted so it might not apply cleanly) fixes autofs: commit 5cefcd3e1c7cb4943697e48996b8b1cbc7a9e7de Author: Ion Badulescu io...@buggy.badula.org Date: Tue Nov 30 07:14:23 2010 -0500 max supported autofs version is 4 diff --git a/conf/autofs/autofs_linux.c b/conf/autofs/autofs_linux.c index af61804..e901da7 100644 --- a/conf/autofs/autofs_linux.c +++ b/conf/autofs/autofs_linux.c @@ -59,7 +59,7 @@ */ #define AUTOFS_MIN_VERSION 3 -#define AUTOFS_MAX_VERSION AUTOFS_MAX_PROTO_VERSION +#define AUTOFS_MAX_VERSION 4 /* -Ion ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] Before I start carving wheels....
On Thu, 2011-02-24 at 14:42 +0800, Ian Kent wrote: On Wed, 2011-02-23 at 23:35 -0500, Vincent Liggio wrote: On Thu, 24 Feb 2011, Ian Kent wrote: I reported on both kernel.org and on am-utils.org that autofs and amd do not work with F12's kernel in March/April of 2010 (kernel.org bug 15878 and am-utils bug 639). No one has acknowledged or worked on the bug, as far as I can tell on the respective bugzillas. I don't remember seeing any mail on that bug even though I'm on the cc list for it. In any case, it's asking for the autofs module to be built as default which isn't likely to happen since, even at that time, the autofs module was going to be removed from the kernel. It (autofs3) still exists as an option in F14 (if we build our own kernel), but doesn't seem to work either as a module or compiled into the kernel (it did work in F12). If there were a way for amd to work with autofs4, that'd be great, but it doesn't. But the autofs4 module should be able to be used for autofs kernel protocol version 3. It may require some changes in user space and, since the v3 protocol in autofs4 hasn't been tested for so long, there may be some other bugs that need fixing. Problem is that amd I believe specifically looks for autofs3 (even though the code says minimum autofs version 3, it fails to work with autofs4). And since no one is responding to the bug I put in about that, and the last time code was released for amd was in 2005, amd using autofs on the current F14 kernel seems to dead. I'll grab the amd source and have a quick look. Where is the right place to get it? It looks like amd should work with autofs protocol version v4. You should also try modprobe autofs4 before starting amd and see what happens. Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] Before I start carving wheels....
On Thu, 2011-02-24 at 15:11 -0500, Vincent Liggio wrote: On 02/24/2011 05:42 AM, Ian Kent wrote: It looks like amd should work with autofs protocol version v4. You should also try modprobe autofs4 before starting amd and see what happens. We tried that before, and this is what happens: Feb 24 14:40:39 canton_64 amd[1504]: initializing amd.conf map amd.net of type nis Feb 24 14:40:39 canton_64 amd[1504]: amd.net mounted fstype toplvl on /net Feb 24 14:40:39 canton_64 amd[1504]: autofs: using protocol version 5 This is wrong, amd hasn't used the correct mount options or I'm not parsing the options correctly. Not sure that I am parsing the options incorrectly since autofs version 4 needs to use the version 4 protocol and it seems to be able to request that OK. Feb 24 14:40:39 canton_64 amd[1504]: /net set to never timeout Then we change to a directory and we get: Feb 24 14:40:57 canton_64 amd[1504]: Unknown autofs packet type 3 lsmod Module Size Used by autofs422687 13 ipv6 278339 24 ppdev 7925 0 parport_pc 21081 0 parport31509 2 ppdev,parport_pc e1000 92062 0 i2c_piix4 11998 0 shpchp 29568 0 i2c_core 26926 1 i2c_piix4 mptspi 14609 3 mptscsih 28444 1 mptspi mptbase74728 2 mptspi,mptscsih scsi_transport_spi 22211 1 mptspi ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] Before I start carving wheels....
On Wed, 2011-02-23 at 23:35 -0500, Vincent Liggio wrote: On Thu, 24 Feb 2011, Ian Kent wrote: I reported on both kernel.org and on am-utils.org that autofs and amd do not work with F12's kernel in March/April of 2010 (kernel.org bug 15878 and am-utils bug 639). No one has acknowledged or worked on the bug, as far as I can tell on the respective bugzillas. I don't remember seeing any mail on that bug even though I'm on the cc list for it. In any case, it's asking for the autofs module to be built as default which isn't likely to happen since, even at that time, the autofs module was going to be removed from the kernel. It (autofs3) still exists as an option in F14 (if we build our own kernel), but doesn't seem to work either as a module or compiled into the kernel (it did work in F12). If there were a way for amd to work with autofs4, that'd be great, but it doesn't. But the autofs4 module should be able to be used for autofs kernel protocol version 3. It may require some changes in user space and, since the v3 protocol in autofs4 hasn't been tested for so long, there may be some other bugs that need fixing. Problem is that amd I believe specifically looks for autofs3 (even though the code says minimum autofs version 3, it fails to work with autofs4). And since no one is responding to the bug I put in about that, and the last time code was released for amd was in 2005, amd using autofs on the current F14 kernel seems to dead. I'll grab the amd source and have a quick look. Where is the right place to get it? What exactly are the error messages you get when trying to use autofs4? When you have tried this have you added an alias to the modprobe config so that autofs4 will be loaded instead of autofs (although compiling in autofs4 only, not autofs, should also work)? Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] automounting replicated servers using NFSv4 fails
On Mon, 2011-02-21 at 10:26 +0100, Nico De Ranter wrote: On Sat, 2011-02-19 at 11:48 +0800, Ian Kent wrote: On Fri, 2011-02-18 at 14:09 +0100, Nico De Ranter wrote: Hi, I have a number of Linux clients (Ubuntu 10.04) that mount a (read-only) directory from 3 replicated servers using NFSv3. I am now in the process of moving to NFSv4. I can mount the directories using NFSv4 manually. I can mount the directories using autofs over NFSv4 when I specify only 1 server (any of the 3 will work), but when I add all 3 servers to the automount configuration file the mount fails. Works OK for me, on Fedora 12. Like I said, that's F14. Which version of autofs are you using? Fedora 14 has 5.0.5 with, essentially all upstream patches except: autofs-5.0.4-always-read-file-maps-mount-lookup-map-read-fix.patch autofs-5.0.5-fix-direct-map-not-updating-on-reread.patch autofs-5.0.5-add-external-bind-method.patch autofs-5.0.5-fix-add-simple-bind-auth.patch autofs-5.0.5-add-dump-maps-option.patch autofs-5.0.5-fix-submount-shutdown-wait.patch And kernel current kernel revision is 2.6.35.10-74.fc14. Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] Problem of concurrent mount/umount call
On Thu, 2011-02-17 at 12:01 +0100, Erwan Loaëc wrote: Hello, I'm going to generate a new autofs package for squeeze for our IT, and I notice that the patch you've done about my problem does not seems to be available on ftp://ftp.kernel.org/pub/linux/daemons/autofs/v5/ Is there a reason ? No, it's in the queue waiting to get pushed, when I do push the patches I have in the queue. Some will get dropped, I rearrange the order fairly often, but eventually I'll push some of the patches. I don't have any reason to drop your patch so it'll stay in the queue until I reach a point where I'm ready to push them. Is the problem has been solved elsewhere ? Nope. -- Erwan Ian Kent wrote: On Tue, 2010-12-07 at 09:54 +0100, Erwan Loaëc wrote: Hello! For information, since I've updated our autofs with the suggested patch, the problem has not occurred, and I don't see any CPU usage problem. Thanks for letting me know. I'll commit that change soon as I get a chance. Thanks!! I hope this problem is definitively solved :o) -- Erwan Loaec Ian Kent wrote: On Tue, 2010-10-26 at 15:22 +0200, Erwan Loaëc wrote: Hi, First, thank you for spending time to analyse the problem. I've apply the patch in our dev environnement, and defined a slow timeout for the mount (60s) FYI, in production, in order to limit this problem, the timeout is set to 2 hours. When it will be possible, I'll update every autofs. The problem is quite rare, so I won't be able to attest that the problem is solve or not immediatly. Right, that sounds sensible since I can't work out how it gets through. I also need to know that the change doesn't cause any unwanted side effects so let me know how things go. Adding an is_mounted() check in a frequently called function like this can have an overhead if your using the old ioctl interface, which you are by the look of the log. This should only show up if you have a largish number of mounts. How many is a large number is very much site dependent but if the CPU usage of the daemon is acceptable to you then you don't need to worry about it. In addition, can you tell me why the patch is autofs-5.0.3 - fix expire race and not autofs-5.0.5 ? That's just a typo, I'll rename the patch before committing it. Thanks, Erwan Ian Kent wrote: On Tue, 2010-10-26 at 09:22 +0200, Erwan Loaëc wrote: Sorry for my previous mail with missing parts... Ian Kent wrote: On Fri, 2010-10-22 at 17:19 +0200, Erwan Loaëc wrote: Hello, Sorry for the time, but before posting again I've upgrade every production servers with newest autofs4 module (with last patch) and last What does newest autofs4 module mean exactly? Yes it is not the newest module but the module recompiled with the patch autofs4-2.6.26-v5-update-20090903.patch OK. automount with all patch EXCEPT these: autofs-5.0.5-fix-restart.patch autofs-5.0.5-fix-status-privilege-error.patch autofs-5.0.4-always-read-file-maps-mount-lookup-map-read-fix.patch autofs-5.0.5-fix-direct-map-not-updating-on-reread.patch autofs-5.0.5-add-external-bind-method.patch autofs-5.0.5-fix-add-simple-bind-auth.patch autofs-5.0.5-add-dump-maps-option.patch autofs-5.0.5-fix-submount-shutdown-wait.patch Today I had the same behaviour than the issue explained my previous mail. Oct 22 16:44:06 SERVERNAME automount[2665]: umount_autofs_offset: couldn't get ioctl fd for offset /cifs/XXX/volume: No such file or directory Oct 22 16:44:06 SERVERNAME automount[2665]: handle_packet_missing_direct:1363: can't find map entry for (20,3548545) This could be caused by a umount returning success when in fact it didn't succeed with the umount. Are you sure umount is returning correct status? Unfortunaly the last case occured on production server with debug disable. I can't find more information in logfile... But, as I've explained in my previous mail, this could logical with the bad sequence found in my previous case: *Call to the share /cifs/XXX/volume *Mount /cifs/XXX/volume expiring *New call to the share /cifs/XXX/volume *Umount /cifs/XXX/volume But that isn't quite what the problem is. I've had a closer look at the log and: Aug 26 10:11:42 bacchus automount[17827]: handle_packet_expire_direct: token 1526, name /cifs/SERV2/TM_termoz Aug 26 10:11:42 bacchus automount[17827]: expiring path /cifs/SERV2/TM_termoz Aug 26 10:11:42 bacchus automount[17827]: umount_multi: path /cifs/SERV2/TM_termoz incl 1 Aug 26 10:11:42 bacchus automount[17827]: umount_subtree_mounts: unmounting dir = /cifs/SERV2/TM_termoz Aug 26 10:11:42 bacchus automount[17827]: expired /cifs/SERV2/TM_termoz Aug 26 10:11:42 bacchus automount[17827]: ioctl_send_ready: token = 1526 /cifs/SERV2/TM_termoz is umounted
Re: [autofs] automounting replicated servers using NFSv4 fails
On Fri, 2011-02-18 at 14:09 +0100, Nico De Ranter wrote: Hi, I have a number of Linux clients (Ubuntu 10.04) that mount a (read-only) directory from 3 replicated servers using NFSv3. I am now in the process of moving to NFSv4. I can mount the directories using NFSv4 manually. I can mount the directories using autofs over NFSv4 when I specify only 1 server (any of the 3 will work), but when I add all 3 servers to the automount configuration file the mount fails. Works OK for me, on Fedora 12. The configuration files look as follows: # /etc/automaster /-/etc/auto.local # /etc/auto.local /usr/local -fstype=nfs4,ro,nodev,nosuid,nonstrict,nodev,sync,_netdev,proto=tcp,retry=10,rsize=8192,wsize=8 192,soft server1:/local/ubuntu64 server2:/local/ubuntu64 server3:/local/ubuntu64 (Note: the content of /etc/auto.local is on 1 line but my e-mail application is splitting the content over multiple lines) As I said above a similar setup using nfs in stead of nfs4 works fine, specifying only 1 server works fine too. If I run automount manually with verbose and debugging enabled I see the following output when trying to access /usr/local: # handle_packet: type = 5 handle_packet_missing_direct: token 296, name /usr/local, request pid 2887 attempting to mount entry /usr/local lookup_mount: lookup(file): looking up /usr/local lookup_mount: lookup(file): /usr/local - -fstype=nfs4,ro,nodev,nosuid,nonstrict,nodev,sync,_netdev,proto=tcp,retry=10,rsize=8192,wsize=8192,soft server1:/local/ubuntu64 server2:/local/ubuntu64 parse_mount: parse(sun): expanded entry: -fstype=nfs4,ro,nodev,nosuid,nonstrict,nodev,sync,_netdev,proto=tcp,retry=10,rsize=8192,wsize=8192,soft server1:/local/ubuntu64 server2:/local/ubuntu64 parse_mount: parse(sun): gathered options: fstype=nfs4,ro,nodev,nosuid,nonstrict,nodev,sync,_netdev,proto=tcp,retry=10,rsize=8192,wsize=8192,soft parse_mount: parse(sun): dequote(server1:/local/ubuntu64) - server1:/local/ubuntu64 parse_mount: parse(sun): dequote(server2:/local/ubuntu64) - server2:/local/ubuntu64 parse_mount: parse(sun): core of entry: options=fstype=nfs4,ro,nodev,nosuid,nonstrict,nodev,sync,_netdev,proto=tcp,retry=10,rsize=8192,wsize=8192,soft, loc=server1:/local/ubuntu64 server2:/local/ubuntu64 sun_mount: parse(sun): mounting root /usr/local, mountpoint /usr/local, what server1:/local/ubuntu64 server2:/local/ubuntu64, fstype nfs4, options ro,nodev,nosuid,nodev,sync,_netdev,proto=tcp,retry=10,rsize=8192,wsize=8192,soft mount_mount: mount(nfs): root=/usr/local name=/usr/local what=server1:/local/ubuntu64 server2:/local/ubuntu64, fstype=nfs4, options=ro,nodev,nosuid,nodev,sync,_netdev,proto=tcp,retry=10,rsize=8192,wsize=8192,soft mount_mount: mount(nfs): nfs options=ro,nodev,nosuid,nodev,sync,_netdev,proto=tcp,retry=10,rsize=8192,wsize=8192,soft, nosymlink=0, ro=1 get_nfs_info: called for host server2 proto tcp version 0x40 get_nfs_info: called for host server1 proto tcp version 0x40 mount(nfs): no hosts available dev_ioctl_send_fail: token = 296 failed to mount /usr/local Any idea what might be going wrong? Thanks in advance, Nico ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] 5.0.5 non-expiring mounts
On Wed, 2011-02-16 at 15:08 +0800, Ian Kent wrote: On Tue, 2011-02-15 at 16:11 -0200, Leonardo Chiquitto wrote: On Tue, Feb 15, 2011 at 10:28 AM, Ian Kent ra...@themaw.net wrote: On Mon, 2011-02-14 at 21:28 -0800, Mike Marion wrote: On Mon, Feb 14, 2011 at 07:37:01PM -0800, Ian Kent wrote: That is kernel revision and autofs revision? 2.6.16.60-0.59.1 (Sles10 sp3 with an updated, but not bleeding edge, patch). autofs 5.0.5 with most of the patches up to a couple months ago. It's hard to get exacts because it's a PTF from Novell (we really pushed them to upgrade to 5.0.5) but it should be pretty much equal to the patch they just released for sle 11 sp1 that they're recommending as they default going forward. Now I'm confused? I thought that Mike had mention he had seen hangs, similar to Steve, and this backtrace was an example of that. But the mail thread doesn't read like that and oddly enough I seem to have identified a locking problem from looking at the code based on the backtrace. A backtrace generally doesn't do us any good when were trying to find an expire problem, the debug log is where we have to start on these. That make it hard, as you know. But I wouldn't mind spending a bit of time on it, if you can also. Let's assume that it's a user space problem for now. Here are the call traces for all automount processes on the kernel side: I think it's a user space problem. snip ... And here are the call traces from the user land daemon: Thread 9 (Thread 4017): #0 0x2b56e465d6a8 in __lll_mutex_lock_wait () from /lib64/libpthread.so.0 #1 0x2b56e46599fb in _L_mutex_lock_92 () from /lib64/libpthread.so.0 #2 0x2b56e4659455 in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x555746cd in master_mutex_lock () at master.c:49 #4 0xd260 in do_hup_signal (master=0x5568d010, age=1296063258) at automount.c:1276 #5 0x55560bd3 in statemachine (arg=value optimized out) at automount.c:1354 #6 main (arg=value optimized out) at automount.c:2142 Thread 8 (Thread 20702): #0 0x2b56e4dd62a7 in brk () from /lib64/libc.so.6 #1 0x55577dfe in expire (logopt=2, cmd=value optimized out, fd=21, ioctlfd=21, path=0x5569ca20 /usr2, arg=0x41c27ef4) at dev-ioctl-lib.c:657 #2 0x55577ebe in ioctl_expire (logopt=21, ioctlfd=-1, path=0x5569ca20 /usr2, when=0) at dev-ioctl-lib.c:701 #3 0x55561e4e in expire_proc_indirect (arg=value optimized out) at indirect.c:545 #4 0x2b56e4657193 in start_thread () from /lib64/libpthread.so.0 #5 0x2b56e4ddcdfd in sysctl () from /lib64/libc.so.6 #6 0x in ?? () Thread 7 (Thread 7060): #0 0x2b56e465ac77 in pthread_rwlock_wrlock () from /lib64/libpthread.so.0 #1 0x555752ea in master_source_writelock (entry=value optimized out) at master.c:527 #2 0x55575f8f in master_add_map_source (entry=0x556a10b0, type=0x0, format=0x0, age=1296059657, argc=1, argv=value optimized out) at master.c:191 #3 0x55579ee3 in master_parse_entry (buffer=value optimized out, default_timeout=86400, logging=value optimized out, age=1296059657) at master_parse.y:823 #4 0x2aab83fe in lookup_read_master (master=value optimized out, age=1296059657, context=value optimized out) at lookup_ldap.c:1625 #5 0x55569052 in do_read_master (master=0x5568d010, type=value optimized out, age=1296059657) at lookup.c:96 #6 0x5556aa3c in lookup_nss_read_master (master=0x5568d010, age=1296059657) at lookup.c:229 #7 0x55575c28 in master_read_master (master=0x5568d010, age=1296059657, readall=1) at master.c:831 #8 0xd844 in do_read_master (arg=value optimized out) at automount.c:1259 #9 0x2b56e4657193 in start_thread () from /lib64/libpthread.so.0 #10 0x2b56e4ddcdfd in sysctl () from /lib64/libc.so.6 #11 0x in ?? () Thread 6 (Thread 6851): #0 0x2b56e465aa3d in pthread_rwlock_rdlock () from /lib64/libpthread.so.0 #1 0x5556deb6 in cache_readlock (mc=0x5568e5b8) at cache.c:60 #2 0x5556baff in do_readmap (arg=value optimized out) at state.c:479 #3 0x2b56e4657193 in start_thread () from /lib64/libpthread.so.0 #4 0x2b56e4ddcdfd in sysctl () from /lib64/libc.so.6 #5 0x in ?? () Thread 5 (Thread 4026): #0 0x2b56e465d6a8 in __lll_mutex_lock_wait () from /lib64/libpthread.so.0 #1 0x2b56e46599fb in _L_mutex_lock_92 () from /lib64/libpthread.so.0 #2 0x2b56e4659455 in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x555746cd in master_mutex_lock () at master.c:49 #4 0x55560ff9 in handle_packet_missing_indirect (ap=0x5569c940, pkt
Re: [autofs] Wrong network
On Wed, 2011-02-16 at 17:00 -0500, Steve Thompson wrote: On Tue, 15 Feb 2011, Ian Kent wrote: As far as the hang you have seen, I don't know why that's happening, the patches were added between el5_5.4 and el5_5.6 have been around for quite a while, upstream and in Fedora and tested by more than one customer, so I didn't expect to hear of a problem. While thinking about this I noticed that the automount hangs all occurred on systems that had SElinux set to enforcing mode; there were no hangs on systems with SElinux either disabled or in permissive mode. So for now I have disabled SElinux on most of the systems on which a hang was observed. So far, so good: no more hangs. But it's only been a couple of days, so I will keep an eye on it. Right, but to help me a gdb backtrace (the thr a a bt) of a hung automount process would be useful. Also, an avc log from a system with selinux in permissive mode is needed to find out what is happening. I don't think the changes to the package caused this breakage and QA testing is always done with selinux in enforcing mode. So maybe there has been an selinux policy change too. Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] 5.0.5 non-expiring mounts
On Wed, 2011-02-16 at 17:08 -0800, Deke Clinger wrote: On Wed Feb 16 12:18:15 UTC 2011 Ian Kent wrote: A backtrace generally doesn't do us any good when were trying to find an expire problem, the debug log is where we have to start on these. I've got a debug log from a sled10sp3 machine running the Novell autofs 5.0.5 update. Ian - could I mail you this personally? I'd rather not have this log with usernames, paths, hostnames, etc. on a public archive. Yes please. FWIW, I did do a test with autofs5.0.5 built from source with all the patches in the patch order list from kernel.org and it demonstrated the same behavior: a USR1 signal unmounted the direct map entries but not the indirect. I changed the maps to files, pruned them to a few hundred entries and converted the indirect entries to direct and re-ran the test. Configured like this all entries unmounted upon a USR1 signal so I do believe this has something to do with direct vs indirect mounts. There are a couple of possibilities. I'm always working with the current source and basic testing includes expiring both direct and indirect mounts as a matter of course and I'm not seeing this. So there has to be more to it or I have one or other patches already in the queue that resolve the problem. Does this happen straight away or start after some time of running? Do all indirect mounts stop expiring or only some? What is the for of the indirect map entries, are they multi-mount entries? Obviously the debug log will probably answer most of these questions. Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] Wrong network
On Tue, 2011-02-15 at 07:40 -0500, Steve Thompson wrote: On Tue, 15 Feb 2011, Ian Kent wrote: Fair call, but it comes over as though you don't want to contribute at all, which isn't good. That's not the impression I meant to give at all. I _am_ building some new test machines (about ten of them), and I _will_ do some testing. My test machines will all be 32-bit, though, as there is no spare 64-bit hardware. I just can't go testing on a production network; what with this and the hang problem I described in a separate post, I'm close enough to being lynched as it is. I have written my own applications of a similar level of complexity to autofs, and I certainly appreciate that it's not easy, and I definitely appreciate the job that you're doing. We're good then. I appreciate you are in a difficult spot, but spare a thought for the pressures I may have that tent to make me a bit short from time to time, ;) As far as the hang you have seen, I don't know why that's happening, the patches were added between el5_5.4 and el5_5.6 have been around for quite a while, upstream and in Fedora and tested by more than one customer, so I didn't expect to hear of a problem. Like I said, send me a gdb backtrace so that I can see where it's happening. Just to confirm, this is autofs-5.0.1-0.rc2.143.el5_5.6, and what kernel revision, I guess the same as RHEL-5.5, which should be fine. I can post the previous RedHat package to people.redhat.com so you can revert back while we work on this, if that will kelp. Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] autofs5.0.5 patches for old kernels.
On Tue, 2011-02-01 at 21:21 +0530, ki...@serc.iisc.ernet.in wrote: Hi, In autofs-5.0.5 distribution there are no kernel V5 patches for old kernel(s2.6.9 to 2.6.17), but document INSTALL says Applying The Kernel Patch = Patches that can be applied to the kernel are located in the `patches' directory. If you have installed autofs from an rpm then they can be found in the packages' doc directory after install. They consist of a kernel patch for each kernel from 2.6.9 thru 2.6.16 (the patches are in the 2.6.17-rc series so patching a 2.6.17 or above kernel shouldn't be needed). = Where can we find patchs for old kernels? There aren't any. As changes have accumulated back porting the patches has become more and more difficult due to changes in other parts of the kernel. So I had to stop back porting them, especially since I can't properly test them. v5.0.5 is it not support on old kernels ? 2.6.18 is an old kernel. You can use the patches from older v5 source tarballs and report any problems but if you use autofs heavily you probably should be using a more recent kernel. There are back ports of the patch series in those tars. You won't get the new device ioctl interface changes (that was about the time I had to stop) but autofs will know the kernel doesn't have this and use the old ioctl interface. I could spend a little time having a look at what I have but not for a while yet. What problem are you trying to resolve exactly. Thanking you, regards, kiran ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] autofs4_d_automount() can change path-dentry param
On Sat, 2011-01-15 at 11:11 +, David Howells wrote: Hi Ian, I've just noticed that autofs4_d_automount() can change the dentry pointer in the path parameter (via autofs4_mountpoint_changed()). Is this just doing a straight substitution of one dentry for its equivalent? I don't think it'll be a problem for follow_automount() and follow_managed(), provided the dentry stays in the same namespace - but if we eliminate the vfsmount pointer and just pass the dentry pointer in to d_automount(), you won't be able to do this anymore. Would it work to simply return NULL here and hope the recheck picks up the substitution? I don't think so. It happens if the mount point dentry is removed and recreated during a callback to the daemon so a d_lookup() then returns the replacement dentry. If the vfsmount isn't available we would need to be able to return the new dentry in much the same as -lookup() will use a replacement dentry if it is returned. I guess that's still a problem if we need to return a vfsmount or ERR_PTR or NULL. Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] WARNING: at fs/dcache.c:1359 d_set_d_op [was: mmotm 2011-01-06-15-41 uploaded]
On Thu, 2011-01-13 at 16:40 +0100, Jiri Slaby wrote: On 01/13/2011 04:33 PM, valdis.kletni...@vt.edu wrote: On Thu, 13 Jan 2011 10:52:22 +0100, Jiri Slaby said: On 01/07/2011 12:41 AM, a...@linux-foundation.org wrote: The mm-of-the-moment snapshot 2011-01-06-15-41 has been uploaded to Hi, after some uptime and several suspend/resume cycles, I got: WARNING: at fs/dcache.c:1359 d_set_d_op+0x82/0xb0() Hardware name: To Be Filled By O.E.M. Modules linked in: dvb_usb_af9015 tda18271 af9013 dvb_usb dvb_core Pid: 3474, comm: automount Tainted: GW 2.6.37-mm1_64+ #1344 Call Trace: [8106bd2a] ? warn_slowpath_common+0x7a/0xb0 [8106bd75] ? warn_slowpath_null+0x15/0x20 [81125a32] ? d_set_d_op+0x82/0xb0 [8120d829] ? autofs4_dir_mkdir+0x169/0x180 Wow. So it wasn't just configfs that trips over this one. I'm now hoping that Al audited all the pseudo file systems for this... Well, CCing Al. I don't see any recent change in fs/autofs4 in: http://git.kernel.org/?p=linux/kernel/git/viro/vfs-2.6.git;a=history;f=fs/autofs4;hb=refs/heads/for-next So maybe not all? Did you see: https://lkml.org/lkml/2011/1/12/394 Haven't had any feedback on this yet, odd or maybe no news is good news? Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] [PATCH 09/18] autofs4: Add d_manage() dentry operation [ver #4]
On Thu, 2011-01-13 at 21:54 +, David Howells wrote: From: Ian Kent ra...@themaw.net This patch required a previous patch to add the -d_automount() dentry operation. Add a function to use the newly defined -d_manage() dentry operation for blocking during mount and expire. Whether the VFS calls the dentry operations d_automount() and d_manage() is controled by the DMANAGED_AUTOMOUNT and DMANAGED_TRANSIT flags. autofs uses the d_automount() operation to callback to user space to request mount operations and the d_manage() operation to block walks into mounts that are under construction or destruction. In order to prevent these functions from being called unnecessarily the DMANAGED_* flags are cleared for cases which would cause this. In the common case the DMANAGED_AUTOMOUNT and DMANAGED_TRANSIT flags are both set for dentrys waiting to be mounted. The DMANAGED_TRANSIT flag is cleared upon successful mount request completion and set during expire runs, both during the dentry expire check, and if selected for expire, is left set until a subsequent successful mount request completes. The exception to this is the so-called rootless multi-mount which has no actual mount at its base. In this case the DMANAGED_AUTOMOUNT flag is cleared upon successful mount request completion as well and set again after a successful expire. Signed-off-by: Ian Kent ra...@themaw.net Signed-off-by: David Howells dhowe...@redhat.com --- fs/autofs4/autofs_i.h | 50 - fs/autofs4/expire.c | 51 + fs/autofs4/inode.c|3 + fs/autofs4/root.c | 100 +++-- 4 files changed, 164 insertions(+), 40 deletions(-) diff --git a/fs/autofs4/autofs_i.h b/fs/autofs4/autofs_i.h index 1ebfe53..7eff538 100644 --- a/fs/autofs4/autofs_i.h +++ b/fs/autofs4/autofs_i.h @@ -99,7 +99,6 @@ struct autofs_info { }; #define AUTOFS_INF_EXPIRING (10) /* dentry is in the process of expiring */ -#define AUTOFS_INF_MOUNTPOINT(11) /* mountpoint status for direct expire */ #define AUTOFS_INF_PENDING (12) /* dentry pending mount */ struct autofs_wait_queue { @@ -221,6 +220,7 @@ extern const struct file_operations autofs4_root_operations; /* Operations methods */ struct vfsmount *autofs4_d_automount(struct path *); +int autofs4_d_manage(struct path *, bool); /* VFS automount flags management functions */ @@ -248,6 +248,54 @@ static inline void managed_dentry_clear_automount(struct dentry *dentry) spin_unlock(dentry-d_lock); } +static inline void __managed_dentry_set_transit(struct dentry *dentry) +{ + dentry-d_flags |= DCACHE_MANAGE_TRANSIT; +} + +static inline void managed_dentry_set_transit(struct dentry *dentry) +{ + spin_lock(dentry-d_lock); + __managed_dentry_set_transit(dentry); + spin_unlock(dentry-d_lock); +} + +static inline void __managed_dentry_clear_transit(struct dentry *dentry) +{ + dentry-d_flags = ~DCACHE_MANAGE_TRANSIT; +} + +static inline void managed_dentry_clear_transit(struct dentry *dentry) +{ + spin_lock(dentry-d_lock); + __managed_dentry_clear_transit(dentry); + spin_unlock(dentry-d_lock); +} + +static inline void __managed_dentry_set_managed(struct dentry *dentry) +{ + dentry-d_flags |= (DCACHE_NEED_AUTOMOUNT|DCACHE_MANAGE_TRANSIT); +} + +static inline void managed_dentry_set_managed(struct dentry *dentry) +{ + spin_lock(dentry-d_lock); + __managed_dentry_set_managed(dentry); + spin_unlock(dentry-d_lock); +} + +static inline void __managed_dentry_clear_managed(struct dentry *dentry) +{ + dentry-d_flags = ~(DCACHE_NEED_AUTOMOUNT|DCACHE_MANAGE_TRANSIT); +} + +static inline void managed_dentry_clear_managed(struct dentry *dentry) +{ + spin_lock(dentry-d_lock); + __managed_dentry_clear_managed(dentry); + spin_unlock(dentry-d_lock); +} + /* Initializing function */ int autofs4_fill_super(struct super_block *, void *, int); diff --git a/fs/autofs4/expire.c b/fs/autofs4/expire.c index 0571ec8..3ed79d7 100644 --- a/fs/autofs4/expire.c +++ b/fs/autofs4/expire.c @@ -26,10 +26,6 @@ static inline int autofs4_can_expire(struct dentry *dentry, if (ino == NULL) return 0; - /* No point expiring a pending mount */ - if (ino-flags AUTOFS_INF_PENDING) - return 0; - if (!do_now) { /* Too young to die */ if (!timeout || time_after(ino-last_used + timeout, now)) @@ -283,6 +279,7 @@ struct dentry *autofs4_expire_direct(struct super_block *sb, unsigned long timeout; struct dentry *root = dget(sb-s_root); int do_now = how AUTOFS_EXP_IMMEDIATE; + struct autofs_info *ino; if (!root) return NULL; @@ -291,20 +288,21 @@ struct dentry *autofs4_expire_direct(struct super_block *sb
Re: [autofs] autofs 5.0.4 and 5.0.5 do not remove ghost entries upon SIGHUP
On Thu, 2011-01-13 at 15:50 +0800, Ian Kent wrote: On Tue, 2011-01-11 at 18:34 +0200, Michael Orlov wrote: Hi, (I have sent this message on Oct 27, but nothing appears in the archives for that month - hopefully it's not a duplicate message.) I think this is a bug - if it is not, I will be glad if someone corrects me. I am using autofs on Gentoo. You will need to duplicate this with all the current 5.0.5 patches on kernel.org applied to get serious interest in this here. So I've done that and this patch seems to resolve the problem but I doubt it will apply to the source you are using without some other dependent patches from kernel.org. autofs-5.0.5 - fix prune cache valid check From: Ian Kent ra...@themaw.net During a map reload, when pruning the cache we look for a valid map entry in another map. In lookup_prune_one_cache() There is a missing check for the entry being in the current map which causes the directory cleanup code from doing its job. --- daemon/lookup.c |8 1 files changed, 8 insertions(+), 0 deletions(-) diff --git a/daemon/lookup.c b/daemon/lookup.c index a9a1f4d..36e60c9 100644 --- a/daemon/lookup.c +++ b/daemon/lookup.c @@ -1060,6 +1060,14 @@ void lookup_prune_one_cache(struct autofs_point *ap, struct mapent_cache *mc, ti * cache entry. */ valid = lookup_source_valid_mapent(ap, key, LKP_DISTINCT); + if (valid valid-mc == mc) { +/* + * We've found a map entry that has been removed from + * the current cache so it isn't really valid. + */ + cache_unlock(valid-mc); + valid = NULL; + } if (!valid is_mounted(_PATH_MOUNTED, path, MNTS_REAL)) { debug(ap-logopt, ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] autofs 5.0.4 and 5.0.5 do not remove ghost entries upon SIGHUP
On Thu, 2011-01-13 at 16:36 +0200, Michael Orlov wrote: So I've done that and this patch seems to resolve the problem but I doubt it will apply to the source you are using without some other dependent patches from kernel.org. Gentoo applies the patches listed in http://kernel.org/pub/linux/daemons/autofs/v5/patch_order-5.0.5 - is that what you mean? Yes, but there are a few others I have here. The patches against this part of the code should all be available on kernel.org so you should be OK. autofs-5.0.5 - fix prune cache valid check Thanks! Will you be adding the patch to http://kernel.org/pub/linux/daemons/autofs/v5/ ? Not until I get feedback as to whether it fixes the issue and a little think time on my part. Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] Handling mount attempts to down NFS servers
On Tue, 2011-01-11 at 12:51 -0500, Paul Raines wrote: I found many users asking this on google search, but no solutions. Is there a way to make the '/bin/mount -t nfs' calls autofs makes timeout faster when the server being accessed is down? Barring that, is there a way to make it such that when there is '/bin/mount -t nfs' attempt going on that is hung, other mount attempts to perfectly working servers during that time do not also hang? This is what is really a killer. One thought I had is to add 'bg' to the mount options in the map files but I am scared that might cause 100s of hung mount attempts to pile up. I am using a mix of CentOS4.8 (autofs-4.1.4-2) and CentOS5.5 (autofs-5.0.1-0.rc2.143.el5_5.6) boxes. Did you look in /etc/sysconfig/autofs for the CentOS5 boxes. # MOUNT_WAIT - time to wait for a response from umount(8). # Setting this timeout can cause problems when # mount would otherwise wait for a server that # is temporarily unavailable, such as when it's # restarting. The defailt of waiting for mount(8) # usually results in a wait of around 3 minutes. # #MOUNT_WAIT=-1 Admittedly there is a typo, umount(8) should be mount(8). This has been present since revision 133. It isn't the best as it is difficult to locate the actual mount process so it is left to timeout, but it does allow autofs to continue. I don't think your chances are very good of getting anything into CentOS4. Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] autofs 5.0.4 and 5.0.5 do not remove ghost entries upon SIGHUP
On Tue, 2011-01-11 at 18:34 +0200, Michael Orlov wrote: Hi, (I have sent this message on Oct 27, but nothing appears in the archives for that month - hopefully it's not a duplicate message.) I think this is a bug - if it is not, I will be glad if someone corrects me. I am using autofs on Gentoo. You will need to duplicate this with all the current 5.0.5 patches on kernel.org applied to get serious interest in this here. In auto.master, put /mnt/test /etc/auto.test --ghost In auto.test, put floppy -fstype=auto :/dev/fd0 (or anything similar) 1. start autofs 2. see that /mnt/test/floppy exists 3. remove or comment out the floppy line in auto.test 4. send SIGHUP to autofs After these actions, in autofs 5.0.4 / 5.0.5 (but not 5.0.3), /mnt/test/floppy still exists - it is not removed after SIGHUP. This is problematic, for example, when auto.test is maintained via udev. Full restart of a service from a udev-invoked script is very error-prone. Thanks, Michael ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] autofs misbehaves when DNS RRs returns more ldap servers
On Fri, 2011-01-07 at 14:12 +0100, Ondrej Valousek wrote: On 06.01.2011 15:07, Ian Kent wrote: Thanks for the suggestions. I'm still on leave so things are still going slowly for now, but I'll get to it. Hi Ian, Please find the attached patch which fixes the problem for me. I also changed the dclist structure definition from: struct dclist { time_t expire; const char *uri; }; into: struct dclist { time_t expire; char **uri; int cnt; }; Hope you'll find it useful :-) . Sorry I didn't get onto this earlier. I could have saved you quite a bit of time. Now that I've had a look I see I've already worked on this problem. See: http://www.kernel.org/pub/linux/daemons/autofs/v5/autofs-5.0.5-check-each-dc-server.patch Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] Autofs SMBFS no write in files
On Tue, 2010-12-28 at 19:51 +0100, Issa wrote: Hello, Autofs accés smbfs impossible d'écrire sur le disque ? Autofs smbfs no writing ? A bit off topic but is there some reason you need to use smbfs. It's deprecated and I expect it will go away some time soon. You should try using CIFS if you have a reasonably recent (or even not so recent should be ok) kernel. Im usint autofs with ubuntu 10.10 Im using autofs like this sudo nano /etc/auto.master +auto.master /mnt/smb /etc/auto.auto --timeout=60 --ghost et le fichier sudo nano /etc/auto.auto #directory name option for mount device to mount win1-fstype=smbfs,rw,credentials=/etc/smb.auth ://win1/docs/ Authentication files : sudo nano /etc/smb.auth username=users1 password=motDePasse domain=windowsDomaine now with this i can access only in read /mnt/smb/win1 My question how add acess to write ? because i can write for the moment. thanks ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] autofs misbehaves when DNS RRs returns more ldap servers
On Thu, 2011-01-06 at 09:48 +0100, Ondrej Valousek wrote: On 06.01.2011 08:09, Ian Kent wrote: LDAP_URI=ldap://server1 ldap://server2; You are supposed to be able to do this. Ok I have found the problem. The construction above is working well, indeed. The problem is, that you call get_dc_list() directly in the while loop in function find_server() where its output is not parsed (normally the LDAP_URI config parameter is parsed fine). I think that to fix it we would need to: 1. call the get_dc_list() before the main while loop 2. fix get_dc_list() so it rather than strcatting ldap uris into a single string returns the pure list so that we do not have to parse it again. This way it can be directly processed in the main while loop. But I do not know how would it behave if we had something like this: LDAP_URI=ldap:///something ldap:///something_else;. Maybe two nested loops would be better - anyway I am sure you know where I am pointing now :-) Thanks for the suggestions. I'm still on leave so things are still going slowly for now, but I'll get to it. Ondrej __ The information contained in this e-mail and in any attachments is confidential and is designated solely for the attention of the intended recipient(s). If you are not an intended recipient, you must not use, disclose, copy, distribute or retain this e-mail or any part thereof. If you have received this e-mail in error, please notify the sender by return e-mail and delete all copies of this e-mail from your computer system(s). Please direct any additional queries to: communicati...@s3group.com. Thank You. Silicon and Software Systems Limited. Registered in Ireland no. 378073. Registered Office: Whelan House, South County Business Park, Leopardstown, Dublin 18 __ ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] autofs misbehaves when DNS RRs returns more ldap servers
On Mon, 2011-01-03 at 11:14 +0100, Ondrej Valousek wrote: On 28.12.2010 03:24, Ian Kent wrote: That's right. I'm supposed to break that list into individual server entries and attempt a connection to each in turn. Can you get a debug log for me please. Please find the debug log attached. I believe it has primarily nothing to do with DNS SRV support - the problem in general is that autofs man page claims that you can do something like this: LDAP_URI=ldap://server1 ldap://server2; You are supposed to be able to do this. but in fact this does not work (at least the source code does not look like supporting it). So in general you have 2 options how to resolve this: 1) fix the autofs man page and say that the construction above is not possible. DNS SRV lookups must be fixed separately then. 2) fix the automounter so that the construction above works as described in the 'man auto.master' - DNS SRV lookups will then start working automatically, too. I'd prefer to fix it so I'll start by checking automount. Here is the debug log: Dec 27 12:44:46 dorado_v1 automount[2712]: Starting automounter version 5.0.1-0.rc2.143.el5_5.6, master map auto.master.ldap Dec 27 12:44:46 dorado_v1 automount[2712]: using kernel protocol version 5.01 Dec 27 12:44:46 dorado_v1 automount[2712]: lookup_nss_read_master: reading master files auto.master.ldap Dec 27 12:44:46 dorado_v1 automount[2712]: lookup(file): file map /etc/auto.master.ldap missing or not readable Dec 27 12:44:46 dorado_v1 automount[2712]: lookup_nss_read_master: reading master ldap auto.master.ldap Dec 27 12:44:46 dorado_v1 automount[2712]: parse_server_string: lookup(ldap): Attempting to parse LDAP information from string auto.master.ldap. Dec 27 12:44:46 dorado_v1 automount[2712]: parse_server_string: lookup(ldap): mapname auto.master.ldap Dec 27 12:44:46 dorado_v1 automount[2712]: parse_ldap_config: lookup(ldap): ldap authentication configured with the following options: Dec 27 12:44:46 dorado_v1 automount[2712]: parse_ldap_config: lookup(ldap): use_tls: 0, tls_required: 0, auth_required: 2, sasl_mech: GSSAPI Dec 27 12:44:46 dorado_v1 automount[2712]: parse_ldap_config: lookup(ldap): user: (null), secret: unspecified, client principal: dorado_...@dublin.ad.s3group.com credential cache: (null) Dec 27 12:44:46 dorado_v1 automount[2712]: parse_init: parse(sun): init gathered global options: (null) Dec 27 12:44:46 dorado_v1 automount[2712]: get_dc_list: doing lookup of SRV RRs for domain dublin.ad.s3group.com Dec 27 12:44:46 dorado_v1 automount[2712]: dns_lookup_srv: 10 records returned in the answer section. Dec 27 12:44:46 dorado_v1 automount[2712]: dns_parse_rr_srv: Parsed dccorka.dublin.ad.s3group.com [0, 100, 389] Dec 27 12:44:46 dorado_v1 automount[2712]: dns_parse_rr_srv: Parsed dclisaa.dublin.ad.s3group.com [0, 100, 389] Dec 27 12:44:46 dorado_v1 automount[2712]: dns_parse_rr_srv: Parsed dcdub1.dublin.ad.s3group.com [0, 100, 389] Dec 27 12:44:46 dorado_v1 automount[2712]: dns_parse_rr_srv: Parsed dcduba.dublin.ad.s3group.com [0, 100, 389] Dec 27 12:44:46 dorado_v1 automount[2712]: dns_parse_rr_srv: Parsed dcdubb.dublin.ad.s3group.com [0, 100, 389] Dec 27 12:44:46 dorado_v1 automount[2712]: dns_parse_rr_srv: Parsed dcpra1.dublin.ad.s3group.com [0, 100, 389] Dec 27 12:44:46 dorado_v1 automount[2712]: dns_parse_rr_srv: Parsed dcsjc1.dublin.ad.s3group.com [0, 100, 389] Dec 27 12:44:46 dorado_v1 automount[2712]: dns_parse_rr_srv: Parsed dcsjca.dublin.ad.s3group.com [0, 100, 389] Dec 27 12:44:46 dorado_v1 automount[2712]: dns_parse_rr_srv: Parsed dcwro1.dublin.ad.s3group.com [0, 100, 389] Dec 27 12:44:46 dorado_v1 automount[2712]: dns_parse_rr_srv: Parsed dccork1.dublin.ad.s3group.com [0, 100, 389] Dec 27 12:44:46 dorado_v1 automount[2712]: find_server: trying server uri ldap://dccorka.dublin.ad.s3group.com:389 ldap://dclisaa.dublin.ad.s3group.com:389 ldap://dcdub1.dublin.ad.s3group.com:389 ldap://dcduba.dublin.ad.s3group.com:389 ldap://dcdubb.dublin.ad.s3group.com:389 ldap://dcpra1.dublin.ad.s3group.com:389 ldap://dcsjc1.dublin.ad.s3group.com:389 ldap://dcsjca.dublin.ad.s3group.com:389 ldap://dcwro1.dublin.ad.s3group.com:389 ldap://dccork1.dublin.ad.s3group.com:389 Dec 27 12:44:46 dorado_v1 automount[2712]: do_bind: lookup(ldap): auth_required: 2, sasl_mech GSSAPI Dec 27 12:44:46 dorado_v1 automount[2712]: sasl_do_kinit: initializing kerberos ticket: client principal dorado_...@dublin.ad.s3group.com Dec 27 12:44:46 dorado_v1 automount[2712]: sasl_do_kinit: calling krb5_parse_name on client principal dorado_...@dublin.ad.s3group.com Dec 27 12:44:46 dorado_v1 automount[2712]: sasl_do_kinit: Using tgs name krbtgt/dublin.ad.s3group@dublin.ad.s3group.com Dec 27 12:44:46 dorado_v1 pcscd: winscard.c:304:SCardConnect() Reader E-Gate 0 0 Not Found Dec 27 12:44:46 dorado_v1 last message repeated 3 times Dec 27 12:44:46 dorado_v1 automount[2712
Re: [autofs] autofs misbehaves when DNS RRs returns more ldap servers
On Mon, 2010-12-27 at 13:36 +0100, Ondrej Valousek wrote: Hi Ian, I just found out that when using DNS RRs to find a suitable ldapserver to connect to, we do not handle correctly the situation where the multiple servers are found. We end up with something like this: automount[2712]: find_server: trying server uri ldap://dccorka.dublin.ad.s3group.com:389 ldap://dclisaa.dublin.ad.s3group.com:389 ldap://dcdub1.dublin.ad.s3group.com:389 ldap://dcduba.dublin.ad.s3group.com:389 ldap://dcdubb.dublin.ad.s3group.com:389 ldap://dcpra1.dublin.ad.s3group.com:389 ldap://dcsjc1.dublin.ad.s3group.com:389 ldap://dcsjca.dublin.ad.s3group.com:389 ldap://dcwro1.dublin.ad.s3group.com:389 ldap://dccork1.dublin.ad.s3group.com:389 Looking at the source code, this uri does not look valid to me. Ok man auto.master says that LDAP_URI might contain A space seperated list of server uris of the form proto://server[/], but ldap_initialize() does not look like it actually supports this - a single server is assumed instead. That's right. I'm supposed to break that list into individual server entries and attempt a connection to each in turn. Can you get a debug log for me please. Can you clarify this? In my case above, if connection to ldap://dccorka.dublin.ad.s3group.com:389 fails, autofs never tries the other servers in the list. Many thanks happy new year! Ondrej __ The information contained in this e-mail and in any attachments is confidential and is designated solely for the attention of the intended recipient(s). If you are not an intended recipient, you must not use, disclose, copy, distribute or retain this e-mail or any part thereof. If you have received this e-mail in error, please notify the sender by return e-mail and delete all copies of this e-mail from your computer system(s). Please direct any additional queries to: communicati...@s3group.com. Thank You. Silicon and Software Systems Limited. Registered in Ireland no. 378073. Registered Office: Whelan House, South County Business Park, Leopardstown, Dublin 18 __ ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] [PATCH] autofs4: Do not potentially dereference NULL pointer returned by fget() in autofs_dev_ioctl_setpipefd()
On Sat, 2010-12-18 at 22:43 +0100, Jesper Juhl wrote: Hi, In fs/autofs4/dev-ioctl.c::autofs_dev_ioctl_setpipefd() we call fget(), which may return NULL, but we do not explicitly test for that NULL return so we may end up dereferencing a NULL pointer - bad. When I originally submitted this patch I had chosen EBUSY as the return value to use if this happens. Ian Kent was kind enough to explain why that would most likely be wrong and why EBADF should most likely be used instead. This version of the patch uses EBADF. Signed-off-by: Jesper Juhl j...@chaosbits.net Acked-by: Ian Kent ra...@themaw.net --- dev-ioctl.c |4 1 file changed, 4 insertions(+) diff --git a/fs/autofs4/dev-ioctl.c b/fs/autofs4/dev-ioctl.c index eff9a41..a650d7e 100644 --- a/fs/autofs4/dev-ioctl.c +++ b/fs/autofs4/dev-ioctl.c @@ -372,6 +372,10 @@ static int autofs_dev_ioctl_setpipefd(struct file *fp, return -EBUSY; } else { struct file *pipe = fget(pipefd); + if (!pipe) { + err = -EBADF; + goto out; + } if (!pipe-f_op || !pipe-f_op-write) { err = -EPIPE; fput(pipe); ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] NFSv4 to be a default on RHEL-6
On Thu, 2010-12-09 at 07:51 -0500, Steve Dickson wrote: On 12/08/2010 09:56 AM, Ondrej Valousek wrote: On 08.12.2010 14:45, Ian Kent wrote: On Wed, 2010-12-08 at 13:51 +0100, Ondrej Valousek wrote: On 08.12.2010 13:36, Ian Kent wrote: The default should be determined by mount.nfs(8) since that's what autofs uses to perform mounts. I see, but it works only if the nfs4 root export is the same as /. It does not work otherwise. Example: Server 'dorado' exporting directory /exports which is also fsid=0 for nfs4. There is (also shared) subdirectory 'ext1' in this one. When I do: cd /net/dorado/exports/ext1 ... the export is mounted using NFSv3. Theoretically if I did: cd /net/dorado/ext1 ... I should have the same mounted via NFSv4, right? Unfortunately it does not work. But it should because: mount dorado:/ext1 /mnt works (giving nfs4 mount) The only information the hosts map has to go on is the export list received from the server. There is no way for autofs to know that /exports is the global root from the mountd exports information. It only knows that /exports is an export and neither does it know what NFS version the server may offer for this mount. That information just isn't available. As I said, I thought in recent Fedora releases (not sure now when this should have happened), that mount server:/exports /some mount point and mount server:/ /some mount point should both work and mount as NFSv4. Correct me if I'm wrong but your point is that it is mounting as NFSv3 so perhaps we should log a bug and see what the experts have to say on this. Ian Hi Ian, I understand that autofs can not know that /exports is a global root. I just thought that mine: cd /net/dorado/ext1 ... (i.e. omitting the exports word) should work as this way should autofs pass something like this : mount.nfs dorado:/ext1 /net/dorado/ext1 which would succeed then, resulting in a nfs4 mount. What happens now is (obviously) that autofs can not find 'ext1' in the exports information from 'dorado' and so it fails even without trying It depends on your server... If your server does not support v4, the mount will roll back to v3. An rpcinfo -p server will show if the server support v4. Knowing that the server supports NFSv4 doesn't tell me which export is a global root though (does it?). I guess I could code the rpcinfo calls and rewrite the mount code to retry if v4 is supported. Not sure I like that idea though. My problem is knowing for sure what the global root is and then I also don't know what mount.nfs(8) will do with it. If I did have some way of knowing what the global root was then trying to accommodate it will cause mounts that fall back to v3 to fail. I thought that the recent NFS server changes were meant to allow both the mount paths above to function, like I think they do on other OSes (I'll check on that if you really want me to)? Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] Automouter crashed, I have the core
On Tue, 2010-12-07 at 16:05 +0100, Ondrej Valousek wrote: On 07.12.2010 13:21, Ian Kent wrote: Unfortunately not but there certainly is a mistake in this area somewhere. I have another report of a hang with very similar symptoms but a thread that should have existed in the back trace had simply disappeared. That thread should have been an expire callback just like the one you have here. So this is really useful to know but I haven't worked it out yet. Hi Ian, Ok, so how can I help now? Submit a bugzilla request/Redhat support case/forget about it? My problem is working out what is going wrong. There are two things we need to do. First, any information about what is happening at the time this occurs might give us a clue of where to look. Second, a debug log might tell us what is happening, especially if we can spot the time of the problem and relate it back to the time in the log, although turning debugging on might stop the bug from occurring. If that happens then we need to go over what we do have and guess, then put some selective logging in to try and nail down what is happening. There's no question this process is difficult for us both and will require a fair amount of effort but that's the way it goes. Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
[autofs] [PATCH] autofs4 - remove ioctl mutex (bz23142)
With the recent changes to remove the BKL a mutex was added to the ioctl entry point for calls to the old ioctl interface. This mutex needs to be removed because of the need for the expire ioctl to call back to the daemon to perform a umount and receive a completion status (via another ioctl). This should be fine as the new ioctl interface uses much of the same code and it has been used without a mutex for around a year without issue, as was the original intention. Ref: Bugzilla bug 23142 Signed-off-by: Ian Kent ra...@themaw.net Acked-by: Arnd Bergmann a...@arndb.de --- fs/autofs4/root.c | 12 +--- 1 files changed, 1 insertions(+), 11 deletions(-) diff --git a/fs/autofs4/root.c b/fs/autofs4/root.c index d5c1401..d34896c 100644 --- a/fs/autofs4/root.c +++ b/fs/autofs4/root.c @@ -980,19 +980,11 @@ static int autofs4_root_ioctl_unlocked(struct inode *inode, struct file *filp, } } -static DEFINE_MUTEX(autofs4_ioctl_mutex); - static long autofs4_root_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) { - long ret; struct inode *inode = filp-f_dentry-d_inode; - - mutex_lock(autofs4_ioctl_mutex); - ret = autofs4_root_ioctl_unlocked(inode, filp, cmd, arg); - mutex_unlock(autofs4_ioctl_mutex); - - return ret; + return autofs4_root_ioctl_unlocked(inode, filp, cmd, arg); } #ifdef CONFIG_COMPAT @@ -1002,13 +994,11 @@ static long autofs4_root_compat_ioctl(struct file *filp, struct inode *inode = filp-f_path.dentry-d_inode; int ret; - mutex_lock(autofs4_ioctl_mutex); if (cmd == AUTOFS_IOC_READY || cmd == AUTOFS_IOC_FAIL) ret = autofs4_root_ioctl_unlocked(inode, filp, cmd, arg); else ret = autofs4_root_ioctl_unlocked(inode, filp, cmd, (unsigned long)compat_ptr(arg)); - mutex_unlock(autofs4_ioctl_mutex); return ret; } ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
[autofs] [PATCH] autofs4 - remove ioctl mutex
With the recent changes to remove the BKL a mutex was added to the ioctl entry point for calls to the old ioctl interface. This mutex needs to be removed because of the need for the expire ioctl to call back to the daemon to perform a umount and receive a completion status (via another ioctl). This should be fine as the new ioctl interface uses much of the same code and it has been used without a mutex for around a year without issue, as was the original intention. --- fs/autofs4/root.c | 12 +--- 1 files changed, 1 insertions(+), 11 deletions(-) diff --git a/fs/autofs4/root.c b/fs/autofs4/root.c index d5c1401..d34896c 100644 --- a/fs/autofs4/root.c +++ b/fs/autofs4/root.c @@ -980,19 +980,11 @@ static int autofs4_root_ioctl_unlocked(struct inode *inode, struct file *filp, } } -static DEFINE_MUTEX(autofs4_ioctl_mutex); - static long autofs4_root_ioctl(struct file *filp, unsigned int cmd, unsigned long arg) { - long ret; struct inode *inode = filp-f_dentry-d_inode; - - mutex_lock(autofs4_ioctl_mutex); - ret = autofs4_root_ioctl_unlocked(inode, filp, cmd, arg); - mutex_unlock(autofs4_ioctl_mutex); - - return ret; + return autofs4_root_ioctl_unlocked(inode, filp, cmd, arg); } #ifdef CONFIG_COMPAT @@ -1002,13 +994,11 @@ static long autofs4_root_compat_ioctl(struct file *filp, struct inode *inode = filp-f_path.dentry-d_inode; int ret; - mutex_lock(autofs4_ioctl_mutex); if (cmd == AUTOFS_IOC_READY || cmd == AUTOFS_IOC_FAIL) ret = autofs4_root_ioctl_unlocked(inode, filp, cmd, arg); else ret = autofs4_root_ioctl_unlocked(inode, filp, cmd, (unsigned long)compat_ptr(arg)); - mutex_unlock(autofs4_ioctl_mutex); return ret; } ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] autofs4 hang in 2.6.37-rc1
On Mon, 2010-11-15 at 15:27 +0200, Avi Kivity wrote: On 11/15/2010 03:22 PM, Ian Kent wrote: Ian, if you can prove that the lock is not needed, I think we shold just remove it. I don't think I can prove it but I will have a long look at the code. I don't think it is needed and I expect I'll recommend it be removed. I've been running with the lock removed for a while with no ill effect. Of course it doesn't prove anything but at least it's a workaround for me. Yeah, I tried pretty hard over quite a long time, with the expectation that the BKL would be removed, to try and make the code independent of it. At one point patched the kernel to use the unlocked ioctl entry point during some development testing and found only one fix that was needed, although a lot has changed since then too. Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] autofs4 hang in 2.6.37-rc1
On Mon, 2010-11-15 at 21:38 +0800, Ian Kent wrote: On Mon, 2010-11-15 at 15:27 +0200, Avi Kivity wrote: On 11/15/2010 03:22 PM, Ian Kent wrote: Ian, if you can prove that the lock is not needed, I think we shold just remove it. I don't think I can prove it but I will have a long look at the code. I don't think it is needed and I expect I'll recommend it be removed. I've been running with the lock removed for a while with no ill effect. Of course it doesn't prove anything but at least it's a workaround for me. Yeah, I tried pretty hard over quite a long time, with the expectation that the BKL would be removed, to try and make the code independent of it. At one point patched the kernel to use the unlocked ioctl entry point during some development testing and found only one fix that was needed, although a lot has changed since then too. Hahaha, although as you say, I won't really know if there are races until I get people really hammering autofs. But, since that's were this is at maybe that's reason enough to remove it so we can get people to start applying pressure to the code so we find and fix any problems. Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] autofs4 hang in 2.6.37-rc1
On Sun, 2010-11-14 at 17:34 +0200, Avi Kivity wrote: On 11/14/2010 05:15 PM, Arnd Bergmann wrote: On Sunday 14 November 2010 14:51:04 Avi Kivity wrote: automount S 88012a28a680 0 399 1 0x 88012a07bd08 0082 88012a07a010 88012a07bfd8 00011800 88012693c260 88012693c5d0 88012693c5c8 00011800 00011800 Call Trace: [81056197] ? prepare_to_wait+0x67/0x74 [811b23eb] autofs4_wait+0x5a4/0x6d5 [81055f25] ? autoremove_wake_function+0x0/0x34 [811b2ba5] autofs4_do_expire_multi+0x5b/0xa3 [811b2c39] autofs4_expire_multi+0x4c/0x54 [811b1750] autofs4_root_ioctl_unlocked+0x23e/0x252 [811b1808] autofs4_root_ioctl+0x39/0x53 [810f5e5c] do_vfs_ioctl+0x557/0x5bb [810ca644] ? remove_vma+0x6e/0x76 [810cb6a2] ? do_munmap+0x31c/0x33e [810f5f02] sys_ioctl+0x42/0x65 [81002b42] system_call_fastpath+0x16/0x1b Shouldn't we drop autofs4_ioctl_mutex while we wait? If the ioctl can sleep for multiple seconds, the mutex should indeed be dropped, and that would be safe because we used to do the same with the BKL. The question is why this would sleep for more than 120 seconds. Let's fix first and ask questions later. You can't hold an exclusive mutex during an autofs expire because the daemon will start by calling the ioctl to check for a dentry to expire then call back to the daemon to perform the umount and wait for a status return (also an ioctl). From memory the expire is the only ioctl that is sensitive to this deadlock. So, either the mutex must be released while waiting for the status return or get rid of the autofs4_ioctl_mutex altogether. Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] autofs4 hang in 2.6.37-rc1
On Sun, 2010-11-14 at 16:15 +0100, Arnd Bergmann wrote: On Sunday 14 November 2010 14:51:04 Avi Kivity wrote: automount S 88012a28a680 0 399 1 0x 88012a07bd08 0082 88012a07a010 88012a07bfd8 00011800 88012693c260 88012693c5d0 88012693c5c8 00011800 00011800 Call Trace: [81056197] ? prepare_to_wait+0x67/0x74 [811b23eb] autofs4_wait+0x5a4/0x6d5 [81055f25] ? autoremove_wake_function+0x0/0x34 [811b2ba5] autofs4_do_expire_multi+0x5b/0xa3 [811b2c39] autofs4_expire_multi+0x4c/0x54 [811b1750] autofs4_root_ioctl_unlocked+0x23e/0x252 [811b1808] autofs4_root_ioctl+0x39/0x53 [810f5e5c] do_vfs_ioctl+0x557/0x5bb [810ca644] ? remove_vma+0x6e/0x76 [810cb6a2] ? do_munmap+0x31c/0x33e [810f5f02] sys_ioctl+0x42/0x65 [81002b42] system_call_fastpath+0x16/0x1b Shouldn't we drop autofs4_ioctl_mutex while we wait? If the ioctl can sleep for multiple seconds, the mutex should indeed be dropped, and that would be safe because we used to do the same with the BKL. The question is why this would sleep for more than 120 seconds. umount against a server that isn't responding can easily take more than 2 minutes. Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] Problem of concurrent mount/umount call
On Tue, 2010-10-26 at 09:22 +0200, Erwan Loaëc wrote: Sorry for my previous mail with missing parts... Ian Kent wrote: On Fri, 2010-10-22 at 17:19 +0200, Erwan Loaëc wrote: Hello, Sorry for the time, but before posting again I've upgrade every production servers with newest autofs4 module (with last patch) and last What does newest autofs4 module mean exactly? Yes it is not the newest module but the module recompiled with the patch autofs4-2.6.26-v5-update-20090903.patch OK. automount with all patch EXCEPT these: autofs-5.0.5-fix-restart.patch autofs-5.0.5-fix-status-privilege-error.patch autofs-5.0.4-always-read-file-maps-mount-lookup-map-read-fix.patch autofs-5.0.5-fix-direct-map-not-updating-on-reread.patch autofs-5.0.5-add-external-bind-method.patch autofs-5.0.5-fix-add-simple-bind-auth.patch autofs-5.0.5-add-dump-maps-option.patch autofs-5.0.5-fix-submount-shutdown-wait.patch Today I had the same behaviour than the issue explained my previous mail. Oct 22 16:44:06 SERVERNAME automount[2665]: umount_autofs_offset: couldn't get ioctl fd for offset /cifs/XXX/volume: No such file or directory Oct 22 16:44:06 SERVERNAME automount[2665]: handle_packet_missing_direct:1363: can't find map entry for (20,3548545) This could be caused by a umount returning success when in fact it didn't succeed with the umount. Are you sure umount is returning correct status? Unfortunaly the last case occured on production server with debug disable. I can't find more information in logfile... But, as I've explained in my previous mail, this could logical with the bad sequence found in my previous case: *Call to the share /cifs/XXX/volume *Mount /cifs/XXX/volume expiring *New call to the share /cifs/XXX/volume *Umount /cifs/XXX/volume But that isn't quite what the problem is. I've had a closer look at the log and: Aug 26 10:11:42 bacchus automount[17827]: handle_packet_expire_direct: token 1526, name /cifs/SERV2/TM_termoz Aug 26 10:11:42 bacchus automount[17827]: expiring path /cifs/SERV2/TM_termoz Aug 26 10:11:42 bacchus automount[17827]: umount_multi: path /cifs/SERV2/TM_termoz incl 1 Aug 26 10:11:42 bacchus automount[17827]: umount_subtree_mounts: unmounting dir = /cifs/SERV2/TM_termoz Aug 26 10:11:42 bacchus automount[17827]: expired /cifs/SERV2/TM_termoz Aug 26 10:11:42 bacchus automount[17827]: ioctl_send_ready: token = 1526 /cifs/SERV2/TM_termoz is umounted. Aug 26 10:11:42 bacchus automount[17827]: handle_packet: type = 5 Aug 26 10:11:42 bacchus automount[17827]: handle_packet_missing_direct: token 1527, name /cifs/SERV2/TM_termoz, request pid 29523 Aug 26 10:11:42 bacchus automount[17827]: attempting to mount entry /cifs/SERV2/TM_termoz Aug 26 10:11:42 bacchus automount[17827]: lookup_mount: lookup(program): /cifs/SERV2/TM_termoz - -fstype=cifs,file_mode=0644,dir_mode=0755,uid=uniok,gid=uniok,credentials=/etc/mycredfile ://SERV2/TM_termoz Aug 26 10:11:42 bacchus automount[17827]: parse_mount: parse(sun): expanded entry: -fstype=cifs,file_mode=0644,dir_mode=0755,uid=uniok,gid=uniok,credentials=/etc/mycredfile ://SERV2/TM_termoz Aug 26 10:11:42 bacchus automount[17827]: parse_mount: parse(sun): gathered options: fstype=cifs,file_mode=0644,dir_mode=0755,uid=uniok,gid=uniok,credentials=/etc/mycredfile Aug 26 10:11:42 bacchus automount[17827]: sun_mount: parse(sun): mounting root /tmp/auto1I7gB7, mountpoint /cifs/SERV2/TM_termoz, what //SERV2/TM_termoz, fstype cifs, options file_mode=0644,dir_mode=0755,uid=uniok,gid=uniok,credentials=/etc/mycredfile Aug 26 10:11:42 bacchus automount[17827]: do_mount: //SERV2/TM_termoz /cifs/SERV2/TM_termoz type cifs options file_mode=0644,dir_mode=0755,uid=uniok,gid=uniok,credentials=/etc/mycredfile using module generic Aug 26 10:11:42 bacchus automount[17827]: mount_mount: mount(generic): calling mkdir_path /tmp/auto1I7gB7 Aug 26 10:11:42 bacchus automount[17827]: mount_mount: mount(generic): calling mount -t cifs -s -o file_mode=0644,dir_mode=0755,uid=uniok,gid=uniok,credentials=/etc/mycredfile //SERV2/TM_termoz /tmp/auto1I7gB7 Aug 26 10:11:42 bacchus automount[17827]: mount_mount: mount(generic): mounted //SERV2/TM_termoz type cifs on /tmp/auto1I7gB7 Aug 26 10:11:42 bacchus automount[17827]: move_mount: moved mount tree from /tmp/auto1I7gB7 to /cifs/SERV2/TM_termoz And is then mounted again here, but the request is not complete and we get an expire, which means that the kernel saw the tree of mounts below /cifs/SERV2 as not busy. This means the dentry was selected for expire before the mount, and in fact before the path walk which caused it even started. Aug 26 10:11:42 bacchus automount[17827]: handle_packet: type = 4 Aug 26 10:11:42 bacchus automount[17827]: handle_packet_expire_indirect: token 1528, name SERV2 Aug 26 10:11:42 bacchus automount[17827]: expiring path /cifs/SERV2 Aug 26 10
Re: [autofs] Problem of concurrent mount/umount call
On Tue, 2010-10-26 at 15:22 +0200, Erwan Loaëc wrote: Hi, First, thank you for spending time to analyse the problem. I've apply the patch in our dev environnement, and defined a slow timeout for the mount (60s) FYI, in production, in order to limit this problem, the timeout is set to 2 hours. When it will be possible, I'll update every autofs. The problem is quite rare, so I won't be able to attest that the problem is solve or not immediatly. Right, that sounds sensible since I can't work out how it gets through. I also need to know that the change doesn't cause any unwanted side effects so let me know how things go. Adding an is_mounted() check in a frequently called function like this can have an overhead if your using the old ioctl interface, which you are by the look of the log. This should only show up if you have a largish number of mounts. How many is a large number is very much site dependent but if the CPU usage of the daemon is acceptable to you then you don't need to worry about it. In addition, can you tell me why the patch is autofs-5.0.3 - fix expire race and not autofs-5.0.5 ? That's just a typo, I'll rename the patch before committing it. Thanks, Erwan Ian Kent wrote: On Tue, 2010-10-26 at 09:22 +0200, Erwan Loaëc wrote: Sorry for my previous mail with missing parts... Ian Kent wrote: On Fri, 2010-10-22 at 17:19 +0200, Erwan Loaëc wrote: Hello, Sorry for the time, but before posting again I've upgrade every production servers with newest autofs4 module (with last patch) and last What does newest autofs4 module mean exactly? Yes it is not the newest module but the module recompiled with the patch autofs4-2.6.26-v5-update-20090903.patch OK. automount with all patch EXCEPT these: autofs-5.0.5-fix-restart.patch autofs-5.0.5-fix-status-privilege-error.patch autofs-5.0.4-always-read-file-maps-mount-lookup-map-read-fix.patch autofs-5.0.5-fix-direct-map-not-updating-on-reread.patch autofs-5.0.5-add-external-bind-method.patch autofs-5.0.5-fix-add-simple-bind-auth.patch autofs-5.0.5-add-dump-maps-option.patch autofs-5.0.5-fix-submount-shutdown-wait.patch Today I had the same behaviour than the issue explained my previous mail. Oct 22 16:44:06 SERVERNAME automount[2665]: umount_autofs_offset: couldn't get ioctl fd for offset /cifs/XXX/volume: No such file or directory Oct 22 16:44:06 SERVERNAME automount[2665]: handle_packet_missing_direct:1363: can't find map entry for (20,3548545) This could be caused by a umount returning success when in fact it didn't succeed with the umount. Are you sure umount is returning correct status? Unfortunaly the last case occured on production server with debug disable. I can't find more information in logfile... But, as I've explained in my previous mail, this could logical with the bad sequence found in my previous case: *Call to the share /cifs/XXX/volume *Mount /cifs/XXX/volume expiring *New call to the share /cifs/XXX/volume *Umount /cifs/XXX/volume But that isn't quite what the problem is. I've had a closer look at the log and: Aug 26 10:11:42 bacchus automount[17827]: handle_packet_expire_direct: token 1526, name /cifs/SERV2/TM_termoz Aug 26 10:11:42 bacchus automount[17827]: expiring path /cifs/SERV2/TM_termoz Aug 26 10:11:42 bacchus automount[17827]: umount_multi: path /cifs/SERV2/TM_termoz incl 1 Aug 26 10:11:42 bacchus automount[17827]: umount_subtree_mounts: unmounting dir = /cifs/SERV2/TM_termoz Aug 26 10:11:42 bacchus automount[17827]: expired /cifs/SERV2/TM_termoz Aug 26 10:11:42 bacchus automount[17827]: ioctl_send_ready: token = 1526 /cifs/SERV2/TM_termoz is umounted. Aug 26 10:11:42 bacchus automount[17827]: handle_packet: type = 5 Aug 26 10:11:42 bacchus automount[17827]: handle_packet_missing_direct: token 1527, name /cifs/SERV2/TM_termoz, request pid 29523 Aug 26 10:11:42 bacchus automount[17827]: attempting to mount entry /cifs/SERV2/TM_termoz Aug 26 10:11:42 bacchus automount[17827]: lookup_mount: lookup(program): /cifs/SERV2/TM_termoz - -fstype=cifs,file_mode=0644,dir_mode=0755,uid=uniok,gid=uniok,credentials=/etc/mycredfile ://SERV2/TM_termoz Aug 26 10:11:42 bacchus automount[17827]: parse_mount: parse(sun): expanded entry: -fstype=cifs,file_mode=0644,dir_mode=0755,uid=uniok,gid=uniok,credentials=/etc/mycredfile ://SERV2/TM_termoz Aug 26 10:11:42 bacchus automount[17827]: parse_mount: parse(sun): gathered options: fstype=cifs,file_mode=0644,dir_mode=0755,uid=uniok,gid=uniok,credentials=/etc/mycredfile Aug 26 10:11:42 bacchus automount[17827]: sun_mount: parse(sun): mounting root /tmp/auto1I7gB7, mountpoint /cifs/SERV2/TM_termoz, what //SERV2/TM_termoz, fstype cifs, options file_mode=0644,dir_mode=0755,uid=uniok,gid=uniok,credentials
Re: [autofs] [PATCH 06/17] Add an AT_NO_AUTOMOUNT flag to suppress terminal automount
On Thu, 2010-09-30 at 19:15 +0100, David Howells wrote: Add an AT_NO_AUTOMOUNT flag to suppress terminal automounting of directories with follow_link semantics. This can be used by fstatat() users to permit the gathering of attributes on an automount point and also prevent mass-automounting of a directory of automount points by ls. Signed-off-by: David Howells dhowe...@redhat.com Acked-by: Ian Kent ra...@themaw.net --- fs/namei.c|6 ++ fs/stat.c |4 +++- include/linux/fcntl.h |1 + include/linux/namei.h |2 ++ 4 files changed, 12 insertions(+), 1 deletions(-) diff --git a/fs/namei.c b/fs/namei.c index 86421f9..74bce3a 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -625,6 +625,12 @@ static int follow_automount(struct path *path, unsigned flags, if (!path-dentry-d_op || !path-dentry-d_op-d_automount) return -EREMOTE; + /* We don't want to mount if someone supplied AT_NO_AUTOMOUNT + * and this is the terminal part of the path. + */ + if ((flags LOOKUP_NO_AUTOMOUNT) !(flags LOOKUP_CONTINUE)) + return -EXDEV; /* we actually want to stop here */ Oops, we missed this -EXDEV when we made the change to -EISDIR. Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] [PATCH 00/17] Introduce automounter dentry ops
work for autofs, however, since the dentry might not have an inode, hence why a dentry flag also. S_AUTOMOUNT and d_automount() are introduced in patch 1; d_manage(), d_managed and DMANAGED_* are introduced in patch 7. David --- David Howells (8): Make follow_down() handle d_manage() Make dentry::d_mounted into a more general field for special function dirs Add an AT_NO_AUTOMOUNT flag to suppress terminal automount Remove the automount through follow_link() kludge code from pathwalk CIFS: Use d_automount() rather than abusing follow_link() NFS: Use d_automount() rather than abusing follow_link() AFS: Use d_automount() rather than abusing follow_link() Add a dentry op to handle automounting rather than abusing follow_link() Ian Kent (9): autofs4 - bump version autofs4 - add v4 pseudo direct mount support autofs4 - fix wait validation autofs4: cleanup autofs4_free_ino() autofs4: cleanup dentry operations autofs4: cleanup inode operations autofs4: removed unused code autofs4: add d_manage() dentry operation autofs4: add d_automount() dentry operation Documentation/filesystems/Locking |2 Documentation/filesystems/vfs.txt | 22 + fs/afs/dir.c |1 fs/afs/inode.c|3 fs/afs/internal.h |1 fs/afs/mntpt.c| 47 +-- fs/autofs/dirhash.c |5 fs/autofs4/autofs_i.h | 100 -- fs/autofs4/dev-ioctl.c|2 fs/autofs4/expire.c | 42 ++ fs/autofs4/inode.c| 28 -- fs/autofs4/root.c | 668 - fs/autofs4/waitq.c| 17 + fs/cifs/cifs_dfs_ref.c| 134 --- fs/cifs/cifsfs.h |6 fs/cifs/dir.c |2 fs/cifs/inode.c |8 fs/dcache.c |7 fs/namei.c| 243 +++-- fs/namespace.c| 20 + fs/nfs/dir.c |4 fs/nfs/inode.c|4 fs/nfs/internal.h |1 fs/nfs/namespace.c| 87 ++--- fs/nfsd/vfs.c |5 fs/stat.c |4 include/linux/auto_fs4.h |2 include/linux/dcache.h| 19 + include/linux/fcntl.h |1 include/linux/fs.h|2 include/linux/namei.h |5 include/linux/nfs_fs.h|1 32 files changed, 836 insertions(+), 657 deletions(-) ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] autmount hangs occasionally on bind-mounts
On Tue, 2010-09-28 at 12:11 +0200, Sebastian Hetze wrote: On Tue, Sep 28, 2010 at 11:30:55AM +0800, Ian Kent wrote: On Mon, 2010-09-27 at 07:55 +0200, Sebastian Hetze wrote: Hi *, we are suffering from some sort of race condition that causes automount to hang: [351841.568061] INFO: task automount:22055 blocked for more than 120 seconds. [351841.568689] echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. [351841.569717] automount D b983e7f6 0 22055 1 0x [351841.570252] e0ca7ef4 0082 f3c38000 b983e7f6 00013fde eaed6000 f63af880 f5037c00 [351841.571308] c0863320 c0863320 f30de480 f30de718 c5589320 0002 b9841648 00013fde [351841.572316] f30de718 f72ceff4 f72ceff0 e0ca7f20 c059fd3e e0ca7f14 f30de480 [351841.573364] Call Trace: [351841.573686] [c059fd3e] __mutex_lock_slowpath+0xbe/0x120 [351841.574130] [c059fc60] mutex_lock+0x20/0x40 [351841.574496] [c0202732] do_rmdir+0x52/0xe0 [351841.574878] [c04b67ad] ? sys_socketcall+0x1cd/0x2a0 [351841.575266] [c0202820] sys_rmdir+0x10/0x20 [351841.575781] [c010968c] syscall_call+0x7/0xb This is only half the story. I think you'll find another process that is waiting on the expire via autofs4_revalidate() and holds the mutex that the above process is waiting on. Actually, there is another blocked process: While that does look a little like what I'd expect to see I don't think that is the process your looking for. [351961.584408] INFO: task install:22804 blocked for more than 120 seconds. [351961.584913] echo 0 /proc/sys/kernel/hung_task_timeout_secs disables this message. [351961.585545] install D e268c4fc 0 22804 22798 0x [351961.586100] f442fed8 0086 c02000b1 e268c4fc 00013fec f442fee8 e04efc00 [351961.587180] c0863320 c0863320 f3a19920 f3a19bb8 c55a9320 0004 f442ff30 c101 [351961.588255] f3a19bb8 f72ceff4 f72ceff0 f442ff04 c059fd3e f547be58 f3a19920 [351961.589550] Call Trace: [351961.589864] [c02000b1] ? path_to_nameidata+0x31/0x50 [351961.590286] [c059fd3e] __mutex_lock_slowpath+0xbe/0x120 [351961.590793] [c059fc60] mutex_lock+0x20/0x40 [351961.591140] [c01ffc4f] lookup_create+0x1f/0xa0 [351961.591569] [c020287c] sys_mkdirat+0x4c/0x100 [351961.591996] [c020e48a] ? mntput_no_expire+0x1a/0xd0 [351961.592427] [c0202950] sys_mkdir+0x20/0x30 [351961.592912] [c010968c] syscall_call+0x7/0xb This is a known problem and has been present for years and cannot be resolved using the current automount framwork. I don't know why we're suddenly seeing people get caught by it recently but we are. Assuming you are seeing the problem I think you are you should be able to work around it by using the browse option on your autofs mounts. This should work OK as long as your maps are not too large. We will try this option. Thanx for your explanation. Can you point me to an kernel bug report number that I can trace for further development on that subject? I don't think there is one. Keep your eye on either the autofs mailing list or linux-fsdevel or Linux Kernel Mailing list, the series will be posted in those lists. It may not mention the deadlock issue since the VFS automount implementation is mean to address slightly different issues with autofs, AFS, CIFS and NFS. But for autofs a side effect of the implementation is the deadlock is gone. Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs
Re: [autofs] dump map option support
On 23/09/2010, at 10:16 PM, Ondrej Valousek webs...@s3group.cz wrote: Hi Ian, Any news about our new dump map option? I just found out that F-13 still does not have it :-( I am also wondering if you have any plans about supporting sssd (just a quick question). My bad, I'll get onto it. I'll probably need to wait until after F14 is released to add it to that. Rawhide will get it when I get the next release out, which is long overdue. Ian ___ autofs mailing list autofs@linux.kernel.org http://linux.kernel.org/mailman/listinfo/autofs