[OpenAFS] Re: MacOS AppleDouble excretions

2010-10-21 Thread Adam Megacz

Derrick Brashear sha...@gmail.com writes:
 On Wed, Oct 13, 2010 at 12:18 AM, Adam Megacz a...@megacz.com wrote:

 Brandon S Allbery KF8NH allb...@ece.cmu.edu writes:
 No, he wants AFS to simply refuse to create resource forks,

 Almost.  I would like the AFS CLIENT to refuse, if the user has
 explicitly requested this behavior.

 The AFS client does what the Darwin kernel tells it to. If the kernel
 honors whatever setting, we'd never see the request.

Yes, I agree that it would be preferable if Apple added a setting to do
this.

But I don't think that's going to happen.

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: MacOS AppleDouble excretions

2010-10-21 Thread Adam Megacz

Derrick Brashear sha...@gmail.com writes:
 Since you suggest your first comments are what we misinterpret:

I do not suggest that.

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: MacOS AppleDouble excretions

2010-10-21 Thread Adam Megacz

omall...@msu.edu writes:
 I can understand where large sites don't want to go this route
 globally since it could break something. ...

 I can understand where AFS Team doesn't want to make it a global
 default option.

 I can understand where a user would want the ability to just not
 create the files

I am in complete agreement with these three statements -- and was before
I started this thread (assuming an error code is returned to the
userspace application when the files are not created).


Booker Bense bbe...@slac.stanford.edu writes:
 I see this as a complete waste of time.

Actually I was going to volunteer to write the patch for the Mac client.
It's not a waste of _my_ time if it stands a reasonable chance of being
included.  But, based on this thread, that appears not to be the case.


Derrick Brashear sha...@gmail.com writes:
 POSIX extended attributes are stored in the files. Until we deal with
 them natively (which requires new RPCs) deleting them actively loses
 data.

Look, this fuss about losing data is a real distraction; can we handle
it and stick to the important issues?

An error code should ALWAYS be returned to the userspace application by
a write operation if the AFS client has declined to perform that action
due to end-user configuration choices.  I don't think anybody in their
right mind (or on this list) is proposing that the resource forks be
silently discarded.  I think I was pretty clear about this in my
original post.  Construing my proposal as discarding the forks is not
helpful at all, and muddies the issue a lot.

If the filesystem reports an error, it has not taken responsibility for
the data, so it does not actively lose[s] data.


Jeffrey Altman jalt...@secure-endpoints.com writes:
 The fix for the I don't like DoubleFiles issue is to find the financial
 or development resources necessary to implement support for EAs

I agree, although that is a fix, not the fix.

I don't have access to those kind of resources, but I do have sufficient
resources to add the client-side option.

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: MacOS AppleDouble excretions

2010-10-12 Thread Adam Megacz

Steve Simmons s...@umich.edu writes:
 Is there any chance of a setting being included in the MacOS client that
^^

 Doing this at our site would result in a firestorm of complaints from


You and I seem to be talking about different things.

 It sounds like you're suggesting we modify afs so it understands
 resource forks properly and generate an error message if someone
 attempts to create a file whose name might be mistaken for a resource
 fork.

No.

  - a


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: MacOS AppleDouble excretions

2010-10-12 Thread Adam Megacz

Brandon S Allbery KF8NH allb...@ece.cmu.edu writes:
 No, he wants AFS to simply refuse to create resource forks,

Almost.  I would like the AFS CLIENT to refuse, if the user has
explicitly requested this behavior.

 because in his world they never have any use whatsoever.

In my world they have no use whatsoever for the kinds of files I
personally happen to store in /afs/.  So I would like to be able to
instruct the AFS client on my personal laptop not to store those files.

 (And apparently his use case is to be considered the common one.)

I'm having trouble parsing this.

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: MacOS AppleDouble excretions

2010-10-12 Thread Adam Megacz

Adam Megacz a...@megacz.com writes:
 There's a MacOS setting to disable the first kind of litter.
  ^^^
 Is there any chance of a setting being included in the MacOS client that
   ^^^ ^^

It appears that everybody who replied to this thread somehow got the
impression that I was asking for a change in the *default* behavior of the
client, or a change in the behavior of the server.

But I guess jumping up and down hollering with indignation isn't quite
as much fun unless you first misinterpret the proposal! ;)

  - a


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: bos killed fileserver before it was shut down cleanly.

2010-10-10 Thread Adam Megacz

Russ Allbery r...@stanford.edu writes:
 The problem is that it's also not uncommon for the fileserver to
 completely or nearly completely stall when shutting down,

Just curious, is this stall a bug in the fileserver, or something
which happens for a good reason?  If so, what is the reason?

In general, I find that these sorts of unexplained stalls (both in the
client and on the server components) are the sorts of problems I have
the most trouble understanding.  It's probably asking too much to hope
for a simple FAQ answer when AFS goes out to lunch, what is it eating?.

  - a


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] MacOS AppleDouble excretions

2010-10-10 Thread Adam Megacz

MacOS seems to litter network shares with two kinds of files:

   .DS_Store   (Finder data)
   ._filename  (AppleDouble resource fork)

There's a MacOS setting to disable the first kind of litter.

Unfortunately it seems like there is no way to get MacOS to refrain from
writing the second kind of file, and it seems like Apple deliberately
doesn't want there to be one.

Is there any chance of a setting being included in the MacOS client that
stops this from happening?  The crude way would be to simply refuse to
create files whose name starts with the prefix ._, reporting
permission-denied or something like that.

The more sophisticated approach would probably be to claim to MacOS that
/afs/ supports resource forks, and report permission-denied when an
attempt to write a resource fork is made.  This has the advantage of not
being filename-based and not breaking programs which access the
filesystem through the POSIX APIs.

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: SYNC_connect: temporary failure on circuit 'FSSYNC' (will retry)

2010-10-07 Thread Adam Megacz

Andrew Deason adea...@sinenomine.net writes:
 Refusing to start means... they start and do nothing?

Precisely.

Unfortunately I couldn't wait any longer and had to downgrade to 1.4.11.

=(

  - a


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: OpenAFS on ext4?

2010-10-05 Thread Adam Megacz

Jaap Winius jwin...@umrk.nl writes:
 Yeah, I've been running OpenAFS (v1.4.12) with ext4 on my private
 server (Debian squeeze) since June. Its workload and specs are nothing
 compared to the systems that others have described here, but it's seen
 almost constant use and has so far not given me any problems.

Same here, fairly small/simple installation has been running 1.4.11
server with /vicepX on ext4 for at least six months now.

No problems, except this one, which caused total and catastrophic data
loss (which was not even remotely close to being OpenAFS's fault):

  
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commit;h=2ec0ae3a

So glad I run nightly backups.

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: SYNC_connect: temporary failure on circuit 'FSSYNC' (will retry)

2010-10-05 Thread Adam Megacz

Andrew Deason adea...@sinenomine.net writes:
 If you see it a lot, I'd like to know more about what's going
 on with your server, but otherwise it's not anything to worry about.

Well, the FileServer and VolServer are both refusing to start, and this
message is the only thing in their logs that seems to explain the
situation...

 What's a circuit?

 Just a name someone gave for the communication mechanism being used, I
 guess. I don't associate much meaning with the term.

I humbly suggest that using standard unix terminology like named pipe
might be a good idea.

  - a


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: is this what windows folks call integrated login? (but with local hashed password)

2010-08-29 Thread Adam Megacz

Derrick Brashear sha...@gmail.com writes:
 My laptop has a local copy of my password in hashed form, so it can let

 Oh. You're not typing a password, so this won't help you.

Er, sorry, I should have been more clear about that.  I am typing in my
password physically at the keyboard.  My laptop has a copy of that
password on the disk in hashed format so that it can verify that I typed
in the correct password, but if somebody steals my laptop they can't
simply read my password off the disk (at least I assume MacOS does this
like all good unices do -- it would be a shame if it didn't; this is the
only reason I consider it safe to use the same password for both my
laptop's local login and my Kerberos principal).

So, anyways, lack of network access will not delay the local operating
system's decision about whether or not to let me proceed with my login.
But it may delay the acquisition of tickets.  But if I'm not on the
network, then ending up logged in locally without tickets is no big deal
-- especially if there's a daemon sitting around waiting for the network
to come back.  I guess it would need to be holding my unhashed password
in memory, but with encrypted swap and a screensaver password that's
still not a huge concern.

  - a



___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: is this what windows folks call integrated login? (but with local hashed password)

2010-08-23 Thread Adam Megacz

Dale Pontius pont...@btv.ibm.com writes:
 Be very careful of an integrated login on a laptop.

I think this might be where Windows integrated login and what I'm
looking for are different.

My laptop has a local copy of my password in hashed form, so it can let
me log in even when there's no network access.  I just want it to spawn
a separate background thread that tries to get tickets for me.  If I'm
not on the network it's no big deal; I won't care that I have no tickets
because I won't be able to do anything with them.

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: is this what windows folks call integrated login?

2010-08-23 Thread Adam Megacz

Derrick Brashear sha...@dementia.org writes:
 Did you try the feature that 'obtains tickets at login' in the prefs pane?

Hey, that's pretty nifty; when was it added?  Is there documentation
anywhere?

It's not quite what I wanted, though... it makes me type in my username
(er, pricipal) and password a second time.  I was hoping I could just
tell it hey, assume that my kerberos principal is the same as ThisCell
(but upcased), and use the username and password I typed into the MacOSX
Login Dialog instead of asking me again.

Anything out there that can do this?

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] is this what windows folks call integrated login?

2010-08-21 Thread Adam Megacz

I have a MacOS laptop.  My username and local password on the laptop
happen to match my kerberos username and password.  My kerberos tickets
expire after 10 hours, but are renewable for 10 *days*.

It occurred to me that it would be nifty if my laptop acquired kerberos
tickets for me when I logged in (during the brief window when my
un-hashed password is present in laptop RAM), and made an attempt to
renew them once an hour (if connected to the network).  This would save
me having to do a separate kinit after logging in, and having to
re-kinit every 10 hours.  I've got a screensaver lock and encrypt my
swapfile, so I'm not too worried about physical theft issues resulting
in ticket theft.

Is there a piece of software that does this?  It's been a long, long
time since I used Windows, but it sounds like this feature is what the
Windows client calls integrated login.  Or maybe not.  Either way, is
there a way to get MacOS to do this?

Thanks,

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] ERROR: Cache dir check failed (cannot use tmpfs as cache partition)

2010-06-20 Thread Adam Megacz

Is there a reason why tmpfs isn't supported (1.5.74)?

I was really hoping to start using it for my cache partitions with 1.5.x
and 1.6.  It has all the advantages of memcache plus it can pushed out
to swap when there's memory pressure (which IIRC memcache cannot).

Currently I use dd to make a huge image in a tmpfs partition and then
loopback-mount that, but aside from all the overhead of passing through
the VFS layer twice this also means that each page is in memory twice
(once in tmpfs's memory and once in the buffercache).  Indeed,
eliminating this double buffering is one of the big selling points of
tmpfs.

Thanks,

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: Starting AFS cache scan...

2010-06-20 Thread Adam Megacz

Andrew Deason adea...@sinenomine.net writes:
 If afsd is being backgrounded, it is not by us. afsd only exits after it
 tries to mount /afs. From what Russ says and from what I see in the
 Debian init scripts, the Debian init scripts do not background it
 either.

Yikes, I tried putting ls /afs/megacz.com in the init script right
after the line that launches afsd, and I got this:

r...@mute:~#/etc/init.d/openafs-client- start
Starting AFS services: openafs afsd.
afsd: All AFS daemons started.
afs started; ls /afs/megacz.com:
/etc/init.d/openafs-client-: line 156:  2152 Segmentation fault  ls 
/afs/megacz.com/

Message from sysl...@mute at Jun 20 21:40:11 ...
 kernel:[  157.595464] Oops:  [#1] SMP 

Message from sysl...@mute at Jun 20 21:40:11 ...
 kernel:[  157.597013] Stack:      
   

Message from sysl...@mute at Jun 20 21:40:11 ...
 kernel:[  157.597013]  0001 c62b19e0 c62b19fc 
d1ce4007 581b0001  

Message from sysl...@mute at Jun 20 21:40:11 ...
 kernel:[  157.597013]0001 c62b19c0   0001 
01cd c716fe64 c62b19c0 

Message from sysl...@mute at Jun 20 21:40:11 ...
 kernel:[  157.597013] Call Trace:

Message from sysl...@mute at Jun 20 21:40:11 ...
 kernel:[  157.597013]  [d1ce4007] afs_GetServer+0x4f4/0x52d [openafs]

Message from sysl...@mute at Jun 20 21:40:11 ...
 kernel:[  157.597013]  [d1cfafb6] InstallUVolumeEntry+0x334/0x38f [openafs]

Message from sysl...@mute at Jun 20 21:40:11 ...
 kernel:[  157.597013]  [d1cfb5f0] afs_SetupVolume+0x2cf/0x3a7 [openafs]

Message from sysl...@mute at Jun 20 21:40:11 ...
 kernel:[  157.597013]  [d1cfbb3c] afs_NewVolumeByName+0x474/0x515 [openafs]

Message from sysl...@mute at Jun 20 21:40:11 ...
 kernel:[  157.597013]  [d1cf061a] EvalMountData+0x298/0x43d [openafs]

Message from sysl...@mute at Jun 20 21:40:11 ...
 kernel:[  157.597013]  [d1cf081f] EvalMountPoint+0x60/0x129 [openafs]

Message from sysl...@mute at Jun 20 21:40:11 ...
 kernel:[  157.597013]  [c01b92a8] request_key+0x28/0x50

Message from sysl...@mute at Jun 20 21:40:11 ...
 kernel:[  157.597013]  [d1cf09cc] afs_EvalFakeStat_int+0xe4/0x34b [openafs]

Message from sysl...@mute at Jun 20 21:40:11 ...
 kernel:[  157.597013]  [d1cf0c43] afs_EvalFakeStat+0x7/0x9 [openafs]

Message from sysl...@mute at Jun 20 21:40:11 ...
 kernel:[  157.597013]  [d1cf3774] afs_open+0x98/0x50c [openafs]

Message from sysl...@mute at Jun 20 21:40:11 ...
 kernel:[  157.597013]  [d1cf0c3a] afs_TryEvalFakeStat+0x7/0x9 [openafs]

Message from sysl...@mute at Jun 20 21:40:11 ...
 kernel:[  157.597013]  [d1d097ec] afs_linux_open+0x0/0xd6 [openafs]

Message from sysl...@mute at Jun 20 21:40:11 ...
 kernel:[  157.597013]  [c01731a3] __dentry_open+0x10d/0x1fc

Message from sysl...@mute at Jun 20 21:40:11 ...
 kernel:[  157.597013]  [c01732ae] nameidata_to_filp+0x1c/0x2c

Message from sysl...@mute at Jun 20 21:40:11 ...
 kernel:[  157.597013]  [d1d0984f] afs_linux_open+0x63/0xd6 [openafs]

Message from sysl...@mute at Jun 20 21:40:11 ...
 kernel:[  157.597013] Process ls (pid: 2152, ti=c716e000 task=cc71c900 
task.ti=c716e000)

Message from sysl...@mute at Jun 20 21:40:11 ...
 kernel:[  157.597013]  [c017d9fa] do_filp_open+0x34f/0x684

Message from sysl...@mute at Jun 20 21:40:11 ...
 kernel:[  157.597013]  [c0172fc0] do_sys_open+0x40/0xb0

Message from sysl...@mute at Jun 20 21:40:11 ...
 kernel:[  157.597013]  [c0103857] sysenter_past_esp+0x78/0xb1

Message from sysl...@mute at Jun 20 21:40:11 ...
 kernel:[  157.597013]  ===

Message from sysl...@mute at Jun 20 21:40:11 ...
 kernel:[  157.597013] Code: 19 01 00 00 89 e0 c7 44 24 24 00 00 00 00 c7 44 24 
20 00 00 00 00 e8 5a 70 fd ff 85 c0 0f 85 fa 00 00 00 8b 46 20 b9 04 00 00 00 
8b 50 10 8b 04 24 e8 fb 1e 00 00 85 c0 89 c3 0f 84 dd 00 00 00 

Message from sysl...@mute at Jun 20 21:40:11 ...
 kernel:[  157.597013]  [c0173074] sys_open+0x1e/0x23

Message from sysl...@mute at Jun 20 21:40:11 ...
 kernel:[  157.597013]  [c02b] quirk_vt8235_acpi+0x10/0x7a

Message from sysl...@mute at Jun 20 21:40:11 ...
 kernel:[  157.597013] EIP: [d1ce317f] afs_GetCapabilities+0x45/0x13e 
[openafs] SS:ESP 0068:c716fc28



___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] gerrit now has bugzilla integration

2010-06-19 Thread Adam Megacz

FYI

  http://code.google.com/p/gerrit/issues/detail?id=124

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: Starting AFS cache scan...

2010-06-19 Thread Adam Megacz

Derrick Brashear sha...@dementia.org writes:
 AFS can't mount any faster than it does with dynroot.

That's okay!  I don't need it to mount any faster; I just need it to not
background itself until it is done mounting.

 is afsd being backgrounded or is the issue that /afs/(something) isn't
 up yet?

Er, they are the same issue.  Obviously afsd needs to get backgrounded
eventually (otherwise the boot process would not continue).  The problem
is that afsd is being backgrounded too early.

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: Starting AFS cache scan...

2010-06-13 Thread Adam Megacz

Derrick Brashear sha...@dementia.org writes:
 Is this dynroot or not?

It is dynroot.

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] compile error with 1.5.74 on kernel 2.6.28.10

2010-06-12 Thread Adam Megacz

This is using openafs_1.5.74.1-1.dsc (Debian).  I can probably kludge my
way around it, but I figured in the run-up to 1.6 y'all would like to
know about it.

  - a

  CC [M]  
/usr/src/modules/openafs/src/libafs/MODLOAD-2.6.28.10gentzen-SP/osi_gcpags.o
/usr/src/modules/openafs/src/libafs/MODLOAD-2.6.28.10gentzen-SP/osi_gcpags.c: 
In function 'afs_osi_proc2cred':
/usr/src/modules/openafs/src/libafs/MODLOAD-2.6.28.10gentzen-SP/osi_gcpags.c:111:
 error: implicit declaration of function 'set_cr_group_info'
make[5]: *** 
[/usr/src/modules/openafs/src/libafs/MODLOAD-2.6.28.10gentzen-SP/osi_gcpags.o] 
Error 1
make[4]: *** 
[_module_/usr/src/modules/openafs/src/libafs/MODLOAD-2.6.28.10gentzen-SP] Error 
2
make[4]: Leaving directory `/usr/src/linux-2.6.28.10'
make[3]: *** [openafs.ko] Error 2
make[3]: Leaving directory 
`/usr/src/modules/openafs/src/libafs/MODLOAD-2.6.28.10gentzen-SP'
make[2]: *** [linux_compdirs] Error 2
make[2]: Leaving directory `/usr/src/modules/openafs/src/libafs'
make[1]: *** [all] Error 2
make[1]: Leaving directory `/usr/src/modules/openafs'
make: *** [build-stamp] Error 2


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: Starting AFS cache scan...

2010-06-12 Thread Adam Megacz

Derrick Brashear sha...@dementia.org writes:
 does your rc file include an afs post inst hook?

Yep

 have it run rxdebug localhost 7001 in a while loop until it succeeds

Hrm, okay, I guess that will work.

  - a


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: Starting AFS cache scan...

2010-06-12 Thread Adam Megacz

Derrick Brashear sha...@gmail.com writes:
 so set it to a script, and provide a script which just runs rxdebug
 localhost 7001 in a loop until it succeeds, then exits.

Hrm, actually, this doesn't seem to be working.  Perhaps I'm doing it
wrong?  Apparently the RX server will respond to debug requests before
/afs is mounted.

  - a

r...@mute:~#rxdebug localhost 7001  echo x
Trying 127.0.0.1 (port 7001):
Free packets: 215/97, packet reclaims: 0, calls: 0, used FDs: 64
not waiting for packets.
0 calls waiting for a thread
1 threads are idle
0 calls have waited for a thread
Done.
x

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: Starting AFS cache scan...

2010-06-10 Thread Adam Megacz

Russ Allbery r...@stanford.edu writes:
 Is there any reason not to do this?  If not, I can just make this change
 in the Debian package.  I don't recall why start-stop-daemon was used
 there in the first place.

+1

(but then again I'm a runit guy)

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Starting AFS cache scan...

2010-06-09 Thread Adam Megacz

I've got my openafs cache in tmpfs (via a loopback-mount), so every time
the machine comes up it sees an empty cache directory and spends some
time Starting AFS cache scan... -- creating the directory structure.

Unfortunately the AFS startup script returns before this process
finishes.  This means that other startup scripts -- which depend on /afs
being mounted -- will end up running before /afs has been mounted.

Is there any way to change this behavior so that
/etc/init.d/openafs-client doesn't yield control until it has at least
attempted to mount /afs?

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: group prefix doesn't match owner

2010-05-03 Thread Adam Megacz

Derrick Brashear sha...@gmail.com writes:
 When creating a group foo:bar as admin, I often find that I have to
 use the -owner parameter to see the owner to foo(something).

 I see.  Is it official AFS policy that this usage is supported?

 Which usage? I'm not sure what you're asking.

Sorry, let me rephrase.  The following sequence of commands generates an
error, but appears to work -- by which I mean that it leaves me in a
state where there is a group named blah:booh but no user named blah.

  $pts cu blah
  $pts creategroup blah:booh -owner blah
  $pts delete blah
  $pts ex blah:booh

Is it official AFS policy that this is supposed to work this way, and
will continue to work this way in the future?

If so, perhaps we should consider changing the error Badly formed name
(group prefix doesn't match owner) into a warning if it's being invoked
by system:administrators (who could just use the sequence of commands
above instead).  Or maybe let -force override the error.

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: experience of SQLite on AFS

2010-05-02 Thread Adam Megacz

Simon Wilkinson s...@inf.ed.ac.uk writes:
 Please notify me once the this is fixed in the Linux CM and I will test
 it for you.

 Derrick has pushed changes 

Still working on this.  It's been a long time since I had this setup
running, so I have to reproduce the corruption first before I can see if
it has been fixed.

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] group prefix doesn't match owner

2010-05-01 Thread Adam Megacz

Is there any reason why pts won't let system:administrator create groups
whose prefix does not match any user?

  $pts ex blah
  pts: User or group doesn't exist so couldn't look up id for blah
  $pts creategroup blah:booh
  pts: Badly formed name (group prefix doesn't match owner?) ; unable to create 
group blah:booh 

Clearly this can be circumvented by system:administrator:

  $pts cu blah
  User blah has id 100015
  $pts creategroup blah:booh -owner blah
  group blah:booh has id -1012
  $pts delete blah
  $pts ex blah:booh
  Name: blah:booh, id: -1012, owner: 0, creator: megacz,
membership: 0, flags: S-M--, group quota: 0.

is there a danger in doing this, other than perhaps confusion?

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: experience of SQLite on AFS

2010-04-26 Thread Adam Megacz

Jeffrey Altman jalt...@secure-endpoints.com writes:
 When a whole file lock is write-held, all of the dirty data in the cache
 must be written back to the file server before the lock is released.
 This is currently not being done and as a result, the database becomes
 corrupted.

 I suspect this will be fixed shortly.

Please notify me once the this is fixed in the Linux CM and I will test
it for you.

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: experience of SQLite on AFS

2010-04-24 Thread Adam Megacz

Ken Dreyer ktdre...@ktdreyer.com writes:
 SQLite has an option in os_unix.c (SQLITE_ENABLE_LOCKING_STYLE) to
 automatically figure out the database's filesystem type and use the
 most appropriate locking mechanism for that filesystem. Adam Megacz
 wrote a patch to SQLite back in 2006 that added AFS to this list of
 filesystems SQLite could detect. I'm not certain, but I think this
 only works for OSX (Adam, correct me if I'm wrong :-)

IIRC that is correct.  Also, DRHipp never merged the patch (even though
I sent him the legal papers he asked for).

 Additionally, SQLite also has the (undocumented?) ability to define a
 fixed locking style at compile-time with SQLITE_FIXED_LOCKING_STYLE.

I must hasten to add that I have never been able to get sqlite working
in a scenario where multiple client machines are concurrently accessing
the same database -- even when whole file locking is in use.  I
originally thought that using whole-file locks only (and no byte-range
locks) would work, but as far as I have been able to determine, it *does
not*.

 We hope we can make use of byte-range locking some day when OpenAFS
 supports this on *nix.

Me too, but my hopes are not high.  The fact that the databases become
corrupted when using whole-file locks only suggests that there is a more
subtle problem lurking here.

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] sqlite on AFS will not work, even with whole-file locking

2010-04-11 Thread Adam Megacz

Brandon Simmons brandon.m.simm...@gmail.com writes:
 Thanks for the response. It seems like whole-file locking in sqlite
 would be a good choice for me in any case,

 In a situation where the whole-file locking scheme is used, would AFS
 be an acceptable choice? Would it be better than NFS?

I had the same idea, and tried it.  It does not work.  Your databases
will get corrupted.  I never figured out why, although I did confirm
that sqlite was in fact requesting only whole-file locks.

It would be nice if it worked, though.  There are a lot of applications
out there where writes to the database are extremely rare, so
invalidating all the clients' caches is not a problem.

  - a


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] why PAGs?

2010-03-01 Thread Adam Megacz

I recently found out that Coda does not have PAGs, and deliberately
omits them (it's not just that they haven't had time to implement them).

This got me to wondering: why does AFS have PAGs?  Restricting the focus
to UNIX for a moment, if we assume that there is a local userid for
every PTS identity, are PAGs really necessary?  Even for something like
mod_waklog, it should be possible to use local userids for credential
isolation.

Just curious.  I'm not seriously proposing getting rid of PAGs or
anything like that.  Just trying to understand things.

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: another MacOS cache manager wedging

2010-03-01 Thread Adam Megacz

Derrick Brashear sha...@gmail.com writes:
 I'd be interested to know if this still happens with 1.5.72,

No, it does not still happen.

I've been using 1.5.72 for at least a week now and I am VERY happy with
it.  On MacOS it is a massive improvement over the 1.4.x sieres.  Thank
you!

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] more kernel panics

2010-02-14 Thread Adam Megacz

Hope these reports are helpful...

  - a

meg...@quine:~$sudo ~/bin/openafs-decode.pl -i 
/Library/Logs/DiagnosticReports/Kernel_2010-02-14-170743_Adam-Megaczs-MacBook.panic
/Library/OpenAFS/Tools/root.client/usr/vice/etc/afs.kext appears to be loadable 
(not including linkage for on-disk libraries).
/tmp/afsdebugb4JqKK/gdb.input:6: Error in sourced command file:
Cannot access memory at address 0x0

Interval Since Last Panic Report:  299638 sec
Panics Since Last Report:  2
Anonymous UUID:06FDBD1C-6E37-46D3-92E6-009678548AC0

Sun Feb 14 17:07:43 2010
panic(cpu 1 caller 0x2a7ac2): Kernel trap at 0x, type 14=page fault, 
registers:
CR0: 0x8001003b, CR2: 0x, CR3: 0x00101000, CR4: 0x06e0
EAX: 0x4613b080, EBX: 0x, ECX: 0x, EDX: 0x051dea04
CR2: 0x, EBP: 0x3513bcc0, ESI: 0x00b8, EDI: 0x
EFL: 0x00010216, EIP: 0x, CS:  0x0004, DS:  0x3513000c
Error code: 0x0010

Backtrace (CPU 1), Frame : Return Address (4 potential args on stack)
0x3513ba78 : 0x21b2bd (0x5cf868 0x3513baac 0x223719 0x0) 
0x3513bac8 : 0x2a7ac2 (0x591c30 0x0 0xe 0x591dfa) 
0x3513bba8 : 0x29d968 (0x3513bbc0 0x3513bbf8 0x3513bcc0 0x0) 
0x3513bbb8 : 0x0 (0xe 0x35130048 0x2a000c 0xc) 
0x3513bcc0 : 0x460bedca (0x45e56864 0x5e43004 0xb8 0x0) 
0x3513bdd0 : 0x460c3746 (0x5e43004 0x45e56864 0x3513bf5c 0x4) 
0x3513bf18 : 0x460e80e1 (0x45e56864 0x3513bf5c 0x4 0x3513bf3c) 
0x3513bf3c : 0x460a6c38 (0x45e56864 0x3513bf5c 0x227595 0x4613bd00) 
0x3513bf80 : 0x460a72c1 (0x4613bd00 0x0 0x3fc 0x57e0390) 
0x3513bfac : 0x46127f87 (0x4613ba48 0x3513bfc8 0x227595 0x2) 
0x3513bfc8 : 0x29d68c (0x35173ad0 0x0 0x29d69b 0x3fe1078) 
  Kernel Extensions in backtrace (with dependencies):
 org.openafs.filesystems.afs(1.5.71)@0x4609a000-0x46148fff

BSD process name corresponding to current thread: kernel_task

Mac OS version:
10C540

Kernel version:
Darwin Kernel Version 10.2.0: Tue Nov  3 10:37:10 PST 2009; 
root:xnu-1486.2.11~1/RELEASE_I386
System model name: MacBook1,1 (Mac-F4208CC8)

System uptime in nanoseconds: 10802080978895
unloaded kexts:
com.apple.iokit.IOUSBMassStorageClass   2.5.1 (addr 0x34e81000, size 0x45056) - 
last unloaded 3666399322849
loaded kexts:
org.virtualbox.kext.VBoxNetAdp  3.0.6 - last loaded 3537690793655
org.virtualbox.kext.VBoxNetFlt  3.0.6
org.virtualbox.kext.VBoxUSB 3.0.6
org.virtualbox.kext.VBoxDrv 3.0.6
com.cisco.nke.ipsec 2.0.1
org.openafs.filesystems.afs 1.5.71
com.apple.driver.IOBluetoothBNEPDriver  2.2.4f3
com.apple.filesystems.autofs2.1.0
com.apple.Dont_Steal_Mac_OS_X   7.0.0
com.apple.iokit.CHUDUtils   201
com.apple.driver.AppleIntelYonahProfile 14
com.apple.iokit.CHUDProf214
com.apple.driver.AudioIPCDriver 1.1.2
com.apple.driver.AppleHDA   1.7.9a4
com.apple.driver.AppleUpstreamUserClient3.1.0
com.apple.driver.AppleIntelGMA950   6.0.6
com.apple.driver.SMCMotionSensor3.0.0d4
com.apple.iokit.AppleYukon2 3.1.14b1
com.apple.driver.AirPort.Atheros421.19.8
com.apple.driver.ACPI_SMC_PlatformPlugin4.0.1d0
com.apple.driver.AppleLPC   1.4.9
com.apple.driver.AppleBacklight 170.0.14
com.apple.driver.AppleIntelIntegratedFramebuffer6.0.6
com.apple.driver.AppleUSBTrackpad   1.8.0b4
com.apple.driver.AppleUSBTCKeyEventDriver   1.8.0b4
com.apple.driver.AppleUSBTCKeyboard 1.8.0b4
com.apple.driver.AppleIRController  251.1.4
com.apple.driver.AppleRAID  4.0.6
com.apple.BootCache 31
com.apple.iokit.IOAHCIBlockStorage  1.6.0
com.apple.driver.AppleUSBHub3.8.4
com.apple.AppleFSCompression.AppleFSCompressionTypeZlib 1.0.0d1
com.apple.driver.AppleAHCIPort  2.0.1
com.apple.driver.AppleUSBEHCI   3.7.5
com.apple.driver.AppleEFINVRAM  1.3.0
com.apple.driver.AppleFWOHCI4.4.0
com.apple.driver.AppleUSBUHCI   3.7.5
com.apple.driver.AppleIntelPIIXATA  2.5.0
com.apple.driver.AppleRTC   1.3
com.apple.driver.AppleHPET  1.4
com.apple.driver.AppleSmartBatteryManager   160.0.0
com.apple.driver.AppleACPIButtons   1.3
com.apple.driver.AppleSMBIOS1.4
com.apple.driver.AppleACPIEC1.3
com.apple.driver.AppleAPIC  1.4
com.apple.driver.AppleIntelCPUPowerManagementClient 96.0.0
com.apple.security.sandbox  0
com.apple.security.quarantine   0
com.apple.nke.applicationfirewall   2.1.11
com.apple.driver.AppleIntelCPUPowerManagement   96.0.0
com.apple.driver.AppleProfileReadCounterAction  17
com.apple.driver.AppleProfileTimestampAction10
com.apple.driver.AppleProfileThreadInfoAction   14
com.apple.driver.AppleProfileRegisterStateAction10
com.apple.driver.AppleProfileKEventAction   10
com.apple.driver.AppleProfileCallstackAction20
com.apple.iokit.IOSurface   73.0
com.apple.iokit.IOBluetoothSerialManager2.2.4f3
com.apple.iokit.CHUDKernLib 207
com.apple.driver.DspFuncLib 1.7.9a4
com.apple.iokit.IOSerialFamily  10.0.3
com.apple.iokit.IOFireWireIP2.0.3
com.apple.iokit.IO80211Family   

[OpenAFS] another kernel panic with 1.5.71

2010-02-07 Thread Adam Megacz

meg...@quine:~$sudo ~/bin/openafs-decode.pl -i
/Library/Logs/DiagnosticReports/Kernel_2010-02-07-174609_Adam-Megaczs-MacBook.panic
/Library/OpenAFS/Tools/root.client/usr/vice/etc/afs.kext appears to be
loadable (not including linkage for on-disk libraries).
/tmp/afsdebuggvaRGC/gdb.input:6: Error in sourced command file:
Cannot access memory at address 0x0


Interval Since Last Panic Report:  1655062 sec
Panics Since Last Report:  9
Anonymous UUID:06FDBD1C-6E37-46D3-92E6-009678548AC0

Sun Feb  7 17:46:09 2010
panic(cpu 1 caller 0x2a7ac2): Kernel trap at 0x, type 14=page
fault, registers:
CR0: 0x8001003b, CR2: 0x, CR3: 0x00101000, CR4: 0x06e0
EAX: 0x45f54080, EBX: 0x, ECX: 0x, EDX: 0x094a6c04
CR2: 0x, EBP: 0x3467bbfc, ESI: 0x0657, EDI: 0x
EFL: 0x00010206, EIP: 0x, CS:  0x0004, DS:  0x3467000c
Error code: 0x0010

Backtrace (CPU 1), Frame : Return Address (4 potential args on stack)
0x3467b9b8 : 0x21b2bd (0x5cf868 0x3467b9ec 0x223719 0x0) 
0x3467ba08 : 0x2a7ac2 (0x591c30 0x0 0xe 0x591dfa) 
0x3467bae8 : 0x29d968 (0x3467bafc 0x3467bbfc 0x0 0xe) 
0x3467baf4 : 0x0 (0xe 0x34670048 0x2a000c 0xc) 
0x3467bbfc : 0x45ed7dca (0x45ce4664 0x5ade004 0x657 0x0) 
0x3467bd0c : 0x45edc746 (0x5ade004 0x45ce4664 0x3467be88 0x1) 
0x3467be54 : 0x45f02fcd (0x45ce4664 0x3467be88 0x1 0x0) 
0x3467beb0 : 0x45f3eebc (0x45ce4664 0x42d8280 0x43f53d4 0x88336f0) 
0x3467bed8 : 0x2f6d6c (0x3467befc 0x3711fcb9 0x0 0x0) 
0x3467bf28 : 0x2e3e3c (0x88336f0 0x1 0x5aee7c4 0x3467bf5c) 
0x3467bf78 : 0x4ee5dc (0x49f6000 0x5aee6c0 0x5aee704 0x0) 
0x3467bfc8 : 0x29deb8 (0x4fc3014 0x0 0x4 0x4fc3014) 
No mapping exists for frame pointer
Backtrace terminated-invalid frame pointer 0xbfffe5d8
  Kernel Extensions in backtrace (with dependencies):
 org.openafs.filesystems.afs(1.5.71)@0x45eb3000-0x45f61fff

BSD process name corresponding to current thread: emacs

Mac OS version:
10C540

Kernel version:
Darwin Kernel Version 10.2.0: Tue Nov  3 10:37:10 PST 2009;
root:xnu-1486.2.11~1/RELEASE_I386
System model name: MacBook1,1 (Mac-F4208CC8)

System uptime in nanoseconds: 240706717768
unloaded kexts:
com.apple.driver.AppleFileSystemDriver  2.0 (addr 0x2ebb2000, size
0x12288) - last unloaded 146893998952
loaded kexts:
org.virtualbox.kext.VBoxNetAdp  3.0.6 - last loaded 52888618916
org.virtualbox.kext.VBoxNetFlt  3.0.6
org.virtualbox.kext.VBoxUSB 3.0.6
org.virtualbox.kext.VBoxDrv 3.0.6
com.cisco.nke.ipsec 2.0.1
org.openafs.filesystems.afs 1.5.71
com.FTDI.driver.FTDIUSBSerialDriver 2.2.14
com.apple.driver.IOBluetoothBNEPDriver  2.2.4f3
com.apple.filesystems.autofs2.1.0
com.apple.Dont_Steal_Mac_OS_X   7.0.0
com.apple.iokit.CHUDUtils   201
com.apple.driver.AppleIntelYonahProfile 14
com.apple.iokit.CHUDProf214
com.apple.driver.AppleIntelGMA950   6.0.6
com.apple.driver.AudioIPCDriver 1.1.2
com.apple.driver.AppleHDA   1.7.9a4
com.apple.driver.AppleUpstreamUserClient3.1.0
com.apple.driver.AppleIntelIntegratedFramebuffer6.0.6
com.apple.driver.SMCMotionSensor3.0.0d4
com.apple.iokit.AppleYukon2 3.1.14b1
com.apple.driver.AirPort.Atheros421.19.8
com.apple.driver.ACPI_SMC_PlatformPlugin4.0.1d0
com.apple.driver.AppleLPC   1.4.9
com.apple.driver.AppleBacklight 170.0.14
com.apple.iokit.SCSITaskUserClient  2.6.0
com.apple.driver.AppleIRController  251.1.4
com.apple.driver.AppleUSBTrackpad   1.8.0b4
com.apple.driver.AppleUSBTCKeyEventDriver   1.8.0b4
com.apple.driver.AppleUSBTCKeyboard 1.8.0b4
com.apple.iokit.IOAHCIBlockStorage  1.6.0
com.apple.driver.AppleRAID  4.0.6
com.apple.driver.AppleUSBHub3.8.4
com.apple.BootCache 31
com.apple.AppleFSCompression.AppleFSCompressionTypeZlib 1.0.0d1
com.apple.driver.AppleUSBEHCI   3.7.5
com.apple.driver.AppleFWOHCI4.4.0
com.apple.driver.AppleAHCIPort  2.0.1
com.apple.driver.AppleIntelPIIXATA  2.5.0
com.apple.driver.AppleUSBUHCI   3.7.5
com.apple.driver.AppleEFINVRAM  1.3.0
com.apple.driver.AppleRTC   1.3
com.apple.driver.AppleHPET  1.4
com.apple.driver.AppleSmartBatteryManager   160.0.0
com.apple.driver.AppleACPIButtons   1.3
com.apple.driver.AppleSMBIOS1.4
com.apple.driver.AppleACPIEC1.3
com.apple.driver.AppleAPIC  1.4
com.apple.driver.AppleIntelCPUPowerManagementClient  

[OpenAFS] Re: advice on troubleshooting blocked cache manager on MacOS?

2010-02-06 Thread Adam Megacz

Derrick Brashear sha...@gmail.com writes:
 Ok, so, can you gather rxdebug (hungclient) 7001 and perhaps a couple
 minutes of tcpdump -s 1500 -n -w /tmp/packets host (hungclient) and
 port 7001 at this point?  (specify an ethernet interface with -i if
 it's not the default that's your upstream)

I can't predict when the hangs will happen, so I can't start the tcpdump
until it's already hung.  Is this still going to be useful?  If so, I
will gather the data you request.

By the way, I tried upgrading to 1.5.71 which -- except for the issue
I'm just about to post -- has been really great.  The blocking issue has
not happened yet (although I've only been using it a week), and fs
precache is a godsend for streaming media -- works beautifully.

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] reproducible kernel panic with 1.5.71 on MacOS 10.6

2010-02-06 Thread Adam Megacz

I can get this to happen consistently.

  - a

meg...@quine:/tmp$./decode.pl -i
/Library/Logs/DiagnosticReports/Kernel_2010-02-06-135623_Adam-Megaczs-MacBook.panic
/Library/OpenAFS/Tools/root.client/usr/vice/etc/afs.kext appears to be
loadable (not including linkage for on-disk libraries).
/var/folders/I6/I6COgALlF4K3h0E+O9vmUE+++TI/-Tmp-/afsdebugI8kfyu/gdb.input:6:
Error in sourced command file:
Cannot access memory at address 0x0
Can't write to folder /var/db/openafs/logs. at ./decode.pl line 268
  main::write_dump_file('/var/db/openafs/logs/crash.dump',
  'HASH(0x811c10)', 'add symbol table from file
  /var/folders/I6/I6COgALlF4K3h0E+O...') called at ./decode.pl line
  101

..
Interval Since Last Panic Report:  1586223 sec
Panics Since Last Report:  2
Anonymous UUID:06FDBD1C-6E37-46D3-92E6-009678548AC0

Sat Feb  6 13:56:23 2010
panic(cpu 0 caller 0x2a7ac2): Kernel trap at 0x, type 14=page
fault, registers:
CR0: 0x8001003b, CR2: 0x, CR3: 0x00101000, CR4: 0x06e0
EAX: 0x45fc7080, EBX: 0x, ECX: 0x, EDX: 0x04b94404
CR2: 0x, EBP: 0x34f63cc0, ESI: 0x180c, EDI: 0x
EFL: 0x00010216, EIP: 0x, CS:  0x0004, DS:  0x34f6000c
Error code: 0x0010

Backtrace (CPU 0), Frame : Return Address (4 potential args on stack)
0x34f63a78 : 0x21b2bd (0x5cf868 0x34f63aac 0x223719 0x0) 
0x34f63ac8 : 0x2a7ac2 (0x591c30 0x0 0xe 0x591dfa) 
0x34f63ba8 : 0x29d968 (0x34f63bc0 0x34f63bf8 0x34f63cc0 0x0) 
0x34f63bb8 : 0x0 (0xe 0x34f60048 0x2a000c 0xc) 
0x34f63cc0 : 0x45f4adca (0x45ce2b94 0x9089004 0x180c 0x0) 
0x34f63dd0 : 0x45f4f746 (0x9089004 0x45ce2b94 0x34f63f5c 0x4) 
0x34f63f18 : 0x45f740e1 (0x45ce2b94 0x34f63f5c 0x4 0x34f63f3c) 
0x34f63f3c : 0x45f32c38 (0x45ce2b94 0x34f63f5c 0x246 0x45fc7d00) 
0x34f63f80 : 0x45f332c1 (0x45fc7d00 0x34f63fac 0x487ab6 0x34f73ad0) 
0x34f63fac : 0x45fb3f87 (0x34f73ad0 0x34f63fcc 0x227595 0x2) 
0x34f63fc8 : 0x29d68c (0x34f73ad0 0x0 0x8 0x3dec334) 
  Kernel Extensions in backtrace (with dependencies):
 org.openafs.filesystems.afs(1.5.71)@0x45f26000-0x45fd4fff

BSD process name corresponding to current thread: kernel_task

Mac OS version:
10C540

Kernel version:
Darwin Kernel Version 10.2.0: Tue Nov  3 10:37:10 PST 2009;
root:xnu-1486.2.11~1/RELEASE_I386
System model name: MacBook1,1 (Mac-F4208CC8)

System uptime in nanoseconds: 217490384250
unloaded kexts:
com.apple.driver.AppleFileSystemDriver  2.0 (addr 0x2ebaa000, size
0x12288) - last unloaded 197935465108
loaded kexts:
org.virtualbox.kext.VBoxNetAdp  3.0.6 - last loaded 54522995514
org.virtualbox.kext.VBoxNetFlt  3.0.6
org.virtualbox.kext.VBoxUSB 3.0.6
org.virtualbox.kext.VBoxDrv 3.0.6
com.cisco.nke.ipsec 2.0.1
org.openafs.filesystems.afs 1.5.71
com.FTDI.driver.FTDIUSBSerialDriver 2.2.14
com.apple.driver.IOBluetoothBNEPDriver  2.2.4f3
com.apple.filesystems.autofs2.1.0
com.apple.Dont_Steal_Mac_OS_X   7.0.0
com.apple.iokit.CHUDUtils   201
com.apple.driver.AppleIntelYonahProfile 14
com.apple.iokit.CHUDProf214
com.apple.driver.AppleIntelGMA950   6.0.6
com.apple.driver.AudioIPCDriver 1.1.2
com.apple.driver.AppleHDA   1.7.9a4
com.apple.driver.AppleUpstreamUserClient3.1.0
com.apple.driver.AppleIntelIntegratedFramebuffer6.0.6
com.apple.driver.SMCMotionSensor3.0.0d4
com.apple.iokit.AppleYukon2 3.1.14b1
com.apple.driver.AirPort.Atheros421.19.8
com.apple.driver.AppleLPC   1.4.9
com.apple.driver.ACPI_SMC_PlatformPlugin4.0.1d0
com.apple.driver.AppleBacklight 170.0.14
com.apple.iokit.SCSITaskUserClient  2.6.0
com.apple.driver.AppleIRController  251.1.4
com.apple.driver.AppleUSBTrackpad   1.8.0b4
com.apple.driver.AppleUSBTCKeyEventDriver   1.8.0b4
com.apple.driver.AppleUSBTCKeyboard 1.8.0b4
com.apple.iokit.IOAHCIBlockStorage  1.6.0
com.apple.driver.AppleRAID  4.0.6
com.apple.driver.AppleFWOHCI4.4.0
com.apple.driver.AppleUSBHub3.8.4
com.apple.driver.AppleAHCIPort  2.0.1
com.apple.driver.AppleUSBEHCI   3.7.5
com.apple.BootCache 31
com.apple.AppleFSCompression.AppleFSCompressionTypeZlib 1.0.0d1
com.apple.driver.AppleIntelPIIXATA  2.5.0
com.apple.driver.AppleEFINVRAM  1.3.0
com.apple.driver.AppleUSBUHCI   3.7.5
com.apple.driver.AppleRTC   1.3
com.apple.driver.AppleHPET  1.4
com.apple.driver.AppleSmartBatteryManager

[OpenAFS] bug report: MacOS installer says a newer version is installed even if it was uninstalled

2010-02-06 Thread Adam Megacz

To reproduce:

  1. Install OpenAFS 1.5.71
  2. Reboot
  3. Run OpenAFS 1.5.71 Uninstall.command script
  4. Reboot
  5. Attempt to install OpenAFS 1.4.12rc2

It will refuse to install.

  - a



___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: bug report: MacOS installer says a newer version is installed even if it was uninstalled

2010-02-06 Thread Adam Megacz

Derrick Brashear sha...@gmail.com writes:
 that's true. 1.5.x  1.4.x for their rules. the 1.5 installers do let
 you backrev. we could backport it to 1.4

Actually, I think the problem is that the 1.4.x uninstaller does not
pkgutil --forget org.openafs.OpenAFS.pkg.  That ought to fix it.

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: another MacOS cache manager wedging

2010-02-02 Thread Adam Megacz

Derrick Brashear sha...@gmail.com writes:
 On Sun, Jan 31, 2010 at 2:23 PM, Adam Megacz a...@megacz.com wrote:
       server 169.229.3.178 partition /vicepa RO Site  -- New release
       server gentzen.megacz.com partition /vicepa RO Site  -- Old release
       server gentzen.megacz.com partition /vicepa RW Site  -- New release
       server gentzen.megacz.com partition /vicepa RO Site  -- New release

 Uh. This isn't right. You have 2 RO sites on the same server and partition.

Ok, fixed, but the problem persists (see below).  I'm upgrading to
1.4.12rc2 now, I will see if that solves the issue.

  - a

cmeg...@quine:~$cmdebug localhost
Lock afs_xvcache status: (upgrade_waiting, write_locked(pid:30499 at:335), 1 
waiters)
Lock afs_xserver status: (none_waiting, 1 read_locks(pid:0))
Lock afs_xvcb status: (writer_waiting, write_locked(pid:0 at:273), 1 waiters)


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: another MacOS cache manager wedging

2010-01-31 Thread Adam Megacz

Thanks for taking the time to check this out.  This issue with the CM
blocking unnecessarily seems to have been happening intermittently ever
since I upgraded from 1.4.6 to 1.4.11 (and continued with 1.4.12 while I
was using it).  Unfortunately I can't downgrade to see if it goes away
because I run Snow Leopard now.  I'm pretty darn sure this is an issue
with the client -- either something that changed in OpenAFS or else
something that changed in MacOS, because it was all humming quite nicely
beforehand.

Derrick Brashear sha...@gmail.com writes:
 Lock afs_xserver status: (none_waiting, 1 read_locks(pid:0))
 Lock afs_xvcb status: (writer_waiting, write_locked(pid:0 at:273), 2
 waiters)
 ** Cache entry @ 0x45dee6c4 for 200.536870919.44.36878 [megacz.com]
    locks: (none_waiting, write_locked(pid:10589 at:54))

 Which server is 536870919 on,

The vos ex output is below; let me know if you need additional
information.

 and why (from the client's perspective) would it not be answering?

I don't know.  There is a significant possibility of very small, steady
(like 3%) packet loss on the path between the client and the server, but
otherwise both servers (169.229.3.178 and gentzen.megacz.com) are up,
healthy, and responsive.

root.cell.readonly536870919 RO 28 K  On-line
169.229.3.178 /vicepa 
RWrite  536870918 ROnly  0 Backup  0 
MaxQuota   5000 K 
CreationWed Jan 13 19:36:04 2010
CopySun May 11 11:42:30 2008
Backup  Wed Jan 13 17:00:11 2010
Last Update Wed Jan 13 19:34:12 2010
89 accesses in the past day (i.e., vnode references)

root.cell.readonly536870919 RO 28 K  On-line
gentzen.megacz.com /vicepa 
RWrite  536870918 ROnly  536880267 Backup  536870920 
MaxQuota   5000 K 
CreationWed Jan 13 19:36:04 2010
CopyWed Jan 13 19:36:04 2010
Backup  Wed Jan 13 17:00:11 2010
Last Update Wed Jan 13 19:34:12 2010
205 accesses in the past day (i.e., vnode references)

RWrite: 536870918 ROnly: 536870919 Backup: 536870920
RClone: 536870919 
number of sites - 4
   server 169.229.3.178 partition /vicepa RO Site  -- New release
   server gentzen.megacz.com partition /vicepa RO Site  -- Old release
   server gentzen.megacz.com partition /vicepa RW Site  -- New release
   server gentzen.megacz.com partition /vicepa RO Site  -- New release


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: another MacOS cache manager wedging

2010-01-31 Thread Adam Megacz

Here's another one:

** Cache entry @ 0x45eb543c for 200.536879758.1.1 [megacz.com]
locks: (none_waiting, 1 read_locks(pid:6896))
6144 bytes  DV  438  refcnt 1
callback 0677db04   expires 1265005388
0 opens  0 writers
volume root
states (0x1), stat'd

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: another MacOS cache manager wedging

2010-01-30 Thread Adam Megacz

Derrick Brashear sha...@gmail.com writes:
 ** Cache entry @ 0x45e0f004 for 1.1.1.1 [dynroot]
 locks: (writer_waiting, write_locked(pid:2870 at:54), 2 waiters)

 I don't even have to look at this one. 54 is FetchStatus. Oddly, it's
 dynroot, so there's something off here.

Here's another:

Lock afs_xserver status: (none_waiting, 1 read_locks(pid:0))
Lock afs_xvcb status: (writer_waiting, write_locked(pid:0 at:273), 2
waiters)
** Cache entry @ 0x45dee6c4 for 200.536870919.44.36878 [megacz.com]
locks: (none_waiting, write_locked(pid:10589 at:54))
   0 bytes  DV0  refcnt 1
callback    expires 0
0 opens  0 writers
normal file
states (0x4), read-only
** Cache entry @ 0x45deeafc for 200.536880268.1.1 [megacz.com]
locks: (none_waiting, 1 read_locks(pid:10548))
   14336 bytes  DV  325  refcnt 1
callback 07dc1284   expires 1264896318
3 opens  0 writers
volume root
states (0x1), stat'd

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: Recommended way to start up OpenAFS on Solaris 10?

2010-01-30 Thread Adam Megacz

Atro Tossavainen atro.tossavainen+open...@helsinki.fi writes:
 Is everybody still writing their own SMF bits to start OpenAFS on
 Solaris 10 without /etc/init.d bits, or is there already a Received
 Way of doing this?

For the server component, I use the script below (with runit, sort of
like SMF for Linux).

If my make bosserver handle SIGTERM properly patch is merged, this
mess will get a lot simpler (and more reliable):

#!/bin/bash
DAEMON=openafs-fileserver
mkdir -p /etc/service/$DAEMON/control
echo '#!/bin/bash' 
/etc/service/.tmpfile-$DAEMON
echo '/usr/bin/bos shutdown -wait -localauth `hostname`'  
/etc/service/.tmpfile-$DAEMON
echo 'kill `cat /etc/service/$DAEMON/supervise/pid`'  
/etc/service/.tmpfile-$DAEMON
chmod +x /etc/service/.tmpfile-$DAEMON
mv /etc/service/.tmpfile-$DAEMON /etc/service/$DAEMON/control/t

exec /usr/sbin/bosserver -nofork


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] another MacOS cache manager wedging

2010-01-29 Thread Adam Megacz

So, I successfully demultihomed all servers in the cell in question.

Unfortunately the random blocking still seems to be happening.  The one
shown below was particularly nasty: it did not resolve after any
reasonable approximation to the timeout value (stayed stuck for well
over 30 minutes before I gave up and rebooted the client).

  - a



hosed
Description: Binary data


hosed.long
Description: Binary data


[OpenAFS] Re: advice on troubleshooting blocked cache manager on MacOS?

2010-01-27 Thread Adam Megacz

Derrick Brashear sha...@gmail.com writes:
 I might be able to try that, but it will take a few days.

 if true, you should see output in cmdebug now

Okay, I just caught it red-handed.  Can anybody help with reading the
tea leaves here?

  meg...@quine:~$cmdebug localhost
  Lock afs_xvcache status: (none_waiting, write_locked(pid:11013 at:335))
  Lock afs_xserver status: (none_waiting, 1 read_locks(pid:0))
  Lock afs_xvcb status: (writer_waiting, write_locked(pid:0 at:273), 1 waiters)

  - a


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: advice on troubleshooting blocked cache manager on MacOS?

2010-01-27 Thread Adam Megacz

Derrick Brashear sha...@gmail.com writes:
  Lock afs_xvcache status: (none_waiting, write_locked(pid:11013
 at:335))

Ah, so I am to interpret the thing after the comma as the name of a
function somewhere within the openafs source code.  Knowing that helps a
lot!

 assuming you're not running disconnected and actively trying to
 disconnect,

Correct.

 So then the question is why FlushVCBs is blocking you. well, you said
 you had multihomed fileservers.

To be completely precise, one of my fileservers is a machine with two IP
addresses, with a one-line NetInfo file.  By multihomed did you mean
on a machine with two public IPs or the AFS server somehow knows
about both IPs?

 RXAFS_GiveUpCallBacks is called here. you didn't perchance grab
 rxdebug output for the client at this point?

Sorry, no; I will do that next time.

 could we address this? yes! how? well, i suppose we could on network
 events (macos has support for this) and when a new server is
 discovered, probe all addresses, so any unreachable addresses are
 marked down in advance.

How do I ask the cache manager to tell me what IPs it thinks a
particular server has?

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: advice on troubleshooting blocked cache manager on MacOS?

2010-01-27 Thread Adam Megacz

Derrick Brashear sha...@gmail.com writes:
 You don't. You can ask the vlserver, which is how the CM found out anyhow:
 vos listaddrs -printuuid -noresolve

Yikes, that list is full of incorrect addresses.  How on earth is the
list compiled?

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] advice on troubleshooting blocked cache manager on MacOS?

2010-01-21 Thread Adam Megacz

Hi, lately I've been encountering a lot of situations where a process
seems to block for a really long time trying to access something in
/afs; it usually succeeds, but only after several minutes.  This seems
to happen only on MacOS (1.4.11, although I saw it with 1.4.10 too).

Can anybody give me some advice on how to go about discovering exactly
which file access is blocked, and perhaps why?  I have control of the
fileserver in this situation.

Thanks!

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: advice on troubleshooting blocked cache manager on MacOS?

2010-01-21 Thread Adam Megacz

Ken Hornstein k...@cmf.nrl.navy.mil writes:
 If it matters at all, I saw the exact same thing.  It seemed to be
 caused by a combination of a multihomed fileserver and AFS client
 behind a NAT (yeah, it's easy to see how that would be an issue).

Wow, that is really interesting, *both* of those factors are in play in
my situation as well.

 Once the multihomed fileserver went away, it all fixed itself.

I might be able to try that, but it will take a few days.

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: 1.4.12fc1 kernel panics

2010-01-16 Thread Adam Megacz

Simon Wilkinson s...@inf.ed.ac.uk writes:
 However, there is a tool that will help you do this - decode-panic.
 I'm not sure if we're installing it in the 1.4.x series,

No, you aren't.

I very very very strongly urge the gatekeepers to arrange for the
installers to always install all tools necessary to create a complete
bug report.  I take the time to gather the information you guys ask for,
but most users won't.

  - a


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: 1.4.12fc1 kernel panics

2010-01-16 Thread Adam Megacz

Simon Wilkinson s...@inf.ed.ac.uk writes:
 http://git.openafs.org/?p=openafs.git;a=blob_plain;f=src/packaging/MacOS/decode-panic;h=a775b9a82b1deea7abdc2c0f109dc04446371e60;hb=HEAD

That did not work.

meg...@quine:/tmp$sudo ./x.pl
Can't find panic file: /Library/Logs/panic.log!
 at ./x.pl line 75
meg...@quine:/tmp$ls /Library/Logs/panic.log
ls: /Library/Logs/panic.log: No such file or directory

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: 1.4.12fc1 kernel panics

2010-01-16 Thread Adam Megacz

Simon Wilkinson s...@inf.ed.ac.uk writes:
 Unfortunately, just a bare panic log isn't that much use when it comes
 to tracking the problem down. Unless we've got exactly the same
 kernel,

Mac OS X 10.6.2, build 10C540 (general release)

 architecture,

i86 (32-bit)

 OpenAFS build,

Check the subject line.

  - a


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: 1.4.12fc1 kernel panics

2010-01-16 Thread Adam Megacz

Derrick Brashear sha...@gmail.com writes:
 Instead, give it some arguments, like -i
 /Library/Logs/DiagnosticReports/(path to kernel report).

Ok,

Panic Date:  Interval Since Last Panic Report:  472905 sec
Kernel Version:  Darwin Kernel Version 10.2.0: Tue Nov  3 10:37:10 PST
2009; root:xnu-1486.2.11~1/RELEASE_I386
OpenAFS Version: org.openafs.filesystems.afs(1.4.12fc1)
=
add symbol table from file
/tmp/afsdebugLAjeJl/org.openafs.filesystems.afs.sym? 0x21b2bd
panic+445: mov0x8011d0,%eax
0x2a7ac2 kernel_trap+1530:jmp0x2a7ade kernel_trap+1558
0x29d968 lo_alltraps+712: mov%edi,%esp
0x4607e500 afs_GetDCache+7832:mov0x64(%edx),%ebx
0x46078a18 BPrefetch+144: mov%eax,-0x3c(%ebp)
0x4607928d afs_BackgroundDaemon+573:  jmp0x460792cb
afs_BackgroundDaemon+635
0x460e76a7 afsd_thread+719:   call   0x2a013e current_thread
0x29d68c call_continuation+28:add$0x10,%esp

  - a


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: 1.4.12fc1 kernel panics

2010-01-16 Thread Adam Megacz

Simon Wilkinson s...@inf.ed.ac.uk writes:
 If you're using the dmg's from the OpenAFS website,

yes

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] 1.4.12fc1 kernel panics

2010-01-15 Thread Adam Megacz

Interval Since Last Panic Report:  1043811 sec
Panics Since Last Report:  2
Anonymous UUID:06FDBD1C-6E37-46D3-92E6-009678548AC0

Fri Jan 15 20:40:17 2010
panic(cpu 1 caller 0x2a7ac2): Kernel trap at 0x4607e500, type 14=page
fault, registers:
CR0: 0x8001003b, CR2: 0x0064, CR3: 0x00101000, CR4: 0x06e0
EAX: 0x0010, EBX: 0x, ECX: 0x460870e2, EDX: 0x
CR2: 0x0064, EBP: 0x34cabf1c, ESI: 0x0bff4004, EDI: 0x
EFL: 0x00010297, EIP: 0x4607e500, CS:  0x0004, DS:  0x000c
Error code: 0x

Backtrace (CPU 1), Frame : Return Address (4 potential args on stack)
0x34cabc48 : 0x21b2bd (0x5cf868 0x34cabc7c 0x223719 0x0) 
0x34cabc98 : 0x2a7ac2 (0x591c30 0x4607e500 0xe 0x591dfa) 
0x34cabd78 : 0x29d968 (0x34cabd90 0x460b0d6d 0x34cabf1c 0x4607e500) 
0x34cabd88 : 0x4607e500 (0xe 0xbff0048 0x34ca000c 0x34ca000c) 
0x34cabf1c : 0x46078a18 (0x45da800c 0xf10 0x0 0x34cabf58) 
0x34cabf80 : 0x4607928d (0x46100420 0x34cabfac 0x487ab6 0x34883ab8) 
0x34cabfac : 0x460e76a7 (0x34883ab8 0x34cabfc8 0x227595 0x2) 
0x34cabfc8 : 0x29d68c (0x34883ab8 0x0 0x8 0x50ba33c) 
  Kernel Extensions in backtrace (with dependencies):
 org.openafs.filesystems.afs(1.4.12fc1)@0x4606c000-0x4610dfff

BSD process name corresponding to current thread: kernel_task

Mac OS version:
10C540

Kernel version:
Darwin Kernel Version 10.2.0: Tue Nov  3 10:37:10 PST 2009;
root:xnu-1486.2.11~1/RELEASE_I386
System model name: MacBook1,1 (Mac-F4208CC8)

System uptime in nanoseconds: 16634816639383
unloaded kexts:
com.apple.driver.AppleFileSystemDriver  2.0 (addr 0x2eda2000, size
0x12288) - last unloaded 161942628549
loaded kexts:
org.virtualbox.kext.VBoxNetAdp  3.0.6 - last loaded 34152370293
org.virtualbox.kext.VBoxNetFlt  3.0.6
org.virtualbox.kext.VBoxUSB 3.0.6
org.virtualbox.kext.VBoxDrv 3.0.6
com.cisco.nke.ipsec 2.0.1
org.openafs.filesystems.afs 1.4.12fc1
com.FTDI.driver.FTDIUSBSerialDriver 2.2.14
com.apple.driver.IOBluetoothBNEPDriver  2.2.4f3
com.apple.filesystems.autofs2.1.0
com.apple.Dont_Steal_Mac_OS_X   7.0.0
com.apple.iokit.CHUDUtils   201
com.apple.driver.AppleIntelYonahProfile 14
com.apple.iokit.CHUDProf214
com.apple.driver.AppleHDA   1.7.9a4
com.apple.driver.AudioIPCDriver 1.1.2
com.apple.driver.AppleUpstreamUserClient3.1.0
com.apple.driver.AppleIntelGMA950   6.0.6
com.apple.driver.SMCMotionSensor3.0.0d4
com.apple.iokit.AppleYukon2 3.1.14b1
com.apple.driver.AirPort.Atheros421.19.8
com.apple.driver.ACPI_SMC_PlatformPlugin4.0.1d0
com.apple.driver.AppleLPC   1.4.9
com.apple.driver.AppleIntelIntegratedFramebuffer6.0.6
com.apple.driver.AppleBacklight 170.0.14
com.apple.driver.AppleIRController  251.1.4
com.apple.driver.AppleUSBTrackpad   1.8.0b4
com.apple.driver.AppleUSBTCKeyEventDriver   1.8.0b4
com.apple.driver.AppleUSBTCKeyboard 1.8.0b4
com.apple.iokit.IOAHCIBlockStorage  1.6.0
com.apple.driver.AppleRAID  4.0.6
com.apple.driver.AppleUSBHub3.8.4
com.apple.driver.AppleUSBEHCI   3.7.5
com.apple.BootCache 31
com.apple.AppleFSCompression.AppleFSCompressionTypeZlib 1.0.0d1
com.apple.driver.AppleFWOHCI4.4.0
com.apple.driver.AppleUSBUHCI   3.7.5
com.apple.driver.AppleEFINVRAM  1.3.0
com.apple.driver.AppleAHCIPort  2.0.1
com.apple.driver.AppleIntelPIIXATA  2.5.0
com.apple.driver.AppleRTC   1.3
com.apple.driver.AppleHPET  1.4
com.apple.driver.AppleSmartBatteryManager   160.0.0
com.apple.driver.AppleACPIButtons   1.3
com.apple.driver.AppleSMBIOS1.4
com.apple.driver.AppleACPIEC1.3
com.apple.driver.AppleAPIC  1.4
com.apple.driver.AppleIntelCPUPowerManagementClient 96.0.0
com.apple.security.sandbox  0
com.apple.security.quarantine   0
com.apple.nke.applicationfirewall   2.1.11
com.apple.driver.AppleIntelCPUPowerManagement   96.0.0
com.apple.iokit.IOSCSIArchitectureModelFamily   2.6.0
com.apple.driver.AppleProfileReadCounterAction  17
com.apple.driver.AppleProfileTimestampAction10
com.apple.driver.AppleProfileThreadInfoAction   14
com.apple.driver.AppleProfileRegisterStateAction10
com.apple.driver.AppleProfileKEventAction   10
com.apple.driver.AppleProfileCallstackAction20
com.apple.iokit.IOSurface  

[OpenAFS] Re: afs/c...@realm vs a...@realm vs 1.5.68

2009-12-29 Thread Adam Megacz

Derrick Brashear sha...@gmail.com writes:
 did you ask aklog what it's doing? -d (the debug switch) should tell
 you exactly what it's doing.

Yep

gentzen:/usr/src# aklog -d -c research.cs.berkeley.edu
Authenticating to cell  (server afs.research.cs.berkeley.EDU).
Trying to authenticate to user's realm RESEARCH.CS.BERKELEY.EDU.
Getting tickets: a...@research.cs.berkeley.edu
We've deduced that we need to authenticate to realm RESEARCH.CS.BERKELEY.EDU.
Getting tickets: a...@research.cs.berkeley.edu
Getting tickets: a...@research.cs.berkeley.edu
Kerberos error code returned by get_cred : -1765328377
aklog: Couldn't get  AFS tickets:
aklog: Server not found in Kerberos database while getting AFS tickets


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: afs/c...@realm vs a...@realm vs 1.5.68

2009-12-29 Thread Adam Megacz

Derrick Brashear sha...@gmail.com writes:
 gentzen:/usr/src# aklog -d -c research.cs.berkeley.edu
 Authenticating to cell  (server afs.research.cs.berkeley.EDU).

 Authenticating to cell %s

 So for some reason the cell configuration it's getting back from
 get_cellconfig doesn't include a cell name in it.

Why does it care?  I specified the cell to use on the command line,
explicitly, like I always do.

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] afs/c...@realm vs a...@realm vs 1.5.68

2009-12-28 Thread Adam Megacz

Many AFS tools will accept either of these two principals for the
vlserver/dbserver/fileserver:

  afs/c...@realm
   a...@realm

Is one preferred over the other for new cells?

Moreover, there seems to be some sort of change in the behavior of the
1.5.68 aklog relative to 1.4.11; the new aklog only appears to attempt
the latter one.

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: afs/c...@realm vs a...@realm vs 1.5.68

2009-12-28 Thread Adam Megacz

Derrick Brashear sha...@gmail.com writes:
 Moreover, there seems to be some sort of change in the behavior of the
 1.5.68 aklog relative to 1.4.11; the new aklog only appears to attempt
 the latter one.

 There seems not to be.

Well, when I hold tickets for afsad...@research.cs.berkeley.edu and
attempt to aklog to research.cs.berkeley.edu (which uses principal
afs/research.cs.berkeley@research.cs.berkeley.edu), I see this on
RESEARCH.CS.BERKELEY.EDU's KDC:

  2009-12-28_19:16:48.25167 Dec 28 11:16:48 research.cs.berkeley.edu 
krb5kdc[2979](info): TGS_REQ (1 etypes {1}) 65.23.129.159: UNKNOWN_SERVER: 
authtime 1262027795,  afsad...@research.cs.berkeley.edu for 
a...@research.cs.berkeley.edu, Server not found in Kerberos database
  2009-12-28_19:16:48.39314 Dec 28 11:16:48 research.cs.berkeley.edu 
krb5kdc[2979](info): TGS_REQ (1 etypes {1}) 65.23.129.159: UNKNOWN_SERVER: 
authtime 1262027795,  afsad...@research.cs.berkeley.edu for 
a...@research.cs.berkeley.edu, Server not found in Kerberos database
  2009-12-28_19:16:48.53461 Dec 28 11:16:48 research.cs.berkeley.edu 
krb5kdc[2979](info): TGS_REQ (1 etypes {1}) 65.23.129.159: UNKNOWN_SERVER: 
authtime 1262027795,  afsad...@research.cs.berkeley.edu for 
a...@research.cs.berkeley.edu, Server not found in Kerberos database

When I downgrade the client (65.23.129.159) to 1.4.11, everything
works fine.

I'm sure this is a configuration error on my part, and I've just
lucked out in some way that the 1.4.11 client is more forgiving about.

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] More EINVAL's with 1.5.68

2009-12-27 Thread Adam Megacz

Strange, I can cat the file in question, but for some reason Ruby
programs (I don't know Ruby) choke on it.

  - a

/usr/lib/ruby/1.8/fileutils.rb:1039:in `read': Invalid argument - 
/afs/megacz.com/user/m/me/megacz/.netstiff/store.new (Errno::EINVAL)
   from 
/usr/lib/ruby/1.8/fileutils.rb:1039:in `fu_copy_stream0'
   from 
/usr/lib/ruby/1.8/fileutils.rb:470:in `copy_stream'
   from /usr/lib/ruby/1.8/pstore.rb:369:in 
`commit_new'
   from /usr/lib/ruby/1.8/pstore.rb:368:in 
`open'
   from /usr/lib/ruby/1.8/pstore.rb:368:in 
`commit_new'
   from /usr/lib/ruby/1.8/pstore.rb:297:in 
`transaction'
   from /usr/bin/netstiff:245:in `each'
   from /usr/bin/netstiff:737:in `cleanup'
   from /usr/bin/netstiff:733:in 
`initialize'
   from /usr/bin/netstiff:1104:in 
`initialize'
   from /usr/bin/netstiff:1350:in `new'
   from /usr/bin/netstiff:1350
/usr/lib/ruby/1.8/fileutils.rb:1039:in `read': Invalid argument - 
/afs/megacz.com/user/m/me/megacz/.netstiff/store.new (Errno::EINVAL)
   from 
/usr/lib/ruby/1.8/fileutils.rb:1039:in `fu_copy_stream0'
   from 
/usr/lib/ruby/1.8/fileutils.rb:470:in `copy_stream'
   from /usr/lib/ruby/1.8/pstore.rb:369:in 
`commit_new'
   from /usr/lib/ruby/1.8/pstore.rb:368:in 
`open'
   from /usr/lib/ruby/1.8/pstore.rb:368:in 
`commit_new'
   from /usr/lib/ruby/1.8/pstore.rb:297:in 
`transaction'
   from /usr/bin/netstiff:245:in `each'
   from /usr/bin/netstiff:737:in `cleanup'
   from /usr/bin/netstiff:733:in 
`initialize'
   from /usr/bin/netstiff:1049:in 
`initialize'
   from /usr/bin/netstiff:1337:in `new'
   from /usr/bin/netstiff:1337


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: More EINVAL's with 1.5.68

2009-12-27 Thread Adam Megacz

Andrew Deason adea...@sinenomine.net writes:
 They are probably performing a difference sequence of syscalls on the
 file, or using different flags, etc. Is this reliably reproducible?
 fstrace dumps could be more enlightening,

Yes, but only until I reboot.  I posted the fstrace (see previous
thread) but was told that I had to grab one shortly after reboot.

Unfortunately this problem doesn't manifest itself until the machine
has been up for a while.  So we're kinda stuck.

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: Issue report: invalid argument with 1.5.68 but not with 1.4.11

2009-12-24 Thread Adam Megacz

Simon Wilkinson s...@inf.ed.ac.uk writes:
 Still no sign of anything failing in that log. There's also not much
 that looks like it's accessing git data, either. Derrick has had a
 look over it and can't see anything either.

 Could you try rebooting your machine, then immediately start fstrace
 and try to reproduce the bug - we might get a bit more information
 that way.

Hrm, disturbing news: the problem went away after rebooting.

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: Current OpenAFS Backup Recommendations

2009-12-21 Thread Adam Megacz

Holger Rauch holger.ra...@empic.de writes:
 do you have any scripts available that you'd be willing to/allowed to
 share? I'd highly appreciate that.

Sure.  I think I posted them here before, but here's where it lives:

  /afs/megacz.com/srv/bin/dump.sh

Also handy for manipulating the incrementals are:

  /afs/megacz.com/srv/bin/compress.sh
  /afs/megacz.com/srv/bin/expand.sh

The backups are kept as a full dump of the most recent nightly plus a
chain of backwards diffs.  Only nightlies are supported right now
(no concept of monthlies/annuals).

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: Issue report: invalid argument with 1.5.68 but not with 1.4.11

2009-12-21 Thread Adam Megacz

Simon Wilkinson s...@inf.ed.ac.uk writes:
   *) The kernel's message log with any messages output when the git
 clone failed

None.

   *) A fstrace log of the git clone operation (see
 http://blob.inf.ed.ac.uk/sxw/2009/01/24/using-fstrace-to-debug-the-afs-cache-manager/
  )

Ok, it's here: /afs/megacz.com/.pub/afs-cm-dump

   *) A packet trace of traffic between the client and the fileserver,
 showing the last 10 or so RPCs before the git clone fails

Hrm, that might be trickier.  If the fstrace log doesn't have the
information you need, could you recommend a particular tcpdump
invocation I should use?

Thanks Simon!

  - a


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: Current OpenAFS Backup Recommendations

2009-12-21 Thread Adam Megacz

Holger Rauch holger.ra...@empic.de writes:
 Which entry do I have to add to my CellServDB file in order to be
 able to create the mount point for the call megacz.com?

Make sure you are passing the -afsdb argument to afsd when you start
it.  Then you won't need to hardwire the IP address into your
configuration files.

  - a


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: Issue report: invalid argument with 1.5.68 but not with 1.4.11

2009-12-21 Thread Adam Megacz

Russ Allbery r...@stanford.edu writes:
 The Debian packaging may still be putting it one level up, without the C
 subdirectory, which may be the problem.

Yep, that was it.

Simon Wilkinson s...@inf.ed.ac.uk writes:
 If you could install that, and try the whole thing again, that would
 be great!

Done.  New dump is at /afs/megacz.com/.pub/afs-cm-dump

Thanks again!

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Issue report: invalid argument with 1.5.68 but not with 1.4.11

2009-12-20 Thread Adam Megacz

I just upgraded one of my machines to the 1.5.68 client, and I've been
experiencing some mysterious behavior when using git.  The one
reproducible oddity is this:

  $ git clone /afs/megacz.com/debian/openafs/1.5/openafs-debian.git
  Initialize openafs-debian/.git
  Initialized empty Git repository in /tmp/openafs-debian/.git/
  error: copy-fd: read returned Invalid argument
  fatal: failed to copy file to 
openafs-debian/.git/objects/60/ae957b9b417ed139ccc0156ba9eca9542a48a6

Executing the same command on a machine with the 1.4.11 client works
fine.  The directory above is world-readable, so feel free to try (but
I may need to vos_release later this week, so please try soon if you
can).

Thanks,

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: Linux tmpfs

2009-12-19 Thread Adam Megacz

Simon Wilkinson s...@inf.ed.ac.uk writes:
 Crickey - this is a thread from almost 12 months ago - talk about
 necromancy!

Ah yes, the magic of gmane ;)

 Anyway, the first point is solved by Marc Dionne's LINUX_USE_FH
 patches in the 1.5.x series - these let you use pretty much any
 filesystem as a disk cache, and are automatically enabled for kernels
 that are new enough to lack the iget interface - check to see if
 LINUX_USE_FH is defined for your build and, if it isn't, define it.

That sounds great; looks like you've got another 1.5.x beta tester
now.

Thanks Simon!

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: Current OpenAFS Backup Recommendations

2009-12-19 Thread Adam Megacz

Russ Allbery r...@stanford.edu writes:
 You can get remarkably good compression by using xdelta3 on dump
 files.

 Don't you have to do full dumps in order to calculate the xdelta3
 differences?

I pipe the output of vos dump directly to xdelta3, so the full dump
never hits the disk.  If you perform the dump and diff on the
fileserver itself, then the network traffic is also proportional to
the size of the diff (rather than the size of the whole dump).  You
can then send the diff over the network to secondary storage.
Secondary storage needs to keep a copy of the full dump and reapply
the diff, so the amount of effort is proportional to the size of the
dump, but the network traffic is proportional to the diff.

It takes a few tries to set it up, but works really smoothly once
you've got it going.  IMHO it's the best incremental backup solution
for AFS that doesn't involve the risk of your proprietary software
suddenly becoming unsupported.

The ideal situation would be to have the volserver emit an RFC3284
file directly using its knowledge of which file blocks have changed
since a given date.  This would give you block-granularity incremental
dumps instead of the file-granularity dumps without inventing a new
dump format; just use an RFC3284-of-dumpfiles.

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: Linux tmpfs

2009-12-18 Thread Adam Megacz

Rainer Toebbicke r...@pclella.cern.ch writes:
 1. tmpfs on linux just works fine, if you have a (small) patch that
 glues the inode-centric file opens to the dentry-centric tmpfs
 files.

 2. AFS files end up twice in memory, once in the mapping of the AFS
 file itself and then in the mapping of the cache chunk. We've
 addressed this by short-circuiting the VM layer for the AFS file, a
 relatively straightforward mod, but which gets messy as you still need
 that layer for everything that is memory-mapped, such as executables.

Hi, does anybody have copies of one or both of these patches?  I'd be
quite interested in trying them out.

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: Current OpenAFS Backup Recommendations

2009-12-18 Thread Adam Megacz

Russ Allbery r...@stanford.edu writes:
 Given the low churn of most of our AFS space and compression of the vos
 dump files, the backups actually take slightly less space than the
 entirety of our cell, with 30 day retention.

You can get remarkably good compression by using xdelta3 on dump
files.  I have almost two years worth of *daily* backups and they
consume only 10x the space of the active data itself (and lives on
much cheaper/slower storage).

I should probably only keep monthlies, but I haven't gotten around to
adjusting the scripts.

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] libnss-afs v2.0 released

2009-12-08 Thread Adam Megacz

I am pleased to announce the availability of libnss-afs v2.0:

  http://www.megacz.com/software/libnss-afs.html

This version will return not found if nscd is not running.  This
resolves numerous minor issues including some which could cause
extraordinarily long delays when shutting down or rebooting.

  http://git.megacz.com/?p=libnss-afs.git;a=commitdiff;h=b5f46b9

A Name Service Switch (NSS) plugin is a shared library used by glibc
to -- among other things -- translate between usernames and numeric
userids and between group names and numeric groupids.

The libnss-afs library is an NSS plugin which answers these queries
using the information stored in the AFS ptserver, avoiding the need to
duplicate (and update) this information in /etc/passwd or LDAP.  The
library also synthesizes the name AfsPag- for the fake group ids
that are used to represent AFS PAGs.

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: LDAP backend for PTS?

2009-11-17 Thread Adam Megacz

Holger Rauch holger.ra...@empic.de writes:
 Is there a PTS backend for OpenLDAP available and actively maintained
 (in the sense that it can be used in conjunction with OpenAFS 1.4.x or
 1.5.x)?

What you want is actually an NSS module, not an LDAP module.  You're
probably using the LDAP module for NSS; you want to augment that with
a ptserver module.  Here's where you can get it:

  http://www.megacz.com/software/libnss-afs.html

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: Combined AFS/Kerberos Apache 2 module

2009-09-03 Thread Adam Megacz

Hey, neat!

Kevin Hildebrand ke...@umd.edu writes:
 In addition, when obtaining AFS tokens, it's possible to do so before
 the Apache directory walk phase, which is a current limitation of
 mod_waklog.

Well, not entirely...

  http://article.gmane.org/gmane.comp.file-systems.afs.modwaklog.devel/114

 When using this module, the use of mod_waklog is not required.

What is the equivalent for mod_waklog's WaklogLocationPrincipal
directive?

  - a


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: problems with cygwin

2009-08-31 Thread Adam Megacz

Same here; all of Cygwin has lost access to AFS.  Sadly I too will
need to downgrade.

  - a

Lars Schimmer l.schim...@cgv.tugraz.at writes:
 Hi!

 One of my students send m this regression:

 After upgrading OpenAFS Client on WindowsXP (32 bit)
 from openafs-en_US-1-5-61.msi
 to openafs-en_US-1-5-62.msi
 i can't read files inside afs with any cygwin app.

 Example:
 $ cat test.txt
 cat: test.txt: Invalid request code

 I also tested other apps: nano, less, lighttpd. None can read any file.
 Tested on several different PCs with different users, files and paths.
 After downgrading to 1.5.61 things work again.


 MfG,
 Lars Schimmer
 --
 -
 TU Graz, Institut für ComputerGraphik  WissensVisualisierung
 Tel: +43 316 873-5405   E-Mail: l.schim...@cgv.tugraz.at
 Fax: +43 316 873-5402   PGP-Key-ID: 0x4A9B1723

-- 

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: OpenAFS 1.5.61 released

2009-08-09 Thread Adam Megacz

Derrick Brashear sha...@openafs.org writes:
 MacOS:
  * GUI installer now asks for local cell information.

The no local cell option seems to be missing (IMHO it should be the
default).

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: Trouble with libnss-afs (on amd64)

2009-07-07 Thread Adam Megacz

Derrick Brashear sha...@gmail.com writes:
 Krullgkli...@cs.uni-goettingen.de wrote:
 the latest version 1.08 of libnss-afs cannot be build on a amd64 Ubuntu 
 Jaunty.

Hi Krull.  I don't have root access to any amd64 machines, so
unfortunately I can't reproduce this.  If you find a solution, please
let me know.

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: Trouble with libnss-afs (on amd64)

2009-07-07 Thread Adam Megacz

Russ Allbery r...@stanford.edu writes:
 Adam, on Debian you want to link with -lafsauthent_pic -lafsrpc_pic.  At
 least if they're available, which for lenny may require using a backport
 since they only went in the 1.4.10 release.

I see.

Russ, as a Debian guru, can you advise me on the officially-approved
way to pick the right libraries at deb-install-time?  It's more
important that libnss-afs continue to work properly with etch's native
version of opeanfs than that it work on amd64.

Thanks,

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: Trouble with libnss-afs

2009-07-07 Thread Adam Megacz

Russ Allbery r...@stanford.edu writes:
 Adam, on Debian you want to link with -lafsauthent_pic -lafsrpc_pic.  At
 least if they're available, which for lenny may require using a backport
 since they only went in the 1.4.10 release.

One possibility: I could require =1.4.10 in order to build, but allow
installation on earlier versions of OpenAFS.

Will this work?  In other words, if libnss_afs.so statically links to
libafsauthent_pic.a/libafsrpc_pic.a from 1.4.10 and then the shared
library is introduced to (say) a 1.4.2 system, will it cause problems?

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: Trouble with libnss-afs

2009-07-07 Thread Adam Megacz

Russ Allbery r...@stanford.edu writes:
 One possibility: I could require =1.4.10 in order to build, but allow
 installation on earlier versions of OpenAFS.

 Will this work?

 Yup.

Okay, libnss-afs 1.09 has been released.  Debs are in the usual place.

  http://git.hcoop.net/?p=megacz/libnss-afs.git;h=tags/1.09

Please let me know if this works on amd64 (can't test it myself).

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: Trouble with libnss-afs

2009-07-06 Thread Adam Megacz

mozafar roshany m.rosh...@gmail.com writes:
 I working on OpenAFS through this document on Debian Lenny:
 ...
 I should mention that I've installed the libnss-afs_1.08_i386.deb package or
 its 1.07 version.

Hi, Mozafar.  Could you please try upgrading to libnss-afs 1.08?  I
recently fixed a linking problem that only causes trouble for users of
newer libc's:

  http://git.hcoop.net/?p=megacz/libnss-afs.git;a=commit;h=491cdaa05effc3

  /afs/hcoop.net/user/m/me/megacz/public/libnss-afs/libnss-afs_1.08_i386.deb

Note that the debs with 1.08 in their name were rebuilt recently; git
is the authoritative source of version numbers.

Thanks,

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: afs: Lost contact with file server on the same machine?

2009-06-15 Thread Adam Megacz

Jason Edgecombe ja...@rampaginggeek.com writes:
 Have you increased the fileserver logging to the maximum level?

Yes, we tried that a while back.  Apparently throttling doesn't get logged.

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: afs: Lost contact with file server on the same machine?

2009-06-15 Thread Adam Megacz

Andrew Deason adea...@sinenomine.net writes:
 No, there's no log message that indicates that this is happening;

How unfortunate.

 but having logs for this would somewhat defeat the purpose of the
 throttling. If you're triggering the throttling behavior, logging it
 would almost certainly really slow down the fileserver.

Presumably the fileserver would only log this message at most once per
client IP per hour or something like that.

 The least disruptive way to see if it's happening is probably
 correlating kernel error messages with the problem, as Russ mentioned
 earlier.

I don't think he ever explained what server-side event he was
correlating the client-side kernel messages with.

 Although, if it's tolerable, it may be easiest to just disable
 throttling by passing -abortthreshold 0 to the fileserver, and see if
 the problem goes away.

THANK YOU.  That's the sort of solution I was looking for.  We will
implement this immediately.

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: afs: Lost contact with file server on the same machine?

2009-06-14 Thread Adam Megacz

Adam Megacz meg...@hcoop.net writes:
  - Have you tried using rxdebug to see if the fileserver is getting
 caught up on something?  Try running it when one of the clients claims
 it's lost contact with the server.

 Unfortunately we can't reproduce the bug on demand.  It tends to
 happen when nobody's looking, and goes away quickly enough that by the
 time somebody gets an admin's attention it has gone away.

I should add, however, that I am eager to add or enable any sort of
instrumentation that might help in determining the cause of these
problems and/or fix them.

If there's any sort of logging I can/should turn on, that would be
great.  But an action that needs to be performed manually while the
bug is in progress isn't really feasible.

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: afs: Lost contact with file server on the same machine?

2009-06-14 Thread Adam Megacz

Russ Allbery r...@stanford.edu writes:
 This sounds identical to the problem that we were having with our web
 servers that was mostly caused by CGI script tokens expiring and then
 scripts continuing to try to access AFS until the file server started
 throttling Rx connections.

Can get the fileserver to log a message indicating that it has decided
to throttle connections from a host?

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: afs: Lost contact with file server on the same machine?

2009-06-13 Thread Adam Megacz

Esther Filderman mizmo...@gmail.com writes:
  - Does the lost contact with server occur on all clients at the
 same time?  Or is it scattered which one loses contact?

It is definitely scattered; we've seen situations where one client
lost contact while another seemed to be having no troubles.

  - For how long does the lost contact occur?  Is it seconds or
 minutes or longer?

Around 10-15 minutes, or until the next fs checks, whichever comes
first.  Some users know to run fs checks to make this go away, but
most don't.  Others are seeing unsupervised cron/at jobs fail as a
result of this.

  - Simple, stupid question: Have you confirmed your hardware is OK and
 not causing hiccups in the system?

Yes.

  - Have you tried using rxdebug to see if the fileserver is getting
 caught up on something?  Try running it when one of the clients claims
 it's lost contact with the server.

Unfortunately we can't reproduce the bug on demand.  It tends to
happen when nobody's looking, and goes away quickly enough that by the
time somebody gets an admin's attention it has gone away.

Thanks for your help,

  - a
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: Connection Timed Out errors occasionally when accessing openafs drive

2009-06-07 Thread Adam Megacz

FWIW, we are still experiencing this problem as well after upgrading
to 1.4.10, although it seems to occur less often than it did before.

  - a

Ken Elkabany k...@elkabany.com writes:
 I upgraded our server and client to 1.4.10. Unfortunately, I am still
 receiving Connection Timed Out errors. They rarely occur, but when
 they do they are a severe hindrance. My use case is as follows:

 Three different unix user accounts (root, www-data, aux) are all
 running multiple background processes (~9 total) which access the afs
 mount. They each automatically acquire, or re-acquire tickets and
 tokens, and then proceed to read, copy, and write files. Occasionally,
 upon creating a directory using a python os command similar to mkdir
 -p (os.makedirs), I receive a Connection Timed Out error. The
 processes must then be restarted.

 Any other suggestions?

 Ken

 On Sun, May 10, 2009 at 7:41 PM, Derrick Brashear sha...@gmail.com wrote:
 it probably matters in the server here, but both.

 Derrick


 On May 10, 2009, at 10:35 PM, Ken Elkabany k...@elkabany.com wrote:

 Is this bug fixed in the client or the server? Thanks.

 Ken

 On Sun, May 10, 2009 at 7:22 PM, Derrick Brashear sha...@gmail.com
 wrote:

 I'd venture this is a bug fixed in 1.4.10, with idle dead time
 computation
 in rx.

 Derrick


 On May 10, 2009, at 9:53 PM, Ken Elkabany k...@elkabany.com wrote:

 Hello,

 I have openafs 1.4.9 client and server running on two separate
 machines across a WAN. The client has scripts that access the
 /afs/our.cell/ directory. Occasionally, the script will fail to
 complete, and the logs will say that the Connection Timed Out on a
 mkdir -p /afs/our.cell/x/y/z command. The frequency of the errors
 are approximately 1 in 100, small enough to not be easily reproducible
 manually, but enough to hamper our project. The scripts run as the
 root user, and is guaranteed to have the proper ticket and token. It's
 also important to note that these scripts often run in parallel (4 at
 a time, all root, modifying our cell). When one fails, all scripts
 running concurrently will fail with the same error, and I typically
 either unlog;kdestroy or restart the openafs-client (I am unsure which
 of those solutions is necessary or sufficient). I will soon have an
 additional LAN setup, and will determine if the same error occurs. Has
 anyone dealt with this issue before?

 Thank you for the assistance,

 Ken
 ___
 OpenAFS-info mailing list
 OpenAFS-info@openafs.org
 https://lists.openafs.org/mailman/listinfo/openafs-info



-- 
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: afs: Lost contact with file server on the same machine?

2009-06-07 Thread Adam Megacz

For the benefit of the mailing list archives, I'd just like to mention
that upgrading from 1.4.6 to 1.4.10 seems to have helped quite a bit,
but the problem remains.  It just happens less frequently.

  - a

Adam Megacz meg...@hcoop.net writes:
 Hello,

 We've got a situation where clients seem to be encountering afs: Lost
 contact with file server fairly frequently (at least once a week).
 This is happening both for a client machine which is on the same
 ethernet switch as the fileserver (no NAT going on) as well as the
 OpenAFS client running on the server machine losing contact with the
 fileserver process running on the very same machine (so it's unlikely
 to actually be the network).

 Sending kill -TSTP to the fileserver to increase the logging level
 hasn't revealed anything interesting happening at the time that
 contact is lost.

 Is there any way to get more detailed information about the reason why
 the client decided that it had lost contact?  For example, whether the
 failure was due to a timeout, an ICMP unreachable, or
 no-route-to-host, etc?

 All machines in question are running OpenAFS 1.4.6 (client and
 server), using the debian packages.

 The fileserver is running with these arguments:

   -p 23 -busyat 600 -rxpck 400 -s 1200 -l 1200 -cb 65535 -b 240 -vc 1200

 Thanks for any suggestions...

   - a

-- 
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: Notes on how to get filedrawers minimally working with R/W AFS access on Debian (at least).

2009-06-07 Thread Adam Megacz

David Boyes dbo...@sinenomine.net writes:
 A bit of googling reveals that Adam Megacz has actually done a Debian
 package of filedrawers.  This saves some time:

 /afs/hcoop.net/user/m/me/megacz/public/filedrawers/

FWIW, this is probably way out of date.  If the filedrawers folks are
interested in distributing my debianization as part of the package
(so it remains up to date), I will update it for them.

 So, it turns out that Adam Megacz expects you to still have Apache 1
 installed in order to build the damn thing.

Yeah, this is because both the Apache1 module and the Apache2 module
are built from the same source package.

If there are any Debian wizards out there who know how to make a
multiple-binary-packages-from-one-source-package that lets you
selectively build some of the binary packages but not others, and
doesn't require you to have the Build-Depends's for the packages you
haven't selected, please let me know.  This might not be possible.

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: Notes on how to get filedrawers minimally working with R/W AFS access on Debian (at least).

2009-06-07 Thread Adam Megacz

Jason Edgecombe ja...@rampaginggeek.com writes:
 So, it turns out that Adam Megacz expects you to still have Apache 1
 installed in order to build the damn thing.

 Yeah, this is because both the Apache1 module and the Apache2 module
 are built from the same source package.

 If there are any Debian wizards out there who know how to make a
 multiple-binary-packages-from-one-source-package that lets you
 selectively build some of the binary packages but not others,

 Why not build all of the pacjages and let the user choose as apt-get time?

That's what it currently does.

 - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: LDAP-AFS interaction (best practice?)

2009-05-17 Thread Adam Megacz

Stephen Joyce step...@physics.unc.edu writes:
 We currently use cfengine and custom scripts to manage /etc/passwd
 by sourcing a central file and checking AFS PTS group memberships to
 build the local file hourly.

You might want to check out libnss-afs.

  http://deleuze.hcoop.net/~megacz/software/libnss-afs.html

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] afs: Lost contact with file server on the same machine?

2009-04-09 Thread Adam Megacz

Hello,

We've got a situation where clients seem to be encountering afs: Lost
contact with file server fairly frequently (at least once a week).
This is happening both for a client machine which is on the same
ethernet switch as the fileserver (no NAT going on) as well as the
OpenAFS client running on the server machine losing contact with the
fileserver process running on the very same machine (so it's unlikely
to actually be the network).

Sending kill -TSTP to the fileserver to increase the logging level
hasn't revealed anything interesting happening at the time that
contact is lost.

Is there any way to get more detailed information about the reason why
the client decided that it had lost contact?  For example, whether the
failure was due to a timeout, an ICMP unreachable, or
no-route-to-host, etc?

All machines in question are running OpenAFS 1.4.6 (client and
server), using the debian packages.

The fileserver is running with these arguments:

  -p 23 -busyat 600 -rxpck 400 -s 1200 -l 1200 -cb 65535 -b 240 -vc 1200

Thanks for any suggestions...

  - a



___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: retaining AFS-specific nameless group IDs (PAG) in `id' and `groups'

2008-04-22 Thread Adam Megacz

Jim Meyering [EMAIL PROTECTED] writes:
 Since you guys are interested in AFS, I'm hoping one of you will
 respond to the above.

http://lists.openafs.org/pipermail/openafs-info/2008-April/029132.html

 I'll wait a few days, after which, if I don't hear anything, I'll
 just revert to the old behavior.

If old behavior means no special action for GIDs that might be PAGs,
I think that is the right course of action.

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: coreutils-6.11 released

2008-04-21 Thread Adam Megacz

Didi [EMAIL PROTECTED] writes:
 the main problem is that through this the 'groups'
 command becomes utterly useless and confused quite a lot of users.
 $ groups
 users id: cannot find name for group ID 1091323188

If you would like that numeric groupid to resolve to some alphanumeric
group name, the right way to do that is to use the NSS:

  http://www.hcoop.net/~megacz/software/libnss-afs.html

 If someone can provide code to determine efficiently whether a
 nameless GID is a PAG then we can probably make everyone happy.

The code you are looking for appears in libnss-afs, but it is based on
assumptions that are only valid on a system known to be running the
OpenAFS client.

In other words, unless coreutils somehow detects the presence of the
OpenAFS client (pioctls?), it probably shouldn't be trying to guess at
what is or isn't a PAG GID the way libnss-afs does.

  - a


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] do the servers pick up CellServDB changes without a restart?

2008-04-16 Thread Adam Megacz

When a new server is added to a cell, is it necessary to bos restart
the existing servers to make them notice the changes to CellServDB, or
are these changes picked up automatically by vlserver/ptserver/etc?

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: maildir on openafs [new faq entry]

2008-04-08 Thread Adam Megacz

David Bear [EMAIL PROTECTED] writes:
 I seem to distantly recall some discussion about storing maildir directories
 on openafs, but I don't remember if it was safe, discouraged, or otherwise
 problematic. Any one see problems with putting maildir in afs?

HCoop is doing this with courier, and it works, although the gadgetry
currently used to acquire tokens is really, really sketchy.

Many users really like having shell access to their mailbox, backup
volumes of their mail, the ability to grep their mail, etc.

The (sole) SMTP server, (sole) IMAP server, and AFS fileserver all
happen to be the same (fairly powerful) machine, so we may be dodging
some of the performance issues that other people see.


Robert Banz [EMAIL PROTECTED] writes:
 Its a mess.  AFS is not for mail. Unix user accounts are not for
 mail.  Use an actual mail system and do it right ;)

This sentiment comes up here often, and although there is much truth
in it, I think that stating it so dogmatically might not be the most
productive route to take.  Mail uses storage; AFS provides storage;
so, let's not imply that putting mail in AFS is obviously stupid!
(perhaps it's only non-obviously stupid).

The problems, as I am aware of them, are:

  - AFS does not perform well under the sort of multiple-machine
concurrent access scenario that certain mail architectures
(large-site Cyrus) use.

  - Unlike POP, the IMAP protocol offers many features which are best
implemented by backend storage which is more database-like than
filesystem-like in nature.  AFS is a less than ideal storage
medium for databases, for reasons explained elsewhere.

Anyways, I've started a FAQ entry to collect concrete reasons why mail
should or should not be stored in AFS.

  
http://www.dementia.org/twiki/bin/view/AFSLore/AdminFAQ#3_52_Is_it_a_good_idea_to_store

I will update the FAQ entry with the proceeds of this thread; please
share your reasoning for encouraging or discouraging mail in AFS.

  - a
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] openafs-devel not processing messages?

2008-04-07 Thread Adam Megacz

I've posted two messages to openafs-devel lately (one yesterday, one
today) and neither has come through... could somebody perhaps check on
that list to see if it is operating correctly?

Thanks,

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Re: other-realm groups in ACLs?

2008-03-17 Thread Adam Megacz

Jeffrey Altman [EMAIL PROTECTED] writes:
 Please clarify what you are asking.  Are you asking if you can use
 the group definitions from cell A on ACLs in cell B?

Yes.

Derrick Brashear [EMAIL PROTECTED] writes:
 No. And my server has no creds to do a lookup in your realm

Sorry, I should have indicated that I was assuming a cross-realm trust
between the home kerberos realms of the two cells.

 (and it could be hazardous if i lost contact with you)

Ah, very good point.  Now I see why this isn't practical.

  - a


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] other-realm groups in ACLs?

2008-03-16 Thread Adam Megacz

Does AFS support the use of pts groups in a remote (trusted) cell on
ACLs in the home cell?  In other words, is there a way to get this to
work?

  fs sa /afs/home.edu/xyz/ somebody:[EMAIL PROTECTED] rli

Or is system:[EMAIL PROTECTED] the only group that is allowed to
have an @ in it?

  - a

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


  1   2   3   4   >