[OpenAFS] Tired of sec tools recursively traversing /afs?

2018-06-19 Thread Jeff Blaine
Hello,

df --local shows /afs in the listing.

Many security tools use 'df --local' to determine local filesystems to
traverse recursively.

If you're like me, you're tired of security tools traversing the
local-but-NOT-LOCAL /afs mountpoint.

I've opened a ticket with the Center for Internet Security (CIS, whose
"benchmark" documents are the basis for myriad security tools' check
scripts) at https://workbench.cisecurity.org/community/17/tickets/6518
but do not personally intend to follow up much on said ticket as our AFS
days are numbered less than 100 or so.

So I got the ball rolling... please consider joining said benchmark
community to add your voice on the ticket if you care about getting this
fixed at the major root of origin.

Jeff
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] same-server partition moves?

2015-12-09 Thread Jeff Blaine
If you had to bulk migrate online volumes across partitions on the same
server, would you just stick to 'vos move'? Other options?
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Migrating existing data onto vice partition on the fly

2014-12-30 Thread Jeff Blaine
 First I would set up the cell and everything, then just run a
 
 vos create -server athlas -partition /vicepa -name root.afs -cell
 cellname -noauth
 
 ..right on top of the existing partition...

Hmm? Describe this more. On top of what existing partition?

But, ignoring that odd info above, all you have to do is:

  rsync -va /my-xfs/data/ /afs/yourcell/huge-empty-volume

^
|- trailing slash relevant, read rsync(1)

If /my-xfs/data is writable space, you *must* to stop all writes to it
(re-mount it read-only) and then run that command again to finalize
things. This may or may not be downtime for you.

-- 
Jeff Blaine
kickflop.net
PGP/GnuPG Key ID: 0x0C8EDD02
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: System resources requirements and performance tuning for AFS file servers

2014-08-15 Thread Jeff Blaine
 On Thu, 14 Aug 2014 18:22:17 -0500
 Brian Sebby se...@anl.gov wrote:
 
 I’m starting a project to migrate our AFS cell from the ancient
 Solaris servers that it currently lives on to a number of RHEL VMs in
 our VMware infrastructure.  One of the significant issues we’ve had
 for a long time is that performance is lousy on our current servers,
 and I’d like to make that better.

We did exactly this for our DB servers (and left them there) and tested
it with 1 fileserver for a bit.

Your best bet is to quantify current slow with data, then stand up a
few test VMs (2core, 4core, various fileserver params) and gather their
performance data under the same tests.

Unfortunately, our fileservers are still left on the old Suns while we
figure out whether or not we're willing to stoop to XFS on RHEL (support
contract, known choice of OS) or add complexity to our world by using
OmniOS or FreeBSD to retain ZFS.

-- 
Jeff Blaine
kickflop.net
PGP/GnuPG Key ID: 0x0C8EDD02
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Pre-built packages: build options?

2014-04-09 Thread Jeff Blaine
First, thank you very much for those who donate time and/or resources to
provide builds of OpenAFS.

How does one determine how these packages were built? What configure
args? Are they all done with bare ./configure  make dest ?
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: OpenAFS client crashes on RHEL 5.10 and RHEL 6.5

2014-03-29 Thread Jeff Blaine
FYI

From our open RH case for 5.x. Quote is from RH support:

We have requested this regression be repaired in RHEL 5.11
 under Bug 1080606, we have also requested that the fix be
 considered for backport into 5.10.z.

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] ZFS-on-Linux on production fileservers?

2013-10-04 Thread Jeff Blaine
[ For those running ext3/ext4, a question further down for you as ]
[ well!   ]

We're still a 100% Solaris + ZFS file server shop. We're EOLing
our Sun SPARC hardware (with tears in our eyes) this year.

Before we spend a significant amount of time evaluating this, I
figured I'd ask first. Any brief response would be greatly appre-
ciated. The generously longer the better :)

* Are you using ZFS-on-Linux in production for file servers?
* If not, and you looked into it, what stopped you?
* If you are, how is it working out for you?

ext3/ext4 people: What is your fsck strategy?
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Volume type mapping to certain partitions

2013-01-15 Thread Jeff Blaine

Are people still doing things like mapping user home directory
volumes to certain partitions on certain servers, keeping track
in a database, etc?

What does this buy, assuming all data served from storage comes
from like hardware (speed, capacity, etc)?

We've kept up this practice and I'm not real sure why we bother.
I cannot see any case where it has helped us in any significant
way in the last 15 years (my hire date, this practice was already
in place then) and am looking to decomplexificate our environment
where possible.

Thoughts?
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Distro vs. @sys. Round 1: FIGHT!

2012-08-23 Thread Jeff Blaine

RHEL 5 vs. RHEL 6

Both have the same @sys currently.

Due to drastic differences in OS libraries present, those (like us),
who use @sys in PATH, get bitten. That is, our build of AppX for
'amd64_linux26' that was built on RHEL 5 will not work on RHEL 6,
and we need to support both.

We had trouble with this once in the past. We solved it by
forcing the newer machines to set a custom sysname in afs.rc (like
amd64_linux26_v2).

Any other options, or is the standard thing everyone does?
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] How's 1.6 on Solaris 10 SPARC?

2012-05-31 Thread Jeff Blaine

Are people actively using 1.6 on Solaris 10 SPARC?
As client?
As file server?
As DB server?
Anything to note?
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] New Keyfile and strange behaviour on clients

2012-05-11 Thread Jeff Blaine

- klist gives only the krbtgt ticket


As it should, unless you've gotten a token.


- tokens gives this output:

Tokens held by the Cache Manager:

Tokens for a...@dia.uniroma3.it [Expires May 10 22:50]
--End of list--


Shows no tokens.


- aklod works fine and after this command I have a new kerberos ticket
(afs/dia.uniroma3...@dia.uniroma3.it)  and the right token:
$ tokens

Tokens held by the Cache Manager:

User's (AFS ID 10001) tokens for a...@dia.uniroma3.it [Expires May 10 22:50]
--End of list--


Shows a token for 1001
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: New Keyfile and strange behaviour on clients

2012-05-11 Thread Jeff Blaine

On 5/11/2012 10:03 AM, Andrew Deason wrote:

No, it shows tokens for the 'dia.uniroma3.it' cell, but the vice id for
the tokens is unknown.


I'm clearly not awake yet. Sorry.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] WARNING: may leave AFS storage and metadata in indeterminate state

2012-04-09 Thread Jeff Blaine

Hi all,

Can anyone explain why it is possible for the interruption of
a 'vos move' to leave AFS storage and metadata in indeterminate
state?

Dumping from clone 2023894170 on source to volume 2023891400 on 
destination ...^C

SIGINT handler: vos move operation in progress
WARNING: may leave AFS storage and metadata in indeterminate state
enter second control-c to exit

I assume since I am only dumping from the clone to destination
that this warning is unnecessarily alarming at this stage of
the move, and all would be fine if I continued with another Ctrl-C.

Comments?
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: Restoring a RW volume that had replicas

2012-04-03 Thread Jeff Blaine

On 4/3/2012 3:35 PM, Andrew Deason wrote:

On Tue, 03 Apr 2012 15:27:37 -0400
Jeff Blainejbla...@kickflop.net  wrote:


You restore the RW myvol to fs2:c as myvol.R just fine.

vos rename myvol.R myvol fails with Already exists


Let's say that, for example:

myvol has volume id 12340
myvol.R has volume id 12349

'vos rename' just changes the name, not the volume id. So by running
that 'vos rename' command you're saying you want volume id 12349 to have
name 'myvol', but 'myvol' already exists with volume id 12340. That's
the error you're getting.


I was trying to be simple with my explanation, but this detail
is surely too relevant now to leave out: fs1 was brought up
empty post-crash, and vos syncvldb fs1 was run.

There should have been no myvol (or id 12340) in the VLDB when
the 'vos rename' ran, from what I understand.


If you want to restore under the original volume name and id number,
'vos restore' to 'myvol' directly with -name and -id.


Let's say I must restore to myvol.R
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: Restoring a RW volume that had replicas

2012-04-03 Thread Jeff Blaine

On 4/3/2012 4:16 PM, Andrew Deason wrote:

On Tue, 03 Apr 2012 15:50:53 -0400
Jeff Blainejbla...@kickflop.net  wrote:


There should have been no myvol (or id 12340) in the VLDB when
the 'vos rename' ran, from what I understand.


But you still had replicas on other sites, right? If you have


Yes.


'myvol.readonly' vols, then 'myvol' also exists in the vldb. Volumes
like 'myvol', 'myvol.backup', 'myvol.readonly' etc aren't really
separate entries. There is one entry in the vldb for 'myvol', and the
vlserver records the RW, RO, BK, etc volume ids for it.

I think the RW id is always set and you can't get rid of it (even if
there are no sites where the RW is present), but I'm not sure.


Ah HA.


If you want to restore under the original volume name and id number,
'vos restore' to 'myvol' directly with -name and -id.


Let's say I must restore to myvol.R


Well, I don't think we provide any way to change the volume id number,
and I'm not sure how feasible/advisable doing that would be, since a lot
of things can go wrong.

But you have some options. You can remove the replicas (you may need a
'vos delentry' as well; I'm not sure), then rename the volume, and add
the replicas back and release. The volume ID number will have changed,
though, and any clients using that volume will need an 'fs checkv'
before they can use it again (or wait 2 hours).


This is what I did, and then dealt with the ensuing Oh crap,
/usr/rcf/bin/ALL_USER_SHELLS just went away on a bunch of
hosts ..., while hastily feeding a fs checkvol into our
bi-hourly config management tool which runs on all hosts
... then waiting for it to run.

Ahem. Live and learn.


Or you can 'vos dump myvol.R | vos restore -name myvol -idtheid'. If
you're doing this to a server that has a replica, you really want to do
it on the same partition as the extant RO (we try to prevent you from
doing otherwise, but I'm not sure if all edge cases are caught; in past
versions we have missed some). Note that when you release, this should
cause a full release, since doing a restore can screw up our tracking of
the incremental data to send, etc.


That would have likely been more pleasant.

Thank you for the replies!
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] HA RW vols

2012-04-02 Thread Jeff Blaine

What have people had success with (existing solutions
in practice) for making RW volumes highly available?
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Combo DB + file servers

2012-03-29 Thread Jeff Blaine

Perhaps someone can jog my memory :)

Remind me why it was the right thing to do when I separated
all DB server functionality from fileserver functionality
9 years ago?

  Site A
fs1
fs2
fs3
db1
db2
db3

  Site B
fs4
fs5
db4

Strongly considering folding the DB servers back onto the
fileservers.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Can't get tokens since upgrading to 1.7.6 and Heimdal

2012-03-16 Thread Jeff Blaine

This is why we strongly recommend that the afs/cell@REALM form of
service tickets be used in all cases.  afs/cell can be used with
Kerberos referrals and when dns realm hierarchies must be searched.


A sanity check on this would be greatly appreciated.

I've shot myself in the foot before here (a few times).

So then to migrate from afs@REALM to afs/cell@REALM without
interruption:

1. Create afs/cell@REALM just as afs@REALM was
2. Extract keytab for afs/cell@RALM
3. Add key(s) for afs/cell@RALM to OpenAFS KeyFile on
   etc upserver
4. After at least max ticket lifetime, remove the old
   key from KeyFile and also remove the principal from KDC.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Can't get tokens since upgrading to 1.7.6 and Heimdal

2012-02-22 Thread Jeff Blaine

The problem isn't it's not finding afs/sub.my@sub.my.org

The problem is: it's not looking for a...@sub.my.org

It should do that.

OpenAFS Quick Start Guide:
...
Begin by creating the following two entires in your site's Kerberos
database:
...

The entry for AFS server processes, called either afs or afs/cell.
...
  ^^^

On 2/22/2012 10:15 AM, David Goldberg wrote:

It should have it. The exact same krb.conf file except for the
allow_weak_crypto line worked fine before when I was using MIT kerberos.

I will check with the admin, though.
Thanks
--
Dave Goldberg
david.goldbe...@verizon.net

Ken Dreyer ktdre...@ktdreyer.com wrote:

On Wed, Feb 22, 2012 at 6:44 AM, David Goldberg
david.goldbe...@verizon.net  wrote:
  $ aklog -d
  Authenticating to cellsub.my.org  http://sub.my.org.
  Getting v5 tickets: afs/sub.my.org  http://sub.my.org@SUB.MY.ORG
  Getting v5 tickets: afs/sub.my.org  http://sub.my.org@MY.ORG
  Getting v5 tickets: a...@my.org
  Kerberos error code returned by get_cred: -1765328377
  aklog.exe: Couldn't getsub.my.org  http://sub.my.org  AFS tickets: 
UNKNOWN_SERVER

Looks like aklog is asking for the Kerberos service principal
afs/sub.my.org  http://sub.my.org@SUB.MY.ORG (and variations), but the 
KDC is saying
that it doesn't know that principal. Are you sure it is present in
your KDC's database? Is DES enabled on this principal and on the KDC?

-
Ken

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] insmod failure

2012-02-21 Thread Jeff Blaine

RHEL 5.8 x86_64 with OpenAFS 1.6.0 built just now:

-bash-3.2# /sbin/insmod /usr/vice/etc/modload/libafs-2.6.18-308.el5.mp.ko
insmod: error inserting 
'/usr/vice/etc/modload/libafs-2.6.18-308.el5.mp.ko': -1 Unknown symbol 
in module

-bash-3.2# uname -a
Linux rcf-linux-beta.our.org 2.6.18-308.el5 #1 SMP Fri Jan 27 17:17:51 
EST 2012 x86_64 x86_64 x86_64 GNU/Linux


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] insmod failure

2012-02-21 Thread Jeff Blaine

On 2/21/2012 11:41 AM, Simon Wilkinson wrote:

On 21 Feb 2012, at 16:25, Jeff Blainejbla...@kickflop.net  wrote:


-bash-3.2# /sbin/insmod /usr/vice/etc/modload/libafs-2.6.18-308.el5.mp.ko
insmod: error inserting '/usr/vice/etc/modload/libafs-2.6.18-308.el5.mp.ko': -1 
Unknown symbol in module


You either need to insmod exportfs first, or use depmod and modprobe.


Thanks.  I see the mod to afs.rc now.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Server (file) host wedge: WARNING: osi_NetIfPoller: ldi_open_by_name failed: 19

2011-12-25 Thread Jeff Blaine

Thanks all.  Happy holiday(s) of choice.

On 12/24/2011 4:49 PM, Jeffrey Altman wrote:

I'm fairly sure this is a Solaris bug.  The error indicates that
/dev/udp is an unknown device.  OpenAFS used to panic when this
condition was reached.  The versions you are using will continue
to operate and simply fail to update the current interface list.

However, the root cause of the problem is outside of OpenAFS.  You
should contact Oracle for a fix.

Jeffrey Altman


On 12/24/2011 2:15 PM, Jeff Blaine wrote:

I'm pretty sure this is the 2nd time we've seen this
now.

AFS fileserver ur.our.org wedged today.  Our monitoring
shows CPU usage pegged at 100% right when the problem
happened (didn't escalate over hours...).

SunOS ur.our.org 5.10 Generic_144488-13 sun4u sparc SUNW,Sun-Fire-V240

/:ur # strings /kernel/fs/sparcv9/afs | grep OpenAFS
@(#) OpenAFS 1.4.14 built  2011-07-07
/:ur # strings /usr/afs/bin/fileserver | grep OpenAFS
@(#) OpenAFS 1.4.11 built  2009-07-14
/:ur #

It had been up 20 days (almost exactly).

The console showed repeating:

 WARNING: osi_NetIfPoller: ldi_open_by_name failed: 19

No console login possible, no SSH possible.  Had to
force-stop the OS.  Issuing 'sync' at the 'ok' prompt
to force a crash dump generated tons of SCSI reset
errors,


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info



___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Server (file) host wedge: WARNING: osi_NetIfPoller: ldi_open_by_name failed: 19

2011-12-24 Thread Jeff Blaine

I'm pretty sure this is the 2nd time we've seen this
now.

AFS fileserver ur.our.org wedged today.  Our monitoring
shows CPU usage pegged at 100% right when the problem
happened (didn't escalate over hours...).

SunOS ur.our.org 5.10 Generic_144488-13 sun4u sparc SUNW,Sun-Fire-V240

/:ur # strings /kernel/fs/sparcv9/afs | grep OpenAFS
@(#) OpenAFS 1.4.14 built  2011-07-07
/:ur # strings /usr/afs/bin/fileserver | grep OpenAFS
@(#) OpenAFS 1.4.11 built  2009-07-14
/:ur #

It had been up 20 days (almost exactly).

The console showed repeating:

WARNING: osi_NetIfPoller: ldi_open_by_name failed: 19

No console login possible, no SSH possible.  Had to
force-stop the OS.  Issuing 'sync' at the 'ok' prompt
to force a crash dump generated tons of SCSI reset
errors,


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Happy Holidays -- Another year in the life of OpenAFS

2011-12-23 Thread Jeff Blaine

[ Cue discussion devolving from documentation into ]
[ document processing tools/formats after 2 posts. ]

Ad...

ACTION!
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] RHEL6 allow_weak_crypto in client krb5.conf

2011-11-22 Thread Jeff Blaine

I'm a little confused.  I just had to turn on
allow_weak_crypto in a RHEL6 kerberos client's
/etc/krb5.conf to be able to aklog.

My understanding was that this setting was only
needed on the KDCs, which until now, has been
working fine since we upgraded our KDCs to 1.9.

Is that just because our other clients are (they
are) running sub-1.9 MIT Kerberos so we didn't hit
this?
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] VL server prefs

2011-11-08 Thread Jeff Blaine

The Cache Manager sets default VL Server preference ranks
as it initializes, randomly assigning a rank from the range
10,000 to 10,126 to each of the machines listed in the
local /usr/vice/etc/CellServDB file.

Does anyone have info about what happens after the initial
VL server preference is set?  Does anything happen?  Or is
the control point purely 'fs setserverprefs -vlservers'?
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] 1.4.x quorum election process?

2011-10-26 Thread Jeff Blaine

Can anyone point me at the docs where quorum election, IP
address numbering as it pertains to election, etc... lives?
I can't find what I am looking for on openafs.org

I seem to recall that the highest IP is sync site (if I
have that right) nonsense was addressed, but again, cannot
find the modern info about the election logic.

Thanks for any info!
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] 1.4.x quorum election process?

2011-10-26 Thread Jeff Blaine

There are two sources of documentation that I know about: A long-ago paper
by Mike Kazar, and the source code (which actually has reasonable comments).
I actually have a copy of the paper if you care.

The key source code you want is ${OPENAFS}/src/ubik/vote.c.  And in my
reading other than the support for clone servers nothing has changed in
terms of the quorum selection (it's the lowest IP address, actually).


Thanks Ken,

Yes, lowest, of course (sorry).

I can't view the .PS documents yet, but I'm not sure it's
necessary to view them if nothing has changed (I was sure
it had).

The lowest IP address favoritism decision is totally
arbitrary, no?

We're kind of screwed unless there's a way around it,
and really would not like to have to apply a local patch
with every rollout.

Andrew, Simon, Jeffrey, Derrick, et al...

Would a favor highest patch be accepted if it was controlled
via configure script, defaulting to the traditional behavior?
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] 1.4.x quorum election process?

2011-10-26 Thread Jeff Blaine

Think about what you would need to do if you were running with this
patch locally.  Every sysadmin that upgrades these servers must remember
that the patch is in place (or how the servers were built/configured)
and not forget.  If you leave tomorrow, is the next sysadmin going to be
burned by this change when s/he attempts to install openafs distributed
binaries in your cell?


You could make the same argument (that you're making) with
at least 5 other existing OpenAFS command-line or build-time
options.  Example: --enable-namei-fileserver vs. not, drop
on a server with existing vice partitions in the wrong
style.

Build/implementation decisions are encapsulated in build
scripts of ours.  Additionally, those decisions are documented
in our wiki.  If he/she hasn't read our internal documentation
about our cell, which is extensive and clear in our wiki, then
yes, he/she will get burned.

Just like he/she would with any other option for cell
or server configuration.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: Kernel panic RHEL 5

2011-10-17 Thread Jeff Blaine

On 10/15/2011 2:08 PM, Andrew Deason wrote:

Thanks for the reply, Andrew.


Rebooting to single user, the insmod works fine and shows:


So, I assume the insmod always works fine, but it panics as soon as
afsd is started?


Yes, that's what I'd assume as well.


What I can see of the panic on the console is shown in the
screenshot here:

http://dl.dropbox.com/u/15519230/panic.jpg


The more useful part is right above that. If you can't see any more
lines, you can configure the box to dump core on panic, and you (or I,
or whatever) can then get all of the messages in the dumped vmcore.


If I build 1.4.14.1 from source, it works fine on this box
it seems.

I cannot explain how 1.4.14 is working fine on our other
similar boxes, but not this one.


Anything different in the config on the box? (does the cache dir exist
and look the same?) The only code changes between 1.4.14 and 1.4.14.1 I


Not that I found.


think were for Solaris and Linux 2.6.38, so nothing relevant was
_supposed_ to have changed...


I can no longer even reproduce the problem.

*SIGH*

The panics were found as part of 20-30 iterative Kickstarts
while developing our new OS imaging process and just went
away while working on it over the weekend.  I *hate* when
things are left this way, but unless I can reproduce it
again, I suspect this is a dead thread.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Kernel panic RHEL 5

2011-10-15 Thread Jeff Blaine

Is it possible that ext4 is not allowed for my cache
partition?

On 10/15/2011 12:47 AM, Jeff Blaine wrote:

This has to be something really dumb on my part, but I can't
make sense of it.

RHEL 5.7 x86_64 2.6.18-274.3.1.el5 SMP on a brand new box.

I've tried both of the following, separately, with the
same result:

1. OpenAFS 1.4.14 binaries built from source 20 days ago, copied
verbatim from a working RHEL 5.7 x86_64 2.6.18-274.3.1.el5 SMP
box.

2. Fresh OpenAFS 1.4.14 build from source *on* this box,
then installed

sh /etc/init.d/afs.rc start = kernel panic

Rebooting to single user, the insmod works fine and shows:

Oct 14 23:36:34 rcf-monitor kernel: libafs: module license
'http://www.openafs.org/dl/license10.html' taints kernel.
Oct 14 23:36:34 rcf-monitor kernel: Found system call table at
0x8028ff40 (pattern scan)
Oct 14 23:36:34 rcf-monitor kernel: Using keyrings, rather than hooking
system calls
Oct 14 23:36:34 rcf-monitor kernel: Found 32-bit system call table at
0x80291280 (pattern scan)
Oct 14 23:36:34 rcf-monitor kernel: Using keyrings, rather than hooking
system calls

What I can see of the panic on the console is shown in the
screenshot here:

http://dl.dropbox.com/u/15519230/panic.jpg

If I build 1.4.14.1 from source, it works fine on this box
it seems.

I cannot explain how 1.4.14 is working fine on our other
similar boxes, but not this one.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Kernel panic RHEL 5

2011-10-14 Thread Jeff Blaine

This has to be something really dumb on my part, but I can't
make sense of it.

RHEL 5.7 x86_64 2.6.18-274.3.1.el5 SMP on a brand new box.

I've tried both of the following, separately, with the
same result:

1. OpenAFS 1.4.14 binaries built from source 20 days ago, copied
   verbatim from a working RHEL 5.7 x86_64 2.6.18-274.3.1.el5 SMP
   box.

2. Fresh OpenAFS 1.4.14 build from source *on* this box,
   then installed

sh /etc/init.d/afs.rc start = kernel panic

Rebooting to single user, the insmod works fine and shows:

Oct 14 23:36:34 rcf-monitor kernel: libafs: module license 
'http://www.openafs.org/dl/license10.html' taints kernel.
Oct 14 23:36:34 rcf-monitor kernel: Found system call table at 
0x8028ff40 (pattern scan)
Oct 14 23:36:34 rcf-monitor kernel: Using keyrings, rather than hooking 
system calls
Oct 14 23:36:34 rcf-monitor kernel: Found 32-bit system call table at 
0x80291280 (pattern scan)
Oct 14 23:36:34 rcf-monitor kernel: Using keyrings, rather than hooking 
system calls


What I can see of the panic on the console is shown in the
screenshot here:

http://dl.dropbox.com/u/15519230/panic.jpg

If I build 1.4.14.1 from source, it works fine on this box
it seems.

I cannot explain how 1.4.14 is working fine on our other
similar boxes, but not this one.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Monitoring performance of fileservers using cacti or munin

2011-10-13 Thread Jeff Blaine

I am missing something from the manual pages or openafs documentation?


Aside from scout, afsmonitor and xstat_*_test
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Clear offlinemsg?

2011-10-13 Thread Jeff Blaine

How does one clear a volume's offlinemsg as set by
'fs setvol /afs/blah -offlinemsg' ?

~ : ADMIN# fs setvol /afs/rcf/user/jblaine -offlinemsg 
~ : ADMIN# fs examine /afs/rcf/user/jblaine
File /afs/rcf/user/jblaine (536887760.1.1) contained in volume 536887760
Volume status for vid = 536887760 named u.jblaine
Current offline message is foo
...


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Windows file locking do not work on IFS client.

2011-09-28 Thread Jeff Blaine

This would be a bug.  Please file bugs to openafs-info@openafs.org.


Or ideally openafs-b...@openafs.org
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: Solaris 10 SPARC hang on shutdown

2011-09-26 Thread Jeff Blaine

FWIW, not that anyone expected it to change really, but this
problem persists with the new Solaris 10 08/11 release
and latest Recommended patchset.

On 2/28/2011 5:50 PM, Andrew Deason wrote:

On Mon, 28 Feb 2011 16:31:49 -0600
Andrew Deasonadea...@sinenomine.net  wrote:


On Mon, 28 Feb 2011 22:18:22 +
Derrick Brashearsha...@dementia.org  wrote:


I'm not surprised, tho given Oracle has not bothered to give OpenAFS
anything I guess they expect us to take your word for it.

Yes, afsd is not really interested in exiting and would prefer unmount
to succeed


This would be rather gross, but: do you think it possible to try to
detect if we've got a pending KILL periodically, and signal an upcall
through afsd to try and umount?


Or actually, it may be less work to just actually spawn kernel
threads... it's probably less work than I've been thinking it is. afsd
just exits if the daemon pioctls return anyway, so we could just have
them spawn a kernel proc and then return. We'd lose any per-process
priority goo that afsd sets for the proc, but I don't think we do any of
that on Solaris anyway.

It's still incredibly annoying, though. And it doesn't seem good to
change something like that in the middle of a stable series.


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] OpenAFS-1.7.1 on Windows 7 32 bit

2011-09-24 Thread Jeff Blaine

To close off this thread for archival sake, this was my
error, as determined (again) by Jeffrey Altman.

RxMaxMTU was set to 1431 and not 1400 as thought. This is
necessary over our VPN setup, which is where this problem
was happening.

On 9/16/2011 4:13 PM, Jeffrey Altman wrote:

On 9/16/2011 2:33 PM, Jeff Blaine wrote:

Thank you for all of the effort getting this released.


You're welcome although after 1606 days of development
the best thanks would be a month not looking at the code
again. :-)


Steps forward for me, but I'm not having as much luck
as everyone else yet.  Important to note, probably, is
that the private beta IFS release worked fine for me
last I tried 2 months ago or so.  Quick grunts as to
where to start debugging are welcome.

At any rate, today:

Uninstalled KfW and OpenAFS, including wiping out dangling
dirs on disk and deleting registry keys, and rebooted.

Dropped our CellServDB and krb5.ini in the proper
places.

Installed 64-bit and 32-bit OpenAFS, no integrated logon,
and don't use DNS for cell lookup, then rebooted.

Installed 64-bit and 32-bit KfW 3.2.2 from Secure
Endpoints' website.

Started NIM, it knew my realm and username, got ticket
and AFS token.


NIM and KFW are completely independent of OpenAFS.
There is no reason to touch their configurations.
Since you were attempting to create a clean slate, did
you delete the %windir%\temp\afscache file?


tokens.exe shows this.

aklog.exe -d -force (for kicks) shows all is fine.


you had tokens and forcibly set them again.  not sure why that would do
anything.


fs.exe checkservers reports all is fine.


fs checkservers -all -fast would be more useful.


fs.exe lsmount \\AFS\our.org\user\jblaine hangs indefinitely
and cannot be Ctrl-C'd.  Trying to kill the process via Task
Manager appears to do nothing.  I've waited several minutes
now.


fs minidump will generate a minidump in %windir%\temp\ for the
afsd_service.exe.   This will permit a developer to see what the process
is stopped waiting for something to happen.

Jeffrey Altman



___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] OpenAFS-1.7.1 on Windows 7 32 bit

2011-09-16 Thread Jeff Blaine

Thank you for all of the effort getting this released.

Steps forward for me, but I'm not having as much luck
as everyone else yet.  Important to note, probably, is
that the private beta IFS release worked fine for me
last I tried 2 months ago or so.  Quick grunts as to
where to start debugging are welcome.

At any rate, today:

Uninstalled KfW and OpenAFS, including wiping out dangling
dirs on disk and deleting registry keys, and rebooted.

Dropped our CellServDB and krb5.ini in the proper
places.

Installed 64-bit and 32-bit OpenAFS, no integrated logon,
and don't use DNS for cell lookup, then rebooted.

Installed 64-bit and 32-bit KfW 3.2.2 from Secure
Endpoints' website.

Started NIM, it knew my realm and username, got ticket
and AFS token.

tokens.exe shows this.

aklog.exe -d -force (for kicks) shows all is fine.

fs.exe checkservers reports all is fine.

fs.exe lsmount \\AFS\our.org\user\jblaine hangs indefinitely
and cannot be Ctrl-C'd.  Trying to kill the process via Task
Manager appears to do nothing.  I've waited several minutes
now.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: 1.4.14 with 2.6.18-274.3.1.el5?

2011-09-14 Thread Jeff Blaine

On 9/13/2011 11:48 PM, Andrew Deason wrote:

On Tue, 13 Sep 2011 21:07:04 -0400
Jeff Blainejbla...@kickflop.net  wrote:


-bash-3.2# time /afs/rcf/user/jblaine/afs-exercise.sh
find: WARNING: Hard link count is wrong for .: this may be a bug in your
filesystem driver.  Automatically turning on find's -noleaf option.
Earlier results may have failed to include directories that should have
been searched.


Is the problem just this message? This is known:

-noleaf
   Do  not  optimize  by  assuming that directories contain 2 fewer
   subdirectories than their  hard  link  count.   This  option  is
   needed  when  searching  filesystems that do not follow the Unix
   directory-link convention, such as CD-ROM or MS-DOS  filesystems
   or  AFS  volume  mount  points.


Interesting.  We've never seen this warning before.  I've
added -noleaf to address that.

I'm not sure yet if there is another problem.  Now that I've
gotten past this, it's on to determining that.  The user of
the box indicated he had turned it off months ago because
AFS was too slow on it (sigh).  So now we're investigating
and starting fresh.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] 1.4.14 with 2.6.18-274.3.1.el5?

2011-09-13 Thread Jeff Blaine

Any ideas here?  Known problem?  What would you like to have
for debugging info?

OpenAFS 1.4.14
Linux 2.6.18-274.3.1.el5

Reboot with no AFS
Remove entire cache directory contents
Start AFS

First test run, then immediate problem on 2nd test run
of same code:

-bash-3.2# time /afs/rcf/user/jblaine/afs-exercise.sh

real0m11.423s
user0m0.004s
sys 0m0.023s
-bash-3.2# time /afs/rcf/user/jblaine/afs-exercise.sh
find: WARNING: Hard link count is wrong for .: this may be a bug in your 
filesystem driver.  Automatically turning on find's -noleaf option. 
Earlier results may have failed to include directories that should have 
been searched.

[... after 1m24s I ^C ]

The script is, in essence:

cd /afs/ourcell/someplace
for every file found with 'find'
cat file to /dev/null
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: Performance issues

2011-08-24 Thread Jeff Blaine

For read/write data, if the cache is too small, the cache manager is
required to flush data to the file server sooner than it would prefer.
Since many files used today are in the GB range, it is not unusual to
have caches sizes of 10GB to 20GB.  The local disk is cheap; network
bandwidth is not.


http://wiki.openafs.org/

This page is in Indonesion Would you like to translate it?

No?

Continuing...

[ Page displays in what I assume to be Indonesian ]

Search: cache

Click 1st link ConfiguringTheCache

http://openafs-wiki.stanford.edu/AFSLore/ConfiguringtheCache/

Machines serving multiple users usually perform better with
 a cache of at least 60 to 70 MB.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] screen loses tokens - Solaris 10

2011-08-15 Thread Jeff Blaine

How might I go about debugging this?  This happens
on a host with Generic_142900-03 but not on a host
with Generic_144488-17 (nor ever on this latter host
at any patch rev -- I have been using/resuming screen
on it for years).

1. Connect to host with PuTTY
2. Confirm krb5 creds and tokens gotten from PAM
3. Start screen
4. Confirm krb5 creds and tokens in screen shell
5. Close PuTTY, Yes, disconnect
6. Connect to host with PuTTY
7. Confirm krb5 creds and tokens gotten from PAM
8. Resume screen session
9. Tokens and krb5 creds in screen shell are gone

Common
--
OpenAFS 1.4.14
MIT Kerberos 1.6.3
Screen 4.00.02
sshd_config
pam.conf
pam_afs_session
pam_krb5RA (Russ Alberry's)
No kdestroy in shell dot files

Different
-
SunOS faron.our.org 5.10 Generic_142900-03 sun4u sparc SUNW,Sun-Fire-V490

SunOS cairo.our.org 5.10 Generic_144488-17 sun4u sparc SUNW,Sun-Fire-280R

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] screen loses tokens - Solaris 10

2011-08-15 Thread Jeff Blaine

On 8/15/2011 6:13 PM, Russ Allbery wrote:

Jeff Blainejbla...@kickflop.net  writes:


Thanks Russ (and Kevin!).  Both hosts are using that option.



Identical /etc/pam.conf and /etc/krb5.conf files on both
the working and failing hosts.



 login session optional pam_krb5RA.so minimum_uid=92 retain_after_close



I'll play around though.


You need it for pam_afs_session as well.  Try running with debug set for
both and make sure that syslog says that it's not deleting tickets and
tokens during the logout.


That solved it.

Now I wish I could explain why it worked fine on
the one box and not the other.

Thanks.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] ETA for 1.4.15?

2011-08-11 Thread Jeff Blaine

Do we have an ETA for 1.4.15 by any chance?  Last
I heard it was March/April 2011.  Looking to have
an official/bundled fix for the Solaris 10 hang
at shutdown thing.

Anything I can do to help the cause?
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] patch : AFS-Monitor (Perl)

2011-07-07 Thread Jeff Blaine

On 7/6/2011 8:26 PM, Steven Jenkins wrote:

I talked with Alf, and I'll be taking over ownership of the module.
If there are other patches, feel free to let me know.


Excellent.  Thanks for stepping up!
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Incredibly simple ways to contribute

2011-06-17 Thread Jeff Blaine

All, please consider reviewing this new list of items
which (currently) require zero code knowledge, zero
programming, zero protocol knowledge, etc.

http://openafs-wiki.stanford.edu/AFSLore/afslore/tinysimpletasks/

If everyone can muster 5 minutes a week or only even
10 minutes per month, it would greatly help overall.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Solaris 10 deadlock issue

2011-06-14 Thread Jeff Blaine

Sweet.  I can reproduce this, BTW.  Exact same
appearance as the problems I reported in the
last month.

I'll patch this test box to latest recommended
and try it again with that too.

On 6/14/2011 5:56 PM, Aaron Knister wrote:

Good afternoon!

I'm writing to report a deadlock issue I'm seeing on Solaris 10.

What I've observed is that when a file larger than the configured size
of the cache is copied out of AFS the cache manager deadlocks and all
access to /afs on the affected system hangs until the system is
rebooted. The issue occurs with a memory cache as well as a disk cache.

The issue can be mitigated if the cache size is raised to the value of
roughly half of the physical memory in the given system. The issue
appeared somewhere between Solaris 10 u8 and u9.

I've reproduced the problem using OpenAFS 1.4.14.1, 1.5.78 and 1.6.0pre6
and a Solaris 10 u8 system with all of the latest patches applied.

I've put together a tar file containing:

- An fstrace dump starting a few seconds before I initiated the copy
- A stack trace of the hung cp command
- The output of cmdebug -long -server localhost run after AFS hangs

The individual files as well as a tar file of them can be found here:
http://userpages.umbc.edu/~aaronk/afs/solaris10-deadlock-issue.

Any help would be greatly appreciated.

Best,
Aaron

--
Aaron Knister
Systems Administrator
Division of Information Technology
University of Maryland, Baltimore County
aar...@umbc.edu mailto:aar...@umbc.edu

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: Solaris 10 deadlock issue

2011-06-14 Thread Jeff Blaine

Solaris 10 SPARC

Failure:

Latest recommended and security patches as of 1 hour ago
OpenAFS 1.4.11

Failure:

Latest recommended and security patches as of 1 hour ago
OpenAFS 1.4.14

On 6/14/2011 7:47 PM, Derrick Brashear wrote:

That's one kernel context. I'd like to see what the afsds are doing, so yes, 
besides that.
Sorry I'm being terse, I'm using a mobile device

Derrick

On Jun 14, 2011, at 4:07 PM, Andrew Deasonadea...@sinenomine.net  wrote:


On Tue, 14 Jun 2011 18:17:22 -0400
Derrick Brashearsha...@gmail.com  wrote:


the backtrace from a kernel dump would be far more useful, if you have
a way to collect one.


You mean besides cp_stack_trace.txt ? I think the fstrace is pretty
clear in that afs_GetDownD is not sufficiently clearing space or
something, though.

--
Andrew Deason
adea...@sinenomine.net

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: Debugging opportunity (time-sensitive)

2011-06-07 Thread Jeff Blaine

I was unable to get a shell this time, but tonight
we experienced what I believe to be the same exact
thing (total /afs wedge for all processes) on
a different Solaris 10 SPARC host with 272 day
uptime.

[ for the record ]

On 5/18/2011 3:59 PM, Jeff Blaine wrote:

On 5/18/2011 3:03 PM, Andrew Deason wrote:

On Wed, 18 May 2011 13:51:06 -0400
Jeff Blainejbla...@kickflop.net wrote:


0 - afs_osi_Sleep
0 | afs_osi_Sleep:entry event 705ac1bc = 1023, 1,
1, 1, 0, 0, 0, 2062683024, 2062683824, 0, 2062684288


This is looking a little weird, but I'm not really used to looking at a
lock structure like this. Are you running a 32-bit kernel module?


bash-3.00# file /kernel/fs/sparcv9/afs
/kernel/fs/sparcv9/afs: ELF 64-bit MSB relocatable SPARCV9 Version 1
bash-3.00#


If you run that again, do these values change?


I ran it once just after receiving this email, and yes,
it did more stuff then hung with a similar line.

Now when I run it over and over, the trace shows the same
~25 lines as reported above, and hangs there as well.
The values shown for afs_osi_Sleep:entry do not change.

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: modload failing, Sol10 SPARC, 1.4.14

2011-06-01 Thread Jeff Blaine

On 5/31/2011 6:40 PM, Andrew Deason wrote:

On Tue, 31 May 2011 18:10:53 -0400
Jeff Blainejbla...@kickflop.net  wrote:


I then rebooted and got the same result upon trying modload
again.

I edited:

src/cf/osconf.m4
src/libuafs/MakefileProto.SOLARIS.in
src/libafs/MakefileProto.SOLARIS.in


Well, you need to re-configure each time (or modify the Makefiles


I am doing a make distclean, configure, and make dest for
every build as part of this thread.


directly). If you look at the command run for afs_dynroot.c you'll see
what we're actually running. If there's a -O2 in there and you're still
getting the error, then something is wrong.


-O2 is there


I'll look at running through the whole build process later tonight to
see what specifically you need to do, if not that.


Thanks
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Help: aklog cannot work properly

2011-06-01 Thread Jeff Blaine

On 6/1/2011 1:03 AM, Lee Eric wrote:

Hi,

It seems aklog cannot work well in my server.


[root@server ~]# klist
Ticket cache: FILE:/tmp/krb5cc_0
Default principal: admin@HERDINGCAT.INTERNAL

Valid starting ExpiresService principal
06/01/11 00:55:12  06/02/11 00:55:10
krbtgt/HERDINGCAT.INTERNAL@HERDINGCAT.INTERNAL
renew until 06/01/11 00:55:12
[root@server ~]# aklog -d -c herdingcat.internal
Authenticating to cell herdingcat.internal (server server.herdingcat.internal).
Trying to authenticate to user's realm HERDINGCAT.INTERNAL.
Getting tickets: afs/herdingcat.internal@HERDINGCAT.INTERNAL


Does this principal exist?  ^^^


Kerberos error code returned by get_cred : -1765328370
aklog: Couldn't get herdingcat.internal AFS tickets:
aklog: unknown RPC error (-1765328370) while getting AFS tickets
[root@server ~]# ls /afs
ls: cannot access /afs/herdingcat.internal: No such device
herdingcat.internal
[root@server ~]# fs wscell
This workstation belongs to cell 'openafs.org'
[root@server ~]#

And I noticed that the client belongs to openafs.org, how this could be?


What does your 'ThisCell' file say?
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] IBM published a guide to configuring Kerberos v5 authentication for OpenAFS

2011-06-01 Thread Jeff Blaine

* The server config uses the old -noauth way to bootstrap


Of course. That's the documented way from Quick Beginnings.

That's how I just did it in a new testbed cell, too.

Where was the new way documented when it was developed?

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] modload failing, Sol10 SPARC, 1.4.14

2011-05-31 Thread Jeff Blaine

Maybe this is something?

/usr/lib/abi/appcert/*

# grep memset etc.alt etc.scoped
etc.alt:ALT_USAGE:inadvertant_static_linking:static linking 
inadevertantly brings in private 
symbols:*:__getcontext|__sigaction|__threaded|_bufsync|_cerror|_dgettext|_doprnt|_doscan|_ecvt|_fcvt|_findbuf|_findiop|_getsp|_memcmp|_memmove|_memset|_mutex_unlock|_psignal|_realbufend|_setbufend|_siguhandler|_smbuf|_thr_getspecific|_thr_keycreate|_thr_main|_thr_setspecific|_xflsbuf|gtty|stty:

etc.scoped:SCOPED_SYMBOL|SunOS_5.6|ld.so.1|_memset
etc.scoped:SCOPED_SYMBOL|SunOS_5.6|ld.so.1|memset
#

On 5/31/2011 11:06 AM, Derrick Brashear wrote:

Worked with Jeff offline on this. So,
1) *only* afs_dynroot.o has the reference to _memset. no other object
does. other objects reference memset, and rx_knet references
bzero also.
2) the preprocessed output of afs_dynroot.o, using the cc command
libafs uses, includes only:

grep memset /tmp/memset
extern void *memset(void *, int, size_t);
extern void *memset(void *, int, size_t);
 memset(cellHosts, 0, sizeof(cellHosts));
 memset(status, 0, sizeof(struct AFSFetchStatus));
 memset(status, 0, sizeof(struct AFSFetchStatus));

That's from:
/opt/SUNWspro/bin/cc -I. -I.. -I../nfs  -I/var/tmp/openafs-1.4.14/src
-I/var/tmp/openafs-1.4.14/src/afs
-I/var/tmp/openafs-1.4.14/src/afs/SOLARIS
-I/var/tmp/openafs-1.4.14/src/config
-I/var/tmp/openafs-1.4.14/src/rx/SOLARIS
-I/var/tmp/openafs-1.4.14/src/rxkad
-I/var/tmp/openafs-1.4.14/src/rxkad/domestic
-I/var/tmp/openafs-1.4.14/src/util  -I/var/tmp/openafs-1.4.14/src
-I/var/tmp/openafs-1.4.14/src/afs
-I/var/tmp/openafs-1.4.14/src/afs/SOLARIS
-I/var/tmp/openafs-1.4.14/src/util
-I/var/tmp/openafs-1.4.14/src/rxkad
-I/var/tmp/openafs-1.4.14/src/config
-I/var/tmp/openafs-1.4.14/src/fsint
-I/var/tmp/openafs-1.4.14/src/vlserver
-I/var/tmp/openafs-1.4.14/include
-I/var/tmp/openafs-1.4.14/include/afs  -O -I. -I..
-I/var/tmp/openafs-1.4.14/src/config  -DAFSDEBUG -DKERNEL -DAFS -DVICE
-DNFS -DUFS -DINET -DQUOTA -DGETMOUNT -D_KERNEL -DSYSV -dn -m64
-xbuiltin=%none-o afs_dynroot.o -c
/var/tmp/openafs-1.4.14/src/afs/afs_dynroot.c
transmuted to:
/opt/SUNWspro/bin/cc -I. -I.. -I../nfs  -I/var/tmp/openafs-1.4.14/src
-I/var/tmp/openafs-1.4.14/src/afs
-I/var/tmp/openafs-1.4.14/src/afs/SOLARIS
-I/var/tmp/openafs-1.4.14/src/config
-I/var/tmp/openafs-1.4.14/src/rx/SOLARIS
-I/var/tmp/openafs-1.4.14/src/rxkad
-I/var/tmp/openafs-1.4.14/src/rxkad/domestic
-I/var/tmp/openafs-1.4.14/src/util  -I/var/tmp/openafs-1.4.14/src
-I/var/tmp/openafs-1.4.14/src/afs
-I/var/tmp/openafs-1.4.14/src/afs/SOLARIS
-I/var/tmp/openafs-1.4.14/src/util
-I/var/tmp/openafs-1.4.14/src/rxkad
-I/var/tmp/openafs-1.4.14/src/config
-I/var/tmp/openafs-1.4.14/src/fsint
-I/var/tmp/openafs-1.4.14/src/vlserver
-I/var/tmp/openafs-1.4.14/include
-I/var/tmp/openafs-1.4.14/include/afs  -O -I. -I..
-I/var/tmp/openafs-1.4.14/src/config  -DAFSDEBUG -DKERNEL -DAFS -DVICE
-DNFS -DUFS -DINET -DQUOTA -DGETMOUNT -D_KERNEL -DSYSV -dn -m64
-xbuiltin=%none   -E /var/tmp/openafs-1.4.14/src/afs/afs_dynroot.c

So I'm not sure what I'm missing.


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] modload failing, Sol10 SPARC, 1.4.14

2011-05-31 Thread Jeff Blaine

Could _memset be defined in one of the Sun header files on Jeff's computer?
cd /usr/include
find . -type f -exec grep _memset {} \; -print

Does not show it on mine.


# cd /usr/include/
# find . -type f | xargs grep -l _memset
./mlib_sys_proto.h
./libpng10/png.h
./libpng10/pngconf.h
./libpng12/png.h
./libpng12/pngconf.h
./unicode/urename.h
./unicode/ustring.h
./firefox/Containers.h
./firefox/Native.h
./firefox/RegAlloc.h
./firefox/avmplus.h
./firefox/mozpngconf.h
./firefox/png.h
./firefox/pngconf.h
#

FWIW, this is a brand new Solaris 10 09/10 install with
all Recommended and Security patches installed via
Patch Check Advanced.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] patch : AFS-Monitor (Perl)

2011-05-31 Thread Jeff Blaine

In case Alf never gets to integrating this patch and
releasing 0.3.3, here is what is needed to get
AFS-Monitor to *build* with modern OpenAFS.  I have
not tested anything other than building yet, and I
am not a Perl extension author of any sort.

Original is here:

http://www.cpan.org/authors/id/A/AL/ALFW/

Or here, though this may go away at some point
as I understand he has changed jobs:

http://www.slac.stanford.edu/~alfw/AFS-Monitor/

diff -r -u AFS-Monitor-0.3.2/src/Monitor.xs AFS-Monitor-0.3.3/src/Monitor.xs
--- AFS-Monitor-0.3.2/src/Monitor.xs2006-09-19 14:00:50.01000 -0400
+++ AFS-Monitor-0.3.3/src/Monitor.xs2011-05-31 13:32:48.01000 -0400
@@ -164,7 +164,7 @@
*/

   static void
-myPrintTheseStats(HV *RXSTATS, struct rx_stats *rxstats)
+myPrintTheseStats(HV *RXSTATS, struct rx_statistics *rxstats)
   {
  HV *PACKETS;
  HV *TYPE;
@@ -8910,9 +8910,9 @@
   warn(WARNING: Server doesn't support retrieval of Rx
statistics\n);
 }
 else {
-struct rx_stats rxstats;
+struct rx_statistics rxstats;

-/* should gracefully handle the case where rx_stats grows */
+/* should gracefully handle the case where rx_statistics grows */
   code = rx_GetServerStats(s, host, port, rxstats,
supportedStatValues);
   if (code  0) {
 sprintf(buffer, rxstats call failed with code %d, code);


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: modload failing, Sol10 SPARC, 1.4.14

2011-05-31 Thread Jeff Blaine

Also, Jeff, if you want a quick workaround, you can change -O to -O2 or
just leave out the -O option. I think changing the value of KERN_OPTMZ
in src/cf/osconf.m4 should be enough...


That didn't do it for me.

Trying now with -O2 in MakefileProto.SOLARIS.in instead of -O
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: modload failing, Sol10 SPARC, 1.4.14

2011-05-31 Thread Jeff Blaine

FWIW, I can't get any workaround to work.  Iterative
setting of -O to -O2 where I could find it across
various builds got me finally to here where I gave up:

bash-3.00# /usr/sbin/modload 
sun4x_510/dest/root.client/usr/vice/etc/modload/libafs64.o

can't load module: Out of memory or no room in system tables

May 31 18:01:47 rcf-afs-test.our.org genunix: [ID 104096 kern.warning] 
WARNING: system call missing from bind file


I then rebooted and got the same result upon trying modload
again.

I edited:

src/cf/osconf.m4
src/libuafs/MakefileProto.SOLARIS.in
src/libafs/MakefileProto.SOLARIS.in

On 5/31/2011 2:13 PM, Andrew Deason wrote:

On Tue, 31 May 2011 12:14:31 -0500
Andrew Deasonadea...@sinenomine.net  wrote:


Or I can just find it by commenting stuff out and seeing when the
_memset ref goes away. It appears to be this loop that's causing it, in
afs_RebuildDynroot lines 378/379:

 for (i = 0; i  NHASHENT; i++)
 dirHeader-hashTable[i] = 0;

which makes sense; that's pretty easily optimizable into a memset.

I'll get a simpler demonstration together to submit to Oracle.


Also, Jeff, if you want a quick workaround, you can change -O to -O2 or
just leave out the -O option. I think changing the value of KERN_OPTMZ
in src/cf/osconf.m4 should be enough...

And now I'm not completely sure if this is a bug or if we're just
missing the magic incantation to make this not happen. A simple test
case:

void
foo(short *arr)
{
 int i;
 for (i = 0; i  256; i++)
 arr[i] = 0;
}

If you compile with 'cc foo.c -c -o foo.o -O3', you get a reference to
_memset. If you compile with -O2 or below, you don't. Passing
-xbuiltin=%none, any of the -xno*lib or -xc99 etc options don't seem to
change anything. With older versions of Sun/Solaris Studio, it never
seems to call _memset.

The Oracle documentation on this is puzzling to me:
http://download.oracle.com/docs/cd/E19205-01/821-1384/gjzku/index.html

It says The following table lists runtime support functions that may be
called in code compiled to run in the Solaris kernel, as a result of
source code translation by the C compiler. the table includes _memset,
_memcpy, et al. Then it says

Note that some versions of the kernel do not provide _memmove(),
_memcpy(), or _memset(), but do provide kernel mode analogues of the
user mode routines memmove(), memcpy(), and memset().

But it doesn't say how to avoid it. I'm not sure if there's a compiler
flag we're missing here, or if it's not supported to use -O3 for kernel
modules, or... ? Or it's just a bug. It's also interesting that this
doesn't happen on amd64, though I assume that's just because it uses
different arch-specific optimizations.

I don't know, should I just try to file a bug anyway, or should we try
to get someone with a support contract to say something?


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: modload failing, Sol10 SPARC, 1.4.14

2011-05-28 Thread Jeff Blaine

On 5/27/2011 4:36 PM, Andrew Deason wrote:

On Fri, 27 May 2011 16:21:44 -0400
Jeff Blainejbla...@kickflop.net  wrote:


cc: Warning: Option xmodel=kernel is not available on SPARC Solaris
platform, ignored


Oh, duh. Try -xbuiltin=%none instead of -xmodel=kernel


Nope, same old.

bash-3.00# grep xbuiltin src/libafs/Make*
src/libafs/MakefileProto.SOLARIS.in:KDEFS_64 = -m64 -xbuiltin=%none
bash-3.00# ./configure --enable-namei-fileserver --disable-afsdb 
--enable-transarc-paths --with-krb5-conf=/usr/rcf-krb5/bin/krb5-config 
21 | tee c.log

...
bash-3.00# grep xbuiltin src/libafs/Make*
src/libafs/MakefileProto.SOLARIS:KDEFS_64 = -m64 -xbuiltin=%none
src/libafs/MakefileProto.SOLARIS.in:KDEFS_64 = -m64 -xbuiltin=%none
bash-3.00#
bash-3.00# make dest 21 | tee makedest.log
...
bash-3.00# cp sun4x_510/dest/root.client/usr/vice/etc/modload/libafs64.o 
/kernel/fs/sparcv9/afs

bash-3.00# modload /kernel/fs/sparcv9/afs
can't load module: Invalid argument
bash-3.00#

May 28 12:24:08 rcf-afs-test.our.org unix: [ID 819705 kern.notice] 
/var/tmp/openafs-1.4.14-src/sun4x_510/dest/root.client/usr/vice/etc/modload/libafs64.o: 
undefined symbol

May 28 12:24:08 rcf-afs-test.our.org unix: [ID 826211 kern.notice] '_memset'
May 28 12:24:08 rcf-afs-test.our.org unix: [ID 472681 kern.notice] 
WARNING: mod_load: cannot load module 'libafs64.o'



/opt/SUNWspro/bin/cc -I. -I.. -I../nfs  -I/var/tmp/openafs-1.4.14/src 
-I/var/tmp/openafs-1.4.14/src/afs 
-I/var/tmp/openafs-1.4.14/src/afs/SOLARIS 
-I/var/tmp/openafs-1.4.14/src/config 
-I/var/tmp/openafs-1.4.14/src/rx/SOLARIS 
-I/var/tmp/openafs-1.4.14/src/rxkad 
-I/var/tmp/openafs-1.4.14/src/rxkad/domestic 
-I/var/tmp/openafs-1.4.14/src/util  -I/var/tmp/openafs-1.4.14/src 
-I/var/tmp/openafs-1.4.14/src/afs 
-I/var/tmp/openafs-1.4.14/src/afs/SOLARIS 
-I/var/tmp/openafs-1.4.14/src/util  -I/var/tmp/openafs-1.4.14/src/rxkad 
 -I/var/tmp/openafs-1.4.14/src/config 
-I/var/tmp/openafs-1.4.14/src/fsint 
-I/var/tmp/openafs-1.4.14/src/vlserver -I/var/tmp/openafs-1.4.14/include 
 -I/var/tmp/openafs-1.4.14/include/afs  -I. -I.. 
-I/var/tmp/openafs-1.4.14/src/config  -DAFSDEBUG -DKERNEL -DAFS -DVICE 
-DNFS -DUFS -DINET -DQUOTA -DGETMOUNT -D_KERNEL -DSYSV -dn -m64 
-xbuiltin=%none
 -DAFS_NONFSTRANS -DAFS_WRAPPER=libafs.nonfs.o_wrapper 
-DAFS_CONF_DATA=libafs.nonfs.o_conf_data -o osi_vfsops.o -c 
/var/tmp/openafs-1.4.14/src/afs/SOLARIS/osi_vfsops.c
/var/tmp/openafs-1.4.14/src/afs/SOLARIS/osi_vfsops.c, line 220: 
warning: implicit function declaration: afs_osi_vget
/var/tmp/openafs-1.4.14/src/afs/SOLARIS/osi_vfsops.c, line 235: 
warning: old-style declaration or incorrect type for: afs_mountroot
/var/tmp/openafs-1.4.14/src/afs/SOLARIS/osi_vfsops.c, line 248: 
warning: old-style declaration or incorrect type for: afs_swapvp
/var/tmp/openafs-1.4.14/src/afs/SOLARIS/osi_vfsops.c, line 279: 
warning: initialization type mismatch
/var/tmp/openafs-1.4.14/src/afs/SOLARIS/osi_vfsops.c, line 282: 
warning: initialization type mismatch
/var/tmp/openafs-1.4.14/src/afs/SOLARIS/osi_vfsops.c, line 331: 
warning: no explicit type given
/var/tmp/openafs-1.4.14/src/afs/SOLARIS/osi_vfsops.c, line 338: 
warning: improper pointer/integer combination: op =
/var/tmp/openafs-1.4.14/src/afs/SOLARIS/osi_vfsops.c, line 348: 
warning: old-style declaration or incorrect type for: afsinit
/var/tmp/openafs-1.4.14/src/afs/SOLARIS/osi_vfsops.c, line 358: 
warning: assignment type mismatch:
pointer to function() returning int = pointer to function() 
returning

long
/var/tmp/openafs-1.4.14/src/afs/SOLARIS/osi_vfsops.c, line 360: 
warning: assignment type mismatch:
pointer to function() returning long = pointer to function() 
returning int
/var/tmp/openafs-1.4.14/src/afs/SOLARIS/osi_vfsops.c, line 364: 
warning: assignment type mismatch:
pointer to function() returning int = pointer to function() 
returning

long
/var/tmp/openafs-1.4.14/src/afs/SOLARIS/osi_vfsops.c, line 366: 
warning: assignment type mismatch:
pointer to function() returning long = pointer to function() 
returning int
/var/tmp/openafs-1.4.14/src/afs/SOLARIS/osi_vfsops.c, line 509: 
warning: old-style declaration or incorrect type for: _init
/var/tmp/openafs-1.4.14/src/afs/SOLARIS/osi_vfsops.c, line 597: 
warning: old-style declaration or incorrect type for: _info
/var/tmp/openafs-1.4.14/src/afs/SOLARIS/osi_vfsops.c, line 606: 
warning: old-style declaration or incorrect type for: _fini
/var/tmp/openafs-1.4.14/src/afs/SOLARIS/osi_vfsops.c, line 614: 
warning: assignment type mismatch:
pointer to function() returning long = pointer to function() 
returning int
/var/tmp/openafs-1.4.14/src/afs/SOLARIS/osi_vfsops.c, line 617: 
warning: assignment type mismatch:
pointer to function() returning long = pointer to function() 
returning int




/opt/SUNWspro/bin/cc -I. -I.. -I../nfs  -I/var/tmp/openafs-1.4.14/src 
-I/var/tmp/openafs-1.4.14/src/afs 
-I/var/tmp/openafs-1.4.14/src/afs/SOLARIS 
-I/var/tmp/openafs-1.4.14/src/config 

[OpenAFS] KDC has no support for encryption type

2011-05-27 Thread Jeff Blaine

Okay, what did I do wrong?

MIT Kerberos 1.9.1 and OpenAFS 1.4.14

For kicks, tried this:

export PATH=/usr/rcf-krb5/bin:$PATH

bash-3.00# kvno afs/rcf-afs-test.our.org
kvno: KDC has no support for encryption type while getting credentials 
for afs/rcf-afs-test.our@rcf-afs-test.our.org

bash-3.00#

kadmin.local:  getprinc afs/rcf-afs-test.our.org
Principal: afs/rcf-afs-test.our@rcf-afs-test.our.org
Expiration date: [never]
Last password change: Fri May 27 11:57:19 EDT 2011
Password expiration date: [none]
Maximum ticket life: 7 days 00:00:00
Maximum renewable life: 14 days 00:00:00
Last modified: Fri May 27 11:57:19 EDT 2011 
(admin/ad...@rcf-afs-test.our.org)

Last successful authentication: [never]
Last failed authentication: [never]
Failed password attempts: 0
Number of keys: 1
Key: vno 2, des-cbc-crc, no salt
MKey: vno 1
Attributes:
Policy: [none]
kadmin.local:

--
Jeff Blaine  |  G06A/ATCC/RCF

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] KDC has no support for encryption type

2011-05-27 Thread Jeff Blaine

Ah, I had allow_weak_crypto = yes

On 5/27/2011 12:23 PM, Brandon Allbery wrote:

On Fri, May 27, 2011 at 12:13, Jeff Blainejbla...@kickflop.net  wrote:

Okay, what did I do wrong?
MIT Kerberos 1.9.1 and OpenAFS 1.4.14


Recent Kerberos (both MIT and heimdal) disables DES by default; recent
OpenAFS knows how to defeat this, but for kinit or kvno you'll need to
do so in /etc/krb5.conf

[libdefaults]
allow_weak_crypto = true

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] KDC has no support for encryption type

2011-05-27 Thread Jeff Blaine

On 5/27/2011 1:55 PM, Brandon Allbery wrote:

On Fri, May 27, 2011 at 13:01, Jeff Blainejbla...@kickflop.net  wrote:

Ah, I had allow_weak_crypto = yes


Then that's not the problem (yes, true, 1, etc. should all work).  If
that's not it then there may be something else; kvno is an MIT thing
and I'm motly Heimdal, so I get to defer to someone else at this
point.


Indeed.  The problem is that the OpenAFS QuickStart Guide
has incorrect information indicating that one can run this,
but not mentioning krb5 creds are required unless a keytab
is specified.  Both of these work:

kvno -k /etc/afs.keytab afs/rcf-afs-test.our.org

or:

kinit someprinc-with-privs

kvno afs/rcf-afs-test.our.org

I'll update the document.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] modload failing, Sol10 SPARC, 1.4.14

2011-05-27 Thread Jeff Blaine

I'm stumped.

bash-3.00# uname -a
SunOS rcf-afs-test.our.org 5.10 Generic_144488-12 sun4u sparc 
SUNW,Sun-Fire-280R
bash-3.00# export 
PATH=/opt/SUNWspro/bin:/usr/ccs/bin:/usr/sfw/bin:/usr/bin:/bin

bash-3.00# /opt/SUNWspro/bin/cc -V
cc: Sun C 5.11 SunOS_sparc 2010/08/13
usage: cc [ options ] files.  Use 'cc -flags' for details
bash-3.00#
bash-3.00# cd /var/tmp/openafs-1.4.14-src
bash-3.00# ./configure --enable-transarc-paths --enable-namei-fileserver 
--disable-afsdb --with-krb5-conf=/usr/rcf-krb5/bin/krb5-config

...
bash-3.00# make dest 21 | tee makedest.log
...
bash-3.00# ls -l sun4x_510/dest/root.client/usr/vice/etc/modload/
total 7626
-rw-r--r--   1 root root4618 Dec 17 10:58 afs.rc
-rw-r--r--   1 root root 1907992 May 27 13:34 libafs64.nonfs.o
-rw-r--r--   1 root root 1970568 May 27 13:34 libafs64.o
bash-3.00# cp sun4x_510/dest/root.client/usr/vice/etc/modload/libafs64.o 
/kernel/fs/sparcv9/afs

bash-3.00# chmod 755 /kernel/fs/sparcv9/afs

bash-3.00# /usr/sbin/modload /kernel/misc/sparcv9/nfssrv
bash-3.00# /usr/sbin/modload /kernel/fs/sparcv9/afs
can't load module: Invalid argument
bash-3.00# file /kernel/fs/sparcv9/afs
/kernel/fs/sparcv9/afs: ELF 64-bit MSB relocatable SPARCV9 Version 1
bash-3.00# ls -ld /kernel/fs/sparcv9/afs
-rwxr-xr-x   1 root root 1970568 May 27 14:02 /kernel/fs/sparcv9/afs
bash-3.00#

--
Jeff Blaine  |  G06A/ATCC/RCF

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: modload failing, Sol10 SPARC, 1.4.14

2011-05-27 Thread Jeff Blaine

On 5/27/2011 2:35 PM, Andrew Deason wrote:

On Fri, 27 May 2011 14:27:03 -0400
Jeff Blainejbla...@kickflop.net  wrote:


bash-3.00# /usr/sbin/modload /kernel/misc/sparcv9/nfssrv
bash-3.00# /usr/sbin/modload /kernel/fs/sparcv9/afs
can't load module: Invalid argument


dmesg | tail


May 27 14:23:25 rcf-afs-test.our.org unix: [ID 819705 kern.notice] 
/kernel/fs/sparcv9/afs: undefined symbol

May 27 14:23:25 rcf-afs-test.our.org unix: [ID 826211 kern.notice] '_memset'
May 27 14:23:25 rcf-afs-test.our.org unix: [ID 472681 kern.notice] 
WARNING: mod_load: cannot load module 'afs'


Ah, this again.

And my previous report of this problem, the solution to which
is not even an option anymore as we don't even have the
ancient Solaris Studio anymore:

https://lists.openafs.org/pipermail/openafs-info/2011-February/035520.html
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: modload failing, Sol10 SPARC, 1.4.14

2011-05-27 Thread Jeff Blaine

On 5/27/2011 3:11 PM, Andrew Deason wrote:

On Fri, 27 May 2011 14:41:20 -0400
Jeff Blainejbla...@kickflop.net  wrote:


May 27 14:23:25 rcf-afs-test.our.org unix: [ID 819705 kern.notice]
/kernel/fs/sparcv9/afs: undefined symbol
May 27 14:23:25 rcf-afs-test.our.org unix: [ID 826211 kern.notice] '_memset'
May 27 14:23:25 rcf-afs-test.our.org unix: [ID 472681 kern.notice]
WARNING: mod_load: cannot load module 'afs'


I'll submit a real patch when I have time to look at what changed, but
try the attached patch to a fresh tree and tell me if it changes
anything?


Thanks Andrew.  Compiling now.


And my previous report of this problem, the solution to which is not
even an option anymore as we don't even have the ancient Solaris
Studio anymore:

https://lists.openafs.org/pipermail/openafs-info/2011-February/035520.html


Well, this previous report went completely overlooked by me and possibly
others because I thought that was a sig or something, and was mixed up
with talking about warnings.


Yeah, dumb on my part.  I should have filed a bug report
to openafs-bugs.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: modload failing, Sol10 SPARC, 1.4.14

2011-05-27 Thread Jeff Blaine

On 5/27/2011 3:11 PM, Andrew Deason wrote:

On Fri, 27 May 2011 14:41:20 -0400
Jeff Blainejbla...@kickflop.net  wrote:


May 27 14:23:25 rcf-afs-test.our.org unix: [ID 819705 kern.notice]
/kernel/fs/sparcv9/afs: undefined symbol
May 27 14:23:25 rcf-afs-test.our.org unix: [ID 826211 kern.notice] '_memset'
May 27 14:23:25 rcf-afs-test.our.org unix: [ID 472681 kern.notice]
WARNING: mod_load: cannot load module 'afs'


I'll submit a real patch when I have time to look at what changed, but
try the attached patch to a fresh tree and tell me if it changes
anything?


No change.

Same error.

Note, too, that I am using -m64 instead of -xarch=sparcv9
per http://rt.central.org/rt/Ticket/Display.html?id=129947

I had the same modload problem when using -xarch=sparcv9
instead of -m64
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: modload failing, Sol10 SPARC, 1.4.14

2011-05-27 Thread Jeff Blaine

On 5/27/2011 4:13 PM, Andrew Deason wrote:

On Fri, 27 May 2011 15:58:51 -0400
Jeff Blainejbla...@kickflop.net  wrote:


No change.

Same error.


Did you save a log of the build? Can I see the commands for, say,
osi_vfsops.c? (there will be a few instances of it)


/opt/SUNWspro/bin/cc -I. -I.. -I../nfs  -I/var/tmp/openafs-1.4.14/src 
-I/var/tmp/openafs-1.4.14/src/afs 
-I/var/tmp/openafs-1.4.14/src/afs/SOLARIS 
-I/var/tmp/openafs-1.4.14/src/config 
-I/var/tmp/openafs-1.4.14/src/rx/SOLARIS 
-I/var/tmp/openafs-1.4.14/src/rxkad 
-I/var/tmp/openafs-1.4.14/src/rxkad/domestic 
-I/var/tmp/openafs-1.4.14/src/util  -I/var/tmp/openafs-1.4.14/src 
-I/var/tmp/openafs-1.4.14/src/afs 
-I/var/tmp/openafs-1.4.14/src/afs/SOLARIS 
-I/var/tmp/openafs-1.4.14/src/util  -I/var/tmp/openafs-1.4.14/src/rxkad 
 -I/var/tmp/openafs-1.4.14/src/config 
-I/var/tmp/openafs-1.4.14/src/fsint 
-I/var/tmp/openafs-1.4.14/src/vlserver -I/var/tmp/openafs-1.4.14/include 
 -I/var/tmp/openafs-1.4.14/include/afs  -I. -I.. 
-I/var/tmp/openafs-1.4.14/src/config  -DAFSDEBUG -DKERNEL -DAFS -DVICE 
-DNFS -DUFS -DINET -DQUOTA -DGETMOUNT -D_KERNEL -DSYSV -dn -m64 
-xmodel=kernel
-DAFS_NONFSTRANS -DAFS_WRAPPER=libafs.nonfs.o_wrapper 
-DAFS_CONF_DATA=libafs.nonfs.o_conf_data -o osi_vfsops.o -c 
/var/tmp/openafs-1.4.14/src/afs/SOLARIS/osi_vfsops.c
cc: Warning: Option xmodel=kernel is not available on SPARC Solaris 
platform, ignored
/var/tmp/openafs-1.4.14/src/afs/SOLARIS/osi_vfsops.c, line 220: 
warning: implicit function declaration: afs_osi_vget
/var/tmp/openafs-1.4.14/src/afs/SOLARIS/osi_vfsops.c, line 235: 
warning: old-style declaration or incorrect type for: afs_mountroot
/var/tmp/openafs-1.4.14/src/afs/SOLARIS/osi_vfsops.c, line 248: 
warning: old-style declaration or incorrect type for: afs_swapvp
/var/tmp/openafs-1.4.14/src/afs/SOLARIS/osi_vfsops.c, line 279: 
warning: initialization type mismatch
/var/tmp/openafs-1.4.14/src/afs/SOLARIS/osi_vfsops.c, line 282: 
warning: initialization type mismatch
/var/tmp/openafs-1.4.14/src/afs/SOLARIS/osi_vfsops.c, line 331: 
warning: no explicit type given
/var/tmp/openafs-1.4.14/src/afs/SOLARIS/osi_vfsops.c, line 338: 
warning: improper pointer/integer combination: op =
/var/tmp/openafs-1.4.14/src/afs/SOLARIS/osi_vfsops.c, line 348: 
warning: old-style declaration or incorrect type for: afsinit
/var/tmp/openafs-1.4.14/src/afs/SOLARIS/osi_vfsops.c, line 358: 
warning: assignment type mismatch:
pointer to function() returning int = pointer to function() 
returning

long
/var/tmp/openafs-1.4.14/src/afs/SOLARIS/osi_vfsops.c, line 360: 
warning: assignment type mismatch:
pointer to function() returning long = pointer to function() 
returning int
/var/tmp/openafs-1.4.14/src/afs/SOLARIS/osi_vfsops.c, line 364: 
warning: assignment type mismatch:
pointer to function() returning int = pointer to function() 
returning

long
/var/tmp/openafs-1.4.14/src/afs/SOLARIS/osi_vfsops.c, line 366: 
warning: assignment type mismatch:
pointer to function() returning long = pointer to function() 
returning int
/var/tmp/openafs-1.4.14/src/afs/SOLARIS/osi_vfsops.c, line 509: 
warning: old-style declaration or incorrect type for: _init
/var/tmp/openafs-1.4.14/src/afs/SOLARIS/osi_vfsops.c, line 597: 
warning: old-style declaration or incorrect type for: _info
/var/tmp/openafs-1.4.14/src/afs/SOLARIS/osi_vfsops.c, line 606: 
warning: old-style declaration or incorrect type for: _fini
/var/tmp/openafs-1.4.14/src/afs/SOLARIS/osi_vfsops.c, line 614: 
warning: assignment type mismatch:
pointer to function() returning long = pointer to function() 
returning int
/var/tmp/openafs-1.4.14/src/afs/SOLARIS/osi_vfsops.c, line 617: 
warning: assignment type mismatch:
pointer to function() returning long = pointer to function() 
returning int



/opt/SUNWspro/bin/cc -I. -I.. -I../nfs  -I/var/tmp/openafs-1.4.14/src 
-I/var/tmp/openafs-1.4.14/src/afs 
-I/var/tmp/openafs-1.4.14/src/afs/SOLARIS 
-I/var/tmp/openafs-1.4.14/src/config 
-I/var/tmp/openafs-1.4.14/src/rx/SOLARIS 
-I/var/tmp/openafs-1.4.14/src/rxkad 
-I/var/tmp/openafs-1.4.14/src/rxkad/domestic 
-I/var/tmp/openafs-1.4.14/src/util  -I/var/tmp/openafs-1.4.14/src 
-I/var/tmp/openafs-1.4.14/src/afs 
-I/var/tmp/openafs-1.4.14/src/afs/SOLARIS 
-I/var/tmp/openafs-1.4.14/src/util  -I/var/tmp/openafs-1.4.14/src/rxkad 
 -I/var/tmp/openafs-1.4.14/src/config 
-I/var/tmp/openafs-1.4.14/src/fsint 
-I/var/tmp/openafs-1.4.14/src/vlserver -I/var/tmp/openafs-1.4.14/include 
 -I/var/tmp/openafs-1.4.14/include/afs  -I. -I.. 
-I/var/tmp/openafs-1.4.14/src/config  -DAFSDEBUG -DKERNEL -DAFS -DVICE 
-DNFS -DUFS -DINET -DQUOTA -DGETMOUNT -D_KERNEL -DSYSV -dn -m64 
-xmodel=kernel
-DAFS_WRAPPER=libafs.o_wrapper -DAFS_CONF_DATA=libafs.o_conf_data -o 
osi_vfsops_nfs.o -c /var/tmp/openafs-1.4.14/src/afs/SOLARIS/osi_vfsops.c
cc: Warning: Option xmodel=kernel is not available on SPARC Solaris 
platform, ignored
/var/tmp/openafs-1.4.14/src/afs/SOLARIS/osi_vfsops.c, line 220: 

[OpenAFS] Debugging opportunity (time-sensitive)

2011-05-18 Thread Jeff Blaine

[ not subscribing to -dev to post just this ]

We have a Solaris 10 SPARC client running 1.4.11 which
has hangs any process accessing our cell.  Before we
announce downtime (sadly, this is a server that is now
hosed), if anyone has any interest in figuring out what
went wrong toward possibly killing off a bug, please
quickly let me know what you'd like me to run.

Right now the box is still functional (NFS) to end users,
so there is no emergency *yet*.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: Debugging opportunity (time-sensitive)

2011-05-18 Thread Jeff Blaine

On 5/18/2011 11:03 AM, Andrew Deason wrote:

On Wed, 18 May 2011 10:36:20 -0400
Jeff Blainejbla...@kickflop.net  wrote:


We have a Solaris 10 SPARC client running 1.4.11 which
has hangs any process accessing our cell.  Before we
announce downtime (sadly, this is a server that is now
hosed), if anyone has any interest in figuring out what
went wrong toward possibly killing off a bug, please
quickly let me know what you'd like me to run.


Does 'cmdebugclient' return anything?


Nope.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: Debugging opportunity (time-sensitive)

2011-05-18 Thread Jeff Blaine

cmdebug produced not output and returned within 2 secs.

Not dynroot.

ls -ld /afs did not hang

ls -ld /afs/our.org hung here:

dtrace: script 'traceafs.d' matched 2594 probes
CPU FUNCTION
  0  - afs_root
  0  - afs_root
  0  - gafs_lookup
  0- afs_lookup
  0  - afs_InitFakeStat
  0  - afs_InitFakeStat
  0  - afs_InitReq
  0- PagInCred
  0- PagInCred
  0  - afs_InitReq
  0  - afs_EvalFakeStat
  0- afs_EvalFakeStat_int
  0- afs_EvalFakeStat_int
  0  - afs_EvalFakeStat
  0  - afs_AccessOK
  0- afs_GetAccessBits
  0- afs_GetAccessBits
  0  - afs_AccessOK
  0  - Check_AtSys
  0  - Check_AtSys
  0  - osi_dnlc_lookup
  0  - osi_dnlc_lookup
  0  - afs_GetDCache
  0- afs_MemGetDSlot
  0  - Afs_Lock_ReleaseR
  0- afs_osi_Wakeup
  0  - afs_getevent
  0  - afs_getevent
  0- afs_osi_Wakeup
  0  - Afs_Lock_ReleaseR
  0- afs_MemGetDSlot
  0- afs_osi_Sleep
  0  - afs_getevent
  0  - afs_getevent

--
Jeff Blaine  |  G06A/ATCC/RCF



On 5/18/2011 11:39 AM, Derrick Brashear wrote:

On Wed, May 18, 2011 at 11:25 AM, Andrew Deasonadea...@sinenomine.net  wrote:

On Wed, 18 May 2011 11:11:37 -0400
Jeff Blainejbla...@kickflop.net  wrote:


Does 'cmdebugclient' return anything?


Nope.


As in, it hangs, or it exits without any output?

But okay, to see where in libafs you're hanging, you can

dtrace -s traceafs.d -c ls -ld /afs

(as root) and give the output, or at least around the spot where it
hangs. I'm assuming 'ls -ld /afs' hangs, though. Just put some other
command in there otherwise.



actually, a relevant question in that vein, is this machine dynroot,
and what is the uppermost path component that hangs?

but you should still run the dtrace command regardless.



___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: Debugging opportunity (time-sensitive)

2011-05-18 Thread Jeff Blaine

On 5/18/2011 1:25 PM, Andrew Deason wrote:

On Wed, 18 May 2011 11:42:45 -0400
Jeff Blainejbla...@kickflop.net  wrote:


0  -  afs_GetDCache
0-  afs_MemGetDSlot
0  -  Afs_Lock_ReleaseR
0-  afs_osi_Wakeup
0  -  afs_getevent
0- afs_getevent
0- afs_osi_Wakeup
0- Afs_Lock_ReleaseR
0- afs_MemGetDSlot
0-  afs_osi_Sleep
0  -  afs_getevent
0- afs_getevent


So, waiting on tdc-lock, I think?

Try the same thing with the attached D script; it may say who's holding
it.


dtrace: script 'traceafs2.d' matched 2597 probes
CPU FUNCTION
  0  - afs_root
  0  - afs_root
  0  - gafs_lookup
  0- afs_lookup
  0  - afs_InitFakeStat
  0  - afs_InitFakeStat
  0  - afs_InitReq
  0- PagInCred
  0- PagInCred
  0  - afs_InitReq
  0  - afs_EvalFakeStat
  0- afs_EvalFakeStat_int
  0- afs_EvalFakeStat_int
  0  - afs_EvalFakeStat
  0  - afs_AccessOK
  0- afs_GetAccessBits
  0- afs_GetAccessBits
  0  - afs_AccessOK
  0  - Check_AtSys
  0  - Check_AtSys
  0  - osi_dnlc_lookup
  0  - osi_dnlc_lookup
  0  - afs_GetDCache
  0- afs_MemGetDSlot
  0  - Afs_Lock_ReleaseR
  0- afs_osi_Wakeup
  0  - afs_getevent
  0  - afs_getevent
  0- afs_osi_Wakeup
  0  - Afs_Lock_ReleaseR
  0- afs_MemGetDSlot
  0- afs_osi_Sleep
  0 | afs_osi_Sleep:entry event 705ac1bc = 1023, 1, 
1, 1, 0, 0, 0, 2062683024, 2062683824, 0, 2062684288


  0  - afs_getevent
  0  - afs_getevent
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: Debugging opportunity (time-sensitive)

2011-05-18 Thread Jeff Blaine

On 5/18/2011 3:03 PM, Andrew Deason wrote:

On Wed, 18 May 2011 13:51:06 -0400
Jeff Blainejbla...@kickflop.net  wrote:


0-  afs_osi_Sleep
0 | afs_osi_Sleep:entry event 705ac1bc = 1023, 1,
1, 1, 0, 0, 0, 2062683024, 2062683824, 0, 2062684288


This is looking a little weird, but I'm not really used to looking at a
lock structure like this. Are you running a 32-bit kernel module?


bash-3.00# file /kernel/fs/sparcv9/afs
/kernel/fs/sparcv9/afs: ELF 64-bit MSB relocatable SPARCV9 Version 1
bash-3.00#


If you run that again, do these values change?


I ran it once just after receiving this email, and yes,
it did more stuff then hung with a similar line.

Now when I run it over and over, the trace shows the same
~25 lines as reported above, and hangs there as well.
The values shown for afs_osi_Sleep:entry do not change.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] vldb_check

2011-05-17 Thread Jeff Blaine

Okay, what does all of this *mean*? :)

syncsite# vldb_check vldb.DB0
Header's maximum volume id is 2023892829 and largest id found in VLDB is 
2023892825

Name Hash 225: Bad entry at 318748: Not a valid vlentry
Name Hash 524: Bad entry at 237940: Not a valid vlentry
Name Hash 532: Bad entry at 188360: Not a valid vlentry
Name Hash 1350: Bad entry at 279380: Not a valid vlentry
Name Hash 2575: Bad entry at 226248: Not a valid vlentry
Name Hash 2899: Bad entry at 141148: Not a valid vlentry
Name Hash 3733: Bad entry at 250668: Not a valid vlentry
Name Hash 3829: Bad entry at 264876: Not a valid vlentry
Name Hash 3971: Bad entry 'src.amake.011': Already in the name hash
Name Hash 4196: Bad entry at 167788: Not a valid vlentry
Name Hash 4428: Bad entry at 331180: Not a valid vlentry
Name Hash 4663: Bad entry at 139668: Not a valid vlentry
Name Hash 5165: Bad entry at 192060: Not a valid vlentry
Name Hash 5861: Bad entry at 160092: Not a valid vlentry
Name Hash 5886: Bad entry at 274496: Not a valid vlentry
Name Hash 5897: Bad entry at 158760: Not a valid vlentry
Name Hash 6728: Bad entry at 161572: Not a valid vlentry
Name Hash 7085: Bad entry at 248004: Not a valid vlentry
Name Hash 7266: Bad entry 'u.ltal': Incorrect name hash chain (should be 
in 8179)

Name Hash 7322: Bad entry 'u.cmag': Already in the name hash
Name Hash 7913: Bad entry at 199460: Not a valid vlentry
Name Hash 8179: Bad entry 'u.ltal': Already in the name hash
bk Id Hash 518: Bad entry 'u.thar': Incorrect Id hash chain (should be 
in 4053)

906: 4f0f1
bk Id Hash 559: Bad entry at 141592: Not a valid vlentry
bk Id Hash 4053: Bad entry 'u.thar': Already in the hash table
Free vlentry at 133748 not on free chain
Volume 'u.thar' id 536891337 also found on other chains (0x4f0f1)
Free vlentry at 134340 not on free chain
Free vlentry at 134784 not on free chain
Free vlentry at 135820 not on free chain
Free vlentry at 136856 not on free chain
Free vlentry at 137892 not on free chain
Free vlentry at 138336 not on free chain
Free vlentry at 138484 not on free chain
Free vlentry at 138632 not on free chain
Free vlentry at 138780 not on free chain
Free vlentry at 138928 not on free chain
Free vlentry at 139076 not on free chain
Free vlentry at 139372 not on free chain
Free vlentry at 139668 not on free chain
Free vlentry at 140112 not on free chain
Free vlentry at 140260 not on free chain
Free vlentry at 140408 not on free chain
Free vlentry at 141148 not on free chain
Free vlentry at 141592 not on free chain
Free vlentry at 142480 not on free chain
Free vlentry at 142628 not on free chain
Free vlentry at 143072 not on free chain
Free vlentry at 144108 not on free chain
Free vlentry at 144256 not on free chain
Free vlentry at 144848 not on free chain
Free vlentry at 145736 not on free chain
Free vlentry at 146772 not on free chain
Free vlentry at 148400 not on free chain
Free vlentry at 148548 not on free chain
Free vlentry at 148844 not on free chain
Free vlentry at 149732 not on free chain
Free vlentry at 150620 not on free chain
Free vlentry at 153284 not on free chain
Free vlentry at 153876 not on free chain
Free vlentry at 154764 not on free chain
Free vlentry at 155060 not on free chain
Free vlentry at 155504 not on free chain
Free vlentry at 155652 not on free chain
Free vlentry at 156096 not on free chain
Free vlentry at 156244 not on free chain
Free vlentry at 158020 not on free chain
Free vlentry at 158760 not on free chain
Free vlentry at 158908 not on free chain
Free vlentry at 159944 not on free chain
Free vlentry at 160092 not on free chain
Free vlentry at 160684 not on free chain
Free vlentry at 161276 not on free chain
Free vlentry at 161572 not on free chain
Free vlentry at 161720 not on free chain
Free vlentry at 161868 not on free chain
Free vlentry at 163644 not on free chain
Free vlentry at 165124 not on free chain
Free vlentry at 166308 not on free chain
Free vlentry at 166456 not on free chain
Free vlentry at 167788 not on free chain
Free vlentry at 167936 not on free chain
Free vlentry at 169268 not on free chain
Free vlentry at 169416 not on free chain
Free vlentry at 174596 not on free chain
Free vlentry at 175040 not on free chain
Free vlentry at 175188 not on free chain
Free vlentry at 175336 not on free chain
Free vlentry at 175484 not on free chain
Free vlentry at 175632 not on free chain
Free vlentry at 176224 not on free chain
Free vlentry at 176372 not on free chain
Free vlentry at 176816 not on free chain
Free vlentry at 176964 not on free chain
Free vlentry at 177260 not on free chain
Free vlentry at 178000 not on free chain
Free vlentry at 178148 not on free chain
Free vlentry at 178296 not on free chain
Free vlentry at 178444 not on free chain
Free vlentry at 179184 not on free chain
Free vlentry at 179332 not on free chain
Free vlentry at 180812 not on free chain
Free vlentry at 180960 not on free chain
Free vlentry at 181256 not on free chain
Free vlentry at 183476 not on free chain
Free vlentry at 183920 not 

Re: [OpenAFS] When to publish security advisories?

2011-04-15 Thread Jeff Blaine

My proposal, going forwards, is to not produce security advisories or
releases for these local denial of service attacks. Local issues that
can result in privilege escalation, or denial of service attacks that
can be performed by those outside a sites infrastructure would still
result in advisories.


That sounds sane to me.


My supplemental question, is just how much use the security
releases actually are. Most of our packagers ignore them, in favour
of pulling the patches that we release with the advisory into their
packaging. Is just providing these patches sufficient? Is there
actually a demand for a super-stable point update that just
contains the security code, or is it acceptable to provide the
security fix as part of a normal stable release?


Patches are fine, IMO, but I think the download page should then
indicate the recommended patches in a new (top!) section.

Then again, you're still possibly providing binary downloads of
a product with known security vulnerabilities, which means
ideally yanking all binary links until there are updated packages,
which means a maintenance chore... and it likely would have been
just as easy to release 1.X.N+1
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Future of 1.4 release series with regards to new Linux kernels

2011-04-15 Thread Jeff Blaine

As you know, the release of OpenAFS 1.6.0 is imminent. Currently we
expect to release OpenAFS 1.4.14.1 with support for Linux kernels
through 2.6.38.
Going forward, it appears that substantial changes would be needed to
support kernels 2.6.39 onwards. To that end, it's our expectation that
for the
continued stability of the 1.4 release series, that kernels beyond
2.6.38 would not be supported, and sites wishing to deploy newer
kernels would
require a 1.6 series release.

If you have concerns on this topic, I'd like to hear from you. (reply
to openafs-gatekeepers or openafs-info as you feel appropriate).


A list of what that means to 1.4 users (or link)
would help me comment. I know nothing of 1.6.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] SNMP?

2011-03-17 Thread Jeff Blaine

Is there anything queryable in OpenAFS via SNMP?  I can only
find ancient mailing list comments about it (1998) when
searching openafs.org ... and a sad note about Kevin McBride's
passing in 2008 when searching Google :(

And No matches found via

http://git.openafs.org/?p=openafs.gita=searchh=HEADst=greps=SNMP
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Validity testing reads, writes, etc.

2011-03-03 Thread Jeff Blaine

Who has a client-side test suite of sorts to perform common
client-side operations and confirm expected outcomes?  We
could really use something to exercise I/O (not really
concerned about performance, but integrity), perform
volume creations, volume fills, whatever.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: 1.4.14+patches = panic

2011-03-02 Thread Jeff Blaine



On 3/1/2011 9:08 PM, Andrew Deason wrote:

On Tue, 01 Mar 2011 19:30:06 -0500
Jeff Blainejbla...@kickflop.net  wrote:


I'm a TOTAL git newbie, so for the sake of full disclosure, here
is how I did the patching:

  git clone http://git.openafs.org/git/openafs.git
  git branch openafs-stable-1_4_14
  git checkout 514256cd403c15da7acf6601aa11371504f856fe

[...]

...not exactly :)

After you clone, you do

git checkout openafs-stable-1_4_14
git cherry-pickcommit1
git cherry-pickcommit2
...
git cherry-pickcommitN

Or you can go in to the gitweb interface, get the patch for each of
those commits, and apply them manually. But that's only if you're scared
of git :)

In any case, that's not your problem, though. By chance, you checked out
code pretty close to the head of the 1.4.x branch, and it has all of the
patches I mentioned. The reason you have a panic is that the patches I
mentioned are not sufficient (I apologize, but the road to getting the
Solaris client stoppable has been long, and I forget what's where).

What you want to do is do the above steps, and then apply two patches
that I forgot to mention that aren't in 1.4.x yet:

http://git.openafs.org/?p=openafs.git;a=commitdiff_plain;h=6b6064ccacc60eb5a1fe45cc69c65fb621e8980c
http://git.openafs.org/?p=openafs.git;a=commitdiff_plain;h=885dfd0e9d0cb6b4e2e32280a9266d1776ea6859


Okay, I give up.  Diffs + patch, here I come.

tmp:cairo rm -rf openafs-1.4.14-PATCHED
tmp:cairo git clone http://git.openafs.org/git/openafs.git 
openafs-1.4.14-PATCHED

Initialized empty Git repository in /tmp/openafs-1.4.14-PATCHED/.git/
Checking out files: 100% (5359/5359), done.
tmp:cairo cd openafs-1.4.14-PATCHED/
openafs-1.4.14-PATCHED:cairo git branch openafs-stable-1_4_14
openafs-1.4.14-PATCHED:cairo git cherry-pick 
6b6064ccacc60eb5a1fe45cc69c65fb621e8980c

warning: too many files (created: 930 deleted: 984), skipping inexact
rename detection
Automatic cherry-pick failed.  After resolving the conflicts,
mark the corrected paths with 'git add paths' or 'git rm paths'
and commit the result.
When commiting, use the option '-c 6b6064c' to retain authorship and
message.
openafs-1.4.14-PATCHED:cairo
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: 1.4.14+patches = panic

2011-03-02 Thread Jeff Blaine

On 3/2/2011 9:38 AM, Simon Wilkinson wrote:

On 2 Mar 2011, at 14:23, Jeff Blainejbla...@kickflop.net  wrote:


On 3/1/2011 9:08 PM, Andrew Deason wrote:


...not exactly :)

After you clone, you do

git checkout openafs-stable-1_4_14


But you typed:


openafs-1.4.14-PATCHED:cairo  git branch openafs-stable-1_4_14


git checkout != git branch


That's because yesterday I pasted:

git clone http://git.openafs.org/git/openafs.git
git branch openafs-stable-1_4_14
...

And Andrew said:

What you want to do is do the above steps, and then
apply two patches that I forgot to mention that aren't
in 1.4.x yet:



Correction taken, though, and thank you for it (though I'm
already past the manual patching stage now).
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: 1.4.14+patches = panic

2011-03-02 Thread Jeff Blaine

On 3/2/2011 10:41 AM, Andrew Deason wrote:

On Wed, 02 Mar 2011 09:44:42 -0500
Jeff Blainejbla...@kickflop.net  wrote:


And Andrew said:

  What you want to do is do the above steps, and then
  apply two patches that I forgot to mention that aren't
  in 1.4.x yet:



The above steps being the steps I said to follow. Which were

git checkout openafs-stable-1_4_14
git cherry-pickcommit1
git cherry-pickcommit2
...
git cherry-pickcommitN


Ah.

FWIW, that sequence fails as follows:

tmp:cairo rm -rf openafs*
tmp:cairo git clone http://git.openafs.org/git/openafs.git 
openafs-1.4.14-PATCHED

Initialized empty Git repository in /tmp/openafs-1.4.14-PATCHED/.git/
Checking out files: 100% (5359/5359), done.
tmp:cairo cd openafs-1.4.14-PATCHED/
openafs-1.4.14-PATCHED:cairo git checkout openafs-stable-1_4_14
Checking out files: 100% (4352/4352), done.
Note: moving to 'openafs-stable-1_4_14' which isn't a local branch
If you want to create a new branch from this checkout, you may do so
(now or later) by using -b with the checkout command again. Example:
  git checkout -b new_branch_name
HEAD is now at 97cfb3e... openafs 1.4.14
openafs-1.4.14-PATCHED:cairo git cherry-pick 
6b6064ccacc60eb5a1fe45cc69c65fb621e8980c

Finished one cherry-pick.
[detached HEAD 0a3a9e2] libafs: consistently hold vnode refs
 11 files changed, 14 insertions(+), 18 deletions(-)
openafs-1.4.14-PATCHED:cairo git cherry-pick 
885dfd0e9d0cb6b4e2e32280a9266d1776ea6859

fatal: Could not find 885dfd0e9d0cb6b4e2e32280a9266d1776ea6859
openafs-1.4.14-PATCHED:cairo
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: 1.4.14+patches = panic

2011-03-02 Thread Jeff Blaine

On 3/2/2011 10:56 AM, Andrew Deason wrote:

On Wed, 02 Mar 2011 10:49:42 -0500
Jeff Blainejbla...@kickflop.net  wrote:


FWIW, that sequence fails as follows:
[...]
openafs-1.4.14-PATCHED:cairo  git cherry-pick
885dfd0e9d0cb6b4e2e32280a9266d1776ea6859
fatal: Could not find 885dfd0e9d0cb6b4e2e32280a9266d1776ea6859


You can't (easily) cherry-pick that one. I said to cherry-pick these
commits:

514256cd403c15da7acf6601aa11371504f856fe
b90f32d8cac7d2e5185e75740b0cf167d370ddb4
7d187f131bf3937b5a299eecb32d237a34c6bbee
b89a9e4fa001b453a3ef5f041ac7978ba696b8e3
d933e5ca54c486d52ed8766e4407987650c903e5
f59e45e2bdf1b2f0b9fd2edf10476bd5e463226d


It's clear now that I read a completely alternate interpretation
of your message yesterday then :)

=

After you clone, you do

git checkout openafs-stable-1_4_14
git cherry-pick commit1
git cherry-pick commit2
...
git cherry-pick commitN

Or you can go in to the gitweb interface, get the patch for each of
those commits, and apply them manually. But that's only if you're scared
of git :)
=

Interpretation: You're not using git right.  Here's
how you use git to retrieve commits
given a hash.

=
In any case, that's not your problem, though. By chance, you checked out
code pretty close to the head of the 1.4.x branch, and it has all of the
patches I mentioned. The reason you have a panic is that the patches I
mentioned are not sufficient (I apologize, but the road to getting the
Solaris client stoppable has been long, and I forget what's where).
=

Interpretation: What you did (clone openafs-stable-1_4_14)
actually includes the 6 commits I mentioned,
however, I realize now they're not enough.

=
What you want to do is do the above steps, and then apply two patches
that I forgot to mention that aren't in 1.4.x yet:
=

Interpretation: You also need these 2 patches.
clone openafs-stable-1_4_14 and apply these
2 extra patches.

[ My mistake here was equating 'patches' ]
[ with 'commits' ]

Off to try again.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] 1.4.14+patches = panic

2011-03-01 Thread Jeff Blaine

1.4.14 with the Solaris 10 (SPARC) patches Andrew Deason mentioned
the other day for the shutdown problem.

I saw this yesterday and just thought maybe I was doing something
odd.  Since it happened again today, twice, I am reporting it.

I'm a TOTAL git newbie, so for the sake of full disclosure, here
is how I did the patching:

git clone http://git.openafs.org/git/openafs.git
git branch openafs-stable-1_4_14
git checkout 514256cd403c15da7acf6601aa11371504f856fe
git checkout b90f32d8cac7d2e5185e75740b0cf167d370ddb4
git checkout 7d187f131bf3937b5a299eecb32d237a34c6bbee
git checkout b89a9e4fa001b453a3ef5f041ac7978ba696b8e3
git add  .
git commit
git checkout b89a9e4fa001b453a3ef5f041ac7978ba696b8e3
git checkout d933e5ca54c486d52ed8766e4407987650c903e5
git checkout f59e45e2bdf1b2f0b9fd2edf10476bd5e463226d

Probably totally wrong.

Please let me know what else you need if you can help.

panic[cpu0]/thread=300036d6520: recursive mutex_enter, lp=704bce30 
owner=300036d6520 thread=300036d6520


02a1009271a0 unix:mutex_vector_enter+350 (18402c8, 1, 704bce30, 
300036d6520, 2a10001f878, 0)
   %l0-3: 018c08d8 0180c2e0  

   %l4-7: 0300036d6520  01815048 
01815040
02a100927250 afs:gafs_freevfs+18 (300039df800, , 1, 
1, 3, 7b2c1ec0)
   %l0-3: 704bce30 030002606d68 0001 

   %l4-7:  030002606d68  

02a100927310 genunix:vfs_rele+1c (300039df800, 3cce2c8, 5306, 
704b6000, 5305, 704b6)
   %l0-3: 0001 2006  
2000
   %l4-7: 03048f58 03048f80 03048f30 
0300036acff8
02a1009273c0 afs:afs_inactive+f8 (30003ba99c8, 3cce2c8, 0, 
30003ba9bc8, 70400, 30003ba99c8)
   %l0-3: 704a4000 030003ba9bf0 704bce70 
030003ba99c8
   %l4-7: 0187cb30 0187c800 3006 
3000
02a100927490 afs:gafs_inactive+20 (30003ba99c8, 3cce2c8, 
1286000, 1, 2, 7b2af1b0)
   %l0-3: 704bce30  2000 

   %l4-7:  012c 704aef48 
704aef60
02a100927550 afs:afs_CheckVolumeNames+52c (704bcb69, 704bc, 
300036ad064, 1, 300036acff8, 0)
   %l0-3: 0004 704bcee0 704bcee1 
1000
   %l4-7: fffe fff7 704b3eb0 
030003ba99c8
02a100927620 afs:afs_Daemon+54c (4d6d862b, 4d6d7fbe, 4d6d6b84, 
4d6d8887, 0, 4d6d8887)
   %l0-3:  704a4000  
704bcee0
   %l4-7: 01846800   
4d6d8874

02a100927710 afs:afs_syscall_call+294 (1, 63614, 0, 0, ff235960, 0)
   %l0-3: 0001 704a4000 0002 
0001
   %l4-7: 704bce30 0300038a8770 0003 
7ffc43c8
02a100927860 afs:Afs_syscall+84 (2a100927bd0, 2a100927bd0, 
2a100927a28, 1c, 186f400, 0)
   %l0-3: 0001 704b5000 0300038a8770 
02a100927760
   %l4-7: 0300038ba160  00052000 


02a100927970 genunix:syscall_ap+58 (820, 1, 1871d50, 7b2b1020, 41, 18)
   %l0-3:  0003  
0006826cff05
   %l4-7: 2b9e 03000389e7a8 02a100927b90 
0006

02a100927a30 genunix:loadable_syscall+6c (1c, 1, 63614, 0, 0, ff235960)
   %l0-3: 0001  030d2568 
8639
   %l4-7: 0041 0820 0041 
01871d50


syncing file systems... 1 1 done
dumping to /dev/dsk/c0t0d0s1, offset 429588480, content: kernel
  0:09 100% done
100% done: 20403 pages dumped, dump succeeded
rebooting...
Resetting ...

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Solaris 10 SPARC hang on shutdown

2011-02-28 Thread Jeff Blaine

Has anyone experienced hangs at OS shutdown with OpenAFS 1.4.11
and higher on Solaris 10 SPARC and recent recommended patch
clusters (recent = the last 2 months)?

We experienced this while upgrading (9 to 10) a production
server last week and just moved past it for now to get the
box back up.

I have replicated it on a test box now, thankfully.

The console shows nothing past syslogd: going down on
signal 15 and just stays there forever (from what we
can tell).

Forcing a savecore dump via 'sync' at the {ok} prompt,
then looking, shows the following processes remaining
at that time:

S PID PPID PGID SID UID FLAGS ADDR NAME
R 0 0 0 0 0 0x0001 018387c0 sched
R 3 0 0 0 0 0x00020001 060010b29848 fsflush
R 2 0 0 0 0 0x00020001 060010b2a468 pageout
R 1 0 0 0 0 0x4a024000 060010b2b088 init
R 1327 1 1327 329 0 0x4a024002 0600176ab0c0 reboot
R 747 1 7 7 0 0x42020001 060017f9d0e0 afsd
R 749 1 7 7 0 0x42020001 0600180104d0 afsd
R 752 1 7 7 0 0x42020001 060017cb44b8 afsd
R 754 1 7 7 0 0x42020001 060017fc8068 afsd
R 756 1 7 7 0 0x42020001 060017fcb0e8 afsd
R 760 1 7 7 0 0x42020001 0600177f4048 afsd
R 762 1 7 7 0 0x42020001 06001800f8b0 afsd
R 764 1 7 7 0 0x42020001 06001800ec90 afsd
R 378 1 378 378 0 0x4202 060013aee480 inetd
R 373 1 373 373 0 0x4202 060013b1cc48 ypbind
R 7 1 7 7 0 0x4202 060010b28008 svc.startd
R 329 7 329 329 0 0x4a024000 0600110ff850 sh
Z 317 7 317 317 0 0x4a014002 060013b3a490 sac

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Solaris 10 SPARC hang on shutdown

2011-02-28 Thread Jeff Blaine

Sorry:

Both are AFS clients and not AFS servers

Production server hang was 1.4.11 with 10_Recommended cluster
from ~2 months ago.

Test box hang is 1.4.14 (same exact hang) with 10_Recommended
cluster from 3 days ago.

On 2/28/2011 1:13 PM, Jeff Blaine wrote:

Has anyone experienced hangs at OS shutdown with OpenAFS 1.4.11
and higher on Solaris 10 SPARC and recent recommended patch
clusters (recent = the last 2 months)?

We experienced this while upgrading (9 to 10) a production
server last week and just moved past it for now to get the
box back up.

I have replicated it on a test box now, thankfully.

The console shows nothing past syslogd: going down on
signal 15 and just stays there forever (from what we
can tell).

Forcing a savecore dump via 'sync' at the {ok} prompt,
then looking, shows the following processes remaining
at that time:

S PID PPID PGID SID UID FLAGS ADDR NAME
R 0 0 0 0 0 0x0001 018387c0 sched
R 3 0 0 0 0 0x00020001 060010b29848 fsflush
R 2 0 0 0 0 0x00020001 060010b2a468 pageout
R 1 0 0 0 0 0x4a024000 060010b2b088 init
R 1327 1 1327 329 0 0x4a024002 0600176ab0c0 reboot
R 747 1 7 7 0 0x42020001 060017f9d0e0 afsd
R 749 1 7 7 0 0x42020001 0600180104d0 afsd
R 752 1 7 7 0 0x42020001 060017cb44b8 afsd
R 754 1 7 7 0 0x42020001 060017fc8068 afsd
R 756 1 7 7 0 0x42020001 060017fcb0e8 afsd
R 760 1 7 7 0 0x42020001 0600177f4048 afsd
R 762 1 7 7 0 0x42020001 06001800f8b0 afsd
R 764 1 7 7 0 0x42020001 06001800ec90 afsd
R 378 1 378 378 0 0x4202 060013aee480 inetd
R 373 1 373 373 0 0x4202 060013b1cc48 ypbind
R 7 1 7 7 0 0x4202 060010b28008 svc.startd
R 329 7 329 329 0 0x4a024000 0600110ff850 sh
Z 317 7 317 317 0 0x4a014002 060013b3a490 sac

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: Solaris 10 SPARC hang on shutdown

2011-02-28 Thread Jeff Blaine

On 2/28/2011 1:31 PM, Andrew Deason wrote:

On Mon, 28 Feb 2011 13:13:24 -0500
Jeff Blainejbla...@kickflop.net  wrote:


Has anyone experienced hangs at OS shutdown with OpenAFS 1.4.11 and
higher on Solaris 10 SPARC and recent recommended patch clusters
(recent = the last 2 months)?


Yes. Oracle in update 9 has changed something with the uadmin() system
call, and it _looks_ like Solaris now waits forever trying to kill all
processes during shutdown for whatever reason. Since a few AFS processes
are unkillable (and deliberately so), it makes the shutdown hang.

The way around this is to stop the AFS client before shutdown. This is
not currently safe with any 1.4 release (on Solaris), but there are
patches in the 1.4 tree that make it so. But by 'not safe' I mean it may
panic the machine; if you stop AFS as late as possible before reboot, it
makes it less likely.

Complain to Oracle, if you like. I know they have already been told
about this, but the more the merrier. In the meantime, you can try to
umount /afs in the init scripts for runlevel 6/5/0 (and/or SMF, etc).


Thanks Andrew

Glad I asked before wasting more time trying to figure out
what it was.

Unmounting /afs let the test box go down for us.

How does one gauge which workaround to use?

Patty's saying the patches don't work ?
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: Solaris 10 SPARC hang on shutdown

2011-02-28 Thread Jeff Blaine

On 2/28/2011 3:18 PM, Andrew Deason wrote:

On Mon, 28 Feb 2011 12:10:54 -0800
Patricia O'Reillyorei...@qualcomm.com  wrote:


Even with the patch the wait is about an hour with the init script.


To be clear, you mean it takes that long for all of the scripts to run,
right? The OpenAFS script itself doesn't take an hour.


Patty,

FWIW, I applied the patches just now to 1.4.14 and
shutdown -g0 -y -i6 works properly for us (comes down
properly within 1 minute).

Devs: What's the timeframe to see these patches in an
official 1.4.x release?  Any idea?

Thanks again.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] aklog build failure, 1.4.14, Solaris 10, Solaris Studio cc

2011-02-25 Thread Jeff Blaine

I swear I hit something like this a few years ago (2008),
but cannot for the life of me find any info on the problem
or solution.

Solaris Studio 12.2
Solaris 10 SPARC
OpenAFS 1.4.14
MIT Kerberos 1.6.3 in /usr/rcf-krb5

make dest
...
/opt/SUNWspro/bin/cc  -I/usr/rcf-krb5/include -DALLOW_REGISTER 
-I/tmp/openafs-1.4.14/src/config -I. -I. -I/tmp/openafs-1.4.14/include 
-I/tmp/openafs-1.4.14/include/afs -I/tmp/openafs-1.4.14/include/rx 
-I/tmp/openafs-1.4.14 -I/tmp/openafs-1.4.14/src 
-I/tmp/openafs-1.4.14/src -dy -Bdynamic  -c aklog_main.c
/usr/rcf-krb5/include/kerberosIV/des.h, line 145: warning: macro 
redefined: ENCRYPT
/usr/rcf-krb5/include/kerberosIV/des.h, line 146: warning: macro 
redefined: DECRYPT
aklog_main.c, line 231: #error: Must have either keyblock or session 
member of krb5_creds

cc: acomp failed for aklog_main.c
*** Error code 2

___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: aklog build failure, 1.4.14, Solaris 10, Solaris Studio cc

2011-02-25 Thread Jeff Blaine

Weird.  I did a make distclean and tried again and it
was all fine.  I must have started something before
that build and didn't clean up.

Builds fine, sorry for the noise.

On 2/25/2011 2:30 PM, Andrew Deason wrote:

On Fri, 25 Feb 2011 14:20:41 -0500
Jeff Blainejbla...@kickflop.net  wrote:


I swear I hit something like this a few years ago (2008),
but cannot for the life of me find any info on the problem
or solution.

Solaris Studio 12.2
Solaris 10 SPARC
OpenAFS 1.4.14
MIT Kerberos 1.6.3 in /usr/rcf-krb5


What's your ./configure line? What do the tests in config.log say for
krb5_princ_size and krb5_principal_get_comp_string ?


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] 3900+ warnings during build for Solaris 10 SPARC

2011-02-25 Thread Jeff Blaine

I've noticed both with Solaris Studio 12 and with
Sun Studio 11 that the build is loaded with warnings.

1800+ implicit function declaration warnings

Is there no concern about these?  Unimportant?

[ As an aside, My successful Solaris Studio 12 build ]
[ of 1.4.14 throws a _memset undefined reference ]
[ error when afsd tries to load on my test box.  ]
[]
[ Building with Sun Studio v11 now, which is what I  ]
[ did the previous build with.   ]
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: Listing all volume mount points

2011-02-24 Thread Jeff Blaine

On 2/24/2011 7:24 PM, Andrew Deason wrote:

On Thu, 24 Feb 2011 17:09:33 -0700
Thomas Smiththeitsm...@gmail.com  wrote:


fs listqdir  provides the information that I need, but I have been
unable to determine a way to script this without knowing every mount
point beforehand.


If you just want to see the usage vs quota, 'vos examine' can tell you
that. You need to run that on every volume in the cell, but you can get
a list of all volumes that clients know about by running 'vos listvldb'.
By gluing them together with a bit of scripting, you can know the usage
vs quota of all volumes.


http://ats.sourceforge.net/

- README

  quota_partinfo - Like 'vos partinfo server partition' but
   instead of reporting on K disk space free, it
   reports on K uncommitted quota-wise (REAL
   and proper free AFS space).  Optionally caches
   results (see the top of the script)
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Revival: Recommended way to start up OpenAFS on Solaris 10?

2011-02-21 Thread Jeff Blaine

Best I can tell, the thread ended with this message from
David Boyes @ SNA:

http://www.openafs.org/pipermail/openafs-info/2010-January/032816.html

Anything?  Anyone?  Did we get anywhere?  Just looking to
snarf someone's SMF stuff that works.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] KfW on Windows 7 / 64 bit

2011-02-02 Thread Jeff Blaine

Have you confirmed that the krb5.ini is correct?

On 2/2/2011 12:39 PM, John Tang Boyland wrote:

I have a student who is trying to get Kerberos/OpenAFS working on
Windows 7 (64 bit).  But not even NIM works, it says that
validity of identity couldn't be determined
When they run kinit in a command.com window they get the same error
with one (I am typing this from memory) about not being able to
contact a KDC for the desired realm.

And yet,
ping kerberos.cs.uwm.edu
works just fine.
They are not aware of any firewall issues that would be
preventing kerberos from getting through.
But that's the only thing I could think of,
since the server is accessible to everyone else,
and is accessible from their computer using ping.

We still haven't solved earlier problems either.
I find it bizarre how four people running the latest
OpenAFS on Windows 7 on 64 bit machines can get four
completely different results.

John Boyland
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: Need volume state / fileserver / salvage knowledge

2011-02-01 Thread Jeff Blaine

Wed Jan 26 12:28:13 2011: upclientetc exited on signal 15
Wed Jan 26 12:28:13 2011: upclientbin exited on signal 15
Wed Jan 26 12:28:24 2011: fs:vol exited on signal 15
Wed Jan 26 12:58:19 2011: bos shutdown: fileserver failed to shutdown within 
1800 seconds
Wed Jan 26 12:58:37 2011: fs:file exited on signal 9


Thanks for the replies.

I can't at all fathom that our issue is one of existing
client connections and callback break completion (timing out).

 Also, in this specific case, it may not be just that shutting down
 volumes took too long. 1.4.11 has known problems that can cause this
 (e.g. the host list gets a loop in it, and something spins forever
 trying to traverse the whole list).

That's this, I think?:

- Fixes to avoid issues cleaning up deleted hosts in
  the fileserver (126454)

Let's assume this issue is what caused our problem.  I'm sort
of at a loss as to how to approach OpenAFS versions.  On one
hand, expectations of more effort to make it clear in the
release notes what items could cause something like unclean
server shutdowns (kind of a big deal, IMO) are not really
justifiable.  It's open source, etc.  On the other hand,
it's not acceptable to blindly upgrade to the latest stable
release every time it comes out.  I understand that the most
obvious take-away is just, You got bit.  Move on., but
if anything can improve on our end, I'd like to do that.

I welcome any suggestions for how others are approaching this.

Jeff Blaine
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


[OpenAFS] Need volume state / fileserver / salvage knowledge

2011-01-28 Thread Jeff Blaine

OpenAFS 1.4.11 on Solaris 10 SPARC servers with *ZFS* vice
partitions

The last time we brought our fileservers down (cleanly, according
to shutdown info via bos status), it struck me as odd that
salvages were needed once it came up.  I sort of brushed it off.

We've done it again, and the same situation is presenting itself,
and I'm really confused as to how that is and what is happening
incorrectly.  One of the three cleanly shutdown fileservers came
up with hundreds of unattachable volumes, and is salvaging now
by our hand.

If anyone has any ideas, please share!  I don't see anything in
the 1.4.12 or 1.4.14 release notes indicating anything that would
be causing this in 1.4.11 (which is the first release we've
used on our upgraded Solaris 10 + ZFS fileservers).  This has
cost us hours of downtime for these particular volumes.

In the meantime, I am going to start scouring openafs.org and
the wiki for as much information as I can about how the entire
fileserver/clean/dirty/salvage process works (finally).

Below you can (if you care to) see that the ZFS properties for
the fileservers are the same (no salvage needed vs. salvage needed).

===
Fileserver with NO Salvage Needed on Clean Shutdown
===

Showing 1 partition, all are confirmed to be configured the same
as this.

BosConfig Info

bnode fs fs 1
parm /usr/afs/bin/fileserver
parm /usr/afs/bin/volserver
parm /usr/afs/bin/salvager -tmpdir /usr/tmp -parallel all4 -DontSalvage
end

ZFS Info

NAME  PROPERTY  VALUE  SOURCE
pool-vice/vicepa  type  filesystem -
pool-vice/vicepa  creation  Wed Jul 15 11:23 2009  -
pool-vice/vicepa  used  30.0G  -
pool-vice/vicepa  available 146G   -
pool-vice/vicepa  referenced30.0G  -
pool-vice/vicepa  compressratio 1.00x  -
pool-vice/vicepa  mounted   yes-
pool-vice/vicepa  quota 176G   local
pool-vice/vicepa  reservation   none   default
pool-vice/vicepa  recordsize32Klocal
pool-vice/vicepa  mountpoint/vicepalocal
pool-vice/vicepa  sharenfs  offlocal
pool-vice/vicepa  checksum  on default
pool-vice/vicepa  compression   offlocal
pool-vice/vicepa  atime offlocal
pool-vice/vicepa  devices   on default
pool-vice/vicepa  exec  on local
pool-vice/vicepa  setuidon local
pool-vice/vicepa  readonly  offdefault
pool-vice/vicepa  zoned offdefault
pool-vice/vicepa  snapdir   hidden default
pool-vice/vicepa  aclmode   groupmask  default
pool-vice/vicepa  aclinheritrestricted default
pool-vice/vicepa  canmount  on default
pool-vice/vicepa  shareiscsioffdefault
pool-vice/vicepa  xattr on local
pool-vice/vicepa  copies1  default
pool-vice/vicepa  version   3  -
pool-vice/vicepa  utf8only  off-
pool-vice/vicepa  normalization none   -
pool-vice/vicepa  casesensitivity   sensitive  -
pool-vice/vicepa  vscan offdefault
pool-vice/vicepa  nbmandoffdefault
pool-vice/vicepa  sharesmb  offdefault
pool-vice/vicepa  refquota  none   default
pool-vice/vicepa  refreservationnone   default
pool-vice/vicepa  primarycache  alldefault
pool-vice/vicepa  secondarycachealldefault
pool-vice/vicepa  usedbysnapshots   0  -
pool-vice/vicepa  usedbydataset 0  -
pool-vice/vicepa  usedbychildren0  -
pool-vice/vicepa  usedbyrefreservation  0  -
pool-vice/vicepa  logbias   latencydefault


Fileserver with Salvage Needed on Clean Shutdown


Showing 1 partition (which is 1 that did have volumes on it
that needed salvaging), all are confirmed to be configured
the same as this.

BosConfig Info

bnode fs fs 1
parm /usr/afs/bin/fileserver
parm /usr/afs/bin/volserver
parm /usr/afs/bin/salvager -tmpdir /usr/tmp 

Re: [OpenAFS] Re: Need volume state / fileserver / salvage knowledge

2011-01-28 Thread Jeff Blaine

On 1/28/2011 12:33 PM, Andrew Deason wrote:

On Fri, 28 Jan 2011 12:10:38 -0500
Jeff Blainejbla...@kickflop.net  wrote:


The last time we brought our fileservers down (cleanly, according to
shutdown info via bos status), it struck me as odd that salvages
were needed once it came up.  I sort of brushed it off.


As in, it salvaged everything automatically when it came back up, or
volumes were not attached when it came back up, and you needed to
salvage to bring them online?


The latter.


We've done it again, and the same situation is presenting itself,
and I'm really confused as to how that is and what is happening
incorrectly.  One of the three cleanly shutdown fileservers came
up with hundreds of unattachable volumes, and is salvaging now
by our hand.


Well, why are they not attaching? FileLog should tell you. And the
salvage logs should say what they fixed, if anything, to bring them back
online.


Yes, I am waiting on that to all finish before I examine and reply.


Also, salvaging an entire partition at once may be quite a bit faster
than salvaging volumes individually, depending on how many volumes you
have. The fileserver needs to be shutdown for that to happen, though.


I didn't trust it at all and forced a salvage of the whole server.
There were many unattachable volumes on every partition.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: Need volume state / fileserver / salvage knowledge

2011-01-28 Thread Jeff Blaine

Examples from FileLog.old:

Fri Jan 28 10:02:48 2011 VAttachVolume: volume /vicepf/V2023864046.vol 
needs to be salvaged; not attached.


Fri Jan 28 10:02:49 2011 VAttachVolume: volume salvage flag is ON for 
/vicepa//V2023886583.vol; volume needs salvage


Examples from SalvageLog old pretty much run the gamut (it's
a 4MB file...).

01/28/2011 10:30:50 Found 13 orphaned files and directories (approx. 26 KB)

01/28/2011 10:30:52 Volume uniquifier is too low; fixed

01/28/2011 10:31:11 Vnode 34: version  inode version; fixed (old status)

01/28/2011 12:54:15 Volume 536872710 (src.local) mount point ./flex/011 
to '#src.flex.011#' invalid, converted to symbolic link


01/28/2011 12:27:30 dir vnode 15: special old unlink-while-referenced 
file .__afs9803 is deleted (vnode 2248)


01/28/2011 12:28:22 dir vnode 1075: ./.gconfd/lock/ior (vnode 4272): 
unique changed from 54370 to 57920


01/28/2011 12:28:22 dir vnode 1077: ./.gconf/%gconf-xml-backend.lock/ior 
already claimed by directory vnode 1 (vnode 4278, unique 54373) -- deleted


01/28/2011 12:28:28 dir vnode 607: invalid entry: ./.gconfd/lock/ior 
(vnode 1114, unique 132811)


01/28/2011 12:37:28 dir vnode 1: invalid entry deleted: 
./.ab_library.lock (vnode 50816, unique 25535)


On 1/28/2011 12:33 PM, Andrew Deason wrote:

On Fri, 28 Jan 2011 12:10:38 -0500
Jeff Blainejbla...@kickflop.net  wrote:


The last time we brought our fileservers down (cleanly, according to
shutdown info via bos status), it struck me as odd that salvages
were needed once it came up.  I sort of brushed it off.


As in, it salvaged everything automatically when it came back up, or
volumes were not attached when it came back up, and you needed to
salvage to bring them online?


We've done it again, and the same situation is presenting itself,
and I'm really confused as to how that is and what is happening
incorrectly.  One of the three cleanly shutdown fileservers came
up with hundreds of unattachable volumes, and is salvaging now
by our hand.


Well, why are they not attaching? FileLog should tell you. And the
salvage logs should say what they fixed, if anything, to bring them back
online.

Also, salvaging an entire partition at once may be quite a bit faster
than salvaging volumes individually, depending on how many volumes you
have. The fileserver needs to be shutdown for that to happen, though.


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: Need volume state / fileserver / salvage knowledge

2011-01-28 Thread Jeff Blaine

Do you have the FileLog from that shutdown?


No, it was cycled out by me salvaging :|


And there isn't anything in play that would cause an old version of the
vice partition or something weird like that, is there? (ZFS snapshots,
liveupgrade misconfiguration, etc)


No.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: Need volume state / fileserver / salvage knowledge

2011-01-28 Thread Jeff Blaine

On 1/28/2011 1:52 PM, Derrick Brashear wrote:

did shutdown perchance take 30min?


Yes.  I found this in BosLog.old just now:

Wed Jan 26 12:28:13 2011: upclientetc exited on signal 15
Wed Jan 26 12:28:13 2011: upclientbin exited on signal 15
Wed Jan 26 12:28:24 2011: fs:vol exited on signal 15
Wed Jan 26 12:58:19 2011: bos shutdown: fileserver failed to shutdown 
within 1800 seconds

Wed Jan 26 12:58:37 2011: fs:file exited on signal 9



Derrick


On Jan 28, 2011, at 1:50 PM, Jeff Blainejbla...@kickflop.net  wrote:


Do you have the FileLog from that shutdown?


No, it was cycled out by me salvaging :|


And there isn't anything in play that would cause an old version of the
vice partition or something weird like that, is there? (ZFS snapshots,
liveupgrade misconfiguration, etc)


No.
___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info



___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: asetkey: failed to set key, code 70354694

2011-01-07 Thread Jeff Blaine

This was solved by getting the responsible person to
finally upgrade this box to Solaris 10 and OpenAFS
1.4.11 via upclientbin.

On 1/6/2011 10:30 AM, Jeff Blaine wrote:

It's talking to a Solaris 9 OpenAFS 1.4.6 server (the only
one like that in our cell). Solaris 10 and OpenAFS 1.4.11
on all other servers.

I rebooted it though after the KeyFile update due to it
seeming a little out of whack (AFS DB server only).

On 1/6/2011 9:46 AM, Derrick Brashear wrote:

Same AFS version everywhere? Some older version had a bug and would
hang when rereading KeyFile, but it shouldn't cause this.
Use tcpdump and figure out which server is returning that error, or,
install a 1.5.78 client and see which server it logs the error about?

On Thu, Jan 6, 2011 at 8:50 AM, Jeff Blainejbla...@kickflop.net wrote:

Hmm, not so fast I guess. *Some* hosts are still doing
this, others are fine (???).

All /usr/afs/etc/KeyFile files checksum the same on our
servers.

rcf-smtp% ssh vegas
Password:
Last login: Thu Jan 6 08:04:52 2011 from rcf-smtp.our.
afs: Tokens for user of AFS id 26560 for cell rcf.our.org are discarded
(rxkad error=19270408)
%
% translate_et 19270408
19270408 (rxk).8 = ticket contained unknown key version number
% kinit
Password for jbla...@rcf.our.org:
% aklog
% logout

rcf-smtp% ssh vegas
Password:
Last login: Thu Jan 6 08:28:51 2011 from rcf-smtp.our.
afs: Tokens for user of AFS id 26560 for cell rcf.our.org are discarded
(rxkad error=19270408)
%


On 1/5/2011 8:37 PM, Jeff Blaine wrote:


Thanks all -- that did it.

On 1/5/2011 5:47 PM, Andrew Deason wrote:


On Wed, 05 Jan 2011 17:36:57 -0500
Jeff Blainejbla...@kickflop.net wrote:


etc-upserver-host# asetkey add 17 /etc/krb5.keytab afs
asetkey: failed to set key, code 70354694.
etc-upserver-host#


$ translate_et 70354694
70354694 (acfg).6 = no more entries

aka AFSCONF_FULL. You can only have 8 keys at once iirc; how many
do you
have in there?


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info






___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: asetkey: failed to set key, code 70354694

2011-01-07 Thread Jeff Blaine

I lied, again!  It's BACK.

All file + DB servers report the exact same data for
'bos listkeys'

All DB servers have been 'bos restart server -all'

Various clients upon login throw the

afs: Tokens for user of AFS id 26560 for cell rcf.our.org
are discarded (rxkad error=19270408)

error for various users.  Some hosts work, some don't.

Some that don't are 1.4.11 just like the servers.  This
is the communication after entering a password via
SSH + pam_krb5 + pam_afs_session on a Solaris 10 SPARC
box running 1.4.11:

client1.our.org - afsdb2.our.org UDP D=7004 S=32965 LEN=84
afsdb2.our.org - client1.our.org UDP D=32965 S=7004 LEN=180
client1.our.org - afsdb2.our.org UDP D=7004 S=32965 LEN=73
client1.our.org - afsdb1.our.org UDP D=7004 S=32966 LEN=84
afsdb1.our.org - client1.our.org UDP D=32966 S=7004 LEN=180
client1.our.org - afsdb1.our.org UDP D=7004 S=32966 LEN=73
client1.our.org - afsdb2.our.org UDP D=7004 S=32966 LEN=156
afsdb2.our.org - client1.our.org UDP D=32966 S=7004 LEN=140
client1.our.org - afsdb2.our.org UDP D=7004 S=32966 LEN=73
client1.our.org - afsdb2.our.org UDP D=7002 S=32966 LEN=300
afsdb2.our.org - client1.our.org UDP D=32966 S=7002 LEN=44
client1.our.org - afsdb2.our.org UDP D=7002 S=32966 LEN=73
client1.our.org - afsfs1.our.org UDP D=7000 S=7001 LEN=52
afsfs1.our.org - client1.our.org UDP D=7001 S=7000 LEN=52
client1.our.org - afsfs1.our.org UDP D=7000 S=7001 LEN=132
afsfs1.our.org - client1.our.org UDP D=7001 S=7000 LEN=74
afsfs1.our.org - client1.our.org UDP D=7001 S=7000 LEN=40
client1.our.org - afsfs1.our.org UDP D=7000 S=7001 LEN=52
afsfs1.our.org - client1.our.org UDP D=7001 S=7000 LEN=40
client1.our.org - afsfs1.our.org UDP D=7000 S=7001 LEN=476
afsfs1.our.org - client1.our.org UDP D=7001 S=7000 LEN=73
afsfs1.our.org - client1.our.org UDP D=7001 S=7000 LEN=156
client1.our.org - afsfs1.our.org UDP D=7000 S=7001 LEN=73

FWIW, none of thosts above are the so-called previously
problematic box, which we have actually halted for now
to see if it affects anything.

Can't make any sense of this.

On 1/7/2011 12:15 PM, Jeff Blaine wrote:

This was solved by getting the responsible person to
finally upgrade this box to Solaris 10 and OpenAFS
1.4.11 via upclientbin.

On 1/6/2011 10:30 AM, Jeff Blaine wrote:

It's talking to a Solaris 9 OpenAFS 1.4.6 server (the only
one like that in our cell). Solaris 10 and OpenAFS 1.4.11
on all other servers.

I rebooted it though after the KeyFile update due to it
seeming a little out of whack (AFS DB server only).

On 1/6/2011 9:46 AM, Derrick Brashear wrote:

Same AFS version everywhere? Some older version had a bug and would
hang when rereading KeyFile, but it shouldn't cause this.
Use tcpdump and figure out which server is returning that error, or,
install a 1.5.78 client and see which server it logs the error about?

On Thu, Jan 6, 2011 at 8:50 AM, Jeff Blainejbla...@kickflop.net wrote:

Hmm, not so fast I guess. *Some* hosts are still doing
this, others are fine (???).

All /usr/afs/etc/KeyFile files checksum the same on our
servers.

rcf-smtp% ssh vegas
Password:
Last login: Thu Jan 6 08:04:52 2011 from rcf-smtp.our.
afs: Tokens for user of AFS id 26560 for cell rcf.our.org are discarded
(rxkad error=19270408)
%
% translate_et 19270408
19270408 (rxk).8 = ticket contained unknown key version number
% kinit
Password for jbla...@rcf.our.org:
% aklog
% logout

rcf-smtp% ssh vegas
Password:
Last login: Thu Jan 6 08:28:51 2011 from rcf-smtp.our.
afs: Tokens for user of AFS id 26560 for cell rcf.our.org are discarded
(rxkad error=19270408)
%


On 1/5/2011 8:37 PM, Jeff Blaine wrote:


Thanks all -- that did it.

On 1/5/2011 5:47 PM, Andrew Deason wrote:


On Wed, 05 Jan 2011 17:36:57 -0500
Jeff Blainejbla...@kickflop.net wrote:


etc-upserver-host# asetkey add 17 /etc/krb5.keytab afs
asetkey: failed to set key, code 70354694.
etc-upserver-host#


$ translate_et 70354694
70354694 (acfg).6 = no more entries

aka AFSCONF_FULL. You can only have 8 keys at once iirc; how many
do you
have in there?


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info






___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


Re: [OpenAFS] Re: asetkey: failed to set key, code 70354694

2011-01-07 Thread Jeff Blaine

I should also point out that 'kinit; aklog' works for all
users who report problems.

How could it be that pam_krb5 (Russ's) and pam_afs_session
are broken due to a key change?

On 1/7/2011 2:38 PM, Jeff Blaine wrote:

I lied, again! It's BACK.

All file + DB servers report the exact same data for
'bos listkeys'

All DB servers have been 'bos restart server -all'

Various clients upon login throw the

afs: Tokens for user of AFS id 26560 for cell rcf.our.org
are discarded (rxkad error=19270408)

error for various users. Some hosts work, some don't.

Some that don't are 1.4.11 just like the servers. This
is the communication after entering a password via
SSH + pam_krb5 + pam_afs_session on a Solaris 10 SPARC
box running 1.4.11:

client1.our.org - afsdb2.our.org UDP D=7004 S=32965 LEN=84
afsdb2.our.org - client1.our.org UDP D=32965 S=7004 LEN=180
client1.our.org - afsdb2.our.org UDP D=7004 S=32965 LEN=73
client1.our.org - afsdb1.our.org UDP D=7004 S=32966 LEN=84
afsdb1.our.org - client1.our.org UDP D=32966 S=7004 LEN=180
client1.our.org - afsdb1.our.org UDP D=7004 S=32966 LEN=73
client1.our.org - afsdb2.our.org UDP D=7004 S=32966 LEN=156
afsdb2.our.org - client1.our.org UDP D=32966 S=7004 LEN=140
client1.our.org - afsdb2.our.org UDP D=7004 S=32966 LEN=73
client1.our.org - afsdb2.our.org UDP D=7002 S=32966 LEN=300
afsdb2.our.org - client1.our.org UDP D=32966 S=7002 LEN=44
client1.our.org - afsdb2.our.org UDP D=7002 S=32966 LEN=73
client1.our.org - afsfs1.our.org UDP D=7000 S=7001 LEN=52
afsfs1.our.org - client1.our.org UDP D=7001 S=7000 LEN=52
client1.our.org - afsfs1.our.org UDP D=7000 S=7001 LEN=132
afsfs1.our.org - client1.our.org UDP D=7001 S=7000 LEN=74
afsfs1.our.org - client1.our.org UDP D=7001 S=7000 LEN=40
client1.our.org - afsfs1.our.org UDP D=7000 S=7001 LEN=52
afsfs1.our.org - client1.our.org UDP D=7001 S=7000 LEN=40
client1.our.org - afsfs1.our.org UDP D=7000 S=7001 LEN=476
afsfs1.our.org - client1.our.org UDP D=7001 S=7000 LEN=73
afsfs1.our.org - client1.our.org UDP D=7001 S=7000 LEN=156
client1.our.org - afsfs1.our.org UDP D=7000 S=7001 LEN=73

FWIW, none of thosts above are the so-called previously
problematic box, which we have actually halted for now
to see if it affects anything.

Can't make any sense of this.

On 1/7/2011 12:15 PM, Jeff Blaine wrote:

This was solved by getting the responsible person to
finally upgrade this box to Solaris 10 and OpenAFS
1.4.11 via upclientbin.

On 1/6/2011 10:30 AM, Jeff Blaine wrote:

It's talking to a Solaris 9 OpenAFS 1.4.6 server (the only
one like that in our cell). Solaris 10 and OpenAFS 1.4.11
on all other servers.

I rebooted it though after the KeyFile update due to it
seeming a little out of whack (AFS DB server only).

On 1/6/2011 9:46 AM, Derrick Brashear wrote:

Same AFS version everywhere? Some older version had a bug and would
hang when rereading KeyFile, but it shouldn't cause this.
Use tcpdump and figure out which server is returning that error, or,
install a 1.5.78 client and see which server it logs the error about?

On Thu, Jan 6, 2011 at 8:50 AM, Jeff Blainejbla...@kickflop.net
wrote:

Hmm, not so fast I guess. *Some* hosts are still doing
this, others are fine (???).

All /usr/afs/etc/KeyFile files checksum the same on our
servers.

rcf-smtp% ssh vegas
Password:
Last login: Thu Jan 6 08:04:52 2011 from rcf-smtp.our.
afs: Tokens for user of AFS id 26560 for cell rcf.our.org are
discarded
(rxkad error=19270408)
%
% translate_et 19270408
19270408 (rxk).8 = ticket contained unknown key version number
% kinit
Password for jbla...@rcf.our.org:
% aklog
% logout

rcf-smtp% ssh vegas
Password:
Last login: Thu Jan 6 08:28:51 2011 from rcf-smtp.our.
afs: Tokens for user of AFS id 26560 for cell rcf.our.org are
discarded
(rxkad error=19270408)
%


On 1/5/2011 8:37 PM, Jeff Blaine wrote:


Thanks all -- that did it.

On 1/5/2011 5:47 PM, Andrew Deason wrote:


On Wed, 05 Jan 2011 17:36:57 -0500
Jeff Blainejbla...@kickflop.net wrote:


etc-upserver-host# asetkey add 17 /etc/krb5.keytab afs
asetkey: failed to set key, code 70354694.
etc-upserver-host#


$ translate_et 70354694
70354694 (acfg).6 = no more entries

aka AFSCONF_FULL. You can only have 8 keys at once iirc; how many
do you
have in there?


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info






___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org/mailman/listinfo/openafs-info


___
OpenAFS-info mailing list
OpenAFS-info@openafs.org
https://lists.openafs.org

  1   2   3   4   >