On 04/30/2012 08:37 PM, Tim Serong wrote:
> On 05/01/2012 08:04 AM, Seth Galitzer wrote:
>> This was a bit trickier to get worked out, but I have made some
>> progress.  It turns out just putting the metadata on a shared disk
>> resource and symlinking wasn't quite enough.  nmbd (the netbios
>> management daemon that samba uses) complained that the symlink to its
>> working directory wasn't a real directory.  On top of that, you can
>> specify the path for the nmbd working dir, but only at compile time, not
>> at run time.  To work around this, I added a bind mount for that dir
>> (/var/run/samba for debian/ubuntu) and now samba will start.  It will
>> even fail over if I put the primary into standby.  So there's the progress.
>>
>> However, a client still can't reconnect to the share once the node has
>> failed over until I rerun "net ads join" on the secondary (new primary).
>>    I've been running the join command using the dns name for the floating
>> IP, but maybe that's not good enough.  I'll look more deeply into net
>> tomorrow, and see if I can specify the IP, too.
>
> Have you got "/var/lib/samba" on shared storage (or linked to, or
> "private dir" in smb.conf set to some directory on shared storage)?
> IIRC when you do "net ads join", various secrets and whatnot are saved
> somewhere in that directory.  If that's not persistent across failover,
> it'd explain what you're seeing.

The following dirs are all on shared storage:
/var/cache/samba
/var/lib/samba
/var/log/samba
/var/run/samba

The last is a bind mount, the rest are symlinks.  Turns out that in 
debian, /var/run is a symlink to /run.  In my fs resource for the bind 
mount, I indicated /var/run/samba as the target, but for some reason, 
the system mounted it at /run/samba instead.  This meant that when I 
tried to failover the resource, it wouldn't unmount and silently fail. 
I changed the resource to use /run/samba as the target and now it fails 
over smoothly.  Not sure who to blame for this behavior, but I've at 
least got it working now.

>
>>
>> The other new oddity is that after I've put the primary into standby and
>> everything has failed over to the secondary, as soon as I bring the
>> primary back online, the resources try to switch back, i.e. they don't
>> stay on the secondary (new primary) as expected.  Granted, if I setup
>> STONITH, this shouldn't be an immediate problem, but it still will be
>> when I go to bring the node back online.  I believe this is only the
>> case with the samba resource enabled, but I'll test this more tomorrow
>> to make sure.
>
> Do you have any constraints that make the resources prefer one node?
> Also look at resource stickiness.

Thanks for the tip.  I set the stickiness on the LVM+fs+samba+exportfs 
group to 100 and that seems to have done the trick.

>
>>
>> I'm starting to wonder if samba is practical for failover or not.  I
>> don't really have much choice about using it.  Because of my mixed
>> environment, I need to be able to export nfs and samba shares from this
>> server.  Manual failover is better than what I have now, which is no
>> redundancy at all.  At least I'd be able to get my users back up more
>> quickly on the cloned node.  It just won't be as smooth as I'd like with
>> automated failover.  It still seems like it should be doable, I just
>> haven't found the proper incantation just yet.
>>
>> Any further advice is welcome.
>
> It is (or should be) ultimately possible.  I have actually done it
> before, just not for rather a while, which is why I'm being a bit vague
> (sorry!)
>
> Regards,
>
> Tim
>
>

Thanks for the help.  I'm still plugging away at it.

Seth

-- 
Seth Galitzer
Systems Coordinator
Computing and Information Sciences
Kansas State University
http://www.cis.ksu.edu/~sgsax
[email protected]
785-532-7790
_______________________________________________
Linux-HA mailing list
[email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha
See also: http://linux-ha.org/ReportingProblems

Reply via email to