Hi,

On Thu, May 06, 2010 at 07:32:51PM +0200, Lars Ellenberg wrote:
> On Thu, May 06, 2010 at 04:55:18PM +0200, Dejan Muhamedagic wrote:
> > Hi,
> > 
> > On Thu, May 06, 2010 at 04:01:45PM +0200, Florian Haas wrote:
> > > On 2010-05-06 15:59, Dejan Muhamedagic wrote:
> > > > Hi,
> > > > 
> > > > On Thu, May 06, 2010 at 03:47:21PM +0200, Florian Haas wrote:
> > > >> Brad,
> > > 
> > > Apologies to Dag for addressing him as Brad. I was distracted.
> > > 
> > > >> sorry, I'll have to drop this one too as it's doing the wrong thing.
> > > >>
> > > >> On 2010-05-06 01:49, Dag Stenstad wrote:
> > > > [...]
> > > >>> +        fi
> > > >>>          target_node=$(host $OCF_RESKEY_CRM_meta_migrate_target | \
> > > >>>          awk '{ print $1; }')
> 
> You are aware that there are several flavours of "host" around,
> each with a different output?
> Also, just as an example how the output *may* look like (with a
> particular flavour of host):
> host www.google.com
> www.google.com is an alias for www.l.google.com.
> www.l.google.com has address 74.125.39.105
> www.l.google.com has address 74.125.39.106
> www.l.google.com has address 74.125.39.147
> www.l.google.com has address 74.125.39.99
> www.l.google.com has address 74.125.39.103
> www.l.google.com has address 74.125.39.104
> www.l.google.com has IPv6 address 2a00:1450:8007::68
> www.l.google.com has IPv6 address 2a00:1450:8007::69
> www.l.google.com has IPv6 address 2a00:1450:8007::6a
> www.l.google.com has IPv6 address 2a00:1450:8007::93
> www.l.google.com has IPv6 address 2a00:1450:8007::63
> www.l.google.com has IPv6 address 2a00:1450:8007::67
> 
> If you go down that route, you have to deal with such things, too ;-)
> 
> > > >>> +        if [ ${target_node} = "Host" ]; then
> 
> (from package "host")       host doesnotexist.linbit.
> doesnotexist.linbit does not exist (Authoritative answer)
> 
> (from package "bind9-host") host doesnotexist.linbit.
> Host doesnotexist.linbit. not found: 3(NXDOMAIN)
> 
> and thats only the two popular debian flavours.
> (and don't ask about the exit codes ;->)
> 
> 
> how about dig? (only one flavour, to the best of my knowledge)
> or getent hosts?
> or gethostip (syslinux package)?
> or put it into the cib again, explicitly, using "migrate_append_domain"?
> 
> Actually, none of this should be done, IMO.
> It needs to be fixed not in the RA, but in whatever entity
> currently is erroneously rejecting the certificate as invalid.
> 
> That entity needs to be fixed, to do the proper lookup, and possibly
> iterate over entries found by whatever resolving mechanism has to be used.

Indeed.

> I presume that there are some parts of libvirt and xen involved?
> It's open source?
> Right, then let's fix it there.
> Not add ugly workarounds that may or
> may not work depending on other factors.

Still, if it's going to take time for that to be fixed upstream,
we can provide a workaround. Of course all your notes about the
host output are valid. Hope that the author and Florian will take
care of that :)

Cheers,

Dejan

> > > >>> +            ocf_log error "Unable to resolve migrate target FQDN."
> > > >>> +            ocf_log debug "This is probably due to a misconfigured 
> > > >>> /etc/resolv.conf, missing"
> > > >>> +            ocf_log debug "entries in /etc/hosts or a generally 
> > > >>> broken DNS setup."
> > > >>> +            exit $OCF_ERR_CONFIGURED
> > > >>> +        fi
> > > >> Hmmm. I'm ambiguous about this part. What this means is that if we
> > > >> initiate migration, and we can't resolve the host name at that time, we
> > > >> don't just stop the migration and recover the resource through an
> > > >> in-place shutdown and restart, but we're forcing it away, without any 
> > > >> VM
> > > >> migration, to a different host. I think this should be
> > > >> $OCF_ERR_GENERIC... what do others think?
> > > > 
> > > > Agreed. The problem may even be temporary. I'd probably try to
> > > > avoid using host resolving in resources anyway.
> > > 
> > > Yes, but in the use case Dag describes it makes good sense.
> > > 
> > > So what is your suggestion? Loop on resolving and time out if it
> > > ultimately fails?
> > 
> > Yes, it seems preferable to do that. Even better would be to
> > catch the output (or exit code) of host(1) and see if the error
> > is intermittent or definite. At any rate, the exit code should be
> > OCF_ERR_GENERIC.
> 
> -- 
> : Lars Ellenberg
> : LINBIT | Your Way to High Availability
> : DRBD/HA support and consulting http://www.linbit.com
> 
> DRBD® and LINBIT® are registered trademarks of LINBIT, Austria.
> _______________________________________________________
> Linux-HA-Dev: [email protected]
> http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
> Home Page: http://linux-ha.org/
_______________________________________________________
Linux-HA-Dev: [email protected]
http://lists.linux-ha.org/mailman/listinfo/linux-ha-dev
Home Page: http://linux-ha.org/

Reply via email to