Hi Dave.
Thanks for reviewing this document and thanks for your phone help on
Friday. I really appreciate it...
Responses inline...
On 03/30/09 08:31, Dave Miner wrote:
> Jack Schwartz wrote:
>> Hi everyone.
>>
>> I am tasked with fixing bug
>> 6252 /etc/nsswitch.dns issues with AI setup if you install from livecd
>>
>> Here is a description of the problem, a reduction of options for
>> fixing it, and some details on what I think is the right solution.
>>
>> Problem:
>> ========
>> The liveCD image sets up a system as a client. It uses NWAM for
>> network setup, which means that client's IP address can float or even
>> be the loopback IP address. This is not a problem for a client as
>> other systems don't initiate communication with clients, and clients
>> don't respond to requests to serve information to other systems.
>>
>> The AI server is a server. Other systems rely on a non-changing,
>> accessible address for getting to that server, as other systems need
>> to know how to find that server. The IP address must not change, and
>> cannot be a loopback address. These server-oriented characteristics
>> of the AI server's IP address are assumed by DHCP as well, since it
>> is setting itself up on a server.
>>
>> What we are trying to accomplish:
>> =================================
>> To prevent that a user runs installadm create-service and it doesn't
>> work.
>>
>> Options for fixing:
>> ===================
>> 1) Have installadm correct the problems it finds.
>> This is the ideal solution but is not feasible, because of the many
>> possible network scenarios a system could be a part of. In
>> particular, which name services the network is using.
>>
>> 2) Alert the user to obvious setup problems, and have the user fix
>> them. Fix any fixable problems upon user's OK.
>>
>
> I don't think we are in a position to fix things very well. At this
> point, I would err on the side of caution and merely catch the
> problems, but not correct them.
OK.
>
>> This way, we don't steer the user down an incorrect path by telling
>> them there is a problem where there isn't one. It is possible that
>> there are other problems which are missed, but the more complete the
>> checklist, the more reliable the checking is.
>>
>> Checks for obvious problems we can make:
>> ========================================
>>
>> Summary:
>> --------
>> Check the following always:
>> 1) Check that IP address is not a loopback address and that
>> getent(1M) returns the same value as ifconfig does.
>> 2) The only network services running are those appropriate for a server.
>>
>> Check the following if setting up DHCP server:
>> 3) netmask: Validate that getent returns the same netmask as ifconfig.
>> 4) Verify /etc/resolv.conf exists and at least one nameserver entry
>> in it is pingable. (Should it be that *all* nameservers are
>> pingable, not just one?)
>> 5) Verify the server knows of the default gateway for the client's
>> subnet.
>>
>> No need to check the following:
>> 1) Hostname.
Meaning: don't check that the hostname is unique across the network.
>>
>>
>> Gory Details, explanations (Read if you can't go to sleep):
>> -----------------------------------------------------------
>>
>> 1) IP address: We can check that it is not a loopback address, and
>> that getent(1M) returns the same value as ifconfig does.
>>
>
> That would be "getent hosts <hostname>" returns an IP address that's
> non-loopback.
Correct:
getent hosts `/usr/bin/hostname'
>
>> We cannot check that it is static (meaning it will be the same on
>> reboot), since we don't know where the system is getting it from. It
>> is possible that DHCP gave the system its address on bootup and would
>> give it a different address the next time; there's no way to tell.
>> We can't look in /etc/hosts because some other name service may have
>> supplied the address. All we can check for is that it is not a
>> loopback address.
>>
>> 2) The only network services running are those appropriate for a server.
>>
>> Check that:
>> svc:/network/physical:nwam is disabled
>> svc:/network/physical:default is online
>> Ask the user if the program should set up these services, if required.
>>
>
> No, you shouldn't modify the service state, only alert to the problem.
> Turning on network/physical:default requires configuring the interface
> statically using /etc/hostname.interface files. If you just turn on
> network/physical:default and turn off nwam without setting up a
> configuration, you'll end up with a system with no networking at all.
Oops. OK.
>
>> In addition, if installadm is using the AI server to be a DHCP server...
>> ------------------------------------------------------------------------
>>
>> 3) netmask: Validate that getent returns the same netmask as ifconfig.
>>
>
> You should be specific about what you're running getent to do.
getent netmasks <IP addr>
>
>> Some utilities (dhcp, I think) ping getent for info, whereas the
>> system may be working with different info. ifconfig returns what the
>> kernel is currently using.
>>
>> 4) Verify /etc/resolv.conf exists and at least one nameserver entry
>> in it is pingable. (Should it be that *all* nameservers are
>> pingable, not just one?)
>>
>
> You cannot use "pingable" as a reliable indicator of validity. There
> is no requirement that such a server respond to pings (or that the
> network along the way even allow them to get there), and there's no
> reliable way to be sure that any failure you see is anything other
> than a transient problem.
OK, so just check that at least one nameserver entry exists in
/etc/resolv.conf? Is that the best we can do?
>
>> The server has DNS info to pass to client. dhcpconfig(1M) supplies
>> the client with configuration info at boot time, such as IP address.
>> DNS info is part of this configuration information needed by the
>> client, as the client will use DNS to resolve where the pkg repos
>> are. It is not a requirement that the server be utilizing dns to
>> resolve its own services; the server could be using NIS or some
>> other name service. For this reason, dns shouldn't have to be listed
>> for hosts and ipnodes entries in /etc/nsswitch.conf.
>>
>> It is a requirement that the server have a /etc/resolv.conf with
>> valid nameserver entries. Unlike other configuration files (like
>> /etc/hosts) which may or may not be used depending on other
>> nameservices used, if /etc/resolv.conf doesn't exist on the system,
>> there is no nameserver information available to that system.
>>
>> 5) Verify the server knows of the default gateway for the client's
>> subnet.
>>
>> Check via netstat -rn and look for "default" on the same line as the
>> proper subnet is listed. This is so the DHCP configuration info
>> which gets propagated to the client has this information. (The
>> client needs it.)
> All you should do in this case is warn that no router information will
> be supplied to clients and that the DHCP setup may need modification
> to provide an appropriate router.
... if netstat -rn shows no "default" line or if "default" is not
listed for the same subnet as the clients are on.
Thanks again,
Jack
>
> Dave