On Fri, Feb 18, 2011 at 11:13 AM, Michael Hanselmann <[email protected]> wrote: > Am 17. Februar 2011 16:27 schrieb René Nussbaumer <[email protected]>: >> +def _RunWhenNodesReachable(node_list, action_cb, interval, >> + ping_fn=netutils.TcpPing, sleep_fn=time.sleep): >> + client = GetClient() >> + cluster_info = client.QueryClusterInfo() >> + if cluster_info["primary_ip_version"] == constants.IP4_VERSION: >> + family = netutils.IPAddress.family >> + else: >> + family = netutils.IP6Address.family >> + >> + nodes = [netutils.GetHostname(node, family=family) for node in node_list] > > You'll have to use a callback for this as well. How about doing the IP > address stuff outside this function and just passing a list of IP > addresses? > >> + node_states = dict((name, False) for name in node_list) > > dict.fromkeys(node_list, False) > >> + success = True >> + >> + while nodes: >> + timeout = utils.RunningTimeout(interval, True) >> + new = False >> + for (idx, node) in enumerate(nodes): >> + if ping_fn(node.ip, constants.DEFAULT_NODED_PORT, timeout=1, > > Make the timeout at least a function-level constant, please. > >> + live_port_needed=True): >> + ToStdout("Node %s became available", node.name) >> + node_name = node_list[idx] >> + node_states[node_name] = True >> + nodes.remove(node) >> + new = True > > I think you should break out of the loop if a node became available > instead of retrying all. Doing so will start the instances as soon as > their nodes become available. > >> + >> + if new: >> + if not action_cb(node_states): > > Just pass the callback the list (not a generator) of online nodes. No > need to expose internal details here. > > action_cb([name for (name, up) in node_states.items()]) > >> + success = False >> + >> + time_remaining = timeout.Remaining() >> + if nodes and time_remaining > 0: >> + sleep_fn(time_remaining) > > Please use utils.Retry with RETRY_REMAINING_TIME. > >> + return success >> + >> + >> +def _MaybeInstanceStartup(opts, inst_map, node_states): >> + nodes_online = set(node for (node, on) in node_states.items() if on) >> + >> + start_inst_list = [] >> + for (inst, nodes) in inst_map.items(): >> + if not (nodes - nodes_online): >> + # All nodes the instance lives on are back online >> + start_inst_list.append(inst) >> + del inst_map[inst] > > Unless it's been designed for that case, modifying a container while > iterating over it is bad if not dangerous.
After some back and forth and local discussion, we agreed on a new design which hopefully addresses all of your concernes here. > >> + if start_inst_list: >> + return _InstanceStart(opts, start_inst_list, True) > > If you make this a callback, too, you can easily test this function. Done. René
