On Fri, Feb 18, 2011 at 11:13 AM, Michael Hanselmann <[email protected]> wrote:
> Am 17. Februar 2011 16:27 schrieb René Nussbaumer <[email protected]>:
>> +def _RunWhenNodesReachable(node_list, action_cb, interval,
>> +                           ping_fn=netutils.TcpPing, sleep_fn=time.sleep):
>> +  client = GetClient()
>> +  cluster_info = client.QueryClusterInfo()
>> +  if cluster_info["primary_ip_version"] == constants.IP4_VERSION:
>> +    family = netutils.IPAddress.family
>> +  else:
>> +    family = netutils.IP6Address.family
>> +
>> +  nodes = [netutils.GetHostname(node, family=family) for node in node_list]
>
> You'll have to use a callback for this as well. How about doing the IP
> address stuff outside this function and just passing a list of IP
> addresses?
>
>> +  node_states = dict((name, False) for name in node_list)
>
> dict.fromkeys(node_list, False)
>
>> +  success = True
>> +
>> +  while nodes:
>> +    timeout = utils.RunningTimeout(interval, True)
>> +    new = False
>> +    for (idx, node) in enumerate(nodes):
>> +      if ping_fn(node.ip, constants.DEFAULT_NODED_PORT, timeout=1,
>
> Make the timeout at least a function-level constant, please.
>
>> +                 live_port_needed=True):
>> +        ToStdout("Node %s became available", node.name)
>> +        node_name = node_list[idx]
>> +        node_states[node_name] = True
>> +        nodes.remove(node)
>> +        new = True
>
> I think you should break out of the loop if a node became available
> instead of retrying all. Doing so will start the instances as soon as
> their nodes become available.
>
>> +
>> +    if new:
>> +      if not action_cb(node_states):
>
> Just pass the callback the list (not a generator) of online nodes. No
> need to expose internal details here.
>
> action_cb([name for (name, up) in node_states.items()])
>
>> +        success = False
>> +
>> +    time_remaining = timeout.Remaining()
>> +    if nodes and time_remaining > 0:
>> +      sleep_fn(time_remaining)
>
> Please use utils.Retry with RETRY_REMAINING_TIME.
>
>> +  return success
>> +
>> +
>> +def _MaybeInstanceStartup(opts, inst_map, node_states):
>> +  nodes_online = set(node for (node, on) in node_states.items() if on)
>> +
>> +  start_inst_list = []
>> +  for (inst, nodes) in inst_map.items():
>> +    if not (nodes - nodes_online):
>> +      # All nodes the instance lives on are back online
>> +      start_inst_list.append(inst)
>> +      del inst_map[inst]
>
> Unless it's been designed for that case, modifying a container while
> iterating over it is bad if not dangerous.

After some back and forth and local discussion, we agreed on a new
design which hopefully addresses all of your concernes here.

>
>> +  if start_inst_list:
>> +    return _InstanceStart(opts, start_inst_list, True)
>
> If you make this a callback, too, you can easily test this function.

Done.

René

Reply via email to