A customer is having a problem using --distribution=arbitrary with a 
hostfile. Hosts can be represented in the hostfile either individually 
(node1,node2,...) or as a range or set of ranges between square brackets 
(node[1-10], node[20-30,50-100], ...). His job appears to fail after 
processing about 13090 hosts, with the following errors:
srun: error: hostlist.c:1747 Too many ranges, can't process entire list: 
Invalid argument
and then
srun: error: Hostlist is too long for the allocate RPC!

The constants defined in hostlist.h and hostlist.c suggest a limit much 
larger than 13090 hosts.  Note sure where the RPC limit is defined.  All 
the hosts in the job's hostfile are listed individually. There are no 
ranges. Does anyone know the following limits regarding the hostfile?

Maximum number of hosts. 
Maximum number of host ranges.
Maximum size of the hostfile in bytes.

Thanks,
Martin

Reply via email to