Hi Martin,

The original logic could not process a hostlist larger than 64k bytes  
in length. The hostlist functions should be able to handle up to 64k  
nodes. The patch below will be in v2.6 and should handle a hostlist of  
any size

https://github.com/SchedMD/slurm/commit/41ba94015036d34ab0cb32de07f2e42f39409ec7


Quoting [email protected]:

> A customer is having a problem using --distribution=arbitrary with a
> hostfile. Hosts can be represented in the hostfile either individually
> (node1,node2,...) or as a range or set of ranges between square brackets
> (node[1-10], node[20-30,50-100], ...). His job appears to fail after
> processing about 13090 hosts, with the following errors:
> srun: error: hostlist.c:1747 Too many ranges, can't process entire list:
> Invalid argument
> and then
> srun: error: Hostlist is too long for the allocate RPC!
>
> The constants defined in hostlist.h and hostlist.c suggest a limit much
> larger than 13090 hosts.  Note sure where the RPC limit is defined.  All
> the hosts in the job's hostfile are listed individually. There are no
> ranges. Does anyone know the following limits regarding the hostfile?
>
> Maximum number of hosts.
> Maximum number of host ranges.
> Maximum size of the hostfile in bytes.
>
> Thanks,
> Martin
>

Reply via email to