Questions about standalone cluster configuration:

  1.  Is it considered bad practice to have standby JobManagers co-located on 
the same machines as TaskManagers?
  2.  Is it considered bad practice to have zookeeper installed on the same 
machines as the JobManager leader and standby machines? (the docs say "In 
production setups, it is recommended to manage your own ZooKeeper 
installation.", but I'm assuming it's still okay to co-locate ZK on with 
JobManager?)
  3.  In another thread, I read that the rule of thumb for 
taskmanager.numberOfTaskSlots = number of cores. Doesn't this ignore cases 
where threads have a high proportion of idle time (i.e. waiting on an I/O 
call)? If the total number of task slot limits my degree of parallelism, but 
most parallel copies of a subtask are idle at any given time, it seems that I 
would want to have # of task slots equal to some multiple of the number of 
cores.

Thanks,
Edward

Reply via email to