Might I suggest: https://svn.open-mpi.org/trac/ompi/ticket/1073
It deals with some of these issues and explains the boundaries of the problem. As for what a string param can contain, I have no opinion. I only note that it must handle special characters such as ';', '/', etc. that are typically found in uri's. I cannot think of any reason it should have a quote in it. Ralph On 11/8/07 12:25 PM, "Tim Prins" <tpr...@cs.indiana.edu> wrote: > The alias option you presented does not work. I think we do some weird > things to find the absolute path for ssh, instead of just issuing the > command. > > I would spend some time fixing this, but I don't want to do it wrong. We > could quote all the param values, and change the parser to remove the > quotes, but this is assuming that a mca param does not contain quotes. > > So I guess there are 2 questions that need to be answered before a fix > is made: > > 1. What exactly can a string mca param contain? Can it have quotes or > spaces or? > > 2. Which mca parameters should be forwarded? Should it be just the ones > from the command line? From the environment? From config files? > > Tim > > Ralph Castain wrote: >> What changed is that we never passed mca params to the orted before - they >> always went to the app, but it's the orted that has the issue. There is a >> bug ticket thread on this subject - I forget the number immediately. >> >> Basically, the problem was that we cannot generally pass the local >> environment to the orteds when we launch them. However, people needed >> various mca params to get to the orteds to control their behavior. The only >> way to resolve that problem was to pass the params via the command line, >> which is what was done. >> >> Except for a very few cases, all of our mca params are single values that do >> not include spaces, so this is not a problem that is causing widespread >> issues. As I said, I already had to deal with one special case that didn't >> involve spaces, but did have special characters that required quoting, which >> identified the larger problem of dealing with quoted strings. >> >> I have no objection to a more general fix. Like I said in my note, though, >> the general fix will take a larger effort. If someone is willing to do so, >> that is fine with me - I was only offering solutions that would fill the >> interim time as I haven't heard anyone step up to say they would fix it >> anytime soon. >> >> Please feel free to jump in and volunteer! ;-) I'm willing to put the quotes >> around things if you will fix the mca cmd line parser to cleanly remove them >> on the other end. >> >> Ralph >> >> >> >> On 11/7/07 5:50 PM, "Tim Prins" <tpr...@cs.indiana.edu> wrote: >> >>> I'm curious what changed to make this a problem. How were we passing mca >>> param >>> from the base to the app before, and why did it change? >>> >>> I think that options 1 & 2 below are no good, since we, in general, allow >>> string mca params to have spaces (as far as I understand it). So a more >>> general approach is needed. >>> >>> Tim >>> >>> On Wednesday 07 November 2007 10:40:45 am Ralph H Castain wrote: >>>> Sorry for delay - wasn't ignoring the issue. >>>> >>>> There are several fixes to this problem - ranging in order from least to >>>> most work: >>>> >>>> 1. just alias "ssh" to be "ssh -Y" and run without setting the mca param. >>>> It won't affect anything on the backend because the daemon/procs don't use >>>> ssh. >>>> >>>> 2. include "pls_rsh_agent" in the array of mca params not to be passed to >>>> the orted in orte/mca/pls/base/pls_base_general_support_fns.c, the >>>> orte_pls_base_orted_append_basic_args function. This would fix the specific >>>> problem cited here, but I admit that listing every such param by name would >>>> get tedious. >>>> >>>> 3. we could easily detect that a "problem" character was in the mca param >>>> value when we add it to the orted's argv, and then put "" around it. The >>>> problem, however, is that the mca param parser on the far end doesn't >>>> remove those "" from the resulting string. At least, I spent over a day >>>> fighting with a problem only to discover that was happening. Could be an >>>> error in the way I was doing things, or could be a real characteristic of >>>> the parser. Anyway, we would have to ensure that the parser removes any >>>> surrounding "" before passing along the param value or this won't work. >>>> >>>> Ralph >>>> >>>> On 11/5/07 12:10 PM, "Tim Prins" <tpr...@cs.indiana.edu> wrote: >>>>> Hi, >>>>> >>>>> Commit 16364 broke things when using multiword mca param values. For >>>>> instance: >>>>> >>>>> mpirun --debug-daemons -mca orte_debug 1 -mca pls rsh -mca pls_rsh_agent >>>>> "ssh -Y" xterm >>>>> >>>>> Will crash and burn, because the value "ssh -Y" is being stored into the >>>>> argv orted_cmd_line in orterun.c:1506. This is then added to the launch >>>>> command for the orted: >>>>> >>>>> /usr/bin/ssh -Y odin004 PATH=/san/homedirs/tprins/usr/rsl/bin:$PATH ; >>>>> export PATH ; >>>>> LD_LIBRARY_PATH=/san/homedirs/tprins/usr/rsl/lib:$LD_LIBRARY_PATH ; >>>>> export LD_LIBRARY_PATH ; /san/homedirs/tprins/usr/rsl/bin/orted --debug >>>>> --debug-daemons --name 0.1 --num_procs 2 --vpid_start 0 --nodename >>>>> odin004 --universe tpr...@odin.cs.indiana.edu:default-universe-27872 >>>>> --nsreplica >>>>> "0.0;tcp://129.79.240.100:40907;tcp6://2001:18e8:2:240:2e0:81ff:fe2d:21a0 >>>>> :4090 8" >>>>> --gprreplica >>>>> "0.0;tcp://129.79.240.100:40907;tcp6://2001:18e8:2:240:2e0:81ff:fe2d:21a0 >>>>> :4090 8" >>>>> -mca orte_debug 1 -mca pls_rsh_agent ssh -Y -mca >>>>> mca_base_param_file_path >>>>> /u/tprins/usr/rsl/share/openmpi/amca-param-sets:/san/homedirs/tprins/rsl/ >>>>> examp les >>>>> -mca mca_base_param_file_path_force /san/homedirs/tprins/rsl/examples >>>>> >>>>> Notice that in this command we now have "-mca pls_rsh_agent ssh -Y". So >>>>> the quotes have been lost, as we die a horrible death. >>>>> >>>>> So we need to add the quotes back in somehow, or pass these options >>>>> differently. I'm not sure what the best way to fix this. >>>>> >>>>> Thanks, >>>>> >>>>> Tim >>> >> >