Hi, JobClient (.18) / Job(.20) class apis should help you achieve this. Amogh
On 11/19/09 1:40 AM, "Gang Luo" <[email protected]> wrote: HI all, I am going to execute multiple mapreduce jobs in sequence, but whether or not to execute a job in that sequence could not be determined beforehand, but depend on the result of the previous job. Is there anyone with some ideas how to do this 'dynamically"? p.s. I guess cascading could help. I still not got of point of cascading yet. It is appreciated if someone could give me some hints on this. Gang Luo --------- Department of Computer Science Duke University (919)316-0993 [email protected] ----- 原始邮件 ---- 发件人: Edward Capriolo <[email protected]> 收件人: [email protected] 发送日期: 2009/11/18 (周三) 1:02:35 下午 主 题: Re: names or ips in rack awareness script? On Wed, Nov 18, 2009 at 11:28 AM, Michael Thomas <[email protected]> wrote: > IPs are passed to the rack awareness script. 燱e use 'dig' to do the reverse > lookup to find the hostname, as we also embed the rack id in the worker node > hostnames. > > --Mike > > On 11/18/2009 08:20 AM, David J. O'Dell wrote: >> >> I'm trying to figure out if I should use ip addresses or dns names in my >> rack awareness script. >> >> Its easier for me to use dns names because we have the row and rack >> number in the name which means I can dynamically determine the rack >> without having to manually update the list when adding nodes. >> >> However this won't work if the script is passed ips as arguments. >> Does anyone know what is being passed on to the script(ip's or dns names) >> >> Relevant docs: >> >> http://hadoop.apache.org/common/docs/r0.20.1/cluster_setup.html#Hadoop+Rack+Awareness >> >> and >> >> http://hadoop.apache.org/common/docs/r0.20.1/api/org/apache/hadoop/net/DNSToSwitchMapping.html#resolve(java.util.List) >> >> >> > > > It was never clear to me what would be needed ip vs hostname. I specified ip, short hostnames, and long hostnames just to be safe. And you know things sometimes change with hadoop ::wink-wink:: I have been meaning to plug my topology script for a while (as I think it is pretty cool). I separated my topology script and my topology data like so.. topology.sh HADOOP_CONF=/etc/hadoop/conf while [ $# -gt 0 ] ; do nodeArg=$1 exec< ${HADOOP_CONF}/topology.data result="" while read line ; do ar=( $line ) if [ "${ar[0]}" = "$nodeArg" ] ; then result="${ar[1]}" fi done shift if [ -z "$result" ] ; then echo -n "/default-rack " else echo -n "$result " fi done topology.data hadoopdata1.ec.com /dc1/rack1 hadoopdata1 /dc1/rack1 10.1.1.1 /dc1/rack1 It is great if your hostname reflects the rackname in some parsable format! Then you do not need to maintain a topology data file like I have. As of now I generate it from our asset db. Good luck! ___________________________________________________________ 好玩贺卡等你发,邮箱贺卡全新上线! http://card.mail.cn.yahoo.com/
