Hi Marcos, executed it from shell.. returns immediately, no delay and the output is just hduser..
hduser@hmaster-asia:~/installations/hadoop-0.23.5$ bash -c "id -Gn hduser" hduser one important thing I noticed on logs is (posted in my first mail), it always get stuck *twice for exactly 60secs*. This looks like some timeout. Went through settings which are close to 60secs, but no luck yet. Output of strace shows futex as the last call after which it takes 2 mins and then print the ls output. Not clear if it is stuck on futex there as that seems to be the last system call. I tried with 0.23.6 but got the same behavior. * * *strace -t bin/hadoop dfs -ls / ; date* ....... 15:09:29 openat(AT_FDCWD, "/home/hduser/installations/hadoop-0.23.5/share/hadoop/mapreduce/", O_RDON LY|O_NONBLOCK|O_DIRECTORY|O_CLOEXEC) = 3 15:09:29 getdents(3, /* 11 entries */, 32768) = 584 15:09:29 getdents(3, /* 0 entries */, 32768) = 0 15:09:29 close(3) = 0 15:09:29 mmap(NULL, 1052672, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS|MAP_STACK, -1, 0) = 0x7 ff716300000 15:09:29 mprotect(0x7ff716300000, 4096, PROT_NONE) = 0 15:09:29 clone(child_stack=0x7ff7163ffff0, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_T HREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID, parent_tidptr=0x7ff716400 9d0, tls=0x7ff716400700, child_tidptr=0x7ff7164009d0) = 42720 *15:09:29 futex*(0x7ff7164009d0, FUTEX_WAIT, 42720, NULLFound 5 items drwxr-xr-x - hduser supergroup 0 2013-03-26 12:28 /aea8d83f-c0a6-4646-a88d-f9a89be92486 drwxr-xr-x - hduser supergroup 0 2013-03-06 12:43 /in drwxr-xr-x - hduser supergroup 0 2013-03-04 12:57 /out1 drwxrwx--- - hduser supergroup 0 2013-03-04 12:32 /tmp drwx------ - hduser supergroup 0 2013-03-04 12:49 /user <unfinished ... exit status 0> Thu Apr 4* 15:11:32 UTC 2013* ~/installations/hadoop-0.23.5$ On Thu, Apr 4, 2013 at 6:36 PM, MARCOS MEDRADO RUBINELLI < [email protected]> wrote: > Gopi, > > The namenode is essentially running bash -c "id -Gn hduser" and waiting > for the response. You could try executing from the shell it to see if it > does take a long time in Azure, or if the output is too complex to parse. > > Regards, > Marcos > > > In 04-04-2013 09:33, Gopi Krishna M wrote: > > in case anybody has some idea on what I should look for in debugging > this.. below is a strace of the simple bin/hadoop dfs -ls / command run on > the machine running namenode > > % time seconds usecs/call calls errors syscall > ------ ----------- ----------- --------- --------- ---------------- > 49.56 0.000853 45 19 14 execve > 36.20 0.000623 1 925 rt_sigprocmask > 5.75 0.000099 2 62 brk > 5.29 0.000091 9 10 getgid > 3.20 0.000055 1 81 rt_sigaction > 0.00 0.000000 0 71 read > 0.00 0.000000 0 3 write > 0.00 0.000000 0 49 11 open > 0.00 0.000000 0 87 close > 0.00 0.000000 0 71 31 stat > 0.00 0.000000 0 36 fstat > 0.00 0.000000 0 12 lseek > 0.00 0.000000 0 72 mmap > 0.00 0.000000 0 36 mprotect > 0.00 0.000000 0 9 munmap > 0.00 0.000000 0 19 rt_sigreturn > 0.00 0.000000 0 2 2 ioctl > 0.00 0.000000 0 40 31 access > 0.00 0.000000 0 19 pipe > 0.00 0.000000 0 8 dup2 > 0.00 0.000000 0 2 getpid > 0.00 0.000000 0 20 clone > 0.00 0.000000 0 38 19 wait4 > 0.00 0.000000 0 2 uname > 0.00 0.000000 0 24 2 fcntl > 0.00 0.000000 0 16 getdents > 0.00 0.000000 0 2 readlink > 0.00 0.000000 0 2 gettimeofday > 0.00 0.000000 0 5 getrlimit > 0.00 0.000000 0 10 getuid > 0.00 0.000000 0 10 geteuid > 0.00 0.000000 0 10 getegid > 0.00 0.000000 0 2 getppid > 0.00 0.000000 0 2 getpgrp > 0.00 0.000000 0 5 arch_prctl > 0.00 0.000000 0 2 1 futex > 0.00 0.000000 0 1 set_tid_address > 0.00 0.000000 0 10 2 openat > 0.00 0.000000 0 1 set_robust_list > ------ ----------- ----------- --------- --------- ---------------- > 100.00 0.001721 1795 113 total > > > On Sun, Mar 31, 2013 at 11:47 AM, Gopi Krishna M <[email protected]> wrote: > >> Hi.. >> >> I have installed hadoop 0.23.5 and is working fine on two of my >> installations. In a new installation on Windows Azure VMs, I am seeing an >> inordinate delay (of 1 minute) when doing a simple dfs ls command. >> >> This is a 3 node cluster, running 64 bit Ubuntu 12.04.1 LTS, Sun jre >> >> command : bin/hadoop dfs -ls / >> >> With log set to DEBUG, following is the output on the console. The >> logs in bold show the sleep/timeout/delay of *1 minute*. Any pointers >> on what to check for / what is the reason for this delay? >> >> 13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field >> org.apache.hadoop.metrics2.lib.MutableRate >> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginSuccess >> with annotation @org.apache. >> hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of >> successful kerberos logins >> and latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops) >> 13/03/31 05:48:55 DEBUG lib.MutableMetricsFactory: field >> org.apache.hadoop.metrics2.lib.MutableRate >> org.apache.hadoop.security.UserGroupInformation$UgiMetrics.loginFailure >> with annotation @org.apache. >> hadoop.metrics2.annotation.Metric(valueName=Time, about=, value=[Rate of >> failed kerberos logins and >> latency (milliseconds)], always=false, type=DEFAULT, sampleName=Ops) >> 13/03/31 05:48:55 DEBUG impl.MetricsSystemImpl: UgiMetrics, User and >> group related metrics >> 13/03/31 05:49:55 DEBUG security.Groups: Creating new Groups object >> *13/03/31 05:49:55 DEBUG security.Groups: Group mapping >> impl=org.apache.hadoop.security.ShellBasedUni* >> *xGroupsMapping; cacheTimeout=300000* >> *13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login* >> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: hadoop login commit >> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: using local >> user:UnixPrincipal: hduser >> 13/03/31 05:50:55 DEBUG security.UserGroupInformation: UGI >> loginUser:hduser (auth:SIMPLE) >> 13/03/31 05:50:56 DEBUG hdfs.DFSClient: Short circuit read is false >> 13/03/31 05:50:56 DEBUG ipc.Client: The ping interval is 60000 ms. >> 13/03/31 05:50:56 DEBUG ipc.Client: Use SIMPLE authentication for >> protocol ClientProtocol >> 13/03/31 05:50:56 DEBUG ipc.Client: Connecting to XXXXXX:9000 >> >> >> Thanks >> Gopi >> >> > >
