Re: rcs configure hang

2020-11-05 Thread Paul Eggert

On 11/5/20 2:28 PM, Kelly Wang (kellythw) wrote:

When strace hang, I do 'ps -elf | grep strace' from other terminal and do kill -9 

kill -s INT $(ps -o pid= -C a.out) looks like not working from my server.


Assuming you're using the Linux kernel signal numbers, you should be able to get 
a process ID (say, 4729) and use this:


kill -2 4729

instead of the fancier 'kill' command I suggested. Also, try this in another 
session:


kill -14 4729

which sends the ALRM signal instead of the INT signal. Either way, see what 'tr' 
says.




Re: rcs configure hang

2020-11-05 Thread Kelly Wang (kellythw)
Hi Paul,

When strace hang, I do 'ps -elf | grep strace' from other terminal and do kill 
-9 
kill -s INT $(ps -o pid= -C a.out) looks like not working from my server.

% kill -s INT $(ps -o pid= -C a.out)
Illegal variable name.

Thanks,
Kelly
 
If you need support for DevX Tools:   http://devxsupport.cisco.com/
Specifically, for NXOS, see -
https://wiki.cisco.com/display/NEXUSPMO/ContactingNexusOpsAndTools
 

On 11/5/20, 1:36 PM, "Paul Eggert"  wrote:

On 11/5/20 1:18 PM, Kelly Wang (kellythw) wrote:
> With the conftest.c you provided, strace still hang.
> Check for how many calls for chdir("confdir3"), it only has 110 times, 
then hang after mkdir("confdir3", 0700 ...
> Is there any directory limitation that can make on a server?

Wow, that's a more-serious kernel (or filesystem) bug than I thought: the 
mkdir 
system call is hanging and does not appear to be interruptible via SIGALRM.

When the program hangs, how do you terminate it? Do you use Control-C from 
a 
terminal? If so, what happens if you instead use 'kill'? Something like 
this:

rm -fr conftest3
gcc conftest.c
strace -o tr ./a.out &
sleep 1
kill -s INT $(ps -o pid= -C a.out)

That last line should send the SIGINT signal to the a.out command; does 
this 
cause a.out to exit? (You can look at 'tr' to see.) If it exits, perhaps we 
can 
modify conftest3 to do the same thing to itself when it is running on a 
buggy 
kernel.

Also, what happens if you do the same recipe as above, but use 'ALRM' 
rather 
than 'INT'? Again, look at the end of 'tr'.



Re: rcs configure hang

2020-11-05 Thread Paul Eggert

On 11/5/20 1:18 PM, Kelly Wang (kellythw) wrote:

With the conftest.c you provided, strace still hang.
Check for how many calls for chdir("confdir3"), it only has 110 times, then hang after 
mkdir("confdir3", 0700 ...
Is there any directory limitation that can make on a server?


Wow, that's a more-serious kernel (or filesystem) bug than I thought: the mkdir 
system call is hanging and does not appear to be interruptible via SIGALRM.


When the program hangs, how do you terminate it? Do you use Control-C from a 
terminal? If so, what happens if you instead use 'kill'? Something like this:


rm -fr conftest3
gcc conftest.c
strace -o tr ./a.out &
sleep 1
kill -s INT $(ps -o pid= -C a.out)

That last line should send the SIGINT signal to the a.out command; does this 
cause a.out to exit? (You can look at 'tr' to see.) If it exits, perhaps we can 
modify conftest3 to do the same thing to itself when it is running on a buggy 
kernel.


Also, what happens if you do the same recipe as above, but use 'ALRM' rather 
than 'INT'? Again, look at the end of 'tr'.




Re: rcs configure hang

2020-11-05 Thread Kelly Wang (kellythw)
Hi Paul,

With the conftest.c you provided, strace still hang.
Check for how many calls for chdir("confdir3"), it only has 110 times, then 
hang after mkdir("confdir3", 0700 ...
Is there any directory limitation that can make on a server?
  
sjc-ads-7913:/ws/kellythw-sjc/rcs_try/getcwd-test% tail tr
chdir("confdir3")   = 0
mkdir("confdir3", 0700) = 0
chdir("confdir3")   = 0
mkdir("confdir3", 0700) = 0
chdir("confdir3")   = 0
mkdir("confdir3", 0700) = 0
chdir("confdir3")   = 0
mkdir("confdir3", 0700) = 0
chdir("confdir3")   = 0
mkdir("confdir3", 0700

% grep 'chdir("confdir3")' tr | wc -l
110

Thanks,
Kelly
 
If you need support for DevX Tools:   http://devxsupport.cisco.com/
Specifically, for NXOS, see -
https://wiki.cisco.com/display/NEXUSPMO/ContactingNexusOpsAndTools
 

On 11/5/20, 9:57 AM, "Paul Eggert"  wrote:

On 10/27/20 8:36 AM, Kelly Wang (kellythw) wrote:
> You are right, after remove confdir3, rerun strace hang.
> Checked tr output, it stopped at bunch of mkdir and chdir and no further 
steps after that.
> mkdir("confdir3", 0700) = 0
> chdir("confdir3")   = 0

How many chdir("confdir3") calls were there, exactly? On my platform there 
were 
1367.

My guess is that the getcwd system call hung on your platform, which 
suggests a 
kernel or filesystem bug somewhere.

What happens if you run the attached conftest.c instead? It's the same as 
before, except with an 'alarm (10)' call. As before, run it like this in 
your 
development directory:

rm -fr conftest3
gcc conftest.c
strace -o tr ./a.out

and see how 'tr' ends if it hangs (which I hope it doesn't).



Re: rcs configure hang

2020-11-05 Thread Paul Eggert

On 10/27/20 8:36 AM, Kelly Wang (kellythw) wrote:

You are right, after remove confdir3, rerun strace hang.
Checked tr output, it stopped at bunch of mkdir and chdir and no further steps 
after that.
mkdir("confdir3", 0700) = 0
chdir("confdir3")   = 0


How many chdir("confdir3") calls were there, exactly? On my platform there were 
1367.


My guess is that the getcwd system call hung on your platform, which suggests a 
kernel or filesystem bug somewhere.


What happens if you run the attached conftest.c instead? It's the same as 
before, except with an 'alarm (10)' call. As before, run it like this in your 
development directory:


rm -fr conftest3
gcc conftest.c
strace -o tr ./a.out

and see how 'tr' ends if it hangs (which I hope it doesn't).
/* confdefs.h */
#define PACKAGE_NAME "dummy"
#define PACKAGE_TARNAME "dummy"
#define PACKAGE_VERSION "0"
#define PACKAGE_STRING "dummy 0"
#define PACKAGE_BUGREPORT ""
#define PACKAGE_URL ""
#define PACKAGE "dummy"
#define VERSION "0"
#define STDC_HEADERS 1
#define HAVE_SYS_TYPES_H 1
#define HAVE_SYS_STAT_H 1
#define HAVE_STDLIB_H 1
#define HAVE_STRING_H 1
#define HAVE_MEMORY_H 1
#define HAVE_STRINGS_H 1
#define HAVE_INTTYPES_H 1
#define HAVE_STDINT_H 1
#define HAVE_UNISTD_H 1
#define __EXTENSIONS__ 1
#define _ALL_SOURCE 1
#define _DARWIN_C_SOURCE 1
#define _GNU_SOURCE 1
#define _NETBSD_SOURCE 1
#define _OPENBSD_SOURCE 1
#define _POSIX_PTHREAD_SEMANTICS 1
#define __STDC_WANT_IEC_60559_ATTRIBS_EXT__ 1
#define __STDC_WANT_IEC_60559_BFP_EXT__ 1
#define __STDC_WANT_IEC_60559_DFP_EXT__ 1
#define __STDC_WANT_IEC_60559_FUNCS_EXT__ 1
#define __STDC_WANT_IEC_60559_TYPES_EXT__ 1
#define __STDC_WANT_LIB_EXT2__ 1
#define __STDC_WANT_MATH_SPEC_FUNCS__ 1
#define _TANDEM_SOURCE 1
#define _HPUX_ALT_XOPEN_SOCKET_API 1
#define HAVE_SYS_SOCKET_H 1
#define HAVE_ARPA_INET_H 1
#define HAVE_FEATURES_H 1
#define HAVE_UNISTD_H 1
#define HAVE_SYS_PARAM_H 1
#define HAVE_DIRENT_H 1
#define HAVE_SYS_STAT_H 1
#define HAVE_SYS_TIME_H 1
#define HAVE_NETDB_H 1
#define HAVE_NETINET_IN_H 1
#define HAVE_LIMITS_H 1
#define HAVE_WCHAR_H 1
#define HAVE_STDINT_H 1
#define HAVE_INTTYPES_H 1
#define HAVE_THREADS_H 1
#define HAVE_SYS_MMAN_H 1
#define HAVE_SYS_SELECT_H 1
#define HAVE_PTHREAD_H 1
#define HAVE_SYS_CDEFS_H 1
#define HAVE_SYS_IOCTL_H 1
#define HAVE_SYS_UIO_H 1
#define restrict __restrict
#define HAVE_SHUTDOWN 1
#define HAVE_STRUCT_SOCKADDR_STORAGE 1
#define HAVE_SA_FAMILY_T 1
#define HAVE_STRUCT_SOCKADDR_STORAGE_SS_FAMILY 1
#define HAVE_ALLOCA_H 1
#define HAVE_ALLOCA 1
#define HAVE_FCHDIR 1
#define HAVE_FCNTL 1
#define HAVE_SYMLINK 1
#define HAVE_FDOPENDIR 1
#define HAVE_MEMPCPY 1
#define HAVE_FSTATAT 1
#define HAVE_FTRUNCATE 1
#define HAVE_GETDTABLESIZE 1
#define HAVE_GETTIMEOFDAY 1
#define HAVE_ISBLANK 1
#define HAVE_LSTAT 1
#define HAVE_MPROTECT 1
#define HAVE_OPENAT 1
#define HAVE_STRERROR_R 1
#define HAVE___XPG_STRERROR_R 1
#define HAVE_PIPE 1
#define HAVE_SIGACTION 1
#define HAVE_SIGALTSTACK 1
#define HAVE_SIGINTERRUPT 1
#define HAVE_SLEEP 1
#define HAVE_CATGETS 1
#define HAVE_SNPRINTF 1
#define HAVE_USLEEP 1
#define HAVE_ENVIRON_DECL 1
#define HAVE_DECL_STRERROR_R 1
#define HAVE_STRERROR_R 1
#define STRERROR_R_CHAR_P 1
#define HAVE_DECL_FCHDIR 1
#define HAVE_WORKING_O_NOATIME 1
#define HAVE_WORKING_O_NOFOLLOW 1
#define LSTAT_FOLLOWS_SLASHED_SYMLINK 1
#define HAVE_DECL_GETCWD 1
#define HAVE_DECL_GETDTABLESIZE 1
#define HAVE_IPV4 1
#define HAVE_IPV6 1
#define HAVE_WINT_T 1
#define HAVE_LONG_LONG_INT 1
#define HAVE_UNSIGNED_LONG_LONG_INT 1
#define HAVE_WEAK_SYMBOLS 1
#define HAVE_PTHREAD_API 1
#define USE_POSIX_THREADS 1
#define USE_POSIX_THREADS_WEAK 1
#define MALLOC_0_IS_NONNULL 1
#define HAVE_MAP_ANONYMOUS 1
#define HAVE_DECL_MEMRCHR 1
#define HAVE_DECL_ALARM 1
#define PROMOTED_MODE_T mode_t
#define HAVE_DECL_STRERROR_R 1
#define HAVE_SIGSET_T 1
#define HAVE__BOOL 1
#define HAVE_WCHAR_T 1
#define HAVE_DECL_STRDUP 1
#define _USE_STD_STAT 1
#define HAVE_DECL_UNSETENV 1
#define GNULIB_TEST_ACCEPT 1
#define HAVE_ALLOCA 1
#define GNULIB_TEST_BIND 1
#define GNULIB_TEST_CHDIR 1
#define GNULIB_TEST_CLOEXEC 1
#define GNULIB_TEST_CLOSE 1
#define HAVE_CLOSEDIR 1
#define GNULIB_TEST_CLOSEDIR 1
#define GNULIB_TEST_CONNECT 1
#define D_INO_IN_DIRENT 1
#define HAVE_DIRFD 1
#define HAVE_DECL_DIRFD 1
#define GNULIB_TEST_DIRFD 1
#define GNULIB_TEST_DUP 1
#define GNULIB_TEST_DUP2 1
#define GNULIB_TEST_ENVIRON 1
#define GNULIB_TEST_FCHDIR 1
#define GNULIB_TEST_FCNTL 1
#define GNULIB_FD_SAFER_FLAG 1
#define GNULIB_TEST_FDOPEN 1
#define HAVE_DECL_FDOPENDIR 1
#define GNULIB_TEST_FDOPENDIR 1
#define GNULIB_FDOPENDIR 1
#define GNULIB_TEST_FSTAT 1
#define GNULIB_TEST_FSTATAT 1
#define GNULIB_TEST_FTRUNCATE 1
/* end confdefs.h.  */

#include 
#include 
#if HAVE_UNISTD_H
# include 
#else
# include 
#endif
#include 
#include 
#include 
#include 
#include 


/* Arrange to define PATH_MAX, like "pathmax.h" does. */
#if HAVE_UNISTD_H
# include 
#endif
#include 
#if defined 

Re: rcs configure hang

2020-11-05 Thread Kelly Wang (kellythw)
Hi Paul or gnulib guru,

Can you share any thought for the configure hanging problem while configure rcs?

+ ./configure

checking whether fcntl handles F_DUPFD correctly... yes
checking whether fcntl understands F_DUPFD_CLOEXEC... needs runtime check
checking whether conversion from 'int' to 'long double' works... yes
checking whether getcwd handles long file names properly...

Thanks,
Kelly
 
If you need support for DevX Tools:   http://devxsupport.cisco.com/
Specifically, for NXOS, see -
https://wiki.cisco.com/display/NEXUSPMO/ContactingNexusOpsAndTools
 

On 10/27/20, 8:36 AM, "Kelly Wang (kellythw)"  wrote:

Hi Paul,

You are right, after remove confdir3, rerun strace hang.
Checked tr output, it stopped at bunch of mkdir and chdir and no further 
steps after that.
mkdir("confdir3", 0700) = 0
chdir("confdir3")   = 0

Thanks,
Kelly

If you need support for DevX Tools:   http://devxsupport.cisco.com/
Specifically, for NXOS, see -
https://wiki.cisco.com/display/NEXUSPMO/ContactingNexusOpsAndTools


On 10/26/20, 3:56 PM, "Paul Eggert"  wrote:

On 10/26/20 9:13 AM, Kelly Wang (kellythw) wrote:

> [Kelly] strace step is not hang and I have tr generated.

Looking at the tr file, it appears that there was already a directory 
confdir3 
when you ran the strace step, and this directory messed up the test. 
Please 
remove that directory (or rename it) and then re-run the "strace -o tr 
./a.out". 
As before, the strace should also hang so you may need to type 
control-C to exit 
it after a while. Look at the resulting 'tr' file and compare it to the 
compressed file tr.gz I sent you earlier.

> [Kelly] The difference of tr output start at:
> 
> openat(AT_FDCWD, "/lib/x86_64-linux-gnu/libc.so.6", 
O_RDONLY|O_CLOEXEC) = 3  ==> output from yours
> 
> openat(AT_FDCWD, "/lib64/libc.so.6", O_RDONLY|O_CLOEXEC) = 3  ==> my 
output

That difference is unimportant. I'm concerned more about what happens 
after the 
long string of mkdir/chdir calls, which should occur once you get 
confdir3 out 
of the way.