Issac,
> -----Original Message-----
> From: Isaac Dunham [mailto:[email protected]]
> Sent: Friday, May 02, 2014 8:39 PM
> To: Bryan Evenson
> Cc: Denys Vlasenko; busybox; [email protected]
> Subject: Re: [PATCH 1/1] hwclock: Verify RTC file descriptor; use reentrant
> functions
>
> On Fri, May 02, 2014 at 06:11:41PM +0000, Bryan Evenson wrote:
> > Denys,
> >
> > > -----Original Message-----
> > > From: Denys Vlasenko [mailto:[email protected]]
> > > Sent: Friday, May 02, 2014 3:10 AM
> > > To: Bryan Evenson
> > > Cc: [email protected]; busybox; [email protected]
> > > Subject: Re: [PATCH 1/1] hwclock: Verify RTC file descriptor; use
> > > reentrant functions
> > >
> > >
> > > I don't see how this can happen. Do you have evidence rtc_xopen()
> > > did return a result less than zero?
> >
> > I don't have evidence, but that seemed to be the most likely result through
> code inspection. rtc_xopen() returns the result of xopen(), which returns
> the result of open(). open() returns -1 if an error occurred, meaning
> rtc_xopen() could return -1 if an error occurred opening the rtc file.
> >
>
> ...except that xopen() does not just return the result of open().
> xopen() calls xopen3(), which has this:
> ret = open(pathname, flags, mode);
> if (ret < 0) {
> bb_perror_msg_and_die("can't open '%s'", pathname);
> }
>
> bb_perror_msg_and_die should do exactly what it says: exit verbosely.
>
> So no, xopen() cannot ever return -1 unless you have a broken compiler.
> rtc_xopen() similarly cannot return -1.
>
> As a rule of thumb, all the x*() functions will abort if they do not succeed.
Thanks for the explanation. I misread that section and was thinking there was
a way for it to return the result of open().
I did get strace loaded on my system and I used that to get a better
understanding of where the error occurs. I modified my test script as follows:
----------
#!/bin/sh
i=0
while [ 1 ]; do
strace -f -o /home/root/hwclock_output_"$$_$i".txt /sbin/hwclock -w -u
: $((i++))
sleep 1;
done
----------
Again, on my system I was able to get an instance of hwclock to enter an
uninterruptable sleep and hang within seconds when I ran two instances of the
test script. By checking the PID of the hung process and the PIDs in the
output files, I found the strace of the hung process, which is shown below.
----------
1808 execve("/sbin/hwclock", ["/sbin/hwclock", "-w", "-u"], [/* 13 vars */]) = 0
1808 brk(0) = 0x9c000
1808 uname({sys="Linux", node="at91sam9x5ek", ...}) = 0
1808 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
1808 open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
1808 fstat64(3, {st_mode=S_IFREG|0644, st_size=7053, ...}) = 0
1808 mmap2(NULL, 7053, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb6f8b000
1808 close(3) = 0
1808 open("/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
1808 read(3,
"\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0\354\225JJ4\0\0\0"..., 512) = 512
1808 fstat64(3, {st_mode=S_IFREG|0755, st_size=1245200, ...}) = 0
1808 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
= 0xb6f8a000
1808 mmap2(0x4a490000, 1284496, PROT_READ|PROT_EXEC,
MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x4a490000
1808 mprotect(0x4a5bc000, 32768, PROT_NONE) = 0
1808 mmap2(0x4a5c4000, 16384, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x12c) = 0x4a5c4000
1808 mmap2(0x4a5c8000, 6544, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x4a5c8000
1808 close(3) = 0
1808 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
= 0xb6f89000
1808 set_tls(0xb6f894c0, 0xb6f89b98, 0x4a487058, 0xb6f894c0, 0x4a487058) = 0
1808 mprotect(0x4a5c4000, 8192, PROT_READ) = 0
1808 mprotect(0x4a486000, 4096, PROT_READ) = 0
1808 munmap(0xb6f8b000, 7053) = 0
1808 getuid32() = 0
1808 open("/dev/rtc", O_WRONLY|O_LARGEFILE) = 3
1808 gettimeofday({1399298707, 765884}, NULL) = 0
1808 brk(0) = 0x9c000
1808 brk(0xbd000) = 0xbd000
1808 open("/etc/localtime", O_RDONLY|O_CLOEXEC) = 4
1808 fcntl64(4, F_GETFD) = 0x1 (flags FD_CLOEXEC)
1808 fstat64(4, {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0
1808 fstat64(4, {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0
1808 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
= 0xb6f8c000
1808 read(4, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\4\0\0\0\4\0\0\0\0"...,
4096) = 3519
1808 _llseek(4, -24, [3495], SEEK_CUR) = 0
1808 read(4, "\nEST5EDT,M3.2.0,M11.1.0\n", 4096) = 24
1808 close(4) = 0
1808 munmap(0xb6f8c000, 4096) = 0
1808 ioctl(3, RTC_SET_TIME, {tm_sec=7, tm_min=5, tm_hour=14, tm_mday=5,
tm_mon=4, tm_year=114, ...}
----------
And here is the output from the hwclock instance that occurred from the other
test script just before this one:
----------
1799 execve("/sbin/hwclock", ["/sbin/hwclock", "-w", "-u"], [/* 13 vars */]) = 0
1799 brk(0) = 0x9c000
1799 uname({sys="Linux", node="at91sam9x5ek", ...}) = 0
1799 access("/etc/ld.so.preload", R_OK) = -1 ENOENT (No such file or directory)
1799 open("/etc/ld.so.cache", O_RDONLY|O_CLOEXEC) = 3
1799 fstat64(3, {st_mode=S_IFREG|0644, st_size=7053, ...}) = 0
1799 mmap2(NULL, 7053, PROT_READ, MAP_PRIVATE, 3, 0) = 0xb6f98000
1799 close(3) = 0
1799 open("/lib/libc.so.6", O_RDONLY|O_CLOEXEC) = 3
1799 read(3,
"\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0(\0\1\0\0\0\354\225JJ4\0\0\0"..., 512) = 512
1799 fstat64(3, {st_mode=S_IFREG|0755, st_size=1245200, ...}) = 0
1799 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
= 0xb6f97000
1799 mmap2(0x4a490000, 1284496, PROT_READ|PROT_EXEC,
MAP_PRIVATE|MAP_DENYWRITE, 3, 0) = 0x4a490000
1799 mprotect(0x4a5bc000, 32768, PROT_NONE) = 0
1799 mmap2(0x4a5c4000, 16384, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_DENYWRITE, 3, 0x12c) = 0x4a5c4000
1799 mmap2(0x4a5c8000, 6544, PROT_READ|PROT_WRITE,
MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x4a5c8000
1799 close(3) = 0
1799 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
= 0xb6f96000
1799 set_tls(0xb6f964c0, 0xb6f96b98, 0x4a487058, 0xb6f964c0, 0x4a487058) = 0
1799 mprotect(0x4a5c4000, 8192, PROT_READ) = 0
1799 mprotect(0x4a486000, 4096, PROT_READ) = 0
1799 munmap(0xb6f98000, 7053) = 0
1799 getuid32() = 0
1799 open("/dev/rtc", O_WRONLY|O_LARGEFILE) = 3
1799 gettimeofday({1399298707, 387953}, NULL) = 0
1799 brk(0) = 0x9c000
1799 brk(0xbd000) = 0xbd000
1799 open("/etc/localtime", O_RDONLY|O_CLOEXEC) = 4
1799 fcntl64(4, F_GETFD) = 0x1 (flags FD_CLOEXEC)
1799 fstat64(4, {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0
1799 fstat64(4, {st_mode=S_IFREG|0644, st_size=3519, ...}) = 0
1799 mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0)
= 0xb6f99000
1799 read(4, "TZif2\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\0\4\0\0\0\4\0\0\0\0"...,
4096) = 3519
1799 _llseek(4, -24, [3495], SEEK_CUR) = 0
1799 read(4, "\nEST5EDT,M3.2.0,M11.1.0\n", 4096) = 24
1799 close(4) = 0
1799 munmap(0xb6f99000, 4096) = 0
1799 ioctl(3, RTC_SET_TIME, {tm_sec=7, tm_min=5, tm_hour=14, tm_mday=5,
tm_mon=4, tm_year=114, ...}) = 0
1799 exit_group(0) = ?
1799 +++ exited with 0 +++
----------
So on my system, hwclock is hanging on the ioctl to set the time. Both the
instance that set the time and the one that could not set the time are trying
to set the RTC to the same time. So I'm assuming from this output that there
is something going wrong with the ioctl on my RTC driver when two separate
processes attempt to set the time at the same time. If I'm interpreting this
correctly, I need to track down what the RTC driver is doing wrong with the
RTC_SET_TIME ioctl.
>
> HTH,
> Isaac Dunham
Thanks,
Bryan
_______________________________________________
busybox mailing list
[email protected]
http://lists.busybox.net/mailman/listinfo/busybox