Issac, > -----Original Message----- > From: Isaac Dunham [mailto:[email protected]] > Sent: Monday, May 05, 2014 4:41 PM > To: Bryan Evenson > Cc: Denys Vlasenko; busybox; [email protected] > Subject: Re: [PATCH 1/1] hwclock: Verify RTC file descriptor; use reentrant > functions > > On Mon, May 05, 2014 at 03:45:28PM +0000, Bryan Evenson wrote: > > I did get strace loaded on my system and I used that to get a better > understanding of where the error occurs. I modified my test script as > follows: > > ---------- > > #!/bin/sh > > i=0 > > while [ 1 ]; do > > strace -f -o /home/root/hwclock_output_"$$_$i".txt /sbin/hwclock -w -u > > : $((i++)) > > sleep 1; > > done > > ---------- > > > > Again, on my system I was able to get an instance of hwclock to enter an > uninterruptable sleep and hang within seconds when I ran two instances of > the test script. By checking the PID of the hung process and the PIDs in the > output files, I found the strace of the hung process, which is shown below. > > > <snip> > > 1808 ioctl(3, RTC_SET_TIME, {tm_sec=7, tm_min=5, tm_hour=14, > > tm_mday=5, tm_mon=4, tm_year=114, ...} > > ---------- > > > > And here is the output from the hwclock instance that occurred from the > other test script just before this one: > > > > ---------- > <snip> > > 1799 ioctl(3, RTC_SET_TIME, {tm_sec=7, tm_min=5, tm_hour=14, > tm_mday=5, tm_mon=4, tm_year=114, ...}) = 0 > > 1799 exit_group(0) = ? > > 1799 +++ exited with 0 +++ > > ---------- > > > > So on my system, hwclock is hanging on the ioctl to set the time. > > Both the instance that set the time and the one that could not set the > > time are trying to set the RTC to the same time. So I'm assuming from > > this output that there is something going wrong with the ioctl on my > > RTC driver when two separate processes attempt to set the time at the > > same time. If I'm interpreting this correctly, I need to track down > > what the RTC driver is doing wrong with the RTC_SET_TIME ioctl. > > > > Well, having two processes trying to set time is rather nonsensical.
I agree that it is nonsensical, but it can occur. On the target system, ntpd is running. There were some rare instances in which hwclock would be accessed at the same time ntpd attempted its "11 minute mode" update. The test script was an attempt to quickly duplicate the problem. > I suppose that the _right_ thing for the kernel to do would be have one block > until the other finished, but...it makes sense to lock the fd. > > So does the patch I'm attaching work for you? Well, this is leading into some interesting questions about responsibility. I reported my results on the linux-arm-kernel mailing list, and it is possible I uncovered a bug in the Atmel RTC driver. I'll know a little more by this time tomorrow on that issue. I'd say Busybox should have reasonable assumptions about the kernel interfaces acting as documented. If that's the case, I'm not sure there is a need for this patch. Any thoughts? Either way, I'll report if I find out if a RTC driver bug was the root cause of my problems. > > HTH, > Isaac Dunham Regards, Bryan _______________________________________________ busybox mailing list [email protected] http://lists.busybox.net/mailman/listinfo/busybox
