On 10/31/2011 03:26 PM, Vincent Pelletier wrote:
> Hi.
>
> A short update first: I don't have this problem on any later suspend attempt
> (~4 so far, from a few dozen of minutes suspend to several hours).
>
> And a disclaimer: my kernel is tainted. Nvidia proprietary driver. Yuck.
> Feel free to blame the problems on it, I need a motivation to switch this
> box to nouveau ;) .
>
> On Mon, Oct 31, 2011 at 7:07 PM, Mike Christie <[email protected]> wrote:
>> Are these coming from accesses to the iscsi disk that is root? If so
>> when the replacement timeout fires, do you get IO errors for the root
>> paritition or are you using something like multipath over iscsi (I see
>> you have only one path but are you using it to just temporarily queue IO)?
>
> I don't use multipath (...at least, if lsmod | grep "multipath" -> nothing is
> enough to tell I'm not). I've not configured a thing to use it.
>
>> Could you send the /var/log/messages?
>
> Attached (gzipped, as it's 250k+ extracted).
> Limited from boot to shutdown (...for reboot).
> Weird enough: last lines before suspend have a timestamp from wakeup time.
> Also, the error output from wakeup is truncated, as seen on line 670:
> Oct 30 06:23:22 localhost kernel: [ 1138.769133] Restarting tasks ...
> 95606] [<ffffffff8100a9ef>] ? do_softirq+0x3f/0x84
Is the log you attached of the case where it took hours or one where it
now sort of works? I did not see the 120 sec issue or any soft lockups.
When you hit the problem is the network accessible and is iscsid up and
running? Can you ping the initiator box from another box on the network?
When you said "then a reconnection" occurs what did you mean? Did you
see a iscsi message indicating that we were reconnected to the target?
>
> Note: I accidentally hit ctrl-scroll lock while trying to make console
> flood stop
> to get time to read - and discovered it somehow dumped scheduler status.
> Sorry for the data it pushed out of buffer.
>
>>> For the moment, I don't have an idea on how to make resume happen
>>> gracefully:
>
> A note on my setup for that boot: actually, I wasn't completely netbooting at
> that point: grub2 & /boot were on local disk, but initrd was initiating iscsi
> connection. It was an intermediate setting, and I am now completely booting
> off iscsi (+ TFTP):
> BIOS + embedded PXE (because I don't want to reflash) ->
> iPXE ("sanboot iscsi:..." maps iscsi to bios disk 0x80) -> grub2 -> linux ->
> initrd reconnects to iscsi to mount /
>
> Maybe this could explain the problem I had (maybe the kernel/suspend tools
> weren't treating network gently enough for a clean resume).
You are doing the suspend after we have povited from the initramfs to
the real /, right? If so that should not be an issue.
--
You received this message because you are subscribed to the Google Groups
"open-iscsi" group.
To post to this group, send email to [email protected].
To unsubscribe from this group, send email to
[email protected].
For more options, visit this group at
http://groups.google.com/group/open-iscsi?hl=en.