[ha-clusters-discuss] zfs hang issue

opensolaris_user hello Tue, 23 Mar 2010 15:42:31 -0700

Thank you Ashu. Do you know how we can get minimal support for Opensolaris.
>From the website i seem to be going in circles trying to find this
information. Any pointers in the right direction is highly appreciated.


On Mon, Mar 22, 2010 at 12:37 PM, Ashutosh Tripathi <
Ashutosh.Tripathi at sun.com> wrote:

> Hi Satya,
>
>        Hmmm... indeed sounds like people from ZFS and SC should
> be providing you with a little more justification/information
> about your situation then they have.
>
>        If i may, it think that apart from looking at this from
> Storage perspective, on the Solaris/ZFS/SC side, you might wanna
> open an Escalation with SUN using your support contract, assuming
> you have one. Otherwise, you are liable to just being thrown around,
> given that this is a rather deep issue to debug.
>
>        SUN (Oracle now), support engineers are trained for situations
> like these and know how to collect detailed data from the system to
> help with deeper analysis. Just looking at kernel thread stacks
> takes you only so far. Brute force kernel coredump analysis takes
> you a little further at the cost of LOT of effort. Doing targetted
> debugging (with dtrace scripts for example, debug kernel modules
> for another example), with several iterations of back and forth
> is what it takes to nail down deeper issues like this.
>
>        Hope that didn't sound too much like a pushback. It was
> intended to be good faith feedback on how you are trying to go
> about this problem.
>
> Regards,
>
> -ashu
>
> opensolaris_user hello wrote:
>
>> Hi Ashu,
>>
>> Thank you very much for the response.
>>
>>  In http://defect.opensolaris.org/bz/show_bug.cgi?id=15058 i indicated
>> that "zfs list -t all" thread was the oldest idle thread and it seem to be
>> stuck for whatever reason.  All the threads that are in biowait() seem to be
>> much later than that thread. It could be that the arc_read_no_lock() thread
>> could be the cause of the other i/o waits. While i agree that biowait() on
>> the scdpmd could indicate some thing got stuck at the scsi layer which may
>> or may not be because of the zfs list thread, but since it was the oldest
>> idle thread i was interested in knowing why it is stuck there for ever.
>>
>> The cluster team had indicated it is a invalid configuration, but did not
>> give any further details as to why that is the case and how we can modify
>> the configuration to prevent this. If you think that ZFS team needs to take
>> a look please assign it to ZFS and i can follow up on zfs-discuss.
>>
>> Once again appreciate your response. We will try to follow up from the
>> storage perspective as well.
>>
>
>
>> Regards,
>> Satya
>>
>>
>>
>>
>> On Fri, Mar 19, 2010 at 4:57 PM, Ashutosh Tripathi <
>> Ashutosh.Tripathi at sun.com <mailto:Ashutosh.Tripathi at sun.com>> wrote:
>>
>>    Hi Satya,
>>
>>           While i don't know why is ZFS I/O hung in biowait(),
>>    from past experience, i can tell you that biowait() issues
>>    tend to be very hard to debug. In many cases, these actually
>>    turn out to be issues related to storage, ie the storage/storage-driver
>>    simply take too long (or loose track of) for a given I/O. At the upper
>>    layer (Solaris/Filesystem), there is nothing the system can do,
>>    except to wait for the I/O to complete.
>>
>>           Note that this is different from a SCSI timeout, ie
>>    that a SCSI packet sent by the server to the storage gets lost,
>>    so the host never gets a ACK back. In that case, the SCSI
>>    command is retried. Here, i am talking about a case where
>>    the SCSI command has been ACKed properly by the storage,
>>    it just never gets back with the completed I/O.
>>
>>           While your mention of the "zfs list -t all" command
>>    sounds a bit suspicious, when i actually look at the thread
>>    stack you posted in the CR, it lists a bunch of java threads
>>    and scdpmd threads stuck behind a biowait(). So, at least in
>>    that case, the hang could be independent of the zfs list
>>    command (it is always possible that the zfs list is triggering
>>    a particular pattern of I/O which leads to this...).
>>
>>           Anyhow, where does that leave you... Have you tried
>>    approaching your storage vendor with this problem? The leading
>>    question to them would be: Why isn't the storage completing this
>>    particular I/O request from the host?
>>
>>    HTH,
>>    -ashu
>>
>>
>>    opensolaris_user hello wrote:
>>
>>
>>        Hi Cluster Team,
>>
>>        We are currently running in to a ZFS hang regularly and after
>>        this happens the node can end up with corrupted pool causing
>>        complete data loss. Other than that since reboot doesn't work we
>>        end up with corrupted boot partition causing boot to panic. We
>>        are using clustering with a single node configuration and aim to
>>        expand to 2 node HA configuration.  Looking at the stack traces
>>        and our application logs it is clear that a "zfs list -t all"
>>        command causes the pool to be stuck.  The system works all the
>>        time with out any issues except when we run in to this hang.
>>
>>        I tried to analyze the root cause and i see that the zfs list
>>        thread was stuck in i/o wait. It seems that this is a ZFS hang
>>        and not related to clustering. We even tried to disable cluster
>>        disk path monitoring and still run in to this issue.
>>
>>        If we can get some insight as to why this is a invalid cluster
>>        configuration and how this can lead to the ZFS hang we would
>>        appreciate that.  I have filed 2 bugs but the following bug
>>        explains the situation much better:
>>
>>        http://defect.opensolaris.org/bz/show_bug.cgi?id=15058
>>
>>        Any insight/help with this issue is highly appreciated.
>>
>>        Thanks,
>>        Satya
>>
>>
>>
>>  ------------------------------------------------------------------------
>>
>>        _______________________________________________
>>        ha-clusters-discuss mailing list
>>        ha-clusters-discuss at opensolaris.org
>>        <mailto:ha-clusters-discuss at opensolaris.org>
>>
>>        http://mail.opensolaris.org/mailman/listinfo/ha-clusters-discuss
>>
>>
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: 
<http://mail.opensolaris.org/pipermail/ha-clusters-discuss/attachments/20100323/1dd96161/attachment.html>

[ha-clusters-discuss] zfs hang issue

Reply via email to