Re: [developer] Fix for "illumos#3821 Race in rollback, zil close, and zil flush" is incomplete

2017-08-21 Thread Prakash Surya
Sounds good. I'll sync up with George. Thanks!

On Mon, Aug 21, 2017 at 8:02 AM, Jerry Jelinek 
wrote:

> Hi Prakash,
>
> We have only seen this panic one time since the original fix was
> integrated into illumos. I have no way to properly test your new change or
> reproduce the problem from this bug. I have been working with George Wilson
> to try to get him access to the dump for debugging. Maybe the best path
> forward is for you and him to sync up. If both of you feel like this new
> patch is the right fix for this bug, then once it is integrated into
> illumos, we'll pull it over right away and it will start being used in
> production at that time. Otherwise, I am not sure how I can help here since
> I can't put your patch into production and I have no other way to do any
> meaningful testing on this change.
>
> Sorry,
> Jerry
>
>
> On Mon, Aug 21, 2017 at 7:51 AM, Prakash Surya 
> wrote:
>
>> Hi Jerry,
>>
>> Would it be possible for you to try out the patch contained here:
>> https://github.com/openzfs/openzfs/pull/421 ? A direct link to the diff
>> (as opposed the review/comments) is here: https://patch-diff.githu
>> busercontent.com/raw/openzfs/openzfs/pull/421.diff
>>
>> When developing that change, I remember hitting a panic that looks
>> similar to what you reproduced, but wasn't able to trigger that same panic
>> after my changes stabilized. Since I haven't root caused your panic, I
>> can't say that my change will definitely address the panic, but I think it
>> will.
>>
>> In addition to addressing the panic, it should also improve latency of
>> ZIL operations (that was the original motivation for the change). It'd be
>> great to get some additional testing from others before submitting the RTI
>> to get it into illumos/openzfs.
>>
>> On Fri, Aug 11, 2017 at 6:14 AM, Jerry Jelinek 
>> wrote:
>>
>>> I just opened the following bug on illumos:
>>> https://www.illumos.org/issues/8574
>>>
>>> I don't have enough zfs knowledge to know if the fix we had been using
>>> for this bug prior to the integration of illumos#3821 is the best way
>>> forward here, or if there is something better.
>>>
>>> I'd be interested to hear any input from someone who knows more about
>>> this, or if anyone would like me to add more details from the panic into
>>> the bug report.
>>>
>>> Thanks,
>>> Jerry
>>>
>>>
> *openzfs-developer* | Archives
> 
> | Powered by Topicbox 

--
openzfs-developer
Archives: 
https://openzfs.topicbox.com/groups/developer/discussions/T55ce250b4e9eb1a9-Ma2bfac988226078aec249c97
Powered by Topicbox: https://topicbox.com


Re: [developer] Fix for "illumos#3821 Race in rollback, zil close, and zil flush" is incomplete

2017-08-21 Thread Jerry Jelinek
Hi Prakash,

We have only seen this panic one time since the original fix was integrated
into illumos. I have no way to properly test your new change or reproduce
the problem from this bug. I have been working with George Wilson to try to
get him access to the dump for debugging. Maybe the best path forward is
for you and him to sync up. If both of you feel like this new patch is the
right fix for this bug, then once it is integrated into illumos, we'll pull
it over right away and it will start being used in production at that time.
Otherwise, I am not sure how I can help here since I can't put your patch
into production and I have no other way to do any meaningful testing on
this change.

Sorry,
Jerry


On Mon, Aug 21, 2017 at 7:51 AM, Prakash Surya 
wrote:

> Hi Jerry,
>
> Would it be possible for you to try out the patch contained here:
> https://github.com/openzfs/openzfs/pull/421 ? A direct link to the diff
> (as opposed the review/comments) is here: https://patch-diff.
> githubusercontent.com/raw/openzfs/openzfs/pull/421.diff
>
> When developing that change, I remember hitting a panic that looks similar
> to what you reproduced, but wasn't able to trigger that same panic after my
> changes stabilized. Since I haven't root caused your panic, I can't say
> that my change will definitely address the panic, but I think it will.
>
> In addition to addressing the panic, it should also improve latency of ZIL
> operations (that was the original motivation for the change). It'd be great
> to get some additional testing from others before submitting the RTI to get
> it into illumos/openzfs.
>
> On Fri, Aug 11, 2017 at 6:14 AM, Jerry Jelinek 
> wrote:
>
>> I just opened the following bug on illumos:
>> https://www.illumos.org/issues/8574
>>
>> I don't have enough zfs knowledge to know if the fix we had been using
>> for this bug prior to the integration of illumos#3821 is the best way
>> forward here, or if there is something better.
>>
>> I'd be interested to hear any input from someone who knows more about
>> this, or if anyone would like me to add more details from the panic into
>> the bug report.
>>
>> Thanks,
>> Jerry
>>
>>
> *openzfs-developer* | Archives
> 
> | Powered by Topicbox 

--
openzfs-developer
Archives: 
https://openzfs.topicbox.com/groups/developer/discussions/T55ce250b4e9eb1a9-M7f65fd130ce4e5dec560605e
Powered by Topicbox: https://topicbox.com


Re: [developer] Fix for "illumos#3821 Race in rollback, zil close, and zil flush" is incomplete

2017-08-21 Thread Prakash Surya
Hi Jerry,

Would it be possible for you to try out the patch contained here:
https://github.com/openzfs/openzfs/pull/421 ? A direct link to the diff (as
opposed the review/comments) is here:
https://patch-diff.githubusercontent.com/raw/openzfs/openzfs/pull/421.diff

When developing that change, I remember hitting a panic that looks similar
to what you reproduced, but wasn't able to trigger that same panic after my
changes stabilized. Since I haven't root caused your panic, I can't say
that my change will definitely address the panic, but I think it will.

In addition to addressing the panic, it should also improve latency of ZIL
operations (that was the original motivation for the change). It'd be great
to get some additional testing from others before submitting the RTI to get
it into illumos/openzfs.

On Fri, Aug 11, 2017 at 6:14 AM, Jerry Jelinek 
wrote:

> I just opened the following bug on illumos: https://www.illumos.org/
> issues/8574
>
> I don't have enough zfs knowledge to know if the fix we had been using for
> this bug prior to the integration of illumos#3821 is the best way forward
> here, or if there is something better.
>
> I'd be interested to hear any input from someone who knows more about
> this, or if anyone would like me to add more details from the panic into
> the bug report.
>
> Thanks,
> Jerry
>
> *openzfs-developer* | Archives
> 
> | Powered by Topicbox 

--
openzfs-developer
Archives: 
https://openzfs.topicbox.com/groups/developer/discussions/T55ce250b4e9eb1a9-M4b3351f20b9c4aa8800e5536
Powered by Topicbox: https://topicbox.com


[developer] Fix for "illumos#3821 Race in rollback, zil close, and zil flush" is incomplete

2017-08-11 Thread Jerry Jelinek
I just opened the following bug on illumos:
https://www.illumos.org/issues/8574

I don't have enough zfs knowledge to know if the fix we had been using for
this bug prior to the integration of illumos#3821 is the best way forward
here, or if there is something better.

I'd be interested to hear any input from someone who knows more about this,
or if anyone would like me to add more details from the panic into the bug
report.

Thanks,
Jerry

--
openzfs-developer
Archives: 
https://openzfs.topicbox.com/groups/developer/discussions/T55ce250b4e9eb1a9-Mc43dfcea45f7babb18538f9b
Powered by Topicbox: https://topicbox.com