Re: [Bug] 12.864681 BUG: lock held when returning to user space!

2013-10-16 Thread vaughan
On 10/17/2013 06:41 AM, Douglas Gilbert wrote:
> That seems to be the case. Vaughan acknowledged the
> problem and forwarded it to me 8 days ago. Yes, it
> seems to be a "no-no" to hold a any kernel semaphore
> when returning to the user space; in this case from
> sg_open(). I was hoping a revised patch might
> appear from Vaughan but to date that has not been
> the case. So with only a few weeks to go before
> lk 3.12 is released, reverting the whole 4 patches
> in that series seems to be the safest course.
>
> Also without a new patch from Vaughan in the next few
> weeks he may also miss the opportunity of getting
> his improved O_EXCL logic into the lk 3.13 series.
>
>
> Thinking about how to solve this problem: a field could
> be added to 'struct sg_device' with one of three states:
> no_opens, non_excl_opens and excl_open. It could be
> manipulated by sg_open() and sg_release() like a
> read-write semaphore. And the faulty 'struct
> rw_semaphore o_sem' in sg_device could be replaced by a
> normal semaphore to protect the manipulation of the new
> three-state field.
> And the new three-state field would replace (or expand)
the 'char exclude' field in struct sg_device .
>
> Doug Gilbert
Hi Doug,

Thanks for providing advice on how to fix this.
However, it seems be still awkward somehow. We have to
1. hold a lock (maybe sg_index_lock or a new one)
2. check
   a) the new three-state field;
   b) if sfp list is empty;
   c) sdp->detached field;
if either condition fails, we may link the open process into o_excl_wait
queue and need wakeup.
if satisfied, we go on.
3. then we release at least sg_index_lock to malloc a new sfp and
initialize.
4. try to acquire sg_index_lock again and add this sfp into sfd_siblings
list if possible.  <== We still have to check at least sdp->detached field
5. update three-state field to reflect the result of Step 4, and wake up
processes waiting in o_excl_wait.

This uncomfortable is introduced by releasing the sg_index_lock in the
middle of check->malloc->add the new sfp struct.

I wanna ask if it is possible to split the sg_add_sfp() into two
functions, sg_init_sfp() and sg_add_sfp2(). We can do all initialize
work in sg_init_sfp()
without any lock and let sg_add_sfp2() only serve lock-check-add in one
way. It seems more convenient for me to understand.
But there is still some questions on this approach:
1. memory consume can be very large if lots of sg_init_sfp in the same time;
2. some field are initialized according to the fields of scsi device sdp
points to, such as low_dma, sg_tablesize, max_sector, phys_segs.
I know scsi_device_get() would keep the underlying scsi_device
alive, however would these fields change in the gap of our initialize
and add?
The relationship of sg_device and scsi_device like above said
confuse me somehow...

Thanks,
Vaughan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Bug] 12.864681 BUG: lock held when returning to user space!

2013-10-16 Thread Douglas Gilbert

On 13-10-16 09:24 AM, James Bottomley wrote:

On Tue, 2013-10-08 at 09:45 -0400, Douglas Gilbert wrote:

On 13-10-08 02:44 AM, vaughan wrote:

Hi Madper,

CC to Douglas to get comments.
I use the rw_semaphore o_sem to protect excl open, introduced in commit
15b06f9a02406e5460001db6d5af5c738cd3d4e7 since v3.12-rc1.
Is it forbidden to do like that in kernel?...


It appears you can not (allow sg_open() to hold a semaphore
then return to the user space). So you will need to do some
rework on that patch or revert it.


OK, there being no reply on this, I'll do the revert ... that's all four
patches, correct?


That seems to be the case. Vaughan acknowledged the
problem and forwarded it to me 8 days ago. Yes, it
seems to be a "no-no" to hold a any kernel semaphore
when returning to the user space; in this case from
sg_open(). I was hoping a revised patch might
appear from Vaughan but to date that has not been
the case. So with only a few weeks to go before
lk 3.12 is released, reverting the whole 4 patches
in that series seems to be the safest course.

Also without a new patch from Vaughan in the next few
weeks he may also miss the opportunity of getting
his improved O_EXCL logic into the lk 3.13 series.


Thinking about how to solve this problem: a field could
be added to 'struct sg_device' with one of three states:
no_opens, non_excl_opens and excl_open. It could be
manipulated by sg_open() and sg_release() like a
read-write semaphore. And the faulty 'struct
rw_semaphore o_sem' in sg_device could be replaced by a
normal semaphore to protect the manipulation of the new
three-state field.

Doug Gilbert


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Bug] 12.864681 BUG: lock held when returning to user space!

2013-10-16 Thread James Bottomley
On Tue, 2013-10-08 at 09:45 -0400, Douglas Gilbert wrote:
> On 13-10-08 02:44 AM, vaughan wrote:
> > Hi Madper,
> >
> > CC to Douglas to get comments.
> > I use the rw_semaphore o_sem to protect excl open, introduced in commit
> > 15b06f9a02406e5460001db6d5af5c738cd3d4e7 since v3.12-rc1.
> > Is it forbidden to do like that in kernel?...
> 
> It appears you can not (allow sg_open() to hold a semaphore
> then return to the user space). So you will need to do some
> rework on that patch or revert it.

OK, there being no reply on this, I'll do the revert ... that's all four
patches, correct?

James


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Bug] 12.864681 BUG: lock held when returning to user space!

2013-10-16 Thread James Bottomley
On Tue, 2013-10-08 at 09:45 -0400, Douglas Gilbert wrote:
 On 13-10-08 02:44 AM, vaughan wrote:
  Hi Madper,
 
  CC to Douglas to get comments.
  I use the rw_semaphore o_sem to protect excl open, introduced in commit
  15b06f9a02406e5460001db6d5af5c738cd3d4e7 since v3.12-rc1.
  Is it forbidden to do like that in kernel?...
 
 It appears you can not (allow sg_open() to hold a semaphore
 then return to the user space). So you will need to do some
 rework on that patch or revert it.

OK, there being no reply on this, I'll do the revert ... that's all four
patches, correct?

James


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Bug] 12.864681 BUG: lock held when returning to user space!

2013-10-16 Thread Douglas Gilbert

On 13-10-16 09:24 AM, James Bottomley wrote:

On Tue, 2013-10-08 at 09:45 -0400, Douglas Gilbert wrote:

On 13-10-08 02:44 AM, vaughan wrote:

Hi Madper,

CC to Douglas to get comments.
I use the rw_semaphore o_sem to protect excl open, introduced in commit
15b06f9a02406e5460001db6d5af5c738cd3d4e7 since v3.12-rc1.
Is it forbidden to do like that in kernel?...


It appears you can not (allow sg_open() to hold a semaphore
then return to the user space). So you will need to do some
rework on that patch or revert it.


OK, there being no reply on this, I'll do the revert ... that's all four
patches, correct?


That seems to be the case. Vaughan acknowledged the
problem and forwarded it to me 8 days ago. Yes, it
seems to be a no-no to hold a any kernel semaphore
when returning to the user space; in this case from
sg_open(). I was hoping a revised patch might
appear from Vaughan but to date that has not been
the case. So with only a few weeks to go before
lk 3.12 is released, reverting the whole 4 patches
in that series seems to be the safest course.

Also without a new patch from Vaughan in the next few
weeks he may also miss the opportunity of getting
his improved O_EXCL logic into the lk 3.13 series.


Thinking about how to solve this problem: a field could
be added to 'struct sg_device' with one of three states:
no_opens, non_excl_opens and excl_open. It could be
manipulated by sg_open() and sg_release() like a
read-write semaphore. And the faulty 'struct
rw_semaphore o_sem' in sg_device could be replaced by a
normal semaphore to protect the manipulation of the new
three-state field.

Doug Gilbert


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Bug] 12.864681 BUG: lock held when returning to user space!

2013-10-16 Thread vaughan
On 10/17/2013 06:41 AM, Douglas Gilbert wrote:
 That seems to be the case. Vaughan acknowledged the
 problem and forwarded it to me 8 days ago. Yes, it
 seems to be a no-no to hold a any kernel semaphore
 when returning to the user space; in this case from
 sg_open(). I was hoping a revised patch might
 appear from Vaughan but to date that has not been
 the case. So with only a few weeks to go before
 lk 3.12 is released, reverting the whole 4 patches
 in that series seems to be the safest course.

 Also without a new patch from Vaughan in the next few
 weeks he may also miss the opportunity of getting
 his improved O_EXCL logic into the lk 3.13 series.


 Thinking about how to solve this problem: a field could
 be added to 'struct sg_device' with one of three states:
 no_opens, non_excl_opens and excl_open. It could be
 manipulated by sg_open() and sg_release() like a
 read-write semaphore. And the faulty 'struct
 rw_semaphore o_sem' in sg_device could be replaced by a
 normal semaphore to protect the manipulation of the new
 three-state field.
 And the new three-state field would replace (or expand)
the 'char exclude' field in struct sg_device .

 Doug Gilbert
Hi Doug,

Thanks for providing advice on how to fix this.
However, it seems be still awkward somehow. We have to
1. hold a lock (maybe sg_index_lock or a new one)
2. check
   a) the new three-state field;
   b) if sfp list is empty;
   c) sdp-detached field;
if either condition fails, we may link the open process into o_excl_wait
queue and need wakeup.
if satisfied, we go on.
3. then we release at least sg_index_lock to malloc a new sfp and
initialize.
4. try to acquire sg_index_lock again and add this sfp into sfd_siblings
list if possible.  == We still have to check at least sdp-detached field
5. update three-state field to reflect the result of Step 4, and wake up
processes waiting in o_excl_wait.

This uncomfortable is introduced by releasing the sg_index_lock in the
middle of check-malloc-add the new sfp struct.

I wanna ask if it is possible to split the sg_add_sfp() into two
functions, sg_init_sfp() and sg_add_sfp2(). We can do all initialize
work in sg_init_sfp()
without any lock and let sg_add_sfp2() only serve lock-check-add in one
way. It seems more convenient for me to understand.
But there is still some questions on this approach:
1. memory consume can be very large if lots of sg_init_sfp in the same time;
2. some field are initialized according to the fields of scsi device sdp
points to, such as low_dma, sg_tablesize, max_sector, phys_segs.
I know scsi_device_get() would keep the underlying scsi_device
alive, however would these fields change in the gap of our initialize
and add?
The relationship of sg_device and scsi_device like above said
confuse me somehow...

Thanks,
Vaughan
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Bug] 12.864681 BUG: lock held when returning to user space!

2013-10-08 Thread Douglas Gilbert

On 13-10-08 02:44 AM, vaughan wrote:

Hi Madper,

CC to Douglas to get comments.
I use the rw_semaphore o_sem to protect excl open, introduced in commit
15b06f9a02406e5460001db6d5af5c738cd3d4e7 since v3.12-rc1.
Is it forbidden to do like that in kernel?...


It appears you can not (allow sg_open() to hold a semaphore
then return to the user space). So you will need to do some
rework on that patch or revert it.

Doug Gilbert

Reference: scsi-linux + kernel lists, title:
  [PATCH v6 0/4][SCSI] sg: fix race condition in sg_open
  20130828


On 10/08/2013 01:57 PM, Madper Xie wrote:

Howdy Vaughan Cao,
I can't meet this issue on both 3.11 and 3.11.4. There are only four
patches between 3.11 and 3.12-rc2 and you are the author. Will you
please check them if you have time.

c...@redhat.com writes:


Hi all,
With kernel3.12-rc2 the dmesg shows following logs:
[   12.864680] 
[   12.864681] [ BUG: lock held when returning to user space! ]
[   12.864682] 3.12.0-rc2 #1 Not tainted
[   12.864683] 
[   12.864684] iprinit/719 is leaving the kernel with locks still held!
[   12.864685] 1 lock held by iprinit/719:
[   12.864686]  #0:  (>o_sem){.+.+..}, at: [] 
sg_open+0x4b5/0x644 [sg]
[   12.934954] ath9k :01:00.0: enabling device ( -> 0002)
[   12.940346] ath: phy0: timeout (1000 us) on reg 0x15f18: 0x & 
0x0007 != 0x0004
[   12.943125] ath: EEPROM regdomain: 0x60
[   12.943127] ath: EEPROM indicates we should expect a direct regpair map
[   12.943129] ath: Country alpha2 being used: 00
[   12.943130] ath: Regpair used: 0x60
[   12.960202] r8169 :02:00.0 p3p1: link down
[   12.960236] r8169 :02:00.0 p3p1: link down
[   12.960256] IPv6: ADDRCONF(NETDEV_UP): p3p1: link is not ready
[   13.003523] ieee80211 phy0: Selected rate control algorithm 'minstrel_ht'
[   13.003886] ieee80211 phy0: Atheros AR9485 Rev:1 mem=0xc9000bc8, 
irq=16
[   13.012120] ip6_tables: (C) 2000-2006 Netfilter Core Team
[   13.023667] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[   13.055802] Ebtables v2.0 registered
[   13.192291] nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
[   15.906392] r8169 :02:00.0 p3p1: link up
[   15.906416] IPv6: ADDRCONF(NETDEV_CHANGE): p3p1: link becomes ready
[   17.121989] systemd-udevd (334) used greatest stack depth: 3352 bytes left

I'm working on finding which version bring this bug in.







--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Bug] 12.864681 BUG: lock held when returning to user space!

2013-10-08 Thread vaughan
Hi Madper,

CC to Douglas to get comments.
I use the rw_semaphore o_sem to protect excl open, introduced in commit
15b06f9a02406e5460001db6d5af5c738cd3d4e7 since v3.12-rc1.
Is it forbidden to do like that in kernel?...

Regards,
Vaughan

On 10/08/2013 01:57 PM, Madper Xie wrote:
> Howdy Vaughan Cao,
>I can't meet this issue on both 3.11 and 3.11.4. There are only four
>patches between 3.11 and 3.12-rc2 and you are the author. Will you
>please check them if you have time. 
>
> c...@redhat.com writes:
>
>> Hi all,
>>With kernel3.12-rc2 the dmesg shows following logs:
>> [   12.864680] ============
>> [   12.864681] [ BUG: lock held when returning to user space! ]
>> [   12.864682] 3.12.0-rc2 #1 Not tainted
>> [   12.864683] 
>> [   12.864684] iprinit/719 is leaving the kernel with locks still held!
>> [   12.864685] 1 lock held by iprinit/719:
>> [   12.864686]  #0:  (>o_sem){.+.+..}, at: [] 
>> sg_open+0x4b5/0x644 [sg]
>> [   12.934954] ath9k :01:00.0: enabling device ( -> 0002)
>> [   12.940346] ath: phy0: timeout (1000 us) on reg 0x15f18: 0x & 
>> 0x0007 != 0x0004
>> [   12.943125] ath: EEPROM regdomain: 0x60
>> [   12.943127] ath: EEPROM indicates we should expect a direct regpair map
>> [   12.943129] ath: Country alpha2 being used: 00
>> [   12.943130] ath: Regpair used: 0x60
>> [   12.960202] r8169 :02:00.0 p3p1: link down
>> [   12.960236] r8169 :02:00.0 p3p1: link down
>> [   12.960256] IPv6: ADDRCONF(NETDEV_UP): p3p1: link is not ready
>> [   13.003523] ieee80211 phy0: Selected rate control algorithm 'minstrel_ht'
>> [   13.003886] ieee80211 phy0: Atheros AR9485 Rev:1 mem=0xc9000bc8, 
>> irq=16
>> [   13.012120] ip6_tables: (C) 2000-2006 Netfilter Core Team
>> [   13.023667] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
>> [   13.055802] Ebtables v2.0 registered
>> [   13.192291] nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
>> [   15.906392] r8169 :02:00.0 p3p1: link up
>> [   15.906416] IPv6: ADDRCONF(NETDEV_CHANGE): p3p1: link becomes ready
>> [   17.121989] systemd-udevd (334) used greatest stack depth: 3352 bytes left
>>
>> I'm working on finding which version bring this bug in.
>

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Bug] 12.864681 BUG: lock held when returning to user space!

2013-10-08 Thread vaughan
Hi Madper,

CC to Douglas to get comments.
I use the rw_semaphore o_sem to protect excl open, introduced in commit
15b06f9a02406e5460001db6d5af5c738cd3d4e7 since v3.12-rc1.
Is it forbidden to do like that in kernel?...

Regards,
Vaughan

On 10/08/2013 01:57 PM, Madper Xie wrote:
 Howdy Vaughan Cao,
I can't meet this issue on both 3.11 and 3.11.4. There are only four
patches between 3.11 and 3.12-rc2 and you are the author. Will you
please check them if you have time. 

 c...@redhat.com writes:

 Hi all,
With kernel3.12-rc2 the dmesg shows following logs:
 [   12.864680] 
 [   12.864681] [ BUG: lock held when returning to user space! ]
 [   12.864682] 3.12.0-rc2 #1 Not tainted
 [   12.864683] 
 [   12.864684] iprinit/719 is leaving the kernel with locks still held!
 [   12.864685] 1 lock held by iprinit/719:
 [   12.864686]  #0:  (sdp-o_sem){.+.+..}, at: [a050de05] 
 sg_open+0x4b5/0x644 [sg]
 [   12.934954] ath9k :01:00.0: enabling device ( - 0002)
 [   12.940346] ath: phy0: timeout (1000 us) on reg 0x15f18: 0x  
 0x0007 != 0x0004
 [   12.943125] ath: EEPROM regdomain: 0x60
 [   12.943127] ath: EEPROM indicates we should expect a direct regpair map
 [   12.943129] ath: Country alpha2 being used: 00
 [   12.943130] ath: Regpair used: 0x60
 [   12.960202] r8169 :02:00.0 p3p1: link down
 [   12.960236] r8169 :02:00.0 p3p1: link down
 [   12.960256] IPv6: ADDRCONF(NETDEV_UP): p3p1: link is not ready
 [   13.003523] ieee80211 phy0: Selected rate control algorithm 'minstrel_ht'
 [   13.003886] ieee80211 phy0: Atheros AR9485 Rev:1 mem=0xc9000bc8, 
 irq=16
 [   13.012120] ip6_tables: (C) 2000-2006 Netfilter Core Team
 [   13.023667] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
 [   13.055802] Ebtables v2.0 registered
 [   13.192291] nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
 [   15.906392] r8169 :02:00.0 p3p1: link up
 [   15.906416] IPv6: ADDRCONF(NETDEV_CHANGE): p3p1: link becomes ready
 [   17.121989] systemd-udevd (334) used greatest stack depth: 3352 bytes left

 I'm working on finding which version bring this bug in.


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Bug] 12.864681 BUG: lock held when returning to user space!

2013-10-08 Thread Douglas Gilbert

On 13-10-08 02:44 AM, vaughan wrote:

Hi Madper,

CC to Douglas to get comments.
I use the rw_semaphore o_sem to protect excl open, introduced in commit
15b06f9a02406e5460001db6d5af5c738cd3d4e7 since v3.12-rc1.
Is it forbidden to do like that in kernel?...


It appears you can not (allow sg_open() to hold a semaphore
then return to the user space). So you will need to do some
rework on that patch or revert it.

Doug Gilbert

Reference: scsi-linux + kernel lists, title:
  [PATCH v6 0/4][SCSI] sg: fix race condition in sg_open
  20130828


On 10/08/2013 01:57 PM, Madper Xie wrote:

Howdy Vaughan Cao,
I can't meet this issue on both 3.11 and 3.11.4. There are only four
patches between 3.11 and 3.12-rc2 and you are the author. Will you
please check them if you have time.

c...@redhat.com writes:


Hi all,
With kernel3.12-rc2 the dmesg shows following logs:
[   12.864680] 
[   12.864681] [ BUG: lock held when returning to user space! ]
[   12.864682] 3.12.0-rc2 #1 Not tainted
[   12.864683] 
[   12.864684] iprinit/719 is leaving the kernel with locks still held!
[   12.864685] 1 lock held by iprinit/719:
[   12.864686]  #0:  (sdp-o_sem){.+.+..}, at: [a050de05] 
sg_open+0x4b5/0x644 [sg]
[   12.934954] ath9k :01:00.0: enabling device ( - 0002)
[   12.940346] ath: phy0: timeout (1000 us) on reg 0x15f18: 0x  
0x0007 != 0x0004
[   12.943125] ath: EEPROM regdomain: 0x60
[   12.943127] ath: EEPROM indicates we should expect a direct regpair map
[   12.943129] ath: Country alpha2 being used: 00
[   12.943130] ath: Regpair used: 0x60
[   12.960202] r8169 :02:00.0 p3p1: link down
[   12.960236] r8169 :02:00.0 p3p1: link down
[   12.960256] IPv6: ADDRCONF(NETDEV_UP): p3p1: link is not ready
[   13.003523] ieee80211 phy0: Selected rate control algorithm 'minstrel_ht'
[   13.003886] ieee80211 phy0: Atheros AR9485 Rev:1 mem=0xc9000bc8, 
irq=16
[   13.012120] ip6_tables: (C) 2000-2006 Netfilter Core Team
[   13.023667] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[   13.055802] Ebtables v2.0 registered
[   13.192291] nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
[   15.906392] r8169 :02:00.0 p3p1: link up
[   15.906416] IPv6: ADDRCONF(NETDEV_CHANGE): p3p1: link becomes ready
[   17.121989] systemd-udevd (334) used greatest stack depth: 3352 bytes left

I'm working on finding which version bring this bug in.







--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Bug] 12.864681 BUG: lock held when returning to user space!

2013-10-07 Thread Madper Xie
Howdy Vaughan Cao,
   I can't meet this issue on both 3.11 and 3.11.4. There are only four
   patches between 3.11 and 3.12-rc2 and you are the author. Will you
   please check them if you have time. 

c...@redhat.com writes:

> Hi all,
>With kernel3.12-rc2 the dmesg shows following logs:
> [   12.864680] 
> [   12.864681] [ BUG: lock held when returning to user space! ]
> [   12.864682] 3.12.0-rc2 #1 Not tainted
> [   12.864683] 
> [   12.864684] iprinit/719 is leaving the kernel with locks still held!
> [   12.864685] 1 lock held by iprinit/719:
> [   12.864686]  #0:  (>o_sem){.+.+..}, at: [] 
> sg_open+0x4b5/0x644 [sg]
> [   12.934954] ath9k :01:00.0: enabling device ( -> 0002)
> [   12.940346] ath: phy0: timeout (1000 us) on reg 0x15f18: 0x & 
> 0x0007 != 0x0004
> [   12.943125] ath: EEPROM regdomain: 0x60
> [   12.943127] ath: EEPROM indicates we should expect a direct regpair map
> [   12.943129] ath: Country alpha2 being used: 00
> [   12.943130] ath: Regpair used: 0x60
> [   12.960202] r8169 :02:00.0 p3p1: link down
> [   12.960236] r8169 :02:00.0 p3p1: link down
> [   12.960256] IPv6: ADDRCONF(NETDEV_UP): p3p1: link is not ready
> [   13.003523] ieee80211 phy0: Selected rate control algorithm 'minstrel_ht'
> [   13.003886] ieee80211 phy0: Atheros AR9485 Rev:1 mem=0xc9000bc8, 
> irq=16
> [   13.012120] ip6_tables: (C) 2000-2006 Netfilter Core Team
> [   13.023667] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
> [   13.055802] Ebtables v2.0 registered
> [   13.192291] nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
> [   15.906392] r8169 :02:00.0 p3p1: link up
> [   15.906416] IPv6: ADDRCONF(NETDEV_CHANGE): p3p1: link becomes ready
> [   17.121989] systemd-udevd (334) used greatest stack depth: 3352 bytes left
>
> I'm working on finding which version bring this bug in.


-- 
Best,
Madper Xie.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Bug] 12.864681 BUG: lock held when returning to user space!

2013-10-07 Thread Madper Xie
Howdy Vaughan Cao,
   I can't meet this issue on both 3.11 and 3.11.4. There are only four
   patches between 3.11 and 3.12-rc2 and you are the author. Will you
   please check them if you have time. 

c...@redhat.com writes:

 Hi all,
With kernel3.12-rc2 the dmesg shows following logs:
 [   12.864680] 
 [   12.864681] [ BUG: lock held when returning to user space! ]
 [   12.864682] 3.12.0-rc2 #1 Not tainted
 [   12.864683] 
 [   12.864684] iprinit/719 is leaving the kernel with locks still held!
 [   12.864685] 1 lock held by iprinit/719:
 [   12.864686]  #0:  (sdp-o_sem){.+.+..}, at: [a050de05] 
 sg_open+0x4b5/0x644 [sg]
 [   12.934954] ath9k :01:00.0: enabling device ( - 0002)
 [   12.940346] ath: phy0: timeout (1000 us) on reg 0x15f18: 0x  
 0x0007 != 0x0004
 [   12.943125] ath: EEPROM regdomain: 0x60
 [   12.943127] ath: EEPROM indicates we should expect a direct regpair map
 [   12.943129] ath: Country alpha2 being used: 00
 [   12.943130] ath: Regpair used: 0x60
 [   12.960202] r8169 :02:00.0 p3p1: link down
 [   12.960236] r8169 :02:00.0 p3p1: link down
 [   12.960256] IPv6: ADDRCONF(NETDEV_UP): p3p1: link is not ready
 [   13.003523] ieee80211 phy0: Selected rate control algorithm 'minstrel_ht'
 [   13.003886] ieee80211 phy0: Atheros AR9485 Rev:1 mem=0xc9000bc8, 
 irq=16
 [   13.012120] ip6_tables: (C) 2000-2006 Netfilter Core Team
 [   13.023667] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
 [   13.055802] Ebtables v2.0 registered
 [   13.192291] nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
 [   15.906392] r8169 :02:00.0 p3p1: link up
 [   15.906416] IPv6: ADDRCONF(NETDEV_CHANGE): p3p1: link becomes ready
 [   17.121989] systemd-udevd (334) used greatest stack depth: 3352 bytes left

 I'm working on finding which version bring this bug in.


-- 
Best,
Madper Xie.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[Bug] 12.864681 BUG: lock held when returning to user space!

2013-09-30 Thread Madper Xie
Hi all,
   With kernel3.12-rc2 the dmesg shows following logs:
[   12.864680] 
[   12.864681] [ BUG: lock held when returning to user space! ]
[   12.864682] 3.12.0-rc2 #1 Not tainted
[   12.864683] 
[   12.864684] iprinit/719 is leaving the kernel with locks still held!
[   12.864685] 1 lock held by iprinit/719:
[   12.864686]  #0:  (>o_sem){.+.+..}, at: [] 
sg_open+0x4b5/0x644 [sg]
[   12.934954] ath9k :01:00.0: enabling device ( -> 0002)
[   12.940346] ath: phy0: timeout (1000 us) on reg 0x15f18: 0x & 
0x0007 != 0x0004
[   12.943125] ath: EEPROM regdomain: 0x60
[   12.943127] ath: EEPROM indicates we should expect a direct regpair map
[   12.943129] ath: Country alpha2 being used: 00
[   12.943130] ath: Regpair used: 0x60
[   12.960202] r8169 :02:00.0 p3p1: link down
[   12.960236] r8169 :02:00.0 p3p1: link down
[   12.960256] IPv6: ADDRCONF(NETDEV_UP): p3p1: link is not ready
[   13.003523] ieee80211 phy0: Selected rate control algorithm 'minstrel_ht'
[   13.003886] ieee80211 phy0: Atheros AR9485 Rev:1 mem=0xc9000bc8, 
irq=16
[   13.012120] ip6_tables: (C) 2000-2006 Netfilter Core Team
[   13.023667] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[   13.055802] Ebtables v2.0 registered
[   13.192291] nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
[   15.906392] r8169 :02:00.0 p3p1: link up
[   15.906416] IPv6: ADDRCONF(NETDEV_CHANGE): p3p1: link becomes ready
[   17.121989] systemd-udevd (334) used greatest stack depth: 3352 bytes left

I'm working on finding which version bring this bug in.
-- 
Best,
Madper Xie.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[Bug] 12.864681 BUG: lock held when returning to user space!

2013-09-30 Thread Madper Xie
Hi all,
   With kernel3.12-rc2 the dmesg shows following logs:
[   12.864680] 
[   12.864681] [ BUG: lock held when returning to user space! ]
[   12.864682] 3.12.0-rc2 #1 Not tainted
[   12.864683] 
[   12.864684] iprinit/719 is leaving the kernel with locks still held!
[   12.864685] 1 lock held by iprinit/719:
[   12.864686]  #0:  (sdp-o_sem){.+.+..}, at: [a050de05] 
sg_open+0x4b5/0x644 [sg]
[   12.934954] ath9k :01:00.0: enabling device ( - 0002)
[   12.940346] ath: phy0: timeout (1000 us) on reg 0x15f18: 0x  
0x0007 != 0x0004
[   12.943125] ath: EEPROM regdomain: 0x60
[   12.943127] ath: EEPROM indicates we should expect a direct regpair map
[   12.943129] ath: Country alpha2 being used: 00
[   12.943130] ath: Regpair used: 0x60
[   12.960202] r8169 :02:00.0 p3p1: link down
[   12.960236] r8169 :02:00.0 p3p1: link down
[   12.960256] IPv6: ADDRCONF(NETDEV_UP): p3p1: link is not ready
[   13.003523] ieee80211 phy0: Selected rate control algorithm 'minstrel_ht'
[   13.003886] ieee80211 phy0: Atheros AR9485 Rev:1 mem=0xc9000bc8, 
irq=16
[   13.012120] ip6_tables: (C) 2000-2006 Netfilter Core Team
[   13.023667] IPv6: ADDRCONF(NETDEV_UP): wlan0: link is not ready
[   13.055802] Ebtables v2.0 registered
[   13.192291] nf_conntrack version 0.5.0 (16384 buckets, 65536 max)
[   15.906392] r8169 :02:00.0 p3p1: link up
[   15.906416] IPv6: ADDRCONF(NETDEV_CHANGE): p3p1: link becomes ready
[   17.121989] systemd-udevd (334) used greatest stack depth: 3352 bytes left

I'm working on finding which version bring this bug in.
-- 
Best,
Madper Xie.
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ipc,shm] BUG: lock held when returning to user space!

2013-08-01 Thread Davidlohr Bueso
On Sat, 2013-07-27 at 21:34 +0800, Fengguang Wu wrote:
> On Sat, Jul 20, 2013 at 09:46:45AM -0700, Davidlohr Bueso wrote:
> > On Sun, 2013-07-21 at 00:02 +0800, Xiaotian Feng wrote:
> > > On Sat, Jul 20, 2013 at 9:13 PM, Fengguang Wu  
> > > wrote:
> > > > Greetings,
> > > >
> > > > I got the below dmesg and the first bad commit is
> > > >
> > > > commit c5d0282a0405b0a81fa3390e4230e4cbb3ced7a2
> > > > Author: Davidlohr Bueso 
> > > > Date:   Fri Jul 19 09:56:58 2013 +1000
> > > >
> > > > ipc,shm: shorten critical region for shmat
> > > >
> > > > Similar to other system calls, acquire the kern_ipc_perm lock after 
> > > > doing
> > > > the initial permission and security checks.
> > > >
> > > > Signed-off-by: Davidlohr Bueso 
> > > > Tested-by: Sedat Dilek 
> > > >     Cc: Rik van Riel 
> > > > Cc: Manfred Spraul 
> > > > Signed-off-by: Andrew Morton 
> > > >
> > > > [   20.702156]
> > > > [   20.702493] 
> > > > [   20.703511] [ BUG: lock held when returning to user space! ]
> > > > [   20.704532] 3.11.0-rc1-next-20130719 #50 Not tainted
> > > > [   20.705416] 
> > > > [   20.706425] trinity-child0/174 is leaving the kernel with locks 
> > > > still held!
> > > > [   20.707638] 1 lock held by trinity-child0/174:
> > > > [   20.708475]  #0:  (rcu_read_lock){.+.+..}, at: [] 
> > > > do_shmat+0xe1/0x500
> > > >
> > > 
> > > 
> > > ns = current->nsproxy->ipc_ns;
> > > - shp = shm_lock_check(ns, shmid);
> > > + rcu_read_lock();
> > > + shp = shm_obtain_object_check(ns, shmid);
> > > if (IS_ERR(shp)) {
> > > err = PTR_ERR(shp);
> > > goto out;
> > > 
> > > 
> > > If shm_obtain_object_check() failed, goto out will return with
> > > rcu_read_lock() held.  I think following patch should cure this.
> > 
> > Yep that should solve it, sorry about that. Sasha Levin sent out a fix
> > for it yesterday (offline).
> 
> What's the patch's status? The bug is still there in linux-next 20130726.

Andrew, unless you have an objection (or have already done so), could
you pickup Sasha's fix?

Thanks,
Davidlohr

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ipc,shm] BUG: lock held when returning to user space!

2013-08-01 Thread Davidlohr Bueso
On Sat, 2013-07-27 at 21:34 +0800, Fengguang Wu wrote:
 On Sat, Jul 20, 2013 at 09:46:45AM -0700, Davidlohr Bueso wrote:
  On Sun, 2013-07-21 at 00:02 +0800, Xiaotian Feng wrote:
   On Sat, Jul 20, 2013 at 9:13 PM, Fengguang Wu fengguang...@intel.com 
   wrote:
Greetings,
   
I got the below dmesg and the first bad commit is
   
commit c5d0282a0405b0a81fa3390e4230e4cbb3ced7a2
Author: Davidlohr Bueso davidlohr.bu...@hp.com
Date:   Fri Jul 19 09:56:58 2013 +1000
   
ipc,shm: shorten critical region for shmat
   
Similar to other system calls, acquire the kern_ipc_perm lock after 
doing
the initial permission and security checks.
   
Signed-off-by: Davidlohr Bueso davidlohr.bu...@hp.com
Tested-by: Sedat Dilek sedat.di...@gmail.com
Cc: Rik van Riel r...@redhat.com
Cc: Manfred Spraul manf...@colorfullife.com
Signed-off-by: Andrew Morton a...@linux-foundation.org
   
[   20.702156]
[   20.702493] 
[   20.703511] [ BUG: lock held when returning to user space! ]
[   20.704532] 3.11.0-rc1-next-20130719 #50 Not tainted
[   20.705416] 
[   20.706425] trinity-child0/174 is leaving the kernel with locks 
still held!
[   20.707638] 1 lock held by trinity-child0/174:
[   20.708475]  #0:  (rcu_read_lock){.+.+..}, at: [814a8491] 
do_shmat+0xe1/0x500
   
   
   
   ns = current-nsproxy-ipc_ns;
   - shp = shm_lock_check(ns, shmid);
   + rcu_read_lock();
   + shp = shm_obtain_object_check(ns, shmid);
   if (IS_ERR(shp)) {
   err = PTR_ERR(shp);
   goto out;
   
   
   If shm_obtain_object_check() failed, goto out will return with
   rcu_read_lock() held.  I think following patch should cure this.
  
  Yep that should solve it, sorry about that. Sasha Levin sent out a fix
  for it yesterday (offline).
 
 What's the patch's status? The bug is still there in linux-next 20130726.

Andrew, unless you have an objection (or have already done so), could
you pickup Sasha's fix?

Thanks,
Davidlohr

--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ipc,shm] BUG: lock held when returning to user space!

2013-07-27 Thread Fengguang Wu
On Sat, Jul 20, 2013 at 09:46:45AM -0700, Davidlohr Bueso wrote:
> On Sun, 2013-07-21 at 00:02 +0800, Xiaotian Feng wrote:
> > On Sat, Jul 20, 2013 at 9:13 PM, Fengguang Wu  
> > wrote:
> > > Greetings,
> > >
> > > I got the below dmesg and the first bad commit is
> > >
> > > commit c5d0282a0405b0a81fa3390e4230e4cbb3ced7a2
> > > Author: Davidlohr Bueso 
> > > Date:   Fri Jul 19 09:56:58 2013 +1000
> > >
> > > ipc,shm: shorten critical region for shmat
> > >
> > > Similar to other system calls, acquire the kern_ipc_perm lock after 
> > > doing
> > > the initial permission and security checks.
> > >
> > > Signed-off-by: Davidlohr Bueso 
> > > Tested-by: Sedat Dilek 
> > > Cc: Rik van Riel 
> > > Cc: Manfred Spraul 
> > >     Signed-off-by: Andrew Morton 
> > >
> > > [   20.702156]
> > > [   20.702493] 
> > > [   20.703511] [ BUG: lock held when returning to user space! ]
> > > [   20.704532] 3.11.0-rc1-next-20130719 #50 Not tainted
> > > [   20.705416] 
> > > [   20.706425] trinity-child0/174 is leaving the kernel with locks still 
> > > held!
> > > [   20.707638] 1 lock held by trinity-child0/174:
> > > [   20.708475]  #0:  (rcu_read_lock){.+.+..}, at: [] 
> > > do_shmat+0xe1/0x500
> > >
> > 
> > 
> > ns = current->nsproxy->ipc_ns;
> > - shp = shm_lock_check(ns, shmid);
> > + rcu_read_lock();
> > + shp = shm_obtain_object_check(ns, shmid);
> > if (IS_ERR(shp)) {
> > err = PTR_ERR(shp);
> > goto out;
> > 
> > 
> > If shm_obtain_object_check() failed, goto out will return with
> > rcu_read_lock() held.  I think following patch should cure this.
> 
> Yep that should solve it, sorry about that. Sasha Levin sent out a fix
> for it yesterday (offline).

What's the patch's status? The bug is still there in linux-next 20130726.

Thanks,
Fengguang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ipc,shm] BUG: lock held when returning to user space!

2013-07-27 Thread Fengguang Wu
On Sat, Jul 20, 2013 at 09:46:45AM -0700, Davidlohr Bueso wrote:
 On Sun, 2013-07-21 at 00:02 +0800, Xiaotian Feng wrote:
  On Sat, Jul 20, 2013 at 9:13 PM, Fengguang Wu fengguang...@intel.com 
  wrote:
   Greetings,
  
   I got the below dmesg and the first bad commit is
  
   commit c5d0282a0405b0a81fa3390e4230e4cbb3ced7a2
   Author: Davidlohr Bueso davidlohr.bu...@hp.com
   Date:   Fri Jul 19 09:56:58 2013 +1000
  
   ipc,shm: shorten critical region for shmat
  
   Similar to other system calls, acquire the kern_ipc_perm lock after 
   doing
   the initial permission and security checks.
  
   Signed-off-by: Davidlohr Bueso davidlohr.bu...@hp.com
   Tested-by: Sedat Dilek sedat.di...@gmail.com
   Cc: Rik van Riel r...@redhat.com
   Cc: Manfred Spraul manf...@colorfullife.com
   Signed-off-by: Andrew Morton a...@linux-foundation.org
  
   [   20.702156]
   [   20.702493] 
   [   20.703511] [ BUG: lock held when returning to user space! ]
   [   20.704532] 3.11.0-rc1-next-20130719 #50 Not tainted
   [   20.705416] 
   [   20.706425] trinity-child0/174 is leaving the kernel with locks still 
   held!
   [   20.707638] 1 lock held by trinity-child0/174:
   [   20.708475]  #0:  (rcu_read_lock){.+.+..}, at: [814a8491] 
   do_shmat+0xe1/0x500
  
  
  
  ns = current-nsproxy-ipc_ns;
  - shp = shm_lock_check(ns, shmid);
  + rcu_read_lock();
  + shp = shm_obtain_object_check(ns, shmid);
  if (IS_ERR(shp)) {
  err = PTR_ERR(shp);
  goto out;
  
  
  If shm_obtain_object_check() failed, goto out will return with
  rcu_read_lock() held.  I think following patch should cure this.
 
 Yep that should solve it, sorry about that. Sasha Levin sent out a fix
 for it yesterday (offline).

What's the patch's status? The bug is still there in linux-next 20130726.

Thanks,
Fengguang
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ipc,shm] BUG: lock held when returning to user space!

2013-07-20 Thread Davidlohr Bueso
On Sun, 2013-07-21 at 00:02 +0800, Xiaotian Feng wrote:
> On Sat, Jul 20, 2013 at 9:13 PM, Fengguang Wu  wrote:
> > Greetings,
> >
> > I got the below dmesg and the first bad commit is
> >
> > commit c5d0282a0405b0a81fa3390e4230e4cbb3ced7a2
> > Author: Davidlohr Bueso 
> > Date:   Fri Jul 19 09:56:58 2013 +1000
> >
> > ipc,shm: shorten critical region for shmat
> >
> > Similar to other system calls, acquire the kern_ipc_perm lock after 
> > doing
> > the initial permission and security checks.
> >
> > Signed-off-by: Davidlohr Bueso 
> > Tested-by: Sedat Dilek 
> > Cc: Rik van Riel 
> > Cc: Manfred Spraul 
> > Signed-off-by: Andrew Morton 
> >
> > [   20.702156]
> > [   20.702493] 
> > [   20.703511] [ BUG: lock held when returning to user space! ]
> > [   20.704532] 3.11.0-rc1-next-20130719 #50 Not tainted
> > [   20.705416] 
> > [   20.706425] trinity-child0/174 is leaving the kernel with locks still 
> > held!
> > [   20.707638] 1 lock held by trinity-child0/174:
> > [   20.708475]  #0:  (rcu_read_lock){.+.+..}, at: [] 
> > do_shmat+0xe1/0x500
> >
> 
> 
> ns = current->nsproxy->ipc_ns;
> - shp = shm_lock_check(ns, shmid);
> + rcu_read_lock();
> + shp = shm_obtain_object_check(ns, shmid);
> if (IS_ERR(shp)) {
> err = PTR_ERR(shp);
> goto out;
> 
> 
> If shm_obtain_object_check() failed, goto out will return with
> rcu_read_lock() held.  I think following patch should cure this.

Yep that should solve it, sorry about that. Sasha Levin sent out a fix
for it yesterday (offline).

Thanks,
Davidlohr

> 
> diff --git a/ipc/shm.c b/ipc/shm.c
> index 59f2194..cb2ceda 100644
> --- a/ipc/shm.c
> +++ b/ipc/shm.c
> @@ -1093,7 +1093,7 @@ long do_shmat(int shmid, char __user *shmaddr,
> int shmflg, ulong *raddr,
>   shp = shm_obtain_object_check(ns, shmid);
>   if (IS_ERR(shp)) {
>   err = PTR_ERR(shp);
> - goto out;
> + goto out_unlock;
>   }
> 
>   err = -EACCES;
> 
> 
> 
> 
> > git bisect start c1f631b9a68251007a6353041ae90f9f7dca771c 
> > d03792f9db9b892f494d3aa19d767ddf0365d1ff --
> > git bisect good 10a3f1f902465ae1320cc95a3284fd3697e05dd8  # 11:14 65+  
> > binfmt_elf.c: use get_random_int() to fix entropy depleting
> > git bisect  bad dac28788378838efb63e37a7eabd7729d97aba6b  # 11:32  0-  
> > dcache: remove dentries from LRU before putting on dispose list
> > git bisect good 3140b2ed6dfe5c9e5eca371c77ca85dca05321d4  # 11:50 65+  
> > ipc,shm: introduce shmctl_nolock
> > git bisect  bad 48a91248649fa3327bd8a31c114ee9149a07f3a7  # 12:04  0-  
> > staging/lustre/ldlm: convert to shrinkers to count/scan API
> > git bisect good 98b78126a51aa5d3ee6d5dae5768e0d16deeeaa3  # 12:14 65+  
> > ipc,shm: cleanup do_shmat pasta
> > git bisect  bad 36ccfd799cad33e2edd5c14ac8776b33e63d195b  # 12:14  0-  
> > ipc: rename ids->rw_mutex
> > git bisect  bad c5d0282a0405b0a81fa3390e4230e4cbb3ced7a2  # 12:14  0-  
> > ipc,shm: shorten critical region for shmat
> > git bisect good 98b78126a51aa5d3ee6d5dae5768e0d16deeeaa3  # 15:34195+  
> > ipc,shm: cleanup do_shmat pasta
> > git bisect  bad c1f631b9a68251007a6353041ae90f9f7dca771c  # 15:34  0-  
> > Add linux-next specific files for 20130719
> > git bisect good 709b465ee655387c4ec056383fa27f16c64f48db  # 18:21195+  
> > Revert "ipc,shm: shorten critical region for shmat"
> > git bisect good d471ce53b1fab60110e4e9f647a345cea31752de  # 18:44195+  
> > Merge branch 'for-linus' of 
> > git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml
> > git bisect  bad c1f631b9a68251007a6353041ae90f9f7dca771c  # 18:44  0-  
> > Add linux-next specific files for 20130719
> >
> > Thanks,
> > Fengguang


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ipc,shm] BUG: lock held when returning to user space!

2013-07-20 Thread Xiaotian Feng
On Sat, Jul 20, 2013 at 9:13 PM, Fengguang Wu  wrote:
> Greetings,
>
> I got the below dmesg and the first bad commit is
>
> commit c5d0282a0405b0a81fa3390e4230e4cbb3ced7a2
> Author: Davidlohr Bueso 
> Date:   Fri Jul 19 09:56:58 2013 +1000
>
> ipc,shm: shorten critical region for shmat
>
> Similar to other system calls, acquire the kern_ipc_perm lock after doing
> the initial permission and security checks.
>
> Signed-off-by: Davidlohr Bueso 
> Tested-by: Sedat Dilek 
> Cc: Rik van Riel 
> Cc: Manfred Spraul 
> Signed-off-by: Andrew Morton 
>
> [   20.702156]
> [   20.702493] ============
> [   20.703511] [ BUG: lock held when returning to user space! ]
> [   20.704532] 3.11.0-rc1-next-20130719 #50 Not tainted
> [   20.705416] 
> [   20.706425] trinity-child0/174 is leaving the kernel with locks still held!
> [   20.707638] 1 lock held by trinity-child0/174:
> [   20.708475]  #0:  (rcu_read_lock){.+.+..}, at: [] 
> do_shmat+0xe1/0x500
>


ns = current->nsproxy->ipc_ns;
- shp = shm_lock_check(ns, shmid);
+ rcu_read_lock();
+ shp = shm_obtain_object_check(ns, shmid);
if (IS_ERR(shp)) {
err = PTR_ERR(shp);
goto out;


If shm_obtain_object_check() failed, goto out will return with
rcu_read_lock() held.  I think following patch should cure this.

diff --git a/ipc/shm.c b/ipc/shm.c
index 59f2194..cb2ceda 100644
--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -1093,7 +1093,7 @@ long do_shmat(int shmid, char __user *shmaddr,
int shmflg, ulong *raddr,
  shp = shm_obtain_object_check(ns, shmid);
  if (IS_ERR(shp)) {
  err = PTR_ERR(shp);
- goto out;
+ goto out_unlock;
  }

  err = -EACCES;




> git bisect start c1f631b9a68251007a6353041ae90f9f7dca771c 
> d03792f9db9b892f494d3aa19d767ddf0365d1ff --
> git bisect good 10a3f1f902465ae1320cc95a3284fd3697e05dd8  # 11:14 65+  
> binfmt_elf.c: use get_random_int() to fix entropy depleting
> git bisect  bad dac28788378838efb63e37a7eabd7729d97aba6b  # 11:32  0-  
> dcache: remove dentries from LRU before putting on dispose list
> git bisect good 3140b2ed6dfe5c9e5eca371c77ca85dca05321d4  # 11:50 65+  
> ipc,shm: introduce shmctl_nolock
> git bisect  bad 48a91248649fa3327bd8a31c114ee9149a07f3a7  # 12:04  0-  
> staging/lustre/ldlm: convert to shrinkers to count/scan API
> git bisect good 98b78126a51aa5d3ee6d5dae5768e0d16deeeaa3  # 12:14 65+  
> ipc,shm: cleanup do_shmat pasta
> git bisect  bad 36ccfd799cad33e2edd5c14ac8776b33e63d195b  # 12:14  0-  
> ipc: rename ids->rw_mutex
> git bisect  bad c5d0282a0405b0a81fa3390e4230e4cbb3ced7a2  # 12:14  0-  
> ipc,shm: shorten critical region for shmat
> git bisect good 98b78126a51aa5d3ee6d5dae5768e0d16deeeaa3  # 15:34195+  
> ipc,shm: cleanup do_shmat pasta
> git bisect  bad c1f631b9a68251007a6353041ae90f9f7dca771c  # 15:34  0-  
> Add linux-next specific files for 20130719
> git bisect good 709b465ee655387c4ec056383fa27f16c64f48db  # 18:21195+  
> Revert "ipc,shm: shorten critical region for shmat"
> git bisect good d471ce53b1fab60110e4e9f647a345cea31752de  # 18:44195+  
> Merge branch 'for-linus' of 
> git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml
> git bisect  bad c1f631b9a68251007a6353041ae90f9f7dca771c  # 18:44  0-  
> Add linux-next specific files for 20130719
>
> Thanks,
> Fengguang
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ipc,shm] BUG: lock held when returning to user space!

2013-07-20 Thread Xiaotian Feng
On Sat, Jul 20, 2013 at 9:13 PM, Fengguang Wu fengguang...@intel.com wrote:
 Greetings,

 I got the below dmesg and the first bad commit is

 commit c5d0282a0405b0a81fa3390e4230e4cbb3ced7a2
 Author: Davidlohr Bueso davidlohr.bu...@hp.com
 Date:   Fri Jul 19 09:56:58 2013 +1000

 ipc,shm: shorten critical region for shmat

 Similar to other system calls, acquire the kern_ipc_perm lock after doing
 the initial permission and security checks.

 Signed-off-by: Davidlohr Bueso davidlohr.bu...@hp.com
 Tested-by: Sedat Dilek sedat.di...@gmail.com
 Cc: Rik van Riel r...@redhat.com
 Cc: Manfred Spraul manf...@colorfullife.com
 Signed-off-by: Andrew Morton a...@linux-foundation.org

 [   20.702156]
 [   20.702493] 
 [   20.703511] [ BUG: lock held when returning to user space! ]
 [   20.704532] 3.11.0-rc1-next-20130719 #50 Not tainted
 [   20.705416] 
 [   20.706425] trinity-child0/174 is leaving the kernel with locks still held!
 [   20.707638] 1 lock held by trinity-child0/174:
 [   20.708475]  #0:  (rcu_read_lock){.+.+..}, at: [814a8491] 
 do_shmat+0xe1/0x500



ns = current-nsproxy-ipc_ns;
- shp = shm_lock_check(ns, shmid);
+ rcu_read_lock();
+ shp = shm_obtain_object_check(ns, shmid);
if (IS_ERR(shp)) {
err = PTR_ERR(shp);
goto out;


If shm_obtain_object_check() failed, goto out will return with
rcu_read_lock() held.  I think following patch should cure this.

diff --git a/ipc/shm.c b/ipc/shm.c
index 59f2194..cb2ceda 100644
--- a/ipc/shm.c
+++ b/ipc/shm.c
@@ -1093,7 +1093,7 @@ long do_shmat(int shmid, char __user *shmaddr,
int shmflg, ulong *raddr,
  shp = shm_obtain_object_check(ns, shmid);
  if (IS_ERR(shp)) {
  err = PTR_ERR(shp);
- goto out;
+ goto out_unlock;
  }

  err = -EACCES;




 git bisect start c1f631b9a68251007a6353041ae90f9f7dca771c 
 d03792f9db9b892f494d3aa19d767ddf0365d1ff --
 git bisect good 10a3f1f902465ae1320cc95a3284fd3697e05dd8  # 11:14 65+  
 binfmt_elf.c: use get_random_int() to fix entropy depleting
 git bisect  bad dac28788378838efb63e37a7eabd7729d97aba6b  # 11:32  0-  
 dcache: remove dentries from LRU before putting on dispose list
 git bisect good 3140b2ed6dfe5c9e5eca371c77ca85dca05321d4  # 11:50 65+  
 ipc,shm: introduce shmctl_nolock
 git bisect  bad 48a91248649fa3327bd8a31c114ee9149a07f3a7  # 12:04  0-  
 staging/lustre/ldlm: convert to shrinkers to count/scan API
 git bisect good 98b78126a51aa5d3ee6d5dae5768e0d16deeeaa3  # 12:14 65+  
 ipc,shm: cleanup do_shmat pasta
 git bisect  bad 36ccfd799cad33e2edd5c14ac8776b33e63d195b  # 12:14  0-  
 ipc: rename ids-rw_mutex
 git bisect  bad c5d0282a0405b0a81fa3390e4230e4cbb3ced7a2  # 12:14  0-  
 ipc,shm: shorten critical region for shmat
 git bisect good 98b78126a51aa5d3ee6d5dae5768e0d16deeeaa3  # 15:34195+  
 ipc,shm: cleanup do_shmat pasta
 git bisect  bad c1f631b9a68251007a6353041ae90f9f7dca771c  # 15:34  0-  
 Add linux-next specific files for 20130719
 git bisect good 709b465ee655387c4ec056383fa27f16c64f48db  # 18:21195+  
 Revert ipc,shm: shorten critical region for shmat
 git bisect good d471ce53b1fab60110e4e9f647a345cea31752de  # 18:44195+  
 Merge branch 'for-linus' of 
 git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml
 git bisect  bad c1f631b9a68251007a6353041ae90f9f7dca771c  # 18:44  0-  
 Add linux-next specific files for 20130719

 Thanks,
 Fengguang
--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ipc,shm] BUG: lock held when returning to user space!

2013-07-20 Thread Davidlohr Bueso
On Sun, 2013-07-21 at 00:02 +0800, Xiaotian Feng wrote:
 On Sat, Jul 20, 2013 at 9:13 PM, Fengguang Wu fengguang...@intel.com wrote:
  Greetings,
 
  I got the below dmesg and the first bad commit is
 
  commit c5d0282a0405b0a81fa3390e4230e4cbb3ced7a2
  Author: Davidlohr Bueso davidlohr.bu...@hp.com
  Date:   Fri Jul 19 09:56:58 2013 +1000
 
  ipc,shm: shorten critical region for shmat
 
  Similar to other system calls, acquire the kern_ipc_perm lock after 
  doing
  the initial permission and security checks.
 
  Signed-off-by: Davidlohr Bueso davidlohr.bu...@hp.com
  Tested-by: Sedat Dilek sedat.di...@gmail.com
  Cc: Rik van Riel r...@redhat.com
  Cc: Manfred Spraul manf...@colorfullife.com
  Signed-off-by: Andrew Morton a...@linux-foundation.org
 
  [   20.702156]
  [   20.702493] 
  [   20.703511] [ BUG: lock held when returning to user space! ]
  [   20.704532] 3.11.0-rc1-next-20130719 #50 Not tainted
  [   20.705416] 
  [   20.706425] trinity-child0/174 is leaving the kernel with locks still 
  held!
  [   20.707638] 1 lock held by trinity-child0/174:
  [   20.708475]  #0:  (rcu_read_lock){.+.+..}, at: [814a8491] 
  do_shmat+0xe1/0x500
 
 
 
 ns = current-nsproxy-ipc_ns;
 - shp = shm_lock_check(ns, shmid);
 + rcu_read_lock();
 + shp = shm_obtain_object_check(ns, shmid);
 if (IS_ERR(shp)) {
 err = PTR_ERR(shp);
 goto out;
 
 
 If shm_obtain_object_check() failed, goto out will return with
 rcu_read_lock() held.  I think following patch should cure this.

Yep that should solve it, sorry about that. Sasha Levin sent out a fix
for it yesterday (offline).

Thanks,
Davidlohr

 
 diff --git a/ipc/shm.c b/ipc/shm.c
 index 59f2194..cb2ceda 100644
 --- a/ipc/shm.c
 +++ b/ipc/shm.c
 @@ -1093,7 +1093,7 @@ long do_shmat(int shmid, char __user *shmaddr,
 int shmflg, ulong *raddr,
   shp = shm_obtain_object_check(ns, shmid);
   if (IS_ERR(shp)) {
   err = PTR_ERR(shp);
 - goto out;
 + goto out_unlock;
   }
 
   err = -EACCES;
 
 
 
 
  git bisect start c1f631b9a68251007a6353041ae90f9f7dca771c 
  d03792f9db9b892f494d3aa19d767ddf0365d1ff --
  git bisect good 10a3f1f902465ae1320cc95a3284fd3697e05dd8  # 11:14 65+  
  binfmt_elf.c: use get_random_int() to fix entropy depleting
  git bisect  bad dac28788378838efb63e37a7eabd7729d97aba6b  # 11:32  0-  
  dcache: remove dentries from LRU before putting on dispose list
  git bisect good 3140b2ed6dfe5c9e5eca371c77ca85dca05321d4  # 11:50 65+  
  ipc,shm: introduce shmctl_nolock
  git bisect  bad 48a91248649fa3327bd8a31c114ee9149a07f3a7  # 12:04  0-  
  staging/lustre/ldlm: convert to shrinkers to count/scan API
  git bisect good 98b78126a51aa5d3ee6d5dae5768e0d16deeeaa3  # 12:14 65+  
  ipc,shm: cleanup do_shmat pasta
  git bisect  bad 36ccfd799cad33e2edd5c14ac8776b33e63d195b  # 12:14  0-  
  ipc: rename ids-rw_mutex
  git bisect  bad c5d0282a0405b0a81fa3390e4230e4cbb3ced7a2  # 12:14  0-  
  ipc,shm: shorten critical region for shmat
  git bisect good 98b78126a51aa5d3ee6d5dae5768e0d16deeeaa3  # 15:34195+  
  ipc,shm: cleanup do_shmat pasta
  git bisect  bad c1f631b9a68251007a6353041ae90f9f7dca771c  # 15:34  0-  
  Add linux-next specific files for 20130719
  git bisect good 709b465ee655387c4ec056383fa27f16c64f48db  # 18:21195+  
  Revert ipc,shm: shorten critical region for shmat
  git bisect good d471ce53b1fab60110e4e9f647a345cea31752de  # 18:44195+  
  Merge branch 'for-linus' of 
  git://git.kernel.org/pub/scm/linux/kernel/git/rw/uml
  git bisect  bad c1f631b9a68251007a6353041ae90f9f7dca771c  # 18:44  0-  
  Add linux-next specific files for 20130719
 
  Thanks,
  Fengguang


--
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [rtc-linux] Re: BUG: lock held when returning to user space

2007-10-29 Thread Alessandro Zummo
On Sun, 28 Oct 2007 00:47:15 +0200 (CEST)
Jiri Kosina <[EMAIL PROTECTED]> wrote:

> Yes, but the fact is that is really is invalid use of mutex -- because the 
> mutex owner could become seriously wrong after fork() or sending the 
> filedescriptor through unix socket ... this easily leads to broken 
> situation.
> 
> This seems to have been introduced in e824290e5d ... Alessandro, could you 
> convert this to test_and_set_bit()/clear_bit() semantics instead of a 
> mutex please?

 Hi Jiri,
   I was away for the weekend and just saw the whole discussion
 and your patch and I'm happy to ack it. Thank you very much
 a thanks to Gabriel for the bug report!
 

-- 

 Best regards,

 Alessandro Zummo,
  Tower Technologies - Torino, Italy

  http://www.towertech.it

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: lock held when returning to user space

2007-10-29 Thread Alessandro Zummo
On Sun, 28 Oct 2007 12:12:19 +0100 (CET)
Jiri Kosina <[EMAIL PROTECTED]> wrote:

> On Sat, 27 Oct 2007, Andrew Morton wrote:
> 
> > It's a fairly daft thing to do.  I think it'd be saner to teach rtc about
> > test_and_set_bit() personally..  
> 
> 
> From: Jiri Kosina <[EMAIL PROTECTED]>
> 
> RTC: convert mutex to bitfield
> 
> RTC code is using mutex to assure exclusive access to /dev/rtc.
> This is however wrong usage, as it leaves the mutex locked when
> returning into userspace, which is unacceptable.
> 
> Convert rtc->char_lock into bit operation.
> 
> Signed-off-by: Jiri Kosina <[EMAIL PROTECTED]>


 Acked-by: Alessandro Zummo <[EMAIL PROTECTED]>

-- 

 Best regards,

 Alessandro Zummo,
  Tower Technologies - Torino, Italy

  http://www.towertech.it

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [rtc-linux] Re: BUG: lock held when returning to user space

2007-10-29 Thread Alessandro Zummo
On Sun, 28 Oct 2007 00:47:15 +0200 (CEST)
Jiri Kosina [EMAIL PROTECTED] wrote:

 Yes, but the fact is that is really is invalid use of mutex -- because the 
 mutex owner could become seriously wrong after fork() or sending the 
 filedescriptor through unix socket ... this easily leads to broken 
 situation.
 
 This seems to have been introduced in e824290e5d ... Alessandro, could you 
 convert this to test_and_set_bit()/clear_bit() semantics instead of a 
 mutex please?

 Hi Jiri,
   I was away for the weekend and just saw the whole discussion
 and your patch and I'm happy to ack it. Thank you very much
 a thanks to Gabriel for the bug report!
 

-- 

 Best regards,

 Alessandro Zummo,
  Tower Technologies - Torino, Italy

  http://www.towertech.it

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: lock held when returning to user space

2007-10-29 Thread Alessandro Zummo
On Sun, 28 Oct 2007 12:12:19 +0100 (CET)
Jiri Kosina [EMAIL PROTECTED] wrote:

 On Sat, 27 Oct 2007, Andrew Morton wrote:
 
  It's a fairly daft thing to do.  I think it'd be saner to teach rtc about
  test_and_set_bit() personally..  
 
 
 From: Jiri Kosina [EMAIL PROTECTED]
 
 RTC: convert mutex to bitfield
 
 RTC code is using mutex to assure exclusive access to /dev/rtc.
 This is however wrong usage, as it leaves the mutex locked when
 returning into userspace, which is unacceptable.
 
 Convert rtc-char_lock into bit operation.
 
 Signed-off-by: Jiri Kosina [EMAIL PROTECTED]


 Acked-by: Alessandro Zummo [EMAIL PROTECTED]

-- 

 Best regards,

 Alessandro Zummo,
  Tower Technologies - Torino, Italy

  http://www.towertech.it

-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: lock held when returning to user space

2007-10-28 Thread Jiri Kosina
(CC list stripped)

On Sat, 27 Oct 2007, Andrew Morton wrote:

> It's a fairly daft thing to do.  I think it'd be saner to teach rtc about
> test_and_set_bit() personally..


From: Jiri Kosina <[EMAIL PROTECTED]>

RTC: convert mutex to bitfield

RTC code is using mutex to assure exclusive access to /dev/rtc.
This is however wrong usage, as it leaves the mutex locked when
returning into userspace, which is unacceptable.

Convert rtc->char_lock into bit operation.

Signed-off-by: Jiri Kosina <[EMAIL PROTECTED]>

diff --git a/drivers/rtc/interface.c b/drivers/rtc/interface.c
index de0da54..a4f56e9 100644
--- a/drivers/rtc/interface.c
+++ b/drivers/rtc/interface.c
@@ -293,7 +293,7 @@ int rtc_irq_register(struct rtc_device *rtc, struct 
rtc_task *task)
return -EINVAL;
 
/* Cannot register while the char dev is in use */
-   if (!(mutex_trylock(>char_lock)))
+   if (test_and_set_bit(RTC_DEV_BUSY, >flags))
return -EBUSY;
 
spin_lock_irq(>irq_task_lock);
@@ -303,7 +303,7 @@ int rtc_irq_register(struct rtc_device *rtc, struct 
rtc_task *task)
}
spin_unlock_irq(>irq_task_lock);
 
-   mutex_unlock(>char_lock);
+   clear_bit(RTC_DEV_BUSY, >flags);
 
return retval;
 }
diff --git a/drivers/rtc/rtc-dev.c b/drivers/rtc/rtc-dev.c
index 814583b..ae1bf17 100644
--- a/drivers/rtc/rtc-dev.c
+++ b/drivers/rtc/rtc-dev.c
@@ -26,10 +26,7 @@ static int rtc_dev_open(struct inode *inode, struct file 
*file)
struct rtc_device, char_dev);
const struct rtc_class_ops *ops = rtc->ops;
 
-   /* We keep the lock as long as the device is in use
-* and return immediately if busy
-*/
-   if (!(mutex_trylock(>char_lock)))
+   if (test_and_set_bit(RTC_DEV_BUSY, >flags))
return -EBUSY;
 
file->private_data = rtc;
@@ -43,8 +40,8 @@ static int rtc_dev_open(struct inode *inode, struct file 
*file)
return 0;
}
 
-   /* something has gone wrong, release the lock */
-   mutex_unlock(>char_lock);
+   /* something has gone wrong */
+   clear_bit(RTC_DEV_BUSY, >flags);
return err;
 }
 
@@ -405,7 +402,7 @@ static int rtc_dev_release(struct inode *inode, struct file 
*file)
if (rtc->ops->release)
rtc->ops->release(rtc->dev.parent);
 
-   mutex_unlock(>char_lock);
+   clear_bit(RTC_DEV_BUSY, >flags);
return 0;
 }
 
@@ -440,7 +437,6 @@ void rtc_dev_prepare(struct rtc_device *rtc)
 
rtc->dev.devt = MKDEV(MAJOR(rtc_devt), rtc->id);
 
-   mutex_init(>char_lock);
 #ifdef CONFIG_RTC_INTF_DEV_UIE_EMUL
INIT_WORK(>uie_task, rtc_uie_task);
setup_timer(>uie_timer, rtc_uie_timer, (unsigned long)rtc);
diff --git a/include/linux/rtc.h b/include/linux/rtc.h
index 6d5e4a4..f2d0d15 100644
--- a/include/linux/rtc.h
+++ b/include/linux/rtc.h
@@ -133,6 +133,9 @@ struct rtc_class_ops {
 #define RTC_DEVICE_NAME_SIZE 20
 struct rtc_task;
 
+/* flags */
+#define RTC_DEV_BUSY 0
+
 struct rtc_device
 {
struct device dev;
@@ -145,7 +148,7 @@ struct rtc_device
struct mutex ops_lock;
 
struct cdev char_dev;
-   struct mutex char_lock;
+   unsigned long flags;
 
unsigned long irq_data;
spinlock_t irq_lock;
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: lock held when returning to user space

2007-10-28 Thread Jiri Kosina
(CC list stripped)

On Sat, 27 Oct 2007, Andrew Morton wrote:

 It's a fairly daft thing to do.  I think it'd be saner to teach rtc about
 test_and_set_bit() personally..


From: Jiri Kosina [EMAIL PROTECTED]

RTC: convert mutex to bitfield

RTC code is using mutex to assure exclusive access to /dev/rtc.
This is however wrong usage, as it leaves the mutex locked when
returning into userspace, which is unacceptable.

Convert rtc-char_lock into bit operation.

Signed-off-by: Jiri Kosina [EMAIL PROTECTED]

diff --git a/drivers/rtc/interface.c b/drivers/rtc/interface.c
index de0da54..a4f56e9 100644
--- a/drivers/rtc/interface.c
+++ b/drivers/rtc/interface.c
@@ -293,7 +293,7 @@ int rtc_irq_register(struct rtc_device *rtc, struct 
rtc_task *task)
return -EINVAL;
 
/* Cannot register while the char dev is in use */
-   if (!(mutex_trylock(rtc-char_lock)))
+   if (test_and_set_bit(RTC_DEV_BUSY, rtc-flags))
return -EBUSY;
 
spin_lock_irq(rtc-irq_task_lock);
@@ -303,7 +303,7 @@ int rtc_irq_register(struct rtc_device *rtc, struct 
rtc_task *task)
}
spin_unlock_irq(rtc-irq_task_lock);
 
-   mutex_unlock(rtc-char_lock);
+   clear_bit(RTC_DEV_BUSY, rtc-flags);
 
return retval;
 }
diff --git a/drivers/rtc/rtc-dev.c b/drivers/rtc/rtc-dev.c
index 814583b..ae1bf17 100644
--- a/drivers/rtc/rtc-dev.c
+++ b/drivers/rtc/rtc-dev.c
@@ -26,10 +26,7 @@ static int rtc_dev_open(struct inode *inode, struct file 
*file)
struct rtc_device, char_dev);
const struct rtc_class_ops *ops = rtc-ops;
 
-   /* We keep the lock as long as the device is in use
-* and return immediately if busy
-*/
-   if (!(mutex_trylock(rtc-char_lock)))
+   if (test_and_set_bit(RTC_DEV_BUSY, rtc-flags))
return -EBUSY;
 
file-private_data = rtc;
@@ -43,8 +40,8 @@ static int rtc_dev_open(struct inode *inode, struct file 
*file)
return 0;
}
 
-   /* something has gone wrong, release the lock */
-   mutex_unlock(rtc-char_lock);
+   /* something has gone wrong */
+   clear_bit(RTC_DEV_BUSY, rtc-flags);
return err;
 }
 
@@ -405,7 +402,7 @@ static int rtc_dev_release(struct inode *inode, struct file 
*file)
if (rtc-ops-release)
rtc-ops-release(rtc-dev.parent);
 
-   mutex_unlock(rtc-char_lock);
+   clear_bit(RTC_DEV_BUSY, rtc-flags);
return 0;
 }
 
@@ -440,7 +437,6 @@ void rtc_dev_prepare(struct rtc_device *rtc)
 
rtc-dev.devt = MKDEV(MAJOR(rtc_devt), rtc-id);
 
-   mutex_init(rtc-char_lock);
 #ifdef CONFIG_RTC_INTF_DEV_UIE_EMUL
INIT_WORK(rtc-uie_task, rtc_uie_task);
setup_timer(rtc-uie_timer, rtc_uie_timer, (unsigned long)rtc);
diff --git a/include/linux/rtc.h b/include/linux/rtc.h
index 6d5e4a4..f2d0d15 100644
--- a/include/linux/rtc.h
+++ b/include/linux/rtc.h
@@ -133,6 +133,9 @@ struct rtc_class_ops {
 #define RTC_DEVICE_NAME_SIZE 20
 struct rtc_task;
 
+/* flags */
+#define RTC_DEV_BUSY 0
+
 struct rtc_device
 {
struct device dev;
@@ -145,7 +148,7 @@ struct rtc_device
struct mutex ops_lock;
 
struct cdev char_dev;
-   struct mutex char_lock;
+   unsigned long flags;
 
unsigned long irq_data;
spinlock_t irq_lock;
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: lock held when returning to user space

2007-10-27 Thread Jiri Kosina
On Sat, 27 Oct 2007, Peter Zijlstra wrote:

> > > [  592.752781] 
> > > [  592.753478] [ BUG: lock held when returning to user space! ]
> > > [  592.753880] 
> > > [  592.754262] hwclock/1452 is leaving the kernel with locks still held!
> > > [  592.754655] 1 lock held by hwclock/1452:
> > > [  592.755007]  #0:  (>char_lock){--..}, at: [] 
> > > rtc_dev_open+0x2e/0x7e
> > Yes, this is because rtc keeps a char_lock mutex locked as long as the 
> > device is open, to avoid concurrent accessess.
> I think, in this case, the lock is associated with a kernel object that
> is properly cleaned up if the holding tasks gets a SIGKILL. But in
> general I'd like to see this kind of thing go away.

Yes, but the fact is that is really is invalid use of mutex -- because the 
mutex owner could become seriously wrong after fork() or sending the 
filedescriptor through unix socket ... this easily leads to broken 
situation.

This seems to have been introduced in e824290e5d ... Alessandro, could you 
convert this to test_and_set_bit()/clear_bit() semantics instead of a 
mutex please?

Thanks,

-- 
Jiri Kosina
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: lock held when returning to user space

2007-10-27 Thread Arjan van de Ven
On Sat, 27 Oct 2007 18:09:29 +0200
Peter Zijlstra <[EMAIL PROTECTED]> wrote:

> 
> 
> Right, the fd could be transferred using unix sockets or fork(). That
> would indeed seriously break a mutex.
>


or just multi-threaded app..
 
> 


-- 
If you want to reach me at my work email, use [EMAIL PROTECTED]
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: lock held when returning to user space

2007-10-27 Thread Linus Torvalds


On Sat, 27 Oct 2007, Peter Zijlstra wrote:
> 
> Now I could probably come up with an annotation to hide it, but what do
> other people think, Ingo, Linus, Andrew, do we want to keep kernel locks
> held over userspace?

Definitely not a sane thing to do. It should use ref-counting and/or a 
single bit to say "busy". 

Linus
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: lock held when returning to user space

2007-10-27 Thread Peter Zijlstra

On Sat, 2007-10-27 at 08:47 -0700, Arjan van de Ven wrote:
> On Sat, 27 Oct 2007 17:12:41 +0200 (CEST)
> Jiri Kosina <[EMAIL PROTECTED]> wrote:
> 
> > On Sat, 27 Oct 2007, Gabriel C wrote:
> > 
> > > I found that today in dmesg after booting current git ( 
> > > ec3b67c11df42362ccda81261d62829042f223f0 ) :
> > > ...
> > > [  592.752777]
> > > [  592.752781] ============
> > > [  592.753478] [ BUG: lock held when returning to user space! ]
> > > [  592.753880] 
> > > [  592.754262] hwclock/1452 is leaving the kernel with locks still
> > > held! [  592.754655] 1 lock held by hwclock/1452:
> > > [  592.755007]  #0:  (>char_lock){--..}, at: []
> > > rtc_dev_open+0x2e/0x7e
> > 
> > Yes, this is because rtc keeps a char_lock mutex locked as long as
> > the device is open, to avoid concurrent accessess.
> > 
> > It could be easily substituted by some counting -- setting and
> > clearing bit in struct rtc_device instead of using char_lock, but
> > doing this just to shut the lockdep off is questionable imho.
> 
> it's not about lockdep; what this code doing is not valid use of a
> mutex:
> A mutex is required to have a clear process as owner, and in this case
> it doesn't have that... at all. This is a violation of the kernel mutex
> semantics.. and should be fixed.

Right, the fd could be transferred using unix sockets or fork(). That
would indeed seriously break a mutex.


-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: lock held when returning to user space

2007-10-27 Thread Arjan van de Ven
On Sat, 27 Oct 2007 17:12:41 +0200 (CEST)
Jiri Kosina <[EMAIL PROTECTED]> wrote:

> On Sat, 27 Oct 2007, Gabriel C wrote:
> 
> > I found that today in dmesg after booting current git ( 
> > ec3b67c11df42362ccda81261d62829042f223f0 ) :
> > ...
> > [  592.752777]
> > [  592.752781] ========
> > [  592.753478] [ BUG: lock held when returning to user space! ]
> > [  592.753880] 
> > [  592.754262] hwclock/1452 is leaving the kernel with locks still
> > held! [  592.754655] 1 lock held by hwclock/1452:
> > [  592.755007]  #0:  (>char_lock){--..}, at: []
> > rtc_dev_open+0x2e/0x7e
> 
> Yes, this is because rtc keeps a char_lock mutex locked as long as
> the device is open, to avoid concurrent accessess.
> 
> It could be easily substituted by some counting -- setting and
> clearing bit in struct rtc_device instead of using char_lock, but
> doing this just to shut the lockdep off is questionable imho.

it's not about lockdep; what this code doing is not valid use of a
mutex:
A mutex is required to have a clear process as owner, and in this case
it doesn't have that... at all. This is a violation of the kernel mutex
semantics.. and should be fixed.

> Peter, what is the preferred way to annotate these kinds of locking
> for lockdep to express that it is intended?

the preferred method is to not use a mutex like this...


-- 
If you want to reach me at my work email, use [EMAIL PROTECTED]
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: lock held when returning to user space

2007-10-27 Thread Andrew Morton
On Sat, 27 Oct 2007 17:28:41 +0200 Peter Zijlstra <[EMAIL PROTECTED]> wrote:

> 
> On Sat, 2007-10-27 at 17:12 +0200, Jiri Kosina wrote:
> > On Sat, 27 Oct 2007, Gabriel C wrote:
> > 
> > > I found that today in dmesg after booting current git ( 
> > > ec3b67c11df42362ccda81261d62829042f223f0 ) :
> > > ...
> > > [  592.752777]
> > > [  592.752781] ================
> > > [  592.753478] [ BUG: lock held when returning to user space! ]
> > > [  592.753880] 
> > > [  592.754262] hwclock/1452 is leaving the kernel with locks still held!
> > > [  592.754655] 1 lock held by hwclock/1452:
> > > [  592.755007]  #0:  (>char_lock){--..}, at: [] 
> > > rtc_dev_open+0x2e/0x7e
> > 
> > Yes, this is because rtc keeps a char_lock mutex locked as long as the 
> > device is open, to avoid concurrent accessess.
> > 
> > It could be easily substituted by some counting -- setting and clearing 
> > bit in struct rtc_device instead of using char_lock, but doing this just 
> > to shut the lockdep off is questionable imho.
> > 
> > Peter, what is the preferred way to annotate these kinds of locking for 
> > lockdep to express that it is intended?
> 
> Not sure, I'd not thought that anyone would actually want to do this.
> I'm also not sure how I stand on this, I'd prefer to say: don't do this!
> 
> I think, in this case, the lock is associated with a kernel object that
> is properly cleaned up if the holding tasks gets a SIGKILL. But in
> general I'd like to see this kind of thing go away.
> 
> Now I could probably come up with an annotation to hide it, but what do
> other people think, Ingo, Linus, Andrew, do we want to keep kernel locks
> held over userspace?
> 

It's a fairly daft thing to do.  I think it'd be saner to teach rtc about
test_and_set_bit() personally..
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: lock held when returning to user space

2007-10-27 Thread Peter Zijlstra

On Sat, 2007-10-27 at 17:12 +0200, Jiri Kosina wrote:
> On Sat, 27 Oct 2007, Gabriel C wrote:
> 
> > I found that today in dmesg after booting current git ( 
> > ec3b67c11df42362ccda81261d62829042f223f0 ) :
> > ...
> > [  592.752777]
> > [  592.752781] ========
> > [  592.753478] [ BUG: lock held when returning to user space! ]
> > [  592.753880] 
> > [  592.754262] hwclock/1452 is leaving the kernel with locks still held!
> > [  592.754655] 1 lock held by hwclock/1452:
> > [  592.755007]  #0:  (>char_lock){--..}, at: [] 
> > rtc_dev_open+0x2e/0x7e
> 
> Yes, this is because rtc keeps a char_lock mutex locked as long as the 
> device is open, to avoid concurrent accessess.
> 
> It could be easily substituted by some counting -- setting and clearing 
> bit in struct rtc_device instead of using char_lock, but doing this just 
> to shut the lockdep off is questionable imho.
> 
> Peter, what is the preferred way to annotate these kinds of locking for 
> lockdep to express that it is intended?

Not sure, I'd not thought that anyone would actually want to do this.
I'm also not sure how I stand on this, I'd prefer to say: don't do this!

I think, in this case, the lock is associated with a kernel object that
is properly cleaned up if the holding tasks gets a SIGKILL. But in
general I'd like to see this kind of thing go away.

Now I could probably come up with an annotation to hide it, but what do
other people think, Ingo, Linus, Andrew, do we want to keep kernel locks
held over userspace?



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: lock held when returning to user space

2007-10-27 Thread Jiri Kosina
On Sat, 27 Oct 2007, Gabriel C wrote:

> I found that today in dmesg after booting current git ( 
> ec3b67c11df42362ccda81261d62829042f223f0 ) :
> ...
> [  592.752777]
> [  592.752781] 
> [  592.753478] [ BUG: lock held when returning to user space! ]
> [  592.753880] 
> [  592.754262] hwclock/1452 is leaving the kernel with locks still held!
> [  592.754655] 1 lock held by hwclock/1452:
> [  592.755007]  #0:  (>char_lock){--..}, at: [] 
> rtc_dev_open+0x2e/0x7e

Yes, this is because rtc keeps a char_lock mutex locked as long as the 
device is open, to avoid concurrent accessess.

It could be easily substituted by some counting -- setting and clearing 
bit in struct rtc_device instead of using char_lock, but doing this just 
to shut the lockdep off is questionable imho.

Peter, what is the preferred way to annotate these kinds of locking for 
lockdep to express that it is intended?

-- 
Jiri Kosina
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


BUG: lock held when returning to user space

2007-10-27 Thread Gabriel C
Hi,

I found that today in dmesg after booting current git ( 
ec3b67c11df42362ccda81261d62829042f223f0 ) :

...

[  592.752777]
[  592.752781] 
[  592.753478] [ BUG: lock held when returning to user space! ]
[  592.753880] 
[  592.754262] hwclock/1452 is leaving the kernel with locks still held!
[  592.754655] 1 lock held by hwclock/1452:
[  592.755007]  #0:  (>char_lock){--..}, at: [] 
rtc_dev_open+0x2e/0x7e

...

(gdb) l *rtc_dev_open+0x2e
0xc02a7ebb is in rtc_dev_open (drivers/rtc/rtc-dev.c:32).
27  const struct rtc_class_ops *ops = rtc->ops;
28
29  /* We keep the lock as long as the device is in use
30   * and return immediately if busy
31   */
32  if (!(mutex_trylock(>char_lock)))
33  return -EBUSY;
34
35  file->private_data = rtc;
36
(gdb)


config is attached.

Regards,

Gabriel
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.24-rc1
# Sat Oct 27 13:40:58 2007
#
CONFIG_X86_32=y
CONFIG_GENERIC_TIME=y
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_QUICKLIST=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_DMI=y
CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config"

#
# General setup
#
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_LOCALVERSION=""
CONFIG_LOCALVERSION_AUTO=y
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_BSD_PROCESS_ACCT=y
# CONFIG_BSD_PROCESS_ACCT_V3 is not set
# CONFIG_TASKSTATS is not set
# CONFIG_USER_NS is not set
# CONFIG_AUDIT is not set
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_LOG_BUF_SHIFT=17
# CONFIG_CGROUPS is not set
CONFIG_FAIR_GROUP_SCHED=y
CONFIG_FAIR_USER_SCHED=y
# CONFIG_FAIR_CGROUP_SCHED is not set
# CONFIG_SYSFS_DEPRECATED is not set
CONFIG_RELAY=y
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
# CONFIG_EMBEDDED is not set
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
CONFIG_KALLSYMS_EXTRA_PASS=y
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_ANON_INODES=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_SLUB_DEBUG=y
# CONFIG_SLAB is not set
CONFIG_SLUB=y
# CONFIG_SLOB is not set
CONFIG_RT_MUTEXES=y
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
CONFIG_MODVERSIONS=y
# CONFIG_MODULE_SRCVERSION_ALL is not set
CONFIG_KMOD=y
CONFIG_STOP_MACHINE=y
CONFIG_BLOCK=y
CONFIG_LBD=y
CONFIG_BLK_DEV_IO_TRACE=y
CONFIG_LSF=y
CONFIG_BLK_DEV_BSG=y

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
# CONFIG_DEFAULT_AS is not set
# CONFIG_DEFAULT_DEADLINE is not set
CONFIG_DEFAULT_CFQ=y
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED="cfq"

#
# Processor type and features
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y
CONFIG_GENERIC_CLOCKEVENTS_BUILD=y
CONFIG_SMP=y
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y
# CONFIG_PARAVIRT_GUEST is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUMM is not set
# CONFIG_MCORE2 is not set
CONFIG_MPENTIUM4=y
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MGEODEGX1 is not set
# CONFIG_MGEODE_LX is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_MVIAC7 is not set
CONFIG_X86_GENERIC=y
CONFIG_X86_CMPXCHG=y
CONFIG_X86_L1_CACHE_SHIFT=7
CONFIG_X86_XADD=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_TSC

BUG: lock held when returning to user space

2007-10-27 Thread Gabriel C
Hi,

I found that today in dmesg after booting current git ( 
ec3b67c11df42362ccda81261d62829042f223f0 ) :

...

[  592.752777]
[  592.752781] 
[  592.753478] [ BUG: lock held when returning to user space! ]
[  592.753880] 
[  592.754262] hwclock/1452 is leaving the kernel with locks still held!
[  592.754655] 1 lock held by hwclock/1452:
[  592.755007]  #0:  (rtc-char_lock){--..}, at: [c02a7ebb] 
rtc_dev_open+0x2e/0x7e

...

(gdb) l *rtc_dev_open+0x2e
0xc02a7ebb is in rtc_dev_open (drivers/rtc/rtc-dev.c:32).
27  const struct rtc_class_ops *ops = rtc-ops;
28
29  /* We keep the lock as long as the device is in use
30   * and return immediately if busy
31   */
32  if (!(mutex_trylock(rtc-char_lock)))
33  return -EBUSY;
34
35  file-private_data = rtc;
36
(gdb)


config is attached.

Regards,

Gabriel
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.24-rc1
# Sat Oct 27 13:40:58 2007
#
CONFIG_X86_32=y
CONFIG_GENERIC_TIME=y
CONFIG_GENERIC_CMOS_UPDATE=y
CONFIG_CLOCKSOURCE_WATCHDOG=y
CONFIG_GENERIC_CLOCKEVENTS=y
CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y
CONFIG_LOCKDEP_SUPPORT=y
CONFIG_STACKTRACE_SUPPORT=y
CONFIG_SEMAPHORE_SLEEPERS=y
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_ZONE_DMA=y
CONFIG_QUICKLIST=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
CONFIG_GENERIC_BUG=y
CONFIG_GENERIC_HWEIGHT=y
CONFIG_ARCH_MAY_HAVE_PC_FDC=y
CONFIG_DMI=y
CONFIG_DEFCONFIG_LIST=/lib/modules/$UNAME_RELEASE/.config

#
# General setup
#
CONFIG_EXPERIMENTAL=y
CONFIG_LOCK_KERNEL=y
CONFIG_INIT_ENV_ARG_LIMIT=32
CONFIG_LOCALVERSION=
CONFIG_LOCALVERSION_AUTO=y
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_SYSVIPC_SYSCTL=y
CONFIG_POSIX_MQUEUE=y
CONFIG_BSD_PROCESS_ACCT=y
# CONFIG_BSD_PROCESS_ACCT_V3 is not set
# CONFIG_TASKSTATS is not set
# CONFIG_USER_NS is not set
# CONFIG_AUDIT is not set
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
CONFIG_LOG_BUF_SHIFT=17
# CONFIG_CGROUPS is not set
CONFIG_FAIR_GROUP_SCHED=y
CONFIG_FAIR_USER_SCHED=y
# CONFIG_FAIR_CGROUP_SCHED is not set
# CONFIG_SYSFS_DEPRECATED is not set
CONFIG_RELAY=y
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=
CONFIG_CC_OPTIMIZE_FOR_SIZE=y
CONFIG_SYSCTL=y
# CONFIG_EMBEDDED is not set
CONFIG_UID16=y
CONFIG_SYSCTL_SYSCALL=y
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
CONFIG_KALLSYMS_EXTRA_PASS=y
CONFIG_HOTPLUG=y
CONFIG_PRINTK=y
CONFIG_BUG=y
CONFIG_ELF_CORE=y
CONFIG_BASE_FULL=y
CONFIG_FUTEX=y
CONFIG_ANON_INODES=y
CONFIG_EPOLL=y
CONFIG_SIGNALFD=y
CONFIG_EVENTFD=y
CONFIG_SHMEM=y
CONFIG_VM_EVENT_COUNTERS=y
CONFIG_SLUB_DEBUG=y
# CONFIG_SLAB is not set
CONFIG_SLUB=y
# CONFIG_SLOB is not set
CONFIG_RT_MUTEXES=y
# CONFIG_TINY_SHMEM is not set
CONFIG_BASE_SMALL=0
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
CONFIG_MODVERSIONS=y
# CONFIG_MODULE_SRCVERSION_ALL is not set
CONFIG_KMOD=y
CONFIG_STOP_MACHINE=y
CONFIG_BLOCK=y
CONFIG_LBD=y
CONFIG_BLK_DEV_IO_TRACE=y
CONFIG_LSF=y
CONFIG_BLK_DEV_BSG=y

#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
# CONFIG_DEFAULT_AS is not set
# CONFIG_DEFAULT_DEADLINE is not set
CONFIG_DEFAULT_CFQ=y
# CONFIG_DEFAULT_NOOP is not set
CONFIG_DEFAULT_IOSCHED=cfq

#
# Processor type and features
#
CONFIG_TICK_ONESHOT=y
CONFIG_NO_HZ=y
CONFIG_HIGH_RES_TIMERS=y
CONFIG_GENERIC_CLOCKEVENTS_BUILD=y
CONFIG_SMP=y
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
CONFIG_SCHED_NO_NO_OMIT_FRAME_POINTER=y
# CONFIG_PARAVIRT_GUEST is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUMM is not set
# CONFIG_MCORE2 is not set
CONFIG_MPENTIUM4=y
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MGEODEGX1 is not set
# CONFIG_MGEODE_LX is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_MVIAC7 is not set
CONFIG_X86_GENERIC=y
CONFIG_X86_CMPXCHG=y
CONFIG_X86_L1_CACHE_SHIFT=7
CONFIG_X86_XADD=y
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
# CONFIG_ARCH_HAS_ILOG2_U32 is not set
# CONFIG_ARCH_HAS_ILOG2_U64 is not set
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_TSC=y
CONFIG_X86_CMOV=y
CONFIG_X86_MINIMUM_CPU_FAMILY=4

Re: BUG: lock held when returning to user space

2007-10-27 Thread Jiri Kosina
On Sat, 27 Oct 2007, Gabriel C wrote:

 I found that today in dmesg after booting current git ( 
 ec3b67c11df42362ccda81261d62829042f223f0 ) :
 ...
 [  592.752777]
 [  592.752781] 
 [  592.753478] [ BUG: lock held when returning to user space! ]
 [  592.753880] 
 [  592.754262] hwclock/1452 is leaving the kernel with locks still held!
 [  592.754655] 1 lock held by hwclock/1452:
 [  592.755007]  #0:  (rtc-char_lock){--..}, at: [c02a7ebb] 
 rtc_dev_open+0x2e/0x7e

Yes, this is because rtc keeps a char_lock mutex locked as long as the 
device is open, to avoid concurrent accessess.

It could be easily substituted by some counting -- setting and clearing 
bit in struct rtc_device instead of using char_lock, but doing this just 
to shut the lockdep off is questionable imho.

Peter, what is the preferred way to annotate these kinds of locking for 
lockdep to express that it is intended?

-- 
Jiri Kosina
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: lock held when returning to user space

2007-10-27 Thread Peter Zijlstra

On Sat, 2007-10-27 at 17:12 +0200, Jiri Kosina wrote:
 On Sat, 27 Oct 2007, Gabriel C wrote:
 
  I found that today in dmesg after booting current git ( 
  ec3b67c11df42362ccda81261d62829042f223f0 ) :
  ...
  [  592.752777]
  [  592.752781] 
  [  592.753478] [ BUG: lock held when returning to user space! ]
  [  592.753880] 
  [  592.754262] hwclock/1452 is leaving the kernel with locks still held!
  [  592.754655] 1 lock held by hwclock/1452:
  [  592.755007]  #0:  (rtc-char_lock){--..}, at: [c02a7ebb] 
  rtc_dev_open+0x2e/0x7e
 
 Yes, this is because rtc keeps a char_lock mutex locked as long as the 
 device is open, to avoid concurrent accessess.
 
 It could be easily substituted by some counting -- setting and clearing 
 bit in struct rtc_device instead of using char_lock, but doing this just 
 to shut the lockdep off is questionable imho.
 
 Peter, what is the preferred way to annotate these kinds of locking for 
 lockdep to express that it is intended?

Not sure, I'd not thought that anyone would actually want to do this.
I'm also not sure how I stand on this, I'd prefer to say: don't do this!

I think, in this case, the lock is associated with a kernel object that
is properly cleaned up if the holding tasks gets a SIGKILL. But in
general I'd like to see this kind of thing go away.

Now I could probably come up with an annotation to hide it, but what do
other people think, Ingo, Linus, Andrew, do we want to keep kernel locks
held over userspace?



-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: lock held when returning to user space

2007-10-27 Thread Andrew Morton
On Sat, 27 Oct 2007 17:28:41 +0200 Peter Zijlstra [EMAIL PROTECTED] wrote:

 
 On Sat, 2007-10-27 at 17:12 +0200, Jiri Kosina wrote:
  On Sat, 27 Oct 2007, Gabriel C wrote:
  
   I found that today in dmesg after booting current git ( 
   ec3b67c11df42362ccda81261d62829042f223f0 ) :
   ...
   [  592.752777]
   [  592.752781] 
   [  592.753478] [ BUG: lock held when returning to user space! ]
   [  592.753880] 
   [  592.754262] hwclock/1452 is leaving the kernel with locks still held!
   [  592.754655] 1 lock held by hwclock/1452:
   [  592.755007]  #0:  (rtc-char_lock){--..}, at: [c02a7ebb] 
   rtc_dev_open+0x2e/0x7e
  
  Yes, this is because rtc keeps a char_lock mutex locked as long as the 
  device is open, to avoid concurrent accessess.
  
  It could be easily substituted by some counting -- setting and clearing 
  bit in struct rtc_device instead of using char_lock, but doing this just 
  to shut the lockdep off is questionable imho.
  
  Peter, what is the preferred way to annotate these kinds of locking for 
  lockdep to express that it is intended?
 
 Not sure, I'd not thought that anyone would actually want to do this.
 I'm also not sure how I stand on this, I'd prefer to say: don't do this!
 
 I think, in this case, the lock is associated with a kernel object that
 is properly cleaned up if the holding tasks gets a SIGKILL. But in
 general I'd like to see this kind of thing go away.
 
 Now I could probably come up with an annotation to hide it, but what do
 other people think, Ingo, Linus, Andrew, do we want to keep kernel locks
 held over userspace?
 

It's a fairly daft thing to do.  I think it'd be saner to teach rtc about
test_and_set_bit() personally..
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: lock held when returning to user space

2007-10-27 Thread Arjan van de Ven
On Sat, 27 Oct 2007 17:12:41 +0200 (CEST)
Jiri Kosina [EMAIL PROTECTED] wrote:

 On Sat, 27 Oct 2007, Gabriel C wrote:
 
  I found that today in dmesg after booting current git ( 
  ec3b67c11df42362ccda81261d62829042f223f0 ) :
  ...
  [  592.752777]
  [  592.752781] 
  [  592.753478] [ BUG: lock held when returning to user space! ]
  [  592.753880] 
  [  592.754262] hwclock/1452 is leaving the kernel with locks still
  held! [  592.754655] 1 lock held by hwclock/1452:
  [  592.755007]  #0:  (rtc-char_lock){--..}, at: [c02a7ebb]
  rtc_dev_open+0x2e/0x7e
 
 Yes, this is because rtc keeps a char_lock mutex locked as long as
 the device is open, to avoid concurrent accessess.
 
 It could be easily substituted by some counting -- setting and
 clearing bit in struct rtc_device instead of using char_lock, but
 doing this just to shut the lockdep off is questionable imho.

it's not about lockdep; what this code doing is not valid use of a
mutex:
A mutex is required to have a clear process as owner, and in this case
it doesn't have that... at all. This is a violation of the kernel mutex
semantics.. and should be fixed.

 Peter, what is the preferred way to annotate these kinds of locking
 for lockdep to express that it is intended?

the preferred method is to not use a mutex like this...


-- 
If you want to reach me at my work email, use [EMAIL PROTECTED]
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: lock held when returning to user space

2007-10-27 Thread Peter Zijlstra

On Sat, 2007-10-27 at 08:47 -0700, Arjan van de Ven wrote:
 On Sat, 27 Oct 2007 17:12:41 +0200 (CEST)
 Jiri Kosina [EMAIL PROTECTED] wrote:
 
  On Sat, 27 Oct 2007, Gabriel C wrote:
  
   I found that today in dmesg after booting current git ( 
   ec3b67c11df42362ccda81261d62829042f223f0 ) :
   ...
   [  592.752777]
   [  592.752781] 
   [  592.753478] [ BUG: lock held when returning to user space! ]
   [  592.753880] 
   [  592.754262] hwclock/1452 is leaving the kernel with locks still
   held! [  592.754655] 1 lock held by hwclock/1452:
   [  592.755007]  #0:  (rtc-char_lock){--..}, at: [c02a7ebb]
   rtc_dev_open+0x2e/0x7e
  
  Yes, this is because rtc keeps a char_lock mutex locked as long as
  the device is open, to avoid concurrent accessess.
  
  It could be easily substituted by some counting -- setting and
  clearing bit in struct rtc_device instead of using char_lock, but
  doing this just to shut the lockdep off is questionable imho.
 
 it's not about lockdep; what this code doing is not valid use of a
 mutex:
 A mutex is required to have a clear process as owner, and in this case
 it doesn't have that... at all. This is a violation of the kernel mutex
 semantics.. and should be fixed.

Right, the fd could be transferred using unix sockets or fork(). That
would indeed seriously break a mutex.


-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: lock held when returning to user space

2007-10-27 Thread Linus Torvalds


On Sat, 27 Oct 2007, Peter Zijlstra wrote:
 
 Now I could probably come up with an annotation to hide it, but what do
 other people think, Ingo, Linus, Andrew, do we want to keep kernel locks
 held over userspace?

Definitely not a sane thing to do. It should use ref-counting and/or a 
single bit to say busy. 

Linus
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: lock held when returning to user space

2007-10-27 Thread Arjan van de Ven
On Sat, 27 Oct 2007 18:09:29 +0200
Peter Zijlstra [EMAIL PROTECTED] wrote:

 
 
 Right, the fd could be transferred using unix sockets or fork(). That
 would indeed seriously break a mutex.



or just multi-threaded app..
 
 


-- 
If you want to reach me at my work email, use [EMAIL PROTECTED]
For development, discussion and tips for power savings, 
visit http://www.lesswatts.org
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: BUG: lock held when returning to user space

2007-10-27 Thread Jiri Kosina
On Sat, 27 Oct 2007, Peter Zijlstra wrote:

   [  592.752781] 
   [  592.753478] [ BUG: lock held when returning to user space! ]
   [  592.753880] 
   [  592.754262] hwclock/1452 is leaving the kernel with locks still held!
   [  592.754655] 1 lock held by hwclock/1452:
   [  592.755007]  #0:  (rtc-char_lock){--..}, at: [c02a7ebb] 
   rtc_dev_open+0x2e/0x7e
  Yes, this is because rtc keeps a char_lock mutex locked as long as the 
  device is open, to avoid concurrent accessess.
 I think, in this case, the lock is associated with a kernel object that
 is properly cleaned up if the holding tasks gets a SIGKILL. But in
 general I'd like to see this kind of thing go away.

Yes, but the fact is that is really is invalid use of mutex -- because the 
mutex owner could become seriously wrong after fork() or sending the 
filedescriptor through unix socket ... this easily leads to broken 
situation.

This seems to have been introduced in e824290e5d ... Alessandro, could you 
convert this to test_and_set_bit()/clear_bit() semantics instead of a 
mutex please?

Thanks,

-- 
Jiri Kosina
-
To unsubscribe from this list: send the line unsubscribe linux-kernel in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/