Re: [PATCH 3/3] pch_gbe: Add MinnowBoard support

2013-07-12 Thread Darren Hart
On Fri, 2013-07-12 at 18:10 -0700, Joe Perches wrote:
> On Fri, 2013-07-12 at 17:58 -0700, Darren Hart wrote:
> > The MinnowBoard uses an AR803x PHY with the PCH GBE which requires
> > special handling. Use the MinnowBoard PCI Subsystem ID to detect this
> > and add a pci_device_id.driver_data structure and functions to handle
> > platform setup.
> 
> trivial comments only:
> 
> Please use scripts/checkpatch.pl

Always good advice. I did actually do that. Some of the reports
conflict with existing formatting throughout the file. I opted for
consistency.

The others sigh, I did a last minute cleanup and somehow introduced
the whitespace errors. I do know better and should have waited until
tonight instead of sending them out when I was rushed. Apologies.

Fixed in V3 and awaiting additional feedback.

> 
> []
> 
> diff --git a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c 
> b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c[]
> []
> > +static int pch_gbe_minnow_platform_init(struct pci_dev *pdev)
> []
> > +   if (ret){
> 
> Missing space before brace

Fixed.

> 
> > diff --git a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_phy.c 
> > b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_phy.c
> []
> > +int pch_gbe_phy_tx_clk_delay(struct pch_gbe_hw *hw)
> []
> > +   case PHY_AR803X_ID:
> > +   netdev_dbg(adapter->netdev,
> > +  "Configuring AR803X PHY for 2ns TX clock delay\n"); 
> []
> > +   netdev_err(adapter->netdev,
> > +  "Unknown PHY (%x), could not set TX clock delay.\n",
> > +  hw->phy.id);
> []
> > +   netdev_err(adapter->netdev,
> > +  "Could not configure tx clock delay for PHY.\n");
> []
> > +int pch_gbe_phy_disable_hibernate(struct pch_gbe_hw *hw)
> []
> > +   case PHY_AR803X_ID:
> > +   netdev_dbg(adapter->netdev,
> > +  "Disabling hibernation for AR803X PHY\n");
> 
> It'd be nice if no period before newline were used
> everywhere.

Indeed, fixed.

Thank you for the review.

> 
> > +   netdev_err(adapter->netdev,
> > +  "Unknown PHY (%x), could not disable hibernation\n",
> > +  hw->phy.id);
> []
> > +   netdev_err(adapter->netdev,
> > +  "Could not disable PHY hibernation.\n");
> 
> 

-- 
Darren Hart
Intel Open Source Technology Center
Yocto Project - Technical Lead - Linux Kernel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] pch_gbe: Add MinnowBoard support

2013-07-12 Thread Darren Hart
On Fri, 2013-07-12 at 18:17 -0700, Greg KH wrote:
> On Fri, Jul 12, 2013 at 05:58:07PM -0700, Darren Hart wrote:
> > The MinnowBoard uses an AR803x PHY with the PCH GBE which requires
> > special handling. Use the MinnowBoard PCI Subsystem ID to detect this
> > and add a pci_device_id.driver_data structure and functions to handle
> > platform setup.
> > 
> > The AR803x does not implement the RGMII 2ns TX clock delay in the trace
> > routing nor via strapping. Add a detection method for the board and the
> > PHY and enable the TX clock delay via the registers.
> > 
> > This PHY will hibernate without link for 10 seconds. Ensure the PHY is
> > awake for probe and then disable hibernation. A future improvement would
> > be to convert pch_gbe to using PHYLIB and making sure we can wake the
> > PHY at the necessary times rather than permanently disabling it.
> > 
> > Signed-off-by: Darren Hart 
> > Cc: "David S. Miller" 
> > Cc: "H. Peter Anvin" 
> > Cc: Peter Waskiewicz 
> > Cc: Andy Shevchenko 
> > Cc: net...@vger.kernel.org
> > Cc:  # 3.8.x: 5829e9b mfd: lpc_sch: Accomodate 
> > partial
> > Cc:  # 3.8.x: 3cbf182 gpio-sch: Allow for more than 
> > 8
> > Cc:  # 3.8.x: 91bbe92: PCI: Add CircuitCo vendor ID
> > Cc:  # 3.8.x: bd79680: pch_gbe: remove inline 
> > keyword
> > Cc:  # 3.8.x: 453ca93: pch_gbe: convert pr_* to
> > Cc:  # 3.8.x: 29cc436: pch_gbe: use managed 
> > functions
> > Cc:  # 3.8.x
> > Cc:  # 3.10.x: 91bbe92: PCI: Add CircuitCo vendor ID
> > Cc:  # 3.10.x: bd79680: pch_gbe: remove inline 
> > keyword
> > Cc:  # 3.10.x: 453ca93: pch_gbe: convert pr_* to
> > Cc:  # 3.10.x: 29cc436: pch_gbe: use managed 
> > functions
> > Cc:  # 3.10.x
> > Signed-off-by: Darren Hart 
> > ---
> >  drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe.h| 15 
> >  .../net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c   | 48 +++
> >  .../net/ethernet/oki-semi/pch_gbe/pch_gbe_phy.c| 98 
> > ++
> >  .../net/ethernet/oki-semi/pch_gbe/pch_gbe_phy.h|  2 +
> >  4 files changed, 163 insertions(+)
> 
> This is _far_ more than just a simple "add a new device id" for a stable
> kernel update.   Please go read Documentation/stable_kernel_rules.txt
> again for why there's no way I can take this type of thing.
>
> You know better than this.

I do appreciate the documentation that is there, and I have read it
(several times). The first two for 3.8 should be acceptable. The three
pre-reqs from pch_gbe are very unfortunate, but Andy pushed his in
response to my initial patch and they were merged first, making things
unnecessarily complicated for stable. This left me with the option of
massaging the patch (easy enough to do - I did this first actually) or
including them as pre-reqs, and I opted for the latter after re-reading
the stable docs and various threads on -stable to try and figure out
what was preferable, and there you had agreed to take a 120 patch
series. I've also seen patches longer than the 100 line limit go in, so
despite the documentation, sometimes it's difficult to tell what is
preferred, and if I don't include stable initially, it takes a lot of
effort and time to get things in afterward. I don't intend this as a
criticism, just an explanation of how I arrived here.

What would you prefer I do with this? I can break up the patch into
infrastructure and then MinnowBoard specific bits (I didn't think the
infrastructure-only patches would be well received on netdev, but maybe
I'm wrong there). I could massage the patch around Andy's three pch_gbe
cleanups which indeed are not stable candidates on their own. Or I could
drop the idea of trying to get Ethernet working on the MinnowBoard
outside of vendor trees and the next upstream release (I'd rather not do
that).

Thanks,

-- 
Darren Hart
Intel Open Source Technology Center
Yocto Project - Technical Lead - Linux Kernel

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 4/4] Sparse initialization of struct page array.

2013-07-12 Thread H. Peter Anvin
On 07/12/2013 10:31 PM, Yinghai Lu wrote:
> On Fri, Jul 12, 2013 at 9:39 PM, H. Peter Anvin  wrote:
>> On 07/12/2013 09:19 PM, Yinghai Lu wrote:
 PG_reserved,
 +   PG_uninitialized2mib,   /* Is this the right spot? ntz - Yes - rmh 
 */
 PG_private, /* If pagecache, has fs-private data */
>>
>> The comment here is WTF...
> 
> ntz: Nate Zimmer?
> rmh: Robin Holt?
> 

This kind of conversation doesn't really belong in a code comment, though.

-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] i2c-designware: make *CNT values configurable

2013-07-12 Thread Shinya Kuribayashi

Hi,

Now I've had a look at the whole discussion.

Basically, DW I2C core provides a good enough (and quite direct) way
to control tHIGH and tLOW timing specs, *HCNT and *LCNT registers.

But from my experience (with a slightly old version of DW I2C core
around 2005, version 1.06a or so), DW I2C core was apparently lacking
in an appropriate hardware mechanism to meet tHD;STA timing spec.  We
then found that we could meet tHD;STA by increasing *HCNT values, so
that's what currently we have in i2c-designware.c  I know with that
workaround applied, tHIGH is to be configured with a larger value
than necessary, but we have no choice.  For I2C bus systems, we must
meet every timing constraint strictly as required.  If DW I2C core
cannot meet tHD;STA without the sacrifice of tHIGH, and I would call
it a hardware bug.

On Wed, Jul 10, 2013 at 06:56:35PM +0200, Christian Ruppert wrote:

If I understand the above, it leaves Tf and Tr to be PCB specific and then
these values are passed to the core driver from platform data, right?


That would be the idea: Calculate Th' and Tl' in function of the desired
clock frequency and duty cycle and then adapt these values using
measured transition times.


I think this would be a good solution.

On 7/8/13 10:42 PM, Christian Ruppert wrote:

Anyway, the HCNT, LCNT and SDA hold time values we get from ACPI SSCN/FMCN
methods are measured by our HW guys on a certain board and they have
verified that using those we meet all the I2C timing specs.

In order to take advantage of those we need some way to pass those values
to the i2c designware core. I have two suggestions:

   - Use the method outlined in this patch and let the interface driver
 override *CNT values.


With *HCNT and *LCNT registers, we can control tHIGH, tLOW, tHD;STA
quite accurately.  On the other hand, what we can't control with DW
I2C core is tr and tf.  I assumed that it could never be longer than
300nsec (it's defined as a max. in the I2C specification), so I used
it for calculation.

I agree that tr and tf are PCB specific, and it would a good first
step toward timing optimization to make them configurable through
platform data.

Second step is that if current i2c_dw_scl_hcnt and i2c_dw_scl_lcnt
calculations don't suit with later DW I2C cores, then it would be
nice for someone who can access to the data book to update formulas,
or provide alternative formulas and make them selectable depending
on DW core versions.

It always helps us to have a way to calculate *HCNT and *LCNT values
automatically.  As said above, DW I2C core can control tHIGH, tLOW,
tHD;STA, etc. quite _accurate_, if HCNT/LCNT values were calculated
with proper formulas.  It also helps hardware people as well to
provide reference HCNT/LCNT values.

And as a third step, if we want to use optimized HCNT/LCNT values
which can not be obtained from proper formulas + user-requested
tf/tr, or if we want to use HCNT/LCNT settings verified by vendors
or provided from hardware team, then I'm fine with having a way to
_override_ HCNT/LCNT values.  Such direct way might be useful.


   - Allow interface drivers to override the function that does calculations
 for these values like:

if (dev->std_scl_cnt)
dev->std_scl_cnt(dev, , );
else
/* Fallback to the calculation based solely on iclk */


To make the code having less indentations and be more clear that *CNT
values are being overridden, something like this would be nice (leave
more good comments if necessary I'll leave it to you):

/* set standard and fast speed deviders for high/low periods */
 
 	/* Standard-mode */

hcnt = i2c_dw_scl_hcnt(input_clock_khz,
40, /* tHD;STA = tHIGH = 4.0 us */
3,  /* tf = 0.3 us */
0,  /* 0: DW default, 1: Ideal */
0); /* No offset */
lcnt = i2c_dw_scl_lcnt(input_clock_khz,
47, /* tLOW = 4.7 us */
3,  /* tf = 0.3 us */
0); /* No offset */
+   if (dev->ss_hcnt && dev->ss_lcnt) {
+   /* give preference to immediate values over optimal ones */
+   hcnt = dev->ss_hcnt;
+   lcnt = dev->ss_lcnt;
+   }
dw_writel(dev, hcnt, DW_IC_SS_SCL_HCNT);
dw_writel(dev, lcnt, DW_IC_SS_SCL_LCNT);
dev_dbg(dev->dev, "Standard-mode HCNT:LCNT = %d:%d\n", hcnt, lcnt);

  Shinya
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 4/4] Sparse initialization of struct page array.

2013-07-12 Thread Yinghai Lu
On Fri, Jul 12, 2013 at 9:39 PM, H. Peter Anvin  wrote:
> On 07/12/2013 09:19 PM, Yinghai Lu wrote:
>>> PG_reserved,
>>> +   PG_uninitialized2mib,   /* Is this the right spot? ntz - Yes - rmh 
>>> */
>>> PG_private, /* If pagecache, has fs-private data */
>
> The comment here is WTF...

ntz: Nate Zimmer?
rmh: Robin Holt?
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH RESEND 0/1] AHCI: Optimize interrupt processing

2013-07-12 Thread Nicholas A. Bellinger
On Fri, 2013-07-12 at 09:46 +0200, Alexander Gordeev wrote:
> On Thu, Jul 11, 2013 at 04:00:37PM -0700, Nicholas A. Bellinger wrote:
> > On Thu, 2013-07-11 at 12:26 +0200, Alexander Gordeev wrote:
> > > On Wed, May 22, 2013 at 07:03:05PM +0200, Jens Axboe wrote:
> > > > On Wed, May 22 2013, Alexander Gordeev wrote:
> > > > > On Wed, May 22, 2013 at 08:50:03AM +0900, Tejun Heo wrote:
> > > > > > Hmm. I'd normally apply this patch but block layer is just
> > > > > > growing multi-queue support and libata is likely to be converted to 
> > > > > > mq
> > > > > > in foreseeable future, so I'm a bit hesitant to make irq handling 
> > > > > > more
> > > > > > sophiscated right now.  Would you be interested in looking into
> > > > > > converting libata to blk mq support?  I'm pretty sure it'd yield far
> > > > > > better outcome if done properly.
> > > > > 
> > > > > I am not committing, but will look into it, sure.
> > > > 
> > > > Would be most awesome, I'm sure Nic would not mind a bit of help on the
> > > > SCSI/libata side :-)
> > > 
> > > Hi Nicholas,
> > > 
> > > Could you please clarify the status of SCSI MQ support? Is it usable now?
> > > 
> > 
> > Hi Alexander,
> > 
> > Thanks for taking a look.  I've not made further progress in the last
> > weeks on scsi-mq, but am still using virtio-scsi + scsi-mq <->
> > vhost-scsi + per-cpu-ida for quite a bit for benchmarking purposes.
> 
> Thanks for the clarification, Nicholas.
> 
> > > I tried 
> > > git://git.kernel.org/pub/scm/linux/kernel/git/nab/target-pending.git,
> > > but it does not appear working without (at least) changes below to SCSI 
> > > lib:
> > > 
> > 
> > The only scsi-mq LLD conversions so far have been to virtio-scsi +
> > scsi_debug to nop REQ_TYPE_FS.
> 
> I see. Do you think the changes I made is a right way to go?
> 
> I had to make it to avoid a NULL-pointer assignent and make BIO bounce
> buffers work, but I do not really understand the mixture of old and
> new code ( neither in fact :) )
> 

Hi Alexander,

Comments below.

> > Just so I understand, your patch below is required in order to make what
> > LLD function with scsi-mq..?
> 
> ata_piix
> 



> > 
> > Thanks!
> > 
> > --nab
> > 
> > > Thanks!
> > > 
> > > diff --git a/drivers/scsi/scsi-mq.c b/drivers/scsi/scsi-mq.c
> > > index ca6ff67..d8cc7a4 100644
> > > --- a/drivers/scsi/scsi-mq.c
> > > +++ b/drivers/scsi/scsi-mq.c
> > > @@ -155,6 +155,7 @@ void scsi_mq_done(struct scsi_cmnd *sc)
> > >  static struct blk_mq_ops scsi_mq_ops = {
> > >   .queue_rq   = scsi_mq_queue_rq,
> > >   .map_queue  = blk_mq_map_queue,
> > > + .timeout= scsi_times_out,
> > >   .alloc_hctx = blk_mq_alloc_single_hw_queue,
> > >   .free_hctx  = blk_mq_free_single_hw_queue,
> > >  };

So your actually triggering a blk-mq timeout with ata_piix..?

> > > diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> > > index 65360db..33aa373 100644
> > > --- a/drivers/scsi/scsi_lib.c
> > > +++ b/drivers/scsi/scsi_lib.c
> > > @@ -283,7 +283,10 @@ int scsi_execute(struct scsi_device *sdev, const 
> > > unsigned char *cmd,
> > >   /*
> > >* head injection *required* here otherwise quiesce won't work
> > >*/
> > > - blk_execute_rq(req->q, NULL, req, 1);
> > > + if (q->mq_ops)
> > > + blk_mq_execute_rq(req->q, req);
> > > + else
> > > + blk_execute_rq(req->q, NULL, req, 1);
> > >  

The scsi_execute() -> REQ_TYPE_BLOCK_RQ special case (scsi_scan +
scsi_ioctl) has a small issue preventing it's conversion to use
blk_mq_execute_rq().

Namely that scsi_execute_cmd() currently expects to be able to check for
the existence of req->errors + req->resid_len *after*
blk_mq_execute_rq() -> blk_mq_finish_request() -> blk_mq_free_request()
has been called to mark the pre-allocated struct request as free for
blk-mq re-use.

That is why scsi-mq still uses blk_execute_rq() for this reason, and
this will need be addressed in order to safely use blk_mq_execute_rq()
in the above context.

> > >   /*
> > >* Some devices (USB mass-storage in particular) may transfer
> > > @@ -298,12 +301,8 @@ int scsi_execute(struct scsi_device *sdev, const 
> > > unsigned char *cmd,
> > >   *resid = req->resid_len;
> > >   ret = req->errors;
> > >   out:
> > > - if (q->mq_ops) {
> > > - printk("scsi_execute(): Calling blk_mq_free_request >>>\n");
> > > - blk_mq_free_request(req);
> > > - } else {
> > > + if (!q->mq_ops)
> > >   blk_put_request(req);
> > > - }
> > >  

Do you have an OOPs backtrace handy to post w/o the last two changes in
place..?

Also, just a heads up that so far I've been using IDE (/dev/hdX) for the
root-device in order to make scsi-mq debugging safer and easier.

I *very* much recommend doing the same if at all possible for ata_piix
scsi-mq development + testing, as you'll want to be very careful when
using a real file-system on top of this early alpha code.

--nab

--
To unsubscribe from this list: send the line "unsubscribe 

Re: [ 0/8] 3.0.86-stable review

2013-07-12 Thread Greg Kroah-Hartman
On Sat, Jul 13, 2013 at 01:17:36PM +0900, Satoru Takeuchi wrote:
> At Thu, 11 Jul 2013 15:20:29 -0700,
> Greg Kroah-Hartman wrote:
> > 
> > This is the start of the stable review cycle for the 3.0.86 release.
> > There are 8 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Sat Jul 13 22:18:05 UTC 2013.
> > Anything received after that time might be too late.
> 
> This kernel can be built and boot without any problem.
> Building a kernel with this kernel also works fine.
> 
>  - Build Machine: debian jessy x86_64
>CPU: Intel(R) Core(TM) i5-2400 CPU @ 3.10GHz x 4
>memory: 8GB
> 
>  - Test machine: debian jessy x86_64(KVM guest on the Build Machine)
>vCPU: x2
>memory: 2GB

Thanks for testing all 4 of these releases and letting me know.

greg k-h

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 4/4] Sparse initialization of struct page array.

2013-07-12 Thread H. Peter Anvin
On 07/12/2013 09:19 PM, Yinghai Lu wrote:
>> PG_reserved,
>> +   PG_uninitialized2mib,   /* Is this the right spot? ntz - Yes - rmh */
>> PG_private, /* If pagecache, has fs-private data */

The comment here is WTF...

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] Final round of SCSI updates for the 3.10+ merge window

2013-07-12 Thread James Bottomley
This is the remaining set of SCSI patches for the merge window.  it's
mostly driver updates (scsi_debug, qla2xxx, storvsc, mp3sas).  There are
also several bug fixes in fcoe, libfc, and megaraid_sas.  We also have a
couple of core changes to try to make device destruction more
deterministic.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-for-linus

The short changelog is

Akinobu Mita (6):
  scsi_debug: reduce duplication between prot_verify_read and 
prot_verify_write
  scsi_debug: simplify offset calculation for dif_storep
  scsi_debug: invalidate protection info for unmapped region
  scsi_debug: fix NULL pointer dereference with parameters dif=0 dix=1
  scsi_debug: fix incorrectly nested kmap_atomic()
  scsi_debug: fix invalid address passed to kunmap_atomic()

Bart Van Assche (12):
  fcoe: Reduce number of sparse warnings
  enable destruction of blocked devices which fail LUN scanning
  qla2xxx: Fix a memory leak in an error path of qla2x00_process_els()
  qla2xxx: Remove an unused variable from qla2x00_remove_one().
  qla2xxx: Fix qla2xxx_check_risc_status().
  qla2xxx: Help Coverity with analyzing ct_sns_pkt initialization.
  qla2xxx: Remove redundant assignments.
  qla2xxx: Remove a dead assignment in qla24xx_build_scsi_crc_2_iocbs().
  qla2xxx: Remove two superfluous tests.
  qla2xxx: Remove dead code in qla2x00_configure_hba()
  qla2xxx: Clean up qla84xx_mgmt_cmd()
  qla2xxx: Clean up qla24xx_iidma()

Bjørn Mork (1):
  megaraid_sas: fix memory leak if SGL has zero length entries

Chad Dupuis (2):
  qla2xxx: Do not take a second firmware dump when intentionally generating 
one.
  qla2xxx: Do not query FC statistics during chip reset.

Dan Carpenter (1):
  megaraid_sas: fix a bug for 64 bit arches

Douglas Gilbert (1):
  scsi constants: command, sense key + additional sense strings

Giridhar Malavali (2):
  qla2xxx: Set the index in outstanding command array to NULL when cmd is 
aborted when the request timeout.
  qla2xxx: Clear the MBX_INTR_WAIT flag when the mailbox time-out happens.

James Bottomley (1):
  Fix race between starved list and device removal

K. Y. Srinivasan (3):
  storvsc: Increase the value of STORVSC_MAX_IO_REQUESTS
  storvsc: Support FC devices
  storvsc: Implement multi-channel support

Mark Rustad (2):
  fcoe: Stop fc_rport_priv structure leak
  libfc: Reject PLOGI from nodes with incompatible role

Neerav Parikh (1):
  fcoe: Fix smatch warning in fcoe_fdmi_info function

Robert Love (3):
  libfcoe: Fix meaningless log statement
  libfc: Differentiate echange timer cancellation debug statements
  libfc: Remove extra space in fc_exch_timer_cancel definition

Saurav Kashyap (2):
  qla2xxx: Fix sparse warning from qla_mr.c and qla_iocb.c.
  qla2xxx: Move qla2x00_free_device to the correct location.

Sreekanth Reddy (7):
  mpt3sas: Bump driver version to v02.100.00.00
  mpt3sas: when async scanning is enabled then while scanning, devices are 
removed but their transport layer entries are not removed
  mpt3sas: MPI2.5 Rev F v2.5.1.1 specification
  mpt3sas: Infinite loops can occur if MPI2_IOCSTATUS_CONFIG_INVALID_PAGE 
is not returned
  mpt3sas: fix for kernel panic when driver loads with HBA conected to non 
LUN 0 configured expander
  mpt3sas: Updated the Hardware timing requirements
  mpt3sas: 2013 source code copyright

Yi Zou (1):
  fcoe: fix the link error status block sparse warnings

Yijing Wang (1):
  pm8001: use pdev->pm_cap instead of pci_find_capability(..,PCI_CAP_ID_PM)

The diffstat is

 drivers/scsi/constants.c| 235 ++--
 drivers/scsi/fcoe/fcoe.c|  26 +--
 drivers/scsi/fcoe/fcoe_ctlr.c   |   4 +
 drivers/scsi/fcoe/fcoe_sysfs.c  |  24 +--
 drivers/scsi/fcoe/fcoe_transport.c  |  28 +---
 drivers/scsi/libfc/fc_exch.c|   4 +-
 drivers/scsi/libfc/fc_rport.c   |  27 
 drivers/scsi/megaraid/megaraid_sas_base.c   |  10 +-
 drivers/scsi/megaraid/megaraid_sas_fp.c |   4 +-
 drivers/scsi/mpt3sas/Kconfig|   2 +-
 drivers/scsi/mpt3sas/mpi/mpi2.h |  12 +-
 drivers/scsi/mpt3sas/mpi/mpi2_cnfg.h|  15 +-
 drivers/scsi/mpt3sas/mpi/mpi2_init.h|   2 +-
 drivers/scsi/mpt3sas/mpi/mpi2_ioc.h |  10 +-
 drivers/scsi/mpt3sas/mpi/mpi2_raid.h|  10 +-
 drivers/scsi/mpt3sas/mpi/mpi2_sas.h |   2 +-
 drivers/scsi/mpt3sas/mpi/mpi2_tool.h|  10 +-
 drivers/scsi/mpt3sas/mpi/mpi2_type.h|   2 +-
 drivers/scsi/mpt3sas/mpt3sas_base.c |  22 ++-
 drivers/scsi/mpt3sas/mpt3sas_base.h |   8 +-
 drivers/scsi/mpt3sas/mpt3sas_config.c   |   2 +-
 drivers/scsi/mpt3sas/mpt3sas_ctl.c  |   2 +-
 drivers/scsi/mpt3sas/mpt3sas_ctl.h  |   2 +-
 

Re: [RFC 4/4] Sparse initialization of struct page array.

2013-07-12 Thread Yinghai Lu
On Thu, Jul 11, 2013 at 7:03 PM, Robin Holt  wrote:
> During boot of large memory machines, a significant portion of boot
> is spent initializing the struct page array.  The vast majority of
> those pages are not referenced during boot.
>
> Change this over to only initializing the pages when they are
> actually allocated.
>
> Besides the advantage of boot speed, this allows us the chance to
> use normal performance monitoring tools to determine where the bulk
> of time is spent during page initialization.
>
> Signed-off-by: Robin Holt 
> Signed-off-by: Nate Zimmer 
> To: "H. Peter Anvin" 
> To: Ingo Molnar 
> Cc: Linux Kernel 
> Cc: Linux MM 
> Cc: Rob Landley 
> Cc: Mike Travis 
> Cc: Daniel J Blueman 
> Cc: Andrew Morton 
> Cc: Greg KH 
> Cc: Yinghai Lu 
> Cc: Mel Gorman 
> ---
>  include/linux/mm.h |  11 +
>  include/linux/page-flags.h |   5 +-
>  mm/nobootmem.c |   5 ++
>  mm/page_alloc.c| 117 
> +++--
>  4 files changed, 132 insertions(+), 6 deletions(-)
>
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index e0c8528..3de08b5 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1330,8 +1330,19 @@ static inline void __free_reserved_page(struct page 
> *page)
> __free_page(page);
>  }
>
> +extern void __reserve_bootmem_region(phys_addr_t start, phys_addr_t end);
> +
> +static inline void __reserve_bootmem_page(struct page *page)
> +{
> +   phys_addr_t start = page_to_pfn(page) << PAGE_SHIFT;
> +   phys_addr_t end = start + PAGE_SIZE;
> +
> +   __reserve_bootmem_region(start, end);
> +}
> +
>  static inline void free_reserved_page(struct page *page)
>  {
> +   __reserve_bootmem_page(page);
> __free_reserved_page(page);
> adjust_managed_page_count(page, 1);
>  }
> diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
> index 6d53675..79e8eb7 100644
> --- a/include/linux/page-flags.h
> +++ b/include/linux/page-flags.h
> @@ -83,6 +83,7 @@ enum pageflags {
> PG_owner_priv_1,/* Owner use. If pagecache, fs may use*/
> PG_arch_1,
> PG_reserved,
> +   PG_uninitialized2mib,   /* Is this the right spot? ntz - Yes - rmh */
> PG_private, /* If pagecache, has fs-private data */
> PG_private_2,   /* If pagecache, has fs aux data */
> PG_writeback,   /* Page is under writeback */
> @@ -211,6 +212,8 @@ PAGEFLAG(SwapBacked, swapbacked) 
> __CLEARPAGEFLAG(SwapBacked, swapbacked)
>
>  __PAGEFLAG(SlobFree, slob_free)
>
> +PAGEFLAG(Uninitialized2Mib, uninitialized2mib)
> +
>  /*
>   * Private page markings that may be used by the filesystem that owns the 
> page
>   * for its own purposes.
> @@ -499,7 +502,7 @@ static inline void ClearPageSlabPfmemalloc(struct page 
> *page)
>  #define PAGE_FLAGS_CHECK_AT_FREE \
> (1 << PG_lru | 1 << PG_locked| \
>  1 << PG_private | 1 << PG_private_2 | \
> -1 << PG_writeback | 1 << PG_reserved | \
> +1 << PG_writeback | 1 << PG_reserved | 1 << PG_uninitialized2mib | \
>  1 << PG_slab| 1 << PG_swapcache | 1 << PG_active | \
>  1 << PG_unevictable | __PG_MLOCKED | __PG_HWPOISON | \
>  __PG_COMPOUND_LOCK)
> diff --git a/mm/nobootmem.c b/mm/nobootmem.c
> index 3b512ca..e3a386d 100644
> --- a/mm/nobootmem.c
> +++ b/mm/nobootmem.c
> @@ -126,6 +126,9 @@ static unsigned long __init 
> free_low_memory_core_early(void)
> phys_addr_t start, end, size;
> u64 i;
>
> +   for_each_reserved_mem_region(i, , )
> +   __reserve_bootmem_region(start, end);
> +

How about holes that is not in memblock.reserved?

before this patch:
free_area_init_node/free_area_init_core/memmap_init_zone
will mark all page in node range to Reserved in struct page, at first.

but those holes is not mapped via kernel low mapping.
so it should be ok not touch "struct page" for them.

Now you only mark reserved for memblock.reserved at first, and later
mark {memblock.memory} - { memblock.reserved} to be available.
And that is ok.

but should split that change to another patch and add some comment
and change log for the change.
in case there is some user like UEFI etc do weird thing.

> for_each_free_mem_range(i, MAX_NUMNODES, , , NULL)
> count += __free_memory_core(start, end);
>
> @@ -162,6 +165,8 @@ unsigned long __init free_all_bootmem(void)
>  {
> struct pglist_data *pgdat;
>
> +   memblock_dump_all();
> +

Not needed.

> for_each_online_pgdat(pgdat)
> reset_node_lowmem_managed_pages(pgdat);
>
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 635b131..fe51eb9 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -740,6 +740,54 @@ static void __init_single_page(struct page *page, 
> unsigned long zone, int nid, i
>  #endif
>  }
>
> +static void expand_page_initialization(struct page *basepage)
> +{
> +   unsigned long 

Re: [ 0/8] 3.0.86-stable review

2013-07-12 Thread Satoru Takeuchi
At Thu, 11 Jul 2013 15:20:29 -0700,
Greg Kroah-Hartman wrote:
> 
> This is the start of the stable review cycle for the 3.0.86 release.
> There are 8 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Sat Jul 13 22:18:05 UTC 2013.
> Anything received after that time might be too late.

This kernel can be built and boot without any problem.
Building a kernel with this kernel also works fine.

 - Build Machine: debian jessy x86_64
   CPU: Intel(R) Core(TM) i5-2400 CPU @ 3.10GHz x 4
   memory: 8GB

 - Test machine: debian jessy x86_64(KVM guest on the Build Machine)
   vCPU: x2
   memory: 2GB

Thanks,
Satoru

> 
> The whole patch series can be found in one patch at:
>   kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.0.86-rc1.gz
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h
> 
> -
> Pseudo-Shortlog of commits:
> 
> Greg Kroah-Hartman 
> Linux 3.0.86-rc1
> 
> Ben Hutchings 
> SCSI: sd: Fix parsing of 'temporary ' cache mode prefix
> 
> J. Bruce Fields 
> nfsd4: fix decoding of compounds across page boundaries
> 
> Greg Kroah-Hartman 
> MAINTAINERS: add stable_kernel_rules.txt to stable maintainer information
> 
> Kees Cook 
> crypto: sanitize argument for format string
> 
> Kees Cook 
> block: do not pass disk names as format strings
> 
> Mikulas Patocka 
> hpfs: better test for errors
> 
> Jonathan Salwan 
> drivers/cdrom/cdrom.c: use kzalloc() for failing hardware
> 
> Tyler Hicks 
> libceph: Fix NULL pointer dereference in auth client code
> 
> 
> -
> 
> Diffstat:
> 
>  MAINTAINERS| 1 +
>  Makefile   | 4 ++--
>  block/genhd.c  | 2 +-
>  crypto/algapi.c| 3 ++-
>  drivers/block/nbd.c| 3 ++-
>  drivers/cdrom/cdrom.c  | 2 +-
>  drivers/scsi/osd/osd_uld.c | 2 +-
>  drivers/scsi/sd.c  | 2 +-
>  fs/hpfs/map.c  | 3 ++-
>  fs/hpfs/super.c| 8 +++-
>  fs/nfsd/nfs4xdr.c  | 2 +-
>  net/ceph/auth_none.c   | 6 ++
>  12 files changed, 27 insertions(+), 11 deletions(-)
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe stable" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ 00/15] 3.9.10-stable review

2013-07-12 Thread Satoru Takeuchi
At Thu, 11 Jul 2013 15:19:30 -0700,
Greg Kroah-Hartman wrote:
> 
> This is the start of the stable review cycle for the 3.9.10 release.
> There are 15 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Sat Jul 13 22:11:24 UTC 2013.
> Anything received after that time might be too late.

This kernel can be built and boot without any problem.
Building a kernel with this kernel also works fine.

 - Build Machine: debian jessy x86_64
   CPU: Intel(R) Core(TM) i5-2400 CPU @ 3.10GHz x 4
   memory: 8GB

 - Test machine: debian jessy x86_64(KVM guest on the Build Machine)
   vCPU: x2
   memory: 2GB

Thanks,
Satoru
> 
> The whole patch series can be found in one patch at:
>   kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.9.10-rc1.gz
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h
> 
> -
> Pseudo-Shortlog of commits:
> 
> Greg Kroah-Hartman 
> Linux 3.9.10-rc1
> 
> Michal Hocko 
> Revert "memcg: avoid dangling reference count in creation failure"
> 
> Ben Hutchings 
> SCSI: sd: Fix parsing of 'temporary ' cache mode prefix
> 
> Gleb Natapov 
> KVM: VMX: mark unusable segment as nonpresent
> 
> J. Bruce Fields 
> nfsd4: fix decoding of compounds across page boundaries
> 
> Greg Kroah-Hartman 
> Revert "serial: 8250_pci: add support for another kind of NetMos 
> Technology PCI 9835 Multi-I/O Controller"
> 
> Zhang Yi 
> futex: Take hugepages into account when generating futex_key
> 
> Greg Kroah-Hartman 
> MAINTAINERS: add stable_kernel_rules.txt to stable maintainer information
> 
> Kees Cook 
> crypto: sanitize argument for format string
> 
> Kees Cook 
> block: do not pass disk names as format strings
> 
> Mikulas Patocka 
> hpfs: better test for errors
> 
> Kees Cook 
> charger-manager: Ensure event is not used as format string
> 
> Rusty Russell 
> module: do percpu allocation after uniqueness check. No, really!
> 
> Jonathan Salwan 
> drivers/cdrom/cdrom.c: use kzalloc() for failing hardware
> 
> majianpeng 
> ceph: fix sleeping function called from invalid context.
> 
> Tyler Hicks 
> libceph: Fix NULL pointer dereference in auth client code
> 
> 
> -
> 
> Diffstat:
> 
>  MAINTAINERS|  1 +
>  Makefile   |  4 ++--
>  arch/x86/kvm/vmx.c | 11 +--
>  block/genhd.c  |  2 +-
>  crypto/algapi.c|  3 ++-
>  drivers/block/nbd.c|  3 ++-
>  drivers/cdrom/cdrom.c  |  2 +-
>  drivers/power/charger-manager.c|  2 +-
>  drivers/scsi/osd/osd_uld.c |  2 +-
>  drivers/scsi/sd.c  |  2 +-
>  drivers/tty/serial/8250/8250_pci.c |  4 
>  fs/ceph/xattr.c|  9 +
>  fs/hpfs/map.c  |  3 ++-
>  fs/hpfs/super.c|  8 +++-
>  fs/nfsd/nfs4xdr.c  |  2 +-
>  include/linux/hugetlb.h| 16 
>  kernel/futex.c |  3 ++-
>  kernel/module.c| 34 ++
>  mm/hugetlb.c   | 17 +
>  mm/memcontrol.c|  2 --
>  net/ceph/auth_none.c   |  6 ++
>  21 files changed, 95 insertions(+), 41 deletions(-)
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe stable" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ 00/11] 3.4.53-stable review

2013-07-12 Thread Satoru Takeuchi
At Thu, 11 Jul 2013 15:11:00 -0700,
Greg Kroah-Hartman wrote:
> 
> This is the start of the stable review cycle for the 3.4.53 release.
> There are 11 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Sat Jul 13 22:04:03 UTC 2013.
> Anything received after that time might be too late.

This kernel can be built and boot without any problem.
Building a kernel with this kernel also works fine.

 - Build Machine: debian jessy x86_64
   CPU: Intel(R) Core(TM) i5-2400 CPU @ 3.10GHz x 4
   memory: 8GB

 - Test machine: debian jessy x86_64(KVM guest on the Build Machine)
   vCPU: x2
   memory: 2GB

Thanks,
Satoru

> 
> The whole patch series can be found in one patch at:
>   kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.4.53-rc1.gz
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h
> 
> -
> Pseudo-Shortlog of commits:
> 
> Greg Kroah-Hartman 
> Linux 3.4.53-rc1
> 
> Greg Kroah-Hartman 
> Revert "sched: Add missing call to calc_load_exit_idle()"
> 
> Ben Hutchings 
> SCSI: sd: Fix parsing of 'temporary ' cache mode prefix
> 
> J. Bruce Fields 
> nfsd4: fix decoding of compounds across page boundaries
> 
> Greg Kroah-Hartman 
> Revert "serial: 8250_pci: add support for another kind of NetMos 
> Technology PCI 9835 Multi-I/O Controller"
> 
> Greg Kroah-Hartman 
> MAINTAINERS: add stable_kernel_rules.txt to stable maintainer information
> 
> Kees Cook 
> crypto: sanitize argument for format string
> 
> Kees Cook 
> block: do not pass disk names as format strings
> 
> Mikulas Patocka 
> hpfs: better test for errors
> 
> Kees Cook 
> charger-manager: Ensure event is not used as format string
> 
> Jonathan Salwan 
> drivers/cdrom/cdrom.c: use kzalloc() for failing hardware
> 
> Tyler Hicks 
> libceph: Fix NULL pointer dereference in auth client code
> 
> 
> -
> 
> Diffstat:
> 
>  MAINTAINERS| 1 +
>  Makefile   | 4 ++--
>  block/genhd.c  | 2 +-
>  crypto/algapi.c| 3 ++-
>  drivers/block/nbd.c| 3 ++-
>  drivers/cdrom/cdrom.c  | 2 +-
>  drivers/power/charger-manager.c| 2 +-
>  drivers/scsi/osd/osd_uld.c | 2 +-
>  drivers/scsi/sd.c  | 2 +-
>  drivers/tty/serial/8250/8250_pci.c | 4 
>  fs/hpfs/map.c  | 3 ++-
>  fs/hpfs/super.c| 8 +++-
>  fs/nfsd/nfs4xdr.c  | 2 +-
>  kernel/time/tick-sched.c   | 1 -
>  net/ceph/auth_none.c   | 6 ++
>  15 files changed, 28 insertions(+), 17 deletions(-)
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe stable" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ 00/19] 3.10.1-stable review

2013-07-12 Thread Satoru Takeuchi
At Thu, 11 Jul 2013 15:01:17 -0700,
Greg Kroah-Hartman wrote:
> 
> 
>   I'm sitting on top of over 170 more patches that have been marked for
>   the stable releases right now that are not included in this set of
>   releases.  The fact that there are this many patches for stable stuff
>   that are waiting to be merged through the main -rc1 merge window cycle
>   is worrying to me.
> 
>   Why are subsystem maintainers holding on to fixes that are
>   _supposedly_ affecting all users?  I mean, 21 powerpc core changes
>   that I don't see until a -rc1 merge?  It's as if developers don't
>   expect people to use a .0 release and are relying on me to get the
>   fixes they have burried in their trees out to users.  That's not that
>   nice.  6 "core" iscsi-target fixes?  That's the sign of either a
>   broken subsystem maintainer, or a lack of understanding what the
>   normal -rc kernel releases are supposed to be for.
> 
>   So, I've picked through the patches and dug out only those that I've
>   "guessed" at being more important than others for the 3.10.1 release.
>   I'll get to the rest of these after 3.11-rc1 is out, and eventually
>   they will make it into the stable releases, but I am going to be much
>   more strict as to what is being added (carriage return changes for
>   debug messages, really ACPI developers?)
> 
> 
> 
> This is the start of the stable review cycle for the 3.10.1 release.
> There are 19 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Sat Jul 13 21:45:35 UTC 2013.
> Anything received after that time might be too late.
> 

This kernel can be built and boot without any problem.
Building a kernel with this kernel also works fine.

 - Build Machine: debian jessy x86_64
   CPU: Intel(R) Core(TM) i5-2400 CPU @ 3.10GHz x 4
   memory: 8GB

 - Test machine: debian jessy x86_64(KVM guest on the Build Machine)
   vCPU: x2
   memory: 2GB

Thanks,
Satoru

> The whole patch series can be found in one patch at:
>   kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.10.1-rc1.gz
> and the diffstat can be found below.
> 
> thanks,
> 
> greg k-h
> 
> -
> Pseudo-Shortlog of commits:
> 
> Greg Kroah-Hartman 
> Linux 3.10.1-rc1
> 
> Michal Hocko 
> Revert "memcg: avoid dangling reference count in creation failure"
> 
> Srivatsa S. Bhat 
> cpufreq: Fix cpufreq regression after suspend/resume
> 
> Ben Hutchings 
> SCSI: sd: Fix parsing of 'temporary ' cache mode prefix
> 
> Gleb Natapov 
> KVM: VMX: mark unusable segment as nonpresent
> 
> J. Bruce Fields 
> nfsd4: fix decoding of compounds across page boundaries
> 
> Andy Adamson 
> NFSv4.1 end back channel session draining
> 
> Greg Kroah-Hartman 
> Revert "serial: 8250_pci: add support for another kind of NetMos 
> Technology PCI 9835 Multi-I/O Controller"
> 
> Peter Hurley 
> tty: Reset itty for other pty
> 
> Zhang Yi 
> futex: Take hugepages into account when generating futex_key
> 
> Greg Kroah-Hartman 
> MAINTAINERS: add stable_kernel_rules.txt to stable maintainer information
> 
> Kees Cook 
> crypto: sanitize argument for format string
> 
> Kees Cook 
> block: do not pass disk names as format strings
> 
> Mikulas Patocka 
> hpfs: better test for errors
> 
> Kees Cook 
> charger-manager: Ensure event is not used as format string
> 
> Rusty Russell 
> module: do percpu allocation after uniqueness check. No, really!
> 
> Jonathan Salwan 
> drivers/cdrom/cdrom.c: use kzalloc() for failing hardware
> 
> Josh Durgin 
> libceph: fix invalid unsigned->signed conversion for timespec encoding
> 
> majianpeng 
> ceph: fix sleeping function called from invalid context.
> 
> Tyler Hicks 
> libceph: Fix NULL pointer dereference in auth client code
> 
> 
> -
> 
> Diffstat:
> 
>  MAINTAINERS|  1 +
>  Makefile   |  4 ++--
>  arch/x86/kvm/vmx.c | 11 +--
>  block/genhd.c  |  2 +-
>  crypto/algapi.c|  3 ++-
>  drivers/block/nbd.c|  3 ++-
>  drivers/cdrom/cdrom.c  |  2 +-
>  drivers/cpufreq/cpufreq_stats.c|  1 +
>  drivers/power/charger-manager.c|  2 +-
>  drivers/scsi/osd/osd_uld.c |  2 +-
>  drivers/scsi/sd.c  |  2 +-
>  drivers/tty/serial/8250/8250_pci.c |  4 
>  drivers/tty/tty_io.c   |  2 ++
>  fs/ceph/xattr.c|  9 +
>  fs/hpfs/map.c  |  3 ++-
>  fs/hpfs/super.c|  8 +++-
>  fs/nfs/nfs4state.c | 23 +++
>  fs/nfsd/nfs4xdr.c  |  2 +-
>  include/linux/ceph/decode.h|  5 -
>  include/linux/hugetlb.h| 16 
>  kernel/futex.c |  3 ++-
>  kernel/module.c   

Re: linux-next: slab shrinkers: BUG at mm/list_lru.c:92

2013-07-12 Thread Dave Chinner
On Thu, Jul 11, 2013 at 06:42:03PM -0700, Hugh Dickins wrote:
> On Thu, 11 Jul 2013, Michal Hocko wrote:
> > On Thu 11-07-13 12:26:34, Dave Chinner wrote:
> > > > We are wating for page under writeback but neither of the 2 paths starts
> > > > in xfs code. So I do not think waiting for PageWriteback causes a
> > > > deadlock here.
> > > 
> > > The problem is this: the page that we are waiting for IO on is in
> > > the IO completion queue, but the IO compeltion requires memory
> > > allocation to complete the transaction. That memory allocation is
> > > causing memcg reclaim, which then waits for IO completion on another
> > > page, which may or may not end up in the same IO completion queue.
> > > The CMWQ can continue to process new Io completions - up to a point
> > > - so slow progress will be made. In the worst case, it can deadlock.
> > 
> > OK, I thought something like that was going on but I just wanted to be
> > sure that I didn't manage to confuse you by the lockup messages.
> > > 
> > > GFP_NOFS allocation is the mechanism by which filesystems are
> > > supposed to be able to avoid this recursive deadlock...
> > 
> > Yes.
> > 
> > > > [...]
> > > > > ... is running IO completion work and trying to commit a transaction
> > > > > that is blocked in memory allocation which is waiting for IO
> > > > > completion. It's disappeared up it's own fundamental orifice.
> > > > > 
> > > > > Ok, this has absolutely nothing to do with the LRU changes - this is
> > > > > a pre-existing XFS/mm interaction problem from around 3.2. The
> > > > > question is now this: how the hell do I get memory allocation to not
> > > > > block waiting on IO completion here? This is already being done in
> > > > > GFP_NOFS allocation context here
> > > > 
> > > > Just for reference. wait_on_page_writeback is issued only for memcg
> > > > reclaim because there is no other throttling mechanism to prevent from
> > > > too many dirty pages on the list, thus pre-mature OOM killer. See
> > > > e62e384e9d (memcg: prevent OOM with too many dirty pages) for more
> > > > details. The original patch relied on may_enter_fs but that check
> > > > disappeared by later changes by c3b94f44fc (memcg: further prevent OOM
> > > > with too many dirty pages).
> > > 
> > > Aye. That's the exact code I was looking at yesterday and wondering
> > > "how the hell is waiting on page writeback valid in GFP_NOFS
> > > context?". It seems that memcg reclaim is intentionally ignoring
> > > GFP_NOFS to avoid OOM issues.  That's a memcg implementation problem,
> > > not a filesystem or LRU infrastructure problem
> > 
> > Agreed and until we have a proper per memcg dirty memory throttling we
> > will always be in a workaround mode. Which is sad but that is the
> > reality...
> > 
> > I am CCing Hugh (the discussion was long and started with a different
> > issue but the above should tell about the current xfs hang. It seems
> > that c3b94f44fc make xfs hang).
> 
> The may_enter_fs test came and went several times as we prepared those
> patches: one set of problems with it in, another set with it out.
> 
> When I made c3b94f44fc, I was not imagining that I/O completion might
> have to wait on a further __GFP_IO allocation.  But I can see the sense
> of what XFS is doing there: after writing the data, it wants to perform
> (initiate?) a transaction; but if that happens to fail, wants to mark
> the written data pages as bad before reaching the end_page_writeback.
> I've toyed with reordering that, but its order does seem sensible.
> 
> I've always thought of GFP_NOFS as meaning "don't recurse into the
> filesystem" (and wondered what that amounts to since direct reclaim
> stopped doing filesystem writeback); but here XFS is expecting it
> to include "and don't wait for PageWriteback to be cleared".

Well, it's more general than that - my understanding of GFP_NOFS is
that it means "don't block reclaim on anything filesystem related
because a filesystem deadlock is possible from this calling
content". Even without direct reclaim doing writeback, there is
still shrinkers that need to avoid locking filesystem objects during
direct reclaim, and the fact that waiting on writeback for specific
pages to complete may (indirectly) block a memory allocation
required to complete the writeback of that page. It's the latter
case that is the problem here...

> I've mused on this for a while, and haven't arrived at any conclusion;
> but do have several mutterings on different kinds of solution.
> 
> Probably the easiest solution, but not necessarily the right solution,
> would be for XFS to add a KM_NOIO akin to its KM_NOFS, and use KM_NOIO
> instead of KM_NOFS in xfs_iomap_write_unwritten() (anywhere else?).
> I'd find that more convincing if it were not so obviously designed
> to match an assumption I'd once made over in mm/vmscan.c.

I'd prefer not to have to start using KM_NOIO in specific places in
the filesystem layer. I can see how it may be relevant, though,

[tip:x86/urgent] x86: Make sure IDT is page aligned

2013-07-12 Thread tip-bot for Kees Cook
Commit-ID:  c0b3450f101523a49823fa93d155f1d258e5ac6f
Gitweb: http://git.kernel.org/tip/c0b3450f101523a49823fa93d155f1d258e5ac6f
Author: Kees Cook 
AuthorDate: Fri, 12 Jul 2013 15:50:17 -0700
Committer:  H. Peter Anvin 
CommitDate: Fri, 12 Jul 2013 16:14:08 -0700

x86: Make sure IDT is page aligned

Since the IDT is referenced from a fixmap, make sure it is page aligned.
Merge with 32-bit one, since it was already aligned to deal with F00F bug.
This avoids the risk of it ever being moved in the bss and having the
mapping be offset, resulting in calling incorrect handlers.

[ hpa: It isn't clear that this is a manifest bug in any way, but
  tagging for -stable because it shouldn't hurt and might avoid some
  very hard-to-debug breakages due to unrelated changes. ]

Signed-off-by: Kees Cook 
Link: http://lkml.kernel.org/r/20130712225017.ga5...@www.outflux.net
Reported-by: PaX Team 
Cc: sta...@vger.kernel.org
Signed-off-by: H. Peter Anvin 
---
 arch/x86/kernel/head_64.S | 4 
 arch/x86/kernel/traps.c   | 7 ++-
 2 files changed, 2 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 5e4d8a8..317b8cc 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -514,10 +514,6 @@ ENTRY(phys_base)

.section .bss, "aw", @nobits
.align L1_CACHE_BYTES
-ENTRY(idt_table)
-   .skip IDT_ENTRIES * 16
-
-   .align L1_CACHE_BYTES
 ENTRY(debug_idt_table)
.skip IDT_ENTRIES * 16
 
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index b0865e8..0952614 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -68,13 +68,10 @@
 #include 
 
 asmlinkage int system_call(void);
+#endif
 
-/*
- * The IDT has to be page-aligned to simplify the Pentium
- * F0 0F bug workaround.
- */
+/* The IDT has to be page-aligned to keep it aligned with its fixmap. */
 gate_desc idt_table[NR_VECTORS] __page_aligned_data = { { { { 0, 0 } } }, };
-#endif
 
 DECLARE_BITMAP(used_vectors, NR_VECTORS);
 EXPORT_SYMBOL_GPL(used_vectors);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[tip:x86/urgent] x86, suspend: Handle CPUs which fail to #GP on RDMSR

2013-07-12 Thread tip-bot for H. Peter Anvin
Commit-ID:  3a783f6e39cc6c89da8846312f29ca47affaa470
Gitweb: http://git.kernel.org/tip/3a783f6e39cc6c89da8846312f29ca47affaa470
Author: H. Peter Anvin 
AuthorDate: Fri, 12 Jul 2013 16:48:12 -0700
Committer:  H. Peter Anvin 
CommitDate: Fri, 12 Jul 2013 16:48:12 -0700

x86, suspend: Handle CPUs which fail to #GP on RDMSR

There are CPUs which have errata causing RDMSR of a nonexistent MSR to
not fault.  We would then try to WRMSR to restore the value of that
MSR, causing a crash.  Specifically, some Pentium M variants would
have this problem trying to save and restore the non-existent EFER,
causing a crash on resume.

Work around this by making sure we can write back the result at
suspend time.

Huge thanks to Christian Sünkenberg for finding the offending erratum
that finally deciphered the mystery.

Reported-and-tested-by: Johan Heinrich 
Debugged-by: Christian Sünkenberg 
Acked-by: Rafael J. Wysocki 
Link: http://lkml.kernel.org/r/51ddc972.3010...@student.kit.edu
Cc:  # v3.7+
---
 arch/x86/kernel/acpi/sleep.c | 18 --
 1 file changed, 16 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kernel/acpi/sleep.c b/arch/x86/kernel/acpi/sleep.c
index 2a34aaf..3312010 100644
--- a/arch/x86/kernel/acpi/sleep.c
+++ b/arch/x86/kernel/acpi/sleep.c
@@ -48,9 +48,20 @@ int x86_acpi_suspend_lowlevel(void)
 #ifndef CONFIG_64BIT
native_store_gdt((struct desc_ptr *)>pmode_gdt);
 
+   /*
+* We have to check that we can write back the value, and not
+* just read it.  At least on 90 nm Pentium M (Family 6, Model
+* 13), reading an invalid MSR is not guaranteed to trap, see
+* Erratum X4 in "Intel Pentium M Processor on 90 nm Process
+* with 2-MB L2 Cache and Intel® Processor A100 and A110 on 90
+* nm process with 512-KB L2 Cache Specification Update".
+*/
if (!rdmsr_safe(MSR_EFER,
>pmode_efer_low,
-   >pmode_efer_high))
+   >pmode_efer_high) &&
+   !wrmsr_safe(MSR_EFER,
+   header->pmode_efer_low,
+   header->pmode_efer_high))
header->pmode_behavior |= (1 << WAKEUP_BEHAVIOR_RESTORE_EFER);
 #endif /* !CONFIG_64BIT */
 
@@ -61,7 +72,10 @@ int x86_acpi_suspend_lowlevel(void)
}
if (!rdmsr_safe(MSR_IA32_MISC_ENABLE,
>pmode_misc_en_low,
-   >pmode_misc_en_high))
+   >pmode_misc_en_high) &&
+   !wrmsr_safe(MSR_IA32_MISC_ENABLE,
+   header->pmode_misc_en_low,
+   header->pmode_misc_en_high))
header->pmode_behavior |=
(1 << WAKEUP_BEHAVIOR_RESTORE_MISC_ENABLE);
header->realmode_flags = acpi_realmode_flags;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] ARM: Fix r7/r11 confusion when CONFIG_THUMB2_KERNEL=y

2013-07-12 Thread Jed Davis
There is currently some inconsistency about the "frame pointer" on ARM.
r11 is the register with assemblers recognize and disassemblers often
print as "fp", and which is sufficient for stack unwinding when using
the APCS frame pointer option; but when unwinding with the Exception
Handling ABI, the register GCC uses when a constant offset won't suffice
(or when -fno-omit-frame-pointer is used; see kernel/sched/Makefile in
particular) is r11 on ARM and r7 on Thumb.

Correspondingly, arch/arm/include/uapi/arm/ptrace.h defines ARM_fp to
refer to r11, but arch/arm/kernel/unwind.c uses "FP" to mean either r11
or r7 depending on Thumbness, and it is unclear what other cases such as
the "fp" in struct stackframe should be doing.

Effects of this are probably limited to failure of EHABI unwinding when
starting from a function that uses r7 to restore its stack pointer, but
the possibility for further breakage (which would be invisible on
non-Thumb kernels) is worrying.

With this change, it is hoped, r7 is consistently referred to as "r7",
and "fp" always means r11; this costs a few extra ifdefs, but it should
help prevent future issues.

Signed-off-by: Jed Davis 
---
 arch/arm/include/asm/stacktrace.h  |4 
 arch/arm/include/asm/thread_info.h |2 ++
 arch/arm/kernel/perf_event.c   |4 
 arch/arm/kernel/process.c  |4 
 arch/arm/kernel/time.c |4 
 arch/arm/kernel/unwind.c   |   27 ++-
 arch/arm/oprofile/common.c |4 
 7 files changed, 48 insertions(+), 1 deletion(-)

diff --git a/arch/arm/include/asm/stacktrace.h 
b/arch/arm/include/asm/stacktrace.h
index 4d0a164..5e546bf 100644
--- a/arch/arm/include/asm/stacktrace.h
+++ b/arch/arm/include/asm/stacktrace.h
@@ -2,7 +2,11 @@
 #define __ASM_STACKTRACE_H
 
 struct stackframe {
+#ifdef CONFIG_THUMB2_KERNEL
+   unsigned long r7;
+#else
unsigned long fp;
+#endif
unsigned long sp;
unsigned long lr;
unsigned long pc;
diff --git a/arch/arm/include/asm/thread_info.h 
b/arch/arm/include/asm/thread_info.h
index 214d415..ae3cd81 100644
--- a/arch/arm/include/asm/thread_info.h
+++ b/arch/arm/include/asm/thread_info.h
@@ -105,6 +105,8 @@ static inline struct thread_info *current_thread_info(void)
((unsigned long)(task_thread_info(tsk)->cpu_context.sp))
 #define thread_saved_fp(tsk)   \
((unsigned long)(task_thread_info(tsk)->cpu_context.fp))
+#define thread_saved_r7(tsk)   \
+   ((unsigned long)(task_thread_info(tsk)->cpu_context.r7))
 
 extern void crunch_task_disable(struct thread_info *);
 extern void crunch_task_copy(struct thread_info *, void *);
diff --git a/arch/arm/kernel/perf_event.c b/arch/arm/kernel/perf_event.c
index d9f5cd4..55776d6 100644
--- a/arch/arm/kernel/perf_event.c
+++ b/arch/arm/kernel/perf_event.c
@@ -601,7 +601,11 @@ perf_callchain_kernel(struct perf_callchain_entry *entry, 
struct pt_regs *regs)
return;
}
 
+#ifdef CONFIG_THUMB2_KERNEL
+   fr.r7 = regs->ARM_r7;
+#else
fr.fp = regs->ARM_fp;
+#endif
fr.sp = regs->ARM_sp;
fr.lr = regs->ARM_lr;
fr.pc = regs->ARM_pc;
diff --git a/arch/arm/kernel/process.c b/arch/arm/kernel/process.c
index d3ca4f6..aeb4c28 100644
--- a/arch/arm/kernel/process.c
+++ b/arch/arm/kernel/process.c
@@ -405,7 +405,11 @@ unsigned long get_wchan(struct task_struct *p)
if (!p || p == current || p->state == TASK_RUNNING)
return 0;
 
+#ifdef CONFIG_THUMB2_KERNEL
+   frame.r7 = thread_saved_r7(p);
+#else
frame.fp = thread_saved_fp(p);
+#endif
frame.sp = thread_saved_sp(p);
frame.lr = 0;   /* recovered from the stack */
frame.pc = thread_saved_pc(p);
diff --git a/arch/arm/kernel/time.c b/arch/arm/kernel/time.c
index 98aee32..80410d3 100644
--- a/arch/arm/kernel/time.c
+++ b/arch/arm/kernel/time.c
@@ -49,7 +49,11 @@ unsigned long profile_pc(struct pt_regs *regs)
if (!in_lock_functions(regs->ARM_pc))
return regs->ARM_pc;
 
+#ifdef CONFIG_THUMB2_KERNEL
+   frame.r7 = regs->ARM_r7;
+#else
frame.fp = regs->ARM_fp;
+#endif
frame.sp = regs->ARM_sp;
frame.lr = regs->ARM_lr;
frame.pc = regs->ARM_pc;
diff --git a/arch/arm/kernel/unwind.c b/arch/arm/kernel/unwind.c
index 00df012..dec03ae 100644
--- a/arch/arm/kernel/unwind.c
+++ b/arch/arm/kernel/unwind.c
@@ -74,7 +74,7 @@ struct unwind_ctrl_block {
 
 enum regs {
 #ifdef CONFIG_THUMB2_KERNEL
-   FP = 7,
+   R7 = 7,
 #else
FP = 11,
 #endif
@@ -317,8 +317,13 @@ static int unwind_exec_insn(struct unwind_ctrl_block *ctrl)
return -URC_FAILURE;
}
 
+#ifdef CONFIG_THUMB2_KERNEL
+   pr_debug("%s: r7 = %08lx sp = %08lx lr = %08lx pc = %08lx\n", __func__,
+ctrl->vrs[R7], ctrl->vrs[SP], ctrl->vrs[LR], ctrl->vrs[PC]);
+#else
pr_debug("%s: fp = %08lx sp = %08lx lr = %08lx pc = %08lx\n", __func__,
  

[PATCH] ARM: perf: Implement perf_arch_fetch_caller_regs

2013-07-12 Thread Jed Davis
We need a perf_arch_fetch_caller_regs for at least some software events
to be able to get a callchain; even user stacks won't work without
at least the CPSR bits for non-user-mode (see perf_callchain).  In
particular, profiling context switches needs this.

This records the state of the point at which perf_arch_fetch_caller_regs
is expanded, instead of that function activation's call site, because we
need SP and PC to be consistent for EHABI unwinding; hopefully nothing
will be inconvenienced by the extra stack frame.

Signed-off-by: Jed Davis 
---
 arch/arm/include/asm/perf_event.h |   43 +
 1 file changed, 43 insertions(+)

diff --git a/arch/arm/include/asm/perf_event.h 
b/arch/arm/include/asm/perf_event.h
index 7558775..2cc7255 100644
--- a/arch/arm/include/asm/perf_event.h
+++ b/arch/arm/include/asm/perf_event.h
@@ -12,6 +12,8 @@
 #ifndef __ARM_PERF_EVENT_H__
 #define __ARM_PERF_EVENT_H__
 
+#include 
+
 /*
  * The ARMv7 CPU PMU supports up to 32 event counters.
  */
@@ -28,4 +30,45 @@ extern unsigned long perf_misc_flags(struct pt_regs *regs);
 #define perf_misc_flags(regs)  perf_misc_flags(regs)
 #endif
 
+/*
+ * We can't actually get the caller's registers here; the saved PC and
+ * SP values have to be consistent or else EHABI unwinding won't work,
+ * and the only way to find the matching SP for the return address is
+ * to unwind the current function.  So we save the current state
+ * instead.
+ *
+ * Note that the ARM Exception Handling ABI allows unwinding to depend
+ * on the contents of any core register, but our unwinder is limited
+ * to the ones in struct stackframe (which are the only ones we expect
+ * GCC to need for kernel code), so we just record those.
+ */
+#ifdef CONFIG_THUMB2_KERNEL
+#define perf_arch_fetch_caller_regs(regs, ip)  \
+   do {\
+   __u32 _cpsr, _pc;   \
+   __asm__ __volatile__("str r7, [%[_regs], #(7 * 4)]\n\t" \
+"str r13, [%[_regs], #(13 * 4)]\n\t" \
+"str r14, [%[_regs], #(14 * 4)]\n\t" \
+"mov %[_pc],  r15\n\t" \
+"mrs %[_cpsr], cpsr\n\t"   \
+: [_cpsr] "=r" (_cpsr),\
+  [_pc] "=r" (_pc) \
+: [_regs] "r" (&(regs)->uregs) \
+: "memory");   \
+   (regs)->ARM_pc = _pc;   \
+   (regs)->ARM_cpsr = _cpsr;   \
+   } while (0)
+#else
+#define perf_arch_fetch_caller_regs(regs, ip)  \
+   do {\
+   __u32 _cpsr;\
+   __asm__ __volatile__("stmia %[_regs11], {r11 - r15}\n\t" \
+"mrs %[_cpsr], cpsr\n\t"   \
+: [_cpsr] "=r" (_cpsr) \
+: [_regs11] "r" (&(regs)->uregs[11]) \
+: "memory");   \
+   (regs)->ARM_cpsr = _cpsr;   \
+   } while (0)
+#endif
+
 #endif /* __ARM_PERF_EVENT_H__ */
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 2/4] Have __free_pages_memory() free in larger chunks.

2013-07-12 Thread Yinghai Lu
On Fri, Jul 12, 2013 at 12:45 AM, Robin Holt  wrote:

> At the very least, I think we could change to:
> static void __init __free_pages_memory(unsigned long start, unsigned long end)
> {
> int order;
>
> while (start < end) {
> order = ffs(start);
>
> while (start + (1UL << order) > end)
> order--;
>
> __free_pages_bootmem(start, order);
>
> start += (1UL << order);
> }
> }

should work, but need to make sure order < MAX_ORDER.

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC 3/4] Seperate page initialization into a separate function.

2013-07-12 Thread Yinghai Lu
On Thu, Jul 11, 2013 at 7:03 PM, Robin Holt  wrote:
> Currently, memmap_init_zone() has all the smarts for initializing a
> single page.  When we convert to initializing pages in a 2MiB chunk,
> we will need to do this equivalent work from two separate places
> so we are breaking out a helper function.
>
> Signed-off-by: Robin Holt 
> Signed-off-by: Nate Zimmer 
> To: "H. Peter Anvin" 
> To: Ingo Molnar 
> Cc: Linux Kernel 
> Cc: Linux MM 
> Cc: Rob Landley 
> Cc: Mike Travis 
> Cc: Daniel J Blueman 
> Cc: Andrew Morton 
> Cc: Greg KH 
> Cc: Yinghai Lu 
> Cc: Mel Gorman 
> ---
>  mm/mm_init.c|  2 +-
>  mm/page_alloc.c | 75 
> +
>  2 files changed, 45 insertions(+), 32 deletions(-)
>
> diff --git a/mm/mm_init.c b/mm/mm_init.c
> index c280a02..be8a539 100644
> --- a/mm/mm_init.c
> +++ b/mm/mm_init.c
> @@ -128,7 +128,7 @@ void __init mminit_verify_pageflags_layout(void)
> BUG_ON(or_mask != add_mask);
>  }
>
> -void __meminit mminit_verify_page_links(struct page *page, enum zone_type 
> zone,
> +void mminit_verify_page_links(struct page *page, enum zone_type zone,
> unsigned long nid, unsigned long pfn)
>  {
> BUG_ON(page_to_nid(page) != nid);
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index c3edb62..635b131 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -697,6 +697,49 @@ static void free_one_page(struct zone *zone, struct page 
> *page, int order,
> spin_unlock(>lock);
>  }
>
> +static void __init_single_page(struct page *page, unsigned long zone, int 
> nid, int reserved)
> +{
> +   unsigned long pfn = page_to_pfn(page);
> +   struct zone *z = _DATA(nid)->node_zones[zone];
> +
> +   set_page_links(page, zone, nid, pfn);
> +   mminit_verify_page_links(page, zone, nid, pfn);
> +   init_page_count(page);
> +   page_mapcount_reset(page);
> +   page_nid_reset_last(page);
> +   if (reserved) {
> +   SetPageReserved(page);
> +   } else {
> +   ClearPageReserved(page);
> +   set_page_count(page, 0);
> +   }
> +   /*
> +* Mark the block movable so that blocks are reserved for
> +* movable at startup. This will force kernel allocations
> +* to reserve their blocks rather than leaking throughout
> +* the address space during boot when many long-lived
> +* kernel allocations are made. Later some blocks near
> +* the start are marked MIGRATE_RESERVE by
> +* setup_zone_migrate_reserve()
> +*
> +* bitmap is created for zone's valid pfn range. but memmap
> +* can be created for invalid pages (for alignment)
> +* check here not to call set_pageblock_migratetype() against
> +* pfn out of zone.
> +*/
> +   if ((z->zone_start_pfn <= pfn)
> +   && (pfn < zone_end_pfn(z))
> +   && !(pfn & (pageblock_nr_pages - 1)))
> +   set_pageblock_migratetype(page, MIGRATE_MOVABLE);
> +
> +   INIT_LIST_HEAD(>lru);
> +#ifdef WANT_PAGE_VIRTUAL
> +   /* The shift won't overflow because ZONE_NORMAL is below 4G. */
> +   if (!is_highmem_idx(zone))
> +   set_page_address(page, __va(pfn << PAGE_SHIFT));
> +#endif
> +}
> +
>  static bool free_pages_prepare(struct page *page, unsigned int order)
>  {
> int i;
> @@ -3934,37 +3977,7 @@ void __meminit memmap_init_zone(unsigned long size, 
> int nid, unsigned long zone,
> continue;
> }
> page = pfn_to_page(pfn);
> -   set_page_links(page, zone, nid, pfn);
> -   mminit_verify_page_links(page, zone, nid, pfn);
> -   init_page_count(page);
> -   page_mapcount_reset(page);
> -   page_nid_reset_last(page);
> -   SetPageReserved(page);
> -   /*
> -* Mark the block movable so that blocks are reserved for
> -* movable at startup. This will force kernel allocations
> -* to reserve their blocks rather than leaking throughout
> -* the address space during boot when many long-lived
> -* kernel allocations are made. Later some blocks near
> -* the start are marked MIGRATE_RESERVE by
> -* setup_zone_migrate_reserve()
> -*
> -* bitmap is created for zone's valid pfn range. but memmap
> -* can be created for invalid pages (for alignment)
> -* check here not to call set_pageblock_migratetype() against
> -* pfn out of zone.
> -*/
> -   if ((z->zone_start_pfn <= pfn)
> -   && (pfn < zone_end_pfn(z))
> -   && !(pfn & (pageblock_nr_pages - 1)))
> -   set_pageblock_migratetype(page, MIGRATE_MOVABLE);
> -
> -   INIT_LIST_HEAD(>lru);

Re: [PATCH] fnic: use simple_open instead of fnic_trace_ctrl_open

2013-07-12 Thread Hiral Patel (hiralpat)


On 7/11/13 11:50 PM, "Camelia Groza"  wrote:

>This removes the open coded fnic_trace_ctrl_open() function
>and replaces file operations references to the function
>with simple_open() instead.
>
>Found using coccinelle.
>
>Signed-off-by: Camelia Groza 
>---
> drivers/scsi/fnic/fnic_debugfs.c |   19 +--
> 1 file changed, 1 insertion(+), 18 deletions(-)
>
>diff --git a/drivers/scsi/fnic/fnic_debugfs.c
>b/drivers/scsi/fnic/fnic_debugfs.c
>index cbcb012..ddc7e94 100644
>--- a/drivers/scsi/fnic/fnic_debugfs.c
>+++ b/drivers/scsi/fnic/fnic_debugfs.c
>@@ -25,23 +25,6 @@ static struct dentry *fnic_trace_debugfs_file;
> static struct dentry *fnic_trace_enable;
> 
> /*
>- * fnic_trace_ctrl_open - Open the trace_enable file
>- * @inode: The inode pointer.
>- * @file: The file pointer to attach the trace enable/disable flag.
>- *
>- * Description:
>- * This routine opens a debugsfs file trace_enable.
>- *
>- * Returns:
>- * This function returns zero if successful.
>- */
>-static int fnic_trace_ctrl_open(struct inode *inode, struct file *filp)
>-{
>-  filp->private_data = inode->i_private;
>-  return 0;
>-}
>-
>-/*
>  * fnic_trace_ctrl_read - Read a trace_enable debugfs file
>  * @filp: The file pointer to read from.
>  * @ubuf: The buffer to copy the data to.
>@@ -222,7 +205,7 @@ static int fnic_trace_debugfs_release(struct inode
>*inode,
> 
> static const struct file_operations fnic_trace_ctrl_fops = {
>   .owner = THIS_MODULE,
>-  .open = fnic_trace_ctrl_open,
>+  .open = simple_open,
>   .read = fnic_trace_ctrl_read,
>   .write = fnic_trace_ctrl_write,
> };
>-- 
>1.7.10.4

Acked-by: Hiral Patel 


--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: XFS: Assertion failed: xfs_dir2_sf_lookup(args) == ENOENT, file: fs/xfs/xfs_dir2_sf.c, line: 358

2013-07-12 Thread Dave Chinner
On Thu, Jul 11, 2013 at 10:39:30PM -0400, Dave Jones wrote:
> Just saw this during boot after an unclean shutdown. It hung afterwards.
> 
> [   97.162665] XFS: Assertion failed: xfs_dir2_sf_lookup(args) == ENOENT, 
> file: fs/xfs/xfs_dir2_sf.c, line: 358

> [   97.173730]  [] xfs_dir2_sf_addname+0x43/0x760 [xfs]
> [   97.173743]  [] xfs_dir_createname+0x15c/0x1b0 [xfs]
> [   97.173754]  [] xfs_create+0x4cc/0x710 [xfs]
> [   97.173764]  [] xfs_vn_mknod+0x9a/0x1c0 [xfs]
> [   97.173773]  [] xfs_vn_create+0x13/0x20 [xfs]
> [   97.173776]  [] vfs_create+0x9d/0x100
> [   97.173778]  [] do_last+0x925/0xe00
> [   97.173780]  [] path_openat+0xbe/0x6f0
> [   97.173783]  [] ? local_clock+0x3f/0x50
> [   97.173785]  [] ? __alloc_fd+0xaf/0x200
> [   97.173787]  [] do_filp_open+0x3a/0x90
> [   97.173789]  [] ? __alloc_fd+0xaf/0x200
> [   97.173790]  [] do_sys_open+0x10b/0x200
> [   97.173792]  [] ? syscall_trace_enter+0x18/0x290
> [   97.173794]  [] SyS_open+0x1e/0x20
> 
> This trace repeated a few times, then the same assertion was triggered from 
> sys_renameat.

That's rather curious. What this means is that there is either an
EIO or EEXIST error being returned from xfs_dir2_sf_lookup() when a
we're about to add the new entry. There are two things here - EIO
can only be returned if a shutdown has occurred - are there any
signs of a shutdown in the logs? If there is a shutdown in progress,
then this is just unlucky to shutdown with an inode in an
inconsistent state in memory that triggers this validity check
failure.

And EEXIST means that the initial lookup of the name during the open
failed to find the entry we are now trying to create. i.e. the
initial path walk failed to do the correct lookup on the directory,
and so never got down to xfs_dir2_sf_lookup() to find the directory
entry (perhaps a problem with a cached negative dentry?). Hence it
was decided during the open(O_CREATE) call that the directory entry
needed to be created, we get down to XFS to create it, and then get
EEXIST because the name already exists...

So, it's not clear what has caused this yet. Is it reproducable? If
would be good to get a trace of lookup vs addname events from XFS,
too (i.e. all the xfs_dir* and xfs_da* events) so we can see if the
correct lookups were done prior to the failing addname operation...

Cheers,

Dave.
-- 
Dave Chinner
da...@fromorbit.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Yet more softlockups.

2013-07-12 Thread Vince Weaver
On Fri, 12 Jul 2013, Dave Jones wrote:

> 
> Here's a fun trick:
> 
> trinity -c perf_event_open -C4 -q -l off
> 
> Within about a minute, that brings any of my boxes to its knees.
> The softlockup detector starts going nuts, and then the box wedges solid.

are you running with the patch

 [PATCH 1/2] perf: Clone child context from parent context pmu

 https://lkml.org/lkml/2013/7/9/310

It hasn't hit Linus git yet.

It fixes a bug that my perf_fuzzer would hit within seconds but it took me 
over a month of trace bisecting and kernel bisecting to isolate it.

The symptoms were stuck processes and NMI tracebacks leading to hard 
locks.  

With the patch applied my perf_fuzzer (which uses the same perf_event_open 
syscall generator as trinity) runs for hours w/o problems.

Vince
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RFC 2/2] x86 qrwlock: Enable x86 to use queue read/write lock

2013-07-12 Thread Waiman Long
This patch makes the necessary changes at the x86 architecture specific
layer to enable the presence of the CONFIG_QUEUE_RWLOCK kernel option
to replace the plain read/write lock by the queue read/write lock.

Signed-off-by: Waiman Long 
---
 arch/x86/Kconfig  |3 +++
 arch/x86/include/asm/spinlock.h   |2 ++
 arch/x86/include/asm/spinlock_types.h |4 
 3 files changed, 9 insertions(+), 0 deletions(-)

diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
index b32ebf9..638dbaa 100644
--- a/arch/x86/Kconfig
+++ b/arch/x86/Kconfig
@@ -2344,6 +2344,9 @@ config X86_DMA_REMAP
bool
depends on STA2X11
 
+config ARCH_QUEUE_RWLOCK
+   def_bool y
+
 source "net/Kconfig"
 
 source "drivers/Kconfig"
diff --git a/arch/x86/include/asm/spinlock.h b/arch/x86/include/asm/spinlock.h
index 33692ea..613a4ff 100644
--- a/arch/x86/include/asm/spinlock.h
+++ b/arch/x86/include/asm/spinlock.h
@@ -137,6 +137,7 @@ static inline void arch_spin_unlock_wait(arch_spinlock_t 
*lock)
cpu_relax();
 }
 
+#ifndef CONFIG_QUEUE_RWLOCK
 /*
  * Read-write spinlocks, allowing multiple readers
  * but only one writer.
@@ -219,6 +220,7 @@ static inline void arch_write_unlock(arch_rwlock_t *rw)
asm volatile(LOCK_PREFIX WRITE_LOCK_ADD(%1) "%0"
 : "+m" (rw->write) : "i" (RW_LOCK_BIAS) : "memory");
 }
+#endif /* CONFIG_QUEUE_RWLOCK */
 
 #define arch_read_lock_flags(lock, flags) arch_read_lock(lock)
 #define arch_write_lock_flags(lock, flags) arch_write_lock(lock)
diff --git a/arch/x86/include/asm/spinlock_types.h 
b/arch/x86/include/asm/spinlock_types.h
index ad0ad07..afacd36 100644
--- a/arch/x86/include/asm/spinlock_types.h
+++ b/arch/x86/include/asm/spinlock_types.h
@@ -28,6 +28,10 @@ typedef struct arch_spinlock {
 
 #define __ARCH_SPIN_LOCK_UNLOCKED  { { 0 } }
 
+#ifdef CONFIG_QUEUE_RWLOCK
+#include 
+#else
 #include 
+#endif
 
 #endif /* _ASM_X86_SPINLOCK_TYPES_H */
-- 
1.7.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RFC 0/2] qrwlock: Introducing a queue read/write lock implementation

2013-07-12 Thread Waiman Long
This patch set introduces a queue-based read/write implementation that
is both faster and fairer than the current read/write lock. It can
also be used as a replacement for ticket spinlocks that are highly
contended if lock size increase is not an issue.

There is no change in the interface. By just replacing the current
read/write lock with the queue read/write lock, we can have a faster
and more deterministic system.

Signed-off-by: Waiman Long 

Waiman Long (2):
  qrwlock: A queue read/write lock implementation
  x86 qrwlock: Enable x86 to use queue read/write lock

 arch/x86/Kconfig  |3 +
 arch/x86/include/asm/spinlock.h   |2 +
 arch/x86/include/asm/spinlock_types.h |4 +
 include/asm-generic/qrwlock.h |  124 +
 lib/Kconfig   |   11 ++
 lib/Makefile  |1 +
 lib/qrwlock.c |  246 +
 7 files changed, 391 insertions(+), 0 deletions(-)
 create mode 100644 include/asm-generic/qrwlock.h
 create mode 100644 lib/qrwlock.c

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH RFC 1/2] qrwlock: A queue read/write lock implementation

2013-07-12 Thread Waiman Long
This patch introduces a read/write lock implementation that put waiting
readers and writers into a queue instead of actively contending the
lock like the regular read/write lock. This will improve performance in
highly contended situation by reducing the cache line bouncing effect.

In addition, the queue read/write lock is more deterministic even
though there is still a smaller chance for lock stealing if the reader
or writer comes at the right moment. Other than that, lock granting
is done in a FIFO manner. The only downside is the size increase in
the lock structure by 4 bytes for 32-bit systems and by 12 bytes for
64-bit systems.

This patch allows the replacement of architecture specific
implementation of read/write lock by this generic version of queue
read/write lock. Two new config parameters are introduced:

1. QUEUE_RWLOCK
   A select-able option that enables the building and replacement of
   architecture specific read/write lock.
2. ARCH_QUEUE_RWLOCK
   Have to be defined in arch/$(arch)/Kconfig to enable QUEUE_RWLOCK

In term of single-thread performance (no contention), a 256K
lock/unlock loop was run on a 2.4GHz Westmere x86-64 CPU. The following
table shows the average time for a single lock/unlock sequence:

Lock Type   Time (ns)
-   -
Ticket spinlock   15.7
Read lock 17.0
Write lock17.2
Queue read lock   31.1
Queue write lock  13.6

While the queue read lock is almost double the time of a read lock
or spinlock, the queue write lock is the fastest of them all. The
execution time can probably be reduced a bit by allowing inlining of
the lock fast paths like the other locks.

To see how the queue write lock can be used as a replacement for ticket
spinlock (just like rwsem can be used as replacement of mutex), the
mb_cache_spinlock in fs/mbcache.c, which is a bottleneck in the disk
workload (ext4 FS) of the AIM7 benchmark, was converted to both a queue
write lock and a regular write lock. When running on a 8-socket 80-core
DL980 system, the performance improvement was shown in the table below.

+-+++---+-+
|  Configuration  |  Mean JPM  |  Mean JPM  | Mean JPM  | qrwlock |
| |Vanilla 3.10|3.10-qrwlock|3.10-rwlock| %Change |
+-+---+
| |  User Range 10 - 100  |
+-+---+
| 8 nodes, HT off |   441374   |   532774   |  637205   | +20.7%  |
| 8 nodes, HT on  |   449373   |   584387   |  641965   | +30.1%  |
+-+---+
| |  User Range 200 - 1000|
+-+---+
| 8 nodes, HT off |   226870   |   354490   |  371593   | +56.3%  |
| 8 nodes, HT on  |   205997   |   314891   |  306378   | +52.9%  |
+-+---+
| |  User Range 1100 - 2000   |
+-+---+
| 8 nodes, HT off |   208234   |   321420   |  343694   | +54.4%  |
| 8 nodes, HT on  |   199612   |   297853   |  252569   | +49.2%  |
+-+++---+-+

Apparently, the regular read/write lock performs even better than
the queue read/write lock in some cases.  This is probably due to the
fact that mb_cache_spinlock is in a separate cache line from the data
being manipulated.

Looking at the fserver and new_fserver workloads (ext4 FS) where the
journal->j_state_lock (a read/write lock) is a bottleneck especially
when HT is on, we see a slight different story. The j_state_lock is
an embedded read/write lock which is in the same cacheline as some
of the data being manipulated. The replacement by a queue read/write
lock gives the following improvement.

++-+--+---+
|  Workload  |mean % change|mean % change |mean % change  |
||10-100 users |200-1000 users|1100-2000 users|
++-+--+---+
|fserver (HT off)|+0.3%|-0.1% |+1.9%  |
|fserver (HT on) |-0.1%|   +32.2% |   +34.7%  |
|new_fserver (HT on) |+0.8%|+0.9% |+0.9%  |
|new_fserver (HT off)|-1.2%|   +29.8% |   +40.5%  |
++-+--+---+

Signed-off-by: Waiman Long 
---
 include/asm-generic/qrwlock.h |  124 +
 lib/Kconfig   |   11 ++
 lib/Makefile  |1 +
 lib/qrwlock.c |  246 +
 4 files changed, 382 insertions(+), 0 deletions(-)
 create mode 100644 

Re: [ 00/15] 3.9.10-stable review

2013-07-12 Thread Greg Kroah-Hartman
On Fri, Jul 12, 2013 at 03:05:05PM -0700, Guenter Roeck wrote:
> On Thu, Jul 11, 2013 at 03:19:30PM -0700, Greg Kroah-Hartman wrote:
> > This is the start of the stable review cycle for the 3.9.10 release.
> > There are 15 patches in this series, all will be posted as a response
> > to this one.  If anyone has any issues with these being applied, please
> > let me know.
> > 
> > Responses should be made by Sat Jul 13 22:11:24 UTC 2013.
> > Anything received after that time might be too late.
> > 
> > The whole patch series can be found in one patch at:
> > kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.9.10-rc1.gz
> > and the diffstat can be found below.
> > 
> Cross build results are as follows. No change to previous release.

Thanks for the testing,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Ksummit-2013-discuss] When to push bug fixes to mainline

2013-07-12 Thread Greg Kroah-Hartman
On Sat, Jul 13, 2013 at 02:24:07AM +0200, Rafael J. Wysocki wrote:
> On Thursday, July 11, 2013 08:34:30 PM Greg Kroah-Hartman wrote:
> > On Thu, Jul 11, 2013 at 10:57:46PM -0400, John W. Linville wrote:
> > > On Thu, Jul 11, 2013 at 08:50:23PM -0400, Theodore Ts'o wrote:
> > > 
> > > > In any case, I've been very conservative in _not_ pushing bug fixes to
> > > > Linus after -rc3 (unless they are fixing a regression or the bug fix
> > > > is super-serious); I'd much rather have them cook in the ext4 tree
> > > > where they can get a lot more testing (a full regression test run for
> > > > ext4 takes over 24 hours), and for people trying out linux-next.
> > > > 
> > > > Maybe the pendulum has swung too far in the direction of holding back
> > > > changes and trying to avoid the risk of introducing regressions;
> > > > perhaps this would be a good topic to discuss at the Kernel Summit.
> > > 
> > > Yes, there does seem to be a certain ebb and flow as to how strict
> > > the rules are about what should go into stable, what fixes are "good
> > > enough" for a given -rc, how tight those rule are in -rc2 vs in -rc6,
> > > etc.  If nothing else, a good repetitive flogging and a restatement of
> > > the One True Way to handle these things might be worthwhile once again...
> > 
> > The rules are documented in stable_kernel_rules.txt for what I will
> > accept.
> > 
> > I have been beating on maintainers for 8 years now to actually mark
> > patches for stable, and only this past year have I finally seen people
> > do it (we FINALLY got SCSI patches marked for stable in this merge
> > window!!!)  So now that maintainers are finally realizing that they need
> > to mark patches, I'll be pushing back harder on the patches that they do
> > submit, because the distros are rightfully pushing back on me for
> > accepting things that are outside of the stable_kernel_rules.txt
> > guidelines.
> 
> I don't quite understand why they are pushing back on you rather than on
> the maintainers who have marked the commits they have problems with for
> -stable.  Why are you supposed to play the role of the gatekeeper here?
> Can't maintainers be held responsible for the commits they mark for -stable in
> the same way as they are responsible for the commits they push to Linus?

Because I'm an easy big target and people are lazy.

> Also, I don't really think that the distros have problems with fixes that are
> simple and provably correct, even though the problems they fix don't seem to 
> be
> "serious enough" for -stable.  They rather have problems with subtle changes
> whose impact is difficult to estimate by inspection and you're not going to be
> pushing back on those anyway (exactly because their impact is difficult to
> estimate).

I know that, you know that, but managers who see tons of kernel patches
just get scared :)

> > If you look on the stable@vger list, I've already rejected 3 today and
> > asked about the huge 21 powerpc patches.  Sure, it's not a lot, when
> > staring down 174 more to go, but it's a start...
> 
> And 2 of those 3 rejected were mine and for 1 of them I actually had a very
> specific reason to mark it for -stable as I told you: It fixed a breakage
> introduced inadvertently in 3.10 and I thought it would be good to reduce
> the exposure of that breakage by fixing it in 3.10.1 as well as in 3.11-rc.

There was no real breakage, that is why I rejected it.

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] misc: add driver for Renesas R-Car Gyro-ADC/speed-pulse interfaces

2013-07-12 Thread Greg KH
On Sat, Jul 13, 2013 at 05:15:16AM +0400, Sergei Shtylyov wrote:
> Hello.
> 
>Can't get to sleep, sigh...
> 
> On 07/13/2013 04:57 AM, Greg KH wrote:
> 
> >>Add the driver for Gyro-ADC/speed-pulse interfaces found in Renesas R-Car 
> >>SoCs.
> >>Though  being two separate devices, they have to be driven together because 
> >>of
> >>the shared start/stop register (located in Gyro-ADC still). At this time, 
> >>only
> >>speed-pulse interface is fully supported, the Gyro-ADC is just initialized 
> >>and
> >>started/stopped synchronously with the speed-pulse interface.  A user 
> >>interface
> >>is implemented via several sysfs files which allow to read and reset the 
> >>speed-
> >>pulse interface's registers.
> 
> >If you modify/create/remove sysfs files, you also have to document them
> >in Documentation/ABI/ which is missing from this patch.
> 
>I've looked there and didn't find the documentation for my closest
> model driver, drivers/misc/ti_dac7512.c (or for many other drivers),
> so I thought I too can do without it.

Nope, that driver should be fixed as well, care to do so?

> >Your sysfs files are also being created in a "racy" way, i.e. after
> >userspace is told about the device, please fix that as well.
> 
>Not sure I understand you. Could you elaborate?

Please read the driver model documentation, it goes into the details of
how to do this properly.  As does this post from me a week or so ago:

http://kroah.com/log/blog/2013/06/26/how-to-create-a-sysfs-file-correctly/

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] pch_gbe: Add MinnowBoard support

2013-07-12 Thread Greg KH
On Fri, Jul 12, 2013 at 05:58:07PM -0700, Darren Hart wrote:
> The MinnowBoard uses an AR803x PHY with the PCH GBE which requires
> special handling. Use the MinnowBoard PCI Subsystem ID to detect this
> and add a pci_device_id.driver_data structure and functions to handle
> platform setup.
> 
> The AR803x does not implement the RGMII 2ns TX clock delay in the trace
> routing nor via strapping. Add a detection method for the board and the
> PHY and enable the TX clock delay via the registers.
> 
> This PHY will hibernate without link for 10 seconds. Ensure the PHY is
> awake for probe and then disable hibernation. A future improvement would
> be to convert pch_gbe to using PHYLIB and making sure we can wake the
> PHY at the necessary times rather than permanently disabling it.
> 
> Signed-off-by: Darren Hart 
> Cc: "David S. Miller" 
> Cc: "H. Peter Anvin" 
> Cc: Peter Waskiewicz 
> Cc: Andy Shevchenko 
> Cc: net...@vger.kernel.org
> Cc:  # 3.8.x: 5829e9b mfd: lpc_sch: Accomodate partial
> Cc:  # 3.8.x: 3cbf182 gpio-sch: Allow for more than 8
> Cc:  # 3.8.x: 91bbe92: PCI: Add CircuitCo vendor ID
> Cc:  # 3.8.x: bd79680: pch_gbe: remove inline keyword
> Cc:  # 3.8.x: 453ca93: pch_gbe: convert pr_* to
> Cc:  # 3.8.x: 29cc436: pch_gbe: use managed functions
> Cc:  # 3.8.x
> Cc:  # 3.10.x: 91bbe92: PCI: Add CircuitCo vendor ID
> Cc:  # 3.10.x: bd79680: pch_gbe: remove inline keyword
> Cc:  # 3.10.x: 453ca93: pch_gbe: convert pr_* to
> Cc:  # 3.10.x: 29cc436: pch_gbe: use managed functions
> Cc:  # 3.10.x
> Signed-off-by: Darren Hart 
> ---
>  drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe.h| 15 
>  .../net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c   | 48 +++
>  .../net/ethernet/oki-semi/pch_gbe/pch_gbe_phy.c| 98 
> ++
>  .../net/ethernet/oki-semi/pch_gbe/pch_gbe_phy.h|  2 +
>  4 files changed, 163 insertions(+)

This is _far_ more than just a simple "add a new device id" for a stable
kernel update.   Please go read Documentation/stable_kernel_rules.txt
again for why there's no way I can take this type of thing.

You know better than this.

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] misc: add driver for Renesas R-Car Gyro-ADC/speed-pulse interfaces

2013-07-12 Thread Sergei Shtylyov

Hello.

   Can't get to sleep, sigh...

On 07/13/2013 04:57 AM, Greg KH wrote:


Add the driver for Gyro-ADC/speed-pulse interfaces found in Renesas R-Car SoCs.
Though  being two separate devices, they have to be driven together because of
the shared start/stop register (located in Gyro-ADC still). At this time, only
speed-pulse interface is fully supported, the Gyro-ADC is just initialized and
started/stopped synchronously with the speed-pulse interface.  A user interface
is implemented via several sysfs files which allow to read and reset the speed-
pulse interface's registers.



If you modify/create/remove sysfs files, you also have to document them
in Documentation/ABI/ which is missing from this patch.


   I've looked there and didn't find the documentation for my closest
model driver, drivers/misc/ti_dac7512.c (or for many other drivers), so 
I thought I too can do without it.



Your sysfs files are also being created in a "racy" way, i.e. after
userspace is told about the device, please fix that as well.


   Not sure I understand you. Could you elaborate?


And are you sure you want to control this through sysfs?  There's no
other better user/kernel apis for it?


   I found none, besides ioctl(), as the device driven is rather 
unique. But I thought that sysfs is "ioctl() today", so I went with it...



thanks,



greg k-h


WBR, Sergei

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ 00/19] 3.10.1-stable review

2013-07-12 Thread Jochen Striepe
Hello,

On Fri, Jul 12, 2013 at 04:28:20PM -0400, Steven Rostedt wrote:
> I would suspect that machines that allow unprivileged users would be
> running distro kernels, and not the latest release from Linus, and thus
> even a bug that "can allow an unprivileged user to crash the kernel" may
> still be able to sit around for a month before being submitted.
> 
> This wouldn't be the case if the bug was in older kernels that are being
> used.

On the one hand, you seem to want users with any kind of production
systems to use distro kernels. On the other hand, developers want
a broad testing base, with vanilla kernels (or better, rc) as early
as possible. You cannot get both at the same time, some kinds of bugs
just appear on production systems.

Users expect vanilla .0 releases usable as production systems, to
be updated (meaning, no new features, just stabilizing) with the
corresponding -stable series.

Just my 2p,
Jochen.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/3] pch_gbe: Add MinnowBoard support

2013-07-12 Thread Joe Perches
On Fri, 2013-07-12 at 17:58 -0700, Darren Hart wrote:
> The MinnowBoard uses an AR803x PHY with the PCH GBE which requires
> special handling. Use the MinnowBoard PCI Subsystem ID to detect this
> and add a pci_device_id.driver_data structure and functions to handle
> platform setup.

trivial comments only:

Please use scripts/checkpatch.pl

[]

diff --git a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c 
b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c[]
[]
> +static int pch_gbe_minnow_platform_init(struct pci_dev *pdev)
[]
> + if (ret){

Missing space before brace

> diff --git a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_phy.c 
> b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_phy.c
[]
> +int pch_gbe_phy_tx_clk_delay(struct pch_gbe_hw *hw)
[]
> + case PHY_AR803X_ID:
> + netdev_dbg(adapter->netdev,
> +"Configuring AR803X PHY for 2ns TX clock delay\n"); 
[]
> + netdev_err(adapter->netdev,
> +"Unknown PHY (%x), could not set TX clock delay.\n",
> +hw->phy.id);
[]
> + netdev_err(adapter->netdev,
> +"Could not configure tx clock delay for PHY.\n");
[]
> +int pch_gbe_phy_disable_hibernate(struct pch_gbe_hw *hw)
[]
> + case PHY_AR803X_ID:
> + netdev_dbg(adapter->netdev,
> +"Disabling hibernation for AR803X PHY\n");

It'd be nice if no period before newline were used
everywhere.

> + netdev_err(adapter->netdev,
> +"Unknown PHY (%x), could not disable hibernation\n",
> +hw->phy.id);
[]
> + netdev_err(adapter->netdev,
> +"Could not disable PHY hibernation.\n");


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 0/3] MinnowBoard Support V2 (Serial and Ethernet)

2013-07-12 Thread Darren Hart
Use the DMI interface to detect the board for UART_CLOCK selection
per Greg K-H.

Resend PCH_GBE_PHY_REGS_LEN define cleanup.

Rewrite of PCH_GBE MinnowBoard support to be completely independent from any
platform or board files. It requests the GPIO line in the pch_gbe driver and
minimizes the pch_gbe_privdata (PCI driver_data) structure.

Tested on 3.8, 3.10, and current master. Patches to lpc_sch and gpio_sch
required for 3.8, included in stable Cc lines.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/3] pch_gbe: Use PCH_GBE_PHY_REGS_LEN instead of 32

2013-07-12 Thread Darren Hart
Avoid using magic numbers when we have perfectly good defines just lying
around.

Signed-off-by: Darren Hart 
Cc: "David S. Miller" 
Cc: "H. Peter Anvin" 
Cc: Peter Waskiewicz 
Cc: Andy Shevchenko 
Cc: net...@vger.kernel.org
---
 drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c 
b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c
index ab1039a..749ddd9 100644
--- a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c
+++ b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c
@@ -682,7 +682,7 @@ static int pch_gbe_init_phy(struct pch_gbe_adapter *adapter)
}
adapter->hw.phy.addr = adapter->mii.phy_id;
netdev_dbg(netdev, "phy_addr = %d\n", adapter->mii.phy_id);
-   if (addr == 32)
+   if (addr == PCH_GBE_PHY_REGS_LEN)
return -EAGAIN;
/* Selected the phy and isolate the rest */
for (addr = 0; addr < PCH_GBE_PHY_REGS_LEN; addr++) {
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/3] pch_uart: Use DMI interface for board detection

2013-07-12 Thread Darren Hart
Use the DMI interface rather than manually matching DMI strings.

Signed-off-by: Darren Hart 
Cc: Michael Brunner 
Cc: Greg Kroah-Hartman 
---
 drivers/tty/serial/pch_uart.c | 71 +--
 1 file changed, 49 insertions(+), 22 deletions(-)

diff --git a/drivers/tty/serial/pch_uart.c b/drivers/tty/serial/pch_uart.c
index 572d481..271cc73 100644
--- a/drivers/tty/serial/pch_uart.c
+++ b/drivers/tty/serial/pch_uart.c
@@ -373,35 +373,62 @@ static const struct file_operations port_regs_ops = {
 };
 #endif /* CONFIG_DEBUG_FS */
 
+static struct dmi_system_id __initdata pch_uart_dmi_table[] = {
+   {
+   .ident = "CM-iTC",
+   {
+   DMI_MATCH(DMI_BOARD_NAME, "CM-iTC"),
+   },
+   (void *)CMITC_UARTCLK,
+   },
+   {
+   .ident = "FRI2",
+   {
+   DMI_MATCH(DMI_BIOS_VERSION, "FRI2"),
+   },
+   (void *)FRI2_64_UARTCLK,
+   },
+   {
+   .ident = "Fish River Island II",
+   {
+   DMI_MATCH(DMI_PRODUCT_NAME, "Fish River Island II"),
+   },
+   (void *)FRI2_48_UARTCLK,
+   },
+   {
+   .ident = "COMe-mTT",
+   {
+   DMI_MATCH(DMI_BOARD_NAME, "COMe-mTT"),
+   },
+   (void *)NTC1_UARTCLK,
+   },
+   {
+   .ident = "nanoETXexpress-TT",
+   {
+   DMI_MATCH(DMI_BOARD_NAME, "nanoETXexpress-TT"),
+   },
+   (void *)NTC1_UARTCLK,
+   },
+   {
+   .ident = "MinnowBoard",
+   {
+   DMI_MATCH(DMI_BOARD_NAME, "MinnowBoard"),
+   },
+   (void *)MINNOW_UARTCLK,
+   },
+};
+
 /* Return UART clock, checking for board specific clocks. */
 static int pch_uart_get_uartclk(void)
 {
-   const char *cmp;
+   const struct dmi_system_id *d;
 
if (user_uartclk)
return user_uartclk;
 
-   cmp = dmi_get_system_info(DMI_BOARD_NAME);
-   if (cmp && strstr(cmp, "CM-iTC"))
-   return CMITC_UARTCLK;
-
-   cmp = dmi_get_system_info(DMI_BIOS_VERSION);
-   if (cmp && strnstr(cmp, "FRI2", 4))
-   return FRI2_64_UARTCLK;
-
-   cmp = dmi_get_system_info(DMI_PRODUCT_NAME);
-   if (cmp && strstr(cmp, "Fish River Island II"))
-   return FRI2_48_UARTCLK;
-
-   /* Kontron COMe-mTT10 (nanoETXexpress-TT) */
-   cmp = dmi_get_system_info(DMI_BOARD_NAME);
-   if (cmp && (strstr(cmp, "COMe-mTT") ||
-   strstr(cmp, "nanoETXexpress-TT")))
-   return NTC1_UARTCLK;
-
-   cmp = dmi_get_system_info(DMI_BOARD_NAME);
-   if (cmp && strstr(cmp, "MinnowBoard"))
-   return MINNOW_UARTCLK;
+   d = dmi_first_match(pch_uart_dmi_table);
+   if (d)
+   return (int)d->driver_data;
 
return DEFAULT_UARTCLK;
 }
-- 
1.8.3.1

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/3] pch_gbe: Add MinnowBoard support

2013-07-12 Thread Darren Hart
The MinnowBoard uses an AR803x PHY with the PCH GBE which requires
special handling. Use the MinnowBoard PCI Subsystem ID to detect this
and add a pci_device_id.driver_data structure and functions to handle
platform setup.

The AR803x does not implement the RGMII 2ns TX clock delay in the trace
routing nor via strapping. Add a detection method for the board and the
PHY and enable the TX clock delay via the registers.

This PHY will hibernate without link for 10 seconds. Ensure the PHY is
awake for probe and then disable hibernation. A future improvement would
be to convert pch_gbe to using PHYLIB and making sure we can wake the
PHY at the necessary times rather than permanently disabling it.

Signed-off-by: Darren Hart 
Cc: "David S. Miller" 
Cc: "H. Peter Anvin" 
Cc: Peter Waskiewicz 
Cc: Andy Shevchenko 
Cc: net...@vger.kernel.org
Cc:  # 3.8.x: 5829e9b mfd: lpc_sch: Accomodate partial
Cc:  # 3.8.x: 3cbf182 gpio-sch: Allow for more than 8
Cc:  # 3.8.x: 91bbe92: PCI: Add CircuitCo vendor ID
Cc:  # 3.8.x: bd79680: pch_gbe: remove inline keyword
Cc:  # 3.8.x: 453ca93: pch_gbe: convert pr_* to
Cc:  # 3.8.x: 29cc436: pch_gbe: use managed functions
Cc:  # 3.8.x
Cc:  # 3.10.x: 91bbe92: PCI: Add CircuitCo vendor ID
Cc:  # 3.10.x: bd79680: pch_gbe: remove inline keyword
Cc:  # 3.10.x: 453ca93: pch_gbe: convert pr_* to
Cc:  # 3.10.x: 29cc436: pch_gbe: use managed functions
Cc:  # 3.10.x
Signed-off-by: Darren Hart 
---
 drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe.h| 15 
 .../net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c   | 48 +++
 .../net/ethernet/oki-semi/pch_gbe/pch_gbe_phy.c| 98 ++
 .../net/ethernet/oki-semi/pch_gbe/pch_gbe_phy.h|  2 +
 4 files changed, 163 insertions(+)

diff --git a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe.h 
b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe.h
index 7779036..d7d71cd 100644
--- a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe.h
+++ b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe.h
@@ -582,6 +582,19 @@ struct pch_gbe_hw_stats {
 };
 
 /**
+ * struct pch_gbe_privdata - PCI Device ID driver data
+ * @phy_tx_clk_delay:  Bool, configure the PHY TX delay in software
+ * @phy_disable_hibernate: Bool, disable PHY hibernation
+ * @platform_init: Platform initialization callback, called from
+ * probe, prio to PHY initialization.
+ */
+struct pch_gbe_privdata {
+   bool phy_tx_clk_delay;
+   bool phy_disable_hibernate;
+   int(*platform_init)(struct pci_dev *pdev);
+};
+
+/**
  * struct pch_gbe_adapter - board specific private data structure
  * @stats_lock:Spinlock structure for status
  * @ethtool_lock:  Spinlock structure for ethtool
@@ -604,6 +617,7 @@ struct pch_gbe_hw_stats {
  * @rx_buffer_len: Receive buffer length
  * @tx_queue_len:  Transmit queue length
  * @have_msi:  PCI MSI mode flag
+ * @pch_gbe_privdata   PCI Device ID driver_data
  */
 
 struct pch_gbe_adapter {
@@ -631,6 +645,7 @@ struct pch_gbe_adapter {
int hwts_tx_en;
int hwts_rx_en;
struct pci_dev *ptp_pdev;
+   struct pch_gbe_privdata *pdata;
 };
 
 #define pch_gbe_hw_to_adapter(hw)  container_of(hw, struct 
pch_gbe_adapter, hw)
diff --git a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c 
b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c
index 749ddd9..da83657 100644
--- a/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c
+++ b/drivers/net/ethernet/oki-semi/pch_gbe/pch_gbe_main.c
@@ -23,6 +23,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #define DRV_VERSION "1.01"
 const char pch_driver_version[] = DRV_VERSION;
@@ -111,6 +112,8 @@ const char pch_driver_version[] = DRV_VERSION;
 #define PTP_L4_MULTICAST_SA "01:00:5e:00:01:81"
 #define PTP_L2_MULTICAST_SA "01:1b:19:00:00:00"
 
+#define MINNOW_PHY_RESET_GPIO  13
+
 static unsigned int copybreak __read_mostly = PCH_GBE_COPYBREAK_DEFAULT;
 
 static int pch_gbe_mdio_read(struct net_device *netdev, int addr, int reg);
@@ -2635,6 +2638,9 @@ static int pch_gbe_probe(struct pci_dev *pdev,
adapter->pdev = pdev;
adapter->hw.back = adapter;
adapter->hw.reg = pcim_iomap_table(pdev)[PCH_GBE_PCI_BAR];
+   adapter->pdata = (struct pch_gbe_privdata *)pci_id->driver_data;
+   if (adapter->pdata && adapter->pdata->platform_init)
+   adapter->pdata->platform_init(pdev);
 
adapter->ptp_pdev = pci_get_bus_and_slot(adapter->pdev->bus->number,
   PCI_DEVFN(12, 4));
@@ -2710,6 +2716,10 @@ static int pch_gbe_probe(struct pci_dev *pdev,
 
dev_dbg(>dev, "PCH Network Connection\n");
 
+   /* Disable hibernation on certain platforms */
+   if (adapter->pdata && adapter->pdata->phy_disable_hibernate)
+   pch_gbe_phy_disable_hibernate(>hw);
+
device_set_wakeup_enable(>dev, 1);
return 0;
 
@@ -2720,9 +2730,47 @@ err_free_netdev:
return ret;
 }
 
+/* The 

Re: [PATCH] misc: add driver for Renesas R-Car Gyro-ADC/speed-pulse interfaces

2013-07-12 Thread Greg KH
On Sat, Jul 13, 2013 at 03:51:42AM +0400, Sergei Shtylyov wrote:
> Add the driver for Gyro-ADC/speed-pulse interfaces found in Renesas R-Car 
> SoCs.
> Though  being two separate devices, they have to be driven together because of
> the shared start/stop register (located in Gyro-ADC still). At this time, only
> speed-pulse interface is fully supported, the Gyro-ADC is just initialized and
> started/stopped synchronously with the speed-pulse interface.  A user 
> interface
> is implemented via several sysfs files which allow to read and reset the 
> speed-
> pulse interface's registers.

If you modify/create/remove sysfs files, you also have to document them
in Documentation/ABI/ which is missing from this patch.

Your sysfs files are also being created in a "racy" way, i.e. after
userspace is told about the device, please fix that as well.

And are you sure you want to control this through sysfs?  There's no
other better user/kernel apis for it?

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mm/hugetlb: per-vma instantiation mutexes

2013-07-12 Thread Hugh Dickins
Adding the essential David Gibson to the Cc list.

On Fri, 12 Jul 2013, Davidlohr Bueso wrote:

> The hugetlb_instantiation_mutex serializes hugepage allocation and 
> instantiation
> in the page directory entry. It was found that this mutex can become quite 
> contended
> during the early phases of large databases which make use of huge pages - for 
> instance
> startup and initial runs. One clear example is a 1.5Gb Oracle database, where 
> lockstat
> reports that this mutex can be one of the top 5 most contended locks in the 
> kernel during
> the first few minutes:
> 
> hugetlb_instantiation_mutex:  10678 10678
>  ---
>  hugetlb_instantiation_mutex10678  [] 
> hugetlb_fault+0x9e/0x340
>  ---
>  hugetlb_instantiation_mutex10678  [] 
> hugetlb_fault+0x9e/0x340
> 
> contentions:  10678
> acquisitions: 99476
> waittime-total: 76888911.01 us
> 
> Instead of serializing each hugetlb fault, we can deal with concurrent faults 
> for pages
> in different vmas. The per-vma mutex is initialized when creating a new vma. 
> So, back to
> the example above, we now get much less contention:
> 
>  >hugetlb_instantiation_mutex:  1 1
>-
>>hugetlb_instantiation_mutex   1   [] 
> hugetlb_fault+0xa6/0x350
>-
>>hugetlb_instantiation_mutex   1[] 
> hugetlb_fault+0xa6/0x350
> 
> contentions:  1
> acquisitions:108092
> waittime-total:  621.24 us
> 
> Signed-off-by: Davidlohr Bueso 

I agree this is a problem worth solving,
but I doubt this patch is the right solution.

> ---
>  include/linux/mm_types.h |  3 +++
>  mm/hugetlb.c | 12 +---
>  mm/mmap.c|  3 +++
>  3 files changed, 11 insertions(+), 7 deletions(-)
> 
> diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
> index fb425aa..b45fd87 100644
> --- a/include/linux/mm_types.h
> +++ b/include/linux/mm_types.h
> @@ -289,6 +289,9 @@ struct vm_area_struct {
>  #ifdef CONFIG_NUMA
>   struct mempolicy *vm_policy;/* NUMA policy for the VMA */
>  #endif
> +#ifdef CONFIG_HUGETLB_PAGE
> + struct mutex hugetlb_instantiation_mutex;
> +#endif
>  };

Bloating every vm_area_struct with a rarely useful mutex:
I'm sure you can construct cases where per-vma mutex would win over
per-mm mutex, but they will have to be very common to justify the bloat.

>  
>  struct core_thread {
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 83aff0a..12e665b 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -137,12 +137,12 @@ static inline struct hugepage_subpool 
> *subpool_vma(struct vm_area_struct *vma)
>   * The region data structures are protected by a combination of the mmap_sem
>   * and the hugetlb_instantion_mutex.  To access or modify a region the caller
>   * must either hold the mmap_sem for write, or the mmap_sem for read and
> - * the hugetlb_instantiation mutex:
> + * the vma's hugetlb_instantiation mutex:

Reading the existing comment, this change looks very suspicious to me.
A per-vma mutex is just not going to provide the necessary exclusion, is
it?  (But I recall next to nothing about these regions and reservations.)

>   *
>   *   down_write(>mmap_sem);
>   * or
>   *   down_read(>mmap_sem);
> - *   mutex_lock(_instantiation_mutex);
> + *   mutex_lock(>hugetlb_instantiation_mutex);
>   */
>  struct file_region {
>   struct list_head link;
> @@ -2547,7 +2547,7 @@ static int unmap_ref_private(struct mm_struct *mm, 
> struct vm_area_struct *vma,
>  
>  /*
>   * Hugetlb_cow() should be called with page lock of the original hugepage 
> held.
> - * Called with hugetlb_instantiation_mutex held and pte_page locked so we
> + * Called with the vma's hugetlb_instantiation_mutex held and pte_page 
> locked so we
>   * cannot race with other handlers or page migration.
>   * Keep the pte_same checks anyway to make transition from the mutex easier.
>   */
> @@ -2847,7 +2847,6 @@ int hugetlb_fault(struct mm_struct *mm, struct 
> vm_area_struct *vma,
>   int ret;
>   struct page *page = NULL;
>   struct page *pagecache_page = NULL;
> - static DEFINE_MUTEX(hugetlb_instantiation_mutex);
>   struct hstate *h = hstate_vma(vma);
>  
>   address &= huge_page_mask(h);
> @@ -2872,7 +2871,7 @@ int hugetlb_fault(struct mm_struct *mm, struct 
> vm_area_struct *vma,
>* get spurious allocation failures if two CPUs race to instantiate
>* the same page in the page cache.
>*/
> - mutex_lock(_instantiation_mutex);
> + mutex_lock(>hugetlb_instantiation_mutex);
>   entry = huge_ptep_get(ptep);
>   if (huge_pte_none(entry)) {
>   ret = hugetlb_no_page(mm, vma, address, ptep, flags);
> @@ -2943,8 +2942,7 @@ out_page_table_lock:
>   put_page(page);
>  
>  out_mutex:
> - mutex_unlock(_instantiation_mutex);
> -
> + 

[RFC][PATCH 4/4] rcu: Have the RCU tracepoints use the tracepoint_string infrastructure

2013-07-12 Thread Steven Rostedt
From: "Steven Rostedt (Red Hat)" 

Currently, RCU tracepoints save only a pointer to strings in the
ring buffer. When displayed via the /sys/kernel/debug/tracing/trace file
they are referenced like the printf "%s" that looks at the address
in the ring buffer and prints out the string it points too. This requires
that the strings are constand and presistent in the kernel.

The problem with this is for tools like trace-cmd and perf that read the
binary data from the buffers but have no access to the kernel memory to
find out what string is represented by the address in the buffer.

By using the tracepoint_string infrastructure, the RCU tracepoint strings
can be exported such that userspace tools can map the addresses to
the strings.

 # cat /sys/kernel/debug/tracing/printk_formats
0x81a4a0e8 : "rcu_preempt"
0x81a4a0f4 : "rcu_bh"
0x81a4a100 : "rcu_sched"
0x818437a0 : "cpuqs"
0x818437a6 : "rcu_sched"
0x818437a0 : "cpuqs"
0x818437b0 : "rcu_bh"
0x818437b7 : "Start context switch"
0x818437cc : "End context switch"
0x818437a0 : "cpuqs"
[...]

Now userspaces tools can display:

 rcu_utilization:  Start context switch
 rcu_dyntick:  Start 1 0
 rcu_utilization:  End context switch
 rcu_batch_start:  rcu_preempt CBs=0/5 bl=10
 rcu_dyntick:  End 0 140
 rcu_invoke_callback:  rcu_preempt rhp=0x880071c0d600 func=proc_i_callback
 rcu_invoke_callback:  rcu_preempt rhp=0x880077b5b230 func=__d_free
 rcu_dyntick:  Start 140 0
 rcu_invoke_callback:  rcu_preempt rhp=0x880077563980 func=file_free_rcu
 rcu_batch_end:rcu_preempt CBs-invoked=3 idle=>c<>c<>c<>c<
 rcu_utilization:  End RCU core
 rcu_grace_period: rcu_preempt 9741 start
 rcu_dyntick:  Start 1 0
 rcu_dyntick:  End 0 140
 rcu_dyntick:  Start 140 0

Instead of:

 rcu_utilization:  81843110
 rcu_future_grace_period: 81842f1d 9939 9939 9940 0 0 3 81842f32
 rcu_batch_start:  81842f1d CBs=0/4 bl=10
 rcu_future_grace_period: 81842f1d 9939 9939 9940 0 0 3 81842f3c
 rcu_grace_period: 81842f1d 9939 81842f80
 rcu_invoke_callback:  81842f1d rhp=0x88007888aac0 
func=file_free_rcu
 rcu_grace_period: 81842f1d 9939 81842f95
 rcu_invoke_callback:  81842f1d rhp=0x88006aeb4600 
func=proc_i_callback
 rcu_future_grace_period: 81842f1d 9939 9939 9940 0 0 3 81842f32
 rcu_future_grace_period: 81842f1d 9939 9939 9940 0 0 3 81842f3c
 rcu_invoke_callback:  81842f1d rhp=0x880071cb9fc0 func=__d_free
 rcu_grace_period: 81842f1d 9939 81842f80
 rcu_invoke_callback:  81842f1d rhp=0x88007888ae80 
func=file_free_rcu
 rcu_batch_end:81842f1d CBs-invoked=4 idle=>c<>c<>c<>c<
 rcu_utilization:  8184311f

Signed-off-by: Steven Rostedt 
---
 kernel/rcutree.c|   82 +++
 kernel/rcutree_plugin.h |   32 +-
 2 files changed, 64 insertions(+), 50 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index 89a8b10..32d403a 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -53,18 +53,31 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "rcutree.h"
 #include 
 
 #include "rcu.h"
 
+#define TPS(x) tracepoint_string(x)
+
 /* Data structures. */
 
 static struct lock_class_key rcu_node_class[RCU_NUM_LVLS];
 static struct lock_class_key rcu_fqs_class[RCU_NUM_LVLS];
 
+/*
+ * In order to export the rcu_state name to the tracing tools, it
+ * needs to be added in the __tracepoint_string section.
+ * This requires defining a separate variable tp__varname
+ * that points to the string being used, and this will allow
+ * the tracing userspace tools to be able to decipher the string
+ * address to the matching string.
+ */
 #define RCU_STATE_INITIALIZER(sname, sabbr, cr) \
+static char sname##_varname[] = #sname; \
+static const char *tp_##sname##_varname __used __tracepoint_string = 
sname##_varname; \
 struct rcu_state sname##_state = { \
.level = { ##_state.node[0] }, \
.call = cr, \
@@ -76,7 +89,7 @@ struct rcu_state sname##_state = { \
.orphan_donetail = ##_state.orphan_donelist, \
.barrier_mutex = __MUTEX_INITIALIZER(sname##_state.barrier_mutex), \
.onoff_mutex = __MUTEX_INITIALIZER(sname##_state.onoff_mutex), \
-   .name = #sname, \
+   .name = sname##_varname, \
.abbr = sabbr, \
 }; \
 DEFINE_PER_CPU(struct rcu_data, sname##_data)
@@ -176,7 +189,7 @@ void rcu_sched_qs(int cpu)
struct rcu_data *rdp = _cpu(rcu_sched_data, cpu);
 
if (rdp->passed_quiesce == 0)
-   trace_rcu_grace_period("rcu_sched", rdp->gpnum, "cpuqs");
+   trace_rcu_grace_period(TPS("rcu_sched"), rdp->gpnum, 
TPS("cpuqs"));

[RFC][PATCH 3/4] tracing: Add __tracepoint_string() to export string pointers

2013-07-12 Thread Steven Rostedt
From: "Steven Rostedt (Red Hat)" 

There are several tracepoints (mostly in RCU), that reference a string
pointer and uses the print format of "%s" to display the string that
exists in the kernel, instead of copying the actual string to the
ring buffer (saves time and ring buffer space).

But this has an issue with userspace tools that read the binary buffers
that has the address of the string but has no access to what the string
itself is. The end result is just output that looks like:

 rcu_dyntick:  818adeaa 1 0
 rcu_dyntick:  818adeb5 0 140
 rcu_dyntick:  818adeb5 0 140
 rcu_utilization:  8184333b
 rcu_utilization:  8184333b

The above is pretty useless when read by the userspace tools. Ideally
we would want something that looks like this:

 rcu_dyntick:  Start 1 0
 rcu_dyntick:  End 0 140
 rcu_dyntick:  Start 140 0
 rcu_callback: rcu_preempt rhp=0x880037aff710 func=put_cred_rcu 0/4
 rcu_callback: rcu_preempt rhp=0x880078961980 func=file_free_rcu 0/5
 rcu_dyntick:  End 0 1

The trace_printk() which also only stores the address of the string
format instead of recording the string into the buffer itself, exports
the mapping of kernel addresses to format strings via the printk_format
file in the debugfs tracing directory.

The tracepoint strings can use this same method and output the format
to the same file and the userspace tools will be able to decipher
the address without any modification.

The tracepoint strings need its own section to save the strings because
the trace_printk section will cause the trace_printk() buffers to be
allocated if anything exists within the section. trace_printk() is only
used for debugging and should never exist in the kernel, we can not use
the trace_printk sections.

Add a new tracepoint_str section that will also be examined by the output
of the printk_format file.

Cc: Paul E. McKenney 
Signed-off-by: Steven Rostedt 
---
 include/asm-generic/vmlinux.lds.h |7 ++-
 include/linux/ftrace_event.h  |7 +++
 kernel/trace/trace.h  |3 +++
 kernel/trace/trace_printk.c   |7 +++
 4 files changed, 23 insertions(+), 1 deletion(-)

diff --git a/include/asm-generic/vmlinux.lds.h 
b/include/asm-generic/vmlinux.lds.h
index 69732d2..83e2c31 100644
--- a/include/asm-generic/vmlinux.lds.h
+++ b/include/asm-generic/vmlinux.lds.h
@@ -122,8 +122,12 @@
 #define TRACE_PRINTKS() VMLINUX_SYMBOL(__start___trace_bprintk_fmt) = .;  \
 *(__trace_printk_fmt) /* Trace_printk fmt' pointer */ \
 VMLINUX_SYMBOL(__stop___trace_bprintk_fmt) = .;
+#define TRACEPOINT_STR() VMLINUX_SYMBOL(__start___tracepoint_str) = .; \
+*(__tracepoint_str) /* Trace_printk fmt' pointer */ \
+VMLINUX_SYMBOL(__stop___tracepoint_str) = .;
 #else
 #define TRACE_PRINTKS()
+#define TRACEPOINT_STR()
 #endif
 
 #ifdef CONFIG_FTRACE_SYSCALLS
@@ -190,7 +194,8 @@
VMLINUX_SYMBOL(__stop___verbose) = .;   \
LIKELY_PROFILE()\
BRANCH_PROFILE()\
-   TRACE_PRINTKS()
+   TRACE_PRINTKS() \
+   TRACEPOINT_STR()
 
 /*
  * Data section helpers
diff --git a/include/linux/ftrace_event.h b/include/linux/ftrace_event.h
index 4372658..cead2e9 100644
--- a/include/linux/ftrace_event.h
+++ b/include/linux/ftrace_event.h
@@ -357,6 +357,13 @@ do {   
\
__trace_printk(ip, fmt, ##args);\
 } while (0)
 
+#define __tracepoint_string__attribute__((section("__tracepoint_str")))
+#define tracepoint_string(str) \
+   ({  \
+   static const char *___tp_str __tracepoint_string = str; \
+   ___tp_str;  \
+   })
+
 #ifdef CONFIG_PERF_EVENTS
 struct perf_event;
 
diff --git a/kernel/trace/trace.h b/kernel/trace/trace.h
index 4a4f6e1..ba321f1 100644
--- a/kernel/trace/trace.h
+++ b/kernel/trace/trace.h
@@ -1022,6 +1022,9 @@ extern struct list_head ftrace_events;
 extern const char *__start___trace_bprintk_fmt[];
 extern const char *__stop___trace_bprintk_fmt[];
 
+extern const char *__start___tracepoint_str[];
+extern const char *__stop___tracepoint_str[];
+
 void trace_printk_init_buffers(void);
 void trace_printk_start_comm(void);
 int trace_keep_overwrite(struct tracer *tracer, u32 mask, int set);
diff --git a/kernel/trace/trace_printk.c b/kernel/trace/trace_printk.c
index a9077c1..fa8d070 100644
--- a/kernel/trace/trace_printk.c
+++ b/kernel/trace/trace_printk.c
@@ 

[RFC][PATCH 0/4] rcu/tracing: Export the RCU tracepoint string pointers

2013-07-12 Thread Steven Rostedt
This has been on my todo list for a while. I've been promising Paul
to get a way to have his tracepoints work for user tools, and I finally
got around to it :-)

As his tracepoints just save a pointer to a string in the ring buffer
and have the output use that pointer with a "%s" printf field, which
works great in the kernel and is fast and efficient, it sucks for tools
that read the binary data from the kernel but has no access to the strings
that those pointers point to.

The trace_printk() had a similar problem when Frederic Weisbecker optimized
it to just save the pointer to the format string along with the parameters,
but that was solved by exporting a table of pointers with the strings
in the /sys/kernel/debug/tracing/printk_formats file.

I've done the same thing for tracepoints using a separate section but the
same file. I needed a seperate section as the trace_printk() section is
used to know if a trace_printk() was added to the kernel (they should
never be added to Linus's tree, unless its for a strict debugging option).
If a trace_printk() is used, several temporary buffers are allocated, which
we don't want in normal use.

Now tracepoints get a section that has the pointers to the strings and
this gets added to the printk_format file. Doing this means that none
of the tools (perf or trace-cmd) require any changes as they already
handle reading the printk_formats via the event_parse library.

-- Steve

Steven Rostedt (Red Hat) (4):
  rcu: Add const annotation to char * for RCU tracepoints and functions
  rcu: Simplify RCU_STATE_INITIALIZER() macro
  tracing: Add __tracepoint_string() to export string pointers
  rcu: Have the RCU tracepoints use the tracepoint_string infrastructure


 include/asm-generic/vmlinux.lds.h |7 ++-
 include/linux/ftrace_event.h  |7 +++
 include/linux/rcupdate.h  |4 +-
 include/trace/events/rcu.h|   82 +++---
 kernel/rcu.h  |2 +-
 kernel/rcupdate.c |2 +-
 kernel/rcutiny_plugin.h   |2 +-
 kernel/rcutorture.c   |   14 +++---
 kernel/rcutree.c  |  100 +
 kernel/rcutree.h  |2 +-
 kernel/rcutree_plugin.h   |   44 
 kernel/trace/trace.h  |3 ++
 kernel/trace/trace_printk.c   |7 +++
 13 files changed, 154 insertions(+), 122 deletions(-)
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC][PATCH 1/4] rcu: Add const annotation to char * for RCU tracepoints and functions

2013-07-12 Thread Steven Rostedt
From: "Steven Rostedt (Red Hat)" 

All the RCU tracepoints and functions that reference char pointers do
so with just 'char *' even though they do not modify the contents of
the string itself. This will cause warnings if a const char * is used
in one of these functions.

The RCU tracepoints store the pointer to the string to refer back to them
when the trace output is displayed. As this can be minutes, hours or
even days later, those strings had better be constant.

This change also opens the door to allow the RCU tracepoint strings and
their addresses to be exported so that userspace tracing tools can
translate the contents of the pointers of the RCU tracepoints.

Signed-off-by: Steven Rostedt 
---
 include/linux/rcupdate.h   |4 +--
 include/trace/events/rcu.h |   82 ++--
 kernel/rcu.h   |2 +-
 kernel/rcupdate.c  |2 +-
 kernel/rcutiny_plugin.h|2 +-
 kernel/rcutorture.c|   14 
 kernel/rcutree.c   |4 +--
 kernel/rcutree.h   |2 +-
 kernel/rcutree_plugin.h|8 ++---
 9 files changed, 60 insertions(+), 60 deletions(-)

diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 4b14bdc..0c38abb 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -52,7 +52,7 @@ extern int rcutorture_runnable; /* for sysctl */
 #if defined(CONFIG_TREE_RCU) || defined(CONFIG_TREE_PREEMPT_RCU)
 extern void rcutorture_record_test_transition(void);
 extern void rcutorture_record_progress(unsigned long vernum);
-extern void do_trace_rcu_torture_read(char *rcutorturename,
+extern void do_trace_rcu_torture_read(const char *rcutorturename,
  struct rcu_head *rhp,
  unsigned long secs,
  unsigned long c_old,
@@ -65,7 +65,7 @@ static inline void rcutorture_record_progress(unsigned long 
vernum)
 {
 }
 #ifdef CONFIG_RCU_TRACE
-extern void do_trace_rcu_torture_read(char *rcutorturename,
+extern void do_trace_rcu_torture_read(const char *rcutorturename,
  struct rcu_head *rhp,
  unsigned long secs,
  unsigned long c_old,
diff --git a/include/trace/events/rcu.h b/include/trace/events/rcu.h
index 59ebcc8..ee2376c 100644
--- a/include/trace/events/rcu.h
+++ b/include/trace/events/rcu.h
@@ -19,12 +19,12 @@
  */
 TRACE_EVENT(rcu_utilization,
 
-   TP_PROTO(char *s),
+   TP_PROTO(const char *s),
 
TP_ARGS(s),
 
TP_STRUCT__entry(
-   __field(char *, s)
+   __field(const char *, s)
),
 
TP_fast_assign(
@@ -51,14 +51,14 @@ TRACE_EVENT(rcu_utilization,
  */
 TRACE_EVENT(rcu_grace_period,
 
-   TP_PROTO(char *rcuname, unsigned long gpnum, char *gpevent),
+   TP_PROTO(const char *rcuname, unsigned long gpnum, const char *gpevent),
 
TP_ARGS(rcuname, gpnum, gpevent),
 
TP_STRUCT__entry(
-   __field(char *, rcuname)
+   __field(const char *, rcuname)
__field(unsigned long, gpnum)
-   __field(char *, gpevent)
+   __field(const char *, gpevent)
),
 
TP_fast_assign(
@@ -89,21 +89,21 @@ TRACE_EVENT(rcu_grace_period,
  */
 TRACE_EVENT(rcu_future_grace_period,
 
-   TP_PROTO(char *rcuname, unsigned long gpnum, unsigned long completed,
+   TP_PROTO(const char *rcuname, unsigned long gpnum, unsigned long 
completed,
 unsigned long c, u8 level, int grplo, int grphi,
-char *gpevent),
+const char *gpevent),
 
TP_ARGS(rcuname, gpnum, completed, c, level, grplo, grphi, gpevent),
 
TP_STRUCT__entry(
-   __field(char *, rcuname)
+   __field(const char *, rcuname)
__field(unsigned long, gpnum)
__field(unsigned long, completed)
__field(unsigned long, c)
__field(u8, level)
__field(int, grplo)
__field(int, grphi)
-   __field(char *, gpevent)
+   __field(const char *, gpevent)
),
 
TP_fast_assign(
@@ -132,13 +132,13 @@ TRACE_EVENT(rcu_future_grace_period,
  */
 TRACE_EVENT(rcu_grace_period_init,
 
-   TP_PROTO(char *rcuname, unsigned long gpnum, u8 level,
+   TP_PROTO(const char *rcuname, unsigned long gpnum, u8 level,
 int grplo, int grphi, unsigned long qsmask),
 
TP_ARGS(rcuname, gpnum, level, grplo, grphi, qsmask),
 
TP_STRUCT__entry(
-   __field(char *, rcuname)
+   __field(const char *, rcuname)
__field(unsigned long, gpnum)
__field(u8, level)
__field(int, grplo)
@@ -168,12 +168,12 @@ TRACE_EVENT(rcu_grace_period_init,
  */
 TRACE_EVENT(rcu_preempt_task,
 
-   TP_PROTO(char *rcuname, int pid, 

[RFC][PATCH 2/4] rcu: Simplify RCU_STATE_INITIALIZER() macro

2013-07-12 Thread Steven Rostedt
From: "Steven Rostedt (Red Hat)" 

The RCU_STATE_INITIALIZER() macro is used only in the rcutree.c file
as well as the rcutree_plugin.h file. It is passed as a rvalue to
a variable of a similar name. A per_cpu variable is also created
with a similar name as well.

The uses of RCU_STATE_INITIALIZER() can be simplified to remove some
of the duplicate code that is done. Currently the three users of this
macro has this format:

struct rcu_state rcu_sched_state =
RCU_STATE_INITIALIZER(rcu_sched, call_rcu_sched);
DEFINE_PER_CPU(struct rcu_data, rcu_sched_data);

Notice that "rcu_sched" is called three times. This is the same with
the other two users. This can be condensed to just:

RCU_STATE_INITIALIZER(rcu_sched, call_rcu_sched);

by moving the rest into the macro itself.

This also opens the door to allow the RCU tracepoint strings and
their addresses to be exported so that userspace tracing tools can
translate the contents of the pointers of the RCU tracepoints.
The change will allow for helper code to be placed in the
RCU_STATE_INITIALIZER() macro to export the name that is used.

Signed-off-by: Steven Rostedt 
---
 kernel/rcutree.c|   14 ++
 kernel/rcutree_plugin.h |4 +---
 2 files changed, 7 insertions(+), 11 deletions(-)

diff --git a/kernel/rcutree.c b/kernel/rcutree.c
index f560eb3..89a8b10 100644
--- a/kernel/rcutree.c
+++ b/kernel/rcutree.c
@@ -64,7 +64,8 @@
 static struct lock_class_key rcu_node_class[RCU_NUM_LVLS];
 static struct lock_class_key rcu_fqs_class[RCU_NUM_LVLS];
 
-#define RCU_STATE_INITIALIZER(sname, sabbr, cr) { \
+#define RCU_STATE_INITIALIZER(sname, sabbr, cr) \
+struct rcu_state sname##_state = { \
.level = { ##_state.node[0] }, \
.call = cr, \
.fqs_state = RCU_GP_IDLE, \
@@ -77,14 +78,11 @@ static struct lock_class_key rcu_fqs_class[RCU_NUM_LVLS];
.onoff_mutex = __MUTEX_INITIALIZER(sname##_state.onoff_mutex), \
.name = #sname, \
.abbr = sabbr, \
-}
-
-struct rcu_state rcu_sched_state =
-   RCU_STATE_INITIALIZER(rcu_sched, 's', call_rcu_sched);
-DEFINE_PER_CPU(struct rcu_data, rcu_sched_data);
+}; \
+DEFINE_PER_CPU(struct rcu_data, sname##_data)
 
-struct rcu_state rcu_bh_state = RCU_STATE_INITIALIZER(rcu_bh, 'b', 
call_rcu_bh);
-DEFINE_PER_CPU(struct rcu_data, rcu_bh_data);
+RCU_STATE_INITIALIZER(rcu_sched, 's', call_rcu_sched);
+RCU_STATE_INITIALIZER(rcu_bh, 'b', call_rcu_bh);
 
 static struct rcu_state *rcu_state;
 LIST_HEAD(rcu_struct_flavors);
diff --git a/kernel/rcutree_plugin.h b/kernel/rcutree_plugin.h
index 49df554..5cdb559 100644
--- a/kernel/rcutree_plugin.h
+++ b/kernel/rcutree_plugin.h
@@ -110,9 +110,7 @@ static void __init rcu_bootup_announce_oddness(void)
 
 #ifdef CONFIG_TREE_PREEMPT_RCU
 
-struct rcu_state rcu_preempt_state =
-   RCU_STATE_INITIALIZER(rcu_preempt, 'p', call_rcu);
-DEFINE_PER_CPU(struct rcu_data, rcu_preempt_data);
+RCU_STATE_INITIALIZER(rcu_preempt, 'p', call_rcu);
 static struct rcu_state *rcu_state = _preempt_state;
 
 static int rcu_preempted_readers_exp(struct rcu_node *rnp);
-- 
1.7.10.4


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ANN] backports-v3.10 released - first release!

2013-07-12 Thread Sven-Haegar Koch
On Fri, 12 Jul 2013, Luis R. Rodriguez wrote:

> Kicked out the first Linux kernel backports release under the new
> project name, "backports" that hopefully clarifies this a generic
> backport project now. Backported subsystems in this release:

> [0] 
> http://www.kernel.org/pub/linux/kernel/projects/backports/stable/v3.10/backports-3.10-1.tar.bz2

Did you perhaps re-upload the 3.10-rc1 backports archive?

haegar@blackhole:~/src/3.2.0$ tar tvf backports-3.10-1.tar.bz2 | head -n 2
drwxr-xr-x mcgrof/mcgrof 0 2013-05-19 03:06 backports-3.10-rc1-1/
-rw-r--r-- mcgrof/mcgrof  4164 2013-05-18 22:57 
backports-3.10-rc1-1/Kconfig.versions
...

$ cat versions 
BACKPORTS_VERSION="v3.10-rc1-1-0-g80112f6"
BACKPORTED_KERNEL_VERSION="v3.10-rc1-0-gf722406"
BACKPORTED_KERNEL_NAME="Linux"

c'ya
sven-haegar

-- 
Three may keep a secret, if two of them are dead.
- Ben F.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[Update][PATCH] ACPI / video / i915: Remove ACPI backlight if firmware expects Windows 8

2013-07-12 Thread Rafael J. Wysocki
From: Rafael J. Wysocki 

According to Matthew Garrett, "Windows 8 leaves backlight control up
to individual graphics drivers rather than making ACPI calls itself.
There's plenty of evidence to suggest that the Intel driver for
Windows [8] doesn't use the ACPI interface, including the fact that
it's broken on a bunch of machines when the OS claims to support
Windows 8.  The simplest thing to do appears to be to disable the
ACPI backlight interface on these systems".

There's a problem with that approach, however, because simply
avoiding to register the ACPI backlight interface if the firmware
calls _OSI for Windows 8 may not work in the following situations:
 (1) The ACPI backlight interface actually works on the given system
 and the i915 driver is not loaded (e.g. another graphics driver
 is used).
 (2) The ACPI backlight interface doesn't work on the given system,
 but there is a vendor platform driver that will register its
 own, equally broken, backlight interface if not prevented from
 doing so by the ACPI subsystem.
Therefore we need to allow the ACPI backlight interface to be
registered until the i915 driver is loaded which then will unregister
it if the firmware has called _OSI for Windows 8 (or will register
the ACPI video driver without backlight support if not already
present).

For this reason, introduce an alternative function for registering
ACPI video, acpi_video_register_with_quirks(), that will check
whether or not the ACPI video driver has already been registered
and whether or not the backlight Windows 8 quirk has to be applied.
If the quirk has to be applied, it will block the ACPI backlight
support and either unregister the backlight interface if the ACPI
video driver has already been registered, or register the ACPI
video driver without the backlight interface otherwise.  Make
the i915 driver use acpi_video_register_with_quirks() instead of
acpi_video_register() in i915_driver_load().

This change is based on earlier patches from Matthew Garrett,
Chun-Yi Lee and Seth Forshee and Aaron Lu's comments.

Signed-off-by: Rafael J. Wysocki 
---
 drivers/acpi/internal.h |   11 ++
 drivers/acpi/video.c|   65 
 drivers/acpi/video_detect.c |   21 
 drivers/gpu/drm/i915/i915_dma.c |2 -
 include/acpi/video.h|   11 ++
 include/linux/acpi.h|1 
 6 files changed, 103 insertions(+), 8 deletions(-)

Index: linux-pm/drivers/acpi/video.c
===
--- linux-pm.orig/drivers/acpi/video.c
+++ linux-pm/drivers/acpi/video.c
@@ -44,6 +44,8 @@
 #include 
 #include 
 
+#include "internal.h"
+
 #define PREFIX "ACPI: "
 
 #define ACPI_VIDEO_BUS_NAME"Video Bus"
@@ -898,7 +900,7 @@ static void acpi_video_device_find_cap(s
device->cap._DDC = 1;
}
 
-   if (acpi_video_backlight_support()) {
+   if (acpi_video_verify_backlight_support()) {
struct backlight_properties props;
struct pci_dev *pdev;
acpi_handle acpi_parent;
@@ -1854,6 +1856,46 @@ static int acpi_video_bus_remove(struct
return 0;
 }
 
+static acpi_status video_unregister_backlight(acpi_handle handle, u32 lvl,
+ void *context, void **rv)
+{
+   struct acpi_device *acpi_dev;
+   struct acpi_video_bus *video;
+   struct acpi_video_device *dev, *next;
+
+   if (acpi_bus_get_device(handle, _dev))
+   return AE_OK;
+
+   if (acpi_match_device_ids(acpi_dev, video_device_ids))
+   return AE_OK;
+
+   video = acpi_driver_data(acpi_dev);
+   if (!video)
+   return AE_OK;
+
+   acpi_video_bus_stop_devices(video);
+   mutex_lock(>device_list_lock);
+   list_for_each_entry_safe(dev, next, >video_device_list, entry) {
+   if (dev->backlight) {
+   backlight_device_unregister(dev->backlight);
+   dev->backlight = NULL;
+   kfree(dev->brightness->levels);
+   kfree(dev->brightness);
+   }
+   if (dev->cooling_dev) {
+   sysfs_remove_link(>dev->dev.kobj,
+ "thermal_cooling");
+   sysfs_remove_link(>cooling_dev->device.kobj,
+ "device");
+   thermal_cooling_device_unregister(dev->cooling_dev);
+   dev->cooling_dev = NULL;
+   }
+   }
+   mutex_unlock(>device_list_lock);
+   acpi_video_bus_start_devices(video);
+   return AE_OK;
+}
+
 static int __init is_i740(struct pci_dev *dev)
 {
if (dev->device == 0x00D1)
@@ -1885,14 +1927,25 @@ static int __init intel_opregion_present
return opregion;
 }
 
-int acpi_video_register(void)
+int 

Re: [PATCH V3 1/3] dts: change Marvell prefix to 'marvell'

2013-07-12 Thread Haojian Zhuang
On Fri, Jul 12, 2013 at 11:10 PM, Daniel Drake  wrote:
> On Thu, Jul 11, 2013 at 5:54 PM, Haojian Zhuang
>  wrote:
>>> Well, Daniel Drake spoke up for OLPC.  Does that count?
>>
>> We don't know they used DT on Marvell MMP2/MMP3. So they don't have DTS file
>> in kernel, we could use both old name & new name in driver.
>
> You are listed as one of the MMP maintainers in the MAINTAINERS file
> and I have sent you several patches in the few 3 weeks which make
> OLPC's usage of MMP + DT pretty obvious. As a maintainer I believe you
> are supposed to review the patches too. hint hint ;)
>

These patches couldn't be applied. Since we're moving irq drivers from arch
directories to irq directories.

When the irq patches are applied, you can rebase your patches.

Regards
Haojian
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/RFT PATCH 4/5] ARM: mm: change max*pfn to include the physical offset of memory

2013-07-12 Thread Russell King - ARM Linux
On Fri, Jul 12, 2013 at 05:48:13PM -0400, Santosh Shilimkar wrote:
> Most of the kernel code assumes that max*pfn is maximum pfns because
> the physical start of memory is expected to be PFN0. Since this
> assumption is not true on ARM architectures, the meaning of max*pfn
> is number of memory pages. This is done to keep drivers happy which
> are making use of of these variable to calculate the dma bounce limit
> using dma_mask.
> 
> Now since we have a architecture override possibility for DMAable
> maximum pfns, lets make meaning of max*pfns as maximum pnfs on ARM
> as well.
> 
> In the patch, the dma_to_pfn/pfn_to_dma() pair is hacked to take care of
> the physical memory offset. It is done this way just to enable testing
> since its understood that it can come in way of single zImage.

As Santosh says, this is a hack - but we need to have a discussion about
how to handle translations from PFN to bus addresses.

Currently, the way we do that on ARM is mostly assume that physical
addresses are the same as bus addresses, but that's not true everywhere,
and certainly isn't true when you have a 32-bit DMA controller which
can access physical memory, where the physical memory is above 4GB
in physical space.

We have certain platforms where the DMA address is already being
programmed into a controller with less than 32-bits in its address
register, and with a physical memory offset - and of course this
case just works out of the box because the high bits are ignored
by the device.

What I'm basically saying is we've had this problem for a while, and
we've lived with it by hoping and hacking, and adjusting max*pfn, but
this is not long-term sustainable.  We *need* to get away from the
idea that DMA addresses are physical addresses and device DMA masks
have some relationship to physical addresses.

Consider for a moment:

PCI address 0x ---> physical address 0xc000.

You plug a card in which can't do 32-bit addressing (remember, there
are such PCI cards in the past...).  The driver sets the DMA mask to
0x0fff (or whatever).  How does that relate to the PCI bus address?
It's 0x to 0x0fff.  How does that relate to the physical
address space?  0xc000 to 0xcfff.

This is why DMA masks can't be treated as some notional address limit.
It just doesn't work when you have bus offsets.

And the extreme case of that is LPAE with all system memory above the
4GB physical mark, with 32-bit DMA capable peripherals - which we're
starting to see now.

Ideally, I think we need some kind of per-bus DT property to describe
the memory which can be accessed from the bus - to do it properly to
cover the cases we've already seen, that would be an offset and a size.

We then need some way for dma_to_pfn() and pfn_to_dma() to efficiently
get at that information - bear in mind that they're hot paths when doing
DMA mappings and the like.  I doubt we want to be looking up the same
property time and time again inside them.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Ksummit-2013-discuss] When to push bug fixes to mainline

2013-07-12 Thread Rafael J. Wysocki
On Thursday, July 11, 2013 08:34:30 PM Greg Kroah-Hartman wrote:
> On Thu, Jul 11, 2013 at 10:57:46PM -0400, John W. Linville wrote:
> > On Thu, Jul 11, 2013 at 08:50:23PM -0400, Theodore Ts'o wrote:
> > 
> > > In any case, I've been very conservative in _not_ pushing bug fixes to
> > > Linus after -rc3 (unless they are fixing a regression or the bug fix
> > > is super-serious); I'd much rather have them cook in the ext4 tree
> > > where they can get a lot more testing (a full regression test run for
> > > ext4 takes over 24 hours), and for people trying out linux-next.
> > > 
> > > Maybe the pendulum has swung too far in the direction of holding back
> > > changes and trying to avoid the risk of introducing regressions;
> > > perhaps this would be a good topic to discuss at the Kernel Summit.
> > 
> > Yes, there does seem to be a certain ebb and flow as to how strict
> > the rules are about what should go into stable, what fixes are "good
> > enough" for a given -rc, how tight those rule are in -rc2 vs in -rc6,
> > etc.  If nothing else, a good repetitive flogging and a restatement of
> > the One True Way to handle these things might be worthwhile once again...
> 
> The rules are documented in stable_kernel_rules.txt for what I will
> accept.
> 
> I have been beating on maintainers for 8 years now to actually mark
> patches for stable, and only this past year have I finally seen people
> do it (we FINALLY got SCSI patches marked for stable in this merge
> window!!!)  So now that maintainers are finally realizing that they need
> to mark patches, I'll be pushing back harder on the patches that they do
> submit, because the distros are rightfully pushing back on me for
> accepting things that are outside of the stable_kernel_rules.txt
> guidelines.

I don't quite understand why they are pushing back on you rather than on
the maintainers who have marked the commits they have problems with for
-stable.  Why are you supposed to play the role of the gatekeeper here?
Can't maintainers be held responsible for the commits they mark for -stable in
the same way as they are responsible for the commits they push to Linus?

Also, I don't really think that the distros have problems with fixes that are
simple and provably correct, even though the problems they fix don't seem to be
"serious enough" for -stable.  They rather have problems with subtle changes
whose impact is difficult to estimate by inspection and you're not going to be
pushing back on those anyway (exactly because their impact is difficult to
estimate).

> If you look on the stable@vger list, I've already rejected 3 today and
> asked about the huge 21 powerpc patches.  Sure, it's not a lot, when
> staring down 174 more to go, but it's a start...

And 2 of those 3 rejected were mine and for 1 of them I actually had a very
specific reason to mark it for -stable as I told you: It fixed a breakage
introduced inadvertently in 3.10 and I thought it would be good to reduce
the exposure of that breakage by fixing it in 3.10.1 as well as in 3.11-rc.

Of course, you are free to disagree with that, but it's not like there was no
reason.

Thanks,
Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/RFT PATCH 3/5] scsi: Use dma_max_pfn(dev) helper for bounce_limit calculations

2013-07-12 Thread Russell King - ARM Linux
On Sat, Jul 13, 2013 at 03:42:55AM +0400, Sergei Shtylyov wrote:
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 86d5220..e8275fa 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1668,7 +1668,7 @@ u64 scsi_calculate_bounce_limit(struct
> Scsi_Host *shost)
>
>host_dev = scsi_get_device(shost);
>if (host_dev && host_dev->dma_mask)
> -bounce_limit = *host_dev->dma_mask;
> +bounce_limit = dma_max_pfn(host_dev) << PAGE_SHIFT;
>
 You definitely forgot -1 here.
>
>>> Please explain your point.
>
>> Previously, 'bounce_limit' would look like 0x (unless I'm
>> mistaken), now it would look like 0xf000 which is hardly what we're
>> looking for, no?
>
>Although, -1 won't give us the correct result in this case, it's more 
> like + PAGE_SIZE - 1.

And where it's used is blk_bounce_limit(), the first which that does
is convert it back to a PFN, losing the bottom bits again...

I'm tempted to suggest converting the whole thing to just deal with
PFNs rather than bytes since we only deal with "can we DMA to this"
on a per-page basis.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] misc: add driver for Renesas R-Car Gyro-ADC/speed-pulse interfaces

2013-07-12 Thread Sergei Shtylyov
Add the driver for Gyro-ADC/speed-pulse interfaces found in Renesas R-Car SoCs.
Though  being two separate devices, they have to be driven together because of
the shared start/stop register (located in Gyro-ADC still). At this time, only
speed-pulse interface is fully supported, the Gyro-ADC is just initialized and
started/stopped synchronously with the speed-pulse interface.  A user interface
is implemented via several sysfs files which allow to read and reset the speed-
pulse interface's registers.

Signed-off-by: Sergei Shtylyov 

---
This patch is againt the 'char-misc-next' barnch of Greg KH's 'char-misc.git'.

 drivers/misc/Kconfig |   10 +
 drivers/misc/Makefile|1 
 drivers/misc/rcar-gyro-adc-speed-pulse.c |  231 +++
 3 files changed, 242 insertions(+)

Index: char-misc/drivers/misc/Kconfig
===
--- char-misc.orig/drivers/misc/Kconfig
+++ char-misc/drivers/misc/Kconfig
@@ -528,6 +528,16 @@ config SRAM
  the genalloc API. It is supposed to be used for small on-chip SRAM
  areas found on many SoCs.
 
+config RCAR_GYRO_ADC_SPEED_PULSE
+   tristate "Renesas R-Car Gyro-ADC and speed-pulse interfaces driver"
+   depends on HAS_IOMEM
+   help
+ This driver allows you to read speed pulse signal characteristics via
+ sysfs. The Gyro-ADC interface is not currently supported.
+
+ This driver can also be built as a module. If so, the module
+ will be called rcar_gyro_adc_speed_pulse.
+
 source "drivers/misc/c2port/Kconfig"
 source "drivers/misc/eeprom/Kconfig"
 source "drivers/misc/cb710/Kconfig"
Index: char-misc/drivers/misc/Makefile
===
--- char-misc.orig/drivers/misc/Makefile
+++ char-misc/drivers/misc/Makefile
@@ -53,3 +53,4 @@ obj-$(CONFIG_INTEL_MEI)   += mei/
 obj-$(CONFIG_VMWARE_VMCI)  += vmw_vmci/
 obj-$(CONFIG_LATTICE_ECP3_CONFIG)  += lattice-ecp3-config.o
 obj-$(CONFIG_SRAM) += sram.o
+obj-$(CONFIG_RCAR_GYRO_ADC_SPEED_PULSE)+= rcar-gyro-adc-speed-pulse.o
Index: char-misc/drivers/misc/rcar-gyro-adc-speed-pulse.c
===
--- /dev/null
+++ char-misc/drivers/misc/rcar-gyro-adc-speed-pulse.c
@@ -0,0 +1,231 @@
+/*
+ * Renesas R-Car Gyro-ADC and speed-pulse interfaces driver
+ *
+ * Copyright (C) 2013  Renesas Solutions Corp.
+ * Copyright (C) 2013  Cogent Embedded, Inc.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms of the GNU General Public License version 2 as published
+ * by the Free Software Foundation; of the License.
+ *
+ * NOTE: Gyro-ADC interface is not really supported yet, just initialized
+ * and started/stopped synchronously with the speed-pulse interface.
+ */
+
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include 
+
+/* Gyro-ADC interface registers */
+#defineADC_MODE_SELECT 0x00
+#defineADC_SPEED_START_STOP0x04
+#defineADC_CLOCK_LENGTH_COUNT  0x08
+#define_1_25_MS_LENGTH_COUNT   0x0C
+#defineAD_CH_REAL_DATA(n)  (0x10 + (n) * 4)
+#defineAD_CH_ADD_DATA(n)   (0x30 + (n) * 4)
+#defineAD_CH_10MS_DATA_FIFO(n) (0x50 + (n) * 4)
+#defineAD_FIFO_STATUS  0x70
+
+/* Speed-pulse interface registers */
+#defineSPEED_PULSE_COUNT_DATA  0x000
+#defineSPEED_PULSE_FILTER_SETTING  0x004
+#defineSPEED_PULSE_COUNT_CLEARING  0x008
+#defineSPEED_PULSE_100MS_LATCH_DATA0x00C
+#define_100_MS_INT_COUNT   0x010
+#defineINT_STATUS_AND_CLEAR0x018
+#define_500_KHZ_FREQ_COUNT_SETTING 0x01C
+#defineSPEED_PULSE_OFFSET_A0x100
+#defineSPEED_PULSE_OFFSET_B0x104
+#defineSPEED_PULSE_WIDTH   0x108
+#defineSPEED_PULSE_OBSERVE_A   0x10C
+#defineSPEED_PULSE_OBSERVE_B   0x110
+#defineSPEED_PULSE_WIDTH_CLEARING  0x114
+#defineSPEED_PULSE_WIDTH_TEST  0x118
+
+struct rcar_adc_sp_priv {
+   void __iomem *adc_base;
+   void __iomem *sp_base;
+   struct clk *adc_clk;
+   struct clk *sp_clk;
+};
+
+static u16 filter_time_const;
+module_param_named(filter, filter_time_const, ushort, S_IRUGO);
+MODULE_PARM_DESC(filter,
+"Low-pass filter time constant in microseconds (default=0)");
+
+static ssize_t rcar_adc_sp_show_pulse_count(struct device *dev,
+   struct device_attribute *attr,
+   char *buf)
+{
+   struct rcar_adc_sp_priv *priv = dev_get_drvdata(dev);
+   u32 value = 

Re: [Bisected] 3.7-rc1 can't resume (still present in 3.9)

2013-07-12 Thread H. Peter Anvin
On 07/12/2013 04:36 PM, Christian Sünkenberg wrote:
> 
> Jonas tried your patch and it fixes suspend/resume on his T43, although
> IMHO the safest approach would be to just add an exception for
> Vendor==Intel && Family==6 && Model==13, or more generally Vendor==Intel
> && !supports_long_mode, as the same erratum also warns about wrmsr
> possibly not triggering a GP either.
> Anyways, at least on this specific MSR with the Pentium M Jonas tested,
> it behaved correctly on every try, so I'd say your patch does the trick,
> thank you very much!
> 

Using vendor matches is not really a great way to deal with things that
can better be handled analytically.

If WRMSR doesn't fault, it is not a problem...

> As a side note, I found a similar erratum #33 in "Pentium® Processor
> Specification Update" for Intel P54C (Family 5, Model 2), which would,
> supposed there are P54C systems with ACPI sleep/resume support, result
> in MSR 0 (P5_MC_ADDR) to be saved and restored instead of nonexistent EFER.

Doesn't really matter, as we'd only read that one after an #MC.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ACPI / memhotplug: Fix a stale pointer in error path

2013-07-12 Thread Rafael J. Wysocki
On Friday, July 12, 2013 04:28:36 PM Toshi Kani wrote:
> On Fri, 2013-07-12 at 23:40 +0200, Rafael J. Wysocki wrote:
> > On Friday, July 12, 2013 03:12:24 PM Toshi Kani wrote:
> > > On Fri, 2013-07-12 at 23:13 +0200, Rafael J. Wysocki wrote:
> > > > On Friday, July 12, 2013 03:01:15 PM Toshi Kani wrote:
> > > > > On Fri, 2013-07-12 at 22:42 +0200, Rafael J. Wysocki wrote:
> > > > > > On Friday, July 12, 2013 08:51:29 AM Toshi Kani wrote:
> > > > > > > On Fri, 2013-07-12 at 09:24 +0900, Yasuaki Ishimatsu wrote:
> > > > > > > > (2013/07/11 1:47), Toshi Kani wrote:
> > > > > > > > > device->driver_data needs to be cleared when releasing its 
> > > > > > > > > data,
> > > > > > > > > mem_device, in an error path of acpi_memory_device_add().
> > > > > > > > > 
> > > > > > > > > Signed-off-by: Toshi Kani 
> > > > > > > > > ---
> > > > > > > > 
> > > > > > > > Reviewed-by: Yasuaki Ishimatsu 
> > > > > > > 
> > > > > > > Thanks Yasuaki!
> > > > > > 
> > > > > > Queued up as a fix for 3.11.
> > > > > 
> > > > > Thanks!
> > > > > 
> > > > > > Do we need that in -stable as well?
> > > > > 
> > > > > Good point.  Yes, we need that in -stable as well.
> > > > 
> > > > What's the oldest mainline major release that fix is applicable to?
> > > 
> > > The fix is applicable all ways up to 2.6.32.
> > 
> > For -stable I'll need to say some more about what practical consequences of
> > the bug are.  Is it difficult to trigger?
> 
> The function evaluates _CRS of memory device objects, and fails when it
> gets an unexpected resource or cannot allocate a memory.

OK, so this is essentially about surviving unexpected external input, which
I suppose is serious enough.

> A kernel crash
> or data corruption may occur when the kernel accessed a stale pointer.
> That said, I am not sure how critical this issue is for old kernels
> since I do not think there are many platforms that support memory
> hotplug today.

Which doesn't matter.  People may want to run 3.10.y on future hardware too.

> After reading the recent -stable discussion on LKML, now
> I am not sure if this fix should be applied for -stable.

Well, I don't necessarily agree with some things being said there.  I guess
I'll need to say something in that thread. :-)

> I instrumented the kernel to generate an error for testing this change.

OK

Thanks,
Rafael


> > > > > > > > >   drivers/acpi/acpi_memhotplug.c |1 +
> > > > > > > > >   1 file changed, 1 insertion(+)
> > > > > > > > > 
> > > > > > > > > diff --git a/drivers/acpi/acpi_memhotplug.c 
> > > > > > > > > b/drivers/acpi/acpi_memhotplug.c
> > > > > > > > > index c711d11..999adb5 100644
> > > > > > > > > --- a/drivers/acpi/acpi_memhotplug.c
> > > > > > > > > +++ b/drivers/acpi/acpi_memhotplug.c
> > > > > > > > > @@ -323,6 +323,7 @@ static int acpi_memory_device_add(struct 
> > > > > > > > > acpi_device *device,
> > > > > > > > >   /* Get the range from the _CRS */
> > > > > > > > >   result = acpi_memory_get_device_resources(mem_device);
> > > > > > > > >   if (result) {
> > > > > > > > > + device->driver_data = NULL;
> > > > > > > > >   kfree(mem_device);
> > > > > > > > >   return result;
> > > > > > > > >   }
> > > > > > > > > --
> > > > > > > > > To unsubscribe from this list: send the line "unsubscribe 
> > > > > > > > > linux-acpi" in
> > > > > > > > > the body of a message to majord...@vger.kernel.org
> > > > > > > > > More majordomo info at  
> > > > > > > > > http://vger.kernel.org/majordomo-info.html
> > > > > > > > > 
> > > > > > > > 
> > > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > 
> > > > > 
> > > 
> > > 
> 
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/RFT PATCH 3/5] scsi: Use dma_max_pfn(dev) helper for bounce_limit calculations

2013-07-12 Thread Sergei Shtylyov

On 07/13/2013 03:08 AM, Sergei Shtylyov wrote:


DMA bounce limit is the maximum direct DMA'able memory beyond which
bounce buffers has to be used to perform dma operations. SCSI driver
relies on dma_mask but its calculation is based on max_*pfn which
don't have uniform meaning across architectures. So make use of
dma_max_pfn() which is expected to return the DMAable maximum pfn
value across architectures.



Cc: Russell King 
Cc: linux-s...@vger.kernel.org



Signed-off-by: Santosh Shilimkar 
---
   drivers/scsi/scsi_lib.c |2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)



diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 86d5220..e8275fa 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1668,7 +1668,7 @@ u64 scsi_calculate_bounce_limit(struct
Scsi_Host *shost)

   host_dev = scsi_get_device(shost);
   if (host_dev && host_dev->dma_mask)
-bounce_limit = *host_dev->dma_mask;
+bounce_limit = dma_max_pfn(host_dev) << PAGE_SHIFT;



You definitely forgot -1 here.



Please explain your point.



Previously, 'bounce_limit' would look like 0x (unless I'm
mistaken), now it would look like 0xf000 which is hardly what we're
looking for, no?


   Although, -1 won't give us the correct result in this case, it's 
more like + PAGE_SIZE - 1.


WBR, Sergei

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ANN] backports-v3.10 released - first release!

2013-07-12 Thread Luis R. Rodriguez
Kicked out the first Linux kernel backports release under the new
project name, "backports" that hopefully clarifies this a generic
backport project now. Backported subsystems in this release:

  * Ethernet
  * Wireless
  * Bluetooth
  * NFC
  * GPU
  * Media
  * Regulator

Go read the git tree for a full ChangeLog, moving forward I won't be
providing those anymore. The highlights are  Johannes Berg's awesome
work on adding full Kconfig support, which makes you can use 'make
menuconfig', and his huge rework of the entire build system, and
kicking us into pythonifying everything non-user / build related.
Other obvious highlights are huge ongoing contributions by Hauke
without which any of this wouldn't have happened. Thierry Escande
threw in NFC, and we also now have Media and Regulator subsystems
backported. We should have now over 830 drivers backported.

Under the new build system users should query 'make help' and 'make
defconfig-help'.

Go download the new release [0] and if you want something more fancy
check out the new shiny but temporary release page [1].

[0] 
http://www.kernel.org/pub/linux/kernel/projects/backports/stable/v3.10/backports-3.10-1.tar.bz2
[1] http://drvbp1.linux-foundation.org/~mcgrof/rel-html/backports/

  Luis
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Bisected] 3.7-rc1 can't resume (still present in 3.9)

2013-07-12 Thread Christian Sünkenberg
Hello,

On 07/11/2013 01:57 AM, H. Peter Anvin wrote:
> On 07/10/2013 01:52 PM, Christian Sünkenberg wrote:
>> Hello,
>>
>> On 05/01/2013 07:33 PM, H. Peter Anvin wrote:
>>> On 05/01/2013 10:01 AM, Jonas Heinrich wrote:
 Hello, I tried the newest kernel, 3.9 today but the bug is still
 present. Applying the attached patch solves the bug for me.

 Best regards, Jonas Heinrich
>>>
>>> Okay... WTF is going on here?  Does pmode_behavior just not get set up
>>> correctly?  Since it seems you can get it to wake up with your patch,
>>> perhaps we can get read out the value of pmode_behavior and print it...
>>
>> indeed, arch/x86/kernel/acpi/sleep.c tries an rdmsr_safe(MSR_EFER, ...)
>> and sets WAKEUP_BEHAVIOR_RESTORE_EFER bit on success, however,
>> on 90 nm Pentium M (Family 6, Model 13), reading an invalid MSR
>> is not guaranteed to trap, see Erratum X4 in "Intel® Pentium® M
>> Processor on 90 nm Process with 2-MB L2 Cache and Intel® Processor A100
>> and A110 on 90 nm process with 512-KB L2 Cache Specification Update".
>> On Jonas' T43, which has an affected Pentium M without EFER,
>> rdmsr_safe(MSR_EFER, ...) succeeds and WAKEUP_BEHAVIOR_RESTORE_EFER
>> gets set, while on resume the corresponding wrmsr traps and thus resume
>> fails.
>>
>> The pre-3.7 code snippet incidentally catched this by not restoring
>> EFER when it would be restored to all 0s.
>>
> 
> That does seem like a reasonable explanation.
> 
> Does this patch fix the problem?  (Comment blatantly ripped off from
> your email message.)

Jonas tried your patch and it fixes suspend/resume on his T43, although
IMHO the safest approach would be to just add an exception for
Vendor==Intel && Family==6 && Model==13, or more generally Vendor==Intel
&& !supports_long_mode, as the same erratum also warns about wrmsr
possibly not triggering a GP either.
Anyways, at least on this specific MSR with the Pentium M Jonas tested,
it behaved correctly on every try, so I'd say your patch does the trick,
thank you very much!

As a side note, I found a similar erratum #33 in "Pentium® Processor
Specification Update" for Intel P54C (Family 5, Model 2), which would,
supposed there are P54C systems with ACPI sleep/resume support, result
in MSR 0 (P5_MC_ADDR) to be saved and restored instead of nonexistent EFER.

Kind regards,
Christian



smime.p7s
Description: S/MIME Cryptographic Signature


Re: [PATCH] Route kbd LEDs through the generic LEDs layer

2013-07-12 Thread Pavel Machek
Hi!

> > > This permits to reassign keyboard LEDs to something else than keyboard 
> > > "leds"
> > > state, by adding keyboard led and modifier triggers connected to a series
> > > of VT input LEDs, themselves connected to VT input triggers, which
> > > per-input device LEDs use by default.  Userland can thus easily change 
> > > the LED
> > > behavior of (a priori) all input devices, or of particular input devices.
> > 
> > Nice! Leds now have proper /sys interface.
> > 
> > But... I boot up, switch from X to console, press capslock, and no
> > reaction anywhere.
> 
> Is it working without the patch?  Console-setup for instance is known to
> have broken the capslock LED, which is precisely one of the reasons for
> this patch, which will provide console-setup with a way to bring back
> caps lock working properly.

You are right, it was broken before.

> At any rate, please provide way more information about your keyboard
> and LED configuration (output of dumpkeys, dmesg, content of
> /sys/class/leds/*/trigger, etc.), as things are just working fine for me
> (just like it has been for the past two years).

Well... I just verified, and it works "as well as before", which it
should.

> > Note that this is notebook with usb keyboard plugged in (and two
> > monitors), but I believe this worked before...
> 
> Things work fine with my USB keyboard too, is this perhaps using an odd
> driver which would not expose LEDs in a standard way?

No, everything works as well as it did. Feel free to add:

Tested-by: Pavel Machek 

Thanks,
Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] mm/hugetlb: per-vma instantiation mutexes

2013-07-12 Thread Davidlohr Bueso
The hugetlb_instantiation_mutex serializes hugepage allocation and instantiation
in the page directory entry. It was found that this mutex can become quite 
contended
during the early phases of large databases which make use of huge pages - for 
instance
startup and initial runs. One clear example is a 1.5Gb Oracle database, where 
lockstat
reports that this mutex can be one of the top 5 most contended locks in the 
kernel during
the first few minutes:

hugetlb_instantiation_mutex:  10678 10678
 ---
 hugetlb_instantiation_mutex10678  [] 
hugetlb_fault+0x9e/0x340
 ---
 hugetlb_instantiation_mutex10678  [] 
hugetlb_fault+0x9e/0x340

contentions:  10678
acquisitions: 99476
waittime-total: 76888911.01 us

Instead of serializing each hugetlb fault, we can deal with concurrent faults 
for pages
in different vmas. The per-vma mutex is initialized when creating a new vma. 
So, back to
the example above, we now get much less contention:

 >hugetlb_instantiation_mutex:  1 1
   -
   >hugetlb_instantiation_mutex   1   [] 
hugetlb_fault+0xa6/0x350
   -
   >hugetlb_instantiation_mutex   1[] 
hugetlb_fault+0xa6/0x350

contentions:  1
acquisitions:108092
waittime-total:  621.24 us

Signed-off-by: Davidlohr Bueso 
---
 include/linux/mm_types.h |  3 +++
 mm/hugetlb.c | 12 +---
 mm/mmap.c|  3 +++
 3 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/include/linux/mm_types.h b/include/linux/mm_types.h
index fb425aa..b45fd87 100644
--- a/include/linux/mm_types.h
+++ b/include/linux/mm_types.h
@@ -289,6 +289,9 @@ struct vm_area_struct {
 #ifdef CONFIG_NUMA
struct mempolicy *vm_policy;/* NUMA policy for the VMA */
 #endif
+#ifdef CONFIG_HUGETLB_PAGE
+   struct mutex hugetlb_instantiation_mutex;
+#endif
 };
 
 struct core_thread {
diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 83aff0a..12e665b 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -137,12 +137,12 @@ static inline struct hugepage_subpool *subpool_vma(struct 
vm_area_struct *vma)
  * The region data structures are protected by a combination of the mmap_sem
  * and the hugetlb_instantion_mutex.  To access or modify a region the caller
  * must either hold the mmap_sem for write, or the mmap_sem for read and
- * the hugetlb_instantiation mutex:
+ * the vma's hugetlb_instantiation mutex:
  *
  * down_write(>mmap_sem);
  * or
  * down_read(>mmap_sem);
- * mutex_lock(_instantiation_mutex);
+ * mutex_lock(>hugetlb_instantiation_mutex);
  */
 struct file_region {
struct list_head link;
@@ -2547,7 +2547,7 @@ static int unmap_ref_private(struct mm_struct *mm, struct 
vm_area_struct *vma,
 
 /*
  * Hugetlb_cow() should be called with page lock of the original hugepage held.
- * Called with hugetlb_instantiation_mutex held and pte_page locked so we
+ * Called with the vma's hugetlb_instantiation_mutex held and pte_page locked 
so we
  * cannot race with other handlers or page migration.
  * Keep the pte_same checks anyway to make transition from the mutex easier.
  */
@@ -2847,7 +2847,6 @@ int hugetlb_fault(struct mm_struct *mm, struct 
vm_area_struct *vma,
int ret;
struct page *page = NULL;
struct page *pagecache_page = NULL;
-   static DEFINE_MUTEX(hugetlb_instantiation_mutex);
struct hstate *h = hstate_vma(vma);
 
address &= huge_page_mask(h);
@@ -2872,7 +2871,7 @@ int hugetlb_fault(struct mm_struct *mm, struct 
vm_area_struct *vma,
 * get spurious allocation failures if two CPUs race to instantiate
 * the same page in the page cache.
 */
-   mutex_lock(_instantiation_mutex);
+   mutex_lock(>hugetlb_instantiation_mutex);
entry = huge_ptep_get(ptep);
if (huge_pte_none(entry)) {
ret = hugetlb_no_page(mm, vma, address, ptep, flags);
@@ -2943,8 +2942,7 @@ out_page_table_lock:
put_page(page);
 
 out_mutex:
-   mutex_unlock(_instantiation_mutex);
-
+   mutex_unlock(>hugetlb_instantiation_mutex);
return ret;
 }
 
diff --git a/mm/mmap.c b/mm/mmap.c
index fbad7b0..8f0b034 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -1543,6 +1543,9 @@ munmap_back:
vma->vm_page_prot = vm_get_page_prot(vm_flags);
vma->vm_pgoff = pgoff;
INIT_LIST_HEAD(>anon_vma_chain);
+#ifdef CONFIG_HUGETLB_PAGE
+   mutex_init(>hugetlb_instantiation_mutex);
+#endif
 
error = -EINVAL;/* when rejecting VM_GROWSDOWN|VM_GROWSUP */
 
-- 
1.7.11.7



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net 2/2] usb/net/r815x: fix cast to restricted __le32

2013-07-12 Thread David Miller
From: Hayes Wang 
Date: Fri, 12 Jul 2013 16:26:16 +0800

>>> drivers/net/usb/r815x.c:38:16: sparse: cast to restricted __le32
>>> drivers/net/usb/r815x.c:67:15: sparse: cast to restricted __le32
>>> drivers/net/usb/r815x.c:69:13: sparse: incorrect type in assignment 
>>> (different base types)
>drivers/net/usb/r815x.c:69:13:expected unsigned int [unsigned] 
> [addressable] [assigned] [usertype] tmp
>drivers/net/usb/r815x.c:69:13:got restricted __le32 [usertype] 
> 
> 
> Signed-off-by: Hayes Wang 
> Spotted-by: kbuild test robot 

Applied.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH net 1/2] usb/net/r8152: fix integer overflow in expression

2013-07-12 Thread David Miller
From: Hayes Wang 
Date: Fri, 12 Jul 2013 16:26:15 +0800

> config: make ARCH=avr32 allyesconfig
> drivers/net/usb/r8152.c: In function 'rtl8152_start_xmit':
> drivers/net/usb/r8152.c:956: warning: integer overflow in expression
> 
>955memset(tx_desc, 0, sizeof(*tx_desc));
>  > 956tx_desc->opts1 = cpu_to_le32((len & TX_LEN_MASK) | TX_FS | 
> TX_LS);
>957tp->tx_skb = skb;
> 
> Signed-off-by: Hayes Wang 
> Spotted-by: kbuild test robot 

Applied.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/RFT PATCH 3/5] scsi: Use dma_max_pfn(dev) helper for bounce_limit calculations

2013-07-12 Thread Sergei Shtylyov

Hello.

On 07/13/2013 02:25 AM, Russell King - ARM Linux wrote:


DMA bounce limit is the maximum direct DMA'able memory beyond which
bounce buffers has to be used to perform dma operations. SCSI driver
relies on dma_mask but its calculation is based on max_*pfn which
don't have uniform meaning across architectures. So make use of
dma_max_pfn() which is expected to return the DMAable maximum pfn
value across architectures.



Cc: Russell King 
Cc: linux-s...@vger.kernel.org



Signed-off-by: Santosh Shilimkar 
---
   drivers/scsi/scsi_lib.c |2 +-
   1 file changed, 1 insertion(+), 1 deletion(-)



diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 86d5220..e8275fa 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1668,7 +1668,7 @@ u64 scsi_calculate_bounce_limit(struct Scsi_Host *shost)

host_dev = scsi_get_device(shost);
if (host_dev && host_dev->dma_mask)
-   bounce_limit = *host_dev->dma_mask;
+   bounce_limit = dma_max_pfn(host_dev) << PAGE_SHIFT;



You definitely forgot -1 here.



Please explain your point.


   Previously, 'bounce_limit' would look like 0x (unless I'm 
mistaken), now it would look like 0xf000 which is hardly what we're 
looking for, no?


WBR, Sergei

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mtip32xx: dynamically allocate buffer in debugfs functions

2013-07-12 Thread Asai Thambi S P
On 5/23/2013 2:23 PM, David Milburn wrote:

> Dynamically allocate buf to prevent warnings:
> 
> drivers/block/mtip32xx/mtip32xx.c: In function ‘mtip_hw_read_device_status’:
> drivers/block/mtip32xx/mtip32xx.c:2823: warning: the frame size of 1056 bytes 
> is larger than 1024 bytes
> drivers/block/mtip32xx/mtip32xx.c: In function ‘mtip_hw_read_registers’:
> drivers/block/mtip32xx/mtip32xx.c:2894: warning: the frame size of 1056 bytes 
> is larger than 1024 bytes
> drivers/block/mtip32xx/mtip32xx.c: In function ‘mtip_hw_read_flags’:
> drivers/block/mtip32xx/mtip32xx.c:2917: warning: the frame size of 1056 bytes 
> is larger than 1024 bytes
> 
> Signed-off-by: David Milburn 


Acked-by: Asai Thambi S P 

> ---
>  drivers/block/mtip32xx/mtip32xx.c |   47 
> +
>  1 files changed, 37 insertions(+), 10 deletions(-)
> 
> diff --git a/drivers/block/mtip32xx/mtip32xx.c 
> b/drivers/block/mtip32xx/mtip32xx.c
> index 847107e..baa0996 100644
> --- a/drivers/block/mtip32xx/mtip32xx.c
> +++ b/drivers/block/mtip32xx/mtip32xx.c
> @@ -2806,34 +2806,51 @@ static ssize_t show_device_status(struct 
> device_driver *drv, char *buf)
>  static ssize_t mtip_hw_read_device_status(struct file *f, char __user *ubuf,
>   size_t len, loff_t *offset)
>  {
> + struct driver_data *dd =  (struct driver_data *)f->private_data;
>   int size = *offset;
> - char buf[MTIP_DFS_MAX_BUF_SIZE];
> + char *buf;
> + int rv = 0;
>  
>   if (!len || *offset)
>   return 0;
>  
> + buf = kzalloc(MTIP_DFS_MAX_BUF_SIZE, GFP_KERNEL);
> + if (!buf) {
> + dev_err(>pdev->dev,
> + "Memory allocation: status buffer\n");
> + return -ENOMEM;
> + }
> +
>   size += show_device_status(NULL, buf);
>  
>   *offset = size <= len ? size : len;
>   size = copy_to_user(ubuf, buf, *offset);
>   if (size)
> - return -EFAULT;
> + rv = -EFAULT;
>  
> - return *offset;
> + kfree(buf);
> + return rv ? rv : *offset;
>  }
>  
>  static ssize_t mtip_hw_read_registers(struct file *f, char __user *ubuf,
> size_t len, loff_t *offset)
>  {
>   struct driver_data *dd =  (struct driver_data *)f->private_data;
> - char buf[MTIP_DFS_MAX_BUF_SIZE];
> + char *buf;
>   u32 group_allocated;
>   int size = *offset;
> - int n;
> + int n, rv = 0;
>  
>   if (!len || size)
>   return 0;
>  
> + buf = kzalloc(MTIP_DFS_MAX_BUF_SIZE, GFP_KERNEL);
> + if (!buf) {
> + dev_err(>pdev->dev,
> + "Memory allocation: register buffer\n");
> + return -ENOMEM;
> + }
> +
>   size += sprintf([size], "H/ S ACTive  : [ 0x");
>  
>   for (n = dd->slot_groups-1; n >= 0; n--)
> @@ -2888,21 +2905,30 @@ static ssize_t mtip_hw_read_registers(struct file *f, 
> char __user *ubuf,
>   *offset = size <= len ? size : len;
>   size = copy_to_user(ubuf, buf, *offset);
>   if (size)
> - return -EFAULT;
> + rv = -EFAULT;
>  
> - return *offset;
> + kfree(buf);
> + return rv ? rv : *offset;
>  }
>  
>  static ssize_t mtip_hw_read_flags(struct file *f, char __user *ubuf,
> size_t len, loff_t *offset)
>  {
>   struct driver_data *dd =  (struct driver_data *)f->private_data;
> - char buf[MTIP_DFS_MAX_BUF_SIZE];
> + char *buf;
>   int size = *offset;
> + int rv = 0;
>  
>   if (!len || size)
>   return 0;
>  
> + buf = kzalloc(MTIP_DFS_MAX_BUF_SIZE, GFP_KERNEL);
> + if (!buf) {
> + dev_err(>pdev->dev,
> + "Memory allocation: flag buffer\n");
> + return -ENOMEM;
> + }
> +
>   size += sprintf([size], "Flag-port : [ %08lX ]\n",
>   dd->port->flags);
>   size += sprintf([size], "Flag-dd   : [ %08lX ]\n",
> @@ -2911,9 +2937,10 @@ static ssize_t mtip_hw_read_flags(struct file *f, char 
> __user *ubuf,
>   *offset = size <= len ? size : len;
>   size = copy_to_user(ubuf, buf, *offset);
>   if (size)
> - return -EFAULT;
> + rv = -EFAULT;
>  
> - return *offset;
> + kfree(buf);
> + return rv ? rv : *offset;
>  }
>  
>  static const struct file_operations mtip_device_status_fops = {
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: shmem info issue

2013-07-12 Thread Hugh Dickins
On Fri, 12 Jul 2013, naveen yadav wrote:
>  Hi All
> 
> I am working on tmpfs. During code analysis I found shmem driver
> register itself as tmpfs .
> 
> cat /proc/meminfo Shmem field read NR_SHMEM enum filed and shows used
> memory in tmpfs
> 
> 
> [root@localhost linux-3.8.2]# cat /proc/meminfo | grep -r Shmem
> Shmem:   704 kB
> [root@localhost linux-3.8.2]#
> 
> include/linux/mmzone.h
> enum zone_stat_item {
> -cut here---
>  NR_SHMEM,   /* shmem pages (included tmpfs/GEM pages
>  -cut here
> }
> 
> I have the following query
> 1-> cat /proc/meminfo | grep -r Shmem show only the used tmpfs memory
> or it show used  tmpfs memory + mata data also (because I found Shmem
> is bit larger then tempfs used memory )

"Shmem" includes only data, not metadata, nor memory currently swapped
out.  But it includes tmpfs memory, SysV shared memory (from ipc/shm.c),
POSIX shared memory (under /dev/shm), and shared anonymous mappings
(from mmap of /dev/zero with MAP_SHARED: see call to shmem_zero_setup()
from drivers/char/mem.c): whatever allocates pages through mm/shmem.c.

> 2-> as per the developer comments NR_SHMEM included tmpfs and GEM
> pages whct is Gem pages

Ah yes, and the Graphics Execution Manager uses shmem for objects shared
with the GPU: see use of shmem_read_mapping_page*() in drivers/gpu/drm/.

> 3-> is my above code analysis is correct?

I think so.

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH v3] x86: make sure IDT is page aligned

2013-07-12 Thread Kees Cook
Since the IDT is referenced from a fixmap, make sure it is page aligned.
Merge with 32-bit one, since it was already aligned to deal with F00F bug.
This avoids the risk of it ever being moved in the bss and having the
mapping be offset, resulting in calling incorrect handlers.

Signed-off-by: Kees Cook 
Reported-by: PaX Team 
Cc: sta...@vger.kernel.org
---
v3:
 - merge 32-bit and 64-bit idt_table definition
v2:
 - 32-bit was already aligned
---
 arch/x86/kernel/head_64.S |4 
 arch/x86/kernel/traps.c   |7 ++-
 2 files changed, 2 insertions(+), 9 deletions(-)

diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
index 5e4d8a8..317b8cc 100644
--- a/arch/x86/kernel/head_64.S
+++ b/arch/x86/kernel/head_64.S
@@ -514,10 +514,6 @@ ENTRY(phys_base)

.section .bss, "aw", @nobits
.align L1_CACHE_BYTES
-ENTRY(idt_table)
-   .skip IDT_ENTRIES * 16
-
-   .align L1_CACHE_BYTES
 ENTRY(debug_idt_table)
.skip IDT_ENTRIES * 16
 
diff --git a/arch/x86/kernel/traps.c b/arch/x86/kernel/traps.c
index b0865e8..0952614 100644
--- a/arch/x86/kernel/traps.c
+++ b/arch/x86/kernel/traps.c
@@ -68,13 +68,10 @@
 #include 
 
 asmlinkage int system_call(void);
+#endif
 
-/*
- * The IDT has to be page-aligned to simplify the Pentium
- * F0 0F bug workaround.
- */
+/* The IDT has to be page-aligned to keep it aligned with its fixmap. */
 gate_desc idt_table[NR_VECTORS] __page_aligned_data = { { { { 0, 0 } } }, };
-#endif
 
 DECLARE_BITMAP(used_vectors, NR_VECTORS);
 EXPORT_SYMBOL_GPL(used_vectors);
-- 
1.7.9.5


-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: make sure IDT is page aligned

2013-07-12 Thread Kees Cook
That was the busted patch. See the v2 I sent. Only 64-bit needs
alignment. And after looking more at it, the idt in head_64.S could be
entirely dropped in favor of using the one in arch/x86/kernel/traps.c
(after moving it out of the #ifdef.

-Kees

On Fri, Jul 12, 2013 at 3:28 PM, H. Peter Anvin  wrote:
> On 07/12/2013 11:30 AM, Kees Cook wrote:
>>
>> - .word 0 # 32-bit align idt_desc.address
>> + .word PAGE_SIZE # page align idt_desc.address
>>
>
> ... and this is totally confused.  This didn't change alignment one
> iota, it only put the value 4096 into the padding.
>
> -hpa
>



-- 
Kees Cook
Chrome OS Security
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ACPI / memhotplug: Fix a stale pointer in error path

2013-07-12 Thread Toshi Kani
On Fri, 2013-07-12 at 23:40 +0200, Rafael J. Wysocki wrote:
> On Friday, July 12, 2013 03:12:24 PM Toshi Kani wrote:
> > On Fri, 2013-07-12 at 23:13 +0200, Rafael J. Wysocki wrote:
> > > On Friday, July 12, 2013 03:01:15 PM Toshi Kani wrote:
> > > > On Fri, 2013-07-12 at 22:42 +0200, Rafael J. Wysocki wrote:
> > > > > On Friday, July 12, 2013 08:51:29 AM Toshi Kani wrote:
> > > > > > On Fri, 2013-07-12 at 09:24 +0900, Yasuaki Ishimatsu wrote:
> > > > > > > (2013/07/11 1:47), Toshi Kani wrote:
> > > > > > > > device->driver_data needs to be cleared when releasing its data,
> > > > > > > > mem_device, in an error path of acpi_memory_device_add().
> > > > > > > > 
> > > > > > > > Signed-off-by: Toshi Kani 
> > > > > > > > ---
> > > > > > > 
> > > > > > > Reviewed-by: Yasuaki Ishimatsu 
> > > > > > 
> > > > > > Thanks Yasuaki!
> > > > > 
> > > > > Queued up as a fix for 3.11.
> > > > 
> > > > Thanks!
> > > > 
> > > > > Do we need that in -stable as well?
> > > > 
> > > > Good point.  Yes, we need that in -stable as well.
> > > 
> > > What's the oldest mainline major release that fix is applicable to?
> > 
> > The fix is applicable all ways up to 2.6.32.
> 
> For -stable I'll need to say some more about what practical consequences of
> the bug are.  Is it difficult to trigger?

The function evaluates _CRS of memory device objects, and fails when it
gets an unexpected resource or cannot allocate a memory.  A kernel crash
or data corruption may occur when the kernel accessed a stale pointer.
That said, I am not sure how critical this issue is for old kernels
since I do not think there are many platforms that support memory
hotplug today.  After reading the recent -stable discussion on LKML, now
I am not sure if this fix should be applied for -stable.  I instrumented
the kernel to generate an error for testing this change.
 
Thanks,
-Toshi


> 
> Rafael
> 
> 
> > > > > > > >   drivers/acpi/acpi_memhotplug.c |1 +
> > > > > > > >   1 file changed, 1 insertion(+)
> > > > > > > > 
> > > > > > > > diff --git a/drivers/acpi/acpi_memhotplug.c 
> > > > > > > > b/drivers/acpi/acpi_memhotplug.c
> > > > > > > > index c711d11..999adb5 100644
> > > > > > > > --- a/drivers/acpi/acpi_memhotplug.c
> > > > > > > > +++ b/drivers/acpi/acpi_memhotplug.c
> > > > > > > > @@ -323,6 +323,7 @@ static int acpi_memory_device_add(struct 
> > > > > > > > acpi_device *device,
> > > > > > > > /* Get the range from the _CRS */
> > > > > > > > result = acpi_memory_get_device_resources(mem_device);
> > > > > > > > if (result) {
> > > > > > > > +   device->driver_data = NULL;
> > > > > > > > kfree(mem_device);
> > > > > > > > return result;
> > > > > > > > }
> > > > > > > > --
> > > > > > > > To unsubscribe from this list: send the line "unsubscribe 
> > > > > > > > linux-acpi" in
> > > > > > > > the body of a message to majord...@vger.kernel.org
> > > > > > > > More majordomo info at  
> > > > > > > > http://vger.kernel.org/majordomo-info.html
> > > > > > > > 
> > > > > > > 
> > > > > > > 
> > > > > > 
> > > > > > 
> > > > 
> > > > 
> > 
> > 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: make sure IDT is page aligned

2013-07-12 Thread H. Peter Anvin
On 07/12/2013 11:30 AM, Kees Cook wrote:
>  
> - .word 0 # 32-bit align idt_desc.address
> + .word PAGE_SIZE # page align idt_desc.address
> 

... and this is totally confused.  This didn't change alignment one
iota, it only put the value 4096 into the padding.

-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH]: 3.10 fails to boot from mmc root

2013-07-12 Thread Kelly Anderson

3.10 fails to boot from mmc root without this patch.

An early version of this patch was to be included in in 3.10
but apparently didn't make it.

--- ./drivers/mmc/core/core.c.orig2013-06-30 16:13:29.0 -0600
+++ ./drivers/mmc/core/core.c2013-07-12 15:17:15.377466795 -0600
@@ -2421,6 +2421,7 @@ void mmc_start_host(struct mmc_host *hos
 else
 mmc_power_up(host);
 mmc_detect_change(host, 0);
+mmc_flush_scheduled_work();
 }

 void mmc_stop_host(struct mmc_host *host)

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] x86: make sure IDT is page aligned

2013-07-12 Thread H. Peter Anvin
On 07/12/2013 11:30 AM, Kees Cook wrote:
> Since the IDT is referenced from a fixmap, make sure it is page aligned.
> This avoids the risk of it ever being moved in the bss and having the
> fixmap fail.
> 
> Signed-off-by: Kees Cook 
> Reported-by: PaX Team 
> Cc: sta...@vger.kernel.org
> ---
>  arch/x86/kernel/head_32.S |2 +-
>  arch/x86/kernel/head_64.S |2 +-
>  2 files changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/arch/x86/kernel/head_32.S b/arch/x86/kernel/head_32.S
> index e65ddc6..3526dd1 100644
> --- a/arch/x86/kernel/head_32.S
> +++ b/arch/x86/kernel/head_32.S
> @@ -734,7 +734,7 @@ boot_gdt_descr:
>   .word __BOOT_DS+7
>   .long boot_gdt - __PAGE_OFFSET
>  
> - .word 0 # 32-bit align idt_desc.address
> + .word PAGE_SIZE # page align idt_desc.address
>  idt_descr:
>   .word IDT_ENTRIES*8-1   # idt contains 256 entries
>   .long idt_table
> diff --git a/arch/x86/kernel/head_64.S b/arch/x86/kernel/head_64.S
> index 5e4d8a8..77e6d3e 100644
> --- a/arch/x86/kernel/head_64.S
> +++ b/arch/x86/kernel/head_64.S
> @@ -513,7 +513,7 @@ ENTRY(phys_base)
>  #include "../../x86/xen/xen-head.S"
>   
>   .section .bss, "aw", @nobits
> - .align L1_CACHE_BYTES
> + .align PAGE_SIZE
>  ENTRY(idt_table)
>   .skip IDT_ENTRIES * 16
>  
> 

You are aligning the IDT *descriptor*, not the IDT itself?

-hpa

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/RFT PATCH 3/5] scsi: Use dma_max_pfn(dev) helper for bounce_limit calculations

2013-07-12 Thread Russell King - ARM Linux
On Sat, Jul 13, 2013 at 01:55:58AM +0400, Sergei Shtylyov wrote:
> Hello.
>
> On 07/13/2013 01:48 AM, Santosh Shilimkar wrote:
>
>> DMA bounce limit is the maximum direct DMA'able memory beyond which
>> bounce buffers has to be used to perform dma operations. SCSI driver
>> relies on dma_mask but its calculation is based on max_*pfn which
>> don't have uniform meaning across architectures. So make use of
>> dma_max_pfn() which is expected to return the DMAable maximum pfn
>> value across architectures.
>
>> Cc: Russell King 
>> Cc: linux-s...@vger.kernel.org
>
>> Signed-off-by: Santosh Shilimkar 
>> ---
>>   drivers/scsi/scsi_lib.c |2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>
>> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
>> index 86d5220..e8275fa 100644
>> --- a/drivers/scsi/scsi_lib.c
>> +++ b/drivers/scsi/scsi_lib.c
>> @@ -1668,7 +1668,7 @@ u64 scsi_calculate_bounce_limit(struct Scsi_Host 
>> *shost)
>>
>>  host_dev = scsi_get_device(shost);
>>  if (host_dev && host_dev->dma_mask)
>> -bounce_limit = *host_dev->dma_mask;
>> +bounce_limit = dma_max_pfn(host_dev) << PAGE_SHIFT;
>
>You definitely forgot -1 here.

Please explain your point.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Ksummit-2013-discuss] When to push bug fixes to mainline

2013-07-12 Thread H. Peter Anvin
On 07/12/2013 01:33 PM, Greg Kroah-Hartman wrote:
> 
> Is it _really_ all that hard to remember what to mark for stable
> inclusion?  If you figure it out after you have committed the patch,
> then just put a copy of it somewhere to remind yourself.  That seems to
> be what both David and I do with no problems, and I think we both deal
> with more individual patches and developers than probably most everyone
> else combined.
> 

For the record, my main reason for wanting something like git notes is
that it is now X years after a patch, the maintainer is gone, and
tracking down someone who knows about the patch is really valuable.
Someone acking a patch after the fact is someone who looked at it "back
then", and can be tracked.

Yes, you can find this in mailing list archives and so on, but we have
had problems with such extrinsic information not being as sticky as we'd
like.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Ksummit-2013-discuss] When to push bug fixes to mainline

2013-07-12 Thread H. Peter Anvin
On 07/12/2013 12:53 PM, Steven Rostedt wrote:
> On Fri, 2013-07-12 at 12:44 -0700, Linus Torvalds wrote:
> 
>> They can be useful for "local" notes (they can be very powerful for
>> certain workflows), but they won't be pulled and pushed by me.
> 
> Perhaps notes can be used as that reminder to send to stable. Tag a
> commit with a note, and have some automated process that monitors
> Linus's tree and when a commit makes it in, automate an email to stable
> with said commit.
> 

Didn't Linus just say he won't do that?

Either way it would seem to fail to accomplish the record-keeping aspect.

-hpa


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] gps6507x.txt: Remove executable bits

2013-07-12 Thread Rob Herring
On 07/12/2013 12:31 PM, Joe Perches wrote:
> Documentation shouldn't be executable.
> 
> Signed-off-by: Joe Perches 

Acked-by: Rob Herring 

> ---
> diff --git a/Documentation/devicetree/bindings/mfd/tps6507x.txt 
> b/Documentation/devicetree/bindings/mfd/tps6507x.txt
> old mode 100755
> new mode 100644
> 
> 

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ 00/15] 3.9.10-stable review

2013-07-12 Thread Guenter Roeck
On Thu, Jul 11, 2013 at 03:19:30PM -0700, Greg Kroah-Hartman wrote:
> This is the start of the stable review cycle for the 3.9.10 release.
> There are 15 patches in this series, all will be posted as a response
> to this one.  If anyone has any issues with these being applied, please
> let me know.
> 
> Responses should be made by Sat Jul 13 22:11:24 UTC 2013.
> Anything received after that time might be too late.
> 
> The whole patch series can be found in one patch at:
>   kernel.org/pub/linux/kernel/v3.0/stable-review/patch-3.9.10-rc1.gz
> and the diffstat can be found below.
> 
Cross build results are as follows. No change to previous release.

Guenter

---
Build reference: v3.9.9-15-gc79284a

Build x86_64:defconfig passed
Build x86_64:allyesconfig passed
Build x86_64:allmodconfig passed
Build x86_64:allnoconfig passed
Build x86_64:alldefconfig passed
Build i386:defconfig passed
Build i386:allyesconfig passed
Build i386:allmodconfig passed
Build i386:allnoconfig passed
Build i386:alldefconfig passed
Build mips:defconfig passed
Build mips:bcm47xx_defconfig passed
Build mips:bcm63xx_defconfig passed
Build mips:nlm_xlp_defconfig passed
Build mips:ath79_defconfig passed
Build mips:ar7_defconfig passed
Build mips:fuloong2e_defconfig passed
Build mips:e55_defconfig passed
Build mips:cavium_octeon_defconfig passed
Build mips:powertv_defconfig passed
Build mips:malta_defconfig passed
Build powerpc:defconfig passed
Build powerpc:allyesconfig failed
Build powerpc:allmodconfig passed
Build powerpc:chroma_defconfig passed
Build powerpc:maple_defconfig passed
Build powerpc:ppc6xx_defconfig passed
Build powerpc:mpc83xx_defconfig passed
Build powerpc:mpc85xx_defconfig passed
Build powerpc:mpc85xx_smp_defconfig passed
Build powerpc:tqm8xx_defconfig passed
Build powerpc:85xx/sbc8548_defconfig passed
Build powerpc:83xx/mpc834x_mds_defconfig passed
Build powerpc:86xx/sbc8641d_defconfig passed
Build arm:defconfig passed
Build arm:allyesconfig failed
Build arm:allmodconfig failed
Build arm:exynos4_defconfig passed
Build arm:multi_v7_defconfig passed
Build arm:kirkwood_defconfig passed
Build arm:omap2plus_defconfig passed
Build arm:tegra_defconfig passed
Build arm:u8500_defconfig passed
Build arm:at91sam9rl_defconfig passed
Build arm:ap4evb_defconfig passed
Build arm:bcm_defconfig passed
Build arm:bonito_defconfig passed
Build arm:pxa910_defconfig passed
Build arm:mvebu_defconfig passed
Build m68k:defconfig passed
Build m68k:m5272c3_defconfig failed
Build m68k:m5307c3_defconfig passed
Build m68k:m5249evb_defconfig passed
Build m68k:m5407c3_defconfig passed
Build m68k:sun3_defconfig passed
Build m68k:m5475evb_defconfig passed
Build sparc:defconfig passed
Build sparc:sparc64_defconfig passed
Build xtensa:defconfig passed
Build xtensa:iss_defconfig passed
Build microblaze:mmu_defconfig passed
Build microblaze:nommu_defconfig passed
Build blackfin:defconfig passed
Build parisc:defconfig passed

---
Total builds: 64 Total build errors: 4
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC/RFT PATCH 3/5] scsi: Use dma_max_pfn(dev) helper for bounce_limit calculations

2013-07-12 Thread Sergei Shtylyov

Hello.

On 07/13/2013 01:48 AM, Santosh Shilimkar wrote:


DMA bounce limit is the maximum direct DMA'able memory beyond which
bounce buffers has to be used to perform dma operations. SCSI driver
relies on dma_mask but its calculation is based on max_*pfn which
don't have uniform meaning across architectures. So make use of
dma_max_pfn() which is expected to return the DMAable maximum pfn
value across architectures.



Cc: Russell King 
Cc: linux-s...@vger.kernel.org



Signed-off-by: Santosh Shilimkar 
---
  drivers/scsi/scsi_lib.c |2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)



diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 86d5220..e8275fa 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1668,7 +1668,7 @@ u64 scsi_calculate_bounce_limit(struct Scsi_Host *shost)

host_dev = scsi_get_device(shost);
if (host_dev && host_dev->dma_mask)
-   bounce_limit = *host_dev->dma_mask;
+   bounce_limit = dma_max_pfn(host_dev) << PAGE_SHIFT;


   You definitely forgot -1 here.

WBR, Sergei

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC/RFT PATCH 1/5] block: Rename parameter dma_mask to max_addr for blk_queue_bounce_limit()

2013-07-12 Thread Santosh Shilimkar
The blk_queue_bounce_limit() API parameter 'dma_mask' is actually the
maximum address the device can handle rather than a dma_mask. Rename
it accordingly to avoid it being interpreted as dma_mask.

No functional change.

The idea is to fix the bad assumptions about dma_mask wherever it could
be miss-interpreted.

Cc: Russell King 
Cc: Jens Axboe 

Signed-off-by: Santosh Shilimkar 
---
 block/blk-settings.c |8 
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/block/blk-settings.c b/block/blk-settings.c
index c50ecf0..026c151 100644
--- a/block/blk-settings.c
+++ b/block/blk-settings.c
@@ -195,17 +195,17 @@ EXPORT_SYMBOL(blk_queue_make_request);
 /**
  * blk_queue_bounce_limit - set bounce buffer limit for queue
  * @q: the request queue for the device
- * @dma_mask: the maximum address the device can handle
+ * @max_addr: the maximum address the device can handle
  *
  * Description:
  *Different hardware can have different requirements as to what pages
  *it can do I/O directly to. A low level driver can call
  *blk_queue_bounce_limit to have lower memory pages allocated as bounce
- *buffers for doing I/O to pages residing above @dma_mask.
+ *buffers for doing I/O to pages residing above @max_addr.
  **/
-void blk_queue_bounce_limit(struct request_queue *q, u64 dma_mask)
+void blk_queue_bounce_limit(struct request_queue *q, u64 max_addr)
 {
-   unsigned long b_pfn = dma_mask >> PAGE_SHIFT;
+   unsigned long b_pfn = max_addr >> PAGE_SHIFT;
int dma = 0;
 
q->bounce_gfp = GFP_NOIO;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC/RFT PATCH 4/5] ARM: mm: change max*pfn to include the physical offset of memory

2013-07-12 Thread Santosh Shilimkar
Most of the kernel code assumes that max*pfn is maximum pfns because
the physical start of memory is expected to be PFN0. Since this
assumption is not true on ARM architectures, the meaning of max*pfn
is number of memory pages. This is done to keep drivers happy which
are making use of of these variable to calculate the dma bounce limit
using dma_mask.

Now since we have a architecture override possibility for DMAable
maximum pfns, lets make meaning of max*pfns as maximum pnfs on ARM
as well.

In the patch, the dma_to_pfn/pfn_to_dma() pair is hacked to take care of
the physical memory offset. It is done this way just to enable testing
since its understood that it can come in way of single zImage.

Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Nicolas Pitre 

Signed-off-by: Santosh Shilimkar 
---
 arch/arm/include/asm/dma-mapping.h |   16 
 arch/arm/mm/init.c |   10 --
 2 files changed, 16 insertions(+), 10 deletions(-)

diff --git a/arch/arm/include/asm/dma-mapping.h 
b/arch/arm/include/asm/dma-mapping.h
index 5b579b9..b2d5937 100644
--- a/arch/arm/include/asm/dma-mapping.h
+++ b/arch/arm/include/asm/dma-mapping.h
@@ -47,12 +47,12 @@ static inline int dma_set_mask(struct device *dev, u64 mask)
 #ifndef __arch_pfn_to_dma
 static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
 {
-   return (dma_addr_t)__pfn_to_bus(pfn);
+   return (dma_addr_t)__pfn_to_bus(pfn + PHYS_PFN_OFFSET);
 }
 
 static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr)
 {
-   return __bus_to_pfn(addr);
+   return __bus_to_pfn(addr) - PHYS_PFN_OFFSET;
 }
 
 static inline void *dma_to_virt(struct device *dev, dma_addr_t addr)
@@ -64,15 +64,16 @@ static inline dma_addr_t virt_to_dma(struct device *dev, 
void *addr)
 {
return (dma_addr_t)__virt_to_bus((unsigned long)(addr));
 }
+
 #else
 static inline dma_addr_t pfn_to_dma(struct device *dev, unsigned long pfn)
 {
-   return __arch_pfn_to_dma(dev, pfn);
+   return __arch_pfn_to_dma(dev, pfn + PHYS_PFN_OFFSET);
 }
 
 static inline unsigned long dma_to_pfn(struct device *dev, dma_addr_t addr)
 {
-   return __arch_dma_to_pfn(dev, addr);
+   return __arch_dma_to_pfn(dev, addr) - PHYS_PFN_OFFSET;
 }
 
 static inline void *dma_to_virt(struct device *dev, dma_addr_t addr)
@@ -86,6 +87,13 @@ static inline dma_addr_t virt_to_dma(struct device *dev, 
void *addr)
 }
 #endif
 
+/* The ARM override for dma_max_pfn() */
+static inline unsigned long dma_max_pfn(struct device *dev)
+{
+   return dma_to_pfn(dev, *dev->dma_mask);
+}
+#define dma_max_pfn(dev) dma_max_pfn(dev)
+
 /*
  * DMA errors are defined by all-bits-set in the DMA address.
  */
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 6833cbe..588a2c1 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -420,12 +420,10 @@ void __init bootmem_init(void)
 * This doesn't seem to be used by the Linux memory manager any
 * more, but is used by ll_rw_block.  If we can get rid of it, we
 * also get rid of some of the stuff above as well.
-*
-* Note: max_low_pfn and max_pfn reflect the number of _pages_ in
-* the system, not the maximum PFN.
 */
-   max_low_pfn = max_low - PHYS_PFN_OFFSET;
-   max_pfn = max_high - PHYS_PFN_OFFSET;
+   min_low_pfn = min;
+   max_low_pfn = max_low;
+   max_pfn = max_high;
 }
 
 /*
@@ -531,7 +529,7 @@ static inline void free_area_high(unsigned long pfn, 
unsigned long end)
 static void __init free_highpages(void)
 {
 #ifdef CONFIG_HIGHMEM
-   unsigned long max_low = max_low_pfn + PHYS_PFN_OFFSET;
+   unsigned long max_low = max_low_pfn;
struct memblock_region *mem, *res;
 
/* set highmem page free */
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC/RFT PATCH 5/5] ARM: mm: Remove bootmem code and switch to NO_BOOTMEM

2013-07-12 Thread Santosh Shilimkar
In the effort of using memblock instead of bootmem allocator, ARM arch
needs to be converted to use NO_BOOTMEM. With NO_BOOTMEM change,
now we use memblock allocator to reserve space for crash kernel to
have one less dependency with nobootmem allocator wrapper.

Hopefully the NO_BOOTMEM memblock wrapper(nobootmem.c) will vanish in
near future and archs can directly use memblock APIs. Ongoing thread
on this topic is here:
https://lkml.org/lkml/2013/6/29/77

Boot tested with both flat memory and sparse (faked) memory models
with highmem enabled. LAPE systems with memory starting > 4GB still
won't work but this is one of the step to solve that problem for ARM.

Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Nicolas Pitre 
Cc: Tejun Heo 

Signed-off-by: Santosh Shilimkar 
---
 arch/arm/Kconfig|1 +
 arch/arm/kernel/setup.c |2 +-
 arch/arm/mm/init.c  |   58 ++-
 3 files changed, 4 insertions(+), 57 deletions(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 0ac9be6..cff9a59 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -63,6 +63,7 @@ config ARM
select OLD_SIGSUSPEND3
select OLD_SIGACTION
select HAVE_CONTEXT_TRACKING
+   select NO_BOOTMEM
help
  The ARM series is a line of low-power-consumption RISC chip designs
  licensed by ARM Ltd and targeted at embedded applications and
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index 63af9a7..2ca4b90 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -805,7 +805,7 @@ static void __init reserve_crashkernel(void)
if (ret)
return;
 
-   ret = reserve_bootmem(crash_base, crash_size, BOOTMEM_EXCLUSIVE);
+   ret = memblock_reserve(crash_base, crash_size);
if (ret < 0) {
printk(KERN_WARNING "crashkernel reservation failed - "
   "memory is in use (0x%lx)\n", (unsigned long)crash_base);
diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 588a2c1..84dd56c 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -153,58 +153,6 @@ static void __init find_limits(unsigned long *min, 
unsigned long *max_low,
*max_high = bank_pfn_end(>bank[mi->nr_banks - 1]);
 }
 
-static void __init arm_bootmem_init(unsigned long start_pfn,
-   unsigned long end_pfn)
-{
-   struct memblock_region *reg;
-   unsigned int boot_pages;
-   phys_addr_t bitmap;
-   pg_data_t *pgdat;
-
-   /*
-* Allocate the bootmem bitmap page.  This must be in a region
-* of memory which has already been mapped.
-*/
-   boot_pages = bootmem_bootmap_pages(end_pfn - start_pfn);
-   bitmap = memblock_alloc_base(boot_pages << PAGE_SHIFT, L1_CACHE_BYTES,
-   __pfn_to_phys(end_pfn));
-
-   /*
-* Initialise the bootmem allocator, handing the
-* memory banks over to bootmem.
-*/
-   node_set_online(0);
-   pgdat = NODE_DATA(0);
-   init_bootmem_node(pgdat, __phys_to_pfn(bitmap), start_pfn, end_pfn);
-
-   /* Free the lowmem regions from memblock into bootmem. */
-   for_each_memblock(memory, reg) {
-   unsigned long start = memblock_region_memory_base_pfn(reg);
-   unsigned long end = memblock_region_memory_end_pfn(reg);
-
-   if (end >= end_pfn)
-   end = end_pfn;
-   if (start >= end)
-   break;
-
-   free_bootmem(__pfn_to_phys(start), (end - start) << PAGE_SHIFT);
-   }
-
-   /* Reserve the lowmem memblock reserved regions in bootmem. */
-   for_each_memblock(reserved, reg) {
-   unsigned long start = memblock_region_reserved_base_pfn(reg);
-   unsigned long end = memblock_region_reserved_end_pfn(reg);
-
-   if (end >= end_pfn)
-   end = end_pfn;
-   if (start >= end)
-   break;
-
-   reserve_bootmem(__pfn_to_phys(start),
-   (end - start) << PAGE_SHIFT, BOOTMEM_DEFAULT);
-   }
-}
-
 #ifdef CONFIG_ZONE_DMA
 
 unsigned long arm_dma_zone_size __read_mostly;
@@ -242,7 +190,7 @@ void __init setup_dma_zone(struct machine_desc *mdesc)
 #endif
 }
 
-static void __init arm_bootmem_free(unsigned long min, unsigned long max_low,
+static void __init zone_sizes_init(unsigned long min, unsigned long max_low,
unsigned long max_high)
 {
unsigned long zone_size[MAX_NR_ZONES], zhole_size[MAX_NR_ZONES];
@@ -396,8 +344,6 @@ void __init bootmem_init(void)
 
find_limits(, _low, _high);
 
-   arm_bootmem_init(min, max_low);
-
/*
 * Sparsemem tries to allocate bootmem in memory_present(),
 * so must be done after the fixed reservations
@@ -414,7 +360,7 @@ void __init bootmem_init(void)
 * the sparse mem_map arrays initialized by sparse_init()

[RFC/RFT PATCH 2/5] mm: dma-mapping: Add dma_max_pfn(dev) helper function

2013-07-12 Thread Santosh Shilimkar
Most of the kernel assumes that PFN0 is the start of the physical
memory (RAM). This assumptions is not true on most of the ARM SOCs
and hence and if one try to update the ARM port to follow the assumptions,
we end of breaking the dma bounce limit for few block layer drivers.
One such example is trying to unify the meaning of max*_pfn on ARM
as the bootmem layer expects, breaks few block layer driver dma
bounce limit.

To fix this problem, we introduce dma_max_pfn(dev) generic helper with
a possibility of override from the architecture code. The helper converts
a DMA bitmask of bits to a block PFN number. In all the generic cases,
it is just  "dev->dma_mask >> PAGE_SHIFT" and hence default behavior
is maintained as is.

Subsequent patches will make use of the helper. No functional change.

Cc: Russell King 
Cc: Jens Axboe 

Signed-off-by: Santosh Shilimkar 
---
 include/linux/dma-mapping.h |7 +++
 1 file changed, 7 insertions(+)

diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h
index 94af418..68a7863 100644
--- a/include/linux/dma-mapping.h
+++ b/include/linux/dma-mapping.h
@@ -129,6 +129,13 @@ static inline int dma_set_seg_boundary(struct device *dev, 
unsigned long mask)
return -EIO;
 }
 
+#ifndef dma_max_pfn
+static inline unsigned long dma_max_pfn(struct device *dev)
+{
+   return *dev->dma_mask >> PAGE_SHIFT;
+}
+#endif
+
 static inline void *dma_zalloc_coherent(struct device *dev, size_t size,
dma_addr_t *dma_handle, gfp_t flag)
 {
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC/RFT PATCH 0/5] mm: ARM nobootmem and few dma_mask fixes

2013-07-12 Thread Santosh Shilimkar
The series is an attempt to move ARM port to NO_BOOTMEM. As discussed
on list NO_BOOTMEM move needed updates to max*pfn meaning to be maximum
PFNs but that breaks the dma_mask for few block layer drivers since
ARM start of physical memory is not PFN0 unlike most of the architectures.
Some more read on it is here:
http://lwn.net/Articles/543408/
http://lwn.net/Articles/543424/

To address this issue, we introduce generic dma_max_pfn() helper which
can be overridden from the architectures.

Another intention behind move to nobootmem is also to convert ARM to
switch to memblock and getting rid of bootmem allocator dependency which
don't work for LPAE machines which has physical memory starting beyond
4 GB boundary. It needs changes to core kernel and also a new memblock
API. More on this can be found here:
https://lkml.org/lkml/2013/6/29/77

I have been trying to cook up these patches with kind help from Russell
and we know series don't solve all the dma_mask bad assumptions. But at
least I am hoping that it can get the ball rolling. 

Comments/testing help is welcome !!

Cc: Russell King 
Cc: Catalin Marinas 
Cc: Will Deacon 
Cc: Nicolas Pitre 
Cc: Tejun Heo 
Cc: Jens Axboe 

Santosh Shilimkar (5):
  block: Rename parameter dma_mask to max_addr for
blk_queue_bounce_limit()
  mm: dma-mapping: Add dma_max_pfn(dev) helper function
  scsi: Use dma_max_pfn(dev) helper for bounce_limit calculations
  ARM: mm: change max*pfn to include the physical offset of memory
  ARM: mm: Remove bootmem code and switch to NO_BOOTMEM

 arch/arm/Kconfig   |1 +
 arch/arm/include/asm/dma-mapping.h |   16 ++---
 arch/arm/kernel/setup.c|2 +-
 arch/arm/mm/init.c |   68 
 block/blk-settings.c   |8 ++---
 drivers/scsi/scsi_lib.c|2 +-
 include/linux/dma-mapping.h|7 
 7 files changed, 32 insertions(+), 72 deletions(-)

-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC/RFT PATCH 3/5] scsi: Use dma_max_pfn(dev) helper for bounce_limit calculations

2013-07-12 Thread Santosh Shilimkar
DMA bounce limit is the maximum direct DMA'able memory beyond which
bounce buffers has to be used to perform dma operations. SCSI driver
relies on dma_mask but its calculation is based on max_*pfn which
don't have uniform meaning across architectures. So make use of
dma_max_pfn() which is expected to return the DMAable maximum pfn
value across architectures.

Cc: Russell King 
Cc: linux-s...@vger.kernel.org

Signed-off-by: Santosh Shilimkar 
---
 drivers/scsi/scsi_lib.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 86d5220..e8275fa 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1668,7 +1668,7 @@ u64 scsi_calculate_bounce_limit(struct Scsi_Host *shost)
 
host_dev = scsi_get_device(shost);
if (host_dev && host_dev->dma_mask)
-   bounce_limit = *host_dev->dma_mask;
+   bounce_limit = dma_max_pfn(host_dev) << PAGE_SHIFT;
 
return bounce_limit;
 }
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 0/4] iio: hid-sensor: add module alias for autoload

2013-07-12 Thread Pandruvada, Srinivas
Tested and ready to go.

Thanks,
Srinivas


-Original Message-
From: Jonathan Cameron [mailto:ji...@kernel.org] 
Sent: Friday, July 12, 2013 11:45 AM
To: Alexander Holler
Cc: Srinivas Pandruvada; linux-kernel@vger.kernel.org; 
linux-...@vger.kernel.org; Jonathan Cameron; Pandruvada, Srinivas
Subject: Re: [PATCH 0/4] iio: hid-sensor: add module alias for autoload

On 07/12/2013 08:21 AM, Alexander Holler wrote:
> Am 11.07.2013 19:27, schrieb Srinivas Pandruvada:
>>
>>
>> On 07/10/2013 08:58 AM, Alexander Holler wrote:
>>> Am 10.07.2013 17:27, schrieb Srinivas Pandruvada:
 Hi,

 There was no intention to prevent auto loading. Did you get chance 
 to test these changes?
>>>
>>> Sure, I always test patches before I send them out.
>>>
>>> Ok, I haven't tested the changes with the iio HID drivers (I don't 
>>> have any commercial HID sensor hub, so I've just compile tested 
>>> these patches here, double reading them), but I've tested the 
>>> similiar changes with a patch for rtc-hid-sensor-time I've send out 
>>> yesterday.
>>> (sorry, no link, lkml.org seems dead, just search for
>>> "rtc-hid-sensor-time: add module alias")
>>>
>>> It works just fine. An example output is now
>>>
>>> Jul  9 19:27:21 dockstar3 kernel: [5.12] rtc_hid_sensor_time
>>> HID-SENSOR-2000a0.0: milliseconds supported
>>> Jul  9 19:27:21 dockstar3 kernel: [5.132864] rtc_hid_sensor_time
>>> HID-SENSOR-2000a0.0: rtc core: setting system clock to 2013-07-09
>>> 17:26:51:328000 UTC (1373390811)
>>> Jul  9 19:27:21 dockstar3 kernel: [5.146105] rtc_hid_sensor_time
>>> HID-SENSOR-2000a0.0: rtc core: registered hid-sensor-time as rtc0
>>>
>>> Before the output was e.g.
>>>
>>> HID-SENSOR-2000a0 HID-SENSOR-2000a0.0: rtc core: registered 
>>> hid-sensor-time as rtc0
>>>
>>> instead of the above with the descriptive rtc_hid_sensor_time.
>> 
>>> Automatic loading of modules works too and it works on ARM, Intel 
>>> and AMD as module or static linked. ;)
> 
> Do you have tested the patches with a real device? I assume you have 
> one. ;)
> 
> Regards,
> 
> Alexander Holler

Just so you two know. Given this discussion, I'll be lazy about these and wait 
for an Ack from Srinivas before taking these.  Look fine to me, but nice to 
have confirmation as you say with the actual hardware!

Jonathan
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] usb: gadget: fotg210-udc: Remove bogus __init/__exit annotations

2013-07-12 Thread Sergei Shtylyov

Hello.

On 07/11/2013 11:25 AM, Geert Uytterhoeven wrote:


When builtin (CONFIG_USB_FOTG210_UDC=y):



LD  drivers/usb/gadget/built-in.o
WARNING: drivers/usb/gadget/built-in.o(.data+0xbf8): Section mismatch in
reference from the variable fotg210_driver to the function
.init.text:fotg210_udc_probe()
The variable fotg210_driver references
the function __init fotg210_udc_probe()
If the reference is valid then annotate the
variable with __init* or __refdata (see linux/init.h) or name the
variable:
*_template, *_timer, *_sht, *_ops, *_probe, *_probe_one, *_console



diff --git a/drivers/usb/gadget/fotg210-udc.c
b/drivers/usb/gadget/fotg210-udc.c
index cce5535..10cd18d 100644
--- a/drivers/usb/gadget/fotg210-udc.c
+++ b/drivers/usb/gadget/fotg210-udc.c
@@ -1074,7 +1074,7 @@ static struct usb_gadget_ops fotg210_gadget_ops = {
 .udc_stop   = fotg210_udc_stop,
   };

-static int __exit fotg210_udc_remove(struct platform_device *pdev)
+static int fotg210_udc_remove(struct platform_device *pdev)



I think you can leave __exit annotation here, if you enclose the
reference in the driver structure in __exit_p()...



The driver is using module_platform_driver(), not
module_platform_driver_probe(),
so it expects the platform device to show up or disappear anytime.


   Well, I don't think it actually does. Perhaps the reason was that 
the latter function wasn't available yet at the time of conversion to 
the former (IIRC it appeared later).


WBR, Sergei

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 4/4] iio: hid-sensor-magn-3d: add module alias for autoload

2013-07-12 Thread Srinivas Pandruvada

Tested. You can add my ack.

Acked-by:Srinivas Pandruvada


On 07/10/2013 01:32 AM, Alexander Holler wrote:

Add a MODULE_DEVICE_TABLE in order to let hotplug mechanisms automatically
load the driver.

This makes it also possible to use the usual driver name instead of
HID-SENSOR-2000xx which isn't very descriptive in kernel messages.

Signed-off-by: Alexander Holler 
---
  drivers/iio/magnetometer/hid-sensor-magn-3d.c | 16 +++-
  1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/iio/magnetometer/hid-sensor-magn-3d.c 
b/drivers/iio/magnetometer/hid-sensor-magn-3d.c
index 99f4e49..e71127a 100644
--- a/drivers/iio/magnetometer/hid-sensor-magn-3d.c
+++ b/drivers/iio/magnetometer/hid-sensor-magn-3d.c
@@ -30,10 +30,6 @@
  #include 
  #include "../common/hid-sensors/hid-sensor-trigger.h"
  
-/*Format: HID-SENSOR-usage_id_in_hex*/

-/*Usage ID from spec for Magnetometer-3D: 0x200083*/
-#define DRIVER_NAME "HID-SENSOR-200083"
-
  enum magn_3d_channel {
CHANNEL_SCAN_INDEX_X,
CHANNEL_SCAN_INDEX_Y,
@@ -390,9 +386,19 @@ static int hid_magn_3d_remove(struct platform_device *pdev)
return 0;
  }
  
+static struct platform_device_id hid_magn_3d_ids[] = {

+   {
+   /* Format: HID-SENSOR-usage_id_in_hex_lowercase */
+   .name = "HID-SENSOR-200083",
+   },
+   { /* sentinel */ }
+};
+MODULE_DEVICE_TABLE(platform, hid_magn_3d_ids);
+
  static struct platform_driver hid_magn_3d_platform_driver = {
+   .id_table = hid_magn_3d_ids,
.driver = {
-   .name   = DRIVER_NAME,
+   .name   = KBUILD_MODNAME,
.owner  = THIS_MODULE,
},
.probe  = hid_magn_3d_probe,


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 3/4] iio: hid-sensor-als: add module alias for autoload

2013-07-12 Thread Srinivas Pandruvada

Tested. You can add my ack.

Acked-by:Srinivas Pandruvada


On 07/10/2013 01:31 AM, Alexander Holler wrote:

Add a MODULE_DEVICE_TABLE in order to let hotplug mechanisms automatically
load the driver.

This makes it also possible to use the usual driver name instead of
HID-SENSOR-2000xx which isn't very descriptive in kernel messages.

Signed-off-by: Alexander Holler 
---
  drivers/iio/light/hid-sensor-als.c | 16 +++-
  1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/iio/light/hid-sensor-als.c 
b/drivers/iio/light/hid-sensor-als.c
index cdc2cad..9adfef0 100644
--- a/drivers/iio/light/hid-sensor-als.c
+++ b/drivers/iio/light/hid-sensor-als.c
@@ -30,10 +30,6 @@
  #include 
  #include "../common/hid-sensors/hid-sensor-trigger.h"
  
-/*Format: HID-SENSOR-usage_id_in_hex*/

-/*Usage ID from spec for Ambiant-Light: 0x200041*/
-#define DRIVER_NAME "HID-SENSOR-200041"
-
  #define CHANNEL_SCAN_INDEX_ILLUM 0
  
  struct als_state {

@@ -355,9 +351,19 @@ static int hid_als_remove(struct platform_device *pdev)
return 0;
  }
  
+static struct platform_device_id hid_als_ids[] = {

+   {
+   /* Format: HID-SENSOR-usage_id_in_hex_lowercase */
+   .name = "HID-SENSOR-200041",
+   },
+   { /* sentinel */ }
+};
+MODULE_DEVICE_TABLE(platform, hid_als_ids);
+
  static struct platform_driver hid_als_platform_driver = {
+   .id_table = hid_als_ids,
.driver = {
-   .name   = DRIVER_NAME,
+   .name   = KBUILD_MODNAME,
.owner  = THIS_MODULE,
},
.probe  = hid_als_probe,


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/4] iio: hid-sensor-gyro-3d: add module alias for autoload

2013-07-12 Thread Srinivas Pandruvada

Tested. You can add my ack.

Acked-by:Srinivas Pandruvada


On 07/10/2013 01:31 AM, Alexander Holler wrote:

Add a MODULE_DEVICE_TABLE in order to let hotplug mechanisms automatically
load the driver.

This makes it also possible to use the usual driver name instead of
HID-SENSOR-2000xx which isn't very descriptive in kernel messages.

Signed-off-by: Alexander Holler 
---
  drivers/iio/gyro/hid-sensor-gyro-3d.c | 16 +++-
  1 file changed, 11 insertions(+), 5 deletions(-)

diff --git a/drivers/iio/gyro/hid-sensor-gyro-3d.c 
b/drivers/iio/gyro/hid-sensor-gyro-3d.c
index bc943dd..9cc8aa1 100644
--- a/drivers/iio/gyro/hid-sensor-gyro-3d.c
+++ b/drivers/iio/gyro/hid-sensor-gyro-3d.c
@@ -30,10 +30,6 @@
  #include 
  #include "../common/hid-sensors/hid-sensor-trigger.h"
  
-/*Format: HID-SENSOR-usage_id_in_hex*/

-/*Usage ID from spec for Gyro-3D: 0x200076*/
-#define DRIVER_NAME "HID-SENSOR-200076"
-
  enum gyro_3d_channel {
CHANNEL_SCAN_INDEX_X,
CHANNEL_SCAN_INDEX_Y,
@@ -389,9 +385,19 @@ static int hid_gyro_3d_remove(struct platform_device *pdev)
return 0;
  }
  
+static struct platform_device_id hid_gyro_3d_ids[] = {

+   {
+   /* Format: HID-SENSOR-usage_id_in_hex_lowercase */
+   .name = "HID-SENSOR-200076",
+   },
+   { /* sentinel */ }
+};
+MODULE_DEVICE_TABLE(platform, hid_gyro_3d_ids);
+
  static struct platform_driver hid_gyro_3d_platform_driver = {
+   .id_table = hid_gyro_3d_ids,
.driver = {
-   .name   = DRIVER_NAME,
+   .name   = KBUILD_MODNAME,
.owner  = THIS_MODULE,
},
.probe  = hid_gyro_3d_probe,


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ACPI / memhotplug: Fix a stale pointer in error path

2013-07-12 Thread Rafael J. Wysocki
On Friday, July 12, 2013 03:12:24 PM Toshi Kani wrote:
> On Fri, 2013-07-12 at 23:13 +0200, Rafael J. Wysocki wrote:
> > On Friday, July 12, 2013 03:01:15 PM Toshi Kani wrote:
> > > On Fri, 2013-07-12 at 22:42 +0200, Rafael J. Wysocki wrote:
> > > > On Friday, July 12, 2013 08:51:29 AM Toshi Kani wrote:
> > > > > On Fri, 2013-07-12 at 09:24 +0900, Yasuaki Ishimatsu wrote:
> > > > > > (2013/07/11 1:47), Toshi Kani wrote:
> > > > > > > device->driver_data needs to be cleared when releasing its data,
> > > > > > > mem_device, in an error path of acpi_memory_device_add().
> > > > > > > 
> > > > > > > Signed-off-by: Toshi Kani 
> > > > > > > ---
> > > > > > 
> > > > > > Reviewed-by: Yasuaki Ishimatsu 
> > > > > 
> > > > > Thanks Yasuaki!
> > > > 
> > > > Queued up as a fix for 3.11.
> > > 
> > > Thanks!
> > > 
> > > > Do we need that in -stable as well?
> > > 
> > > Good point.  Yes, we need that in -stable as well.
> > 
> > What's the oldest mainline major release that fix is applicable to?
> 
> The fix is applicable all ways up to 2.6.32.

For -stable I'll need to say some more about what practical consequences of
the bug are.  Is it difficult to trigger?

Rafael


> > > > > > >   drivers/acpi/acpi_memhotplug.c |1 +
> > > > > > >   1 file changed, 1 insertion(+)
> > > > > > > 
> > > > > > > diff --git a/drivers/acpi/acpi_memhotplug.c 
> > > > > > > b/drivers/acpi/acpi_memhotplug.c
> > > > > > > index c711d11..999adb5 100644
> > > > > > > --- a/drivers/acpi/acpi_memhotplug.c
> > > > > > > +++ b/drivers/acpi/acpi_memhotplug.c
> > > > > > > @@ -323,6 +323,7 @@ static int acpi_memory_device_add(struct 
> > > > > > > acpi_device *device,
> > > > > > >   /* Get the range from the _CRS */
> > > > > > >   result = acpi_memory_get_device_resources(mem_device);
> > > > > > >   if (result) {
> > > > > > > + device->driver_data = NULL;
> > > > > > >   kfree(mem_device);
> > > > > > >   return result;
> > > > > > >   }
> > > > > > > --
> > > > > > > To unsubscribe from this list: send the line "unsubscribe 
> > > > > > > linux-acpi" in
> > > > > > > the body of a message to majord...@vger.kernel.org
> > > > > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > > > > > 
> > > > > > 
> > > > > > 
> > > > > 
> > > > > 
> > > 
> > > 
> 
> 
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ 00/19] 3.10.1-stable review

2013-07-12 Thread Justin M. Forbes
On Fri, Jul 12, 2013 at 04:28:20PM -0400, Steven Rostedt wrote:
> 
> I would suspect that machines that allow unprivileged users would be
> running distro kernels, and not the latest release from Linus, and thus
> even a bug that "can allow an unprivileged user to crash the kernel" may
> still be able to sit around for a month before being submitted.
> 
But distros *do* ship the latest release from Linus. Fedora is often
shipping .1 releases, and sometimes .0.  This seems to be getting more
difficult though as more and more fixes have been left for stable to fix
and the Linus release contains a number of known regressions.
We know about those regressions not just from following lists, but because
we have users running rawhide kernels which are snapshots of Linus' tree
almost daily.  They see the regressions and complain.  So yeah, there are
machines out there running Linus' latest tree.

Justin
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [git pull] vfs.git part 2

2013-07-12 Thread Al Viro
On Fri, Jul 12, 2013 at 01:21:07PM -0700, Linus Torvalds wrote:
> So you could have something like
> 
>   #define O_TMPFILE (__O_TMPFILE | O_DIRECTORY | O_RDWR)
>   #define O_TMPFILE_MASK (__O_TMPFILE | O_DIRECTORY | O_CREAT | O_ACCMODE)
> 
> and then use
> 
>if ((flags & O_TMPFILE_MASK) != O_TMPFILE)
>return -ENOTSUPP;
> 
> or whatever.

Point...  I'd rather use EINVAL, though.  FWIW, we might want to add
openat2() at some later point, with proper validation of arguments; I really
don't want to think of the kludges we'll need the next time we add an
open flag...

Safer ABI for O_TMPFILE

[suggested by Rasmus Villemoes] make O_DIRECTORY | O_RDWR part of O_TMPFILE;
that will fail on old kernels in a lot more cases than what I came up with.
And make sure O_CREAT doesn't get there...

Signed-off-by: Al Viro 
---
diff --git a/arch/alpha/include/uapi/asm/fcntl.h 
b/arch/alpha/include/uapi/asm/fcntl.h
index dfdadb0..09f49a6 100644
--- a/arch/alpha/include/uapi/asm/fcntl.h
+++ b/arch/alpha/include/uapi/asm/fcntl.h
@@ -32,7 +32,7 @@
 #define O_SYNC (__O_SYNC|O_DSYNC)
 
 #define O_PATH 04000
-#define O_TMPFILE  01
+#define __O_TMPFILE01
 
 #define F_GETLK7
 #define F_SETLK8
diff --git a/arch/parisc/include/uapi/asm/fcntl.h 
b/arch/parisc/include/uapi/asm/fcntl.h
index cc61c47..34a46cb 100644
--- a/arch/parisc/include/uapi/asm/fcntl.h
+++ b/arch/parisc/include/uapi/asm/fcntl.h
@@ -20,7 +20,7 @@
 #define O_INVISIBLE00400 /* invisible I/O, for DMAPI/XDSM */
 
 #define O_PATH 02000
-#define O_TMPFILE  04000
+#define __O_TMPFILE04000
 
 #define F_GETLK64  8
 #define F_SETLK64  9
diff --git a/arch/sparc/include/uapi/asm/fcntl.h 
b/arch/sparc/include/uapi/asm/fcntl.h
index d73e5e0..7e8ace5 100644
--- a/arch/sparc/include/uapi/asm/fcntl.h
+++ b/arch/sparc/include/uapi/asm/fcntl.h
@@ -35,7 +35,7 @@
 #define O_SYNC (__O_SYNC|O_DSYNC)
 
 #define O_PATH 0x100
-#define O_TMPFILE  0x200
+#define __O_TMPFILE0x200
 
 #define F_GETOWN   5   /*  for sockets. */
 #define F_SETOWN   6   /*  for sockets. */
diff --git a/fs/namei.c b/fs/namei.c
index b2beee7..8b61d10 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -2977,7 +2977,7 @@ static struct file *path_openat(int dfd, struct filename 
*pathname,
 
file->f_flags = op->open_flag;
 
-   if (unlikely(file->f_flags & O_TMPFILE)) {
+   if (unlikely(file->f_flags & __O_TMPFILE)) {
error = do_tmpfile(dfd, pathname, nd, flags, op, file, );
goto out;
}
diff --git a/fs/open.c b/fs/open.c
index fca72c4..9156cb0 100644
--- a/fs/open.c
+++ b/fs/open.c
@@ -840,8 +840,8 @@ static inline int build_open_flags(int flags, umode_t mode, 
struct open_flags *o
if (flags & __O_SYNC)
flags |= O_DSYNC;
 
-   if (flags & O_TMPFILE) {
-   if (!(flags & O_CREAT))
+   if (flags & __O_TMPFILE) {
+   if ((flags & O_TMPFILE_MASK) != O_TMPFILE)
return -EINVAL;
acc_mode = MAY_OPEN | ACC_MODE(flags);
} else if (flags & O_PATH) {
diff --git a/include/uapi/asm-generic/fcntl.h b/include/uapi/asm-generic/fcntl.h
index 06632be..05ac354 100644
--- a/include/uapi/asm-generic/fcntl.h
+++ b/include/uapi/asm-generic/fcntl.h
@@ -84,10 +84,14 @@
 #define O_PATH 01000
 #endif
 
-#ifndef O_TMPFILE
-#define O_TMPFILE  02000
+#ifndef __O_TMPFILE
+#define __O_TMPFILE02000
 #endif
 
+/* a horrid kludge trying to make sure that this will fail on old kernels */
+#define O_TMPFILE (__O_TMPFILE | O_DIRECTORY | O_RDWR)
+#define O_TMPFILE_MASK (__O_TMPFILE | O_DIRECTORY | O_CREAT | O_ACCMODE)  
+
 #ifndef O_NDELAY
 #define O_NDELAY   O_NONBLOCK
 #endif
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: rcu: qs related lockup on boot

2013-07-12 Thread Paul E. McKenney
On Fri, Jul 12, 2013 at 04:34:09PM -0400, Sasha Levin wrote:
> Hi all,
> 
> I've stumbled on the following rcu related spew at boot. It happened right 
> when the
> kernel finished up it's boot sequence and passed it to userspace.
> 
> [  116.549044] BUG: soft lockup - CPU#0 stuck for 30s! [modprobe:12510]
> [  116.549884] Modules linked in:
> [  116.551796] hardirqs last  enabled at (3258): [] 
> restore_args+0x0/0x30
> [  116.553684] hardirqs last disabled at (3259): [] 
> apic_timer_interrupt+0x6d/0x80
> [  116.554753] softirqs last  enabled at (3212): [] 
> __do_softirq+0x447/0x4d0
> [  116.555857] softirqs last disabled at (3223): [] 
> irq_exit+0x86/0x120
> [  116.556760] CPU: 0 PID: 12510 Comm: modprobe Tainted: GW
> 3.10.0-next-20130712-sasha #3956
> [  116.557812] task: 8807c8173000 ti: 8807c72a6000 task.ti: 
> 8807c72a6000
> [  116.558599] RIP: 0010:[]  []
> _raw_spin_unlock_irqrestore+0x8d/0xc0
> [  116.558958] RSP: 0018:8807eba03ce8  EFLAGS: 0282
> [  116.558958] RAX: 8807c8173000 RBX: 84199eb7 RCX: 
> 
> [  116.558958] RDX: 8807c8173000 RSI:  RDI: 
> 0282
> [  116.558958] RBP: 8807eba03cf8 R08:  R09: 
> 
> [  116.558958] R10: 0001 R11:  R12: 
> 8807eba03c58
> [  116.558958] R13: 841a3cf2 R14: 8807eba03cf8 R15: 
> 86098020
> [  116.558958] FS:  () GS:8807eba0() 
> knlGS:
> [  116.558958] CS:  0010 DS:  ES:  CR0: 8005003b
> [  116.558958] CR2: 7f99fc17f850 CR3: 0007c8438000 CR4: 
> 06f0
> [  116.558958] Stack:
> [  116.558958]  86098020 0282 8807eba03d38 
> 81169173
> [  116.558958]  8807eba03d18 85e8d000 0282 
> 
> [  116.558958]    8807eba03d58 
> 811e7743
> [  116.558958] Call Trace:
> [  116.558958]  
> [  116.558958]  [] __wake_up+0x53/0x70

Might be we figured out a new way to deadlock RCU and the scheduler,
but just in case you rediscovered an old way, could you please try this
patch from Peter Zijlstra?

http://www.mail-archive.com/linux-kernel@vger.kernel.org/msg464708.html

If that doesn't help and you don't have lockdep enabled, could you
please enable it?

Thanx, Paul

> [  116.558958]  [] rcu_report_qs_rsp+0x73/0x80
> [  116.558958]  [] rcu_report_qs_rnp+0x26d/0x2c0
> [  116.558958]  [] ? rcu_check_quiescent_state+0x4a/0xf0
> [  116.558958]  [] rcu_check_quiescent_state+0xcc/0xf0
> [  116.558958]  [] __rcu_process_callbacks+0x5b/0x180
> [  116.558958]  [] rcu_process_callbacks+0x88/0xc0
> [  116.558958]  [] __do_softirq+0x261/0x4d0
> [  116.558958]  [] irq_exit+0x86/0x120
> [  116.558958]  [] smp_apic_timer_interrupt+0x4a/0x60
> [  116.558958]  [] apic_timer_interrupt+0x72/0x80
> [  116.558958]  
> [  116.558958]  [] ? retint_restore_args+0x13/0x13
> [  116.558958]  [] ? _raw_spin_unlock_irq+0x4c/0x80
> [  116.558958]  [] ? _raw_spin_unlock_irq+0x30/0x80
> [  116.558958]  [] finish_task_switch+0x96/0x120
> [  116.558958]  [] ? finish_task_switch+0x58/0x120
> [  116.558958]  [] __schedule+0x81b/0x8e0
> [  116.558958]  [] ? rcu_irq_exit+0x1b7/0x200
> [  116.558958]  [] ? retint_restore_args+0x13/0x13
> [  116.558958]  [] preempt_schedule_irq+0xa4/0xf0
> [  116.558958]  [] retint_kernel+0x26/0x30
> [  116.558958]  [] ? user_enter+0x135/0x150
> [  116.558958]  [] syscall_trace_leave+0x12d/0x160
> [  116.558958]  [] int_check_syscall_exit_work+0x34/0x3d
> [  116.558958] Code: 1f 80 00 00 00 00 e8 73 e2 00 fd 48 83 3d e3 d6
> 8a 01 00 75 11 0f 0b 0f 1f 80 00 00 00 00 eb fe 66 0f 1f 44 00 00 4c
> 89 e7 57 9d <66> 66 90 66 90 bf 01 00 00 00 e8 94 4f 00 00 65 48 8b
> 04 25 88
> 
> The other cpus with anything in their callstack are:
> 
> [  118.366894] CPU: 140 PID: 0 Comm: swapper/140 Tainted: GW
> 3.10.0-next-20130712-sasha #3956
> [  118.366894] task: 8807de5bb000 ti: 8807de5c task.ti: 
> 8807de5c
> [  118.366894] RIP: 0010:[]  [] 
> delay_tsc+0xb2/0x140
> [  118.366894] RSP: 0018:8807fd203eb8  EFLAGS: 0046
> [  118.366894] RAX: 0008 RBX: 5af9b24b RCX: 
> 5af9c2c7
> [  118.366894] RDX: 0043 RSI:  RDI: 
> 0001
> [  118.366894] RBP: 8807fd203ee8 R08: e26ec8c5 R09: 
> 0001
> [  118.366894] R10: 0001 R11:  R12: 
> 8807de5c0010
> [  118.366894] R13: 5854 R14: 008c R15: 
> 5af9

Re: [PATCH 1/2] Drivers: hv: balloon: Fix a bug in the hot-add code

2013-07-12 Thread Ben Hutchings
On Fri, Jul 12, 2013 at 09:07:19PM +, KY Srinivasan wrote:
[...]
> > Well now it might look like a bug that you don't test the result
> > of wait_for_completion_timeout().  Maybe update the comment to
> > explain why it's OK to continue anyway?
> 
> I put in the comment in the patch explaining why it is ok to continue.
[...]

But that is not nearly as easy to see as the comment that is
already *in the code* which your patch isn't updating.

Ben.

-- 
Ben Hutchings
We get into the habit of living before acquiring the habit of thinking.
  - Albert Camus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ACPI / memhotplug: Fix a stale pointer in error path

2013-07-12 Thread Toshi Kani
On Fri, 2013-07-12 at 23:13 +0200, Rafael J. Wysocki wrote:
> On Friday, July 12, 2013 03:01:15 PM Toshi Kani wrote:
> > On Fri, 2013-07-12 at 22:42 +0200, Rafael J. Wysocki wrote:
> > > On Friday, July 12, 2013 08:51:29 AM Toshi Kani wrote:
> > > > On Fri, 2013-07-12 at 09:24 +0900, Yasuaki Ishimatsu wrote:
> > > > > (2013/07/11 1:47), Toshi Kani wrote:
> > > > > > device->driver_data needs to be cleared when releasing its data,
> > > > > > mem_device, in an error path of acpi_memory_device_add().
> > > > > > 
> > > > > > Signed-off-by: Toshi Kani 
> > > > > > ---
> > > > > 
> > > > > Reviewed-by: Yasuaki Ishimatsu 
> > > > 
> > > > Thanks Yasuaki!
> > > 
> > > Queued up as a fix for 3.11.
> > 
> > Thanks!
> > 
> > > Do we need that in -stable as well?
> > 
> > Good point.  Yes, we need that in -stable as well.
> 
> What's the oldest mainline major release that fix is applicable to?

The fix is applicable all ways up to 2.6.32.

Thanks,
-Toshi


> 
> Rafael
> 
> 
> > > > > >   drivers/acpi/acpi_memhotplug.c |1 +
> > > > > >   1 file changed, 1 insertion(+)
> > > > > > 
> > > > > > diff --git a/drivers/acpi/acpi_memhotplug.c 
> > > > > > b/drivers/acpi/acpi_memhotplug.c
> > > > > > index c711d11..999adb5 100644
> > > > > > --- a/drivers/acpi/acpi_memhotplug.c
> > > > > > +++ b/drivers/acpi/acpi_memhotplug.c
> > > > > > @@ -323,6 +323,7 @@ static int acpi_memory_device_add(struct 
> > > > > > acpi_device *device,
> > > > > > /* Get the range from the _CRS */
> > > > > > result = acpi_memory_get_device_resources(mem_device);
> > > > > > if (result) {
> > > > > > +   device->driver_data = NULL;
> > > > > > kfree(mem_device);
> > > > > > return result;
> > > > > > }
> > > > > > --
> > > > > > To unsubscribe from this list: send the line "unsubscribe 
> > > > > > linux-acpi" in
> > > > > > the body of a message to majord...@vger.kernel.org
> > > > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > > > > 
> > > > > 
> > > > > 
> > > > 
> > > > 
> > 
> > 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH 1/2] Drivers: hv: balloon: Fix a bug in the hot-add code

2013-07-12 Thread KY Srinivasan


> -Original Message-
> From: Ben Hutchings [mailto:b...@decadent.org.uk]
> Sent: Friday, July 12, 2013 4:17 PM
> To: KY Srinivasan
> Cc: gre...@linuxfoundation.org; linux-kernel@vger.kernel.org;
> de...@linuxdriverproject.org; o...@aepfle.de; a...@canonical.com;
> jasow...@redhat.com; Stable
> Subject: Re: [PATCH 1/2] Drivers: hv: balloon: Fix a bug in the hot-add code
> 
> On Fri, Jul 12, 2013 at 06:56:14AM -0700, K. Y. Srinivasan wrote:
> > As we hot-add 128 MB chunks of memory, we wait to ensure that the memory
> > is onlined before attempting to hot-add the next chunk. If the udev rule for
> > memory hot-add is not executed within the allowed time, we would rollback
> the
> > state and abort further hot-add. Since the hot-add has succeeded and the 
> > only
> > failure is that the memory is not onlined within the allowed time, we 
> > should not
> > be rolling back the state. Fix this bug.
> [...]
> > /*
> >  * Wait for the memory block to be onlined.
> >  */
> > -   t = wait_for_completion_timeout(_device.ol_waitevent,
> 5*HZ);
> > -   if (t == 0) {
> > -   pr_info("hot_add memory timedout\n");
> > -   has->ha_end_pfn -= HA_CHUNK;
> > -   has->covered_end_pfn -=  processed_pfn;
> > -   break;
> > -   }
> > +   wait_for_completion_timeout(_device.ol_waitevent,
> 5*HZ);
> >
> > }
> >
> 
> Well now it might look like a bug that you don't test the result
> of wait_for_completion_timeout().  Maybe update the comment to
> explain why it's OK to continue anyway?

I put in the comment in the patch explaining why it is ok to continue. To 
reiterate,
it is ok to continue because hot add has succeeded. More importantly, what I was
doing earlier - rolling back the state when in fact hot add had succeeded was 
incorrect.

Regards,

K. Y
> 
> Ben.
> 
> --
> Ben Hutchings
> We get into the habit of living before acquiring the habit of thinking.
>   - Albert Camus
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [ 00/19] 3.10.1-stable review

2013-07-12 Thread Guenter Roeck
On Fri, Jul 12, 2013 at 04:47:44PM -0400, Theodore Ts'o wrote:
> On Fri, Jul 12, 2013 at 09:50:51PM +0200, Willy Tarreau wrote:
> > So probably we should incite patch contributors to add a specific
> > tag such as "Fixes: 3.5 and later", so that non-important patches
> > do not need the Cc:stable anymore, but users who experience an issue
> > can easily spot them and ask for their inclusion.
> 
> This is a really good idea.   /me likes
> 
I agree, that would be very helpful.

Guenter
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ACPI / memhotplug: Fix a stale pointer in error path

2013-07-12 Thread Rafael J. Wysocki
On Friday, July 12, 2013 03:01:15 PM Toshi Kani wrote:
> On Fri, 2013-07-12 at 22:42 +0200, Rafael J. Wysocki wrote:
> > On Friday, July 12, 2013 08:51:29 AM Toshi Kani wrote:
> > > On Fri, 2013-07-12 at 09:24 +0900, Yasuaki Ishimatsu wrote:
> > > > (2013/07/11 1:47), Toshi Kani wrote:
> > > > > device->driver_data needs to be cleared when releasing its data,
> > > > > mem_device, in an error path of acpi_memory_device_add().
> > > > > 
> > > > > Signed-off-by: Toshi Kani 
> > > > > ---
> > > > 
> > > > Reviewed-by: Yasuaki Ishimatsu 
> > > 
> > > Thanks Yasuaki!
> > 
> > Queued up as a fix for 3.11.
> 
> Thanks!
> 
> > Do we need that in -stable as well?
> 
> Good point.  Yes, we need that in -stable as well.

What's the oldest mainline major release that fix is applicable to?

Rafael


> > > > >   drivers/acpi/acpi_memhotplug.c |1 +
> > > > >   1 file changed, 1 insertion(+)
> > > > > 
> > > > > diff --git a/drivers/acpi/acpi_memhotplug.c 
> > > > > b/drivers/acpi/acpi_memhotplug.c
> > > > > index c711d11..999adb5 100644
> > > > > --- a/drivers/acpi/acpi_memhotplug.c
> > > > > +++ b/drivers/acpi/acpi_memhotplug.c
> > > > > @@ -323,6 +323,7 @@ static int acpi_memory_device_add(struct 
> > > > > acpi_device *device,
> > > > >   /* Get the range from the _CRS */
> > > > >   result = acpi_memory_get_device_resources(mem_device);
> > > > >   if (result) {
> > > > > + device->driver_data = NULL;
> > > > >   kfree(mem_device);
> > > > >   return result;
> > > > >   }
> > > > > --
> > > > > To unsubscribe from this list: send the line "unsubscribe linux-acpi" 
> > > > > in
> > > > > the body of a message to majord...@vger.kernel.org
> > > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > > > 
> > > > 
> > > > 
> > > 
> > > 
> 
> 
-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] ACPI / memhotplug: Fix a stale pointer in error path

2013-07-12 Thread Toshi Kani
On Fri, 2013-07-12 at 22:42 +0200, Rafael J. Wysocki wrote:
> On Friday, July 12, 2013 08:51:29 AM Toshi Kani wrote:
> > On Fri, 2013-07-12 at 09:24 +0900, Yasuaki Ishimatsu wrote:
> > > (2013/07/11 1:47), Toshi Kani wrote:
> > > > device->driver_data needs to be cleared when releasing its data,
> > > > mem_device, in an error path of acpi_memory_device_add().
> > > > 
> > > > Signed-off-by: Toshi Kani 
> > > > ---
> > > 
> > > Reviewed-by: Yasuaki Ishimatsu 
> > 
> > Thanks Yasuaki!
> 
> Queued up as a fix for 3.11.

Thanks!

> Do we need that in -stable as well?

Good point.  Yes, we need that in -stable as well.

-Toshi


> Rafael
> 
> 
> > > 
> > > >   drivers/acpi/acpi_memhotplug.c |1 +
> > > >   1 file changed, 1 insertion(+)
> > > > 
> > > > diff --git a/drivers/acpi/acpi_memhotplug.c 
> > > > b/drivers/acpi/acpi_memhotplug.c
> > > > index c711d11..999adb5 100644
> > > > --- a/drivers/acpi/acpi_memhotplug.c
> > > > +++ b/drivers/acpi/acpi_memhotplug.c
> > > > @@ -323,6 +323,7 @@ static int acpi_memory_device_add(struct 
> > > > acpi_device *device,
> > > > /* Get the range from the _CRS */
> > > > result = acpi_memory_get_device_resources(mem_device);
> > > > if (result) {
> > > > +   device->driver_data = NULL;
> > > > kfree(mem_device);
> > > > return result;
> > > > }
> > > > --
> > > > To unsubscribe from this list: send the line "unsubscribe linux-acpi" in
> > > > the body of a message to majord...@vger.kernel.org
> > > > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> > > > 
> > > 
> > > 
> > 
> > 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC][PATCH 23/30] ACPI / hotplug / PCI: Do not exectute _PS0 and _PS3 directly

2013-07-12 Thread Rafael J. Wysocki
On Friday, July 12, 2013 04:05:08 PM Mika Westerberg wrote:
> On Fri, Jul 12, 2013 at 02:01:30AM +0200, Rafael J. Wysocki wrote:
> > Index: linux-pm/drivers/pci/hotplug/acpiphp.h
> > ===
> > --- linux-pm.orig/drivers/pci/hotplug/acpiphp.h
> > +++ linux-pm/drivers/pci/hotplug/acpiphp.h
> > @@ -160,7 +160,6 @@ struct acpiphp_attention_info
> >  
> >  /* slot flags */
> >  
> > -#define SLOT_POWEREDON (0x0001)
> >  #define SLOT_ENABLED   (0x0002)
> >  #define SLOT_MULTIFUNCTION (0x0004)
> >  
> > @@ -168,11 +167,7 @@ struct acpiphp_attention_info
> >  
> >  #define FUNC_HAS_STA   (0x0001)
> >  #define FUNC_HAS_EJ0   (0x0002)
> > -#define FUNC_HAS_PS0   (0x0010)
> > -#define FUNC_HAS_PS1   (0x0020)
> > -#define FUNC_HAS_PS2   (0x0040)
> > -#define FUNC_HAS_PS3   (0x0080)
> > -#define FUNC_HAS_DCK(0x0100)
> > +#define FUNC_HAS_DCK(0x0003)
> 
> These are flags not enum so the above wants to be
> 
>   #define FUNC_HAS_DCK(0x0004)

Yeah, obviously.

I guess it goes against the natural tendency to assign numbers to things
sequentially, so I generally prefer the (1U << n) notation. :-)

> otherwise we accidentally match checks like:
> 
>   /* install notify handler */
>   if (!(newfunc->flags & FUNC_HAS_DCK)) {
>   ...

Yup.  Thanks for spotting that!

Rafael


-- 
I speak only for myself.
Rafael J. Wysocki, Intel Open Source Technology Center.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [media] cx25821 regression from 3.9: BUG: bad unlock balance detected!

2013-07-12 Thread Sander Eikelenboom

Friday, May 17, 2013, 11:52:17 AM, you wrote:

> On Fri May 17 2013 11:04:50 Sander Eikelenboom wrote:
>> 
>> Friday, May 17, 2013, 10:25:24 AM, you wrote:
>> 
>> > On Thu May 16 2013 19:41:42 Sander Eikelenboom wrote:
>> >> Hi Hans / Mauro,
>> >> 
>> >> With 3.10.0-rc1 (including the cx25821 changes from Hans), I get the bug 
>> >> below which wasn't present with 3.9.
>> 
>> > How do I reproduce this? I've tried to, but I can't make this happen.
>> 
>> > Looking at the code I can't see how it could hit this bug anyway.
>> 
>> I'm using "motion" to grab and process 6 from the video streams of the card 
>> i have (card with 8 inputs).
>> It seems the cx25821 underwent quite some changes between 3.9 and 3.10.

> It did.

>> And in the past there have been some more locking issues around mmap and 
>> media devices, although they seem to appear as circular locking dependencies 
>> and with different devices.
>>- http://www.mail-archive.com/linux-media@vger.kernel.org/msg46217.html
>>- Under kvm: http://www.spinics.net/lists/linux-media/msg63322.html

> Neither of those are related to this issue.

>> 
>> - Perhaps that running in a VM could have to do with it ?
>>- The driver on 3.9 occasionaly gives this, probably latency related (but 
>> continues to work):
>>  cx25821: cx25821_video_wakeup: 2 buffers handled (should be 1)
>> 
>>  Could it be something double unlocking in that path ?
>> 
>> - Is there any extra debugging i could enable that could pinpoint the issue ?

> Try this patch:

> diff --git a/drivers/media/pci/cx25821/cx25821-core.c 
> b/drivers/media/pci/cx25821/cx25821-core.c
> index b762c5b..8f8d0e0 100644
> --- a/drivers/media/pci/cx25821/cx25821-core.c
> +++ b/drivers/media/pci/cx25821/cx25821-core.c
> @@ -1208,7 +1208,6 @@ void cx25821_free_buffer(struct videobuf_queue *q, 
> struct cx25821_buffer *buf)
> struct videobuf_dmabuf *dma = videobuf_to_dma(>vb);
>  
> BUG_ON(in_interrupt());
> -   videobuf_waiton(q, >vb, 0, 0);
> videobuf_dma_unmap(q->dev, dma);
> videobuf_dma_free(dma);
> btcx_riscmem_free(to_pci_dev(q->dev), >risc);

> I don't think the waiton is really needed for this driver.

> What really should happen is that videobuf is replaced by videobuf2 in this
> driver, but that's a fair amount of work.

Hi Hans,

After being busy for quite some time, i do have some spare time now.

Since i'm still having trouble with this driver, is there a patch series for a 
similar driver
that was converted to videobuf2 ?
I don't know if it is entirely in my league, but i could give it a try when i 
have a example.

--
Sander


> Regards,

> Hans

>> 
>> 
>> --
>> 
>> Sander
>> 
>> 
>> 
>> > Regards,
>> 
>> > Hans
>> 
>> >> 
>> >> --
>> >> Sander
>> >> 
>> >> 
>> >> [   53.004968] =
>> >> [   53.004968] [ BUG: bad unlock balance detected! ]
>> >> [   53.004968] 3.10.0-rc1-20130516-jens+ #1 Not tainted
>> >> [   53.004968] -
>> >> [   53.004968] motion/3328 is trying to release lock (>lock) at:
>> >> [   53.004968] [] mutex_unlock+0x9/0x10
>> >> [   53.004968] but there are no more locks to release!
>> >> [   53.004968]
>> >> [   53.004968] other info that might help us debug this:
>> >> [   53.004968] 1 lock held by motion/3328:
>> >> [   53.004968]  #0:  (>mmap_sem){++}, at: [] 
>> >> vm_munmap+0x3e/0x70
>> >> [   53.004968]
>> >> [   53.004968] stack backtrace:
>> >> [   53.004968] CPU: 1 PID: 3328 Comm: motion Not tainted 
>> >> 3.10.0-rc1-20130516-jens+ #1
>> >> [   53.004968] Hardware name: Xen HVM domU, BIOS 4.3-unstable 05/16/2013
>> >> [   53.004968]  819be5f9 88002ac35c58 819b9029 
>> >> 88002ac35c88
>> >> [   53.004968]  810e615e 88002ac35cb8 88002b7c18a8 
>> >> 819be5f9
>> >> [   53.004968]   88002ac35d28 810eb17e 
>> >> 810e7ba5
>> >> [   53.004968] Call Trace:
>> >> [   53.004968]  [] ? mutex_unlock+0x9/0x10
>> >> [   53.004968]  [] dump_stack+0x19/0x1b
>> >> [   53.004968]  [] print_unlock_imbalance_bug+0xfe/0x110
>> >> [   53.004968]  [] ? mutex_unlock+0x9/0x10
>> >> [   53.004968]  [] lock_release_non_nested+0x1ce/0x320
>> >> [   53.004968]  [] ? 
>> >> debug_check_no_locks_freed+0x105/0x1b0
>> >> [   53.353529]  [] ? mutex_unlock+0x9/0x10
>> >> [   53.353529]  [] lock_release+0xfc/0x250
>> >> [   53.353529]  [] __mutex_unlock_slowpath+0xb2/0x1f0
>> >> [   53.353529]  [] mutex_unlock+0x9/0x10
>> >> [   53.353529]  [] videobuf_waiton+0x55/0x230
>> >> [   53.353529]  [] ? tlb_finish_mmu+0x32/0x50
>> >> [   53.353529]  [] ? unmap_region+0xc6/0x100
>> >> [   53.353529]  [] ? kmem_cache_free+0x195/0x230
>> >> [   53.353529]  [] cx25821_free_buffer+0x49/0xa0
>> >> [   53.353529]  [] cx25821_buffer_release+0x9/0x10
>> >> [   53.353529]  [] videobuf_vm_close+0xc5/0x160
>> >> [   53.353529]  [] remove_vma+0x25/0x60
>> >> [   53.353529]  [] 

Re: [RFC][PATCH 0/30] ACPI / hotplug / PCI: Major rework + Thunderbolt workarounds

2013-07-12 Thread Rafael J. Wysocki
On Friday, July 12, 2013 04:18:50 PM Mika Westerberg wrote:
> On Fri, Jul 12, 2013 at 01:34:20AM +0200, Rafael J. Wysocki wrote:
> > Hi,
> > 
> > I've made some progress with my ACPIPHP rework since I posted the series 
> > last
> > time and here goes an update.
> > 
> > First off, the previous series was somewhat racy, which should be fixed now.
> > Apart from this there's quite some new material on top of the patches I 
> > posted
> > last time (or rather on top of their new versions) and I integrated the
> > Thunderbolt series from Mika with that.  As a result,
> > 
> > https://patchwork.kernel.org/patch/2817341/
> > 
> > is required to be applied.
> 
> With the above mentioned patch applied + fix for patch [23/30], I tested
> this series on Acer Aspire S5 and Intel DZ77RE-75K desktop and Thunderbolt
> works just fine :-)
> 
> You can add
> 
> Tested-by: Mika Westerberg 
> 
> to the series.

Thanks a lot, that's really helpful!

Now I can rebase it on the previous cleanups and we'll see how that all
together works on top of the Linus' current. :-)

> Nice cleanup!

Thanks!

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


  1   2   3   4   5   6   7   8   9   10   >