Re: [PATCH 00/34] Memory management performance backports for -stable V2

2012-07-23 Thread Mike Galbraith
On Mon, 2012-07-23 at 14:38 +0100, Mel Gorman wrote: 
> Changelog since V1
>   o Expand some of the notes  (jrnieder)
>   o Correct upstream commit SHA1  (hugh)
> 
> This series is related to the new addition to stable_kernel_rules.txt
> 
>  - Serious issues as reported by a user of a distribution kernel may also
>be considered if they fix a notable performance or interactivity issue.
>As these fixes are not as obvious and have a higher risk of a subtle
>regression they should only be submitted by a distribution kernel
>maintainer and include an addendum linking to a bugzilla entry if it
>exists and additional information on the user-visible impact.
> 
> All of these patches have been backported to a distribution kernel and
> address some sort of performance issue in the VM. As they are not all
> obvious, I've added a "Stable note" to the top of each patch giving
> additional information on why the patch was backported. Lets see where
> the boundaries lie on how this new rule is interpreted in practice :).

FWIW, I'm all for performance backports.  They do have a downside though
(other than the risk of bugs slipping in, or triggering latent bugs).

When the next enterprise kernel is built, marketeers ask for numbers to
make potential customers drool over, and you _can't produce any_ because
you wedged all the spiffy performance stuff into the crusty old kernel.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL v2] Blackfin changes for 3.6-rc1

2012-07-23 Thread Bob Liu
Hi linus,

This is the new pull request about blackfin changes for 3.6-rc1.
I've rebased my tree to 3.5.

The big changes are adding PM and HDMI support for bf60x, other patches are 
various
bug fix and code cleanup.

Thanks,
-Bob

The following changes since commit 28a33cbc24e4256c143dce96c7d93bf423229f92:

  Linux 3.5 (2012-07-21 13:58:29 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/lliubbo/blackfin.git for-linus

for you to fetch changes up to 719154c6d1c1a3a404f4ff570c4b36bb2ef868ca:

  bf60x: fix build warning (2012-07-24 13:39:53 +0800)


Bob Liu (7):
  bfin: reorg clock init steps for bf609
  blackfin: Kconfig: fix ROM range for bf60x
  blackfin: mach-common: ints-priority add irq_set_wake
  blackfin: bf609-ezkit: add probe_type for norflash
  blackfin: fix musb macro name
  blackfin: cplb-nompu: fix ROM cplb size for bf609-ezkit
  bf60x: fix build warning

Scott Jiang (10):
  bf609: change ad7877 cs and irq pin
  bfin: add 32M, 16M and 8M uncached DMA region options
  v4l2: add adv7842 video decoder driver
  bf609: add ssm2602 support on bf609 platform
  bf609: add CVBS and S-Video support for adv7842
  bf609: add HDMI support for adv7842
  bf609: convert vs6624 blank_clocks to black_pixels
  bf561: add capabilities in adv7183_inputs
  bf609: reuse bf5xx-i2s-pcm.c as i2s pcm driver
  bf609: add adv7511 display support

Sonic Zhang (16):
  blackfin: Call sg_for_each to pass through the whole sg list.
  bf609: crypto: Add blackfin crypto crc driver platform data.
  bf60x: Enable Blackfin CRC crypto driver by default.
  bf60x: bfin_crc: move structure bfin_crc out of head file.
  bf609: bfin_crc: Remove unused CRC TX DMA platform resources.
  bfin: pm: add deepsleep for bf60x
  bf60x: Add wake up latency bench for deep sleep mode.
  i2c: i2c-bfin-twi: Always access 16 bit MMR by bfin 16 bit access Macro.
  bf60x: sec: Clean up interrupt initialization code for SEC.
  bf60x: sec: Enable sec interrupt source priority configuration.
  bf60x: update bf60x anomaly list.
  bf60x: add default anomaly setting.
  bf60x: update anomaly id in serial and twi driver headers.
  bf60x: Add double fault, hardware error and NMI SEC handler
  bf60x: cpufreq: fix anomaly 05000273
  blackfin: twi: read twi mmr via bfin_read macro

Steven Miao (14):
  pm: dpmc macro typo fix
  bfin-dma: only use MDMA3 on bf609
  irq: set cgu event handle to fasteoi handle
  cpufreq: change debug message level to show clock change error
  cache: enable L2 sram icache in menuconfig
  bfin: simple_timer: add READ_COUNTER ioctl and add NOIRQ timer mode
  bf60x: pm: add smc nor flash syscore ops
  bf60x: pm: pass wakeup param
  gpiokeys: add gpio keyboard platform device
  bf60x: pm: add pint suspend and resume support
  bfin: pint: add pint suspend and resume
  cleanup: sec and linkport only built on bf60x
  dpm: deepsleep: reserve stack
  PM: add BF60x flash suspend and resume support

Vivi Li (1):
  bf60x: vs6624 pin update

 arch/blackfin/Kconfig  |   16 +-
 arch/blackfin/configs/BF609-EZKIT_defconfig|2 +
 arch/blackfin/include/asm/bfin-global.h|8 +-
 arch/blackfin/include/asm/bfin_crc.h   |   14 -
 arch/blackfin/include/asm/bfin_serial.h|2 +-
 arch/blackfin/include/asm/bfin_simple_timer.h  |6 +
 arch/blackfin/include/asm/bfin_twi.h   |   10 +-
 arch/blackfin/include/asm/context.S|9 +
 arch/blackfin/include/asm/dpmc.h   |2 +-
 arch/blackfin/include/asm/gpio.h   |2 +
 arch/blackfin/include/asm/irq.h|   10 +
 arch/blackfin/include/asm/mem_init.h   |  212 +
 arch/blackfin/include/asm/traps.h  |2 +
 arch/blackfin/kernel/bfin_dma.c|4 +-
 arch/blackfin/kernel/cplb-nompu/cplbinit.c |8 +
 arch/blackfin/kernel/dma-mapping.c |   10 +-
 arch/blackfin/mach-bf527/boards/ezkit.c|4 +-
 arch/blackfin/mach-bf548/boards/ezkit.c|4 +-
 arch/blackfin/mach-bf548/include/mach/gpio.h   |2 +
 arch/blackfin/mach-bf561/boards/ezkit.c|3 +
 arch/blackfin/mach-bf609/Kconfig   |8 +
 arch/blackfin/mach-bf609/Makefile  |4 +-
 arch/blackfin/mach-bf609/boards/ezkit.c|  304 +-
 arch/blackfin/mach-bf609/clock.c   |3 +-
 arch/blackfin/mach-bf609/dpm.S |  157 ++
 arch/blackfin/mach-bf609/hibernate.S   |   65 
 arch/blackfin/mach-bf609/include/mach/anomaly.h|  141 -
 

Re: [GIT PULL] Blackfin changes for 3.6-rc1

2012-07-23 Thread Bob Liu
Hi linus,

On Tue, Jul 24, 2012 at 12:54 PM, Linus Torvalds
 wrote:
> On Mon, Jul 23, 2012 at 8:54 PM, Bob Liu  wrote:
>>
>> Please pull blackfin changes for 3.6-rc1.
>
> No.
>
> These were clearly rebased today. And on top of random state in the
> merge window.
>

Sorry for that, i've rebased my tree to v3.5 and will send a new pull-request.

-- 
Regards,
--Bob
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] USB: plusb: Add support for PL-2501

2012-07-23 Thread kyak
This patch was created against linux-3.5, but it applies perfectly 
against net-next tree, i just checked..


I'm sorry for not being able to submit the correct patch from the first 
attempt (and not even from the third attempt). Could you be more specific 
about "doesn't apply cleanly at all"? By the way, i'm perfectly fine if 
you just make this trivial change by yourself and take the credit, because 
our exchange of e-mails has become 20 times bigger than the patch 
itself. Probably sending another version of this patch from my side would 
be just another waste of (your) time.


On Mon, 23 Jul 2012, David Miller wrote:


From: kyak 
Date: Mon, 23 Jul 2012 15:44:11 +0400 (MSK)


From: Mikhail Peselnik 

This patch adds support for PL-2501 by adding the appropriate USB
ID's. This chip is used in several USB 'Easy Trasfer' Cables.

Signed-off-by: Mikhail Peselnik 
Tested-by: Mikhail Peselnik 


This does not apply cleanly to my net-next tree at all.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


linux-next: Tree for July 24

2012-07-23 Thread Stephen Rothwell
Hi all,

Please do not add anything to linux-next included branches/series that is
destined for v3.7 until after v3.6-rc1 is released.

Reminder: do not rebase your branches before asking Linus to pull them ...

Changes since 20120723:

The nfs tree lost its conflict.

The device-mapper tree lost its conflicts.

The tty tree still has its build failures for which I have disabled 2
staging drivers and applied a patch.

I have still reverted 3 commits from the signal tree at the request of the
arm maintainer.



I have created today's linux-next tree at
git://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
(patches at http://www.kernel.org/pub/linux/kernel/next/ ).  If you
are tracking the linux-next tree using git, you should not use "git pull"
to do so as that will try to merge the new linux-next release with the
old one.  You should use "git fetch" as mentioned in the FAQ on the wiki
(see below).

You can see which trees have been included by looking in the Next/Trees
file in the source.  There are also quilt-import.log and merge.log files
in the Next directory.  Between each merge, the tree was built with
a ppc64_defconfig for powerpc and an allmodconfig for x86_64. After the
final fixups (if any), it is also built with powerpc allnoconfig (32 and
64 bit), ppc44x_defconfig and allyesconfig (minus
CONFIG_PROFILE_ALL_BRANCHES - this fails its final link) and i386, sparc,
sparc64 and arm defconfig. These builds also have
CONFIG_ENABLE_WARN_DEPRECATED, CONFIG_ENABLE_MUST_CHECK and
CONFIG_DEBUG_INFO disabled when necessary.

Below is a summary of the state of the merge.

We are up to 197 trees (counting Linus' and 26 trees of patches pending
for Linus' tree), more are welcome (even if they are currently empty).
Thanks to those who have contributed, and to those who haven't, please do.

Status of my local build tests will be at
http://kisskb.ellerman.id.au/linux-next .  If maintainers want to give
advice about cross compilers/configs that work, we are always open to add
more builds.

Thanks to Randy Dunlap for doing many randconfig builds.  And to Paul
Gortmaker for triage and bug fixes.

There is a wiki covering stuff to do with linux-next at
http://linux.f-seidel.de/linux-next/pmwiki/ .  Thanks to Frank Seidel.

-- 
Cheers,
Stephen Rothwells...@canb.auug.org.au

$ git checkout master
$ git reset --hard stable
Merging origin/master (0ec4f43 locks: fix checking of fcntl_setlease argument)
Merging fixes/master (9023a40 Merge tag 'mmc-fixes-for-3.5-rc4' of 
git://git.kernel.org/pub/scm/linux/kernel/git/cjb/mmc)
Merging kbuild-current/rc-fixes (f8f5701 Linux 3.5-rc1)
Merging arm-current/fixes (ff081e0 ARM: 7457/1: smp: Fix suspicious RCU 
originating from cpu_die())
Merging m68k-current/for-linus (1525e06 m68k/apollo: Rename "timer" to 
"apollo_timer")
Merging powerpc-merge/merge (50fb31c tty/hvc_opal: Fix debug function name)
Merging sparc/master (d55de60 sparc64: remove unused function 
straddles_64bit_va_hole())
Merging net/master (3e4b945 Merge tag 'md-3.5-fixes' of 
git://neil.brown.name/md)
Merging sound-current/for-linus (c1b623d Merge tag 'asoc-3.6' of 
git://git.kernel.org/pub/scm/linux/kernel/git/broonie/sound into for-next)
Merging pci-current/for-linus (314489b Merge tag 'fixes-for-linus' of 
git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc)
Merging wireless/master (8a70e7f NFC: NCI module license 'unspecified' taints 
kernel)
Merging driver-core.current/driver-core-linus (84a1caf Linux 3.5-rc7)
Merging tty.current/tty-linus (84a1caf Linux 3.5-rc7)
Merging usb.current/usb-linus (84a1caf Linux 3.5-rc7)
Merging staging.current/staging-linus (6887a41 Linux 3.5-rc5)
Merging char-misc.current/char-misc-linus (84a1caf Linux 3.5-rc7)
Merging input-current/for-linus (e76b8ee Input: xpad - add Andamiro Pump It Up 
pad)
Merging md-current/for-linus (58e94ae md/raid1: close some possible races on 
write errors during resync)
Merging audit-current/for-linus (c158a35 audit: no leading space in 
audit_log_d_path prefix)
Merging crypto-current/master (c475c06 hwrng: atmel-rng - fix data valid check)
Merging ide/master (39a50b4 Merge branch 'hfsplus')
Merging dwmw2/master (244dc4e Merge 
git://git.infradead.org/users/dwmw2/random-2.6)
Merging sh-current/sh-fixes-for-linus (4403310 SH: Convert out[bwl] macros to 
inline functions)
Merging irqdomain-current/irqdomain/merge (15e06bf irqdomain: Fix debugfs 
formatting)
Merging devicetree-current/devicetree/merge (4e8383b of: release node fix for 
of_parse_phandle_with_args)
Merging spi-current/spi/merge (d1c185b of/spi: Fix SPI module loading by using 
proper "spi:" modalias prefixes.)
Merging gpio-current/gpio/merge (96b7064 gpio/tca6424: merge I2C transactions, 
remove cast)
Merging arm/for-next (dea2ea3 Merge branches 'audit', 'delay', 'dmaengine', 
'fixes', 'misc' and 'sta2x11' into for-next)
Merging

[GIT PULL] First round of SCSI updates for the 3.5+ merge window

2012-07-23 Thread James Bottomley
The most important feature of this patch set is the new async
infrastructure that makes sure async_synchronize_full() synchronizes all
domains and allows us to remove all the hacks (like having
scsi_complete_async_scans() in the device base code) and means that the
async infrastructure will "just work" in future. The rest is assorted
driver updates (aacraid, bnx2fc, virto-scsi, megaraid, bfa, lpfc,
qla2xxx, qla4xxx) plus a lot of infrastructure work in sas and FC.

The patch is available here:

git://git.kernel.org/pub/scm/linux/kernel/git/jejb/scsi.git scsi-misc

The short changelog is

Alan Cox (1):
  aha152x: Allow use on 64bit systems

Bart Van Assche (5):
  Stop accepting SCSI requests before removing a device
  Change return type of scsi_queue_insert() into void
  Avoid dangling pointer in scsi_requeue_command()
  Fix device removal NULL pointer dereference
  scsi_dh_alua: Re-enable STPG for unavailable ports

Ben Collins (4):
  aacraid: Fix endian issues in core and SRC portions of driver
  aacraid: Relax the tight timeout loop on fib commands
  aacraid: Better handling of in-flight events on thread stop
  aacraid: Use resource_size_t for IO mem pointers and offsets

Bhanu Prakash Gollapudi (4):
  bnx2fc: Bumped version to 1.0.12
  bnx2fc: use list_entry instead of explicit cast
  bnx2fc: Improve error recovery by handling parity errors
  bnx2fc: Support interface creation on non-VLAN interface also.

Cong Meng (1):
  virtio-scsi: hotplug support for virtio-scsi

Dan Carpenter (6):
  mvsas: remove unused variable in mvs_task_exec()
  megaraid: remove a spurious IRQ enable
  megaraid: cleanup type issue in mega_build_cmd()
  bfa: dereferencing freed memory in bfad_im_probe()
  bfa: off by one in bfa_ioc_mbox_isr()
  arcmsr: fix misuse of | instead of &

Dan Williams (17):
  Revert "[SCSI] fix async probe regression"
  cleanup usages of scsi_complete_async_scans
  queue async scan work to an async_schedule domain
  async: make async_synchronize_full() flush all work regardless of
domain
  async: introduce 'async_domain' type
  libsas: trim sas_task of slow path infrastructure
  libsas: drop sata port multiplier infrastructure
  libsas: fix sas_discover_devices return code handling
  libsas: continue revalidation
  isci: use sas eh strategy handlers
  libsas: use ->lldd_I_T_nexus_reset for ->eh_bus_reset_handler
  libsas: add sas_eh_abort_handler
  libsas: enforce eh strategy handlers only in eh context
  cleanup setting task state in scsi_error_handler()
  fix eh wakeup (scsi_schedule_eh vs scsi_restart_operations)
  libata, libsas: introduce sched_eh and end_eh port ops
  fix hot unplug vs async scan race

Eric Dumazet (1):
  bnx2fc: use kthread_create_on_node

HighPoint Linux Team (1):
  hptiop: fix RR312x in hosts with >12GB

James Bottomley (2):
  lpfc: fix problems with -Werror
  Remove scsi_wait_scan module

James Smart (10):
  lpfc 8.3.32: Update lpfc to version 8.3.32
  lpfc 8.3.32: Fix error reporting of misconfigured ports
  lpfc 8.3.32: Fix system panic due to node state change
  lpfc 8.3.32: Fix ability to change FCP EQ delay multiplier
  lpfc 8.3.32: Correct successful aborts returning error status
  lpfc 8.3.32: Correct provisioning change failure on local function
  lpfc 8.3.32: Correct host DIF configuration that hung system
  lpfc 8.3.32: Fix CQ and EQ dump failure for debugfs
  lpfc 8.3.32: Correct null pointer Error in lpfc_sli.c
  lpfc 8.3.32: lpfc_sli.c: add missing jumps to mempool_free

Jeff Skirvin (1):
  libsas: sas_rediscover_dev did not look at the SMP exec status.

Joe Perches (1):
  bnx2fc: Reduce object size by consolidating formats

Jon Mason (2):
  qla2xxx: remove unnecessary reads of PCI_CAP_ID_EXP
  qla4xxx: remove unnecessary read of PCI_CAP_ID_EXP

Josh Hunt (1):
  properly initialize atomic_t

Karen Xie (1):
  cxgb4i: tcp push bit fix

Krishna Gudipati (1):
  bfa: Fix to set correct return error codes and misc cleanup.

Kyle McMartin (1):
  bfa: squelch lockdep complaint with a spin_lock_init

Lin Ming (1):
  scsi_pm: set device runtime state before parent suspended

Maciej Trela (1):
  libsas: cleanup spurious calls to scsi_schedule_eh

Mahesh Rajashekhara (1):
  aacraid: Series 7 Async. (performance) mode support

Mark Rustad (1):
  libfcoe: Fix section mismatch

Mike Christie (3):
  remove old comment from block/unblock functions
  core, classes, mpt2sas: have scsi_internal_device_unblock take new
state
  add new SDEV_TRANSPORT_OFFLINE state

Mike Snitzer (1):
  scsi_dh: add scsi_dh_attached_handler_name

Muthukumar Ratty (1):
  block: Fix blk_execute_rq_nowait() dead queue handling

Namjae Jeon (4):
  ufs: fix incorrect return value about SUCCESS and FAILED
  usb-storage: update 

[GIT PULL] slave-dmaengine updates for v3.6

2012-07-23 Thread Vinod Koul

Hi Linus,

Here is the slave-dmaengine update for 3.6

This time we have new dmaengine driver from tegra folks. Also we have
Guennadi's cleanup of sh drivers which incudes a library for sh drivers.
And the usual odd fixes in bunch of drivers and some nice cleanup of
dw_dmac from Andy.


The following changes since commit f8f5701bdaf9134b1f90e5044a82c66324d2073f:

are available in the git repository at:

  git://git.infradead.org/users/vkoul/slave-dma.git next

Andy Shevchenko (12):
  dw_dmac: fix constant in the comment
  dw_dmac: use proper casting to print dma_addr_t values
  dw_dmac: introduce dwc_dump_chan_regs to dump registers
  dw_dmac: print correct number of scanned descriptors
  dw_dmac: use __func__ constant in the debug prints
  dw_dmac: disable dma in optimal way in probe
  dw_dmac: disable BLOCK interrupts
  dw_dmac: introduce dwc_fast_fls()
  dw_dmac: move from __init to __devinit
  dw_dmac: introduce dwc_chan_disable
  dw_dmac: mark dwc_dump_lli inline
  dw_dmac: use 'u32' for LLI structure members, not dma_addr_t

Attila Kinali (1):
  dma: mxs-dma: Export missing symbols from mxs-dma.c

Guennadi Liakhovetski (17):
  dma: move shdma driver to an own directory
  dmaengine: add an shdma-base library
  dma: shdma: prepare for conversion to the shdma base library
  mmc: sh_mmcif: remove unneeded struct sh_mmcif_dma, prepare to shdma 
conversion
  mmc: sh_mobile_sdhi: prepare for conversion to the shdma base library
  serial: sh-sci: prepare for conversion to the shdma base library
  ASoC: siu: prepare for conversion to the shdma base library
  usb: renesas_usbhs: prepare for conversion to the shdma base library
  ASoC: fsi: prepare for conversion to the shdma base library
  dma: shdma: convert to the shdma base library
  dmaengine: shdma: (cosmetic) simplify a static function
  ASoC: siu: don't use DMA device for channel filtering
  sh: remove unused DMA device pointer from SIU platform data
  dmaengine: shdma: prepare to stop using struct dma_chan::private
  dma: sh: use an integer slave ID to improve API compatibility
  dma: sh: provide a migration path for slave drivers to stop using .private
  mmc: sh_mmcif: switch to the new DMA channel allocation and configuration

Huang Shijie (1):
  dma: enable mxs-dma for imx6q

Javi Merino (1):
  DMA: PL330: Fix racy mutex unlock

Joe Perches (1):
  dmaengine: Cleanup logging messages

Lars-Peter Clausen (1):
  dmaengine: Add wrapper for device_tx_status callback

Laxman Dewangan (7):
  dma: dmaengine: add slave req id in slave_config
  dma: tegra: add dmaengine based dma driver
  dma: tegra: use sg_dma_address() for getting dma buffer address
  dma: tegra: do not set transfer desc flag to DMA_CTRL_ACK in cyclic mode
  dma: tegra: set DMA_CYCLIC capability
  dma: tegra: fix residual calculation for cyclic case
  dma: tegra: rename driver and compatible to match with dts

Linus Walleij (1):
  dma: coh901318: use devm allocation

Nicolas Ferre (1):
  dmaengine: at_hdmac: trivial: fix comment in header

Prashant Gaikwad (1):
  dma: tegra: add clk_prepare/clk_unprepare

Richard Zhao (1):
  dma: imx-sdma: buf_tail should be initialize in prepare function

Sachin Kamat (1):
  DMA: PL330: Add missing static storage class specifier

Tushar Behera (1):
  dmaengine: pl330: dont complete descriptor for cyclic dma

Uwe Kleine-König (1):
  dmaengine: at_hdmac: add a few const qualifiers

Vinod Koul (2):
  Merge branch 'fixes' into next
  dmaengine: mmp_tdma: fix the arch dependency

Zhangfei Gao (1):
  dmaengine: mmp_tdma: add mmp tdma support

 arch/sh/include/asm/siu.h  |1 -
 arch/sh/kernel/cpu/sh4a/setup-sh7722.c |1 -
 drivers/dma/Kconfig|   26 +-
 drivers/dma/Makefile   |4 +-
 drivers/dma/at_hdmac.c |   11 +-
 drivers/dma/coh901318.c|   72 +-
 drivers/dma/dmaengine.c|   20 +-
 drivers/dma/dw_dmac.c  |  182 ++---
 drivers/dma/dw_dmac_regs.h |8 +-
 drivers/dma/imx-sdma.c |6 +-
 drivers/dma/mmp_tdma.c |  610 +
 drivers/dma/mxs-dma.c  |3 +-
 drivers/dma/pl330.c|   30 +-
 drivers/dma/sh/Makefile|2 +
 drivers/dma/sh/shdma-base.c|  934 +++
 drivers/dma/sh/shdma.c |  943 
 drivers/dma/{ => sh}/shdma.h   |   46 +-
 drivers/dma/shdma.c| 1524 
 drivers/dma/tegra20-apb-dma.c  | 1415 +
 drivers/mmc/host/sh_mmcif.c|   94 ++-
 drivers/mmc/host/sh_mobile_sdhi.c  |8 +-
 drivers/tty/serial/sh-sci.c|8 +-
 drivers/usb/renesas_usbhs/fifo.c 

Re: [RFC 0/2] virtio: provide a way for host to monitor critical events in the device

2012-07-23 Thread Rusty Russell
On Mon, 23 Jul 2012 22:32:39 +0200, Sasha Levin  wrote:
> As it was discussed recently, there's currently no way for the guest to notify
> the host about panics. Further more, there's no reasonable way to notify the
> host of other critical events such as an OOM kill.

I clearly missed the discussion.  Is this actually useful?  In practice,
won't you want the log from the guest?  What makes a virtual guest
different from a physical guest?

Guest watchdog functionality might be useful, but that's simpler to
implement via a virtio watchdog device, and more effective to implement
via a host facility that actually pings guest functionality (rather than
the kernel).

Cheers,
Rusty.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 02/20] mm: Add optional TLB flush to generic RCU page-table freeing

2012-07-23 Thread Nikunj A Dadhania
On Thu, 28 Jun 2012 01:01:46 +0200, Peter Zijlstra  
wrote:
  
> +#ifdef CONFIG_STRICT_TLB_FILL
> +/*
> + * Some archictures (sparc64, ppc) cannot refill TLBs after the they've 
> removed
> + * the PTE entries from their hash-table. Their hardware never looks at the
> + * linux page-table structures, so they don't need a hardware TLB invalidate
> + * when tearing down the page-table structure itself.
> + */
> +static inline void tlb_table_flush_mmu(struct mmu_gather *tlb) { }
> +#else
> +static inline void tlb_table_flush_mmu(struct mmu_gather *tlb)
> +{
> + tlb_flush_mmu(tlb);
> +}
> +#endif
> +
>  void tlb_table_flush(struct mmu_gather *tlb)
>  {
>   struct mmu_table_batch **batch = >batch;
>  
>   if (*batch) {
> + tlb_table_flush_mmu(tlb);
>   call_rcu_sched(&(*batch)->rcu, tlb_remove_table_rcu);
>   *batch = NULL;
>   }

Hi Peter,

When running munmap(https://lkml.org/lkml/2012/5/17/59) test with KVM
and pvflush patches I got a crash. I have verified that the crash
happens on the base(non virt) as well when I have
CONFIG_HAVE_RCU_TABLE_FREE defined. Here is the crash details and my
analysis below:

---

BUG: unable to handle kernel NULL pointer dereference at 0008
IP: [] __call_rcu+0x29/0x1c0
PGD 0 
Oops: 0002 [#1] SMP 
CPU 24 
Modules linked in: kvm_intel kvm [last unloaded: scsi_wait_scan]


Pid: 32643, comm: munmap Not tainted 3.5.0-rc7+ #46 IBM System x3850 X5 
-[7042CR6]-[root@mx3850x5 ~/Node 1, Processor Card]# 
RIP: 0010:[]  [] __call_rcu+0x29/0x1c0
RSP: 0018:88203164fc28  EFLAGS: 00010246
RAX: 88203164fba8 RBX:  RCX: 
RDX: 81e34280 RSI: 81130330 RDI: 
RBP: 88203164fc58 R08: ea00d2680340 R09: 
R10: 883c7fbd4ef8 R11: 0078 R12: 81130330
R13: 7f09ee803000 R14: 883c2fa5bab0 R15: 88203164fe08
FS:  7f09ee7ee700() GS:883c7fc0() knlGS:
CS:  0010 DS:  ES:  CR0: 8005003b
CR2: 0008 CR3: 01e0b000 CR4: 07e0
DR0:  DR1:  DR2: 
DR3:  DR6: 0ff0 DR7: 0400
Process munmap (pid: 32643, threadinfo 88203164e000, task 882030458a70)
Stack:
 883c2fa5bab0 88203164fe08 88203164fc68 88203164fe08
 88203164fe08 7f09ee803000 88203164fc68 810d33c7
 88203164fc88 81130e0d 88203164fc88 ea00d28e54f8
Call Trace:
 [] call_rcu_sched+0x17/0x20
 [] tlb_table_flush+0x2d/0x40
 [] tlb_remove_table+0x60/0xc0
 [] ___pte_free_tlb+0x63/0x70
 [] free_pgd_range+0x298/0x4b0
 [] free_pgtables+0xce/0x120
 [] exit_mmap+0xa7/0x160
 [] mmput+0x6f/0xf0
 [] exit_mm+0x105/0x130
 [] ? taskstats_exit+0x17d/0x240
 [] do_exit+0x176/0x480
 [] do_group_exit+0x55/0xd0
 [] sys_exit_group+0x17/0x20
 [] system_call_fastpath+0x16/0x1b
Code: ff ff 55 48 89 e5 48 83 ec 30 48 89 5d e8 4c 89 65 f0 4c 89 6d f8 66 66 
66 66 90 40 f6 c7 03 48 89 fb 49 89 f4 0f 85 19 01 00 00 <4c> 89 63 08 48 c7 03 
00 00 00 00 0f ae f0 9c 58 66 66 90 66 90 
RIP  [] __call_rcu+0x29/0x1c0
 RSP 
CR2: 0008
---[ end trace 3ed30a91ea7cb375 ]---



I think this is what is happening:

___pte_free_tlb
   tlb_remove_table
  tlb_table_flush
 tlb_table_flush_mmu
tlb_flush_mmu
Sets need_flush = 0
tlb_table_flush (if CONFIG_HAVE_RCU_TABLE_FREE)
[Gets called twice with same *tlb!]

tlb_table_flush_mmu
tlb_flush_mmu(nop as need_flush is 0)
call_rcu_sched(&(*batch)->rcu,...);
*batch = NULL;
 call_rcu_sched(&(*batch)->rcu,...); < *batch would be NULL

I verified this by putting following fix and do not see the crash
anymore:

diff --git a/mm/memory.c b/mm/memory.c
index 1797bc1..329fcb9 100644
--- a/mm/memory.c
+++ b/mm/memory.c
@@ -367,7 +367,8 @@ void tlb_table_flush(struct mmu_gather *tlb)
 
if (*batch) {
tlb_table_flush_mmu(tlb);
-   call_rcu_sched(&(*batch)->rcu, tlb_remove_table_rcu);
+   if(*batch)
+   call_rcu_sched(&(*batch)->rcu, tlb_remove_table_rcu);
*batch = NULL;
}
 }

Thanks
Nikunj

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V2] dma: tegra: enable/disable dma clock

2012-07-23 Thread Laxman Dewangan

On Tuesday 24 July 2012 10:38 AM, Vinod Koul wrote:

On Fri, 2012-07-20 at 13:31 +0530, Laxman Dewangan wrote:

Enable the DMA clock when allocating channel and
disable clock when freeing channels.

Signed-off-by: Laxman Dewangan
---
+   clk_disable_unprepare(tdma->dma_clk);

What if another channel is active, disabling clock can cause bad
behavior. You should check here if all channels are idle and then
disable, or is this handled by clock API?


Yes, clock driver keeps the reference count and so client driver need 
not to take care.


Thanks,
Laxman


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[ANNOUNCE] Git v1.7.12-rc0

2012-07-23 Thread Junio C Hamano
A release candidate Git v1.7.12-rc0 is now available for testing
at the usual places.

The release tarballs are found at:

http://code.google.com/p/git-core/downloads/list

and their SHA-1 checksums are:

09016e819a69b49090756e9bc5c97a4df25c2f78  git-1.7.12.rc0.tar.gz
e85ad0780ff81eacdb05a10762060812bc9367dd  git-htmldocs-1.7.12.rc0.tar.gz
b641a9664c333518ede3b1d8b67d84d18f5b0e14  git-manpages-1.7.12.rc0.tar.gz

Also the following public repositories all have a copy of the v1.7.12-rc0
tag and the master branch that the tag points at:

  url = git://repo.or.cz/alt-git.git
  url = https://code.google.com/p/git-core/
  url = git://git.sourceforge.jp/gitroot/git-core/git.git
  url = git://git-core.git.sourceforge.net/gitroot/git-core/git-core
  url = https://github.com/gitster/git

Git v1.7.12 Release Notes (draft)
=

Updates since v1.7.11
-

UI, Workflows & Features

 * Git can be told to normalize pathnames it read from readdir(3) and
   all arguments it got from the command line into precomposed UTF-8
   (assuming that they come as decomposed UTF-8), in order to work
   around issues on Mac OS.

   I think there still are other places that need conversion
   (e.g. paths that are read from stdin for some commands), but this
   should be a good first step in the right direction.

 * Per-user $HOME/.gitconfig file can optionally be stored in
   $HOME/.config/git/config instead, which is in line with XDG.

 * The value of core.attributesfile and core.excludesfile default to
   $HOME/.config/attributes and $HOME/.config/ignore respectively when
   these files exist.

 * Logic to disambiguate abbreviated object names have been taught to
   take advantage of object types that are expected in the context,
   e.g. XX in the "git describe" output v1.2.3-gXX must be a
   commit object, not a blob nor a tree.  This will help us prolong
   the lifetime of abbreviated object names.

 * "git apply" learned to wiggle the base version and perform three-way
   merge when a patch does not exactly apply to the version you have.

 * Scripted Porcelain writers now have access to the credential API via
   the "git credential" plumbing command.

 * "git help" used to always default to "man" format even on platforms
   where "man" viewer is not widely available.

 * "git clone --local $path" started its life as an experiment to
   optionally use link/copy when cloning a repository on the disk, but
   we didn't deprecate it after we made the option a no-op to always
   use the optimization.  The command learned "--no-local" option to
   turn this off, as a more explicit alternative over use of file://
   URL.

 * "git fetch" and friends used to say "remote side hung up
   unexpectedly" when they failed to get response they expect from the
   other side, but one common reason why they don't get expected
   response is that the remote repository does not exist or cannot be
   read. The error message in this case was updated to give better
   hints to the user.

 * git native protocol agents learned to show software version over
   the wire, so that the server log can be examined to see the vintage
   distribution of clients.

 * "git help -w $cmd" can show HTML version of documentation for
   "git-$cmd" by setting help.htmlpath to somewhere other than the
   default location where the build procedure installs them locally;
   the variable can even point at a http:// URL.

 * "git rebase [-i] --root $tip" can now be used to rewrite all the
   history leading to "$tip" down to the root commit.

 * "git rebase -i" learned "-x " to insert "exec " after
   each commit in the resulting history.

 * "git status" gives finer classification to various states of paths
   in conflicted state and offer advice messages in its output.

 * "git submodule" learned to deal with nested submodule structure
   where a module is contained within a module whose origin is
   specified as a relative URL to its superproject's origin.

 * A rather heavy-ish "git completion" script has been split to create
   a separate "git prompting" script, to help lazy-autoloading of the
   completion part while making prompting part always available.

 * "gitweb" pays attention to various forms of credits that are
   similar to "Signed-off-by:" lines in the commit objects and
   highlights them accordingly.


Foreign Interface

 * "mediawiki" remote helper (in contrib/) learned to handle file
   attachments.

 * "git p4" now uses "Jobs:" and "p4 move" when appropriate.

 * vcs-svn has been updated to clean-up compilation, lift 32-bit
   limitations, etc.


Performance, Internal Implementation, etc. (please report possible regressions)

 * Some tests showed false failures caused by a bug in ecryptofs.

 * We no longer use AsciiDoc7 syntax in our documentation and favor a
   more modern style.

 * "git am --rebasing" codepath was taught to grab authorship, log
   message and the patch text directly out of existing 

Re: [PATCH V2] dma: tegra: enable/disable dma clock

2012-07-23 Thread Vinod Koul
On Fri, 2012-07-20 at 13:31 +0530, Laxman Dewangan wrote:
> Enable the DMA clock when allocating channel and
> disable clock when freeing channels.
> 
> Signed-off-by: Laxman Dewangan 
> ---
> Changes from V1 to V2:
> - Enable/disable clock when allocating/freeing channels.
> - rewrite the description to reflect change.
> 
>  drivers/dma/tegra20-apb-dma.c |   18 +-
>  1 files changed, 17 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/dma/tegra20-apb-dma.c b/drivers/dma/tegra20-apb-dma.c
> index d52dbc6..24acd71 100644
> --- a/drivers/dma/tegra20-apb-dma.c
> +++ b/drivers/dma/tegra20-apb-dma.c
> @@ -1119,15 +1119,21 @@ struct dma_async_tx_descriptor 
> *tegra_dma_prep_dma_cyclic(
>  static int tegra_dma_alloc_chan_resources(struct dma_chan *dc)
>  {
>   struct tegra_dma_channel *tdc = to_tegra_dma_chan(dc);
> + struct tegra_dma *tdma = tdc->tdma;
> + int ret;
>  
>   dma_cookie_init(>dma_chan);
>   tdc->config_init = false;
> - return 0;
> + ret = clk_prepare_enable(tdma->dma_clk);
> + if (ret < 0)
> + dev_err(tdc2dev(tdc), "clk_prepare_enable failed: %d\n", ret);
> + return ret;
>  }
>  
>  static void tegra_dma_free_chan_resources(struct dma_chan *dc)
>  {
>   struct tegra_dma_channel *tdc = to_tegra_dma_chan(dc);
> + struct tegra_dma *tdma = tdc->tdma;
>  
>   struct tegra_dma_desc *dma_desc;
>   struct tegra_dma_sg_req *sg_req;
> @@ -1163,6 +1169,7 @@ static void tegra_dma_free_chan_resources(struct 
> dma_chan *dc)
>   list_del(_req->node);
>   kfree(sg_req);
>   }
> + clk_disable_unprepare(tdma->dma_clk);
What if another channel is active, disabling clock can cause bad
behavior. You should check here if all channels are idle and then
disable, or is this handled by clock API?
>  }
>  
>  /* Tegra20 specific DMA controller information */
> @@ -1255,6 +1262,13 @@ static int __devinit tegra_dma_probe(struct 
> platform_device *pdev)
>   }
>   }
>  
> + /* Enable clock before accessing registers */
> + ret = clk_prepare_enable(tdma->dma_clk);
> + if (ret < 0) {
> + dev_err(>dev, "clk_prepare_enable failed: %d\n", ret);
> + goto err_pm_disable;
> + }
> +
>   /* Reset DMA controller */
>   tegra_periph_reset_assert(tdma->dma_clk);
>   udelay(2);
> @@ -1265,6 +1279,8 @@ static int __devinit tegra_dma_probe(struct 
> platform_device *pdev)
>   tdma_write(tdma, TEGRA_APBDMA_CONTROL, 0);
>   tdma_write(tdma, TEGRA_APBDMA_IRQ_MASK_SET, 0xul);
>  
> + clk_disable_unprepare(tdma->dma_clk);
> +
>   INIT_LIST_HEAD(>dma_dev.channels);
>   for (i = 0; i < cdata->nr_channels; i++) {
>   struct tegra_dma_channel *tdc = >channels[i];


-- 
~Vinod

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] firmware_map : unify argument of firmware_map_add_early/hotplug

2012-07-23 Thread Yasuaki Ishimatsu
The patch is bugfix. So I want you to review it and merge into linux-3.6.

2012/07/17 11:17, Yasuaki Ishimatsu wrote:
> There are two ways to create /sys/firmware/memmap/X sysfs:
> 
>- firmware_map_add_early
>  When the system starts, it is calledd from e820_reserve_resources()
>- firmware_map_add_hotplug
>  When the memory is hot plugged, it is called from add_memory()
> 
> But these functions are called without unifying value of end argument as 
> below:
> 
>- end argument of firmware_map_add_early()   : start + size - 1
>- end argument of firmware_map_add_hogplug() : start + size
> 
> The patch unifies them to "start + size". Even if applying the patch,
> /sys/firmware/memmap/X/end file content does not change.
> 
> CC: Thomas Gleixner 
> CC: Ingo Molnar 
> CC: H. Peter Anvin 
> CC: Tejun Heo 
> CC: Andrew Morton 
> Reviewed-by: Dave Hansen 
> Signed-off-by: Yasuaki Ishimatsu 
> 
> ---
>   arch/x86/kernel/e820.c|2 +-
>   drivers/firmware/memmap.c |8 
>   2 files changed, 5 insertions(+), 5 deletions(-)
> 
> Index: linux-next/arch/x86/kernel/e820.c
> ===
> --- linux-next.orig/arch/x86/kernel/e820.c2012-07-02 09:50:23.0 
> +0900
> +++ linux-next/arch/x86/kernel/e820.c 2012-07-12 13:30:45.942318179 +0900
> @@ -944,7 +944,7 @@
>   for (i = 0; i < e820_saved.nr_map; i++) {
>   struct e820entry *entry = _saved.map[i];
>   firmware_map_add_early(entry->addr,
> - entry->addr + entry->size - 1,
> + entry->addr + entry->size,
>   e820_type_to_string(entry->type));
>   }
>   }
> Index: linux-next/drivers/firmware/memmap.c
> ===
> --- linux-next.orig/drivers/firmware/memmap.c 2012-07-02 09:50:26.0 
> +0900
> +++ linux-next/drivers/firmware/memmap.c  2012-07-12 13:40:53.823318481 
> +0900
> @@ -98,7 +98,7 @@
>   /**
>* firmware_map_add_entry() - Does the real work to add a firmware memmap 
> entry.
>* @start: Start of the memory range.
> - * @end:   End of the memory range (inclusive).
> + * @end:   End of the memory range.
>* @type:  Type of the memory range.
>* @entry: Pre-allocated (either kmalloc() or bootmem allocator), 
> uninitialised
>* entry.
> @@ -113,7 +113,7 @@
>   BUG_ON(start > end);
>   
>   entry->start = start;
> - entry->end = end;
> + entry->end = end - 1;
>   entry->type = type;
>   INIT_LIST_HEAD(>list);
>   kobject_init(>kobj, _ktype);
> @@ -148,7 +148,7 @@
>* firmware_map_add_hotplug() - Adds a firmware mapping entry when we do
>* memory hotplug.
>* @start: Start of the memory range.
> - * @end:   End of the memory range (inclusive).
> + * @end:   End of the memory range.
>* @type:  Type of the memory range.
>*
>* Adds a firmware mapping entry. This function is for memory hotplug, it is
> @@ -175,7 +175,7 @@
>   /**
>* firmware_map_add_early() - Adds a firmware mapping entry.
>* @start: Start of the memory range.
> - * @end:   End of the memory range (inclusive).
> + * @end:   End of the memory range.
>* @type:  Type of the memory range.
>*
>* Adds a firmware mapping entry. This function uses the bootmem allocator
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL] Blackfin changes for 3.6-rc1

2012-07-23 Thread Linus Torvalds
On Mon, Jul 23, 2012 at 8:54 PM, Bob Liu  wrote:
>
> Please pull blackfin changes for 3.6-rc1.

No.

These were clearly rebased today. And on top of random state in the
merge window.

Why would you do that? It's so horribly wrong that there's no way in
hell I'm pulling this mess.

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] regulator: tps6586x: add support for input supply

2012-07-23 Thread Laxman Dewangan

On Tuesday 24 July 2012 02:21 AM, Stephen Warren wrote:

On 07/13/2012 07:50 AM, Laxman Dewangan wrote:

sm[0-2], ldo[0-9] and ldo_rtc
+- sm0-supply: The input supply for the SM0.
+- sm1-supply: The input supply for the SM1.
+- sm2-supply: The input supply for the SM2.
+- vinldo01-supply: The input supply for the LDO1 and LDO2
+- vinldo23-supply: The input supply for the LDO2 and LDO3
+- vinldo4-supply: The input supply for the LDO4
+- vinldo678-supply: The input supply for the LDO6, LDO7 and LDO8
+- vinldo9-supply: The input supply for the LDO9

Hmm. The signal names in my data sheet are VIN_SMn and VINLDOn, so
having "vin" in just some of the property names seems a little inconsistent.



My bad, not sure why I missed it.  I will send the patch for correcting 
this before Tegra's board ventana/harmony fills the dt entry.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: next/mmotm unbootable on G5: irqdomain

2012-07-23 Thread Grant Likely
On Mon, Jul 23, 2012 at 9:21 PM, Benjamin Herrenschmidt
 wrote:
> On Mon, 2012-07-23 at 16:32 -0600, Grant Likely wrote:
>> > As-is I'm backing off from the linear/legacy/tree merge patch as just
>> > too risky. I've already pulled that stuff out of linux-next.
>>
>> Can I pull you pseries fix into my tree (my preference), or do I need
>> to rebase on top of yours?
>
> The mpic fix for the g5 is in Linus tree already, I added it on top of
> powerpc -next before I asked Linus to pull.
>
> For pseries (ie the fix for irq_find_mapping vs. radix), I don't have a
> formal patch, just the one I hand typed in my previous email, so do
> whatever you want with it.

Okay, I'll merge in Linus' tree at the appropriate point to protect
against bisection, and I'll fix up the appropriate patch that touches
irq_find_mapping.

g.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: linux-next: build failure after merge of the gpio-lw tree

2012-07-23 Thread Kuninori Morimoto

Hi Thomas

Could you please teach me current status of these patches ?

Kuninori Morimoto (2):
  genirq: export irq_set_chip_and_handler_name()
  genirq: export dummy_irq_chip

At Mon, 9 Jul 2012 22:34:23 +0200,
Linus Walleij wrote:
> 
> On Mon, Jul 9, 2012 at 4:04 AM, Kuninori Morimoto
>  wrote:
> 
> > Hi Linus Walleij, Stephen, and Thomas
> >
> >> >> > After merging the gpio-lw tree, today's linux-next build (x86_64
> >> >> > allmodconfig) failed like this:
> >> >> >
> >> >> > ERROR: "irq_set_chip_and_handler_name" [drivers/gpio/gpio-pcf857x.ko] 
> >> >> > undefined!
> >> >> > ERROR: "dummy_irq_chip" [drivers/gpio/gpio-pcf857x.ko] undefined!
> >> >>
> >> >> Thanks, I've dropped the offending patch, Kuninori can you look into 
> >> >> this and
> >> >> provide a new patch? It's the second patch from your patch set.
> >> >
> >> > OK. I will, but it will be next week.
> >> > And could you please show me where is your repository/branch ?
> >>
> >> http://git.kernel.org/?p=linux/kernel/git/linusw/linux-gpio.git;a=summary
> >> branch devel/for-next
> >
> > In my check, these are export symbol issue.
> > I think above 2 function/struct were not exported for module.
> >
> > Is it poosible to solve this issue by these patches ?
> 
> Hm Thomas has to answer to that (and merge the patches, if he
> likes them).
> 
> Yours,
> Linus Walleij


Best regards
---
Kuninori Morimoto
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[GIT PULL] Blackfin changes for 3.6-rc1

2012-07-23 Thread Bob Liu
Hi linus,

Please pull blackfin changes for 3.6-rc1.
The big changes are adding PM and HDMI support for bf60x, other patches are 
various
bug fix and code cleanup.

Thanks,
-Bob

The following changes since commit 97e7292ab5ccd30a13c3612835535fc3f3e59715:

  Merge tag 'clk' of git://git.kernel.org/pub/scm/linux/kernel/git/arm/arm-soc 
(2012-07-23 17:51:03 -0700)

are available in the git repository at:


  git://git.kernel.org/pub/scm/linux/kernel/git/lliubbo/blackfin.git for-linus

for you to fetch changes up to ff765054a1d6461a0724443084f806455320c2ef:

  bf60x: fix build warning (2012-07-24 11:44:00 +0800)


Bob Liu (7):
  bfin: reorg clock init steps for bf609
  blackfin: Kconfig: fix ROM range for bf60x
  blackfin: mach-common: ints-priority add irq_set_wake
  blackfin: bf609-ezkit: add probe_type for norflash
  blackfin: fix musb macro name
  blackfin: cplb-nompu: fix ROM cplb size for bf609-ezkit
  bf60x: fix build warning

Scott Jiang (10):
  bf609: change ad7877 cs and irq pin
  bfin: add 32M, 16M and 8M uncached DMA region options
  v4l2: add adv7842 video decoder driver
  bf609: add ssm2602 support on bf609 platform
  bf609: add CVBS and S-Video support for adv7842
  bf609: add HDMI support for adv7842
  bf609: convert vs6624 blank_clocks to black_pixels
  bf561: add capabilities in adv7183_inputs
  bf609: reuse bf5xx-i2s-pcm.c as i2s pcm driver
  bf609: add adv7511 display support

Sonic Zhang (16):
  blackfin: Call sg_for_each to pass through the whole sg list.
  bf609: crypto: Add blackfin crypto crc driver platform data.
  bf60x: Enable Blackfin CRC crypto driver by default.
  bf60x: bfin_crc: move structure bfin_crc out of head file.
  bf609: bfin_crc: Remove unused CRC TX DMA platform resources.
  bfin: pm: add deepsleep for bf60x
  bf60x: Add wake up latency bench for deep sleep mode.
  i2c: i2c-bfin-twi: Always access 16 bit MMR by bfin 16 bit access Macro.
  bf60x: sec: Clean up interrupt initialization code for SEC.
  bf60x: sec: Enable sec interrupt source priority configuration.
  bf60x: update bf60x anomaly list.
  bf60x: add default anomaly setting.
  bf60x: update anomaly id in serial and twi driver headers.
  bf60x: Add double fault, hardware error and NMI SEC handler
  bf60x: cpufreq: fix anomaly 05000273
  blackfin: twi: read twi mmr via bfin_read macro

Steven Miao (14):
  pm: dpmc macro typo fix
  bfin-dma: only use MDMA3 on bf609
  irq: set cgu event handle to fasteoi handle
  cpufreq: change debug message level to show clock change error
  cache: enable L2 sram icache in menuconfig
  bfin: simple_timer: add READ_COUNTER ioctl and add NOIRQ timer mode
  bf60x: pm: add smc nor flash syscore ops
  bf60x: pm: pass wakeup param
  gpiokeys: add gpio keyboard platform device
  bf60x: pm: add pint suspend and resume support
  bfin: pint: add pint suspend and resume
  cleanup: sec and linkport only built on bf60x
  dpm: deepsleep: reserve stack
  PM: add BF60x flash suspend and resume support

Vivi Li (1):
  bf60x: vs6624 pin update

 arch/blackfin/Kconfig  |   16 +-
 arch/blackfin/configs/BF609-EZKIT_defconfig|2 +
 arch/blackfin/include/asm/bfin-global.h|8 +-
 arch/blackfin/include/asm/bfin_crc.h   |   14 -
 arch/blackfin/include/asm/bfin_serial.h|2 +-
 arch/blackfin/include/asm/bfin_simple_timer.h  |6 +
 arch/blackfin/include/asm/bfin_twi.h   |   10 +-
 arch/blackfin/include/asm/context.S|9 +
 arch/blackfin/include/asm/dpmc.h   |2 +-
 arch/blackfin/include/asm/gpio.h   |2 +
 arch/blackfin/include/asm/irq.h|   10 +
 arch/blackfin/include/asm/mem_init.h   |  212 +
 arch/blackfin/include/asm/traps.h  |2 +
 arch/blackfin/kernel/bfin_dma.c|4 +-
 arch/blackfin/kernel/cplb-nompu/cplbinit.c |8 +
 arch/blackfin/kernel/dma-mapping.c |   10 +-
 arch/blackfin/mach-bf527/boards/ezkit.c|4 +-
 arch/blackfin/mach-bf548/boards/ezkit.c|4 +-
 arch/blackfin/mach-bf548/include/mach/gpio.h   |2 +
 arch/blackfin/mach-bf561/boards/ezkit.c|3 +
 arch/blackfin/mach-bf609/Kconfig   |8 +
 arch/blackfin/mach-bf609/Makefile  |4 +-
 arch/blackfin/mach-bf609/boards/ezkit.c|  304 +-
 arch/blackfin/mach-bf609/clock.c   |3 +-
 arch/blackfin/mach-bf609/dpm.S |  157 ++
 arch/blackfin/mach-bf609/hibernate.S   |   65 
 arch/blackfin/mach-bf609/include/mach/anomaly.h|  141 -
 

[PATCH NEXT v2]staging: tidspbridge: Fix typos.

2012-07-23 Thread Justin P. Mattock
From: "Justin P. Mattock" 

Signed-off-by: Justin P. Mattock 

---

The below patch fixes typos found while reading through staging: tidsbridge:

 .../staging/tidspbridge/Documentation/error-codes  |2 +-
 drivers/staging/tidspbridge/core/_tiomap.h |2 +-
 drivers/staging/tidspbridge/core/chnl_sm.c |6 +++---
 drivers/staging/tidspbridge/core/io_sm.c   |   10 +-
 drivers/staging/tidspbridge/core/sync.c|2 +-
 drivers/staging/tidspbridge/core/tiomap3430.c  |4 ++--
 drivers/staging/tidspbridge/core/tiomap3430_pwr.c  |2 +-
 drivers/staging/tidspbridge/dynload/tramp.c|8 
 drivers/staging/tidspbridge/hw/hw_mmu.c|6 +++---
 .../tidspbridge/include/dspbridge/dspioctl.h   |2 +-
 .../staging/tidspbridge/include/dspbridge/mbx_sh.h |2 +-
 .../staging/tidspbridge/include/dspbridge/node.h   |2 +-
 .../staging/tidspbridge/include/dspbridge/ntfy.h   |2 +-
 .../staging/tidspbridge/include/dspbridge/proc.h   |2 +-
 .../staging/tidspbridge/include/dspbridge/strm.h   |2 +-
 .../staging/tidspbridge/include/dspbridge/sync.h   |4 ++--
 drivers/staging/tidspbridge/rmgr/dbdcd.c   |2 +-
 drivers/staging/tidspbridge/rmgr/dspdrv.c  |4 ++--
 drivers/staging/tidspbridge/rmgr/mgr.c |4 ++--
 drivers/staging/tidspbridge/rmgr/nldr.c|2 +-
 drivers/staging/tidspbridge/rmgr/node.c|2 +-
 drivers/staging/tidspbridge/rmgr/proc.c|2 +-
 22 files changed, 37 insertions(+), 37 deletions(-)

diff --git a/drivers/staging/tidspbridge/Documentation/error-codes 
b/drivers/staging/tidspbridge/Documentation/error-codes
index 12826e2..ad73cba 100644
--- a/drivers/staging/tidspbridge/Documentation/error-codes
+++ b/drivers/staging/tidspbridge/Documentation/error-codes
@@ -69,7 +69,7 @@ The error codes used by this driver are:
 Invalid pointer or handler.
 
 [EEXIST]
-Attempted to create a channel manager  when one already exists.
+Attempted to create a channel manager when one already exists.
 
 [EINVAL]
 Invalid argument.
diff --git a/drivers/staging/tidspbridge/core/_tiomap.h 
b/drivers/staging/tidspbridge/core/_tiomap.h
index 7cb5871..543a127 100644
--- a/drivers/staging/tidspbridge/core/_tiomap.h
+++ b/drivers/staging/tidspbridge/core/_tiomap.h
@@ -219,7 +219,7 @@ static const struct map_l4_peripheral l4_peripheral_table[] 
= {
 /* MBX_PM_MAX_RESOURCES: CORE 2 Clock Resources. */
 #define MBX_CORE2_RESOURCES 1
 
-/* MBX_PM_MAX_RESOURCES: TOTAL Clock Reosurces. */
+/* MBX_PM_MAX_RESOURCES: TOTAL Clock Resources. */
 #define MBX_PM_MAX_RESOURCES 11
 
 /*  Power Management Commands */
diff --git a/drivers/staging/tidspbridge/core/chnl_sm.c 
b/drivers/staging/tidspbridge/core/chnl_sm.c
index e0c7e4c..f38950e 100644
--- a/drivers/staging/tidspbridge/core/chnl_sm.c
+++ b/drivers/staging/tidspbridge/core/chnl_sm.c
@@ -20,7 +20,7 @@
  *  The lower edge functions must be implemented by the Bridge driver
  *  writer, and are declared in chnl_sm.h.
  *
- *  Care is taken in this code to prevent simulataneous access to channel
+ *  Care is taken in this code to prevent simultaneous access to channel
  *  queues from
  *  1. Threads.
  *  2. io_dpc(), scheduled from the io_isr() as an event.
@@ -34,7 +34,7 @@
  *  Channel Invariant:
  *  There is an important invariant condition which must be maintained per
  *  channel outside of bridge_chnl_get_ioc() and IO_Dispatch(), violation 
of
- *  which may cause timeouts and/or failure offunction sync_wait_on_event.
+ *  which may cause timeouts and/or failure of function sync_wait_on_event.
  *  This invariant condition is:
  *
  *  list_empty(>io_completions) ==> pchnl->sync_event is reset
@@ -602,7 +602,7 @@ int bridge_chnl_get_ioc(struct chnl_object *chnl_obj, u32 
timeout,
/*  Since DSPStream_Reclaim() does not take a timeout
 *  parameter, we pass the stream's timeout value to
 *  bridge_chnl_get_ioc. We cannot determine whether or not
-*  we have waited in User mode. Since the stream's timeout
+*  we have waited in user mode. Since the stream's timeout
 *  value may be non-zero, we still have to set the event.
 *  Therefore, this optimization is taken out.
 *
diff --git a/drivers/staging/tidspbridge/core/io_sm.c 
b/drivers/staging/tidspbridge/core/io_sm.c
index 480a384..e322fb7 100644
--- a/drivers/staging/tidspbridge/core/io_sm.c
+++ b/drivers/staging/tidspbridge/core/io_sm.c
@@ -837,8 +837,8 @@ static void io_dispatch_pm(struct io_mgr *pio_mgr)
 /*
  *   io_dpc 
  *  Deferred procedure call for shared memory channel driver ISR.  Carries
- *  out the dispatch of I/O as a non-preemptible event.It can only be
- *  pre-empted  by an ISR.
+ *  out the dispatch of I/O 

Re: [PATCH NEXT]staging "tidspbridge" Fix typos.

2012-07-23 Thread Justin P. Mattock

On 07/23/2012 01:44 PM, Ramirez Luna, Omar wrote:

Hi Justin,

On Mon, Jul 23, 2012 at 8:49 AM, Justin P. Mattock
 wrote:

diff --git a/drivers/staging/tidspbridge/core/tiomap3430.c 
b/drivers/staging/tidspbridge/core/tiomap3430.c
index f9609ce..2c82d5a 100644
--- a/drivers/staging/tidspbridge/core/tiomap3430.c
+++ b/drivers/staging/tidspbridge/core/tiomap3430.c
@@ -328,7 +328,7 @@ static int bridge_brd_read(struct bridge_dev_context 
*dev_ctxt,
ul_num_bytes, mem_type);
 return status;
 }
-   /* copy the data from  DSP memory, */
+   /* copy the data from DSP memory, */


I guess we can get rid of the comma (,) at the end of this sentence.


good catch!



...

diff --git a/drivers/staging/tidspbridge/include/dspbridge/proc.h 
b/drivers/staging/tidspbridge/include/dspbridge/proc.h
index a82380e..9cd5022 100644
--- a/drivers/staging/tidspbridge/include/dspbridge/proc.h
+++ b/drivers/staging/tidspbridge/include/dspbridge/proc.h
@@ -263,7 +263,7 @@ extern int proc_get_processor_id(void *proc, u32 * proc_id);
   *  Returns:
   *  0 :   Success.
   *  -EFAULT :   Invalid processor handle.
- *  -EPERM   :   General failure while retireving processor trace
+ *  -EPERM   :   General failure while retireing processor trace


This was meant to be 'retrieving'.

And given that this is focused on tidspbridge, could you change the
patch subject? To: staging: tidspbridge: fix typos.

Cheers,

Omar



ah.. I see now.. comments can be tough to decipher sometimes with what 
the writer is saying, helps to know what the code is doing.


cheers, resending the revised version, and Thank you for taking the time.

Justin P. Mattock
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH V6 05/13] perf: Generic intel uncore support

2012-07-23 Thread Stephane Eranian
On Fri, Jun 15, 2012 at 8:31 AM, Yan, Zheng  wrote:
> From: "Yan, Zheng" 
>
> This patch adds the generic intel uncore pmu support, including helper
> functions that add/delete uncore events, a hrtimer that periodically
> polls the counters to avoid overflow and code that places all events
> for a particular socket onto a single cpu. The code design is based on
> the structure of Sandy Bridge-EP's uncore subsystem, which consists of
> a variety of components, each component contain one or more boxes.
>
> Signed-off-by: Zheng Yan 
> ---
>  arch/x86/kernel/cpu/Makefile  |4 +-
>  arch/x86/kernel/cpu/perf_event_intel_uncore.c |  878 
> +
>  arch/x86/kernel/cpu/perf_event_intel_uncore.h |  204 ++
>  3 files changed, 1085 insertions(+), 1 deletions(-)
>  create mode 100644 arch/x86/kernel/cpu/perf_event_intel_uncore.c
>  create mode 100644 arch/x86/kernel/cpu/perf_event_intel_uncore.h
>
> diff --git a/arch/x86/kernel/cpu/Makefile b/arch/x86/kernel/cpu/Makefile
> index 6ab6aa2..bac4c38 100644
> --- a/arch/x86/kernel/cpu/Makefile
> +++ b/arch/x86/kernel/cpu/Makefile
> @@ -32,7 +32,9 @@ obj-$(CONFIG_PERF_EVENTS) += perf_event.o
>
>  ifdef CONFIG_PERF_EVENTS
>  obj-$(CONFIG_CPU_SUP_AMD)  += perf_event_amd.o
> -obj-$(CONFIG_CPU_SUP_INTEL)+= perf_event_p6.o perf_event_p4.o 
> perf_event_intel_lbr.o perf_event_intel_ds.o perf_event_intel.o
> +obj-$(CONFIG_CPU_SUP_INTEL)+= perf_event_p6.o perf_event_p4.o
> +obj-$(CONFIG_CPU_SUP_INTEL)+= perf_event_intel_lbr.o 
> perf_event_intel_ds.o perf_event_intel.o
> +obj-$(CONFIG_CPU_SUP_INTEL)+= perf_event_intel_uncore.o
>  endif
>
>  obj-$(CONFIG_X86_MCE)  += mcheck/
> diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c 
> b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
> new file mode 100644
> index 000..ef79e49
> --- /dev/null
> +++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
> @@ -0,0 +1,878 @@
> +#include "perf_event_intel_uncore.h"
> +
> +static struct intel_uncore_type *empty_uncore[] = { NULL, };
> +static struct intel_uncore_type **msr_uncores = empty_uncore;
> +
> +/* mask of cpus that collect uncore events */
> +static cpumask_t uncore_cpu_mask;
> +
> +/* constraint for the fixed counter */
> +static struct event_constraint constraint_fixed =
> +   EVENT_CONSTRAINT((u64)-1, 1 << UNCORE_PMC_IDX_FIXED, (u64)-1);
> +
> +static void uncore_assign_hw_event(struct intel_uncore_box *box,
> +   struct perf_event *event, int idx)
> +{
> +   struct hw_perf_event *hwc = >hw;
> +
> +   hwc->idx = idx;
> +   hwc->last_tag = ++box->tags[idx];
> +
> +   if (hwc->idx == UNCORE_PMC_IDX_FIXED) {
> +   hwc->event_base = uncore_msr_fixed_ctr(box);
> +   hwc->config_base = uncore_msr_fixed_ctl(box);
> +   return;
> +   }
> +
> +   hwc->config_base = uncore_msr_event_ctl(box, hwc->idx);
> +   hwc->event_base =  uncore_msr_perf_ctr(box, hwc->idx);
> +}
> +
> +static void uncore_perf_event_update(struct intel_uncore_box *box,
> +   struct perf_event *event)
> +{
> +   u64 prev_count, new_count, delta;
> +   int shift;
> +
> +   if (event->hw.idx >= UNCORE_PMC_IDX_FIXED)
> +   shift = 64 - uncore_fixed_ctr_bits(box);
> +   else
> +   shift = 64 - uncore_perf_ctr_bits(box);
> +
> +   /* the hrtimer might modify the previous event value */
> +again:
> +   prev_count = local64_read(>hw.prev_count);
> +   new_count = uncore_read_counter(box, event);
> +   if (local64_xchg(>hw.prev_count, new_count) != prev_count)
> +   goto again;
> +
> +   delta = (new_count << shift) - (prev_count << shift);
> +   delta >>= shift;
> +
> +   local64_add(delta, >count);
> +}
> +
> +/*
> + * The overflow interrupt is unavailable for SandyBridge-EP, is broken
> + * for SandyBridge. So we use hrtimer to periodically poll the counter
> + * to avoid overflow.
> + */
> +static enum hrtimer_restart uncore_pmu_hrtimer(struct hrtimer *hrtimer)
> +{
> +   struct intel_uncore_box *box;
> +   unsigned long flags;
> +   int bit;
> +
> +   box = container_of(hrtimer, struct intel_uncore_box, hrtimer);
> +   if (!box->n_active || box->cpu != smp_processor_id())
> +   return HRTIMER_NORESTART;
> +   /*
> +* disable local interrupt to prevent uncore_pmu_event_start/stop
> +* to interrupt the update process
> +*/
> +   local_irq_save(flags);
> +
> +   for_each_set_bit(bit, box->active_mask, UNCORE_PMC_IDX_MAX)
> +   uncore_perf_event_update(box, box->events[bit]);
> +
> +   local_irq_restore(flags);
> +
> +   hrtimer_forward_now(hrtimer, 
> ns_to_ktime(UNCORE_PMU_HRTIMER_INTERVAL));
> +   return HRTIMER_RESTART;
> +}
> +
> +static void uncore_pmu_start_hrtimer(struct intel_uncore_box 

Re: [RFC PATCH 0/6] virtio-trace: Support virtio-trace

2012-07-23 Thread Masami Hiramatsu
(2012/07/24 11:36), Yoshihiro YUNOMAE wrote:
> Therefore, we propose a new system "virtio-trace", which uses enhanced
> virtio-serial and existing ring-buffer of ftrace, for collecting guest kernel
> tracing data. In this system, there are 5 main components:
>  (1) Ring-buffer of ftrace in a guest
>  - When trace agent reads ring-buffer, a page is removed from ring-buffer.
>  (2) Trace agent in the guest
>  - Splice the page of ring-buffer to read_pipe using splice() without
>memory copying. Then, the page is spliced from write_pipe to virtio
>without memory copying.
>  (3) Virtio-console driver in the guest
>  - Pass the page to virtio-ring
>  (4) Virtio-serial bus in QEMU
>  - Copy the page to kernel pipe
>  (5) Reader in the host
>  - Read guest tracing data via FIFO(named pipe)

So, this is our answer for the argued points in previous thread.
This virtio-serial and ftrace enhancements doesn't introduce new
"ringbuffer" in the kernel, and just use virtio's ringbuffer.
Also, using splice gives us a great advantage in the performance
because of copy-less trace-data transfer.

Actually, one copy should occur in the host (to write it into the pipe),
because removing physical pages of the guest is hard to track and may
involve a TLB flush per page, even if it is done in background.

Thank you,

-- 
Masami HIRAMATSU
Software Platform Research Dept. Linux Technology Center
Hitachi, Ltd., Yokohama Research Laboratory
E-mail: masami.hiramatsu...@hitachi.com


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: next/mmotm unbootable on G5: irqdomain

2012-07-23 Thread Benjamin Herrenschmidt
On Mon, 2012-07-23 at 16:32 -0600, Grant Likely wrote:
> > As-is I'm backing off from the linear/legacy/tree merge patch as just
> > too risky. I've already pulled that stuff out of linux-next.
> 
> Can I pull you pseries fix into my tree (my preference), or do I need
> to rebase on top of yours? 

The mpic fix for the g5 is in Linus tree already, I added it on top of
powerpc -next before I asked Linus to pull.

For pseries (ie the fix for irq_find_mapping vs. radix), I don't have a
formal patch, just the one I hand typed in my previous email, so do
whatever you want with it.

Cheers,
Ben.


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/1 v2] classmate-laptop: Add support for Classmate V4 accelerometer.

2012-07-23 Thread Thadeu Lima de Souza Cascardo
On Mon, Jul 23, 2012 at 03:44:41PM +0200, Miguel Gómez wrote:
> El 23/07/12 15:36, Matthew Garrett escribió:
> >On Mon, Jul 23, 2012 at 03:33:27PM +0200, Miguel Gómez wrote:
> >
> >>Names are upper-cased in acpica, so the device is reported as
> >>FNBT. But in the driver it's named FnBT, and that's why it
> >>doesn't match.
> >
> >So just change the existing entry in the driver to FNBT?
> 
> I'd go for it. I can send a patch if you want. But not sure about
> Thadeu's opinion. In the old thread it seems that he wanted to
> explore other options besides that.
> 
> Do you agree with that change Thadeu?
> 

I'd say the other options were met with silence. So, go forward with the
patch and I'll ack it.

Thanks.
Cascardo.

> 
> -- 
> Miguel Gómez
> Igalia - http://www.igalia.com


signature.asc
Description: Digital signature


[PATCH] acpi : create sun sysfs file in container device

2012-07-23 Thread Yasuaki Ishimatsu
There is no comment on the patch about a month. But I want to merge the patch
into linux-3.6. So I resend it. 

---
Even if container device has _SUN method, the method is ignored. So we cannot
know slot-unique ID number of the container device. The patch creates "sun"
file in sysfs so that we can recognize it.

Signed-off-by: Yasuaki Ishimatsu 

---
 drivers/acpi/container.c |   36 +---
 1 file changed, 33 insertions(+), 3 deletions(-)

Index: linux-3.5-rc1/drivers/acpi/container.c
===
--- linux-3.5-rc1.orig/drivers/acpi/container.c 2012-06-14 15:35:31.045500166 
+0900
+++ linux-3.5-rc1/drivers/acpi/container.c  2012-06-14 16:40:13.010405144 
+0900
@@ -32,6 +32,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -93,10 +94,30 @@ static int is_device_present(acpi_handle
 }

 /***/
+
+static ssize_t acpi_device_sun_show(struct device *dev,
+   struct device_attribute *attr, char *buf) {
+   struct acpi_device *device = to_acpi_device(dev);
+   acpi_status status;
+   unsigned long long sun;
+
+   status = acpi_evaluate_integer(device->handle, "_SUN", NULL,  );
+   if (ACPI_FAILURE(status))
+   return 0;
+
+   return sprintf(buf, "%llu\n", sun);
+}
+
+static DEVICE_ATTR(sun, 0444, acpi_device_sun_show, NULL);
+
+/***/
+
 static int acpi_container_add(struct acpi_device *device)
 {
struct acpi_container *container;
-
+   acpi_status status;
+   acpi_handle temp;
+   int result = 0;

if (!device) {
printk(KERN_ERR PREFIX "device is NULL\n");
@@ -115,13 +136,22 @@ static int acpi_container_add(struct acp
ACPI_DEBUG_PRINT((ACPI_DB_INFO, "Device <%s> bid <%s>\n",
  acpi_device_name(device), acpi_device_bid(device)));

-   return 0;
+   status = acpi_get_handle(device->handle, "_SUN", );
+   if (ACPI_SUCCESS(status))
+   result = device_create_file(>dev, _attr_sun);
+
+   return result;
 }

 static int acpi_container_remove(struct acpi_device *device, int type)
 {
-   acpi_status status = AE_OK;
+   acpi_status status;
struct acpi_container *pc = NULL;
+   acpi_handle temp;
+
+   status = acpi_get_handle(device->handle, "_SUN", );
+   if (ACPI_SUCCESS(status))
+   device_remove_file(>dev, _attr_sun);

pc = acpi_driver_data(device);
kfree(pc);

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] sctp: Make "Invalid Stream Identifier" ERROR follows SACK when bundling

2012-07-23 Thread xufeng zhang

On 07/24/2012 10:27 AM, Vlad Yasevich wrote:

xufeng zhang  wrote:

   

On 07/19/2012 01:57 PM, xufengzhang.m...@gmail.com wrote:
 

When "Invalid Stream Identifier" ERROR happens after process the
received DATA chunks, this ERROR chunk is enqueued into outqueue
before SACK chunk, so when bundling ERROR chunk with SACK chunk,
the ERROR chunk is always placed first in the packet because of
the chunk's position in the outqueue.
This violates sctp specification:
  RFC 4960 6.5. Stream Identifier and Stream Sequence Number
  ...The endpoint may bundle the ERROR chunk in the same
  packet as the SACK as long as the ERROR follows the SACK.
So we must place SACK first when bundling "Invalid Stream Identifier"
ERROR and SACK in one packet.
Although we can do that by enqueue SACK chunk into outqueue before
ERROR chunk, it will violate the side-effect interpreter processing.
It's easy to do this job when dequeue chunks from the outqueue,
by this way, we introduce a flag 'has_isi_err' which indicate
whether or not the "Invalid Stream Identifier" ERROR happens.

Signed-off-by: Xufeng Zhang
---
   include/net/sctp/structs.h |2 ++
   net/sctp/output.c  |   26 ++
   2 files changed, 28 insertions(+), 0 deletions(-)

diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
index 88949a9..5adf4de 100644
--- a/include/net/sctp/structs.h
+++ b/include/net/sctp/structs.h
@@ -842,6 +842,8 @@ struct sctp_packet {
has_sack:1, /* This packet contains a SACK chunk. */
has_auth:1, /* This packet contains an AUTH chunk */
has_data:1, /* This packet contains at least 1 DATA chunk */
+   has_isi_err:1,  /* This packet contains a "Invalid Stream
+* Identifier" ERROR chunk */
ipfragok:1, /* So let ip fragment this packet */
malloced:1; /* Is it malloced? */
   };
diff --git a/net/sctp/output.c b/net/sctp/output.c
index 817174e..77fb1ae 100644
--- a/net/sctp/output.c
+++ b/net/sctp/output.c
@@ -79,6 +79,7 @@ static void sctp_packet_reset(struct sctp_packet
   

*packet)
 

packet->has_sack = 0;
packet->has_data = 0;
packet->has_auth = 0;
+   packet->has_isi_err = 0;
packet->ipfragok = 0;
packet->auth = NULL;
   }
@@ -267,6 +268,7 @@ static sctp_xmit_t sctp_packet_bundle_sack(struct
   

sctp_packet *pkt,
 

   sctp_xmit_t sctp_packet_append_chunk(struct sctp_packet *packet,
 struct sctp_chunk *chunk)
   {
+   struct sctp_chunk *lchunk;
sctp_xmit_t retval = SCTP_XMIT_OK;
__u16 chunk_len = WORD_ROUND(ntohs(chunk->chunk_hdr->length));

@@ -316,7 +318,31 @@ sctp_xmit_t sctp_packet_append_chunk(struct
   

sctp_packet *packet,
 

packet->has_cookie_echo = 1;
break;

+   case SCTP_CID_ERROR:
+   if (chunk->subh.err_hdr->cause&   SCTP_ERROR_INV_STRM)
+   packet->has_isi_err = 1;
+   break;
+
case SCTP_CID_SACK:
+   /* RFC 4960
+* 6.5 Stream Identifier and Stream Sequence Number
+* The endpoint may bundle the ERROR chunk in the same
+* packet as the SACK as long as the ERROR follows the SACK.
+*/
+   if (packet->has_isi_err) {
+   if (list_is_singular(>chunk_list))
+   list_add(>list,>chunk_list);
+   else {
+   lchunk = list_first_entry(>chunk_list,
+   struct sctp_chunk, list);
+   list_add(>list,>list);
+   }

   

And I should clarify the above judgment code.
AFAIK, there should be two cases for the bundling when invalid stream
identifier error happens:
1). COOKIE_ACK ERROR SACK
2). ERROR SACK
So I need to deal with the two cases differently.

 

Sorry but I just don't buy that the above are the only 2 cases.  What if there 
are addip chunks as well?  What if there are some other extensions also.  This 
code has to be generic enough to handle any condition.
   

Aha, you are right, this may happens.
So I think the general solution is to fix this problem in the enqueue side.
What do you think? any better suggestion!


Thanks,
Xufeng Zhang

- vlad

   

Thanks,
Xufeng Zhang
 

+   packet->size += chunk_len;
+   chunk->transport = packet->transport;
+   packet->has_sack = 1;
+   goto finish;
+   }
+
packet->has_sack = 1;
break;


   


   


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  

[PATCH V2] perf/x86: Fix format definition of SNB-EP uncore QPI box

2012-07-23 Thread Yan, Zheng
From: "Yan, Zheng" 

The event control register of SNB-EP uncore QPI box has a one bit
extension at bit position 21.

Reported-by: Stephane Eranian 
Signed-off-by: Yan, Zheng 
---
Changes since V1:
 - Define the extentsion bit together with event select field

 arch/x86/kernel/cpu/perf_event_intel_uncore.c |   22 +-
 arch/x86/kernel/cpu/perf_event_intel_uncore.h |4 
 2 files changed, 25 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.c 
b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
index a5de59f..2139fb0 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.c
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.c
@@ -18,6 +18,7 @@ static struct event_constraint constraint_empty =
EVENT_CONSTRAINT(0, 0, 0);
 
 DEFINE_UNCORE_FORMAT_ATTR(event, event, "config:0-7");
+DEFINE_UNCORE_FORMAT_ATTR(event_ext, event, "config:0-7,21");
 DEFINE_UNCORE_FORMAT_ATTR(umask, umask, "config:8-15");
 DEFINE_UNCORE_FORMAT_ATTR(edge, edge, "config:18");
 DEFINE_UNCORE_FORMAT_ATTR(tid_en, tid_en, "config:19");
@@ -279,6 +280,15 @@ static struct attribute *snbep_uncore_pcu_formats_attr[] = 
{
NULL,
 };
 
+static struct attribute *snbep_uncore_qpi_formats_attr[] = {
+   _attr_event_ext.attr,
+   _attr_umask.attr,
+   _attr_edge.attr,
+   _attr_inv.attr,
+   _attr_thresh8.attr,
+   NULL,
+};
+
 static struct uncore_event_desc snbep_uncore_imc_events[] = {
INTEL_UNCORE_EVENT_DESC(clockticks,  "event=0xff,umask=0x00"),
INTEL_UNCORE_EVENT_DESC(cas_count_read,  "event=0x04,umask=0x03"),
@@ -314,6 +324,11 @@ static struct attribute_group 
snbep_uncore_pcu_format_group = {
.attrs = snbep_uncore_pcu_formats_attr,
 };
 
+static struct attribute_group snbep_uncore_qpi_format_group = {
+   .name = "format",
+   .attrs = snbep_uncore_qpi_formats_attr,
+};
+
 static struct intel_uncore_ops snbep_uncore_msr_ops = {
.init_box   = snbep_uncore_msr_init_box,
.disable_box= snbep_uncore_msr_disable_box,
@@ -485,8 +500,13 @@ static struct intel_uncore_type snbep_uncore_qpi = {
.num_counters   = 4,
.num_boxes  = 2,
.perf_ctr_bits  = 48,
+   .perf_ctr   = SNBEP_PCI_PMON_CTR0,
+   .event_ctl  = SNBEP_PCI_PMON_CTL0,
+   .event_mask = SNBEP_QPI_PCI_PMON_RAW_EVENT_MASK,
+   .box_ctl= SNBEP_PCI_PMON_BOX_CTL,
+   .ops= _uncore_pci_ops,
.event_descs= snbep_uncore_qpi_events,
-   SNBEP_UNCORE_PCI_COMMON_INIT(),
+   .format_group   = _uncore_qpi_format_group,
 };
 
 
diff --git a/arch/x86/kernel/cpu/perf_event_intel_uncore.h 
b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
index b13e9ea..0f8a8ca 100644
--- a/arch/x86/kernel/cpu/perf_event_intel_uncore.h
+++ b/arch/x86/kernel/cpu/perf_event_intel_uncore.h
@@ -115,6 +115,10 @@
 SNBEP_PCU_MSR_PMON_CTL_OCC_INVERT | \
 SNBEP_PCU_MSR_PMON_CTL_OCC_EDGE_DET)
 
+#define SNBEP_QPI_PCI_PMON_RAW_EVENT_MASK  \
+   (SNBEP_PMON_RAW_EVENT_MASK | \
+SNBEP_PMON_CTL_EV_SEL_EXT)
+
 /* SNB-EP pci control register */
 #define SNBEP_PCI_PMON_BOX_CTL 0xf4
 #define SNBEP_PCI_PMON_CTL00xd8
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH 6/6] tools: Add guest trace agent as a user tool

2012-07-23 Thread Yoshihiro YUNOMAE
This patch adds a user tool, "trace agent" for sending trace data of a guest to
a Host in low overhead. This agent has the following functions:
 - splice a page of ring-buffer to read_pipe without memory copying
 - splice the page from write_pipe to virtio-console without memory copying
 - write trace data to stdout by using -o option
 - controlled by start/stop orders from a Host

Signed-off-by: Yoshihiro YUNOMAE 
---

 tools/virtio/virtio-trace/Makefile  |   14 +
 tools/virtio/virtio-trace/README|  118 
 tools/virtio/virtio-trace/trace-agent-ctl.c |  137 ++
 tools/virtio/virtio-trace/trace-agent-rw.c  |  192 +++
 tools/virtio/virtio-trace/trace-agent.c |  270 +++
 tools/virtio/virtio-trace/trace-agent.h |   75 
 6 files changed, 806 insertions(+), 0 deletions(-)
 create mode 100644 tools/virtio/virtio-trace/Makefile
 create mode 100644 tools/virtio/virtio-trace/README
 create mode 100644 tools/virtio/virtio-trace/trace-agent-ctl.c
 create mode 100644 tools/virtio/virtio-trace/trace-agent-rw.c
 create mode 100644 tools/virtio/virtio-trace/trace-agent.c
 create mode 100644 tools/virtio/virtio-trace/trace-agent.h

diff --git a/tools/virtio/virtio-trace/Makefile 
b/tools/virtio/virtio-trace/Makefile
new file mode 100644
index 000..ef3adfc
--- /dev/null
+++ b/tools/virtio/virtio-trace/Makefile
@@ -0,0 +1,14 @@
+CC = gcc
+CFLAGS = -O2 -Wall
+LFLAG = -lpthread
+
+all: trace-agent
+
+.c.o:
+   $(CC) $(CFLAGS) $(LFLAG) -c $^ -o $@
+
+trace-agent: trace-agent.o trace-agent-ctl.o trace-agent-rw.o
+   $(CC) $(CFLAGS) $(LFLAG) -o $@ $^
+
+clean:
+   rm -f *.o trace-agent
diff --git a/tools/virtio/virtio-trace/README b/tools/virtio/virtio-trace/README
new file mode 100644
index 000..b64845b
--- /dev/null
+++ b/tools/virtio/virtio-trace/README
@@ -0,0 +1,118 @@
+Trace Agent for virtio-trace
+
+
+Trace agent is a user tool for sending trace data of a guest to a Host in low
+overhead. Trace agent has the following functions:
+ - splice a page of ring-buffer to read_pipe without memory copying
+ - splice the page from write_pipe to virtio-console without memory copying
+ - write trace data to stdout by using -o option
+ - controlled by start/stop orders from a Host
+
+The trace agent operates as follows:
+ 1) Initialize all structures.
+ 2) Create a read/write thread per CPU. Each thread is bound to a CPU.
+The read/write threads hold it.
+ 3) A controller thread does poll() for a start order of a host.
+ 4) After the controller of the trace agent receives a start order from a host,
+the controller wake read/write threads.
+ 5) The read/write threads start to read trace data from ring-buffers and
+write the data to virtio-serial.
+ 6) If the controller receives a stop order from a host, the read/write threads
+stop to read trace data.
+
+
+Files
+=
+
+README: this file
+Makefile: Makefile of trace agent for virtio-trace
+trace-agent.c: includes main function, sets up for operating trace agent
+trace-agent.h: includes all structures and some macros
+trace-agent-ctl.c: includes controller function for read/write threads
+trace-agent-rw.c: includes read/write threads function
+
+
+Setup
+=
+
+To use this trace agent for virtio-trace, we need to prepare some virtio-serial
+I/Fs.
+
+1) Make FIFO in a host
+ virtio-trace uses virtio-serial pipe as trace data paths as to the number
+of CPUs and a control path, so FIFO (named pipe) should be created as follows:
+   # mkdir /tmp/virtio-trace/
+   # mkfifo /tmp/virtio-trace/trace-path-cpu{0,1,2,...,X}.{in,out}
+   # mkfifo /tmp/virtio-trace/agent-ctl-path.{in,out}
+
+For example, if a guest use three CPUs, the names are
+   trace-path-cpu{0,1,2}.{in.out}
+and
+   agent-ctl-path.{in,out}.
+
+2) Set up of virtio-serial pipe in a host
+ Add qemu option to use virtio-serial pipe.
+
+ ##virtio-serial device##
+ -device virtio-serial-pci,id=virtio-serial0\
+ ##control path##
+ -chardev pipe,id=charchannel0,path=/tmp/virtio-trace/agent-ctl-path\
+ -device virtserialport,bus=virtio-serial0.0,nr=1,chardev=charchannel0,\
+  id=channel0,name=agent-ctl-path\
+ ##data path##
+ -chardev pipe,id=charchannel1,path=/tmp/virtio-trace/trace-path-cpu0\
+ -device virtserialport,bus=virtio-serial0.0,nr=2,chardev=charchannel0,\
+  id=channel1,name=trace-path-cpu0\
+  ...
+
+If you manage guests with libvirt, add the following tags to domain XML files.
+Then, libvirt passes the same command option to qemu.
+
+   
+  
+  
+  
+   
+   
+  
+  
+  
+   
+   ...
+Here, chardev names are restricted to trace-path-cpuX and agent-ctl-path. For
+example, if a guest use three CPUs, chardev names should be trace-path-cpu0,
+trace-path-cpu1, trace-path-cpu2, and agent-ctl-path.
+
+3) Boot the guest
+ You can find some chardev in 

[RFC PATCH 4/6] ftrace: Allow stealing pages from pipe buffer

2012-07-23 Thread Yoshihiro YUNOMAE
From: Masami Hiramatsu 

Use generic steal operation on pipe buffer to allow stealing
ring buffer's read page from pipe buffer.

Note that this could reduce the performance of splice on the
splice_write side operation without affinity setting.
Since the ring buffer's read pages are allocated on the
tracing-node, but the splice user does not always execute
splice write side operation on the same node. In this case,
the page will be accessed from the another node.
Thus, it is strongly recommended to assign the splicing
thread to corresponding node.

Signed-off-by: Masami Hiramatsu 
Cc: Steven Rostedt 
Cc: Frederic Weisbecker 
Cc: Ingo Molnar 
---

 kernel/trace/trace.c |8 +---
 1 files changed, 1 insertions(+), 7 deletions(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index a120f98..ae01930 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -4194,12 +4194,6 @@ static void buffer_pipe_buf_release(struct 
pipe_inode_info *pipe,
buf->private = 0;
 }
 
-static int buffer_pipe_buf_steal(struct pipe_inode_info *pipe,
-struct pipe_buffer *buf)
-{
-   return 1;
-}
-
 static void buffer_pipe_buf_get(struct pipe_inode_info *pipe,
struct pipe_buffer *buf)
 {
@@ -4215,7 +4209,7 @@ static const struct pipe_buf_operations 
buffer_pipe_buf_ops = {
.unmap  = generic_pipe_buf_unmap,
.confirm= generic_pipe_buf_confirm,
.release= buffer_pipe_buf_release,
-   .steal  = buffer_pipe_buf_steal,
+   .steal  = generic_pipe_buf_steal,
.get= buffer_pipe_buf_get,
 };
 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH 3/6] virtio/console: Wait until the port is ready on splice

2012-07-23 Thread Yoshihiro YUNOMAE
From: Masami Hiramatsu 

Wait if the port is not connected or full on splice
like as write is doing.

Signed-off-by: Masami Hiramatsu 
Cc: Amit Shah 
Cc: Arnd Bergmann 
Cc: Greg Kroah-Hartman 
---

 drivers/char/virtio_console.c |   39 +++
 1 files changed, 27 insertions(+), 12 deletions(-)

diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
index 911cb3e..e49d435 100644
--- a/drivers/char/virtio_console.c
+++ b/drivers/char/virtio_console.c
@@ -724,6 +724,26 @@ static ssize_t port_fops_read(struct file *filp, char 
__user *ubuf,
return fill_readbuf(port, ubuf, count, true);
 }
 
+static int wait_port_writable(struct port *port, bool nonblock)
+{
+   int ret;
+
+   if (will_write_block(port)) {
+   if (nonblock)
+   return -EAGAIN;
+
+   ret = wait_event_freezable(port->waitqueue,
+  !will_write_block(port));
+   if (ret < 0)
+   return ret;
+   }
+   /* Port got hot-unplugged. */
+   if (!port->guest_connected)
+   return -ENODEV;
+
+   return 0;
+}
+
 static ssize_t port_fops_write(struct file *filp, const char __user *ubuf,
   size_t count, loff_t *offp)
 {
@@ -740,18 +760,9 @@ static ssize_t port_fops_write(struct file *filp, const 
char __user *ubuf,
 
nonblock = filp->f_flags & O_NONBLOCK;
 
-   if (will_write_block(port)) {
-   if (nonblock)
-   return -EAGAIN;
-
-   ret = wait_event_freezable(port->waitqueue,
-  !will_write_block(port));
-   if (ret < 0)
-   return ret;
-   }
-   /* Port got hot-unplugged. */
-   if (!port->guest_connected)
-   return -ENODEV;
+   ret = wait_port_writable(port, nonblock);
+   if (ret < 0)
+   return ret;
 
count = min((size_t)(32 * 1024), count);
 
@@ -851,6 +862,10 @@ static ssize_t port_fops_splice_write(struct 
pipe_inode_info *pipe,
.u.data = ,
};
 
+   ret = wait_port_writable(port, filp->f_flags & O_NONBLOCK);
+   if (ret < 0)
+   return ret;
+
sgl.n = 0;
sgl.len = 0;
sgl.sg = kmalloc(sizeof(struct scatterlist) * MAX_SPLICE_PAGES,


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH 5/6] virtio/console: Allocate scatterlist according to the current pipe size

2012-07-23 Thread Yoshihiro YUNOMAE
From: Masami Hiramatsu 

Allocate scatterlist according to the current pipe size.
This allows splicing bigger buffer if the pipe size has
been changed by fcntl.

Signed-off-by: Masami Hiramatsu 
Cc: Amit Shah 
Cc: Arnd Bergmann 
Cc: Greg Kroah-Hartman 
---

 drivers/char/virtio_console.c |   23 ---
 1 files changed, 12 insertions(+), 11 deletions(-)

diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
index e49d435..f5063d5 100644
--- a/drivers/char/virtio_console.c
+++ b/drivers/char/virtio_console.c
@@ -229,7 +229,6 @@ struct port {
bool guest_connected;
 };
 
-#define MAX_SPLICE_PAGES   32
 /* This is the very early arch-specified put chars function. */
 static int (*early_put_chars)(u32, const char *, int);
 
@@ -482,15 +481,16 @@ struct buffer_token {
void *buf;
struct scatterlist *sg;
} u;
-   bool sgpages;
+   /* If sgpages == 0 then buf is used, else sg is used */
+   unsigned int sgpages;
 };
 
-static void reclaim_sg_pages(struct scatterlist *sg)
+static void reclaim_sg_pages(struct scatterlist *sg, unsigned int nrpages)
 {
int i;
struct page *page;
 
-   for (i = 0; i < MAX_SPLICE_PAGES; i++) {
+   for (i = 0; i < nrpages; i++) {
page = sg_page([i]);
if (!page)
break;
@@ -511,7 +511,7 @@ static void reclaim_consumed_buffers(struct port *port)
}
while ((tok = virtqueue_get_buf(port->out_vq, ))) {
if (tok->sgpages)
-   reclaim_sg_pages(tok->u.sg);
+   reclaim_sg_pages(tok->u.sg, tok->sgpages);
else
kfree(tok->u.buf);
kfree(tok);
@@ -581,7 +581,7 @@ static ssize_t send_buf(struct port *port, void *in_buf, 
size_t in_count,
tok = kmalloc(sizeof(*tok), GFP_ATOMIC);
if (!tok)
return -ENOMEM;
-   tok->sgpages = false;
+   tok->sgpages = 0;
tok->u.buf = in_buf;
 
sg_init_one(sg, in_buf, in_count);
@@ -597,7 +597,7 @@ static ssize_t send_pages(struct port *port, struct 
scatterlist *sg, int nents,
tok = kmalloc(sizeof(*tok), GFP_ATOMIC);
if (!tok)
return -ENOMEM;
-   tok->sgpages = true;
+   tok->sgpages = nents;
tok->u.sg = sg;
 
return __send_to_port(port, sg, nents, in_count, tok, nonblock);
@@ -797,6 +797,7 @@ out:
 
 struct sg_list {
unsigned int n;
+   unsigned int size;
size_t len;
struct scatterlist *sg;
 };
@@ -807,7 +808,7 @@ static int pipe_to_sg(struct pipe_inode_info *pipe, struct 
pipe_buffer *buf,
struct sg_list *sgl = sd->u.data;
unsigned int offset, len;
 
-   if (sgl->n == MAX_SPLICE_PAGES)
+   if (sgl->n == sgl->size)
return 0;
 
/* Try lock this page */
@@ -868,12 +869,12 @@ static ssize_t port_fops_splice_write(struct 
pipe_inode_info *pipe,
 
sgl.n = 0;
sgl.len = 0;
-   sgl.sg = kmalloc(sizeof(struct scatterlist) * MAX_SPLICE_PAGES,
-GFP_ATOMIC);
+   sgl.size = pipe->nrbufs;
+   sgl.sg = kmalloc(sizeof(struct scatterlist) * sgl.size, GFP_ATOMIC);
if (unlikely(!sgl.sg))
return -ENOMEM;
 
-   sg_init_table(sgl.sg, MAX_SPLICE_PAGES);
+   sg_init_table(sgl.sg, sgl.size);
ret = __splice_from_pipe(pipe, , pipe_to_sg);
if (likely(ret > 0))
ret = send_pages(port, sgl.sg, sgl.n, sgl.len, true);


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH 2/6] virtio/console: Add a failback for unstealable pipe buffer

2012-07-23 Thread Yoshihiro YUNOMAE
From: Masami Hiramatsu 

Add a failback memcpy path for unstealable pipe buffer.
If buf->ops->steal() fails, virtio-serial tries to
copy the page contents to an allocated page, instead
of just failing splice().

Signed-off-by: Masami Hiramatsu 
Cc: Amit Shah 
Cc: Arnd Bergmann 
Cc: Greg Kroah-Hartman 
---

 drivers/char/virtio_console.c |   28 +---
 1 files changed, 25 insertions(+), 3 deletions(-)

diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
index fe31b2f..911cb3e 100644
--- a/drivers/char/virtio_console.c
+++ b/drivers/char/virtio_console.c
@@ -794,7 +794,7 @@ static int pipe_to_sg(struct pipe_inode_info *pipe, struct 
pipe_buffer *buf,
struct splice_desc *sd)
 {
struct sg_list *sgl = sd->u.data;
-   unsigned int len = 0;
+   unsigned int offset, len;
 
if (sgl->n == MAX_SPLICE_PAGES)
return 0;
@@ -807,9 +807,31 @@ static int pipe_to_sg(struct pipe_inode_info *pipe, struct 
pipe_buffer *buf,
 
len = min(buf->len, sd->len);
sg_set_page(&(sgl->sg[sgl->n]), buf->page, len, buf->offset);
-   sgl->n++;
-   sgl->len += len;
+   } else {
+   /* Failback to copying a page */
+   struct page *page = alloc_page(GFP_KERNEL);
+   char *src = buf->ops->map(pipe, buf, 1);
+   char *dst;
+
+   if (!page)
+   return -ENOMEM;
+   dst = kmap(page);
+
+   offset = sd->pos & ~PAGE_MASK;
+
+   len = sd->len;
+   if (len + offset > PAGE_SIZE)
+   len = PAGE_SIZE - offset;
+
+   memcpy(dst + offset, src + buf->offset, len);
+
+   kunmap(page);
+   buf->ops->unmap(pipe, buf, src);
+
+   sg_set_page(&(sgl->sg[sgl->n]), page, len, offset);
}
+   sgl->n++;
+   sgl->len += len;
 
return len;
 }


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC PATCH 1/6] virtio/console: Add splice_write support

2012-07-23 Thread Yoshihiro YUNOMAE
From: Masami Hiramatsu 

Enable to use splice_write from pipe to virtio-console port.
This steals pages from pipe and directly send it to host.

Note that this may accelerate only the guest to host path.

Signed-off-by: Masami Hiramatsu 
Cc: Amit Shah 
Cc: Arnd Bergmann 
Cc: Greg Kroah-Hartman 
---

 drivers/char/virtio_console.c |  136 +++--
 1 files changed, 128 insertions(+), 8 deletions(-)

diff --git a/drivers/char/virtio_console.c b/drivers/char/virtio_console.c
index cdf2f54..fe31b2f 100644
--- a/drivers/char/virtio_console.c
+++ b/drivers/char/virtio_console.c
@@ -24,6 +24,8 @@
 #include 
 #include 
 #include 
+#include 
+#include 
 #include 
 #include 
 #include 
@@ -227,6 +229,7 @@ struct port {
bool guest_connected;
 };
 
+#define MAX_SPLICE_PAGES   32
 /* This is the very early arch-specified put chars function. */
 static int (*early_put_chars)(u32, const char *, int);
 
@@ -474,26 +477,52 @@ static ssize_t send_control_msg(struct port *port, 
unsigned int event,
return 0;
 }
 
+struct buffer_token {
+   union {
+   void *buf;
+   struct scatterlist *sg;
+   } u;
+   bool sgpages;
+};
+
+static void reclaim_sg_pages(struct scatterlist *sg)
+{
+   int i;
+   struct page *page;
+
+   for (i = 0; i < MAX_SPLICE_PAGES; i++) {
+   page = sg_page([i]);
+   if (!page)
+   break;
+   put_page(page);
+   }
+   kfree(sg);
+}
+
 /* Callers must take the port->outvq_lock */
 static void reclaim_consumed_buffers(struct port *port)
 {
-   void *buf;
+   struct buffer_token *tok;
unsigned int len;
 
if (!port->portdev) {
/* Device has been unplugged.  vqs are already gone. */
return;
}
-   while ((buf = virtqueue_get_buf(port->out_vq, ))) {
-   kfree(buf);
+   while ((tok = virtqueue_get_buf(port->out_vq, ))) {
+   if (tok->sgpages)
+   reclaim_sg_pages(tok->u.sg);
+   else
+   kfree(tok->u.buf);
+   kfree(tok);
port->outvq_full = false;
}
 }
 
-static ssize_t send_buf(struct port *port, void *in_buf, size_t in_count,
-   bool nonblock)
+static ssize_t __send_to_port(struct port *port, struct scatterlist *sg,
+ int nents, size_t in_count,
+ struct buffer_token *tok, bool nonblock)
 {
-   struct scatterlist sg[1];
struct virtqueue *out_vq;
ssize_t ret;
unsigned long flags;
@@ -505,8 +534,7 @@ static ssize_t send_buf(struct port *port, void *in_buf, 
size_t in_count,
 
reclaim_consumed_buffers(port);
 
-   sg_init_one(sg, in_buf, in_count);
-   ret = virtqueue_add_buf(out_vq, sg, 1, 0, in_buf, GFP_ATOMIC);
+   ret = virtqueue_add_buf(out_vq, sg, nents, 0, tok, GFP_ATOMIC);
 
/* Tell Host to go! */
virtqueue_kick(out_vq);
@@ -544,6 +572,37 @@ done:
return in_count;
 }
 
+static ssize_t send_buf(struct port *port, void *in_buf, size_t in_count,
+   bool nonblock)
+{
+   struct scatterlist sg[1];
+   struct buffer_token *tok;
+
+   tok = kmalloc(sizeof(*tok), GFP_ATOMIC);
+   if (!tok)
+   return -ENOMEM;
+   tok->sgpages = false;
+   tok->u.buf = in_buf;
+
+   sg_init_one(sg, in_buf, in_count);
+
+   return __send_to_port(port, sg, 1, in_count, tok, nonblock);
+}
+
+static ssize_t send_pages(struct port *port, struct scatterlist *sg, int nents,
+ size_t in_count, bool nonblock)
+{
+   struct buffer_token *tok;
+
+   tok = kmalloc(sizeof(*tok), GFP_ATOMIC);
+   if (!tok)
+   return -ENOMEM;
+   tok->sgpages = true;
+   tok->u.sg = sg;
+
+   return __send_to_port(port, sg, nents, in_count, tok, nonblock);
+}
+
 /*
  * Give out the data that's requested from the buffer that we have
  * queued up.
@@ -725,6 +784,66 @@ out:
return ret;
 }
 
+struct sg_list {
+   unsigned int n;
+   size_t len;
+   struct scatterlist *sg;
+};
+
+static int pipe_to_sg(struct pipe_inode_info *pipe, struct pipe_buffer *buf,
+   struct splice_desc *sd)
+{
+   struct sg_list *sgl = sd->u.data;
+   unsigned int len = 0;
+
+   if (sgl->n == MAX_SPLICE_PAGES)
+   return 0;
+
+   /* Try lock this page */
+   if (buf->ops->steal(pipe, buf) == 0) {
+   /* Get reference and unlock page for moving */
+   get_page(buf->page);
+   unlock_page(buf->page);
+
+   len = min(buf->len, sd->len);
+   sg_set_page(&(sgl->sg[sgl->n]), buf->page, len, buf->offset);
+   sgl->n++;
+   sgl->len += len;
+   }
+
+   return len;
+}
+
+/* Faster zero-copy write by splicing */
+static ssize_t 

[RFC PATCH 0/6] virtio-trace: Support virtio-trace

2012-07-23 Thread Yoshihiro YUNOMAE
Hi All,

The following patch set provides a low-overhead system for collecting kernel
tracing data of guests by a host in a virtualization environment.

A guest OS generally shares some devices with other guests or a host, so
reasons of any problems occurring in a guest may be from other guests or a host.
Then, to collect some tracing data of a number of guests and a host is needed
when some problems occur in a virtualization environment. One of methods to
realize that is to collect tracing data of guests in a host. To do this, network
is generally used. However, high load will be taken to applications on guests
using network I/O because there are many network stack layers. Therefore,
a communication method for collecting the data without using network is needed.

We submitted a patch set of "IVRing", a ring-buffer driver constructed on
Inter-VM shared memory (IVShmem), to LKML http://lwn.net/Articles/500304/ in
this June. IVRing and the IVRing reader use POSIX shared memory each other
without using network, so a low-overhead system for collecting guest tracing
data is realized. However, this patch set has some problems as follows:
 - use IVShmem instead of virtio
 - create a new ring-buffer without using existing ring-buffer in kernel
 - scalability
   -- not support SMP environment
   -- buffer size limitation
   -- not support live migration (maybe difficult for realize this)

Therefore, we propose a new system "virtio-trace", which uses enhanced
virtio-serial and existing ring-buffer of ftrace, for collecting guest kernel
tracing data. In this system, there are 5 main components:
 (1) Ring-buffer of ftrace in a guest
 - When trace agent reads ring-buffer, a page is removed from ring-buffer.
 (2) Trace agent in the guest
 - Splice the page of ring-buffer to read_pipe using splice() without
   memory copying. Then, the page is spliced from write_pipe to virtio
   without memory copying.
 (3) Virtio-console driver in the guest
 - Pass the page to virtio-ring
 (4) Virtio-serial bus in QEMU
 - Copy the page to kernel pipe
 (5) Reader in the host
 - Read guest tracing data via FIFO(named pipe) 

***Evaluation***
When a host collects tracing data of a guest, the performance of using
virtio-trace is compared with that of using native(just running ftrace),
IVRing, and virtio-serial(normal method of read/write).


The overview of this evaluation is as follows:
 (a) A guest on a KVM is prepared.
 - The guest is dedicated one physical CPU as a virtual CPU(VCPU).

 (b) The guest starts to write tracing data to ring-buffer of ftrace.
 - The probe points are all trace points of sched, timer, and kmem.

 (c) Writing trace data, dhrystone 2 in UNIX bench is executed as a benchmark
 tool in the guest.
 - Dhrystone 2 intends system performance by repeating integer arithmetic
   as a score.
 - Since higher score equals to better system performance, if the score
   decrease based on bare environment, it indicates that any operation
   disturbs the integer arithmetic. Then, we define the overhead of
   transporting trace data is calculated as follows:
OVERHEAD = (1 - SCORE_OF_A_METHOD/NATIVE_SCORE) * 100.

The performance of each method is compared as follows:
 [1] Native
 - only recording trace data to ring-buffer on a guest
 [2] Virtio-trace
 - running a trace agent on a guest
 - a reader on a host opens FIFO using cat command
 [3] IVRing
 - A SystemTap script in a guest records trace data to IVRing.
   -- probe points are same as ftrace.
 [4] Virtio-serial(normal)
 - A reader(using cat) on a guest output trace data to a host using
   standard output via virtio-serial.

Other information is as follows:
 - host
   kernel: 3.3.7-1 (Fedora16)
   CPU: Intel Xeon x5660@2.80GHz(12core)
   Memory: 48GB

 - guest(only booting one guest)
   kernel: 3.5.0-rc4+ (Fedora16)
   CPU: 1VCPU(dedicated)
   Memory: 1GB


3 patterns based on the bare environment were indicated as follows:
   Scores  overhead against [0] Native
[0] Native:  28807569.5   -
[1] Virtio-trace:28685049.5 0.43%
[2] IVRing:  28418595.5 1.35%
[3] Virtio-serial:   13262258.753.96%


***Just enhancement ideas***
 - Support for trace-cmd
 - Support for 9pfs protocol
 - Support for non-blocking mode in QEMU
 - Make "vhost-serial"

Thank you,

---

Masami Hiramatsu (5):
  virtio/console: Allocate scatterlist according to the current pipe size
  ftrace: Allow stealing pages from pipe buffer
  virtio/console: Wait until the port is ready on splice
  virtio/console: Add a failback for unstealable pipe buffer
  virtio/console: Add splice_write support

Yoshihiro YUNOMAE (1):
  tools: Add guest trace agent as a user tool


 drivers/char/virtio_console.c   |  198 ++--
 kernel/trace/trace.c  

Re: [PATCH 00/11] 3.2-stable: Fix for leapsecond caused hrtimer/futex issue

2012-07-23 Thread Ben Hutchings
On Mon, 2012-07-23 at 12:51 -0700, John Stultz wrote:
> On 07/19/2012 01:48 PM, Christoph Biedl wrote:
> > John Stultz wrote...
> >
> >> Attached is the test case I used to reproduce and test the solution
> >> to the hard-hang deadlock.
> > I was wondering whether anybody managed to crash a virtualbox guest
> > using your program. No avail, using version 4.1.18 on the host and the
> > guest kernel running several 3.0.x (x < 38) kernels on both x32 and
> > x64, the guest utilies were stopped. Rather a fun fact I guess but I
> > wanted to let you know.
> 
> I've been able to crash a kvm guest with an unpatched kernel with my 
> test.  The issue requires that the adding of the hrtimer causes the 
> clockevent to be reprogrammed. This usually happens if there's no timers 
> that expire sooner then the leapsecond timer. So if there are drivers 
> that set frequent timers, or set timers right before the leapsecond, it 
> may be difficult to trigger this issue.
> 
> Lowering HZ or adding more vcpus might help if you really want to be 
> able to trigger the issue.
[...]

Your test program also made Linux 3.2.23 (or it may have been .21)
lock-up for me in a KVM guest, while 3.2.24-rc1 seemed immune.

Ben.

-- 
Ben Hutchings
If more than one person is responsible for a bug, no one is at fault.


signature.asc
Description: This is a digitally signed message part


Re: [MMTests] Sysbench read-only on ext3

2012-07-23 Thread Mike Galbraith
On Mon, 2012-07-23 at 22:13 +0100, Mel Gorman wrote:

> The backing database was postgres.

FWIW, that wouldn't have been my choice.  I don't know if it still does,
but it used to use userland spinlocks to achieve scalability.  Turning
your CPUs into space heaters to combat concurrency issues makes a pretty
flat graph, but probably doesn't test kernels as well as something that
did not do that.

-Mike


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] mpc85xx_defconfig: add IDE support for MPC85xxCDS

2012-07-23 Thread Zhao Chenhui
On Fri, Jul 20, 2012 at 03:09:00PM +0100, Alan Cox wrote:
> On Fri, 20 Jul 2012 20:45:25 +0800
> Zhao Chenhui  wrote:
> 
> > Add IDE support for MPC85xxCDS.
> > 
> > Signed-off-by: Zhao Chenhui 
> > ---
> >  arch/powerpc/configs/mpc85xx_defconfig |2 ++
> >  1 files changed, 2 insertions(+), 0 deletions(-)
> > 
> > diff --git a/arch/powerpc/configs/mpc85xx_defconfig 
> > b/arch/powerpc/configs/mpc85xx_defconfig
> > index 03ee911..45eda33 100644
> > --- a/arch/powerpc/configs/mpc85xx_defconfig
> > +++ b/arch/powerpc/configs/mpc85xx_defconfig
> > @@ -105,6 +105,8 @@ CONFIG_BLK_DEV_RAM=y
> >  CONFIG_BLK_DEV_RAM_SIZE=131072
> >  CONFIG_MISC_DEVICES=y
> >  CONFIG_EEPROM_LEGACY=y
> > +CONFIG_IDE=y
> > +CONFIG_BLK_DEV_VIA82CXXX=y
> 
> CONFIG_IDE is obsolete we shouldn't be adding it to anything as it will
> eventually go away. Please use the ATA drivers.
> 

Thanks. I will replace it with the ATA driver.

-Chenhui

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH] sctp: Make "Invalid Stream Identifier" ERROR follows SACK when bundling

2012-07-23 Thread Vlad Yasevich
xufeng zhang  wrote:

>On 07/19/2012 01:57 PM, xufengzhang.m...@gmail.com wrote:
>> When "Invalid Stream Identifier" ERROR happens after process the
>> received DATA chunks, this ERROR chunk is enqueued into outqueue
>> before SACK chunk, so when bundling ERROR chunk with SACK chunk,
>> the ERROR chunk is always placed first in the packet because of
>> the chunk's position in the outqueue.
>> This violates sctp specification:
>>  RFC 4960 6.5. Stream Identifier and Stream Sequence Number
>>  ...The endpoint may bundle the ERROR chunk in the same
>>  packet as the SACK as long as the ERROR follows the SACK.
>> So we must place SACK first when bundling "Invalid Stream Identifier"
>> ERROR and SACK in one packet.
>> Although we can do that by enqueue SACK chunk into outqueue before
>> ERROR chunk, it will violate the side-effect interpreter processing.
>> It's easy to do this job when dequeue chunks from the outqueue,
>> by this way, we introduce a flag 'has_isi_err' which indicate
>> whether or not the "Invalid Stream Identifier" ERROR happens.
>>
>> Signed-off-by: Xufeng Zhang
>> ---
>>   include/net/sctp/structs.h |2 ++
>>   net/sctp/output.c  |   26 ++
>>   2 files changed, 28 insertions(+), 0 deletions(-)
>>
>> diff --git a/include/net/sctp/structs.h b/include/net/sctp/structs.h
>> index 88949a9..5adf4de 100644
>> --- a/include/net/sctp/structs.h
>> +++ b/include/net/sctp/structs.h
>> @@ -842,6 +842,8 @@ struct sctp_packet {
>>  has_sack:1, /* This packet contains a SACK chunk. */
>>  has_auth:1, /* This packet contains an AUTH chunk */
>>  has_data:1, /* This packet contains at least 1 DATA chunk */
>> +has_isi_err:1,  /* This packet contains a "Invalid Stream
>> + * Identifier" ERROR chunk */
>>  ipfragok:1, /* So let ip fragment this packet */
>>  malloced:1; /* Is it malloced? */
>>   };
>> diff --git a/net/sctp/output.c b/net/sctp/output.c
>> index 817174e..77fb1ae 100644
>> --- a/net/sctp/output.c
>> +++ b/net/sctp/output.c
>> @@ -79,6 +79,7 @@ static void sctp_packet_reset(struct sctp_packet
>*packet)
>>  packet->has_sack = 0;
>>  packet->has_data = 0;
>>  packet->has_auth = 0;
>> +packet->has_isi_err = 0;
>>  packet->ipfragok = 0;
>>  packet->auth = NULL;
>>   }
>> @@ -267,6 +268,7 @@ static sctp_xmit_t sctp_packet_bundle_sack(struct
>sctp_packet *pkt,
>>   sctp_xmit_t sctp_packet_append_chunk(struct sctp_packet *packet,
>>   struct sctp_chunk *chunk)
>>   {
>> +struct sctp_chunk *lchunk;
>>  sctp_xmit_t retval = SCTP_XMIT_OK;
>>  __u16 chunk_len = WORD_ROUND(ntohs(chunk->chunk_hdr->length));
>>
>> @@ -316,7 +318,31 @@ sctp_xmit_t sctp_packet_append_chunk(struct
>sctp_packet *packet,
>>  packet->has_cookie_echo = 1;
>>  break;
>>
>> +case SCTP_CID_ERROR:
>> +if (chunk->subh.err_hdr->cause&  SCTP_ERROR_INV_STRM)
>> +packet->has_isi_err = 1;
>> +break;
>> +
>>  case SCTP_CID_SACK:
>> +/* RFC 4960
>> + * 6.5 Stream Identifier and Stream Sequence Number
>> + * The endpoint may bundle the ERROR chunk in the same
>> + * packet as the SACK as long as the ERROR follows the SACK.
>> + */
>> +if (packet->has_isi_err) {
>> +if (list_is_singular(>chunk_list))
>> +list_add(>list,>chunk_list);
>> +else {
>> +lchunk = list_first_entry(>chunk_list,
>> +struct sctp_chunk, list);
>> +list_add(>list,>list);
>> +}
>>
>And I should clarify the above judgment code.
>AFAIK, there should be two cases for the bundling when invalid stream 
>identifier error happens:
>1). COOKIE_ACK ERROR SACK
>2). ERROR SACK
>So I need to deal with the two cases differently.
>

Sorry but I just don't buy that the above are the only 2 cases.  What if there 
are addip chunks as well?  What if there are some other extensions also.  This 
code has to be generic enough to handle any condition.

- vlad

>
>Thanks,
>Xufeng Zhang
>> +packet->size += chunk_len;
>> +chunk->transport = packet->transport;
>> +packet->has_sack = 1;
>> +goto finish;
>> +}
>> +
>>  packet->has_sack = 1;
>>  break;
>>
>>


-- 
Sent from my Android phone with SkitMail. Please excuse my brevity.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] zsmalloc: s/firstpage/page in new copy map funcs

2012-07-23 Thread Minchan Kim
Hi Greg,

On Mon, Jul 23, 2012 at 03:27:49PM -0700, Greg Kroah-Hartman wrote:
> On Mon, Jul 23, 2012 at 05:10:39PM -0500, Seth Jennings wrote:
> > Greg,
> > 
> > I know it's the first Monday after a kernel release and
> > things are crazy for you.  I was hoping to get this zsmalloc
> > stuff in before the merge window hit so I wouldn't have to
> > bother you :-/  But, alas, it didn't happen that way.
> 
> Nope, sorry, it missed them.  It needed to be at least a week previous
> to when the final kernel comes out to get into the next one.
> 
> > Minchan acked these yesterday.  When you get a chance, could
> > you pull these 3 patches?  I'm wanting to send out a
> > promotion patch for zsmalloc and zcache based on these.
> 
> Sorry, it will have to wait until after 3.6-rc1 is out before I will add
> them to my tree for 3.7, that's the merge rules, that you well know :)

I think it is good time that zram/zsmalloc is out of staging because of
removing arch dependency and many clean up with some bug fix.
I hope it's out of staging in this chance.
If you have a concern about that, please let me know it.

Thanks!

-- 
Kind regards,
Minchan Kim
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 0/3] KVM: remove dummy pages

2012-07-23 Thread Xiao Guangrong
On 07/23/2012 10:03 PM, Xiao Guangrong wrote:
> Currently, kvm allocates some pages and use then as error indicators,
> it wastes memory and is not good for scalability.
> 
> In this patchset, we introduce some error code instead of the pages to
> indicate the error conditions.
> 

Sorry for the noise, there are some typos in the title and in the change-logs,
I will correct them after you guys review the patches.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


What went in Linux 3.5 from Xen standpoint.

2012-07-23 Thread Konrad Rzeszutek Wilk
Hey,

Linus released this Saturday v3.5 and there were quite a few
of interesting components/fixes/features that went in!

Going to go down the list of what shortlog provided:
 - Less MSR traps when using AMD machines - so better performance. [andre]
 - The APIC IPI interface works - which means that 'perf' in the initial
   domain works. So no more lockups. Also with some tweaks, that means kdb
   can work as well [ben]
 - Static analyzer fixes in hvc [dan]
 - The march to make HVM domains driver domains continue - the XenBus
   in the initial domain can now search for it in another domain instead
   of assuming its in the initial domain [daniel]. Marek also expanded netfront
   to now work in initial domain. [marek]
 - Memory reported by 'xm list' for the initial domain now matches what
   the kernel has been booted with. In the past it would be around 1GB
   less due to the kernel releasing memory that fall within
   E820 gaps/PCI space [david and me]
 - The bootup on two-socket or more of AMD cpus works correctly (I put
   in a temporary way in v3.4) but Lin implemented the correct way [lin, jeremy 
and me]
 - Make xen-blkfront capable of unloading. This requires some extra
   infrastructure work to handle delayed grants [jan]
 - Fixes in the hvc drivers for PVonHVM guests started with 'xm' (not xl!) would
   fail during migration. [me]
 - Fix in blkback with discard operation in 32/64 (so 32-bit guest, 64-bit 
dom0) mix would fail.
 - Support for 'perf' to work in the initial domain [lin]
 - PVonHVM guests on their reboot would not release properly event channels
   so would consume additional memory [stefano]
 - Make it possible for the upstream QEMU run in the initial domain along with
   blkfront (and blkback) to allow QCOW images to be mapped [stefano]
 - Fix for mulitple PCI domains to not crash during bootup [zhang]
 - Fix to make FLR actually do properly FLR in xen-pciback. [me]

And the full list:

Andre Przywara (1):
  xen/setup: filter APERFMPERF cpuid feature out

Ben Guthro (1):
  xen: implement apic ipi interface

Dan Carpenter (1):
  hvc_xen: NULL dereference on allocation failure

Daniel De Graaf (1):
  xenbus: Add support for xenbus backend in stub domain

David Vrabel (1):
  xen/setup: update VA mapping when releasing memory during setup

H. Peter Anvin (1):
  xen-acpi-processor: Add missing #include 

Ingo Molnar (2):
  x86/apic: Fix UP boot crash
  x86/xen/apic: Add missing #include 

Jan Beulich (3):
  xen/gnttab: add deferred freeing logic
  xen-blkfront: properly name all devices
  xen-blkfront: module exit handling adjustments

Jana Saout (1):
  xen: Add selfballoning memory reservation tunable.

Konrad Rzeszutek Wilk (20):
  xen/p2m: Move code around to allow for better re-usage.
  xen/p2m: Allow alloc_p2m_middle to call reserve_brk depending on argument
  xen/p2m: Collapse early_alloc_p2m_middle redundant checks.
  xen/p2m: An early bootup variant of set_phys_to_machine
  PCI: move mutex locking out of pci_dev_reset function
  x86/apic: Replace io_apic_ops with x86_io_apic_ops.
  xen/x86: Implement x86_apic_ops
  Revert "xen/x86: Workaround 'x86/ioapic: Add register level checks to 
detect bogus io-apic entries'"
  xen/setup: Only print "Freeing XXX-YYY pfn range: Z pages freed" if Z > 0
  xen/setup: Populate freed MFNs from non-RAM E820 entries and gaps to E820 
RAM
  xen/setup: Combine the two hypercall functions - since they are quite 
similar.
  xen/acpi/sleep: Enable ACPI sleep via the __acpi_os_prepare_sleep
  xen/smp: unbind irqworkX when unplugging vCPUs.
  xen/hvc: Collapse error logic.
  xen/hvc: Fix error cases around HVM_PARAM_CONSOLE_PFN
  xen/hvc: Check HVM_PARAM_CONSOLE_[EVTCHN|PFN] for correctness.
  xen/events: Add WARN_ON when quick lookup found invalid type.
  xen/balloon: Subtract from xen_released_pages the count that is populated.
  xen/blkback: Copy id field when doing BLKIF_DISCARD.
  xen/blkfront: Add WARN to deal with misbehaving backends.

Lin Ming (2):
  xen/apic: implement io apic read with hypercall
  xen: implement IRQ_WORK_VECTOR handler

Marek Marczykowski (1):
  xen: do not disable netfront in dom0

Srivatsa Vaddagiri (1):
  debugfs: Add support to print u32 array in debugfs

Stefano Stabellini (3):
  xen: enter/exit lazy_mmu_mode around m2p_override calls
  xen: do not map the same GSI twice in PVHVM guests.
  xen: mark local pages as FOREIGN in the m2p_override

Zhang, Yang Z (1):
  xen/pci: Check for PCI bridge before using it.

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: Do need keep mail in thread when sending a new version patch?

2012-07-23 Thread Du, ChangbinX
Thank you all. Maybe both of the two ideas are acceptable.
Seeing your discussion, I prefer to group patches in thread if thread is
not heavy, else I would like to start a new thread.

Thanks again.
Du, Changbin

> -Original Message-
> From: Randy Dunlap [mailto:rdun...@xenotime.net]
> Sent: Tuesday, July 24, 2012 4:43 AM
> To: Henrique de Moraes Holschuh
> Cc: Du, ChangbinX; linux-kernel@vger.kernel.org
> Subject: Re: Do need keep mail in thread when sending a new version patch?
> 
> On 07/23/2012 11:43 AM, Henrique de Moraes Holschuh wrote:
> 
> > On Mon, 23 Jul 2012, Randy Dunlap wrote:
> >> On 07/23/2012 04:34 AM, Henrique de Moraes Holschuh wrote:
> >>> On Mon, 23 Jul 2012, Du, ChangbinX wrote:
>  Hi, developers!
>   I have a question about sending patch: If my patch have a new
> version,
>  do I need set "--in-reply-to" option so keep them in the same thread when
>  sending it? And in which cases should "--in-reply-to" option be set?
> Thanks!
> >>>
> >>> Yes, please do it when possible.  It groups patches nicely in theaded 
> >>> email
> >>> readers, and it makes it easier for people who are interested in the 
> >>> subject
> >>> to find the new version of your patchset (and all individual patches that
> >>> are part of the same patchset), AND for people who are not interested to
> >>> skip the entire thing.
> >>
> >>
> >> Please compare to a proposed patch:
> >>
> >>   http://marc.info/?l=linux-doc=133539116205348=2
> >
> > This one (referenced by the commit changelog of the above) seems to be a
> > better explanation:
> >
> > http://lists.infradead.org/pipermail/linux-arm-kernel/2012-April/096236.html
> >
> > Anyway, I stand corrected.
> >
> 
> 
> Well, that patch isn't merged into mainline yet, right?
> and it is only one opinion.
> 
> I don't mind seeing the new versions of a patch sent as
> replies to older versions, so I don't see a need to codify it.
> 
> 
> --
> ~Randy
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 6/6] rbtree: remove prior augmented rbtree implementation

2012-07-23 Thread Michel Lespinasse
convert arch/x86/mm/pat_rbtree.c to the proposed augmented rbtree api
and remove the old augmented rbtree implementation.

Signed-off-by: Michel Lespinasse 
---
 arch/x86/mm/pat_rbtree.c |   63 
 include/linux/rbtree.h   |8 -
 lib/rbtree.c |   71 --
 lib/rbtree_test.c|   11 ++-
 4 files changed, 47 insertions(+), 106 deletions(-)

diff --git a/arch/x86/mm/pat_rbtree.c b/arch/x86/mm/pat_rbtree.c
index 8acaddd..5b8c8b2 100644
--- a/arch/x86/mm/pat_rbtree.c
+++ b/arch/x86/mm/pat_rbtree.c
@@ -13,6 +13,7 @@
 #include 
 #include 
 #include 
+#include 
 #include 
 #include 
 
@@ -54,27 +55,51 @@ static u64 get_subtree_max_end(struct rb_node *node)
return ret;
 }
 
-/* Update 'subtree_max_end' for a node, based on node and its children */
-static void memtype_rb_augment_cb(struct rb_node *node, void *__unused)
+static u64 compute_subtree_max_end(struct memtype *data)
 {
-   struct memtype *data;
-   u64 max_end, child_max_end;
-
-   if (!node)
-   return;
-
-   data = container_of(node, struct memtype, rb);
-   max_end = data->end;
+   u64 max_end = data->end, child_max_end;
 
-   child_max_end = get_subtree_max_end(node->rb_right);
+   child_max_end = get_subtree_max_end(data->rb.rb_right);
if (child_max_end > max_end)
max_end = child_max_end;
 
-   child_max_end = get_subtree_max_end(node->rb_left);
+   child_max_end = get_subtree_max_end(data->rb.rb_left);
if (child_max_end > max_end)
max_end = child_max_end;
 
-   data->subtree_max_end = max_end;
+   return max_end;
+}
+
+/* Update 'subtree_max_end' after tree rotation. old and new are the
+ * former and current subtree roots */
+static void memtype_rb_rotate_cb(struct rb_node *old, struct rb_node *new)
+{
+   struct memtype *old_data = container_of(old, struct memtype, rb);
+   struct memtype *new_data = container_of(new, struct memtype, rb);
+
+   new_data->subtree_max_end = old_data->subtree_max_end;
+   old_data->subtree_max_end = compute_subtree_max_end(old_data);
+}
+
+static void memtype_rb_copy_cb(struct rb_node *old, struct rb_node *new)
+{
+   struct memtype *old_data = container_of(old, struct memtype, rb);
+   struct memtype *new_data = container_of(new, struct memtype, rb);
+
+   new_data->subtree_max_end = old_data->subtree_max_end;
+}
+
+/* Update 'subtree_max_end' for node and its parents */
+static void memtype_rb_propagate_cb(struct rb_node *node, struct rb_node *stop)
+{
+   while (node != stop) {
+   struct memtype *data = container_of(node, struct memtype, rb);
+   u64 subtree_max_end = compute_subtree_max_end(data);
+   if (data->subtree_max_end == subtree_max_end)
+   break;
+   data->subtree_max_end = subtree_max_end;
+   node = rb_parent(>rb);
+   }
 }
 
 /* Find the first (lowest start addr) overlapping range from rb tree */
@@ -179,15 +204,17 @@ static void memtype_rb_insert(struct rb_root *root, 
struct memtype *newdata)
struct memtype *data = container_of(*node, struct memtype, rb);
 
parent = *node;
+   if (data->subtree_max_end < newdata->end)
+   data->subtree_max_end = newdata->end;
if (newdata->start <= data->start)
node = &((*node)->rb_left);
else if (newdata->start > data->start)
node = &((*node)->rb_right);
}
 
+   newdata->subtree_max_end = newdata->end;
rb_link_node(>rb, parent, node);
-   rb_insert_color(>rb, root);
-   rb_augment_insert(>rb, memtype_rb_augment_cb, NULL);
+   rb_insert_augmented(>rb, root, memtype_rb_rotate_cb);
 }
 
 int rbt_memtype_check_insert(struct memtype *new, unsigned long *ret_type)
@@ -209,16 +236,14 @@ int rbt_memtype_check_insert(struct memtype *new, 
unsigned long *ret_type)
 
 struct memtype *rbt_memtype_erase(u64 start, u64 end)
 {
-   struct rb_node *deepest;
struct memtype *data;
 
data = memtype_rb_exact_match(_rbroot, start, end);
if (!data)
goto out;
 
-   deepest = rb_augment_erase_begin(>rb);
-   rb_erase(>rb, _rbroot);
-   rb_augment_erase_end(deepest, memtype_rb_augment_cb, NULL);
+   rb_erase_augmented(>rb, _rbroot, memtype_rb_copy_cb,
+  memtype_rb_propagate_cb, memtype_rb_rotate_cb);
 out:
return data;
 }
diff --git a/include/linux/rbtree.h b/include/linux/rbtree.h
index bf836a2..487f00b 100644
--- a/include/linux/rbtree.h
+++ b/include/linux/rbtree.h
@@ -61,14 +61,6 @@ struct rb_root {
 extern void rb_insert_color(struct rb_node *, struct rb_root *);
 extern void rb_erase(struct rb_node *, struct rb_root *);
 
-typedef void (*rb_augment_f)(struct rb_node *node, void 

Re: [PATCH 5/6] rbtree: faster augmented erase

2012-07-23 Thread Michel Lespinasse
Add an augmented tree rotation callback to __rb_erase_color(), so that
augmented tree information can be maintained while rebalancing.

Also introduce rb_erase_augmented(), which is a version of rb_erase()
with augmented tree callbacks. We need three callbacks here: one to
copy the subtree's augmented value after stitching in a new node as
the subtree root (rb_erase_augmented cases 2 and 3), one to propagate
the augmented values up after removing a node, and one to pass up to
__rb_erase_color() to handle rebalancing.

Things are set up so that rb_erase() uses dummy do-nothing callbacks,
which get inlined and eliminated by the compiler, and also inlines the
__rb_erase_color() call so as to generate similar code than before
(once again, the compiler somehow generates smaller code than before
with all that inlining, but the speed seems to be on par). For the
augmented version rb_erase_augmented(), however, we use partial
inlining: we want rb_erase_augmented() and its augmented copy and
propagation callbacks to get inlined together, but we still call into
a generic __rb_erase_color() (passing a non-inlined callback function)
for the rebalancing work. This is intended to strike a reasonable
compromise between speed and compiled code size.

Signed-off-by: Michel Lespinasse 
---
 include/linux/rbtree.h  |5 --
 include/linux/rbtree_internal.h |  137 +
 lib/rbtree.c|  141 ++-
 lib/rbtree_test.c   |   31 ++---
 4 files changed, 180 insertions(+), 134 deletions(-)
 create mode 100644 include/linux/rbtree_internal.h

diff --git a/include/linux/rbtree.h b/include/linux/rbtree.h
index 1364b81..bf836a2 100644
--- a/include/linux/rbtree.h
+++ b/include/linux/rbtree.h
@@ -61,11 +61,6 @@ struct rb_root {
 extern void rb_insert_color(struct rb_node *, struct rb_root *);
 extern void rb_erase(struct rb_node *, struct rb_root *);
 
-typedef void rb_augment_rotate(struct rb_node *old, struct rb_node *new);
-
-extern void rb_insert_augmented(struct rb_node *node, struct rb_root *root,
-   rb_augment_rotate *augment);
-
 typedef void (*rb_augment_f)(struct rb_node *node, void *data);
 
 extern void rb_augment_insert(struct rb_node *node,
diff --git a/include/linux/rbtree_internal.h b/include/linux/rbtree_internal.h
new file mode 100644
index 000..82d2864
--- /dev/null
+++ b/include/linux/rbtree_internal.h
@@ -0,0 +1,137 @@
+#ifndef _LINUX_RBTREE_INTERNAL_H
+#define _LINUX_RBTREE_INTERNAL_H
+
+#defineRB_RED  0
+#defineRB_BLACK1
+
+#define rb_color(r)   ((r)->__rb_parent_color & 1)
+#define rb_is_red(r)   (!rb_color(r))
+#define rb_is_black(r) rb_color(r)
+
+static inline void rb_set_parent(struct rb_node *rb, struct rb_node *p)
+{
+   rb->__rb_parent_color = rb_color(rb) | (unsigned long)p;
+}
+
+static inline void rb_set_parent_color(struct rb_node *rb,
+  struct rb_node *p, int color)
+{
+   rb->__rb_parent_color = (unsigned long)p | color;
+}
+
+static inline struct rb_node *rb_red_parent(struct rb_node *red)
+{
+   return (struct rb_node *)red->__rb_parent_color;
+}
+
+typedef void rb_augment_rotate(struct rb_node *old, struct rb_node *new);
+typedef void rb_augment_copy(struct rb_node *old, struct rb_node *new);
+typedef void rb_augment_propagate(struct rb_node *node, struct rb_node *stop);
+
+extern void rb_insert_augmented(struct rb_node *node, struct rb_root *root,
+   rb_augment_rotate *augment);
+extern void __rb_erase_color(struct rb_node *node, struct rb_node *parent,
+struct rb_root *root, rb_augment_rotate *augment);
+
+static inline void
+rb_erase_augmented(struct rb_node *node, struct rb_root *root,
+  rb_augment_copy *augment_copy,
+  rb_augment_propagate *augment_propagate,
+  rb_augment_rotate *augment_rotate)
+{
+   struct rb_node *parent = rb_parent(node);
+   struct rb_node *child = node->rb_right;
+   struct rb_node *tmp = node->rb_left;
+   bool black;
+
+   if (!tmp) {
+   /* Case 1: node to erase has no more than 1 child (easy!) */
+   if (child)
+one_child:
+   rb_set_parent(child, parent);
+   if (parent) {
+   if (parent->rb_left == node)
+   parent->rb_left = child;
+   else
+   parent->rb_right = child;
+   } else
+   root->rb_node = child;
+
+   tmp = parent;
+   black = rb_is_black(node);
+   } else if (!child) {
+   /* Still case 1, but this time the child is node->rb_left */
+   child = tmp;
+   goto one_child;
+   } else {
+   struct rb_node *old = node;
+
+   /*
+   

Re: [PATCH] sctp: Make "Invalid Stream Identifier" ERROR follows SACK when bundling

2012-07-23 Thread Xufeng Zhang
On 7/23/12, Neil Horman  wrote:
> On Mon, Jul 23, 2012 at 10:30:34AM +0800, xufeng zhang wrote:
>> On 07/23/2012 08:49 AM, Neil Horman wrote:
>> >
>> >Not sure I understand how you came into this error.  If we get an
>> > invalid
>> >stream, we issue an SCTP_REPORT_TSN side effect, followed by an
>> > SCTP_CMD_REPLY
>> >which sends the error chunk.  The reply goes through
>> >sctp_outq_tail->sctp_outq_chunk->sctp_outq_transmit_chunk->sctp_outq_append_chunk.
>> >That last function checks to see if a sack is already part of the packet,
>> > and if
>> >there isn't one, appends one, using the updated tsn map.
>> Yes, you are right, but consider the invalid stream identifier's
>> DATA chunk is the first
>> DATA chunk in the association which will need SACK immediately.
>> Here is what I thought of the scenario:
>> sctp_sf_eat_data_6_2()
>> -->sctp_eat_data()
>> -->sctp_make_op_error()
>> -->sctp_add_cmd_sf(commands, SCTP_CMD_REPLY, SCTP_CHUNK(err))
>> -->sctp_outq_tail()  /* First enqueue ERROR chunk */
>> -->sctp_add_cmd_sf(commands, SCTP_CMD_GEN_SACK, SCTP_FORCE())
>> -->sctp_gen_sack()
>> -->sctp_make_sack()
>> -->sctp_add_cmd_sf(commands, SCTP_CMD_REPLY,
>> SCTP_CHUNK(sack))
>> -->sctp_outq_tail()  /* Then enqueue SACK chunk
>> */
>>
>> So SACK chunk is enqueued after ERROR chunk.
> Ah, I see.  Since the ERROR and SACK chunks are both control chunks, and
> since
> we explicitly add the SACK to the control queue instead of going through
> the
> bundle path in sctp_packet_append_chunk the ordering gets wrong.
>
> Ok, so the problem makes sense.  I think the soultion could be alot easier
> though.  IIRC SACK chunks always live at the head of a packet, so why not
> just
> special case it in sctp_outq_tail?  I.e. instead of doing a list_add_tail,
> in
> the else clause of sctp_outq_tail check the chunk_hdr->type to see if its
> SCTP_CID_SACK.  If it is, use list_add_head rather than list_add_tail.  I
> think
> that will fix up both the COOKIE_ECHO and ESTABLISHED cases, won't it?  And
> then
> you won't have keep track of extra state in the packet configuration.

(Please ignore the duplicate messages if you received, sorry for this!)

Yes, it's a good idea, but I think the premise is not correct:
RFC 4960 page 57:
"D) Upon reception of the COOKIE ECHO chunk, endpoint "Z" will reply
   with a COOKIE ACK chunk after building a TCB and moving to the
   ESTABLISHED state. A COOKIE ACK chunk may be bundled with any
   pending DATA chunks (and/or SACK chunks), but the COOKIE ACK chunk
   MUST be the first chunk in the packet."

So we can't put SACK chunk always at the head of the packet.


Thanks,
Xufeng Zhang

>
> Regards
> Neil
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC PATCH 0/6] augmented rbtree changes

2012-07-23 Thread Michel Lespinasse
On Fri, Jul 20, 2012 at 05:31:01AM -0700, Michel Lespinasse wrote:
> Patch 5 speeds up the augmented rbtree erase. Here again we use a tree
> rotation callback during rebalancing; however we also have to propagate
> the augmented node information above nodes being erased and/or stitched,
> and I haven't found a nice enough way to do that. So for now I am proposing
> the simple-stupid way of propagating all the way to the root. More on
> this later.

So, I looked at it again and finally figured out a decent way to avoid
unnecessary propagation here. Going to resend patches 5/6 as replies to
their original postings.

> - The prio tree of all VMAs mapping a given file (struct address_space)
> could be switched to an augmented rbtree based interval tree (thus removing
> the prio tree library in favor of augmented rbtrees)

I actually have a prototype for that already. The augmented rbtree based
implementation is slightly faster than prio tree on insert/erase, and
considerably faster on lookups. However, this is with a synthetic test
exercising prio and rbtrees directly, not with a realistic workload going
through the MM layers. Do we know of situations where prio tree performance
is currently a concern ?

> As they stand, patches 3-6 don't seem to make a difference for basic rbtree
> support, and they improve my augmented rbtree insertion/erase benchmark
> by a factor of ~2.1 to ~2.3 depending on test machines.

After rewriting patches 5-6 as discussed above, augmented rbtrees are now
~2.5 - ~2.7 times faster than before this patch series.

-- 
Michel "Walken" Lespinasse
A program is never fully debugged until the last user dies.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/4] fbcon: use kstrtouint instead of deprecated function simple_strtoul.

2012-07-23 Thread Paul Cercueil

Signed-off-by: Paul Cercueil 
---
 drivers/video/console/fbcon.c |  102 -
 1 file changed, 69 insertions(+), 33 deletions(-)

diff --git a/drivers/video/console/fbcon.c b/drivers/video/console/fbcon.c
index 2e471c2..a0b1818 100644
--- a/drivers/video/console/fbcon.c
+++ b/drivers/video/console/fbcon.c
@@ -435,7 +435,7 @@ static void fbcon_del_cursor_timer(struct fb_info *info)
 static int __init fb_console_setup(char *this_opt)
 {
char *options;
-   int i, j;
+   int i, j, ret;
 
if (!this_opt || !*this_opt)
return 1;
@@ -445,18 +445,29 @@ static int __init fb_console_setup(char *this_opt)
strcpy(fontname, options + 5);

if (!strncmp(options, "scrollback:", 11)) {
+   char *k;
options += 11;
-   if (*options) {
-   fbcon_softback_size = simple_strtoul(options, 
, 0);
-   if (*options == 'k' || *options == 'K') {
-   fbcon_softback_size *= 1024;
-   options++;
-   }
-   if (*options != ',')
-   return 1;
-   options++;
-   } else
-   return 1;
+   k = options;
+
+   while (*k != '\0' && *k != 'k' && *k != 'K')
+   k++;
+
+   /* Clear the 'k' or 'K' suffix to
+* prevent errors with kstrtouint */
+   if (*k != '\0')
+   *k++ = '\0';
+   else
+   k = NULL;
+
+   ret = kstrtouint(options, 0, (unsigned int *)
+   _softback_size);
+
+   if (!ret && k)
+   fbcon_softback_size *= 1024;
+
+   /* (k && *k): Check for garbage after the suffix */
+   if (ret || (k && *k))
+   pr_warn("fbcon: scrollback: incorrect 
value.\n");
}

if (!strncmp(options, "map:", 4)) {
@@ -476,22 +487,44 @@ static int __init fb_console_setup(char *this_opt)
}
 
if (!strncmp(options, "vc:", 3)) {
+   char *dash;
options += 3;
-   if (*options)
-   first_fb_vc = simple_strtoul(options, , 
10) - 1;
-   if (first_fb_vc < 0)
-   first_fb_vc = 0;
-   if (*options++ == '-')
-   last_fb_vc = simple_strtoul(options, , 
10) - 1;
-   fbcon_is_default = 0; 
-   }   
+
+   dash = strchr(options, '-');
+   if (dash)
+   *dash++ = '\0';
+
+   ret = kstrtouint(options, 10,
+   (unsigned int *) _fb_vc);
+   if (!ret) {
+   if (--first_fb_vc < 0)
+   first_fb_vc = 0;
+
+   if (dash) {
+   ret = kstrtouint(dash, 10,
+   (unsigned int *)
+   _fb_vc);
+   if (!ret)
+   last_fb_vc--;
+   }
+   }
+
+   if (!ret)
+   fbcon_is_default = 0;
+   else
+   pr_warn("fbcon: vc: incorrect value.\n");
+   }
 
if (!strncmp(options, "rotate:", 7)) {
options += 7;
-   if (*options)
-   initial_rotation = simple_strtoul(options, 
, 0);
-   if (initial_rotation > 3)
-   initial_rotation = 0;
+   ret = kstrtouint(options, 0, (unsigned int *)
+   _rotation);
+   if (!ret) {
+   if (initial_rotation > 3)
+   initial_rotation = 0;
+   } else {
+   pr_warn("fbcon: rotate: incorrect value.\n");
+   }
}
}
return 1;
@@ -3312,8 +3345,8 @@ static ssize_t store_rotate(struct device *device,

[PATCH 2/4] fbcon: prevent possible buffer overflow.

2012-07-23 Thread Paul Cercueil

Signed-off-by: Paul Cercueil 
---
 drivers/video/console/fbcon.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/video/console/fbcon.c b/drivers/video/console/fbcon.c
index a0b1818..3ffab97 100644
--- a/drivers/video/console/fbcon.c
+++ b/drivers/video/console/fbcon.c
@@ -442,7 +442,7 @@ static int __init fb_console_setup(char *this_opt)
 
while ((options = strsep(_opt, ",")) != NULL) {
if (!strncmp(options, "font:", 5))
-   strcpy(fontname, options + 5);
+   strlcpy(fontname, options + 5, sizeof(fontname));

if (!strncmp(options, "scrollback:", 11)) {
char *k;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/4] fbcon: continue parsing parameters after an error.

2012-07-23 Thread Paul Cercueil

Signed-off-by: Paul Cercueil 
---
 drivers/video/console/fbcon.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/video/console/fbcon.c b/drivers/video/console/fbcon.c
index 3ffab97..9b83b75 100644
--- a/drivers/video/console/fbcon.c
+++ b/drivers/video/console/fbcon.c
@@ -481,9 +481,9 @@ static int __init fb_console_setup(char *this_opt)
}
 
fbcon_map_override();
+   } else {
+   pr_warn("fbcon: map: incorrect value.\n");
}
-
-   return 1;
}
 
if (!strncmp(options, "vc:", 3)) {
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 4/4] fbcon: optimize parameters parsing loop.

2012-07-23 Thread Paul Cercueil

Signed-off-by: Paul Cercueil 
---
 drivers/video/console/fbcon.c |8 +++-
 1 file changed, 7 insertions(+), 1 deletion(-)

diff --git a/drivers/video/console/fbcon.c b/drivers/video/console/fbcon.c
index 9b83b75..1ecaf68 100644
--- a/drivers/video/console/fbcon.c
+++ b/drivers/video/console/fbcon.c
@@ -441,8 +441,10 @@ static int __init fb_console_setup(char *this_opt)
return 1;
 
while ((options = strsep(_opt, ",")) != NULL) {
-   if (!strncmp(options, "font:", 5))
+   if (!strncmp(options, "font:", 5)) {
strlcpy(fontname, options + 5, sizeof(fontname));
+   continue;
+   }

if (!strncmp(options, "scrollback:", 11)) {
char *k;
@@ -468,6 +470,7 @@ static int __init fb_console_setup(char *this_opt)
/* (k && *k): Check for garbage after the suffix */
if (ret || (k && *k))
printk(KERN_WARNING "fbcon: scrollback: 
incorrect value.\n");
+   continue;
}

if (!strncmp(options, "map:", 4)) {
@@ -484,6 +487,7 @@ static int __init fb_console_setup(char *this_opt)
} else {
printk(KERN_WARNING "fbcon: map: incorrect 
value.\n");
}
+   continue;
}
 
if (!strncmp(options, "vc:", 3)) {
@@ -513,6 +517,7 @@ static int __init fb_console_setup(char *this_opt)
fbcon_is_default = 0;
else
printk(KERN_WARNING "fbcon: vc: incorrect 
value.\n");
+   continue;
}
 
if (!strncmp(options, "rotate:", 7)) {
@@ -525,6 +530,7 @@ static int __init fb_console_setup(char *this_opt)
} else {
printk(KERN_WARNING "fbcon: rotate: incorrect 
value.\n");
}
+   continue;
}
}
return 1;
-- 
1.7.10.4

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH v2] mmc: dw_mmc: Disable low power mode if SDIO interrupts are used

2012-07-23 Thread Jaehoon Chung
On 07/24/2012 02:02 AM, Doug Anderson wrote:
> The documentation for the dw_mmc part says that the low power
> mode should normally only be set for MMC and SD memory and should
> be turned off for SDIO cards that need interrupts detected.
> 
> The best place I could find to do this is when the SDIO interrupt
> was first enabled.  I rely on the fact that dw_mci_setup_bus()
> will be called when it's time to reenable.
> 
> Signed-off-by: Doug Anderson 
> ---
> Changes in v2:
> - Commenting fixes requested by Grant Grundler.
> - Be extra certain that we don't re-turn on the low power mode in
>   CLKENA in dw_mci_setup_bus() if SDIO interrupts are enabled.
>   There are no known instances of this happening but it's good to be safe.
> 
>  drivers/mmc/host/dw_mmc.c |   43 ---
>  1 files changed, 40 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/mmc/host/dw_mmc.c b/drivers/mmc/host/dw_mmc.c
> index 72dc3cd..0ab1771 100644
> --- a/drivers/mmc/host/dw_mmc.c
> +++ b/drivers/mmc/host/dw_mmc.c
> @@ -627,6 +627,7 @@ static void dw_mci_setup_bus(struct dw_mci_slot *slot)
>  {
>   struct dw_mci *host = slot->host;
>   u32 div;
> + u32 clk_en_a;
>  
>   if (slot->clock != host->current_speed) {
>   div = host->bus_hz / slot->clock;
> @@ -659,9 +660,11 @@ static void dw_mci_setup_bus(struct dw_mci_slot *slot)
>   mci_send_cmd(slot,
>SDMMC_CMD_UPD_CLK | SDMMC_CMD_PRV_DAT_WAIT, 0);
>  
> - /* enable clock */
> - mci_writel(host, CLKENA, ((SDMMC_CLKEN_ENABLE |
> -SDMMC_CLKEN_LOW_PWR) << slot->id));
> + /* enable clock; only low power if no SDIO */
> + clk_en_a = SDMMC_CLKEN_ENABLE << slot->id;
> + if (!(mci_readl(host, INTMASK) & SDMMC_INT_SDIO(slot->id)))
> + clk_en_a |= SDMMC_CLKEN_LOW_PWR << slot->id;
> + mci_writel(host, CLKENA, clk_en_a);
>  
>   /* inform CIU */
>   mci_send_cmd(slot,
> @@ -862,6 +865,32 @@ static int dw_mci_get_cd(struct mmc_host *mmc)
>   return present;
>  }
>  
> +/*
> + * Disable lower power mode.
> + *
> + * Low power mode will stop the card clock when idle.  According to
> + * documentation (Exynos 5250 User's Manual 0.04, description of
> + * CLKENA register) we should disable low power mode for SDIO cards
> + * if we need interrupts to work.
> + *
As Seungwon is mentioned, that is not exynos5250 specific.
> + * This function is fast if the power mode is already disabled.
> + */
> +static void dw_mci_disable_low_power(struct mmc_host *mmc)
> +{
> + struct dw_mci_slot *slot = mmc_priv(mmc);
> + struct dw_mci *host = slot->host;
> + u32 clk_en_a;
> + const u32 clken_low_pwr = SDMMC_CLKEN_LOW_PWR << slot->id;
> +
> + clk_en_a = mci_readl(host, CLKENA);
> +
> + if (clk_en_a & clken_low_pwr) {
> + mci_writel(host, CLKENA, clk_en_a & ~clken_low_pwr);
> + mci_send_cmd(slot, SDMMC_CMD_UPD_CLK |
> +  SDMMC_CMD_PRV_DAT_WAIT, 0);
> + }
> +}
> +
>  static void dw_mci_enable_sdio_irq(struct mmc_host *mmc, int enb)
>  {
>   struct dw_mci_slot *slot = mmc_priv(mmc);
> @@ -871,6 +900,14 @@ static void dw_mci_enable_sdio_irq(struct mmc_host *mmc, 
> int enb)
>   /* Enable/disable Slot Specific SDIO interrupt */
>   int_mask = mci_readl(host, INTMASK);
>   if (enb) {
> + /*
> +  * Turn off low power mode if it was enabled.  This is a bit of
> +  * a heavy operation and we disable / enable IRQs a lot, so
> +  * we'll leave low power mode disabled and it will get
> +  * re-enabled again in dw_mci_setup_bus().
> +  */
> + dw_mci_disable_low_power(mmc);
How about using "slot" instead of "mmc"?
> +
>   mci_writel(host, INTMASK,
>  (int_mask | SDMMC_INT_SDIO(slot->id)));
>   } else {
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH] extcon: extcon_gpio: Replace gpio_request_one by devm_gpio_request_one

2012-07-23 Thread Axel Lin
commit 01eaf24 "extcon: Convert extcon_gpio to devm_gpio_request_one"
missed the replacement for devm_gpio_request_one. fix it.

Signed-off-by: Axel Lin 
---
 drivers/extcon/extcon_gpio.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/extcon/extcon_gpio.c b/drivers/extcon/extcon_gpio.c
index fe3db45..3cc152e 100644
--- a/drivers/extcon/extcon_gpio.c
+++ b/drivers/extcon/extcon_gpio.c
@@ -107,7 +107,8 @@ static int __devinit gpio_extcon_probe(struct 
platform_device *pdev)
if (ret < 0)
return ret;
 
-   ret = gpio_request_one(extcon_data->gpio, GPIOF_DIR_IN, pdev->name);
+   ret = devm_gpio_request_one(>dev, extcon_data->gpio, GPIOF_DIR_IN,
+   pdev->name);
if (ret < 0)
goto err;
 
-- 
1.7.9.5



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/3] drivers/misc: Add realtek card reader core driver

2012-07-23 Thread wwang

Hi Borislav:

Realtek card reader supports not only SDMMC card, but also Memory stick. 
This part is the common code, so it is located in drivers/misc. There is 
also SDMMC-relevant code under CONFIG_MMC. And in the future, 
Memstick-relevant code will be added under CONFIG_MEMSTICK.


BR,
wwang

于 2012年07月24日 00:33, Borislav Petkov 写道:

On Mon, Jul 23, 2012 at 05:42:38PM +0800, wei_w...@realsil.com.cn wrote:

From: Wei WANG 

Realtek card reader core driver is the bus driver for Realtek
driver-based card reader, which supplies adapter layer to
be used by lower-level pci/usb card reader and upper-level
sdmmc/memstick host driver.

Signed-off-by: Wei WANG 
---
  Documentation/misc-devices/realtek_cr.txt |   27 ++
  drivers/misc/Kconfig  |1 +
  drivers/misc/Makefile |1 +
  drivers/misc/realtek_cr/Kconfig   |   26 ++
  drivers/misc/realtek_cr/Makefile  |7 +
  drivers/misc/realtek_cr/core/Kconfig  |6 +
  drivers/misc/realtek_cr/core/Makefile |1 +
  drivers/misc/realtek_cr/core/rtsx_core.c  |  492 +
  include/linux/rtsx_core.h |  183 +++
  9 files changed, 744 insertions(+)
  create mode 100644 Documentation/misc-devices/realtek_cr.txt
  create mode 100644 drivers/misc/realtek_cr/Kconfig
  create mode 100644 drivers/misc/realtek_cr/Makefile
  create mode 100644 drivers/misc/realtek_cr/core/Kconfig
  create mode 100644 drivers/misc/realtek_cr/core/Makefile
  create mode 100644 drivers/misc/realtek_cr/core/rtsx_core.c
  create mode 100644 include/linux/rtsx_core.h

diff --git a/Documentation/misc-devices/realtek_cr.txt 
b/Documentation/misc-devices/realtek_cr.txt
new file mode 100644
index 000..b4e6fbe
--- /dev/null
+++ b/Documentation/misc-devices/realtek_cr.txt
@@ -0,0 +1,27 @@
+Realtek Driver-based Card Reader
+
+
+Supported chips:
+RTS5209
+RTS5229
+
+Contact Email:
+pc_sw_li...@realsil.com.cn
+
+
+Description
+---
+
+Realtek driver-based card reader supports access to many types of memory cards,
+such as Memory Stick, Memory Stick Pro, Secure Digital and MultiMediaCard.
+
+
+udev rules
+--
+
+In order to modprobe Realtek SD/MMC interface driver automatically, the 
following rule
+should be added to the udev rules file:
+
+SUBSYSTEM=="rtsx_cr", ENV{RTSX_CARD_TYPE}=="SD", RUN+="/sbin/modprobe -bv 
rtsx_sdmmc"
+
+Typically, we may edit /lib/udev/rules.d/80-drivers.rules and copy the rule 
into it in Ubuntu.
diff --git a/drivers/misc/Kconfig b/drivers/misc/Kconfig
index 2661f6e..09ce905 100644
--- a/drivers/misc/Kconfig
+++ b/drivers/misc/Kconfig
@@ -517,4 +517,5 @@ source "drivers/misc/lis3lv02d/Kconfig"
  source "drivers/misc/carma/Kconfig"
  source "drivers/misc/altera-stapl/Kconfig"
  source "drivers/misc/mei/Kconfig"
+source "drivers/misc/realtek_cr/Kconfig"
  endmenu
diff --git a/drivers/misc/Makefile b/drivers/misc/Makefile
index 456972f..c09f147 100644
--- a/drivers/misc/Makefile
+++ b/drivers/misc/Makefile
@@ -51,3 +51,4 @@ obj-y += carma/
  obj-$(CONFIG_USB_SWITCH_FSA9480) += fsa9480.o
  obj-$(CONFIG_ALTERA_STAPL)+=altera-stapl/
  obj-$(CONFIG_INTEL_MEI)   += mei/
+obj-$(CONFIG_REALTEK_CR_SUPPORT) += realtek_cr/
diff --git a/drivers/misc/realtek_cr/Kconfig b/drivers/misc/realtek_cr/Kconfig
new file mode 100644
index 000..303d98a
--- /dev/null
+++ b/drivers/misc/realtek_cr/Kconfig
@@ -0,0 +1,26 @@
+#
+# Realtek driver-based card reader
+#
+
+menuconfig REALTEK_CR_SUPPORT
+   tristate "Realtek driver-based card reader"
+   help
+ Realtek driver-based card reader supports access to many types of
+ memory cards, such as Memory Stick, Memory Stick Pro, Secure Digital
+ and MultiMediaCard.
+
+ If you want to use Realtek driver-based card reader, enable this
+ option and other options below.
+
+config REALTEK_CR_DEBUG
+   bool "Realtek driver-based card reader debugging"
+   depends on REALTEK_CR_SUPPORT != n
+   help
+ This is an option for use by developers; most people should
+ say N here.  This enables Realtek card reader driver debugging.
+
+if REALTEK_CR_SUPPORT
+
+source "drivers/misc/realtek_cr/core/Kconfig"
+
+endif

Ok, maybe I'm a newbie here but this is a card reader driver and AFAICT
it should be placed under CONFIG_MMC. Why is it under drivers/misc?



--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [QUESTION ON BUG] the rcu stall issue could not be reproduced

2012-07-23 Thread Michael Wang
On 07/24/2012 02:46 AM, Martin Mokrejs wrote:
> Hi,
>   I see few more RCU bugs reported in bugzilla:
> 
> https://bugzilla.kernel.org/show_bug.cgi?id=43028
> https://bugzilla.kernel.org/show_bug.cgi?id=40092
> https://bugzilla.kernel.org/show_bug.cgi?id=42997
> 
> And, I placed my previous long email with logs at
> https://bugzilla.kernel.org/show_bug.cgi?id=45091
> 
> Hope this helps eventually once.

That's very helpful, I need some time to read and think about it, thanks
for the info ;-)

Regards,
Michael Wang
> Martin
> 
> Mike Galbraith wrote:
>> On Fri, 2012-07-20 at 11:09 +0800, Michael Wang wrote: 
>>> Hi, Mike, Martin, Dan
>>>
>>> I'm currently taking an eye on the rcu stall issue which was reported by
>>> you in the mail:
>>>
>>> rcu: endless stalls
>>> From: Mike Galbraith
>>> linux-3.4-rc7: rcu_sched self-detected stall on CPU
>>> From: Martin Mokrejs
>>> RCU stalls in linux-next
>>> From: Dan Carpenter
>>>
>>> I try to reproduce the issue on my X86 server with 12 cpu
>>
>> The 'endless stalls' box was 341.3 times larger.  Dunno if you can
>> even set a serial port slow enough to approximate all cores trying to
>> gripe through a single pinhole simultaneously.
>>
>> -Mike
>>
>>
> 
> 


--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Attaching a process to cgroups

2012-07-23 Thread Mike Galbraith
On Mon, 2012-07-23 at 22:41 +0200, Andrea Righi wrote: 
> On Thu, Jun 21, 2012 at 10:23:02AM +0200, Mike Galbraith wrote:
> > On Thu, 2012-06-21 at 11:54 +0400, Alexey Vlasov wrote: 
> > > On Wed, Jun 20, 2012 at 02:28:18PM +0200, Mike Galbraith wrote:
> > > > 
> > > > kernel/cgroup.c::cgroup_attach_task()
> > > > {
> > > > ...
> > > > synchronize_rcu();
> > > > ...
> > > > }
> > > 
> > > So nothing can be done here? (I mean if only I knew how to fix it I
> > > wouldn't ask about it ;)
> > 
> > Sure, kill the obnoxious thing, it's sitting right in the middle of the
> > userspace interface.
> > 
> > I banged on it a while back (wrt explosive android patches), extracted
> > RCU from the userspace interface.  It seemed to work great, much faster,
> > couldn't make it explode.  I wouldn't bet anything I wasn't willing to
> > immediately part with that the result was really really safe though ;-)
> > 
> > -Mike
> 
> JFYI,
> 
> I'm testing the following patch in a bunch of hosts and I wasn't able to
> make any of them to explode, even running a multi-threaded
> cgroup-intensive workload, but probably I was just lucky (or unlucky,
> depending on the point of view).
> 
> It is basically the same Not-signed-off-by work posted by Mike a while
> ago: https://lkml.org/lkml/2011/4/12/599.
> 
> In addition, I totally removed the synchronize_rcu() call from
> cgroup_attach_task() and added the call_rcu -> schedule_work removal
> also for css_set. The latter looks unnecessary to me from a logical
> point of view, or maybe I'm missing something, because I can't explain
> why with it I can't trigger any BUG / oops.
> 
> Mike, did you make any progress from your old patch?

No, it worked, but I couldn't prove it was really safe, so let it drop.

-Mike

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


RE: [PATCH v2] mmc: dw_mmc: Disable low power mode if SDIO interrupts are used

2012-07-23 Thread Seungwon Jeon
July 24, 2012, Doug Anderson  wrote:
> The documentation for the dw_mmc part says that the low power
> mode should normally only be set for MMC and SD memory and should
> be turned off for SDIO cards that need interrupts detected.
> 
> The best place I could find to do this is when the SDIO interrupt
> was first enabled.  I rely on the fact that dw_mci_setup_bus()
> will be called when it's time to reenable.
> 
> Signed-off-by: Doug Anderson 
> ---
> Changes in v2:
> - Commenting fixes requested by Grant Grundler.
> - Be extra certain that we don't re-turn on the low power mode in
>   CLKENA in dw_mci_setup_bus() if SDIO interrupts are enabled.
>   There are no known instances of this happening but it's good to be safe.
> 
>  drivers/mmc/host/dw_mmc.c |   43 ---
>  1 files changed, 40 insertions(+), 3 deletions(-)
> 
> diff --git a/drivers/mmc/host/dw_mmc.c b/drivers/mmc/host/dw_mmc.c
> index 72dc3cd..0ab1771 100644
> --- a/drivers/mmc/host/dw_mmc.c
> +++ b/drivers/mmc/host/dw_mmc.c
> @@ -627,6 +627,7 @@ static void dw_mci_setup_bus(struct dw_mci_slot *slot)
>  {
>   struct dw_mci *host = slot->host;
>   u32 div;
> + u32 clk_en_a;
> 
>   if (slot->clock != host->current_speed) {
>   div = host->bus_hz / slot->clock;
> @@ -659,9 +660,11 @@ static void dw_mci_setup_bus(struct dw_mci_slot *slot)
>   mci_send_cmd(slot,
>SDMMC_CMD_UPD_CLK | SDMMC_CMD_PRV_DAT_WAIT, 0);
> 
> - /* enable clock */
> - mci_writel(host, CLKENA, ((SDMMC_CLKEN_ENABLE |
> -SDMMC_CLKEN_LOW_PWR) << slot->id));
> + /* enable clock; only low power if no SDIO */
> + clk_en_a = SDMMC_CLKEN_ENABLE << slot->id;
> + if (!(mci_readl(host, INTMASK) & SDMMC_INT_SDIO(slot->id)))
> + clk_en_a |= SDMMC_CLKEN_LOW_PWR << slot->id;
> + mci_writel(host, CLKENA, clk_en_a);
I have followed this patch from v1.
I think it's a good point.
Looks good to me.

> 
>   /* inform CIU */
>   mci_send_cmd(slot,
> @@ -862,6 +865,32 @@ static int dw_mci_get_cd(struct mmc_host *mmc)
>   return present;
>  }
> 
> +/*
> + * Disable lower power mode.
> + *
> + * Low power mode will stop the card clock when idle.  According to
> + * documentation (Exynos 5250 User's Manual 0.04, description of
> + * CLKENA register) we should disable low power mode for SDIO cards
> + * if we need interrupts to work.
Above comment is correct, but this is not specific for Exynos5250.
Exynos5250 is just one of host controllers base on Synopsys's.
It'd be better to remove the part related to Exynos you mentioned .


> + *
> + * This function is fast if the power mode is already disabled.
Definitely, low power mode not power mode, right?

Best regards,
Seungwon Jeon

> + */
> +static void dw_mci_disable_low_power(struct mmc_host *mmc)
> +{
> + struct dw_mci_slot *slot = mmc_priv(mmc);
> + struct dw_mci *host = slot->host;
> + u32 clk_en_a;
> + const u32 clken_low_pwr = SDMMC_CLKEN_LOW_PWR << slot->id;
> +
> + clk_en_a = mci_readl(host, CLKENA);
> +
> + if (clk_en_a & clken_low_pwr) {
> + mci_writel(host, CLKENA, clk_en_a & ~clken_low_pwr);
> + mci_send_cmd(slot, SDMMC_CMD_UPD_CLK |
> +  SDMMC_CMD_PRV_DAT_WAIT, 0);
> + }
> +}
> +
>  static void dw_mci_enable_sdio_irq(struct mmc_host *mmc, int enb)
>  {
>   struct dw_mci_slot *slot = mmc_priv(mmc);
> @@ -871,6 +900,14 @@ static void dw_mci_enable_sdio_irq(struct mmc_host *mmc, 
> int enb)
>   /* Enable/disable Slot Specific SDIO interrupt */
>   int_mask = mci_readl(host, INTMASK);
>   if (enb) {
> + /*
> +  * Turn off low power mode if it was enabled.  This is a bit of
> +  * a heavy operation and we disable / enable IRQs a lot, so
> +  * we'll leave low power mode disabled and it will get
> +  * re-enabled again in dw_mci_setup_bus().
> +  */
> + dw_mci_disable_low_power(mmc);
> +
>   mci_writel(host, INTMASK,
>  (int_mask | SDMMC_INT_SDIO(slot->id)));
>   } else {
> --
> 1.7.7.3
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-mmc" in
> the body of a message to majord...@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC 21/23] ARM: keystone: introducing TI Keystone platform

2012-07-23 Thread Cyril Chemparathy
Texas Instruments Keystone family of multicore devices now includes an
upcoming slew of Cortex A15 based devices.  This patch adds basic definitions
for a new Keystone sub-architecture in ARM.

Subsequent patches in this series will extend support to include SMP and take
advantage of the large physical memory addressing capabilities via LPAE.

Signed-off-by: Vitaly Andrianov 
Signed-off-by: Cyril Chemparathy 
---
 arch/arm/Kconfig  |   17 +
 arch/arm/Makefile |1 +
 arch/arm/boot/dts/keystone-sim.dts|   77 +++
 arch/arm/configs/keystone_defconfig   |   20 +
 arch/arm/mach-keystone/Makefile   |1 +
 arch/arm/mach-keystone/Makefile.boot  |1 +
 arch/arm/mach-keystone/include/mach/debug-macro.S |   44 +++
 arch/arm/mach-keystone/include/mach/entry-macro.S |   20 +
 arch/arm/mach-keystone/include/mach/io.h  |   22 ++
 arch/arm/mach-keystone/include/mach/memory.h  |   22 ++
 arch/arm/mach-keystone/include/mach/system.h  |   30 
 arch/arm/mach-keystone/include/mach/timex.h   |   21 ++
 arch/arm/mach-keystone/include/mach/uncompress.h  |   24 ++
 arch/arm/mach-keystone/include/mach/vmalloc.h |   21 ++
 arch/arm/mach-keystone/keystone.c |   83 +
 15 files changed, 404 insertions(+)
 create mode 100644 arch/arm/boot/dts/keystone-sim.dts
 create mode 100644 arch/arm/configs/keystone_defconfig
 create mode 100644 arch/arm/mach-keystone/Makefile
 create mode 100644 arch/arm/mach-keystone/Makefile.boot
 create mode 100644 arch/arm/mach-keystone/include/mach/debug-macro.S
 create mode 100644 arch/arm/mach-keystone/include/mach/entry-macro.S
 create mode 100644 arch/arm/mach-keystone/include/mach/io.h
 create mode 100644 arch/arm/mach-keystone/include/mach/memory.h
 create mode 100644 arch/arm/mach-keystone/include/mach/system.h
 create mode 100644 arch/arm/mach-keystone/include/mach/timex.h
 create mode 100644 arch/arm/mach-keystone/include/mach/uncompress.h
 create mode 100644 arch/arm/mach-keystone/include/mach/vmalloc.h
 create mode 100644 arch/arm/mach-keystone/keystone.c

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 55da671..04c846b 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -359,6 +359,23 @@ config ARCH_HIGHBANK
help
  Support for the Calxeda Highbank SoC based boards.
 
+config ARCH_KEYSTONE
+   bool "Texas Instruments Keystone Devices"
+   select ARCH_WANT_OPTIONAL_GPIOLIB
+   select ARM_GIC
+   select CLKDEV_LOOKUP
+   select COMMON_CLK
+   select CLKSRC_MMIO
+   select CPU_V7
+   select GENERIC_CLOCKEVENTS
+   select USE_OF
+   select SPARSE_IRQ
+   select NEED_MACH_MEMORY_H
+   select HAVE_SCHED_CLOCK
+   help
+ Support for boards based on the Texas Instruments Keystone family of
+ SoCs.
+
 config ARCH_CLPS711X
bool "Cirrus Logic CLPS711x/EP721x/EP731x-based"
select CPU_ARM720T
diff --git a/arch/arm/Makefile b/arch/arm/Makefile
index 0298b00..13d6ef5 100644
--- a/arch/arm/Makefile
+++ b/arch/arm/Makefile
@@ -143,6 +143,7 @@ machine-$(CONFIG_ARCH_EP93XX)   := ep93xx
 machine-$(CONFIG_ARCH_GEMINI)  := gemini
 machine-$(CONFIG_ARCH_H720X)   := h720x
 machine-$(CONFIG_ARCH_HIGHBANK):= highbank
+machine-$(CONFIG_ARCH_KEYSTONE):= keystone
 machine-$(CONFIG_ARCH_INTEGRATOR)  := integrator
 machine-$(CONFIG_ARCH_IOP13XX) := iop13xx
 machine-$(CONFIG_ARCH_IOP32X)  := iop32x
diff --git a/arch/arm/boot/dts/keystone-sim.dts 
b/arch/arm/boot/dts/keystone-sim.dts
new file mode 100644
index 000..118d631
--- /dev/null
+++ b/arch/arm/boot/dts/keystone-sim.dts
@@ -0,0 +1,77 @@
+/dts-v1/;
+/include/ "skeleton.dtsi"
+
+/ {
+   model = "Texas Instruments Keystone 2 SoC";
+   compatible = "ti,keystone-evm";
+   #address-cells = <1>;
+   #size-cells = <1>;
+   interrupt-parent = <>;
+
+   aliases {
+   serial0 = 
+   };
+
+   chosen {
+   bootargs = "console=ttyS0,115200n8 debug earlyprintk lpj=5 
rdinit=/bin/ash rw root=/dev/ram0 initrd=0x8500,9M";
+   };
+
+   memory {
+   reg = <0x8000 0x800>;
+   };
+
+   cpus {
+   interrupt-parent = <>;
+
+   cpu@0 {
+   compatible = "arm,cortex-a15";
+   };
+
+   cpu@1 {
+   compatible = "arm,cortex-a15";
+   };
+
+   cpu@2 {
+   compatible = "arm,cortex-a15";
+   };
+
+   cpu@3 {
+   compatible = "arm,cortex-a15";
+   };
+
+   };
+
+   soc {
+   #address-cells = <1>;
+   #size-cells = <1>;
+   ranges;
+   

[RFC 06/23] ARM: LPAE: use phys_addr_t for initrd location and size

2012-07-23 Thread Cyril Chemparathy
From: Vitaly Andrianov 

This patch fixes the initrd setup code to use phys_addr_t instead of assuming
32-bit addressing.  Without this we cannot boot on systems where initrd is
located above the 4G physical address limit.

Signed-off-by: Vitaly Andrianov 
Signed-off-by: Cyril Chemparathy 
---
 arch/arm/mm/init.c |   14 +++---
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index 8252c31..51f3e92 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -36,12 +36,12 @@
 
 #include "mm.h"
 
-static unsigned long phys_initrd_start __initdata = 0;
-static unsigned long phys_initrd_size __initdata = 0;
+static phys_addr_t phys_initrd_start __initdata = 0;
+static phys_addr_t phys_initrd_size __initdata = 0;
 
 static int __init early_initrd(char *p)
 {
-   unsigned long start, size;
+   phys_addr_t start, size;
char *endp;
 
start = memparse(p, );
@@ -347,14 +347,14 @@ void __init arm_memblock_init(struct meminfo *mi, struct 
machine_desc *mdesc)
 #ifdef CONFIG_BLK_DEV_INITRD
if (phys_initrd_size &&
!memblock_is_region_memory(phys_initrd_start, phys_initrd_size)) {
-   pr_err("INITRD: 0x%08lx+0x%08lx is not a memory region - 
disabling initrd\n",
-  phys_initrd_start, phys_initrd_size);
+   pr_err("INITRD: 0x%08llx+0x%08llx is not a memory region - 
disabling initrd\n",
+  (u64)phys_initrd_start, (u64)phys_initrd_size);
phys_initrd_start = phys_initrd_size = 0;
}
if (phys_initrd_size &&
memblock_is_region_reserved(phys_initrd_start, phys_initrd_size)) {
-   pr_err("INITRD: 0x%08lx+0x%08lx overlaps in-use memory region - 
disabling initrd\n",
-  phys_initrd_start, phys_initrd_size);
+   pr_err("INITRD: 0x%08llx+0x%08llx overlaps in-use memory region 
- disabling initrd\n",
+  (u64)phys_initrd_start, (u64)phys_initrd_size);
phys_initrd_start = phys_initrd_size = 0;
}
if (phys_initrd_size) {
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC 23/23] ARM: keystone: add switch over to high physical address range

2012-07-23 Thread Cyril Chemparathy
Keystone platforms have their physical memory mapped at an address outside the
32-bit physical range.  A Keystone machine with 16G of RAM would find its
memory at 0x08 - 0x0b.

For boot purposes, the interconnect supports a limited alias of some of this
memory within the 32-bit addressable space (0x8000 - 0x).  This
aliasing is implemented in hardware, and is not intended to be used much
beyond boot.  For instance, DMA coherence does not work when running out of
this aliased address space.

Therefore, we've taken the approach of booting out of the low physical address
range, and subsequently we switch over to the high range once we're safely
inside machine specific territory.  This patch implements this switch over
mechanism, which involves rewiring the TTBRs and page tables to point to the
new physical address space.

Signed-off-by: Vitaly Andrianov 
Signed-off-by: Cyril Chemparathy 
---
 arch/arm/Kconfig |1 +
 arch/arm/boot/dts/keystone-sim.dts   |6 +-
 arch/arm/configs/keystone_defconfig  |1 +
 arch/arm/mach-keystone/include/mach/memory.h |   29 
 arch/arm/mach-keystone/keystone.c|   92 ++
 arch/arm/mach-keystone/platsmp.c |   21 ++
 6 files changed, 147 insertions(+), 3 deletions(-)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 5b82879..f970ee1 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -373,6 +373,7 @@ config ARCH_KEYSTONE
select NEED_MACH_MEMORY_H
select HAVE_SCHED_CLOCK
select HAVE_SMP
+   select ZONE_DMA if ARM_LPAE
help
  Support for boards based on the Texas Instruments Keystone family of
  SoCs.
diff --git a/arch/arm/boot/dts/keystone-sim.dts 
b/arch/arm/boot/dts/keystone-sim.dts
index 118d631..afdef89 100644
--- a/arch/arm/boot/dts/keystone-sim.dts
+++ b/arch/arm/boot/dts/keystone-sim.dts
@@ -4,7 +4,7 @@
 / {
model = "Texas Instruments Keystone 2 SoC";
compatible = "ti,keystone-evm";
-   #address-cells = <1>;
+   #address-cells = <2>;
#size-cells = <1>;
interrupt-parent = <>;
 
@@ -13,11 +13,11 @@
};
 
chosen {
-   bootargs = "console=ttyS0,115200n8 debug earlyprintk lpj=5 
rdinit=/bin/ash rw root=/dev/ram0 initrd=0x8500,9M";
+   bootargs = "console=ttyS0,115200n8 debug earlyprintk lpj=5 
rdinit=/bin/ash rw root=/dev/ram0 initrd=0x80500,9M";
};
 
memory {
-   reg = <0x8000 0x800>;
+   reg = <0x0008 0x 0x800>;
};
 
cpus {
diff --git a/arch/arm/configs/keystone_defconfig 
b/arch/arm/configs/keystone_defconfig
index 5f71e66..8ea3b96 100644
--- a/arch/arm/configs/keystone_defconfig
+++ b/arch/arm/configs/keystone_defconfig
@@ -1,6 +1,7 @@
 CONFIG_EXPERIMENTAL=y
 CONFIG_BLK_DEV_INITRD=y
 CONFIG_ARCH_KEYSTONE=y
+CONFIG_ARM_LPAE=y
 CONFIG_SMP=y
 CONFIG_ARM_ARCH_TIMER=y
 CONFIG_NR_CPUS=4
diff --git a/arch/arm/mach-keystone/include/mach/memory.h 
b/arch/arm/mach-keystone/include/mach/memory.h
index 7c78b1e..6404633 100644
--- a/arch/arm/mach-keystone/include/mach/memory.h
+++ b/arch/arm/mach-keystone/include/mach/memory.h
@@ -19,4 +19,33 @@
 #define MAX_PHYSMEM_BITS   36
 #define SECTION_SIZE_BITS  34
 
+#define KEYSTONE_LOW_PHYS_START0x8000ULL
+#define KEYSTONE_LOW_PHYS_SIZE 0x8000ULL /* 2G */
+#define KEYSTONE_LOW_PHYS_END  (KEYSTONE_LOW_PHYS_START + \
+KEYSTONE_LOW_PHYS_SIZE - 1)
+
+#define KEYSTONE_HIGH_PHYS_START   0x8ULL
+#define KEYSTONE_HIGH_PHYS_SIZE0x4ULL  /* 16G */
+#define KEYSTONE_HIGH_PHYS_END (KEYSTONE_HIGH_PHYS_START + \
+KEYSTONE_HIGH_PHYS_SIZE - 1)
+#ifdef CONFIG_ARM_LPAE
+
+#ifndef __ASSEMBLY__
+
+extern phys_addr_t  keystone_phys_offset;
+
+#define PLAT_PHYS_OFFSET keystone_phys_offset
+
+static inline phys_addr_t __virt_to_idmap(unsigned long x)
+{
+   return (phys_addr_t)(x) - CONFIG_PAGE_OFFSET +
+   KEYSTONE_LOW_PHYS_START;
+}
+
+#define virt_to_idmap(x)   __virt_to_idmap((unsigned long)(x))
+
+#endif /* __ASSEMBLY__ */
+
+#endif /* CONFIG_ARM_LPAE */
+
 #endif /* __ASM_MACH_MEMORY_H */
diff --git a/arch/arm/mach-keystone/keystone.c 
b/arch/arm/mach-keystone/keystone.c
index 650e202..f0f4a08 100644
--- a/arch/arm/mach-keystone/keystone.c
+++ b/arch/arm/mach-keystone/keystone.c
@@ -19,12 +19,20 @@
 #include 
 #include 
 #include 
+#include 
 
 #include 
 #include 
 #include 
 #include 
 #include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
 
 extern struct smp_ops keystone_smp_ops;
 
@@ -74,6 +82,86 @@ static const char *keystone_match[] __initconst = {
NULL,
 };
 
+#ifdef CONFIG_ARM_LPAE
+
+phys_addr_t keystone_phys_offset  = KEYSTONE_LOW_PHYS_START;
+
+extern 

[RFC 11/23] ARM: mm: cleanup checks for membank overlap with vmalloc area

2012-07-23 Thread Cyril Chemparathy
On Keystone platforms, physical memory is entirely outside the 32-bit
addressible range.  Therefore, the (bank->start > ULONG_MAX) check below marks
the entire system memory as highmem, and this causes unpleasentness all over.

This patch eliminates the extra bank start check (against ULONG_MAX) by
checking bank->start against the physical address corresponding to vmalloc_min
instead.

In the process, this patch also cleans up parts of the highmem sanity check
code by removing what has now become a redundant check for banks that entirely
overlap with the vmalloc range.

Signed-off-by: Cyril Chemparathy 
Signed-off-by: Vitaly Andrianov 
---
 arch/arm/mm/mmu.c |   19 +--
 1 file changed, 1 insertion(+), 18 deletions(-)

diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index adaf8c3..4840efa 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -907,15 +907,12 @@ void __init sanity_check_meminfo(void)
struct membank *bank = [j];
*bank = meminfo.bank[i];
 
-   if (bank->start > ULONG_MAX)
-   highmem = 1;
-
-#ifdef CONFIG_HIGHMEM
if (bank->start >= vmalloc_limit)
highmem = 1;
 
bank->highmem = highmem;
 
+#ifdef CONFIG_HIGHMEM
/*
 * Split those memory banks which are partially overlapping
 * the vmalloc area greatly simplifying things later.
@@ -938,8 +935,6 @@ void __init sanity_check_meminfo(void)
bank->size = vmalloc_limit - bank->start;
}
 #else
-   bank->highmem = highmem;
-
/*
 * Highmem banks not allowed with !CONFIG_HIGHMEM.
 */
@@ -952,18 +947,6 @@ void __init sanity_check_meminfo(void)
}
 
/*
-* Check whether this memory bank would entirely overlap
-* the vmalloc area.
-*/
-   if (bank->start >= vmalloc_limit) {
-   printk(KERN_NOTICE "Ignoring RAM at %.8llx-%.8llx "
-  "(vmalloc region overlap).\n",
-  (unsigned long long)bank->start,
-  (unsigned long long)bank->start + bank->size - 
1);
-   continue;
-   }
-
-   /*
 * Check whether this memory bank would partially overlap
 * the vmalloc area.
 */
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC 08/23] ARM: LPAE: use 64-bit pgd physical address in switch_mm()

2012-07-23 Thread Cyril Chemparathy
This patch modifies the switch_mm() processor functions to use 64-bit
addresses.  We use u64 instead of phys_addr_t, in order to avoid having config
dependent register usage when calling into switch_mm assembly code.

The changes in this patch are primarily adjustments for registers used for
arguments to switch_mm.  The few processor definitions that did use the second
argument have been modified accordingly.

Arguments and calling conventions aside, this patch should be a no-op on v6
and non-LPAE v7 processors.  On LPAE systems, we now honor the upper 32-bits
of the physical address that is being passed in.

Signed-off-by: Cyril Chemparathy 
Signed-off-by: Vitaly Andrianov 
---
 arch/arm/include/asm/proc-fns.h |4 ++--
 arch/arm/mm/proc-v6.S   |2 +-
 arch/arm/mm/proc-v7-2level.S|2 +-
 arch/arm/mm/proc-v7-3level.S|5 +++--
 4 files changed, 7 insertions(+), 6 deletions(-)

diff --git a/arch/arm/include/asm/proc-fns.h b/arch/arm/include/asm/proc-fns.h
index f3628fb..fa6554e 100644
--- a/arch/arm/include/asm/proc-fns.h
+++ b/arch/arm/include/asm/proc-fns.h
@@ -60,7 +60,7 @@ extern struct processor {
/*
 * Set the page table
 */
-   void (*switch_mm)(unsigned long pgd_phys, struct mm_struct *mm);
+   void (*switch_mm)(u64 pgd_phys, struct mm_struct *mm);
/*
 * Set a possibly extended PTE.  Non-extended PTEs should
 * ignore 'ext'.
@@ -82,7 +82,7 @@ extern void cpu_proc_init(void);
 extern void cpu_proc_fin(void);
 extern int cpu_do_idle(void);
 extern void cpu_dcache_clean_area(void *, int);
-extern void cpu_do_switch_mm(unsigned long pgd_phys, struct mm_struct *mm);
+extern void cpu_do_switch_mm(u64 pgd_phys, struct mm_struct *mm);
 #ifdef CONFIG_ARM_LPAE
 extern void cpu_set_pte_ext(pte_t *ptep, pte_t pte);
 #else
diff --git a/arch/arm/mm/proc-v6.S b/arch/arm/mm/proc-v6.S
index 5900cd5..566c658 100644
--- a/arch/arm/mm/proc-v6.S
+++ b/arch/arm/mm/proc-v6.S
@@ -100,8 +100,8 @@ ENTRY(cpu_v6_dcache_clean_area)
  */
 ENTRY(cpu_v6_switch_mm)
 #ifdef CONFIG_MMU
+   ldr r1, [r2, #MM_CONTEXT_ID]@ get mm->context.id
mov r2, #0
-   ldr r1, [r1, #MM_CONTEXT_ID]@ get mm->context.id
ALT_SMP(orr r0, r0, #TTB_FLAGS_SMP)
ALT_UP(orr  r0, r0, #TTB_FLAGS_UP)
mcr p15, 0, r2, c7, c5, 6   @ flush BTAC/BTB
diff --git a/arch/arm/mm/proc-v7-2level.S b/arch/arm/mm/proc-v7-2level.S
index 42ac069..3397803 100644
--- a/arch/arm/mm/proc-v7-2level.S
+++ b/arch/arm/mm/proc-v7-2level.S
@@ -39,8 +39,8 @@
  */
 ENTRY(cpu_v7_switch_mm)
 #ifdef CONFIG_MMU
+   ldr r1, [r2, #MM_CONTEXT_ID]@ get mm->context.id
mov r2, #0
-   ldr r1, [r1, #MM_CONTEXT_ID]@ get mm->context.id
ALT_SMP(orr r0, r0, #TTB_FLAGS_SMP)
ALT_UP(orr  r0, r0, #TTB_FLAGS_UP)
 #ifdef CONFIG_ARM_ERRATA_430973
diff --git a/arch/arm/mm/proc-v7-3level.S b/arch/arm/mm/proc-v7-3level.S
index 8de0f1d..0001581 100644
--- a/arch/arm/mm/proc-v7-3level.S
+++ b/arch/arm/mm/proc-v7-3level.S
@@ -47,9 +47,10 @@
  */
 ENTRY(cpu_v7_switch_mm)
 #ifdef CONFIG_MMU
-   ldr r1, [r1, #MM_CONTEXT_ID]@ get mm->context.id
-   and r3, r1, #0xff
+   ldr r2, [r2, #MM_CONTEXT_ID]@ get mm->context.id
+   and r3, r2, #0xff
mov r3, r3, lsl #(48 - 32)  @ ASID
+   orr r3, r3, r1  @ upper 32-bits of pgd phys
mcrrp15, 0, r0, r3, c2  @ set TTB 0
isb
 #endif
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC 16/23] ARM: LPAE: accomodate >32-bit addresses for page table base

2012-07-23 Thread Cyril Chemparathy
This patch redefines the early boot time use of the R4 register to steal a few
low order bits (ARCH_PGD_SHIFT bits), allowing for up to 38-bit physical
addresses.

This is probably not the best means to the end, and a better alternative may
be to modify the head.S register allocations to fit in full register pairs for
pgdir and swapper_pg_dir.  However, squeezing out these extra registers seemed
to be a far greater pain than squeezing out a few low order bits from the page
table addresses.

Signed-off-by: Cyril Chemparathy 
Signed-off-by: Vitaly Andrianov 
---
 arch/arm/include/asm/cache.h |9 +
 arch/arm/kernel/head.S   |7 +--
 arch/arm/kernel/smp.c|   11 +--
 arch/arm/mm/proc-arm1026.S   |2 ++
 arch/arm/mm/proc-mohawk.S|2 ++
 arch/arm/mm/proc-v6.S|2 ++
 arch/arm/mm/proc-v7-2level.S |2 ++
 arch/arm/mm/proc-v7-3level.S |7 +++
 arch/arm/mm/proc-v7.S|1 +
 arch/arm/mm/proc-xsc3.S  |2 ++
 10 files changed, 41 insertions(+), 4 deletions(-)

diff --git a/arch/arm/include/asm/cache.h b/arch/arm/include/asm/cache.h
index 75fe66b..986480c 100644
--- a/arch/arm/include/asm/cache.h
+++ b/arch/arm/include/asm/cache.h
@@ -17,6 +17,15 @@
 #define ARCH_DMA_MINALIGN  L1_CACHE_BYTES
 
 /*
+ * Minimum guaranted alignment in pgd_alloc().  The page table pointers passed
+ * around in head.S and proc-*.S are shifted by this amount, in order to
+ * leave spare high bits for systems with physical address extension.  This
+ * does not fully accomodate the 40-bit addressing capability of ARM LPAE, but
+ * gives us about 38-bits or so.
+ */
+#define ARCH_PGD_SHIFT L1_CACHE_SHIFT
+
+/*
  * With EABI on ARMv5 and above we must have 64-bit aligned slab pointers.
  */
 #if defined(CONFIG_AEABI) && (__LINUX_ARM_ARCH__ >= 5)
diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index 692e57f..6fe1c40 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -22,6 +22,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #ifdef CONFIG_DEBUG_LL
 #include 
@@ -160,7 +161,7 @@ ENDPROC(stext)
  *
  * Returns:
  *  r0, r3, r5-r7 corrupted
- *  r4 = physical page table address
+ *  r4 = page table (see ARCH_PGD_SHIFT in asm/cache.h)
  */
 __create_page_tables:
pgtbl   r4, r8  @ page table address
@@ -320,6 +321,7 @@ __create_page_tables:
 #ifdef CONFIG_ARM_LPAE
sub r4, r4, #0x1000 @ point to the PGD table
 #endif
+   mov r4, r4, lsr #ARCH_PGD_SHIFT
mov pc, lr
 ENDPROC(__create_page_tables)
.ltorg
@@ -392,7 +394,7 @@ __secondary_data:
  *  r0  = cp#15 control register
  *  r1  = machine ID
  *  r2  = atags or dtb pointer
- *  r4  = page table pointer
+ *  r4  = page table (see ARCH_PGD_SHIFT in asm/cache.h)
  *  r9  = processor ID
  *  r13 = *virtual* address to jump to upon completion
  */
@@ -422,6 +424,7 @@ __enable_mmu:
@ has the processor setup already programmed the page table pointer?
addsr5, r4, #1
beq __turn_mmu_on   @ yes!
+   mov r4, r4, lsl #ARCH_PGD_SHIFT
mcr p15, 0, r4, c2, c0, 0   @ load page table pointer
b   __turn_mmu_on
 ENDPROC(__enable_mmu)
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index 2c7217d..e41e1be 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -42,6 +42,7 @@
 #include 
 #include 
 #include 
+#include 
 
 /*
  * as from 2.5, kernels no longer have an init_tasks structure
@@ -62,6 +63,7 @@ static DECLARE_COMPLETION(cpu_running);
 
 int __cpuinit __cpu_up(unsigned int cpu, struct task_struct *idle)
 {
+   phys_addr_t pgdir;
int ret;
 
/*
@@ -69,8 +71,13 @@ int __cpuinit __cpu_up(unsigned int cpu, struct task_struct 
*idle)
 * its stack and the page tables.
 */
secondary_data.stack = task_stack_page(idle) + THREAD_START_SP;
-   secondary_data.pgdir = virt_to_phys(idmap_pgd);
-   secondary_data.swapper_pg_dir = virt_to_phys(swapper_pg_dir);
+
+   pgdir = virt_to_phys(idmap_pgd);
+   secondary_data.pgdir = pgdir >> ARCH_PGD_SHIFT;
+
+   pgdir = virt_to_phys(swapper_pg_dir);
+   secondary_data.swapper_pg_dir = pgdir >> ARCH_PGD_SHIFT;
+
__cpuc_flush_dcache_area(_data, sizeof(secondary_data));
outer_clean_range(__pa(_data), __pa(_data + 1));
 
diff --git a/arch/arm/mm/proc-arm1026.S b/arch/arm/mm/proc-arm1026.S
index c28070e..4556f77 100644
--- a/arch/arm/mm/proc-arm1026.S
+++ b/arch/arm/mm/proc-arm1026.S
@@ -22,6 +22,7 @@
 #include 
 #include 
 #include 
+#include 
 
 #include "proc-macros.S"
 
@@ -403,6 +404,7 @@ __arm1026_setup:
mcr p15, 0, r0, c7, c10, 4  @ drain write buffer on v4
 #ifdef CONFIG_MMU
mcr p15, 0, r0, c8, c7  @ invalidate I,D TLBs on v4
+   mov r4, r4, lsl #ARCH_PGD_SHIFT
mcr p15, 0, r4, c2, c0  @ load page table pointer

[RFC 01/23] ARM: LPAE: disable phys-to-virt patching on PAE systems

2012-07-23 Thread Cyril Chemparathy
From: Vitaly Andrianov 

The current phys-to-virt patching mechanism is broken on PAE machines with
64-bit physical addressing.  This patch disables the patching mechanism in
such configurations.

Signed-off-by: Vitaly Andrianov 
Signed-off-by: Cyril Chemparathy 
---
 arch/arm/Kconfig |1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index a91009c..55da671 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -191,6 +191,7 @@ config ARM_PATCH_PHYS_VIRT
default y
depends on !XIP_KERNEL && MMU
depends on !ARCH_REALVIEW || !SPARSEMEM
+   depends on !ARCH_PHYS_ADDR_T_64BIT
help
  Patch phys-to-virt and virt-to-phys translation functions at
  boot and module load time according to the position of the
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC 07/23] ARM: LPAE: use phys_addr_t for membank size

2012-07-23 Thread Cyril Chemparathy
This patch changes the membank structure's size field to phys_addr_t to allow
banks larger than 4G.

Signed-off-by: Cyril Chemparathy 
Signed-off-by: Vitaly Andrianov 
---
 arch/arm/include/asm/setup.h |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/arch/arm/include/asm/setup.h b/arch/arm/include/asm/setup.h
index 23ebc0c..a2e7581 100644
--- a/arch/arm/include/asm/setup.h
+++ b/arch/arm/include/asm/setup.h
@@ -195,8 +195,8 @@ static const struct tagtable __tagtable_##fn __tag = { tag, 
fn }
 #define NR_BANKS   CONFIG_ARM_NR_BANKS
 
 struct membank {
-   phys_addr_t start;
-   unsigned long size;
+   phys_addr_t  start;
+   phys_addr_t  size;
unsigned int highmem;
 };
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC 12/23] ARM: mm: clean up membank size limit checks

2012-07-23 Thread Cyril Chemparathy
This patch cleans up the highmem sanity check code by simplifying the range
checks with a pre-calculated size_limit.  This patch should otherwise have no
functional impact on behavior.

This patch also removes a redundant (bank->start < vmalloc_limit) check, since
this is already covered by the !highmem condition.

Signed-off-by: Cyril Chemparathy 
Signed-off-by: Vitaly Andrianov 
---
 arch/arm/mm/mmu.c |   19 +++
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 4840efa..6b0baf3 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -905,10 +905,15 @@ void __init sanity_check_meminfo(void)
 
for (i = 0, j = 0; i < meminfo.nr_banks; i++) {
struct membank *bank = [j];
+   phys_addr_t size_limit;
+
*bank = meminfo.bank[i];
+   size_limit = bank->size;
 
if (bank->start >= vmalloc_limit)
highmem = 1;
+   else
+   size_limit = vmalloc_limit - bank->start;
 
bank->highmem = highmem;
 
@@ -917,8 +922,7 @@ void __init sanity_check_meminfo(void)
 * Split those memory banks which are partially overlapping
 * the vmalloc area greatly simplifying things later.
 */
-   if (!highmem && bank->start < vmalloc_limit &&
-   bank->size > vmalloc_limit - bank->start) {
+   if (!highmem && bank->size > size_limit) {
if (meminfo.nr_banks >= NR_BANKS) {
printk(KERN_CRIT "NR_BANKS too low, "
 "ignoring high memory\n");
@@ -927,12 +931,12 @@ void __init sanity_check_meminfo(void)
(meminfo.nr_banks - i) * sizeof(*bank));
meminfo.nr_banks++;
i++;
-   bank[1].size -= vmalloc_limit - bank->start;
+   bank[1].size -= size_limit;
bank[1].start = vmalloc_limit;
bank[1].highmem = highmem = 1;
j++;
}
-   bank->size = vmalloc_limit - bank->start;
+   bank->size = size_limit;
}
 #else
/*
@@ -950,14 +954,13 @@ void __init sanity_check_meminfo(void)
 * Check whether this memory bank would partially overlap
 * the vmalloc area.
 */
-   if (bank->start + bank->size > vmalloc_limit)
-   unsigned long newsize = vmalloc_limit - bank->start;
+   if (bank->size > size_limit) {
printk(KERN_NOTICE "Truncating RAM at %.8llx-%.8llx "
   "to -%.8llx (vmalloc region overlap).\n",
   (unsigned long long)bank->start,
   (unsigned long long)bank->start + bank->size - 1,
-  (unsigned long long)bank->start + newsize - 1);
-   bank->size = newsize;
+  (unsigned long long)bank->start + size_limit - 
1);
+   bank->size = size_limit;
}
 #endif
if (!bank->highmem && bank->start + bank->size > 
arm_lowmem_limit)
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC 17/23] ARM: add machine desc hook for early memory/paging initialization

2012-07-23 Thread Cyril Chemparathy
This patch adds a machine descriptor hook that gives control to machine
specific code prior to memory and paging initialization.

On Keystone platforms, this hook is used to switch the PHYS_OFFSET over
to the "real" non-32-bit-addressable address range.

Signed-off-by: Cyril Chemparathy 
Signed-off-by: Vitaly Andrianov 
---
 arch/arm/include/asm/mach/arch.h |1 +
 arch/arm/kernel/setup.c  |3 +++
 2 files changed, 4 insertions(+)

diff --git a/arch/arm/include/asm/mach/arch.h b/arch/arm/include/asm/mach/arch.h
index 0b1c94b..49e9c2a 100644
--- a/arch/arm/include/asm/mach/arch.h
+++ b/arch/arm/include/asm/mach/arch.h
@@ -39,6 +39,7 @@ struct machine_desc {
 struct meminfo *);
void(*reserve)(void);/* reserve mem blocks  */
void(*map_io)(void);/* IO mapping function  */
+   void(*init_meminfo)(void);
void(*init_early)(void);
void(*init_irq)(void);
struct sys_timer*timer; /* system tick timer*/
diff --git a/arch/arm/kernel/setup.c b/arch/arm/kernel/setup.c
index e15d83b..7cbe292 100644
--- a/arch/arm/kernel/setup.c
+++ b/arch/arm/kernel/setup.c
@@ -964,6 +964,9 @@ void __init setup_arch(char **cmdline_p)
 
parse_early_param();
 
+   if (mdesc->init_meminfo)
+   mdesc->init_meminfo();
+
sort(, meminfo.nr_banks, sizeof(meminfo.bank[0]), 
meminfo_cmp, NULL);
sanity_check_meminfo();
arm_memblock_init(, mdesc);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC 02/23] ARM: LPAE: use signed arithmetic for mask definitions

2012-07-23 Thread Cyril Chemparathy
This patch applies to PAGE_MASK, PMD_MASK, and PGDIR_MASK, where forcing
unsigned long math truncates the mask at the 32-bits.  This clearly does bad
things on PAE systems.

This patch fixes this problem by defining these masks as signed quantities.
We then rely on sign extension to do the right thing.

Signed-off-by: Cyril Chemparathy 
Signed-off-by: Vitaly Andrianov 
---
 arch/arm/include/asm/page.h   |7 ++-
 arch/arm/include/asm/pgtable-3level.h |6 +++---
 2 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/arm/include/asm/page.h b/arch/arm/include/asm/page.h
index ecf9019..1c810d2 100644
--- a/arch/arm/include/asm/page.h
+++ b/arch/arm/include/asm/page.h
@@ -13,7 +13,12 @@
 /* PAGE_SHIFT determines the page size */
 #define PAGE_SHIFT 12
 #define PAGE_SIZE  (_AC(1,UL) << PAGE_SHIFT)
-#define PAGE_MASK  (~(PAGE_SIZE-1))
+
+/*
+ * We do not use PAGE_SIZE in the following because we rely on sign
+ * extension to appropriately extend upper bits for PAE systems
+ */
+#define PAGE_MASK  (~((1 << PAGE_SHIFT) - 1))
 
 #ifndef __ASSEMBLY__
 
diff --git a/arch/arm/include/asm/pgtable-3level.h 
b/arch/arm/include/asm/pgtable-3level.h
index b249035..ae39d11 100644
--- a/arch/arm/include/asm/pgtable-3level.h
+++ b/arch/arm/include/asm/pgtable-3level.h
@@ -48,16 +48,16 @@
 #define PMD_SHIFT  21
 
 #define PMD_SIZE   (1UL << PMD_SHIFT)
-#define PMD_MASK   (~(PMD_SIZE-1))
+#define PMD_MASK   (~((1 << PMD_SHIFT) - 1))
 #define PGDIR_SIZE (1UL << PGDIR_SHIFT)
-#define PGDIR_MASK (~(PGDIR_SIZE-1))
+#define PGDIR_MASK (~((1 << PGDIR_SHIFT) - 1))
 
 /*
  * section address mask and size definitions.
  */
 #define SECTION_SHIFT  21
 #define SECTION_SIZE   (1UL << SECTION_SHIFT)
-#define SECTION_MASK   (~(SECTION_SIZE-1))
+#define SECTION_MASK   (~((1 << SECTION_SHIFT) - 1))
 
 #define USER_PTRS_PER_PGD  (PAGE_OFFSET / PGDIR_SIZE)
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC 04/23] ARM: LPAE: use phys_addr_t in alloc_init_pud()

2012-07-23 Thread Cyril Chemparathy
From: Vitaly Andrianov 

This patch fixes the alloc_init_pud() function to use phys_addr_t instead of
unsigned long when passing in the phys argument.

This is an extension to commit 97092e0c56830457af0639f6bd904537a150ea4a, which
applied similar changes elsewhere in the ARM memory management code.

Signed-off-by: Vitaly Andrianov 
Signed-off-by: Cyril Chemparathy 
---
 arch/arm/mm/mmu.c |3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index cf4528d..226985c 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -628,7 +628,8 @@ static void __init alloc_init_section(pud_t *pud, unsigned 
long addr,
 }
 
 static void __init alloc_init_pud(pgd_t *pgd, unsigned long addr,
-   unsigned long end, unsigned long phys, const struct mem_type *type)
+ unsigned long end, phys_addr_t phys,
+ const struct mem_type *type)
 {
pud_t *pud = pud_offset(pgd, addr);
unsigned long next;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC 14/23] ARM: LPAE: factor out T1SZ and TTBR1 computations

2012-07-23 Thread Cyril Chemparathy
This patch moves the TTBR1 offset calculation and the T1SZ calculation out
of the TTB setup assembly code.  This should not affect functionality in
any way, but improves code readability as well as readability of subsequent
patches in this series.

Signed-off-by: Cyril Chemparathy 
Signed-off-by: Vitaly Andrianov 
---
 arch/arm/include/asm/pgtable-3level-hwdef.h |   10 ++
 arch/arm/mm/proc-v7-3level.S|   16 
 2 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/arch/arm/include/asm/pgtable-3level-hwdef.h 
b/arch/arm/include/asm/pgtable-3level-hwdef.h
index d795282..b501650 100644
--- a/arch/arm/include/asm/pgtable-3level-hwdef.h
+++ b/arch/arm/include/asm/pgtable-3level-hwdef.h
@@ -74,4 +74,14 @@
 #define PHYS_MASK_SHIFT(40)
 #define PHYS_MASK  ((1ULL << PHYS_MASK_SHIFT) - 1)
 
+#if defined CONFIG_VMSPLIT_2G
+#define TTBR1_OFFSET   (1 << 4)/* skip two L1 entries */
+#elif defined CONFIG_VMSPLIT_3G
+#define TTBR1_OFFSET   (4096 * (1 + 3))/* only L2, skip pgd + 3*pmd */
+#else
+#define TTBR1_OFFSET   0
+#endif
+
+#define TTBR1_SIZE (((PAGE_OFFSET >> 30) - 1) << 16)
+
 #endif
diff --git a/arch/arm/mm/proc-v7-3level.S b/arch/arm/mm/proc-v7-3level.S
index 0001581..3b1a745 100644
--- a/arch/arm/mm/proc-v7-3level.S
+++ b/arch/arm/mm/proc-v7-3level.S
@@ -120,18 +120,10 @@ ENDPROC(cpu_v7_set_pte_ext)
 * booting secondary CPUs would end up using TTBR1 for the identity
 * mapping set up in TTBR0.
 */
-   bhi 9001f   @ PHYS_OFFSET > PAGE_OFFSET?
-   orr \tmp, \tmp, #(((PAGE_OFFSET >> 30) - 1) << 16) @ TTBCR.T1SZ
-#if defined CONFIG_VMSPLIT_2G
-   /* PAGE_OFFSET == 0x8000, T1SZ == 1 */
-   add \ttbr1, \ttbr1, #1 << 4 @ skip two L1 entries
-#elif defined CONFIG_VMSPLIT_3G
-   /* PAGE_OFFSET == 0xc000, T1SZ == 2 */
-   add \ttbr1, \ttbr1, #4096 * (1 + 3) @ only L2 used, skip pgd+3*pmd
-#endif
-   /* CONFIG_VMSPLIT_1G does not need TTBR1 adjustment */
-9001:  mcr p15, 0, \tmp, c2, c0, 2 @ TTB control register
-   mcrrp15, 1, \ttbr1, \zero, c2   @ load TTBR1
+   orrls   \tmp, \tmp, #TTBR1_SIZE @ TTBCR.T1SZ
+   mcr p15, 0, \tmp, c2, c0, 2 @ TTBCR
+   addls   \ttbr1, \ttbr1, #TTBR1_OFFSET
+   mcrrp15, 1, \ttbr1, \zero, c2   @ load TTBR1
.endm
 
__CPUINIT
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 2/3] mm: Warn if pg_data_t isn't initialized with zero

2012-07-23 Thread Minchan Kim
This patch warns if memory-hotplug/boot code doesn't initialize
pg_data_t with zero when it's allocated. As I looked arch code and
memory hotplug, they already seem to initiailize pg_data_t.
So this warning should be never happen. It needs double check and
let's add checking garbage with warn. I select fields randomly
nearyby begin/middle/end of pg_data_t for checking garbage.
If we are very unlucky, those garbage might be zero but it's very unlikely,
I hope.

This patch isn't for performance but removing initialization code
which is necessary to add whenever we adds new field to pg_data_t or zone.
It's rather bothersome and error-prone about compile at least as I had
experienced.

Firstly, Andrew suggested clearing out of pg_data_t in MM core part but
Tejun doesn't like it because in the future, some archs can initialize
some fields in arch code and pass them into general MM part so blindly clearing
it out in mm core part would be very annoying.

Cc: Tejun Heo 
Cc: Andrew Morton 
Cc: linux-arch 
Signed-off-by: Minchan Kim 
---
 mm/page_alloc.c |3 +++
 1 file changed, 3 insertions(+)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index b65c362..2037eeb 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4517,6 +4517,9 @@ void __paginginit free_area_init_node(int nid, unsigned 
long *zones_size,
 {
pg_data_t *pgdat = NODE_DATA(nid);
 
+   /* pg_data_t should be reset to zero when it's allocated */
+   WARN_ON(pgdat->nr_zones || pgdat->node_start_pfn || 
pgdat->classzone_idx);
+
pgdat->node_id = nid;
pgdat->node_start_pfn = node_start_pfn;
calculate_node_totalpages(pgdat, zones_size, zholes_size);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 3/3] mm: remove redundant initialization

2012-07-23 Thread Minchan Kim
pg_data_t should be zero out before reaching free_area_init_core
so remove unnecessary initialization.

Signed-off-by: Minchan Kim 
---
 include/linux/vmstat.h |5 -
 mm/page_alloc.c|9 ++---
 2 files changed, 2 insertions(+), 12 deletions(-)

diff --git a/include/linux/vmstat.h b/include/linux/vmstat.h
index 65efb92..ad2cfd5 100644
--- a/include/linux/vmstat.h
+++ b/include/linux/vmstat.h
@@ -179,11 +179,6 @@ extern void zone_statistics(struct zone *, struct zone *, 
gfp_t gfp);
 #define add_zone_page_state(__z, __i, __d) mod_zone_page_state(__z, __i, __d)
 #define sub_zone_page_state(__z, __i, __d) mod_zone_page_state(__z, __i, 
-(__d))
 
-static inline void zap_zone_vm_stats(struct zone *zone)
-{
-   memset(zone->vm_stat, 0, sizeof(zone->vm_stat));
-}
-
 extern void inc_zone_state(struct zone *, enum zone_stat_item);
 
 #ifdef CONFIG_SMP
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 2037eeb..7b09ecc 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -4376,6 +4376,8 @@ void __init set_pageblock_order(void)
  *   - mark all pages reserved
  *   - mark all memory queues empty
  *   - clear the memory bitmaps
+ *
+ * NOTE: pgdat should get zeroed by caller.
  */
 static void __paginginit free_area_init_core(struct pglist_data *pgdat,
unsigned long *zones_size, unsigned long *zholes_size)
@@ -4386,10 +4388,8 @@ static void __paginginit free_area_init_core(struct 
pglist_data *pgdat,
int ret;
 
pgdat_resize_init(pgdat);
-   pgdat->nr_zones = 0;
init_waitqueue_head(>kswapd_wait);
init_waitqueue_head(>pfmemalloc_wait);
-   pgdat->kswapd_max_order = 0;
pgdat_page_cgroup_init(pgdat);
 
for (j = 0; j < MAX_NR_ZONES; j++) {
@@ -4450,11 +4450,6 @@ static void __paginginit free_area_init_core(struct 
pglist_data *pgdat,
 
zone_pcp_init(zone);
lruvec_init(>lruvec, zone);
-   zap_zone_vm_stats(zone);
-   zone->flags = 0;
-#ifdef CONFIG_MEMORY_ISOLATION
-   zone->nr_pageblock_isolate = 0;
-#endif
if (!size)
continue;
 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC 13/23] ARM: LPAE: define ARCH_LOW_ADDRESS_LIMIT for bootmem

2012-07-23 Thread Cyril Chemparathy
This patch adds an architecture defined override for ARCH_LOW_ADDRESS_LIMIT.
On PAE systems, the absence of this override causes bootmem to incorrectly
limit itself to 32-bit addressable physical memory.

Signed-off-by: Cyril Chemparathy 
Signed-off-by: Vitaly Andrianov 
---
 arch/arm/include/asm/memory.h |2 ++
 1 file changed, 2 insertions(+)

diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index 7629dfe..c330a23 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -285,6 +285,8 @@ static inline __deprecated void *bus_to_virt(unsigned long 
x)
 #define arch_is_coherent() 0
 #endif
 
+#define ARCH_LOW_ADDRESS_LIMIT PHYS_MASK
+
 #endif
 
 #include 
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC 18/23] ARM: add virt_to_idmap for interconnect aliasing

2012-07-23 Thread Cyril Chemparathy
From: Vitaly Andrianov 

On some PAE systems (e.g. TI Keystone), memory is above the 32-bit addressible
limit, and the interconnect provides an aliased view of parts of physical
memory in the 32-bit addressible space.  This alias is strictly for boot time
usage, and is not otherwise usable because of coherency limitations.

On such systems, the idmap mechanism needs to take this aliased mapping into
account.  This patch introduces a virt_to_idmap() macro, which can be used on
such sub-architectures to represent the interconnect supported boot time
alias.  Most other systems would leave this macro untouched, i.e., do a simply
virt_to_phys() and nothing more.

Signed-off-by: Vitaly Andrianov 
Signed-off-by: Cyril Chemparathy 
---
 arch/arm/include/asm/memory.h |9 +
 arch/arm/kernel/smp.c |4 ++--
 arch/arm/mm/idmap.c   |4 ++--
 3 files changed, 13 insertions(+), 4 deletions(-)

diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index c330a23..b6b203c 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -235,6 +235,15 @@ static inline void *phys_to_virt(phys_addr_t x)
 #define pfn_to_kaddr(pfn)  __va((pfn) << PAGE_SHIFT)
 
 /*
+ * These are for systems that have a hardware interconnect supported alias of
+ * physical memory for idmap purposes.  Most cases should leave these
+ * untouched.
+ */
+#ifndef virt_to_idmap
+#define virt_to_idmap(x) virt_to_phys(x)
+#endif
+
+/*
  * Virtual <-> DMA view memory address translations
  * Again, these are *only* valid on the kernel direct mapped RAM
  * memory.  Use of these is *deprecated* (and that doesn't mean
diff --git a/arch/arm/kernel/smp.c b/arch/arm/kernel/smp.c
index e41e1be..cce630c 100644
--- a/arch/arm/kernel/smp.c
+++ b/arch/arm/kernel/smp.c
@@ -72,10 +72,10 @@ int __cpuinit __cpu_up(unsigned int cpu, struct task_struct 
*idle)
 */
secondary_data.stack = task_stack_page(idle) + THREAD_START_SP;
 
-   pgdir = virt_to_phys(idmap_pgd);
+   pgdir = virt_to_idmap(idmap_pgd);
secondary_data.pgdir = pgdir >> ARCH_PGD_SHIFT;
 
-   pgdir = virt_to_phys(swapper_pg_dir);
+   pgdir = virt_to_idmap(swapper_pg_dir);
secondary_data.swapper_pg_dir = pgdir >> ARCH_PGD_SHIFT;
 
__cpuc_flush_dcache_area(_data, sizeof(secondary_data));
diff --git a/arch/arm/mm/idmap.c b/arch/arm/mm/idmap.c
index ab88ed4..919cb6e 100644
--- a/arch/arm/mm/idmap.c
+++ b/arch/arm/mm/idmap.c
@@ -85,8 +85,8 @@ static int __init init_static_idmap(void)
return -ENOMEM;
 
/* Add an identity mapping for the physical address of the section. */
-   idmap_start = virt_to_phys((void *)__idmap_text_start);
-   idmap_end = virt_to_phys((void *)__idmap_text_end);
+   idmap_start = virt_to_idmap((void *)__idmap_text_start);
+   idmap_end = virt_to_idmap((void *)__idmap_text_end);
 
pr_info("Setting up static identity map for 0x%llx - 0x%llx\n",
(long long)idmap_start, (long long)idmap_end);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC 10/23] ARM: mm: use physical addresses in highmem sanity checks

2012-07-23 Thread Cyril Chemparathy
This patch modifies the highmem sanity checking code to use physical addresses
instead.  This change eliminates the wrap-around problems associated with the
original virtual address based checks, and this simplifies the code a bit.

The one constraint imposed here is that low physical memory must be mapped in
a monotonically increasing fashion if there are multiple banks of memory,
i.e., x < y must => pa(x) < pa(y).

Signed-off-by: Cyril Chemparathy 
Signed-off-by: Vitaly Andrianov 
---
 arch/arm/mm/mmu.c |   22 ++
 1 file changed, 10 insertions(+), 12 deletions(-)

diff --git a/arch/arm/mm/mmu.c b/arch/arm/mm/mmu.c
index 226985c..adaf8c3 100644
--- a/arch/arm/mm/mmu.c
+++ b/arch/arm/mm/mmu.c
@@ -901,6 +901,7 @@ phys_addr_t arm_lowmem_limit __initdata = 0;
 void __init sanity_check_meminfo(void)
 {
int i, j, highmem = 0;
+   phys_addr_t vmalloc_limit = __pa(vmalloc_min - 1) + 1;
 
for (i = 0, j = 0; i < meminfo.nr_banks; i++) {
struct membank *bank = [j];
@@ -910,8 +911,7 @@ void __init sanity_check_meminfo(void)
highmem = 1;
 
 #ifdef CONFIG_HIGHMEM
-   if (__va(bank->start) >= vmalloc_min ||
-   __va(bank->start) < (void *)PAGE_OFFSET)
+   if (bank->start >= vmalloc_limit)
highmem = 1;
 
bank->highmem = highmem;
@@ -920,8 +920,8 @@ void __init sanity_check_meminfo(void)
 * Split those memory banks which are partially overlapping
 * the vmalloc area greatly simplifying things later.
 */
-   if (!highmem && __va(bank->start) < vmalloc_min &&
-   bank->size > vmalloc_min - __va(bank->start)) {
+   if (!highmem && bank->start < vmalloc_limit &&
+   bank->size > vmalloc_limit - bank->start) {
if (meminfo.nr_banks >= NR_BANKS) {
printk(KERN_CRIT "NR_BANKS too low, "
 "ignoring high memory\n");
@@ -930,12 +930,12 @@ void __init sanity_check_meminfo(void)
(meminfo.nr_banks - i) * sizeof(*bank));
meminfo.nr_banks++;
i++;
-   bank[1].size -= vmalloc_min - __va(bank->start);
-   bank[1].start = __pa(vmalloc_min - 1) + 1;
+   bank[1].size -= vmalloc_limit - bank->start;
+   bank[1].start = vmalloc_limit;
bank[1].highmem = highmem = 1;
j++;
}
-   bank->size = vmalloc_min - __va(bank->start);
+   bank->size = vmalloc_limit - bank->start;
}
 #else
bank->highmem = highmem;
@@ -955,8 +955,7 @@ void __init sanity_check_meminfo(void)
 * Check whether this memory bank would entirely overlap
 * the vmalloc area.
 */
-   if (__va(bank->start) >= vmalloc_min ||
-   __va(bank->start) < (void *)PAGE_OFFSET) {
+   if (bank->start >= vmalloc_limit) {
printk(KERN_NOTICE "Ignoring RAM at %.8llx-%.8llx "
   "(vmalloc region overlap).\n",
   (unsigned long long)bank->start,
@@ -968,9 +967,8 @@ void __init sanity_check_meminfo(void)
 * Check whether this memory bank would partially overlap
 * the vmalloc area.
 */
-   if (__va(bank->start + bank->size) > vmalloc_min ||
-   __va(bank->start + bank->size) < __va(bank->start)) {
-   unsigned long newsize = vmalloc_min - __va(bank->start);
+   if (bank->start + bank->size > vmalloc_limit)
+   unsigned long newsize = vmalloc_limit - bank->start;
printk(KERN_NOTICE "Truncating RAM at %.8llx-%.8llx "
   "to -%.8llx (vmalloc region overlap).\n",
   (unsigned long long)bank->start,
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC 09/23] ARM: LPAE: use 64-bit accessors for TTBR registers

2012-07-23 Thread Cyril Chemparathy
This patch adds TTBR accessor macros, and modifies cpu_get_pgd() and
the LPAE version of cpu_set_reserved_ttbr0() to use these instead.

In the process, we also fix these functions to correctly handle cases
where the physical address lies beyond the 4G limit of 32-bit addressing.

Signed-off-by: Cyril Chemparathy 
Signed-off-by: Vitaly Andrianov 
---
 arch/arm/include/asm/proc-fns.h |   24 +++-
 arch/arm/mm/context.c   |   13 ++---
 2 files changed, 21 insertions(+), 16 deletions(-)

diff --git a/arch/arm/include/asm/proc-fns.h b/arch/arm/include/asm/proc-fns.h
index fa6554e..918b4f9 100644
--- a/arch/arm/include/asm/proc-fns.h
+++ b/arch/arm/include/asm/proc-fns.h
@@ -116,13 +116,27 @@ extern void cpu_resume(void);
 #define cpu_switch_mm(pgd,mm) cpu_do_switch_mm(virt_to_phys(pgd),mm)
 
 #ifdef CONFIG_ARM_LPAE
+
+#define cpu_get_ttbr(nr)   \
+   ({  \
+   u64 ttbr;   \
+   __asm__("mrrc   p15, " #nr ", %Q0, %R0, c2" \
+   : "=r" (ttbr)   \
+   : : "cc");  \
+   ttbr;   \
+   })
+
+#define cpu_set_ttbr(nr, val)  \
+   do {\
+   u64 ttbr = val; \
+   __asm__("mcrr   p15, " #nr ", %Q0, %R0, c2" \
+   : : "r" (ttbr)  \
+   : "cc");\
+   } while (0)
+
 #define cpu_get_pgd()  \
({  \
-   unsigned long pg, pg2;  \
-   __asm__("mrrc   p15, 0, %0, %1, c2" \
-   : "=r" (pg), "=r" (pg2) \
-   :   \
-   : "cc");\
+   u64 pg = cpu_get_ttbr(0);   \
pg &= ~(PTRS_PER_PGD*sizeof(pgd_t)-1);  \
(pgd_t *)phys_to_virt(pg);  \
})
diff --git a/arch/arm/mm/context.c b/arch/arm/mm/context.c
index 806cc4f..ad70bd8 100644
--- a/arch/arm/mm/context.c
+++ b/arch/arm/mm/context.c
@@ -15,6 +15,7 @@
 
 #include 
 #include 
+#include 
 
 static DEFINE_RAW_SPINLOCK(cpu_asid_lock);
 unsigned int cpu_last_asid = ASID_FIRST_VERSION;
@@ -22,17 +23,7 @@ unsigned int cpu_last_asid = ASID_FIRST_VERSION;
 #ifdef CONFIG_ARM_LPAE
 void cpu_set_reserved_ttbr0(void)
 {
-   unsigned long ttbl = __pa(swapper_pg_dir);
-   unsigned long ttbh = 0;
-
-   /*
-* Set TTBR0 to swapper_pg_dir which contains only global entries. The
-* ASID is set to 0.
-*/
-   asm volatile(
-   "   mcrrp15, 0, %0, %1, c2  @ set TTBR0\n"
-   :
-   : "r" (ttbl), "r" (ttbh));
+   cpu_set_ttbr(0, __pa(swapper_pg_dir));
isb();
 }
 #else
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/3] mips: zero out pg_data_t when it's allocated

2012-07-23 Thread Minchan Kim
This patch is ready for next patch which try to remove zero-out
of pg_data_t in core MM part. At a glance, all archs except this part
already have done it so this patch makes consistent with other archs.

Cc: Ralf Baechle 
Cc: linux-m...@linux-mips.org
Signed-off-by: Minchan Kim 
---
 arch/mips/sgi-ip27/ip27-memory.c |1 +
 1 file changed, 1 insertion(+)

diff --git a/arch/mips/sgi-ip27/ip27-memory.c b/arch/mips/sgi-ip27/ip27-memory.c
index b105eca..cd8fcab 100644
--- a/arch/mips/sgi-ip27/ip27-memory.c
+++ b/arch/mips/sgi-ip27/ip27-memory.c
@@ -401,6 +401,7 @@ static void __init node_mem_init(cnodeid_t node)
 * Allocate the node data structures on the node first.
 */
__node_data[node] = __va(slot_freepfn << PAGE_SHIFT);
+   memset(__node_data[node], 0, PAGE_SIZE);
 
NODE_DATA(node)->bdata = _node_data[node];
NODE_DATA(node)->node_start_pfn = start_pfn;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC 20/23] mm: bootmem: use phys_addr_t for physical addresses

2012-07-23 Thread Cyril Chemparathy
From: Vitaly Andrianov 

On a physical address extended (PAE) systems physical memory may be located
outside the first 4GB address range.  In particular, on TI Keystone devices,
all memory (including lowmem) is located outside the 4G address space. Many
functions in the bootmem.c use unsigned long as a type for physical addresses,
and this breaks badly on such PAE systems.

This patch intensively mangles the bootmem allocator to use phys_addr_t where
necessary.  We are aware that this is most certainly not the way to go
considering that the ARM architecture appears to be moving towards memblock.
Memblock may be a better solution, and fortunately it looks a lot more PAE
savvy than bootmem is.

However, we do not fully understand the motivations and restrictions behind
the mixed bootmem + memblock model in current ARM code. We hope for a
meaningful discussion and useful guidance towards a better solution to this
problem.

Signed-off-by: Vitaly Andrianov 
Signed-off-by: Cyril Chemparathy 
---
 include/linux/bootmem.h |   30 
 mm/bootmem.c|   59 ---
 2 files changed, 45 insertions(+), 44 deletions(-)

diff --git a/include/linux/bootmem.h b/include/linux/bootmem.h
index 6d6795d..e43c463 100644
--- a/include/linux/bootmem.h
+++ b/include/linux/bootmem.h
@@ -49,10 +49,10 @@ extern unsigned long free_all_bootmem_node(pg_data_t 
*pgdat);
 extern unsigned long free_all_bootmem(void);
 
 extern void free_bootmem_node(pg_data_t *pgdat,
- unsigned long addr,
+ phys_addr_t addr,
  unsigned long size);
-extern void free_bootmem(unsigned long addr, unsigned long size);
-extern void free_bootmem_late(unsigned long addr, unsigned long size);
+extern void free_bootmem(phys_addr_t addr, unsigned long size);
+extern void free_bootmem_late(phys_addr_t addr, unsigned long size);
 
 /*
  * Flags for reserve_bootmem (also if CONFIG_HAVE_ARCH_BOOTMEM_NODE,
@@ -65,44 +65,44 @@ extern void free_bootmem_late(unsigned long addr, unsigned 
long size);
 #define BOOTMEM_DEFAULT0
 #define BOOTMEM_EXCLUSIVE  (1<<0)
 
-extern int reserve_bootmem(unsigned long addr,
+extern int reserve_bootmem(phys_addr_t addr,
   unsigned long size,
   int flags);
 extern int reserve_bootmem_node(pg_data_t *pgdat,
-   unsigned long physaddr,
+   phys_addr_t physaddr,
unsigned long size,
int flags);
 
 extern void *__alloc_bootmem(unsigned long size,
 unsigned long align,
-unsigned long goal);
+phys_addr_t goal);
 extern void *__alloc_bootmem_nopanic(unsigned long size,
 unsigned long align,
-unsigned long goal);
+phys_addr_t goal);
 extern void *__alloc_bootmem_node(pg_data_t *pgdat,
  unsigned long size,
  unsigned long align,
- unsigned long goal);
+ phys_addr_t goal);
 void *__alloc_bootmem_node_high(pg_data_t *pgdat,
  unsigned long size,
  unsigned long align,
- unsigned long goal);
+ phys_addr_t goal);
 extern void *__alloc_bootmem_node_nopanic(pg_data_t *pgdat,
  unsigned long size,
  unsigned long align,
- unsigned long goal);
+ phys_addr_t goal);
 void *___alloc_bootmem_node_nopanic(pg_data_t *pgdat,
  unsigned long size,
  unsigned long align,
- unsigned long goal,
- unsigned long limit);
+ phys_addr_t goal,
+ phys_addr_t limit);
 extern void *__alloc_bootmem_low(unsigned long size,
 unsigned long align,
-unsigned long goal);
+phys_addr_t goal);
 extern void *__alloc_bootmem_low_node(pg_data_t *pgdat,
  unsigned long size,
  unsigned long align,
- unsigned long goal);
+ phys_addr_t goal);
 
 #ifdef CONFIG_NO_BOOTMEM
 /* We are using top down, so it is safe to use 0 here */
@@ -137,7 +137,7 @@ extern void *__alloc_bootmem_low_node(pg_data_t *pgdat,
 #define alloc_bootmem_low_pages_node(pgdat, x) \

[RFC 22/23] ARM: keystone: enable SMP on Keystone machines

2012-07-23 Thread Cyril Chemparathy
This patch adds basic SMP support for Keystone machines.  Nothing very fancy
here, just enough to get 4 CPUs booted up.  This does not include support for
hotplug, etc.

Signed-off-by: Vitaly Andrianov 
Signed-off-by: Cyril Chemparathy 
---
 arch/arm/Kconfig|1 +
 arch/arm/configs/keystone_defconfig |2 +
 arch/arm/mach-keystone/Makefile |1 +
 arch/arm/mach-keystone/keystone.c   |3 ++
 arch/arm/mach-keystone/platsmp.c|   73 +++
 5 files changed, 80 insertions(+)
 create mode 100644 arch/arm/mach-keystone/platsmp.c

diff --git a/arch/arm/Kconfig b/arch/arm/Kconfig
index 04c846b..5b82879 100644
--- a/arch/arm/Kconfig
+++ b/arch/arm/Kconfig
@@ -372,6 +372,7 @@ config ARCH_KEYSTONE
select SPARSE_IRQ
select NEED_MACH_MEMORY_H
select HAVE_SCHED_CLOCK
+   select HAVE_SMP
help
  Support for boards based on the Texas Instruments Keystone family of
  SoCs.
diff --git a/arch/arm/configs/keystone_defconfig 
b/arch/arm/configs/keystone_defconfig
index 7f2a04b..5f71e66 100644
--- a/arch/arm/configs/keystone_defconfig
+++ b/arch/arm/configs/keystone_defconfig
@@ -1,7 +1,9 @@
 CONFIG_EXPERIMENTAL=y
 CONFIG_BLK_DEV_INITRD=y
 CONFIG_ARCH_KEYSTONE=y
+CONFIG_SMP=y
 CONFIG_ARM_ARCH_TIMER=y
+CONFIG_NR_CPUS=4
 CONFIG_AEABI=y
 CONFIG_HIGHMEM=y
 CONFIG_VFP=y
diff --git a/arch/arm/mach-keystone/Makefile b/arch/arm/mach-keystone/Makefile
index d4671d5..3f6b8ab 100644
--- a/arch/arm/mach-keystone/Makefile
+++ b/arch/arm/mach-keystone/Makefile
@@ -1 +1,2 @@
 obj-y  := keystone.o
+obj-$(CONFIG_SMP)  += platsmp.o
diff --git a/arch/arm/mach-keystone/keystone.c 
b/arch/arm/mach-keystone/keystone.c
index 9583dc4..650e202 100644
--- a/arch/arm/mach-keystone/keystone.c
+++ b/arch/arm/mach-keystone/keystone.c
@@ -26,6 +26,8 @@
 #include 
 #include 
 
+extern struct smp_ops keystone_smp_ops;
+
 static struct map_desc io_desc[] = {
{
.virtual= 0xfe80UL,
@@ -73,6 +75,7 @@ static const char *keystone_match[] __initconst = {
 };
 
 DT_MACHINE_START(KEYSTONE, "Keystone")
+   smp_ops(keystone_smp_ops)
.map_io = keystone_map_io,
.init_irq   = keystone_init_irq,
.timer  = _timer,
diff --git a/arch/arm/mach-keystone/platsmp.c b/arch/arm/mach-keystone/platsmp.c
new file mode 100644
index 000..437659a
--- /dev/null
+++ b/arch/arm/mach-keystone/platsmp.c
@@ -0,0 +1,73 @@
+/*
+ * Copyright 2012 Texas Instruments, Inc.
+ *
+ * Based on platsmp.c, Copyright 2010-2011 Calxeda, Inc.
+ * Based on platsmp.c, Copyright (C) 2002 ARM Ltd.
+ *
+ * This program is free software; you can redistribute it and/or modify it
+ * under the terms and conditions of the GNU General Public License,
+ * version 2, as published by the Free Software Foundation.
+ *
+ * This program is distributed in the hope it will be useful, but WITHOUT
+ * ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or
+ * FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General Public License for
+ * more details.
+ *
+ * You should have received a copy of the GNU General Public License along with
+ * this program.  If not, see .
+ */
+#include 
+#include 
+#include 
+
+#include 
+#include 
+#include 
+#include 
+#include 
+
+extern void secondary_startup(void);
+
+static void __init keystone_smp_init_cpus(void)
+{
+   unsigned int i, ncores;
+
+   ncores = 4;
+
+   /* sanity check */
+   if (ncores > NR_CPUS) {
+   pr_warn("restricted to %d cpus\n", NR_CPUS);
+   ncores = NR_CPUS;
+   }
+
+   for (i = 0; i < ncores; i++)
+   set_cpu_possible(i, true);
+
+   set_smp_cross_call(gic_raise_softirq);
+}
+
+static void __init keystone_smp_prepare_cpus(unsigned int max_cpus)
+{
+   /* nothing for now */
+}
+
+static void __cpuinit keystone_secondary_init(unsigned int cpu)
+{
+   gic_secondary_init(0);
+}
+
+static int __cpuinit
+keystone_boot_secondary(unsigned int cpu, struct task_struct *idle)
+{
+   unsigned long *jump_ptr = phys_to_virt(0x81f0);
+
+   jump_ptr[cpu] = virt_to_idmap(_startup);
+   __cpuc_flush_dcache_area(jump_ptr, sizeof(jump_ptr) * 4);
+
+   return 0;
+}
+
+struct smp_ops keystone_smp_ops __initdata = {
+   smp_init_ops(keystone)
+   smp_secondary_ops(keystone)
+};
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC 15/23] ARM: LPAE: allow proc override of TTB setup

2012-07-23 Thread Cyril Chemparathy
This patch allows ARM processor setup functions (*_setup in proc-*.S) to
indicate that the page table has already been programmed.  This is
done by setting r4 (page table pointer) to -1 before returning from the
processor setup handler.

This capability is particularly needed on LPAE systems, where the translation
table base needs to be programmed differently with 64-bit control
register operations.

Further, a few of the processors (arm1026, mohawk, xsc3) were programming the
TTB twice.  This patch prevents the main head.S code from programming TTB the
second time on these machines.

Signed-off-by: Cyril Chemparathy 
Signed-off-by: Vitaly Andrianov 
---
 arch/arm/kernel/head.S   |   11 ++-
 arch/arm/mm/proc-arm1026.S   |1 +
 arch/arm/mm/proc-mohawk.S|1 +
 arch/arm/mm/proc-v6.S|2 ++
 arch/arm/mm/proc-v7-2level.S |3 ++-
 arch/arm/mm/proc-v7-3level.S |1 +
 arch/arm/mm/proc-v7.S|1 +
 arch/arm/mm/proc-xsc3.S  |1 +
 8 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/arch/arm/kernel/head.S b/arch/arm/kernel/head.S
index 835898e..692e57f 100644
--- a/arch/arm/kernel/head.S
+++ b/arch/arm/kernel/head.S
@@ -411,17 +411,18 @@ __enable_mmu:
 #ifdef CONFIG_CPU_ICACHE_DISABLE
bic r0, r0, #CR_I
 #endif
-#ifdef CONFIG_ARM_LPAE
-   mov r5, #0
-   mcrrp15, 0, r4, r5, c2  @ load TTBR0
-#else
+#ifndef CONFIG_ARM_LPAE
mov r5, #(domain_val(DOMAIN_USER, DOMAIN_MANAGER) | \
  domain_val(DOMAIN_KERNEL, DOMAIN_MANAGER) | \
  domain_val(DOMAIN_TABLE, DOMAIN_MANAGER) | \
  domain_val(DOMAIN_IO, DOMAIN_CLIENT))
mcr p15, 0, r5, c3, c0, 0   @ load domain access register
-   mcr p15, 0, r4, c2, c0, 0   @ load page table pointer
 #endif
+
+   @ has the processor setup already programmed the page table pointer?
+   addsr5, r4, #1
+   beq __turn_mmu_on   @ yes!
+   mcr p15, 0, r4, c2, c0, 0   @ load page table pointer
b   __turn_mmu_on
 ENDPROC(__enable_mmu)
 
diff --git a/arch/arm/mm/proc-arm1026.S b/arch/arm/mm/proc-arm1026.S
index fbc1d5f..c28070e 100644
--- a/arch/arm/mm/proc-arm1026.S
+++ b/arch/arm/mm/proc-arm1026.S
@@ -404,6 +404,7 @@ __arm1026_setup:
 #ifdef CONFIG_MMU
mcr p15, 0, r0, c8, c7  @ invalidate I,D TLBs on v4
mcr p15, 0, r4, c2, c0  @ load page table pointer
+   mvn r4, #0  @ do not set page table pointer
 #endif
 #ifdef CONFIG_CPU_DCACHE_WRITETHROUGH
mov r0, #4  @ explicitly disable writeback
diff --git a/arch/arm/mm/proc-mohawk.S b/arch/arm/mm/proc-mohawk.S
index fbb2124..a26303c 100644
--- a/arch/arm/mm/proc-mohawk.S
+++ b/arch/arm/mm/proc-mohawk.S
@@ -390,6 +390,7 @@ __mohawk_setup:
mcr p15, 0, r0, c8, c7  @ invalidate I,D TLBs
orr r4, r4, #0x18   @ cache the page table in L2
mcr p15, 0, r4, c2, c0, 0   @ load page table pointer
+   mvn r4, #0  @ do not set page table pointer
 
mov r0, #0  @ don't allow CP access
mcr p15, 0, r0, c15, c1, 0  @ write CP access register
diff --git a/arch/arm/mm/proc-v6.S b/arch/arm/mm/proc-v6.S
index 566c658..872156e 100644
--- a/arch/arm/mm/proc-v6.S
+++ b/arch/arm/mm/proc-v6.S
@@ -210,7 +210,9 @@ __v6_setup:
ALT_UP(orr  r4, r4, #TTB_FLAGS_UP)
ALT_SMP(orr r8, r8, #TTB_FLAGS_SMP)
ALT_UP(orr  r8, r8, #TTB_FLAGS_UP)
+   mcr p15, 0, r4, c2, c0, 0   @ load TTB0
mcr p15, 0, r8, c2, c0, 1   @ load TTB1
+   mvn r4, #0  @ do not set page table pointer
 #endif /* CONFIG_MMU */
adr r5, v6_crval
ldmia   r5, {r5, r6}
diff --git a/arch/arm/mm/proc-v7-2level.S b/arch/arm/mm/proc-v7-2level.S
index 3397803..cc78c0c 100644
--- a/arch/arm/mm/proc-v7-2level.S
+++ b/arch/arm/mm/proc-v7-2level.S
@@ -139,7 +139,7 @@ ENDPROC(cpu_v7_set_pte_ext)
 
/*
 * Macro for setting up the TTBRx and TTBCR registers.
-* - \ttb0 and \ttb1 updated with the corresponding flags.
+* - \ttbr0 and \ttbr1 updated with the corresponding flags.
 */
.macro  v7_ttb_setup, zero, ttbr0, ttbr1, tmp
mcr p15, 0, \zero, c2, c0, 2@ TTB control register
@@ -147,6 +147,7 @@ ENDPROC(cpu_v7_set_pte_ext)
ALT_UP(orr  \ttbr0, \ttbr0, #TTB_FLAGS_UP)
ALT_SMP(orr \ttbr1, \ttbr1, #TTB_FLAGS_SMP)
ALT_UP(orr  \ttbr1, \ttbr1, #TTB_FLAGS_UP)
+   mcr p15, 0, \ttbr0, c2, c0, 0   @ load TTB0
mcr p15, 0, \ttbr1, c2, c0, 1   @ load TTB1
.endm
 
diff --git a/arch/arm/mm/proc-v7-3level.S b/arch/arm/mm/proc-v7-3level.S
index 3b1a745..5e3bed1 100644

[RFC 00/23] Introducing the TI Keystone platform

2012-07-23 Thread Cyril Chemparathy
TI's scalable KeyStone II architecture includes support for both TMS320C66x
floating point DSPs and ARM Cortex-A15 clusters, for a mixture of up to 32
cores per SoC.  The solution is optimized around a high performance chip
interconnect and a rich set of on chip peripherals.  Please refer [1] for
initial technical documentation on these devices.

This patch series provides a basic Linux port for these devices, including
support for SMP, and LPAE boot.  A majority of the patches in this series are
related to LPAE functionality, imposed by the device architecture which has
system memory mapped at an address above the 4G 32-bit addressable limit.

This patch series is based on the v3.5 kernel with the smp_ops patch set
applied on top.  This series is being posted to elicit early feedback, and so
that some of these fixes may get incorporated early on into the kernel code.

  [1] - http://www.ti.com/product/tms320tci6636


Cyril Chemparathy (17):
  ARM: LPAE: use signed arithmetic for mask definitions
  ARM: LPAE: use phys_addr_t on virt <--> phys conversion
  ARM: LPAE: use phys_addr_t for membank size
  ARM: LPAE: use 64-bit pgd physical address in switch_mm()
  ARM: LPAE: use 64-bit accessors for TTBR registers
  ARM: mm: use physical addresses in highmem sanity checks
  ARM: mm: cleanup checks for membank overlap with vmalloc area
  ARM: mm: clean up membank size limit checks
  ARM: LPAE: define ARCH_LOW_ADDRESS_LIMIT for bootmem
  ARM: LPAE: factor out T1SZ and TTBR1 computations
  ARM: LPAE: allow proc override of TTB setup
  ARM: LPAE: accomodate >32-bit addresses for page table base
  ARM: add machine desc hook for early memory/paging initialization
  drivers: cma: fix addressing on PAE machines
  ARM: keystone: introducing TI Keystone platform
  ARM: keystone: enable SMP on Keystone machines
  ARM: keystone: add switch over to high physical address range

Vitaly Andrianov (6):
  ARM: LPAE: disable phys-to-virt patching on PAE systems
  ARM: LPAE: use phys_addr_t in alloc_init_pud()
  ARM: LPAE: use phys_addr_t in free_memmap()
  ARM: LPAE: use phys_addr_t for initrd location and size
  ARM: add virt_to_idmap for interconnect aliasing
  mm: bootmem: use phys_addr_t for physical addresses

 arch/arm/Kconfig  |   20 +++
 arch/arm/Makefile |1 +
 arch/arm/boot/dts/keystone-sim.dts|   77 +
 arch/arm/configs/keystone_defconfig   |   23 +++
 arch/arm/include/asm/cache.h  |9 ++
 arch/arm/include/asm/mach/arch.h  |1 +
 arch/arm/include/asm/memory.h |   28 +++-
 arch/arm/include/asm/page.h   |7 +-
 arch/arm/include/asm/pgtable-3level-hwdef.h   |   10 ++
 arch/arm/include/asm/pgtable-3level.h |6 +-
 arch/arm/include/asm/proc-fns.h   |   28 +++-
 arch/arm/include/asm/setup.h  |4 +-
 arch/arm/kernel/head.S|   18 ++-
 arch/arm/kernel/setup.c   |3 +
 arch/arm/kernel/smp.c |   11 +-
 arch/arm/mach-keystone/Makefile   |2 +
 arch/arm/mach-keystone/Makefile.boot  |1 +
 arch/arm/mach-keystone/include/mach/debug-macro.S |   44 +
 arch/arm/mach-keystone/include/mach/entry-macro.S |   20 +++
 arch/arm/mach-keystone/include/mach/io.h  |   22 +++
 arch/arm/mach-keystone/include/mach/memory.h  |   51 ++
 arch/arm/mach-keystone/include/mach/system.h  |   30 
 arch/arm/mach-keystone/include/mach/timex.h   |   21 +++
 arch/arm/mach-keystone/include/mach/uncompress.h  |   24 +++
 arch/arm/mach-keystone/include/mach/vmalloc.h |   21 +++
 arch/arm/mach-keystone/keystone.c |  178 +
 arch/arm/mach-keystone/platsmp.c  |   94 +++
 arch/arm/mm/context.c |   13 +-
 arch/arm/mm/idmap.c   |4 +-
 arch/arm/mm/init.c|   20 +--
 arch/arm/mm/mmu.c |   49 ++
 arch/arm/mm/proc-arm1026.S|3 +
 arch/arm/mm/proc-mohawk.S |3 +
 arch/arm/mm/proc-v6.S |6 +-
 arch/arm/mm/proc-v7-2level.S  |7 +-
 arch/arm/mm/proc-v7-3level.S  |   29 ++--
 arch/arm/mm/proc-v7.S |2 +
 arch/arm/mm/proc-xsc3.S   |3 +
 drivers/base/dma-contiguous.c |4 +-
 include/linux/bootmem.h   |   30 ++--
 mm/bootmem.c  |   59 +++
 41 files changed, 840 insertions(+), 146 deletions(-)
 create mode 100644 arch/arm/boot/dts/keystone-sim.dts
 create mode 100644 arch/arm/configs/keystone_defconfig
 create mode 100644 

[RFC 03/23] ARM: LPAE: use phys_addr_t on virt <--> phys conversion

2012-07-23 Thread Cyril Chemparathy
This patch fixes up the types used when converting back and forth between
physical and virtual addresses.

Signed-off-by: Vitaly Andrianov 
Signed-off-by: Cyril Chemparathy 
---
 arch/arm/include/asm/memory.h |   17 +++--
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/arch/arm/include/asm/memory.h b/arch/arm/include/asm/memory.h
index fcb5757..7629dfe 100644
--- a/arch/arm/include/asm/memory.h
+++ b/arch/arm/include/asm/memory.h
@@ -169,22 +169,27 @@ extern unsigned long __pv_phys_offset;
: "=r" (to) \
: "r" (from), "I" (type))
 
-static inline unsigned long __virt_to_phys(unsigned long x)
+static inline phys_addr_t __virt_to_phys(unsigned long x)
 {
unsigned long t;
__pv_stub(x, t, "add", __PV_BITS_31_24);
return t;
 }
 
-static inline unsigned long __phys_to_virt(unsigned long x)
+static inline unsigned long __phys_to_virt(phys_addr_t x)
 {
unsigned long t;
__pv_stub(x, t, "sub", __PV_BITS_31_24);
return t;
 }
 #else
-#define __virt_to_phys(x)  ((x) - PAGE_OFFSET + PHYS_OFFSET)
-#define __phys_to_virt(x)  ((x) - PHYS_OFFSET + PAGE_OFFSET)
+
+#define __virt_to_phys(x)  \
+   ((phys_addr_t)(x) - PAGE_OFFSET + PHYS_OFFSET)
+
+#define __phys_to_virt(x)  \
+   ((unsigned long)((phys_addr_t)(x) - PHYS_OFFSET + PAGE_OFFSET))
+
 #endif
 #endif
 
@@ -219,14 +224,14 @@ static inline phys_addr_t virt_to_phys(const volatile 
void *x)
 
 static inline void *phys_to_virt(phys_addr_t x)
 {
-   return (void *)(__phys_to_virt((unsigned long)(x)));
+   return (void *)__phys_to_virt(x);
 }
 
 /*
  * Drivers should NOT use these either.
  */
 #define __pa(x)__virt_to_phys((unsigned long)(x))
-#define __va(x)((void *)__phys_to_virt((unsigned 
long)(x)))
+#define __va(x)((void 
*)__phys_to_virt((phys_addr_t)(x)))
 #define pfn_to_kaddr(pfn)  __va((pfn) << PAGE_SHIFT)
 
 /*
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC 05/23] ARM: LPAE: use phys_addr_t in free_memmap()

2012-07-23 Thread Cyril Chemparathy
From: Vitaly Andrianov 

The free_memmap() was mistakenly using unsigned long type to represent
physical addresses.  This breaks on PAE systems where memory could be placed
above the 32-bit addressible limit.

This patch fixes this function to properly use phys_addr_t instead.

Signed-off-by: Vitaly Andrianov 
Signed-off-by: Cyril Chemparathy 
---
 arch/arm/mm/init.c |6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/arch/arm/mm/init.c b/arch/arm/mm/init.c
index f54d592..8252c31 100644
--- a/arch/arm/mm/init.c
+++ b/arch/arm/mm/init.c
@@ -457,7 +457,7 @@ static inline void
 free_memmap(unsigned long start_pfn, unsigned long end_pfn)
 {
struct page *start_pg, *end_pg;
-   unsigned long pg, pgend;
+   phys_addr_t pg, pgend;
 
/*
 * Convert start_pfn/end_pfn to a struct page pointer.
@@ -469,8 +469,8 @@ free_memmap(unsigned long start_pfn, unsigned long end_pfn)
 * Convert to physical addresses, and
 * round start upwards and end downwards.
 */
-   pg = (unsigned long)PAGE_ALIGN(__pa(start_pg));
-   pgend = (unsigned long)__pa(end_pg) & PAGE_MASK;
+   pg = PAGE_ALIGN(__pa(start_pg));
+   pgend = __pa(end_pg) & PAGE_MASK;
 
/*
 * If there are free pages between these,
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFC 19/23] drivers: cma: fix addressing on PAE machines

2012-07-23 Thread Cyril Chemparathy
This patch fixes a couple of bugs that otherwise impair CMA functionality on
PAE machines:

  - alignment must be a 64-bit type when running on systems with 64-bit
physical addresses.  If this is not the case, the limit calculation thunks
allocations down to an address range < 4G.

  - The allocated range is now being checked using dma_supported() instead of
hardcoding a 32-bit addressable limit.

Signed-off-by: Vitaly Andrianov 
Signed-off-by: Cyril Chemparathy 
---
 drivers/base/dma-contiguous.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/base/dma-contiguous.c b/drivers/base/dma-contiguous.c
index 78efb03..e10bd9a 100644
--- a/drivers/base/dma-contiguous.c
+++ b/drivers/base/dma-contiguous.c
@@ -234,7 +234,7 @@ int __init dma_declare_contiguous(struct device *dev, 
unsigned long size,
  phys_addr_t base, phys_addr_t limit)
 {
struct cma_reserved *r = _reserved[cma_reserved_count];
-   unsigned long alignment;
+   phys_addr_t alignment;
 
pr_debug("%s(size %lx, base %08lx, limit %08lx)\n", __func__,
 (unsigned long)size, (unsigned long)base,
@@ -271,7 +271,7 @@ int __init dma_declare_contiguous(struct device *dev, 
unsigned long size,
if (!addr) {
base = -ENOMEM;
goto err;
-   } else if (addr + size > ~(unsigned long)0) {
+   } else if (!dma_supported(dev, addr + size)) {
memblock_free(addr, size);
base = -EINVAL;
goto err;
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH -alternative] mm: hugetlbfs: Close race during teardown of hugetlbfs shared page tables V2 (resend)

2012-07-23 Thread Hugh Dickins
On Mon, 23 Jul 2012, Mel Gorman wrote:
> On Sun, Jul 22, 2012 at 09:04:33PM -0700, Hugh Dickins wrote:
> > On Fri, 20 Jul 2012, Mel Gorman wrote:
> > > On Fri, Jul 20, 2012 at 04:36:35PM +0200, Michal Hocko wrote:
> 
> I like it in that it's simple and I can confirm it works for the test case
> of interest.

Phew, I'm glad to hear that, thanks.

> 
> However, is your patch not vunerable to truncate issues?
> madvise()/truncate() issues was the main reason why I was wary of VMA tricks
> as a solution. As it turns out, madvise(DONTNEED) is not a problem as it is
> ignored for hugetlbfs but I think truncate is still problematic. Lets say
> we mmap(MAP_SHARED) a hugetlbfs file and then truncate for whatever reason.
> 
> invalidate_inode_pages2
>   invalidate_inode_pages2_range
> unmap_mapping_range_vma
>   zap_page_range_single
> unmap_single_vma
> __unmap_hugepage_range (removes VM_MAYSHARE)
> 
> The VMA still exists so the consequences for this would be varied but
> minimally fault is going to be "interesting".

You had me worried there, I hadn't considered truncation or invalidation2
at all.

But actually, I don't think they do pose any problem for my patch.  They
would indeed if I were removing VM_MAYSHARE in __unmap_hugepage_range()
as you show above; but no, I'm removing it in unmap_hugepage_range().

That's only called by unmap_single_vma(): which is called via
unmap_vmas() by unmap_region() or exit_mmap() just before free_pgtables()
(the problem cases); or by madvise_dontneed() via zap_page_range(), which
as you note is disallowed on VM_HUGETLB; or by zap_page_range_single().

zap_page_range_single() is called by zap_vma_ptes(), which is only
allowed on VM_PFNMAP; or by unmap_mapping_range_vma(), which looked
like it was going to deadlock on i_mmap_mutex (with or without my
patch) until I realized that hugetlbfs has its own hugetlbfs_setattr()
and hugetlb_vmtruncate() which don't use unmap_mapping_range() at all.

invalidate_inode_pages2() (and _range()) do use unmap_mapping_range(),
but hugetlbfs doesn't support direct_IO, and otherwise I think they're
called by a filesystem directly on its own inodes, which hugetlbfs
does not.  Anyway, if there's a deadlock on i_mmap_mutex somewhere
in there, it's not introduced by the proposed patch.

So, unmap_hugepage_range() is only being called in the problem cases,
just before free_pgtables(), when unmapping a vma (with mmap_sem held),
or when exiting (when we have the last reference to mm): in each case,
the vma is on its way out, and VM_MAYSHARE no longer of interest to others.

I spent a while concerned that I'd overlooked the truncation case, before
realizing that it's not a problem: the issue comes when we free_pgtables(),
which truncation makes no attempt to do.

So, after a bout of anxiety, I think my &= ~VM_MAYSHARE remains good.

Hugh
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [GIT PULL 00/15] arm-soc: changes for v3.6

2012-07-23 Thread Linus Torvalds
On Mon, Jul 23, 2012 at 1:32 PM, Arnd Bergmann  wrote:
>
> There are about 600 changesets in here, and a bunch of simple
> conflicts between the 15 branches, but so far no conflicts with
> stuff that you have merged already. For reference, you can check
> the "for-linus" branch in the same tree to see how we resolved
> the conflicts.

Ok, so I've merged some things a bit differently, but afaik it's all
semantically identical *except* for the merge of the file

  arch/arm/mach-omap2/clockdomains3xxx_data.c

in my merge of your "general arm-soc cleanups" pull request.

In your "for-linus" branch, "_3xxx_clkdm" remains in the
clockdomains_common[] array. In my merge, it is gone. But I think I
did the merge correctly, and you did it wrong. HOWEVER, I don't know
the code, maybe there is some subtle reason why you did it like you
did.

Your "for-linus" branch also had that

   arch/arm/arm-soc-for-next-contents.txt

file that shouldn't have been there, but whatever.

Anyway, apart from that "please check" comment, I also have small
complaint: your pull requests didn't actually point to the tags, they
pointed to the next/xyz commits. So every time I did a pull, I had to
change "next/xyz" to "tags/xyz". That's just annoying make-work. I
think it's because you just said "xyz" to the git request-pull script,
and then git had to pick one of the things and picked next. Please
disambiguate by just saying "tags/xyz" explicitly.

  Linus
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 2/2] perf: use XSI-complaint version of strerror_r() instead of GNU-specific

2012-07-23 Thread Namhyung Kim
On Tue, 24 Jul 2012 00:06:54 +0300, Kirill A. Shutemov wrote:
> From 11d62205ee3c534aa9b0e9a24a312438ac726ffb Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" 
> Date: Mon, 23 Jul 2012 17:41:05 +0300
> Subject: [PATCH 2/2] perf: fix strerror_r() usage
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
>
> Perf uses GNU-specific version of strerror_r(). The GNU-specific
> strerror_r() returns a pointer to a string containing the error message.
> This may be either a pointer to a string that the function stores in
> buf, or a pointer to some (immutable) static string (in which case buf
> is unused).
>
> In glibc-2.16 GNU version was marked with attribute warn_unused_result.
> It triggers few warnings in perf:
>
> util/target.c: In function ‘perf_target__strerror’:
> util/target.c:114:13: error: ignoring return value of ‘strerror_r’, declared 
> with attribute warn_unused_result [-Werror=unused-result]
> ui/browsers/hists.c: In function ‘hist_browser__dump’:
> ui/browsers/hists.c:981:13: error: ignoring return value of ‘strerror_r’, 
> declared with attribute warn_unused_result [-Werror=unused-result]
>
> They are bugs.
>
> Let's fix strerror_r() usage.
>

Thanks for fixing this. Just a minor nitpick below..


> Signed-off-by: Kirill A. Shutemov 
> ---
>  tools/perf/ui/browsers/hists.c |  4 ++--
>  tools/perf/util/target.c   | 12 +++-
>  2 files changed, 13 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/ui/browsers/hists.c b/tools/perf/ui/browsers/hists.c
> index 482f051..413bd62 100644
> --- a/tools/perf/ui/browsers/hists.c
> +++ b/tools/perf/ui/browsers/hists.c
> @@ -978,8 +978,8 @@ static int hist_browser__dump(struct hist_browser 
> *browser)
>   fp = fopen(filename, "w");
>   if (fp == NULL) {
>   char bf[64];
> - strerror_r(errno, bf, sizeof(bf));
> - ui_helpline__fpush("Couldn't write to %s: %s", filename, bf);
> + const char *err = strerror_r(errno, bf, sizeof(bf));
> + ui_helpline__fpush("Couldn't write to %s: %s", filename, err);
>   return -1;
>   }
>  
> diff --git a/tools/perf/util/target.c b/tools/perf/util/target.c
> index 1064d5b..5c4b3b1 100644
> --- a/tools/perf/util/target.c
> +++ b/tools/perf/util/target.c
> @@ -9,6 +9,7 @@
>  #include "target.h"
>  #include "debug.h"
>  
> +#include 
>  #include 
>  #include 
>  
> @@ -110,8 +111,17 @@ int perf_target__strerror(struct perf_target *target, 
> int errnum,
>   int idx;
>   const char *msg;
>  
> + assert(buflen > 0);
> +

It seems perf (and me too) prefers BUG_ON than assert:

  namhyung@sejong:perf$ git grep BUG_ON\( | wc -l
  55
  namhyung@sejong:perf$ git grep assert\( | wc -l
  16
  
It's not a big deal, though. I'm ok if others are happy with it.

Thanks,
Namhyung


>   if (errnum >= 0) {
> - strerror_r(errnum, buf, buflen);
> + const char *err = strerror_r(errnum, buf, buflen);
> +
> + if (err != buf) {
> + size_t len = strlen(err);
> + char *c = mempcpy(buf, err, min(buflen - 1, len));
> + *c = '\0';
> + }
> +
>   return 0;
>   }
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [PATCH 1/2] perf: fix build error

2012-07-23 Thread Namhyung Kim
Hi, Kirill

On Tue, 24 Jul 2012 00:04:07 +0300, Kirill A. Shutemov wrote:
> From 14f476dddcb36bca93a50ef1a3341e2c60836e0a Mon Sep 17 00:00:00 2001
> From: "Kirill A. Shutemov" 
> Date: Mon, 23 Jul 2012 17:39:11 +0300
> Subject: [PATCH 1/2] perf: fix build error
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
>
> Bison 2.6 started to generate parse_events_parse() declaration in
> header. In this case we have redundant redeclaration:
>
> util/parse-events.c:29:5: error: redundant redeclaration of 
> ‘parse_events_parse’ [-Werror=redundant-decls]
> In file included from util/parse-events.c:14:0:
> util/parse-events-bison.h:99:5: note: previous declaration of 
> ‘parse_events_parse’ was here
> cc1: all warnings being treated as errors
>
> Let's disable -Wredundant-decls for util/parse-events.c since it
> includes header we can't control.
>

It'd be better if the subject line is more descriptive. Like:

  "perf tools: fix a build error with bison 2.6"

Other than that, looks good to me.

Thanks,
Namhyung


> Signed-off-by: Kirill A. Shutemov 
> ---
>  tools/perf/Makefile | 3 +++
>  1 file changed, 3 insertions(+)
>
> diff --git a/tools/perf/Makefile b/tools/perf/Makefile
> index 75d74e5..1091192 100644
> --- a/tools/perf/Makefile
> +++ b/tools/perf/Makefile
> @@ -803,6 +803,9 @@ $(OUTPUT)ui/browsers/map.o: ui/browsers/map.c 
> $(OUTPUT)PERF-CFLAGS
>  $(OUTPUT)util/rbtree.o: ../../lib/rbtree.c $(OUTPUT)PERF-CFLAGS
>   $(QUIET_CC)$(CC) -o $@ -c $(ALL_CFLAGS) 
> -DETC_PERFCONFIG='"$(ETC_PERFCONFIG_SQ)"' $<
>  
> +$(OUTPUT)util/parse-events.o: util/parse-events.c $(OUTPUT)PERF-CFLAGS
> + $(QUIET_CC)$(CC) -o $@ -c $(ALL_CFLAGS) -Wno-redundant-decls $<
> +
>  $(OUTPUT)util/scripting-engines/trace-event-perl.o: 
> util/scripting-engines/trace-event-perl.c $(OUTPUT)PERF-CFLAGS
>   $(QUIET_CC)$(CC) -o $@ -c $(ALL_CFLAGS) $(PERL_EMBED_CCOPTS) 
> -Wno-redundant-decls -Wno-strict-prototypes -Wno-unused-parameter -Wno-shadow 
> $<
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: Release Announcements

2012-07-23 Thread Shea Levy

On 07/23/2012 01:52 PM, J.H. wrote:

On 07/23/2012 02:22 AM, Borislav Petkov wrote:

On Sun, Jul 22, 2012 at 12:08:34PM -0400, Shea Levy wrote:

The linux-kernel-announce doesn't seem to have had any traffic
since 3.1-rc4 (maybe due to the kernel.org break-in?). Is there a
recommended way to get email news of kernel releases without being
subscribed to the main kernel list?

Let's CC some more people about this.


Follow the respective gitweb RSS feeds?


Fair enough, I'll do this until another solution is forthcoming.


I'll have to do some digging to
figure out where those e-mails got generated from.  I do want to say
that that *SHOULD* be working, but if the e-mails aren't showing up in
the archives that may have gotten broken somewhere.


They're not just not showing up in the archives, I subscribed a week or 
so before the 3.5 release and I've not gotten a single mail yet.


Cheers,
Shea
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] ARM: sched_clock: update epoch_cyc on resume

2012-07-23 Thread Colin Cross
On Mon, Jul 23, 2012 at 5:14 PM, Linus Walleij  wrote:
> On Mon, Jul 23, 2012 at 9:27 PM, Colin Cross  wrote:
>> On Mon, Jul 23, 2012 at 11:55 AM, Linus Walleij
>
>> Does the clock you use for sched_clock continue to run in all suspend
>> modes? All the SoC's I've used only have a 32kHz clock in the deepest
>> suspend mode,
>
> Yes, and yes it is 32kHz.
>
>> which is not ideal for sched_clock.
>
> Not that I know why scheduling with 32kHz is so bad compared to the
> default system scheduling granularity which is HZ if you don't have
> sched_clock() implemented.
>
> Since this seems to be such an important point, what makes you
> want MHz:es for scheduling granularity? To me the biggest impact
> is actually the granularity of the timestamps in the printk:s.
>
> (It's not that I doubt your needs, more curiosity.)

There's a comment somewhere about higher resolution sched_clock
providing fairer scheduling.  With 32 kHz sched_clock, every runtime
measured by the scheduler will be wrong by up to 31.25 us.  Most
systems have a faster clock, and if it's available it seems silly not
to use it.

It's also used for tracing, where 31.25 us resolution is a little low
for function tracing or function graph tracing.

>>  For example, on
>> Tegra2 the faster 1MHz clock used for sched_clock resets in the
>> deepest suspend state (LP0) but not the shallowest suspend state
>> (LP2), and which suspend state the chip hits depends on which hardware
>> is active.  Opting out of this patch would cause Tegra's clock to
>> sometimes run in suspend, and sometimes not, which seems worse for
>> debugging than consistently not running in suspend.  I'd be surprised
>> if a similar situation didn't apply to your platform.
>
> Well being able to switch between different sched_clock() providers
> may be the ideal...
>
>>> - If it absolutely needs to be in the core code, also have a bool
>>>   field indicating whether the clock is going to die during suspend
>>>   and add new registration functions for setting that sched_clock
>>>   type, e.g. setup_sched_clock_nonsuspendable()
>>
>> Sounds reasonable if some platforms need the extra complexity.
>
> OK agreed.
>
> A connecting theme is that of being avle to flag clock sources as
> sched_clock providers. If all clocksources were tagged with
> rating, and only clocksources were used for sched_clock(), the
> kernel could select the highest-rated clock under all circumstances.
>
> But that's quite intrusive, more of an idea. :-P

sched_clock is supposed to be very low overhead compared to ktime_get,
and has some strict  requirements if CONFIG_HAVE_UNSTABLE_SCHED_CLOCK
is not set (see kernel/sched/clock.c), but it might be possible.
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [SEGFAULT: perf]

2012-07-23 Thread Namhyung Kim
Hi, Andrew

On Mon, 23 Jul 2012 08:52:17 -0500, Andrew Watts wrote:
> perf report on 3.4.6 segfaults when certain pointers are null. Attached is
> a patch that applies cleanly @3.4.6 and addresses (A). I am not comfortable
> suggesting fixes for (B) and (C).
>

(A) has a fix already in the tip tree [1], and maybe in mainline too.

And yes, I think this sort thing needs some love. It's on my TODO list
but not have time to do it yet.

[1] http://www.spinics.net/lists/linux-tip-commits/msg15695.html

Thanks,
Namhyung


> ~ Andy
>
> ===
>
> (A) perf report with sort on comm/pid/parent/dso:
>
> (gdb) run report --sort=comm
> Program received signal SIGSEGV, Segmentation fault.
> 0x0805c00b in perf_evsel__add_hist_entry (evsel=0x81c56a8, al=0xbfffe9b0,
> sample=0xbfffea84, machine=0x81c3fa8) at builtin-report.c:171
> 171 if (notes->src == NULL && symbol__alloc_hist(he->ms.sym) < 0)
> #0  0x0805c00b in perf_evsel__add_hist_entry (evsel=0x81c56a8, al=0xbfffe9b0,
> sample=0xbfffea84, machine=0x81c3fa8) at builtin-report.c:171
> #1  0x0805c234 in process_sample_event (tool=0xb27c, event=0xb6db1f38,
> sample=0xbfffea84, evsel=0x81c56a8, machine=0x81c3fa8)
> at builtin-report.c:216
> #2  0x080a3097 in perf_session_deliver_event (session=0x81c3f50, event=
> 0xb6db1f38, sample=0xbfffea84, tool=0xb27c, file_offset=339768)
> at util/session.c:885
> #3  0x080a24d1 in flush_sample_queue (s=0x81c3f50, tool=0xb27c)
> at util/session.c:587
> #4  0x080a40e3 in __perf_session__process_events (session=0x81c3f50,
> data_offset=280, data_size=419640, file_size=419920, tool=0xb27c)
> at util/session.c:1257
> #5  0x080a41d3 in perf_session__process_events (self=0x81c3f50, tool=
> 0xb27c) at util/session.c:1273
> (gdb) print he->ms.sym
> $1 = (struct symbol *) 0x0
>
> --
>
> (B) perf report segfaults on sorts of symbol_from/symbol_to:
>
> (gdb) run report --sort=symbol_from
> Program received signal SIGSEGV, Segmentation fault.
> 0x080b8777 in sort__sym_from_cmp (left=0xbfffe878, right=0x84dfde0)
> at util/sort.c:334
> 334 if (!from_l->sym && !from_r->sym
> (gdb) bt
> #0  0x080b8777 in sort__sym_from_cmp (left=0xbfffe878, right=0x84dfde0)
> at util/sort.c:334
> #1  0x080ba0b8 in hist_entry__cmp (left=0xbfffe878, right=0x84dfde0)
> at util/hist.c:345
> #2  0x080b9c31 in add_hist_entry (hists=0x81c571c, entry=0xbfffe878, al=
> 0xbfffe9b0, period=333940) at util/hist.c:254
> #3  0x080ba04c in __hists__add_entry (self=0x81c571c, al=0xbfffe9b0,
> sym_parent=0x0, period=333940) at util/hist.c:335
> #4  0x0805bf50 in perf_evsel__add_hist_entry (evsel=0x81c56a8, al=0xbfffe9b0,
> sample=0xbfffea84, machine=0x81c3fa8) at builtin-report.c:149
> #5  0x0805c234 in process_sample_event (tool=0xb27c, event=0xb6db1840,
> sample=0xbfffea84, evsel=0x81c56a8, machine=0x81c3fa8)
> at builtin-report.c:216
> (gdb) print left->branch_info
> $2 = (struct branch_info *) 0x0
> (gdb) print right->branch_info
> $3 = (struct branch_info *) 0x0
>
> --
>
> (C) perf report segfaults with dso_from/dso_to:
>
> (gdb) run report --sort=dso_to
> Program received signal SIGSEGV, Segmentation fault.
> sort__dso_to_cmp (left=0xbfffe878, right=0x82346f0) at util/sort.c:317
> 317 return _sort__dso_cmp(left->branch_info->to.map,
> (gdb) bt
> #0  sort__dso_to_cmp (left=0xbfffe878, right=0x82346f0) at util/sort.c:317
> #1  0x080ba0c8 in hist_entry__cmp (left=0xbfffe878, right=0x82346f0)
> at util/hist.c:345
> #2  0x080b9c41 in add_hist_entry (hists=0x81c571c, entry=0xbfffe878, al=
> 0xbfffe9b0, period=31) at util/hist.c:254
> #3  0x080ba05c in __hists__add_entry (self=0x81c571c, al=0xbfffe9b0,
> sym_parent=0x0, period=31) at util/hist.c:335
> #4  0x0805bf50 in perf_evsel__add_hist_entry (evsel=0x81c56a8, al=0xbfffe9b0,
> sample=0xbfffea84, machine=0x81c3fa8) at builtin-report.c:149
> #5  0x0805c242 in process_sample_event (tool=0xb27c, event=0xb6dbf800,
> sample=0xbfffea84, evsel=0x81c56a8, machine=0x81c3fa8)
> at builtin-report.c:216
> (gdb) print left->branch_info
> $4 = (struct branch_info *) 0x0
> (gdb) print right->branch_info
> $5 = (struct branch_info *) 0x0
>
> ===
>
>
> --- builtin-report.c.orig 2012-07-22
> +++ builtin-report.c  2012-07-22
> @@ -162,7 +162,7 @@ static int perf_evsel__add_hist_entry(st
>* so we don't allocated the extra space needed because the stdio
>* code will not use it.
>*/
> - if (al->sym != NULL && use_browser > 0) {
> + if (al->sym != NULL && he->ms.sym != NULL && use_browser > 0) {
>   struct annotation *notes = symbol__annotation(he->ms.sym);
>  
>   assert(evsel != NULL);
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  

Re: [PATCH v2] leds: add new lp8788 led driver

2012-07-23 Thread Bryan Wu
On Mon, Jul 23, 2012 at 2:19 AM, Mark Brown
 wrote:
> On Sat, Jul 21, 2012 at 02:48:49AM +0800, Bryan Wu wrote:
>
>> Actually cancel_work_sync() is quite similar to flush_work_sync()
>> here. For the timer thing, in fact it is NULL when cancel_work_sync()
>> call __cancel_work_timer().
>
>> And Mark, do you have any advice about the flush_work_sync() and
>> cancel_work_sync(). I saw you use flush in the
>> drivers/leds/leds-wm8350.c.
>
> If the work is flushed then the state that userspace thought was set
> when the driver is removed will actually be set before the driver is
> removed.  This is fairly minor but might be useful.

So what's kind of state you mentioned here that is cared by user
space. I find these 2 functions are quite confused for use right now.
Literally, canceling normally will remove pending work item and wait
for running work item to finish. flushing will wait for both pending
and running work item to finish.

Thanks,
-- 
Bryan Wu 
Kernel Developer+86.186-168-78255 Mobile
Canonical Ltd.  www.canonical.com
Ubuntu - Linux for human beings | www.ubuntu.com
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[Resend PATCH v2] usb: gadget: s3c-hsotg: fix core reset timeout failure

2012-07-23 Thread Du, Changbin
From: "Du, Changbin" 

The timeout values were 1000 and timeout issue occured many times on my
s3c6410 Soc based board (mostly when booting whith USB cable not
connected). This patch increase the values to 1 to guarantee the
success of reset.

Having set timeout to 1, I printed the remained timeout values
which could cause timeout issue before this change (tested several
times).
the first timeout value remained:
timeout = 8079
timeout = 8079
timeout = 8078
timeout = 8081
the second timeout value remained:
timeout = 7940
timeout = 7945
timeout = 7940
timeout = 7938
Seeing from above values, I think the value 1 is big enough.

Signed-off-by: Du, Changbin 
---
Changes for v2:
Fixed wrapped line done by my mail client

---
 drivers/usb/gadget/s3c-hsotg.c |4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/usb/gadget/s3c-hsotg.c b/drivers/usb/gadget/s3c-hsotg.c
index f4abb0e..f3e2234 100644
--- a/drivers/usb/gadget/s3c-hsotg.c
+++ b/drivers/usb/gadget/s3c-hsotg.c
@@ -2215,7 +2215,7 @@ static int s3c_hsotg_corereset(struct s3c_hsotg *hsotg)
/* issue soft reset */
writel(GRSTCTL_CSftRst, hsotg->regs + GRSTCTL);
 
-   timeout = 1000;
+   timeout = 1;
do {
grstctl = readl(hsotg->regs + GRSTCTL);
} while ((grstctl & GRSTCTL_CSftRst) && timeout-- > 0);
@@ -2225,7 +2225,7 @@ static int s3c_hsotg_corereset(struct s3c_hsotg *hsotg)
return -EINVAL;
}
 
-   timeout = 1000;
+   timeout = 1;
 
while (1) {
u32 grstctl = readl(hsotg->regs + GRSTCTL);
-- 
1.7.9.5

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFC] ARM: sched_clock: update epoch_cyc on resume

2012-07-23 Thread Linus Walleij
On Mon, Jul 23, 2012 at 9:27 PM, Colin Cross  wrote:
> On Mon, Jul 23, 2012 at 11:55 AM, Linus Walleij

> Does the clock you use for sched_clock continue to run in all suspend
> modes? All the SoC's I've used only have a 32kHz clock in the deepest
> suspend mode,

Yes, and yes it is 32kHz.

> which is not ideal for sched_clock.

Not that I know why scheduling with 32kHz is so bad compared to the
default system scheduling granularity which is HZ if you don't have
sched_clock() implemented.

Since this seems to be such an important point, what makes you
want MHz:es for scheduling granularity? To me the biggest impact
is actually the granularity of the timestamps in the printk:s.

(It's not that I doubt your needs, more curiosity.)

>  For example, on
> Tegra2 the faster 1MHz clock used for sched_clock resets in the
> deepest suspend state (LP0) but not the shallowest suspend state
> (LP2), and which suspend state the chip hits depends on which hardware
> is active.  Opting out of this patch would cause Tegra's clock to
> sometimes run in suspend, and sometimes not, which seems worse for
> debugging than consistently not running in suspend.  I'd be surprised
> if a similar situation didn't apply to your platform.

Well being able to switch between different sched_clock() providers
may be the ideal...

>> - If it absolutely needs to be in the core code, also have a bool
>>   field indicating whether the clock is going to die during suspend
>>   and add new registration functions for setting that sched_clock
>>   type, e.g. setup_sched_clock_nonsuspendable()
>
> Sounds reasonable if some platforms need the extra complexity.

OK agreed.

A connecting theme is that of being avle to flag clock sources as
sched_clock providers. If all clocksources were tagged with
rating, and only clocksources were used for sched_clock(), the
kernel could select the highest-rated clock under all circumstances.

But that's quite intrusive, more of an idea. :-P

Yours,
Linus Walleij
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[PATCH -next] pstore: fix printk format warning

2012-07-23 Thread Randy Dunlap
From: Randy Dunlap 

Fix printk format warning (on i386) in pstore:

fs/pstore/ram.c:409:3: warning: format '%lu' expects type 'long unsigned int', 
but argument 2 has type 'size_t'

Signed-off-by: Randy Dunlap 
Acked-by: Kees Cook 
---
This patch from June 15 is still needed in linux-next.

 fs/pstore/ram.c |2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- linux-next-20120723.orig/fs/pstore/ram.c
+++ linux-next-20120723/fs/pstore/ram.c
@@ -406,7 +406,7 @@ static int __devinit ramoops_probe(struc
goto fail_init_fprz;
 
if (!cxt->przs && !cxt->cprz && !cxt->fprz) {
-   pr_err("memory size too small, minimum is %lu\n",
+   pr_err("memory size too small, minimum is %zu\n",
cxt->console_size + cxt->record_size +
cxt->ftrace_size);
goto fail_cnt;
--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[Bcache v15 02/16] Fix ratelimit macro to compile in c99 mode

2012-07-23 Thread Kent Overstreet

Signed-off-by: Kent Overstreet 
---
 include/linux/ratelimit.h |2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/include/linux/ratelimit.h b/include/linux/ratelimit.h
index e11ccb4..9ad57d3 100644
--- a/include/linux/ratelimit.h
+++ b/include/linux/ratelimit.h
@@ -20,7 +20,7 @@ struct ratelimit_state {
 #define DEFINE_RATELIMIT_STATE(name, interval_init, burst_init)
\
\
struct ratelimit_state name = { \
-   .lock   = __RAW_SPIN_LOCK_UNLOCKED(name.lock),  \
+   .lock   = __RAW_SPIN_LOCK_INITIALIZER(name.lock),\
.interval   = interval_init,\
.burst  = burst_init,   \
}
-- 
1.7.7.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[Bcache v15 03/16] Export get_random_int()

2012-07-23 Thread Kent Overstreet

Signed-off-by: Kent Overstreet 
---
 drivers/char/random.c |1 +
 1 files changed, 1 insertions(+), 0 deletions(-)

diff --git a/drivers/char/random.c b/drivers/char/random.c
index 4ec04a7..78ff2f6 100644
--- a/drivers/char/random.c
+++ b/drivers/char/random.c
@@ -1362,6 +1362,7 @@ unsigned int get_random_int(void)
 
return ret;
 }
+EXPORT_SYMBOL(get_random_int);
 
 /*
  * randomize_range() returns a start address such that
-- 
1.7.7.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [Bcache v15 07/16] Closures

2012-07-23 Thread Kent Overstreet
I screwed up this patch - here's the correct one:


commit f2f374ca1fedf912dab186f2e2ccd1fd82486acc
Author: Kent Overstreet 
Date:   Fri Jan 13 16:14:05 2012 -0800

Closures

Asynchronous refcounty thingies; they embed a refcount and a work
struct. Extensive documentation follows in include/linux/closure.h

Signed-off-by: Kent Overstreet 

diff --git a/include/linux/closure.h b/include/linux/closure.h
new file mode 100644
index 000..9537e18
--- /dev/null
+++ b/include/linux/closure.h
@@ -0,0 +1,668 @@
+#ifndef _LINUX_CLOSURE_H
+#define _LINUX_CLOSURE_H
+
+#include 
+#include 
+#include 
+
+/*
+ * Closure is perhaps the most overused and abused term in computer science, 
but
+ * since I've been unable to come up with anything better you're stuck with it
+ * again.
+ *
+ * What are closures?
+ *
+ * They embed a refcount. The basic idea is they count "things that are in
+ * progress" - in flight bios, some other thread that's doing something else -
+ * anything you might want to wait on.
+ *
+ * The refcount may be manipulated with closure_get() and closure_put().
+ * closure_put() is where many of the interesting things happen, when it causes
+ * the refcount to go to 0.
+ *
+ * Closures can be used to wait on things both synchronously and 
asynchronously,
+ * and synchronous and asynchronous use can be mixed without restriction. To
+ * wait synchronously, use closure_sync() - you will sleep until your closure's
+ * refcount hits 1.
+ *
+ * To wait asynchronously, use
+ *   continue_at(cl, next_function, workqueue);
+ *
+ * passing it, as you might expect, the function to run when nothing is pending
+ * and the workqueue to run that function out of.
+ *
+ * continue_at() also, critically, is a macro that returns the calling 
function.
+ * There's good reason for this.
+ *
+ * To use safely closures asynchronously, they must always have a refcount 
while
+ * they are running owned by the thread that is running them. Otherwise, 
suppose
+ * you submit some bios and wish to have a function run when they all complete:
+ *
+ * foo_endio(struct bio *bio, int error)
+ * {
+ * closure_put(cl);
+ * }
+ *
+ * closure_init(cl);
+ *
+ * do_stuff();
+ * closure_get(cl);
+ * bio1->bi_endio = foo_endio;
+ * bio_submit(bio1);
+ *
+ * do_more_stuff();
+ * closure_get(cl);
+ * bio2->bi_endio = foo_endio;
+ * bio_submit(bio2);
+ *
+ * continue_at(cl, complete_some_read, system_wq);
+ *
+ * If closure's refcount started at 0, complete_some_read() could run before 
the
+ * second bio was submitted - which is almost always not what you want! More
+ * importantly, it wouldn't be possible to say whether the original thread or
+ * complete_some_read()'s thread owned the closure - and whatever state it was
+ * associated with!
+ *
+ * So, closure_init() initializes a closure's refcount to 1 - and when a
+ * closure_fn is run, the refcount will be reset to 1 first.
+ *
+ * Then, the rule is - if you got the refcount with closure_get(), release it
+ * with closure_put() (i.e, in a bio->bi_endio function). If you have a 
refcount
+ * on a closure because you called closure_init() or you were run out of a
+ * closure - _always_ use continue_at(). Doing so consistently will help
+ * eliminate an entire class of particularly pernicious races.
+ *
+ * For a closure to wait on an arbitrary event, we need to introduce waitlists:
+ *
+ * struct closure_waitlist list;
+ * closure_wait_event(list, cl, condition);
+ * closure_wake_up(wait_list);
+ *
+ * These work analagously to wait_event() and wake_up() - except that instead 
of
+ * operating on the current thread (for wait_event()) and lists of threads, 
they
+ * operate on an explicit closure and lists of closures.
+ *
+ * Because it's a closure we can now wait either synchronously or
+ * asynchronously. closure_wait_event() returns the current value of the
+ * condition, and if it returned false continue_at() or closure_sync() can be
+ * used to wait for it to become true.
+ *
+ * It's useful for waiting on things when you can't sleep in the context in
+ * which you must check the condition (perhaps a spinlock held, or you might be
+ * beneath generic_make_request() - in which case you can't sleep on IO).
+ *
+ * closure_wait_event() will wait either synchronously or asynchronously,
+ * depending on whether the closure is in blocking mode or not. You can pick a
+ * mode explicitly with closure_wait_event_sync() and
+ * closure_wait_event_async(), which do just what you might expect.
+ *
+ * Lastly, you might have a wait list dedicated to a specific event, and have 
no
+ * need for specifying the condition - you just want to wait until someone runs
+ * closure_wake_up() on the appropriate wait list. In that case, just use
+ * closure_wait(). It will return either true or false, depending on whether 
the
+ * closure was already on a wait list or not - a closure can only be on one 
wait
+ * list at a time.
+ *
+ * Parents:
+ *
+ * closure_init() takes two 

[Bcache v15 06/16] Add human-readable units modifier to vsnprintf()

2012-07-23 Thread Kent Overstreet

Signed-off-by: Kent Overstreet 
---
 lib/vsprintf.c |   24 +++-
 1 files changed, 23 insertions(+), 1 deletions(-)

diff --git a/lib/vsprintf.c b/lib/vsprintf.c
index c3f36d41..16149dd 100644
--- a/lib/vsprintf.c
+++ b/lib/vsprintf.c
@@ -338,6 +338,7 @@ int num_to_str(char *buf, int size, unsigned long long num)
 #define LEFT   16  /* left justified */
 #define SMALL  32  /* use lowercase in hex (must be 32 == 0x20) */
 #define SPECIAL64  /* prefix hex with "0x", octal with "0" 
*/
+#define HUNITS 128 /* Human readable units, i.e. k/M/G/T */
 
 enum format_type {
FORMAT_TYPE_NONE, /* Just a string part */
@@ -377,6 +378,7 @@ char *number(char *buf, char *end, unsigned long long num,
 {
/* we are called with base 8, 10 or 16, only, thus don't need "G..."  */
static const char digits[16] = "0123456789ABCDEF"; /* 
"GHIJKLMNOPQRSTUVWXYZ"; */
+   static const char units[] = "?kMGTPEZY";
 
char tmp[66];
char sign;
@@ -431,7 +433,26 @@ char *number(char *buf, char *end, unsigned long long num,
num >>= shift;
} while (num);
} else { /* base 10 */
-   i = put_dec(tmp, num) - tmp;
+   if (spec.flags & HUNITS) {
+   int u, rem = 0;
+
+   for (u = 0; num >= 1024; u++) {
+   rem = num & ~(~0 << 10);
+   num >>= 10;
+   }
+
+   if (u) {
+   tmp[i++] = units[u];
+
+   if (num < 100) {
+   rem /= 100;
+   i = put_dec(tmp + i, rem) - tmp;
+   tmp[i++] = '.';
+   }
+   }
+   }
+
+   i = put_dec(tmp + i, num) - tmp;
}
 
/* printing 100 using %2d gives "100", not "00" */
@@ -1127,6 +1148,7 @@ int format_decode(const char *fmt, struct printf_spec 
*spec)
case ' ': spec->flags |= SPACE;   break;
case '#': spec->flags |= SPECIAL; break;
case '0': spec->flags |= ZEROPAD; break;
+   case 'h': spec->flags |= HUNITS;  break;
default:  found = false;
}
 
-- 
1.7.7.3

--
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[Bcache v15 08/16] bcache: Generic utility code

2012-07-23 Thread Kent Overstreet
Much of this code should be moved out of drivers/block/bcache, but it
was originally written for bcache.

Signed-off-by: Kent Overstreet 
---
 drivers/md/bcache/util.c |  392 ++
 drivers/md/bcache/util.h |  606 ++
 2 files changed, 998 insertions(+), 0 deletions(-)
 create mode 100644 drivers/md/bcache/util.c
 create mode 100644 drivers/md/bcache/util.h

diff --git a/drivers/md/bcache/util.c b/drivers/md/bcache/util.c
new file mode 100644
index 000..e58c27f
--- /dev/null
+++ b/drivers/md/bcache/util.c
@@ -0,0 +1,392 @@
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+#include 
+
+#include "util.h"
+
+#define simple_strtoint(c, end, base)  simple_strtol(c, end, base)
+#define simple_strtouint(c, end, base) simple_strtoul(c, end, base)
+
+#define STRTO_H(name, type)\
+int name ## _h(const char *cp, type *res)  \
+{  \
+   int u = 0;  \
+   char *e;\
+   type i = simple_ ## name(cp, , 10);   \
+   \
+   switch (tolower(*e)) {  \
+   default:\
+   return -EINVAL; \
+   case 'y':   \
+   case 'z':   \
+   u++;\
+   case 'e':   \
+   u++;\
+   case 'p':   \
+   u++;\
+   case 't':   \
+   u++;\
+   case 'g':   \
+   u++;\
+   case 'm':   \
+   u++;\
+   case 'k':   \
+   u++;\
+   if (e++ == cp)  \
+   return -EINVAL; \
+   case '\n':  \
+   case '\0':  \
+   if (*e == '\n') \
+   e++;\
+   }   \
+   \
+   if (*e) \
+   return -EINVAL; \
+   \
+   while (u--) {   \
+   if ((type) ~0 > 0 &&\
+   (type) ~0 / 1024 <= i)  \
+   return -EINVAL; \
+   if ((i > 0 && ANYSINT_MAX(type) / 1024 < i) ||  \
+   (i < 0 && -ANYSINT_MAX(type) / 1024 > i))   \
+   return -EINVAL; \
+   i *= 1024;  \
+   }   \
+   \
+   *res = i;   \
+   return 0;   \
+}  \
+EXPORT_SYMBOL_GPL(name ## _h);
+
+STRTO_H(strtoint, int)
+STRTO_H(strtouint, unsigned int)
+STRTO_H(strtoll, long long)
+STRTO_H(strtoull, unsigned long long)
+
+ssize_t snprint_string_list(char *buf, size_t size, const char * const list[],
+   size_t selected)
+{
+   char *out = buf;
+   size_t i;
+
+   for (i = 0; list[i]; i++)
+   out += snprintf(out, buf + size - out,
+   i == selected ? "[%s] " : "%s ", list[i]);
+
+   out[-1] = '\n';
+   return out - buf;
+}
+EXPORT_SYMBOL_GPL(snprint_string_list);
+
+ssize_t read_string_list(const char *buf, const char * const list[])
+{
+   size_t i;
+   char *s, *d = kstrndup(buf, PAGE_SIZE - 1, GFP_KERNEL);
+   if (!d)
+   return -ENOMEM;
+
+   s = strim(d);
+
+   for (i = 0; list[i]; i++)
+   if (!strcmp(list[i], s))
+   

[Bcache v15 09/16] bcache: Documentation, and changes to generic code

2012-07-23 Thread Kent Overstreet

Signed-off-by: Kent Overstreet 
---
 Documentation/ABI/testing/sysfs-block-bcache |  156 
 Documentation/bcache.txt |  255 ++
 drivers/md/Kconfig   |2 +
 drivers/md/Makefile  |1 +
 drivers/md/bcache/Kconfig|   41 
 drivers/md/bcache/Makefile   |   14 ++
 include/linux/cgroup_subsys.h|6 +
 include/linux/sched.h|4 +
 kernel/fork.c|4 +
 9 files changed, 483 insertions(+), 0 deletions(-)
 create mode 100644 Documentation/ABI/testing/sysfs-block-bcache
 create mode 100644 Documentation/bcache.txt
 create mode 100644 drivers/md/bcache/Kconfig
 create mode 100644 drivers/md/bcache/Makefile

diff --git a/Documentation/ABI/testing/sysfs-block-bcache 
b/Documentation/ABI/testing/sysfs-block-bcache
new file mode 100644
index 000..9e4bbc5
--- /dev/null
+++ b/Documentation/ABI/testing/sysfs-block-bcache
@@ -0,0 +1,156 @@
+What:  /sys/block//bcache/unregister
+Date:  November 2010
+Contact:   Kent Overstreet 
+Description:
+   A write to this file causes the backing device or cache to be
+   unregistered. If a backing device had dirty data in the cache,
+   writeback mode is automatically disabled and all dirty data is
+   flushed before the device is unregistered. Caches unregister
+   all associated backing devices before unregistering themselves.
+
+What:  /sys/block//bcache/clear_stats
+Date:  November 2010
+Contact:   Kent Overstreet 
+Description:
+   Writing to this file resets all the statistics for the device.
+
+What:  /sys/block//bcache/cache
+Date:  November 2010
+Contact:   Kent Overstreet 
+Description:
+   For a backing device that has cache, a symlink to
+   the bcache/ dir of that cache.
+
+What:  /sys/block//bcache/cache_hits
+Date:  November 2010
+Contact:   Kent Overstreet 
+Description:
+   For backing devices: integer number of full cache hits,
+   counted per bio. A partial cache hit counts as a miss.
+
+What:  /sys/block//bcache/cache_misses
+Date:  November 2010
+Contact:   Kent Overstreet 
+Description:
+   For backing devices: integer number of cache misses.
+
+What:  /sys/block//bcache/cache_hit_ratio
+Date:  November 2010
+Contact:   Kent Overstreet 
+Description:
+   For backing devices: cache hits as a percentage.
+
+What:  /sys/block//bcache/sequential_cutoff
+Date:  November 2010
+Contact:   Kent Overstreet 
+Description:
+   For backing devices: Threshold past which sequential IO will
+   skip the cache. Read and written as bytes in human readable
+   units (i.e. echo 10M > sequntial_cutoff).
+
+What:  /sys/block//bcache/bypassed
+Date:  November 2010
+Contact:   Kent Overstreet 
+Description:
+   Sum of all reads and writes that have bypassed the cache (due
+   to the sequential cutoff).  Expressed as bytes in human
+   readable units.
+
+What:  /sys/block//bcache/writeback
+Date:  November 2010
+Contact:   Kent Overstreet 
+Description:
+   For backing devices: When on, writeback caching is enabled and
+   writes will be buffered in the cache. When off, caching is in
+   writethrough mode; reads and writes will be added to the
+   cache but no write buffering will take place.
+
+What:  /sys/block//bcache/writeback_running
+Date:  November 2010
+Contact:   Kent Overstreet 
+Description:
+   For backing devices: when off, dirty data will not be written
+   from the cache to the backing device. The cache will still be
+   used to buffer writes until it is mostly full, at which point
+   writes transparently revert to writethrough mode. Intended only
+   for benchmarking/testing.
+
+What:  /sys/block//bcache/writeback_delay
+Date:  November 2010
+Contact:   Kent Overstreet 
+Description:
+   For backing devices: In writeback mode, when dirty data is
+   written to the cache and the cache held no dirty data for that
+   backing device, writeback from cache to backing device starts
+   after this delay, expressed as an integer number of seconds.
+
+What:  /sys/block//bcache/writeback_percent
+Date:  November 2010
+Contact:   Kent Overstreet 
+Description:
+   For backing devices: If nonzero, writeback from cache to
+   backing device only takes place when more than this percentage
+   of the cache is used, 

  1   2   3   4   5   6   7   8   9   10   >