Re: [Bug 205065] New: workqueue: PF_MEMALLOC task 173(kswapd0) is flushing !WQ_MEM_RECLAIM events:gen6_pm_rps_work [i915]

2019-10-02 Thread Andrew Morton

(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Tue, 01 Oct 2019 17:06:35 + bugzilla-dae...@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=205065
> 
> Bug ID: 205065
>Summary: workqueue: PF_MEMALLOC task 173(kswapd0) is flushing
> !WQ_MEM_RECLAIM events:gen6_pm_rps_work [i915]
>Product: Memory Management
>Version: 2.5
> Kernel Version: Ubuntu 5.3.0-13.14-generic 5.3.0
>   Hardware: All
> OS: Linux
>   Tree: Mainline
> Status: NEW
>   Severity: normal
>   Priority: P1
>  Component: Other
>   Assignee: a...@linux-foundation.org
>   Reporter: romanescu.2...@gmail.com
> Regression: No

Maybe Tejun can help interpret this ;)

> Open bug in launchpad.net
> https://bugs.launchpad.net/bugs/1846241
> 
> dmesg:
> [ 9998.518472] [ cut here ]
> [ 9998.518505] workqueue: PF_MEMALLOC task 173(kswapd0) is flushing
> !WQ_MEM_RECLAIM events:gen6_pm_rps_work [i915]
> [ 9998.518516] WARNING: CPU: 5 PID: 173 at kernel/workqueue.c:2598
> check_flush_dependency+0xa7/0x140
> [ 9998.518517] Modules linked in: usbhid rfcomm ccm cmac bnep
> snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio
> nls_iso8859_1 snd_soc_skl snd_soc_hdac_hda snd_hda_ext_core snd_soc_skl_ipc
> snd_soc_sst_ipc snd_soc_sst_dsp snd_soc_acpi_intel_match snd_soc_acpi
> snd_soc_core snd_compress ac97_bus snd_pcm_dmaengine x86_pkg_temp_thermal
> intel_powerclamp coretemp snd_hda_intel snd_hda_codec kvm_intel snd_hda_core
> kvm irqbypass snd_hwdep snd_pcm snd_seq_midi snd_seq_midi_event intel_rapl_msr
> snd_rawmidi ath9k_htc mei_hdcp ath9k_common btusb uvcvideo crct10dif_pclmul
> ath9k_hw btrtl videobuf2_vmalloc crc32_pclmul videobuf2_memops btbcm ath
> snd_seq btintel videobuf2_v4l2 i915 ghash_clmulni_intel mac80211
> videobuf2_common hid_sensor_incl_3d bluetooth hid_sensor_magn_3d
> hid_sensor_gyro_3d hid_sensor_accel_3d hid_sensor_rotation videodev
> snd_seq_device hid_sensor_trigger snd_timer cfg80211 drm_kms_helper mc
> industrialio_triggered_buffer ecdh_generic kfifo_buf ecc
> [ 9998.518560] processor_thermal_device hid_sensor_iio_common libarc4
> aesni_intel spi_pxa2xx_platform drm snd intel_rapl_common industrialio
> aes_x86_64 crypto_simd i2c_algo_bit fb_sys_fops input_leds cryptd glue_helper
> joydev dw_dmac intel_cstate syscopyarea sysfillrect intel_xhci_usb_role_switch
> mei_me intel_rapl_perf serio_raw wmi_bmof intel_wmi_thunderbolt dw_dmac_core
> soundcore idma64 8250_dw cros_ec_ishtp mei hid_multitouch roles virt_dma
> cros_ec_core sysimgblt intel_pch_thermal intel_soc_dts_iosf hp_wmi
> soc_button_array int3403_thermal intel_vbtn int340x_thermal_zone sparse_keymap
> hp_accel acpi_pad lis3lv02d hp_wireless input_polldev int3400_thermal
> acpi_thermal_rel mac_hid sch_fq_codel parport_pc ppdev sunrpc lp parport
> ip_tables x_tables autofs4 btrfs xor zstd_compress raid6_pq libcrc32c
> hid_sensor_custom hid_sensor_hub intel_ishtp_loader intel_ishtp_hid 
> hid_generic
> psmouse sdhci_pci cqhci i2c_i801 ahci intel_lpss_pci intel_ish_ipc sdhci
> i2c_hid libahci intel_lpss intel_ishtp hid wmi
> [ 9998.518603] pinctrl_sunrisepoint video pinctrl_intel
> [ 9998.518609] CPU: 5 PID: 173 Comm: kswapd0 Not tainted 5.3.0-13-generic
> #14-Ubuntu
> [ 9998.518610] Hardware name: HP HP Pavilion x360 Convertible 14-cd0xxx/8486,
> BIOS F.34 04/29/2019
> [ 9998.518614] RIP: 0010:check_flush_dependency+0xa7/0x140
> [ 9998.518616] Code: 8d 8a 70 0a 00 00 4d 89 e0 48 8d 8b b0 00 00 00 4c 89 ca
> 48 c7 c7 e8 93 d3 a1 48 89 45 e0 c6 05 c4 29 75 01 01 e8 f4 13 fe ff <0f> 0b 
> 48
> 8b 45 e0 eb 0f 4c 89 ef e8 39 8e 00 00 41 f6 45 25 08 75
> [ 9998.518618] RSP: 0018:ba44c034f7f0 EFLAGS: 00010086
> [ 9998.518620] RAX:  RBX: 9d76a400ce00 RCX:
> 
> [ 9998.518621] RDX: 0063 RSI: a2581fe3 RDI:
> 0046
> [ 9998.518623] RBP: ba44c034f810 R08: a2581f80 R09:
> 0063
> [ 9998.518624] R10: a2582360 R11: a2581fcb R12:
> c1092710
> [ 9998.518625] R13: 9d76a30b5e00 R14: 0001 R15:
> 9d76a5df0700
> [ 9998.518627] FS: () GS:9d76a5d4()
> knlGS:
> [ 9998.518628] CS: 0010 DS:  ES:  CR0: 80050033
> [ 9998.518630] CR2: 7f877e503000 CR3: 00019be0a002 CR4:
> 003606e0
> [ 9998.518631] Call Trace:
> [ 9998.518636] __flush_work+0x97/0x1d0
> [ 9998.518641] ? enqueue_hrtimer+0x3d/0x90
> [ 9998.518644] __cancel_work_timer+0x10e/0x190
> [ 9998.518648] ? _cond_resched+0x19/0x30
> [ 9998.518651] ? synchronize_irq+0x3e/0xb0
> [ 9998.518654] cancel_work_sync+0x10/0x20
> [ 9998.518692] gen6_disable_rps_interrupts+0x95/0xc0 [i915]
> [ 9998.518732] gen6_rps_idle+0x1f/0xf0 [i915]
> [ 9998.518772] 

Re: [PATCH 3/4] kernel.h: Add non_block_start/end()

2019-08-22 Thread Andrew Morton
On Tue, 20 Aug 2019 22:24:40 +0200 Daniel Vetter  wrote:

> Hi Peter,
> 
> Iirc you've been involved at least somewhat in discussing this. -mm folks
> are a bit undecided whether these new non_block semantics are a good idea.
> Michal Hocko still is in support, but Andrew Morton and Jason Gunthorpe
> are less enthusiastic. Jason said he's ok with merging the hmm side of
> this if scheduler folks ack. If not, then I'll respin with the
> preempt_disable/enable instead like in v1.

I became mollified once Michel explained the rationale.  I think it's
OK.  It's very specific to the oom reaper and hopefully won't be used
more widely(?).



Re: [PATCH 2/5] kernel.h: Add non_block_start/end()

2019-08-15 Thread Andrew Morton
On Thu, 15 Aug 2019 10:44:29 +0200 Michal Hocko  wrote:

> > I continue to struggle with this.  It introduces a new kernel state
> > "running preemptibly but must not call schedule()".  How does this make
> > any sense?
> > 
> > Perhaps a much, much more detailed description of the oom_reaper
> > situation would help out.
>  
> The primary point here is that there is a demand of non blockable mmu
> notifiers to be called when the oom reaper tears down the address space.
> As the oom reaper is the primary guarantee of the oom handling forward
> progress it cannot be blocked on anything that might depend on blockable
> memory allocations. These are not really easy to track because they
> might be indirect - e.g. notifier blocks on a lock which other context
> holds while allocating memory or waiting for a flusher that needs memory
> to perform its work. If such a blocking state happens that we can end up
> in a silent hang with an unusable machine.
> Now we hope for reasonable implementations of mmu notifiers (strong
> words I know ;) and this should be relatively simple and effective catch
> all tool to detect something suspicious is going on.
> 
> Does that make the situation more clear?

Yes, thanks, much.  Maybe a code comment along the lines of

  This is on behalf of the oom reaper, specifically when it is
  calling the mmu notifiers.  The problem is that if the notifier were
  to block on, for example, mutex_lock() and if the process which holds
  that mutex were to perform a sleeping memory allocation, the oom
  reaper is now blocked on completion of that memory allocation.

btw, do we need task_struct.non_block_count?  Perhaps the oom reaper
thread could set a new PF_NONBLOCK (which would be more general than
PF_OOM_REAPER).  If we run out of PF_ flags, use (current == oom_reaper_th).



Re: [PATCH 1/5] mm: Check if mmu notifier callbacks are allowed to fail

2019-08-14 Thread Andrew Morton
On Wed, 14 Aug 2019 22:20:23 +0200 Daniel Vetter  wrote:

> Just a bit of paranoia, since if we start pushing this deep into
> callchains it's hard to spot all places where an mmu notifier
> implementation might fail when it's not allowed to.
> 
> Inspired by some confusion we had discussing i915 mmu notifiers and
> whether we could use the newly-introduced return value to handle some
> corner cases. Until we realized that these are only for when a task
> has been killed by the oom reaper.
> 
> An alternative approach would be to split the callback into two
> versions, one with the int return value, and the other with void
> return value like in older kernels. But that's a lot more churn for
> fairly little gain I think.
> 
> Summary from the m-l discussion on why we want something at warning
> level: This allows automated tooling in CI to catch bugs without
> humans having to look at everything. If we just upgrade the existing
> pr_info to a pr_warn, then we'll have false positives. And as-is, no
> one will ever spot the problem since it's lost in the massive amounts
> of overall dmesg noise.
> 
> ...
>
> --- a/mm/mmu_notifier.c
> +++ b/mm/mmu_notifier.c
> @@ -179,6 +179,8 @@ int __mmu_notifier_invalidate_range_start(struct 
> mmu_notifier_range *range)
>   pr_info("%pS callback failed with %d in 
> %sblockable context.\n",
>   mn->ops->invalidate_range_start, _ret,
>   !mmu_notifier_range_blockable(range) ? 
> "non-" : "");
> + WARN_ON(mmu_notifier_range_blockable(range) ||
> + ret != -EAGAIN);
>   ret = _ret;
>   }
>   }

A problem with WARN_ON(a || b) is that if it triggers, we don't know
whether it was because of a or because of b.  Or both.  So I'd suggest

WARN_ON(a);
WARN_ON(b);



Re: [PATCH 2/5] kernel.h: Add non_block_start/end()

2019-08-14 Thread Andrew Morton
On Wed, 14 Aug 2019 22:20:24 +0200 Daniel Vetter  wrote:

> In some special cases we must not block, but there's not a
> spinlock, preempt-off, irqs-off or similar critical section already
> that arms the might_sleep() debug checks. Add a non_block_start/end()
> pair to annotate these.
> 
> This will be used in the oom paths of mmu-notifiers, where blocking is
> not allowed to make sure there's forward progress. Quoting Michal:
> 
> "The notifier is called from quite a restricted context - oom_reaper -
> which shouldn't depend on any locks or sleepable conditionals. The code
> should be swift as well but we mostly do care about it to make a forward
> progress. Checking for sleepable context is the best thing we could come
> up with that would describe these demands at least partially."
> 
> Peter also asked whether we want to catch spinlocks on top, but Michal
> said those are less of a problem because spinlocks can't have an
> indirect dependency upon the page allocator and hence close the loop
> with the oom reaper.

I continue to struggle with this.  It introduces a new kernel state
"running preemptibly but must not call schedule()".  How does this make
any sense?

Perhaps a much, much more detailed description of the oom_reaper
situation would help out.

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v19 00/15] arm64: untag user pointers passed to the kernel

2019-08-08 Thread Andrew Morton
On Thu, 8 Aug 2019 14:12:19 -0700 Kees Cook  wrote:

> > The ones that are left are the mm ones: 4, 5, 6, 7 and 8.
> > 
> > Andrew, could you take a look and give your Acked-by or pick them up 
> > directly?
> 
> Given the subsystem Acks, it seems like 3-10 and 12 could all just go
> via Andrew? I hope he agrees. :)

I'll grab everything that has not yet appeared in linux-next.  If more
of these patches appear in linux-next I'll drop those as well.

The review discussion against " [PATCH v19 02/15] arm64: Introduce
prctl() options to control the tagged user addresses ABI" has petered
out inconclusively.  prctl() vs arch_prctl().

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [Bug 204407] New: Bad page state in process Xorg

2019-08-02 Thread Andrew Morton
(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Thu, 01 Aug 2019 22:34:16 + bugzilla-dae...@bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=204407
> 
> Bug ID: 204407
>Summary: Bad page state in process Xorg
>Product: Memory Management
>Version: 2.5
> Kernel Version: 5.3.0-rc2-00013
>   Hardware: All
> OS: Linux
>   Tree: Mainline
> Status: NEW
>   Severity: normal
>   Priority: P1
>  Component: Page Allocator
>   Assignee: a...@linux-foundation.org
>   Reporter: p...@vandrovec.name
> Regression: No
> 
> Created attachment 284081
>   --> https://bugzilla.kernel.org/attachment.cgi?id=284081=edit
> dmesg
> 
> I've upgraded from 5.3-rc1 to 5.3-rc2, and when I started X server, system
> became unhappy:
> 
> [259701.387365] BUG: Bad page state in process Xorg  pfn:2a300
> [259701.393593] page:eaa8c000 refcount:0 mapcount:-128
> mapping: index:0x0
> [259701.402832] flags: 0x2000()
> [259701.407426] raw: 2000 822ab778 eaa8f208
> 
> [259701.415900] raw:  0003 ff7f
> 
> [259701.424373] page dumped because: nonzero mapcount
> [259701.429847] Modules linked in: af_packet xt_REDIRECT nft_compat x_tables
> nft_counter nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv4 nf_tables
> nfnetlink ppdev parport fuse autofs4 binfmt_misc uinput twofish_generic
> twofish_avx_x86_64 twofish_x86_64_3way twofish_x86_64 twofish_common
> camellia_generic camellia_aesni_avx_x86_64 camellia_x86_64 serpent_avx_x86_64
> serpent_sse2_x86_64 serpent_generic blowfish_generic blowfish_x86_64
> blowfish_common cast5_avx_x86_64 cast5_generic cast_common des_generic cmac
> xcbc rmd160 af_key xfrm_algo rpcsec_gss_krb5 nfsv4 nls_iso8859_2 cifs libarc4
> nfsv3 nfsd auth_rpcgss nfs_acl nfs lockd grace fscache sunrpc ipv6 crc_ccitt
> nf_defrag_ipv6 snd_hda_codec_hdmi pktcdvd coretemp hwmon intel_rapl_common
> iosf_mbi x86_pkg_temp_thermal snd_hda_codec_realtek snd_hda_codec_generic
> ledtrig_audio snd_hda_intel snd_hda_codec crct10dif_pclmul crc32_pclmul
> snd_hwdep crc32c_intel snd_hda_core ghash_clmulni_intel snd_pcm_oss uas
> iTCO_wdt aesni_intel e1000e
> [259701.429873]  snd_mixer_oss iTCO_vendor_support aes_x86_64 snd_pcm
> crypto_simd ptp mei_me dcdbas lpc_ich sr_mod cryptd usb_storage snd_timer
> glue_helper mfd_core input_leds pps_core tpm_tis cdrom i2c_i801 snd mei
> tpm_tis_core sg tpm [last unloaded: parport_pc]
> [259701.539387] CPU: 10 PID: 4860 Comm: Xorg Tainted: GT
> 5.3.0-rc2-64-00013-g03f05a670a3d #69
> [259701.549382] Hardware name: Dell Inc. Precision T3610/09M8Y8, BIOS A16
> 02/05/2018
> [259701.549382] Call Trace:
> [259701.549382]  dump_stack+0x46/0x60
> [259701.549382]  bad_page.cold.28+0x81/0xb4
> [259701.549382]  __free_pages_ok+0x236/0x240
> [259701.549382]  __ttm_dma_free_page+0x2f/0x40
> [259701.549382]  ttm_dma_unpopulate+0x29b/0x370
> [259701.549382]  ttm_tt_destroy.part.6+0x44/0x50
> [259701.549382]  ttm_bo_cleanup_memtype_use+0x29/0x70
> [259701.549382]  ttm_bo_put+0x225/0x280
> [259701.549382]  ttm_bo_vm_close+0x10/0x20
> [259701.549382]  remove_vma+0x20/0x40
> [259701.549382]  __do_munmap+0x2da/0x420
> [259701.549382]  __vm_munmap+0x66/0xc0
> [259701.549382]  __x64_sys_munmap+0x22/0x30
> [259701.549382]  do_syscall_64+0x5e/0x1a0
> [259701.549382]  ? prepare_exit_to_usermode+0x75/0xa0
> [259701.549382]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
> [259701.549382] RIP: 0033:0x7f504d0ec1d7
> [259701.549382] Code: 10 e9 67 ff ff ff 0f 1f 44 00 00 48 8b 15 b1 6c 0c 00 f7
> d8 64 89 02 48 c7 c0 ff ff ff ff e9 6b ff ff ff b8 0b 00 00 00 0f 05 <48> 3d 
> 01
> f0 ff ff 73 01 c3 48 8b 0d 89 6c 0c 00 f7 d8 64 89 01 48
> [259701.549382] RSP: 002b:7ffe529db138 EFLAGS: 0206 ORIG_RAX:
> 000b
> [259701.549382] RAX: ffda RBX: 564a5eabce70 RCX:
> 7f504d0ec1d7
> [259701.549382] RDX: 7ffe529db140 RSI: 0040 RDI:
> 7f5044b65000
> [259701.549382] RBP: 564a5eafe460 R08: 000b R09:
> 00010283e000
> [259701.549382] R10: 0001 R11: 0206 R12:
> 564a5e475b08
> [259701.549382] R13: 564a5e475c80 R14: 7ffe529db190 R15:
> 0c80
> [259701.707238] Disabling lock debugging due to kernel taint

I assume the above is misbehaviour in the DRM code?

> 
> Also - maybe related, maybe not - I've got three userspace crashes earlier on
> this kernel (but never before):
> 
> [77154.886836] iscons.py[12441]: segfault at 2c ip 080cf0b5 sp
> f773fb60 error 4 in python[8048000+11a000]
> [77154.898376] Code: 02 0f 84 4a 2e 00 00 8b 4d 08 8b bd 04 ff ff ff 8b 59 38
> 8b 57 20 8b 7b 10 85 ff 0f 84 ee 22 00 00 8b 8d 04 ff ff ff 8b 59 08 <8b> 43 
> 2c
> 85 

Re: mmotm 2019-07-04-15-01 uploaded (gpu/drm/i915/oa/)

2019-07-04 Thread Andrew Morton
On Thu, 04 Jul 2019 22:22:41 -0700 Joe Perches  wrote:

> > So when comparing a zero-length file with a non-existent file, diff
> > produces no output.
> 
> Why use the -N option ?
> 
> $ diff --help
> [...]
>   -N, --new-file  treat absent files as empty
> 
> otherwise
> 
> $ cd $(mktemp -d -p .)
> $ touch x
> $ diff -u x y
> diff: y: No such file or directory

Without -N diff fails and exits with an error.  -N does what's desired
as long as the non-missing file isn't empty.


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: mmotm 2019-07-04-15-01 uploaded (gpu/drm/i915/oa/)

2019-07-04 Thread Andrew Morton
On Fri, 5 Jul 2019 13:14:35 +1000 Stephen Rothwell  
wrote:

> > I checked next-20190704 tag.
> > 
> > I see the empty file
> > drivers/gpu/drm/i915/oa/Makefile
> > 
> > Did someone delete it?
> 
> Commit
> 
>   5ed7a0cf3394 ("drm/i915: Move OA files to separate folder")
> 
> from the drm-intel tree seems to have created it as an empty file.

hrm.

diff(1) doesn't seem to know how to handle a zero-length file.

y:/home/akpm> mkdir foo
y:/home/akpm> cd foo
y:/home/akpm/foo> touch x
y:/home/akpm/foo> diff -uN x y
y:/home/akpm/foo> date > x
y:/home/akpm/foo> diff -uN x y
--- x   2019-07-04 21:58:37.815028211 -0700
+++ y   1969-12-31 16:00:00.0 -0800
@@ -1 +0,0 @@
-Thu Jul  4 21:58:37 PDT 2019

So when comparing a zero-length file with a non-existent file, diff
produces no output.


This'll make things happy.  And I guess it should be done to protect
all the valuable intellectual property in that file.

--- /dev/null
+++ a/drivers/gpu/drm/i915/oa/Makefile
@@ -0,0 +1 @@
+# SPDX-License-Identifier: GPL-2.0
_

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH 16/25] device-dax: use the dev_pagemap internal refcount

2019-07-02 Thread Andrew Morton
On Fri, 28 Jun 2019 12:14:44 -0700 Dan Williams  
wrote:

> I believe -mm auto drops patches when they appear in the -next
> baseline. So it should "just work" to pull it into the series and send
> it along for -next inclusion.

Yup.  Although it isn't very "auto" - I manually check that the patch
which turned up in -next was identical to the version which I had.  If
not, I go find out why...

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: dev_pagemap related cleanups

2019-06-13 Thread Andrew Morton
On Thu, 13 Jun 2019 14:24:20 -0600 Logan Gunthorpe  wrote:

> 
> 
> On 2019-06-13 2:21 p.m., Dan Williams wrote:
> > On Thu, Jun 13, 2019 at 1:18 PM Logan Gunthorpe  wrote:
> >>
> >>
> >>
> >> On 2019-06-13 12:27 p.m., Dan Williams wrote:
> >>> On Thu, Jun 13, 2019 at 2:43 AM Christoph Hellwig  wrote:
> 
>  Hi Dan, Jérôme and Jason,
> 
>  below is a series that cleans up the dev_pagemap interface so that
>  it is more easily usable, which removes the need to wrap it in hmm
>  and thus allowing to kill a lot of code
> 
>  Diffstat:
> 
>   22 files changed, 245 insertions(+), 802 deletions(-)
> >>>
> >>> Hooray!
> >>>
>  Git tree:
> 
>  git://git.infradead.org/users/hch/misc.git hmm-devmem-cleanup
> >>>
> >>> I just realized this collides with the dev_pagemap release rework in
> >>> Andrew's tree (commit ids below are from next.git and are not stable)
> >>>
> >>> 4422ee8476f0 mm/devm_memremap_pages: fix final page put race
> >>> 771f0714d0dc PCI/P2PDMA: track pgmap references per resource, not globally
> >>> af37085de906 lib/genalloc: introduce chunk owners
> >>> e0047ff8aa77 PCI/P2PDMA: fix the gen_pool_add_virt() failure path
> >>> 0315d47d6ae9 mm/devm_memremap_pages: introduce devm_memunmap_pages
> >>> 216475c7eaa8 drivers/base/devres: introduce devm_release_action()
> >>>
> >>> CONFLICT (content): Merge conflict in tools/testing/nvdimm/test/iomap.c
> >>> CONFLICT (content): Merge conflict in mm/hmm.c
> >>> CONFLICT (content): Merge conflict in kernel/memremap.c
> >>> CONFLICT (content): Merge conflict in include/linux/memremap.h
> >>> CONFLICT (content): Merge conflict in drivers/pci/p2pdma.c
> >>> CONFLICT (content): Merge conflict in drivers/nvdimm/pmem.c
> >>> CONFLICT (content): Merge conflict in drivers/dax/device.c
> >>> CONFLICT (content): Merge conflict in drivers/dax/dax-private.h
> >>>
> >>> Perhaps we should pull those out and resend them through hmm.git?
> >>
> >> Hmm, I've been waiting for those patches to get in for a little while now 
> >> ;(
> > 
> > Unless Andrew was going to submit as v5.2-rc fixes I think I should
> > rebase / submit them on current hmm.git and then throw these cleanups
> > from Christoph on top?
> 
> Whatever you feel is best. I'm just hoping they get in sooner rather
> than later. They do fix a bug after all. Let me know if you want me to
> retest the P2PDMA stuff after the rebase.

I had them down for 5.3-rc1.  I'll send them along for 5.2-rc5 instead.

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH V2] include: linux: Regularise the use of FIELD_SIZEOF macro

2019-06-11 Thread Andrew Morton
On Tue, 11 Jun 2019 15:00:10 -0600 Andreas Dilger  wrote:

> >> to FIELD_SIZEOF
> > 
> > As Alexey has pointed out, C structs and unions don't have fields -
> > they have members.  So this is an opportunity to switch everything to
> > a new member_sizeof().
> > 
> > What do people think of that and how does this impact the patch footprint?
> 
> I did a check, and FIELD_SIZEOF() is used about 350x, while sizeof_field()
> is about 30x, and SIZEOF_FIELD() is only about 5x.

Erk.  Sorry, I should have grepped.

> That said, I'm much more in favour of "sizeof_field()" or "sizeof_member()"
> than FIELD_SIZEOF().  Not only does that better match "offsetof()", with
> which it is closely related, but is also closer to the original "sizeof()".
> 
> Since this is a rather trivial change, it can be split into a number of
> patches to get approval/landing via subsystem maintainers, and there is no
> huge urgency to remove the original macros until the users are gone.  It
> would make sense to remove SIZEOF_FIELD() and sizeof_field() quickly so
> they don't gain more users, and the remaining FIELD_SIZEOF() users can be
> whittled away as the patches come through the maintainer trees.

In that case I'd say let's live with FIELD_SIZEOF() and remove
sizeof_field() and SIZEOF_FIELD().

I'm a bit surprised that the FIELD_SIZEOF() definition ends up in
stddef.h rather than in kernel.h where such things are normally
defined.  Why is that?



Re: [PATCH V2] include: linux: Regularise the use of FIELD_SIZEOF macro

2019-06-11 Thread Andrew Morton
On Wed, 12 Jun 2019 01:08:36 +0530 Shyam Saini 
 wrote:

> Currently, there are 3 different macros, namely sizeof_field, SIZEOF_FIELD
> and FIELD_SIZEOF which are used to calculate the size of a member of
> structure, so to bring uniformity in entire kernel source tree lets use
> FIELD_SIZEOF and replace all occurrences of other two macros with this.
> 
> For this purpose, redefine FIELD_SIZEOF in include/linux/stddef.h and
> tools/testing/selftests/bpf/bpf_util.h and remove its defination from
> include/linux/kernel.h
> 
> In favour of FIELD_SIZEOF, this patch also deprecates other two similar
> macros sizeof_field and SIZEOF_FIELD.
> 
> For code compatibility reason, retain sizeof_field macro as a wrapper macro
> to FIELD_SIZEOF

As Alexey has pointed out, C structs and unions don't have fields -
they have members.  So this is an opportunity to switch everything to
a new member_sizeof().

What do people think of that and how does this impact the patch footprint?
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: RFC: Run a dedicated hmm.git for 5.3

2019-05-25 Thread Andrew Morton
On Fri, 24 May 2019 09:44:55 -0300 Jason Gunthorpe  wrote:

> Now that -mm merged the basic hmm API skeleton I think running like
> this would get us quickly to the place we all want: comprehensive in tree
> users of hmm.
> 
> Andrew, would this be acceptable to you?

Sure.  Please take care not to permit this to reduce the amount of
exposure and review which the core HMM pieces get.



Re: [PATCH 1/2] mm/hmm: support automatic NUMA balancing

2019-05-13 Thread Andrew Morton
On Fri, 10 May 2019 19:53:23 + "Kuehling, Felix"  
wrote:

> From: Philip Yang 
> 
> While the page is migrating by NUMA balancing, HMM failed to detect this
> condition and still return the old page. Application will use the new
> page migrated, but driver pass the old page physical address to GPU,
> this crash the application later.
> 
> Use pte_protnone(pte) to return this condition and then hmm_vma_do_fault
> will allocate new page.
> 
> Signed-off-by: Philip Yang 

This should have included your signed-off-by:, since you were on the
patch delivery path.  I'll make that change to my copy of the patch,
OK?
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v6 0/8] mmu notifier provide context informations

2019-04-09 Thread Andrew Morton
On Tue, 26 Mar 2019 12:47:39 -0400 jgli...@redhat.com wrote:

> From: Jérôme Glisse 
> 
> (Andrew this apply on top of my HMM patchset as otherwise you will have
>  conflict with changes to mm/hmm.c)
> 
> Changes since v5:
> - drop KVM bits waiting for KVM people to express interest if they
>   do not then i will post patchset to remove change_pte_notify as
>   without the changes in v5 change_pte_notify is just useless (it
>   it is useless today upstream it is just wasting cpu cycles)
> - rebase on top of lastest Linus tree
> 
> Previous cover letter with minor update:
> 
> 
> Here i am not posting users of this, they already have been posted to
> appropriate mailing list [6] and will be merge through the appropriate
> tree once this patchset is upstream.
> 
> Note that this serie does not change any behavior for any existing
> code. It just pass down more information to mmu notifier listener.
> 
> The rational for this patchset:
> 
> CPU page table update can happens for many reasons, not only as a
> result of a syscall (munmap(), mprotect(), mremap(), madvise(), ...)
> but also as a result of kernel activities (memory compression, reclaim,
> migration, ...).
> 
> This patch introduce a set of enums that can be associated with each
> of the events triggering a mmu notifier:
> 
> - UNMAP: munmap() or mremap()
> - CLEAR: page table is cleared (migration, compaction, reclaim, ...)
> - PROTECTION_VMA: change in access protections for the range
> - PROTECTION_PAGE: change in access protections for page in the range
> - SOFT_DIRTY: soft dirtyness tracking
> 
> Being able to identify munmap() and mremap() from other reasons why the
> page table is cleared is important to allow user of mmu notifier to
> update their own internal tracking structure accordingly (on munmap or
> mremap it is not longer needed to track range of virtual address as it
> becomes invalid). Without this serie, driver are force to assume that
> every notification is an munmap which triggers useless trashing within
> drivers that associate structure with range of virtual address. Each
> driver is force to free up its tracking structure and then restore it
> on next device page fault. With this serie we can also optimize device
> page table update [6].
> 
> More over this can also be use to optimize out some page table updates
> like for KVM where we can update the secondary MMU directly from the
> callback instead of clearing it.

We seem to be rather short of review input on this patchset.  ie: there
is none.

> ACKS AMD/RADEON https://lkml.org/lkml/2019/2/1/395

OK, kind of ackish, but not a review.

> ACKS RDMA https://lkml.org/lkml/2018/12/6/1473

This actually acks the infiniband part of a patch which isn't in this
series.


So we have some work to do, please.  Who would be suitable reviewers?
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel

Re: [PATCH v4 0/9] mmu notifier provide context informations

2019-01-31 Thread Andrew Morton
On Thu, 31 Jan 2019 11:10:06 -0500 Jerome Glisse  wrote:

> Andrew what is your plan for this ? I had a discussion with Peter Xu
> and Andrea about change_pte() and kvm. Today the change_pte() kvm
> optimization is effectively disabled because of invalidate_range
> calls. With a minimal couple lines patch on top of this patchset
> we can bring back the kvm change_pte optimization and we can also
> optimize some other cases like for instance when write protecting
> after fork (but i am not sure this is something qemu does often so
> it might not help for real kvm workload).
> 
> I will be posting a the extra patch as an RFC, but in the meantime
> i wanted to know what was the status for this.

The various drm patches appear to be headed for collisions with drm
tree development so we'll need to figure out how to handle that and in
what order things happen.

It's quite unclear from the v4 patchset's changelogs that this has
anything to do with KVM and "the change_pte() kvm optimization" hasn't
been described anywhere(?).

So..  I expect the thing to do here is to get everything finished, get
the changelogs completed with this new information and do a resend.

Can we omit the drm and rdma patches for now?  Feed them in via the
subsystem maintainers when the dust has settled?

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 3/3] mm/mmu_notifier: contextual information for event triggering invalidation

2018-12-04 Thread Andrew Morton
On Mon,  3 Dec 2018 15:18:17 -0500 jgli...@redhat.com wrote:

> CPU page table update can happens for many reasons, not only as a result
> of a syscall (munmap(), mprotect(), mremap(), madvise(), ...) but also
> as a result of kernel activities (memory compression, reclaim, migration,
> ...).
> 
> Users of mmu notifier API track changes to the CPU page table and take
> specific action for them. While current API only provide range of virtual
> address affected by the change, not why the changes is happening.
> 
> This patchset adds event information so that users of mmu notifier can
> differentiate among broad category:
> - UNMAP: munmap() or mremap()
> - CLEAR: page table is cleared (migration, compaction, reclaim, ...)
> - PROTECTION_VMA: change in access protections for the range
> - PROTECTION_PAGE: change in access protections for page in the range
> - SOFT_DIRTY: soft dirtyness tracking
> 
> Being able to identify munmap() and mremap() from other reasons why the
> page table is cleared is important to allow user of mmu notifier to
> update their own internal tracking structure accordingly (on munmap or
> mremap it is not longer needed to track range of virtual address as it
> becomes invalid).
> 
> ...
>
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -519,6 +519,7 @@ bool __oom_reap_task_mm(struct mm_struct *mm)
>   struct mmu_notifier_range range;
>   struct mmu_gather tlb;
>  
> + range.event = MMU_NOTIFY_CLEAR;
>   range.start = vma->vm_start;
>   range.end = vma->vm_end;
>   range.mm = mm;

mmu_notifier_range and MMU_NOTIFY_CLEAR aren't defined if
CONFIG_MMU_NOTIFIER=n.

I'll try a temporary bodge:

+++ a/include/linux/mmu_notifier.h
@@ -10,8 +10,6 @@
 struct mmu_notifier;
 struct mmu_notifier_ops;
 
-#ifdef CONFIG_MMU_NOTIFIER
-
 /*
  * The mmu notifier_mm structure is allocated and installed in
  * mm->mmu_notifier_mm inside the mm_take_all_locks() protected
@@ -32,6 +30,8 @@ struct mmu_notifier_range {
bool blockable;
 };
 
+#ifdef CONFIG_MMU_NOTIFIER
+
 struct mmu_notifier_ops {
/*
 * Called either by mmu_notifier_unregister or when the mm is


But this new code should vanish altogether if CONFIG_MMU_NOTIFIER=n,
please.  Or at least, we shouldn't be unnecessarily initializing .mm
and .event.  Please take a look at debloating this code.


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] mm: Replace all open encodings for NUMA_NO_NODE

2018-11-23 Thread Andrew Morton
On Fri, 23 Nov 2018 15:24:16 +0530 Anshuman Khandual 
 wrote:

> At present there are multiple places where invalid node number is encoded
> as -1. Even though implicitly understood it is always better to have macros
> in there. Replace these open encodings for an invalid node number with the
> global macro NUMA_NO_NODE. This helps remove NUMA related assumptions like
> 'invalid node' from various places redirecting them to a common definition.
> 
> ...
> 
> Build tested this with multiple cross compiler options like alpha, sparc,
> arm64, x86, powerpc, powerpc64le etc with their default config which might
> not have compiled tested all driver related changes. I will appreciate
> folks giving this a test in their respective build environment.
> 
> All these places for replacement were found by running the following grep
> patterns on the entire kernel code. Please let me know if this might have
> missed some instances. This might also have replaced some false positives.
> I will appreciate suggestions, inputs and review.
> 
> 1. git grep "nid == -1"
> 2. git grep "node == -1"
> 3. git grep "nid = -1"
> 4. git grep "node = -1"

The build testing is good, but I worry that some of the affected files
don't clearly have numa.h in their include paths, for the NUMA_NO_NODE
definition.

The first thing I looked it is arch/powerpc/include/asm/pci-bridge.h. 
Maybe it somehow manages to include numa.h via some nested include, but
if so, is that reliable across all config combinations and as code
evolves?

So I think that the patch should have added an explicit include of
numa.h, especially in cases where the affected file previously had no
references to any of the things which numa.h defines.

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v8 0/7] mm: Merge hmm into devm_memremap_pages, mark GPL-only

2018-11-21 Thread Andrew Morton
On Tue, 20 Nov 2018 15:12:49 -0800 Dan Williams  
wrote:

> Changes since v7 [1]:
> At Maintainer Summit, Greg brought up a topic I proposed around
> EXPORT_SYMBOL_GPL usage. The motivation was considerations for when
> EXPORT_SYMBOL_GPL is warranted and the criteria for taking the
> exceptional step of reclassifying an existing export. Specifically, I
> wanted to make the case that although the line is fuzzy and hard to
> specify in abstract terms, it is nonetheless clear that
> devm_memremap_pages() and HMM (Heterogeneous Memory Management) have
> crossed it. The devm_memremap_pages() facility should have been
> EXPORT_SYMBOL_GPL from the beginning, and HMM as a derivative of that
> functionality should have naturally picked up that designation as well.
> 
> Contrary to typical rules, the HMM infrastructure was merged upstream
> with zero in-tree consumers. There was a promise at the time that those
> users would be merged "soon", but it has been over a year with no drivers
> arriving. While the Nouveau driver is about to belatedly make good on
> that promise it is clear that HMM was targeted first and foremost at an
> out-of-tree consumer.
> 
> HMM is derived from devm_memremap_pages(), a facility Christoph and I
> spearheaded to support persistent memory. It combines a device lifetime
> model with a dynamically created 'struct page' / memmap array for any
> physical address range. It enables coordination and control of the many
> code paths in the kernel built to interact with memory via 'struct page'
> objects. With HMM the integration goes even deeper by allowing device
> drivers to hook and manipulate page fault and page free events.
> 
> One interpretation of when EXPORT_SYMBOL is suitable is when it is
> exporting stable and generic leaf functionality.  The
> devm_memremap_pages() facility continues to see expanding use cases,
> peer-to-peer DMA being the most recent, with no clear end date when it
> will stop attracting reworks and semantic changes. It is not suitable to
> export devm_memremap_pages() as a stable 3rd party driver API due to the
> fact that it is still changing and manipulates core behavior. Moreover,
> it is not in the best interest of the long term development of the core
> memory management subsystem to permit any external driver to effectively
> define its own system-wide memory management policies with no
> encouragement to engage with upstream.
> 
> I am also concerned that HMM was designed in a way to minimize further
> engagement with the core-MM. That, with these hooks in place,
> device-drivers are free to implement their own policies without much
> consideration for whether and how the core-MM could grow to meet that
> need. Going forward not only should HMM be EXPORT_SYMBOL_GPL, but the
> core-MM should be allowed the opportunity and stimulus to change and
> address these new use cases as first class functionality.
> 

The arguments are compelling.  I apologize for not thinking of and/or
not being made aware of them at the time.

I'll take [7/7] (with all the above added to the changelog) with a view
to a 4.21-rc1 merge.  That gives us a couple of months for further
discussion.  Public discussion, please.

It should be noted that [7/7] has a cc:stable.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] mm, oom: distinguish blockable mode for mmu notifiers

2018-07-24 Thread Andrew Morton
On Tue, 24 Jul 2018 16:17:47 +0200 Michal Hocko  wrote:

> On Fri 20-07-18 17:09:02, Andrew Morton wrote:
> [...]
> > - Undocumented return value.
> > 
> > - comment "failed to reap part..." is misleading - sounds like it's
> >   referring to something which happened in the past, is in fact
> >   referring to something which might happen in the future.
> > 
> > - fails to call trace_finish_task_reaping() in one case
> > 
> > - code duplication.
> > 
> > - Increases mmap_sem hold time a little by moving
> >   trace_finish_task_reaping() inside the locked region.  So sue me ;)
> > 
> > - Sharing the finish: path means that the trace event won't
> >   distinguish between the two sources of finishing.
> > 
> > Please take a look?
> 
> oom_reap_task_mm should return false when __oom_reap_task_mm return
> false. This is what my patch did but it seems this changed by
> http://www.ozlabs.org/~akpm/mmotm/broken-out/mm-oom-remove-oom_lock-from-oom_reaper.patch
> so that one should be fixed.
> 
> diff --git a/mm/oom_kill.c b/mm/oom_kill.c
> index 104ef4a01a55..88657e018714 100644
> --- a/mm/oom_kill.c
> +++ b/mm/oom_kill.c
> @@ -565,7 +565,7 @@ static bool oom_reap_task_mm(struct task_struct *tsk, 
> struct mm_struct *mm)
>   /* failed to reap part of the address space. Try again later */
>   if (!__oom_reap_task_mm(mm)) {
>   up_read(>mmap_sem);
> - return true;
> + return false;
>   }
>  
>   pr_info("oom_reaper: reaped process %d (%s), now anon-rss:%lukB, 
> file-rss:%lukB, shmem-rss:%lukB\n",

OK, thanks, I added that.

> 
> On top of that the proposed cleanup looks as follows:
> 

Looks good to me.  Seems a bit strange that we omit the pr_info()
output if the mm was partially reaped - people would still want to know
this?   Not very important though.

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] mm, oom: distinguish blockable mode for mmu notifiers

2018-07-20 Thread Andrew Morton
On Mon, 16 Jul 2018 13:50:58 +0200 Michal Hocko  wrote:

> From: Michal Hocko 
> 
> There are several blockable mmu notifiers which might sleep in
> mmu_notifier_invalidate_range_start and that is a problem for the
> oom_reaper because it needs to guarantee a forward progress so it cannot
> depend on any sleepable locks.
> 
> ...
>
> @@ -571,7 +565,12 @@ static bool oom_reap_task_mm(struct task_struct *tsk, 
> struct mm_struct *mm)
>  
>   trace_start_task_reaping(tsk->pid);
>  
> - __oom_reap_task_mm(mm);
> + /* failed to reap part of the address space. Try again later */
> + if (!__oom_reap_task_mm(mm)) {
> + up_read(>mmap_sem);
> + ret = false;
> + goto unlock_oom;
> + }

This function is starting to look a bit screwy.

: static bool oom_reap_task_mm(struct task_struct *tsk, struct mm_struct *mm)
: {
:   if (!down_read_trylock(>mmap_sem)) {
:   trace_skip_task_reaping(tsk->pid);
:   return false;
:   }
: 
:   /*
:* MMF_OOM_SKIP is set by exit_mmap when the OOM reaper can't
:* work on the mm anymore. The check for MMF_OOM_SKIP must run
:* under mmap_sem for reading because it serializes against the
:* down_write();up_write() cycle in exit_mmap().
:*/
:   if (test_bit(MMF_OOM_SKIP, >flags)) {
:   up_read(>mmap_sem);
:   trace_skip_task_reaping(tsk->pid);
:   return true;
:   }
: 
:   trace_start_task_reaping(tsk->pid);
: 
:   /* failed to reap part of the address space. Try again later */
:   if (!__oom_reap_task_mm(mm)) {
:   up_read(>mmap_sem);
:   return true;
:   }
: 
:   pr_info("oom_reaper: reaped process %d (%s), now anon-rss:%lukB, 
file-rss:%lukB, shmem-rss:%lukB\n",
:   task_pid_nr(tsk), tsk->comm,
:   K(get_mm_counter(mm, MM_ANONPAGES)),
:   K(get_mm_counter(mm, MM_FILEPAGES)),
:   K(get_mm_counter(mm, MM_SHMEMPAGES)));
:   up_read(>mmap_sem);
: 
:   trace_finish_task_reaping(tsk->pid);
:   return true;
: }

- Undocumented return value.

- comment "failed to reap part..." is misleading - sounds like it's
  referring to something which happened in the past, is in fact
  referring to something which might happen in the future.

- fails to call trace_finish_task_reaping() in one case

- code duplication.


I'm thinking it wants to be something like this?

: /*
:  * Return true if we successfully acquired (then released) mmap_sem
:  */
: static bool oom_reap_task_mm(struct task_struct *tsk, struct mm_struct *mm)
: {
:   if (!down_read_trylock(>mmap_sem)) {
:   trace_skip_task_reaping(tsk->pid);
:   return false;
:   }
: 
:   /*
:* MMF_OOM_SKIP is set by exit_mmap when the OOM reaper can't
:* work on the mm anymore. The check for MMF_OOM_SKIP must run
:* under mmap_sem for reading because it serializes against the
:* down_write();up_write() cycle in exit_mmap().
:*/
:   if (test_bit(MMF_OOM_SKIP, >flags)) {
:   trace_skip_task_reaping(tsk->pid);
:   goto out;
:   }
: 
:   trace_start_task_reaping(tsk->pid);
: 
:   if (!__oom_reap_task_mm(mm)) {
:   /* Failed to reap part of the address space. Try again later */
:   goto finish;
:   }
: 
:   pr_info("oom_reaper: reaped process %d (%s), now anon-rss:%lukB, 
file-rss:%lukB, shmem-rss:%lukB\n",
:   task_pid_nr(tsk), tsk->comm,
:   K(get_mm_counter(mm, MM_ANONPAGES)),
:   K(get_mm_counter(mm, MM_FILEPAGES)),
:   K(get_mm_counter(mm, MM_SHMEMPAGES)));
: finish:
:   trace_finish_task_reaping(tsk->pid);
: out:
:   up_read(>mmap_sem);
:   return true;
: }

- Increases mmap_sem hold time a little by moving
  trace_finish_task_reaping() inside the locked region.  So sue me ;)

- Sharing the finish: path means that the trace event won't
  distinguish between the two sources of finishing.

Please take a look?
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] mm, oom: distinguish blockable mode for mmu notifiers

2018-07-20 Thread Andrew Morton
On Tue, 17 Jul 2018 10:12:01 +0200 Michal Hocko  wrote:

> > Any suggestions regarding how the driver developers can test this code
> > path?  I don't think we presently have a way to fake an oom-killing
> > event?  Perhaps we should add such a thing, given the problems we're
> > having with that feature.
> 
> The simplest way is to wrap an userspace code which uses these notifiers
> into a memcg and set the hard limit to hit the oom. This can be done
> e.g. after the test faults in all the mmu notifier managed memory and
> set the hard limit to something really small. Then we are looking for a
> proper process tear down.

Chances are, some of the intended audience don't know how to do this
and will either have to hunt down a lot of documentation or will just
not test it.  But we want them to test it, so a little worked step-by-step
example would help things along please.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] mm, oom: distinguish blockable mode for mmu notifiers

2018-07-16 Thread Andrew Morton
On Mon, 16 Jul 2018 13:50:58 +0200 Michal Hocko  wrote:

> From: Michal Hocko 
> 
> There are several blockable mmu notifiers which might sleep in
> mmu_notifier_invalidate_range_start and that is a problem for the
> oom_reaper because it needs to guarantee a forward progress so it cannot
> depend on any sleepable locks.
> 
> Currently we simply back off and mark an oom victim with blockable mmu
> notifiers as done after a short sleep. That can result in selecting a
> new oom victim prematurely because the previous one still hasn't torn
> its memory down yet.
> 
> We can do much better though. Even if mmu notifiers use sleepable locks
> there is no reason to automatically assume those locks are held.
> Moreover majority of notifiers only care about a portion of the address
> space and there is absolutely zero reason to fail when we are unmapping an
> unrelated range. Many notifiers do really block and wait for HW which is
> harder to handle and we have to bail out though.
> 
> This patch handles the low hanging fruid. 
> __mmu_notifier_invalidate_range_start
> gets a blockable flag and callbacks are not allowed to sleep if the
> flag is set to false. This is achieved by using trylock instead of the
> sleepable lock for most callbacks and continue as long as we do not
> block down the call chain.

I assume device driver developers are wondering "what does this mean
for me".  As I understand it, the only time they will see
blockable==false is when their driver is being called in response to an
out-of-memory condition, yes?  So it is a very rare thing.

Any suggestions regarding how the driver developers can test this code
path?  I don't think we presently have a way to fake an oom-killing
event?  Perhaps we should add such a thing, given the problems we're
having with that feature.

> I think we can improve that even further because there is a common
> pattern to do a range lookup first and then do something about that.
> The first part can be done without a sleeping lock in most cases AFAICS.
> 
> The oom_reaper end then simply retries if there is at least one notifier
> which couldn't make any progress in !blockable mode. A retry loop is
> already implemented to wait for the mmap_sem and this is basically the
> same thing.
> 
> ...
>
> +static inline int mmu_notifier_invalidate_range_start_nonblock(struct 
> mm_struct *mm,
> +   unsigned long start, unsigned long end)
> +{
> + int ret = 0;
> + if (mm_has_notifiers(mm))
> + ret = __mmu_notifier_invalidate_range_start(mm, start, end, 
> false);
> +
> + return ret;
>  }

nit,

{
if (mm_has_notifiers(mm))
return __mmu_notifier_invalidate_range_start(mm, start, end, 
false);
return 0;
}

would suffice.


> 
> ...
>
> --- a/mm/mmap.c
> +++ b/mm/mmap.c
> @@ -3074,7 +3074,7 @@ void exit_mmap(struct mm_struct *mm)
>* reliably test it.
>*/
>   mutex_lock(_lock);
> - __oom_reap_task_mm(mm);
> + (void)__oom_reap_task_mm(mm);
>   mutex_unlock(_lock);

What does this do?

>   set_bit(MMF_OOM_SKIP, >flags);
> 
> ...
>

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] kernel.h: Add for_each_if()

2018-07-11 Thread Andrew Morton
On Wed, 11 Jul 2018 13:51:08 +0200 Daniel Vetter  wrote:

> But I still have the situation that a bunch of maintainers acked this
> and Andrew Morton defacto nacked it, which I guess means I'll keep the
> macro in drm? The common way to go about this seems to be to just push
> the patch series with the ack in some pull request to Linus and ignore
> the people who raised questions, but not really my thing.

Heh.

But, am I wrong?  Code which uses regular kernel style doesn't have
these issues.  We shouldn't be enabling irregular style - we should be
making such sites more regular.  The fact that the compiler generates a
nice warning in some cases simply helps us with that.



___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] kernel.h: Add for_each_if()

2018-07-09 Thread Andrew Morton
On Mon,  9 Jul 2018 18:25:09 +0200 Daniel Vetter  wrote:

> To avoid compilers complainig about ambigious else blocks when putting
> an if condition into a for_each macro one needs to invert the
> condition and add a dummy else. We have a nice little convenience
> macro for that in drm headers, let's move it out. Subsequent patches
> will roll it out to other places.
> 
> The issue the compilers complain about are nested if with an else
> block and no {} to disambiguate which if the else belongs to. The C
> standard is clear, but in practice people forget:
> 
> if (foo)
>   if (bar)
>   /* something */
>   else
>   /* something else

um, yeah, don't do that.  Kernel coding style is very much to do

if (foo) {
if (bar)
/* something */
else
/* something else
}

And if not doing that generates a warning then, well, do that.

> The same can happen in a for_each macro when it also contains an if
> condition at the end, except the compiler message is now really
> confusing since there's only 1 if:
> 
> for_each_something()
>   if (bar)
>   /* something */
>   else
>   /* something else
> 
> The for_each_if() macro, by inverting the condition and adding an
> else, avoids the compiler warning.

Ditto.

> Motivated by a discussion with Andy and Yisheng, who want to add
> another for_each_macro which would benefit from for_each_if() instead
> of hand-rolling it.

Ditto.

> v2: Explain a bit better what this is good for, after the discussion
> with Peter Z.

Presumably the above was discussed in whatever-thread-that-was.

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [patch 1/1] drivers/gpu/drm/i915/intel_guc_log.c: work around gcc-4.4.4 union initializer issue

2018-03-08 Thread Andrew Morton
On Thu, 08 Mar 2018 12:06:46 +0200 Jani Nikula <jani.nik...@linux.intel.com> 
wrote:

> On Wed, 07 Mar 2018, a...@linux-foundation.org wrote:
> > From: Andrew Morton <a...@linux-foundation.org>
> > Subject: drivers/gpu/drm/i915/intel_guc_log.c: work around gcc-4.4.4 union 
> > initializer issue
> >
> > gcc-4.4.4 has problems with initalizers of anon unions.
> >
> > drivers/gpu/drm/i915/intel_guc_log.c: In function 'guc_log_control':
> > drivers/gpu/drm/i915/intel_guc_log.c:64: error: unknown field 
> > 'logging_enabled' specified in initializer
> >
> > Work around this.
> 
> Thanks for the patch, pushed to drm-intel-next-queued.
> 
> That said, how long do we have to care about old compilers? I thought we
> were converging on at the very least GCC 4.5 being required.

Yes, I've seen some talk about that and it is about time for us to do
it.  I'm not sure what stage things are at though.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] fix double ;;s in code

2018-02-20 Thread Andrew Morton
On Tue, 20 Feb 2018 10:03:56 +0200 Jani Nikula  
wrote:

> On Mon, 19 Feb 2018, Pavel Machek  wrote:
> > On Mon 2018-02-19 16:41:35, Daniel Vetter wrote:
> >> Yeah, pls split this into one patch per area, with a suitable patch
> >> subject prefix. Look at git log of each file to get a feeling for what's
> >> the standard in each area.
> >
> > Yeah I can spend hour spliting it, and then people will ignore it
> > anyway.
> >
> > If you care about one of the files being modified, please fix the
> > bug, ";;" is a clear bug.
> >
> > If you don't care ... well I don't care either.
> 
> Look, if this causes just one conflict down the line because it touches
> the kernel all over the place, then IMO it already wasn't worth
> it. Merge conflicts are inevitable, but there's no reason to make life
> harder just to cater for a cleanup patch. It's not that important.
> 
> Had it been split up, the drm parts would've been merged already.

I just stage patches like this behind linux-next.  So if your stuff in
in -next, you'll never notice a thing.

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [RFC] Per file OOM badness

2018-01-22 Thread Andrew Morton
On Thu, 18 Jan 2018 11:47:48 -0500 Andrey Grodzovsky 
 wrote:

> Hi, this series is a revised version of an RFC sent by Christian König
> a few years ago. The original RFC can be found at 
> https://lists.freedesktop.org/archives/dri-devel/2015-September/089778.html
> 
> This is the same idea and I've just adressed his concern from the original 
> RFC 
> and switched to a callback into file_ops instead of a new member in struct 
> file.

Should be in address_space_operations, I suspect.  If an application
opens a file twice, we only want to count it once?

But we're putting the cart ahead of the horse here.  Please provide us
with a detailed description of the problem which you are addressing so
that the MM developers can better consider how to address your
requirements.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH] fbdev: remove current maintainer

2016-12-19 Thread Andrew Morton
On Mon, 19 Dec 2016 14:34:12 +0100 Bartlomiej Zolnierkiewicz  wrote:

> 
> Hi,
> 
> On Thursday, December 15, 2016 12:12:52 PM Andrew Morton wrote:
> > On Thu, 8 Dec 2016 10:34:12 +0200 Tomi Valkeinen  
> > wrote:
> > 
> > > Remove Tomi Valkeinen from fbdev maintainer and mark fbdev as orphan.
> > > 
> > > ...
> > >
> > > I just don't have time to even pretend I maintain fbdev, so mark it as 
> > > Orphan.
> > > 
> > > Anyone want to pick fbdev maintainership?
> 
> http://marc.info/?l=linux-fbdev=148181317415549=2
> 
> > I'll keep an eye on the mailing list, try to ensure that things keep
> > ticking along.
> 
> Andrew, do you want to manage patches yourself or should I step
> in and try to do it?

No, please go ahead.



[PATCH] fbdev: remove current maintainer

2016-12-15 Thread Andrew Morton
On Thu, 8 Dec 2016 10:34:12 +0200 Tomi Valkeinen  
wrote:

> Remove Tomi Valkeinen from fbdev maintainer and mark fbdev as orphan.
> 
> ...
>
> I just don't have time to even pretend I maintain fbdev, so mark it as Orphan.
> 
> Anyone want to pick fbdev maintainership?

I'll keep an eye on the mailing list, try to ensure that things keep
ticking along.



[RFC 00/10] implement alternative and much simpler id allocator

2016-12-09 Thread Andrew Morton
On Thu,  8 Dec 2016 02:22:55 +0100 Rasmus Villemoes  wrote:

> TL;DR: these patches save 250 KB of memory, with more low-hanging
> fruit ready to pick.
> 
> While browsing through the lib/idr.c code, I noticed that the code at
> the end of ida_get_new_above() probably doesn't work as intended: Most
> users of ida use it via ida_simple_get(), and that starts by
> unconditionally calling ida_pre_get(), ensuring that ida->idr has
> 8==MAX_IDR_FREE idr_layers in its free list id_free. In the common
> case, none (or at most one) of these get used during
> ida_get_new_above(), and we only free one, leaving at least 6 (usually
> 7) idr_layers in the free list.

Please be aware of

http://ozlabs.org/~akpm/mmots/broken-out/reimplement-idr-and-ida-using-the-radix-tree.patch
http://lkml.kernel.org/r/1480369871-5271-68-git-send-email-mawilcox at 
linuxonhyperv.com

I expect we'll be merging patches 1-32 of that series into 4.10-rc1 and
the above patch (#33) into 4.11-rc1.


[PATCH v2 1/2] mm: fix cache mode of dax pmd mappings

2016-09-08 Thread Andrew Morton
On Wed, 07 Sep 2016 15:26:14 -0700 Dan Williams  
wrote:

> track_pfn_insert() in vmf_insert_pfn_pmd() is marking dax mappings as
> uncacheable rendering them impractical for application usage.  DAX-pte
> mappings are cached and the goal of establishing DAX-pmd mappings is to
> attain more performance, not dramatically less (3 orders of magnitude).
> 
> track_pfn_insert() relies on a previous call to reserve_memtype() to
> establish the expected page_cache_mode for the range.  While memremap()
> arranges for reserve_memtype() to be called, devm_memremap_pages() does
> not.  So, teach track_pfn_insert() and untrack_pfn() how to handle
> tracking without a vma, and arrange for devm_memremap_pages() to
> establish the write-back-cache reservation in the memtype tree.

Acked-by: Andrew Morton 

I'll grab [2/2].


[PATCH] tree-wide: replace config_enabled() with IS_ENABLED()

2016-06-07 Thread Andrew Morton
On Mon,  6 Jun 2016 21:20:56 +0900 Masahiro Yamada  wrote:

> The use of config_enabled() against config options is ambiguous.
> In practical terms, config_enabled() is equivalent to IS_BUILTIN(),
> but the author might have used it for the meaning of IS_ENABLED().
> Using IS_ENABLED(), IS_BUILTIN(), IS_MODULE() etc. makes the
> intention clearer.
> 
> This commit replaces config_enabled() with IS_ENABLED() where
> possible.  This commit is only touching bool config options.
> 
> I noticed two cases where config_enabled() is used against a tristate
> option:
> 
>  - config_enabled(CONFIG_HWMON)
>   [ drivers/net/wireless/ath/ath10k/thermal.c ]
> 
>  - config_enabled(CONFIG_BACKLIGHT_CLASS_DEVICE)
>   [ drivers/gpu/drm/gma500/opregion.c ]
> 
> I did not touch them because they should be converted to IS_BUILTIN()
> in order to keep the logic, but I was not sure it was the authors'
> intention.

Well, we do want to be able to remove config_enabled() altogether if
we're going to do this.  So please later send along a patch which makes
a best-effort fix of the unclear usages and let's zap the thing.

If those fixes weren't quite correct then there will be a build error
(drivers/net/wireless/ath/ath10k/thermal.c) or no change in behaviour
(drivers/gpu/drm/gma500/opregion.c).



[PATCH v7 00/12] Support non-lru page migration

2016-06-01 Thread Andrew Morton
On Wed,  1 Jun 2016 08:21:09 +0900 Minchan Kim  wrote:

> Recently, I got many reports about perfermance degradation in embedded
> system(Android mobile phone, webOS TV and so on) and easy fork fail.
> 
> The problem was fragmentation caused by zram and GPU driver mainly.
> With memory pressure, their pages were spread out all of pageblock and
> it cannot be migrated with current compaction algorithm which supports
> only LRU pages. In the end, compaction cannot work well so reclaimer
> shrinks all of working set pages. It made system very slow and even to
> fail to fork easily which requires order-[2 or 3] allocations.
> 
> Other pain point is that they cannot use CMA memory space so when OOM
> kill happens, I can see many free pages in CMA area, which is not
> memory efficient. In our product which has big CMA memory, it reclaims
> zones too exccessively to allocate GPU and zram page although there are
> lots of free space in CMA so system becomes very slow easily.

But this isn't presently implemented for GPU drivers or for CMA, yes?

What's the story there?


[PATCH v4 00/13] Support non-lru page migration

2016-04-27 Thread Andrew Morton
On Wed, 27 Apr 2016 16:48:13 +0900 Minchan Kim  wrote:

> Recently, I got many reports about perfermance degradation in embedded
> system(Android mobile phone, webOS TV and so on) and easy fork fail.
> 
> The problem was fragmentation caused by zram and GPU driver mainly.
> With memory pressure, their pages were spread out all of pageblock and
> it cannot be migrated with current compaction algorithm which supports
> only LRU pages. In the end, compaction cannot work well so reclaimer
> shrinks all of working set pages. It made system very slow and even to
> fail to fork easily which requires order-[2 or 3] allocations.
> 
> Other pain point is that they cannot use CMA memory space so when OOM
> kill happens, I can see many free pages in CMA area, which is not
> memory efficient. In our product which has big CMA memory, it reclaims
> zones too exccessively to allocate GPU and zram page although there are
> lots of free space in CMA so system becomes very slow easily.
> 
> To solve these problem, this patch tries to add facility to migrate
> non-lru pages via introducing new functions and page flags to help
> migration.

I'm seeing some rejects here against Mel's changes and our patch
bandwidth is getting waaay way ahead of our review bandwidth.  So I
think I'll loadshed this patchset at this time, sorry.


[PATCH 2/9] mm: Provide new get_vaddr_frames() helper

2015-06-02 Thread Andrew Morton
On Tue, 2 Jun 2015 17:23:00 +0200 Jan Kara  wrote:

> > That's a lump of new code which many kernels won't be needing.  Can we
> > put all this in a new .c file and select it within drivers/media
> > Kconfig?
>   So the attached patch should do what you had in mind. OK?

lgtm.

>  drivers/gpu/drm/exynos/Kconfig  |   1 +
>  drivers/media/platform/omap/Kconfig |   1 +
>  drivers/media/v4l2-core/Kconfig |   1 +
>  mm/Kconfig  |   3 +
>  mm/Makefile |   1 +
>  mm/frame-vec.c  | 233 
> 

But frame_vector.c would be a more pleasing name.  For `struct frame_vector'.


[PATCH 2/9] mm: Provide new get_vaddr_frames() helper

2015-05-28 Thread Andrew Morton
On Wed, 13 May 2015 15:08:08 +0200 Jan Kara  wrote:

> Provide new function get_vaddr_frames().  This function maps virtual
> addresses from given start and fills given array with page frame numbers of
> the corresponding pages. If given start belongs to a normal vma, the function
> grabs reference to each of the pages to pin them in memory. If start
> belongs to VM_IO | VM_PFNMAP vma, we don't touch page structures. Caller
> must make sure pfns aren't reused for anything else while he is using
> them.
> 
> This function is created for various drivers to simplify handling of
> their buffers.
> 
> Acked-by: Mel Gorman 
> Acked-by: Vlastimil Babka 
> Signed-off-by: Jan Kara 
> ---
>  include/linux/mm.h |  44 +++
>  mm/gup.c   | 226 
> +

That's a lump of new code which many kernels won't be needing.  Can we
put all this in a new .c file and select it within drivers/media
Kconfig?



[Intel-gfx] drivers/gpu/drm/i915/i915_gem_gtt.c

2015-05-23 Thread Andrew Morton
On Sat, 23 May 2015 14:30:09 +0100 Damien Lespiau  
wrote:

> On Fri, May 22, 2015 at 02:17:32PM -0700, Andrew Morton wrote:
> > I'm not sure what's happened to the drm code in linux-next - it's
> > exploding all over the place.  Did someone turn on -Werror without
> > doing anywhere near enough testing?
> > 
> > Anyway, I don't know how to fix this i386 build error:
> 
> Seems like you have CONFIG_DRM_I915_WERROR set?

Yes.

> We explicitely made sure to not enable -Werror by default,

`make allmodconfig' enables CONFIG_DRM_I915_WERROR.

I'm not sure what is the approved way of fixing this.  Perhaps
disabling CONFIG_DRM_I915_WERROR when CONFIG_COMPILE_TEST=y.


drivers/gpu/drm/i915/i915_gem_gtt.c

2015-05-22 Thread Andrew Morton

I'm not sure what's happened to the drm code in linux-next - it's
exploding all over the place.  Did someone turn on -Werror without
doing anywhere near enough testing?

Anyway, I don't know how to fix this i386 build error:

drivers/gpu/drm/i915/i915_gem_gtt.c: In function 'gen8_ppgtt_init':
drivers/gpu/drm/i915/i915_gem_gtt.c:954:2: error: large integer implicitly 
truncated to unsigned type [-Werror=overflow]

ppgtt->base.total = 1ULL << 32;

i915_address_space.total is a ulong: 32-bit.



[Bug 87891] New: kernel BUG at mm/slab.c:2625!

2014-11-11 Thread Andrew Morton
On Wed, 12 Nov 2014 13:08:55 +0900 Tetsuo Handa  wrote:

> Andrew Morton wrote:
> > Poor ttm guys - this is a bit of a trap we set for them.
> 
> Commit a91576d7916f6cce (\"drm/ttm: Pass GFP flags in order to avoid 
> deadlock.\")
> changed to use sc->gfp_mask rather than GFP_KERNEL.
> 
> -   pages_to_free = kmalloc(npages_to_free * sizeof(struct page *),
> -   GFP_KERNEL);
> +   pages_to_free = kmalloc(npages_to_free * sizeof(struct page *), gfp);
> 
> But this bug is caused by sc->gfp_mask containing some flags which are not
> in GFP_KERNEL, right? Then, I think
> 
> -   pages_to_free = kmalloc(npages_to_free * sizeof(struct page *), gfp);
> +   pages_to_free = kmalloc(npages_to_free * sizeof(struct page *), gfp & 
> GFP_KERNEL);
> 
> would hide this bug.
> 
> But I think we should use GFP_ATOMIC (or drop __GFP_WAIT flag)

Well no - ttm_page_pool_free() should stop calling kmalloc altogether. 
Just do

struct page *pages_to_free[16];

and rework the code to free 16 pages at a time.  Easy.

Apart from all the other things we're discussing here, it should do
this because kmalloc() isn't very reliable within a shrinker.


> for
> two reasons when __alloc_pages_nodemask() is called from shrinker functions.
> 
> (1) Stack usage by __alloc_pages_nodemask() is large. If we unlimitedly allow
> recursive __alloc_pages_nodemask() calls, kernel stack could overflow
> under extreme memory pressure.
> 
> (2) Some shrinker functions are using sleepable locks which could make kswapd
> sleep for unpredictable duration. If kswapd is unexpectedly blocked inside
> shrinker functions and somebody is expecting that kswapd is running for
> reclaiming memory, it is a memory allocation deadlock.
> 
> Speak of ttm module, commit 22e71691fd54c637 (\"drm/ttm: Use mutex_trylock() 
> to
> avoid deadlock inside shrinker functions.\") prevents unlimited recursive
> __alloc_pages_nodemask() calls.

Yes, there are such problems.

Shrinkers do all sorts of surprising things - some of the filesystem
ones do disk writes!  And these involve all sorts of locking and memory
allocations.  But they won't be directly using scan_control.gfp_mask. 
They may be using open-coded __GFP_NOFS for the allocations.  The
complicated ones pass the IO over to kernel threads and wait for them
to complete, which addresses the stack consumption concerns (at least).




[Bug 87891] New: kernel BUG at mm/slab.c:2625!

2014-11-11 Thread Andrew Morton
On Wed, 12 Nov 2014 03:47:03 +0200 "Kirill A. Shutemov"  wrote:

> On Wed, Nov 12, 2014 at 03:22:41AM +0200, Kirill A. Shutemov wrote:
> > On Tue, Nov 11, 2014 at 04:49:13PM -0800, Andrew Morton wrote:
> > > On Tue, 11 Nov 2014 18:36:28 -0600 (CST) Christoph Lameter  > > linux.com> wrote:
> > > 
> > > > On Tue, 11 Nov 2014, Andrew Morton wrote:
> > > > 
> > > > > There's no point in doing
> > > > >
> > > > >   #define GFP_SLAB_BUG_MASK 
> > > > > (__GFP_DMA32|__GFP_HIGHMEM|~__GFP_BITS_MASK)
> > > > >
> > > > > because __GFP_DMA32|__GFP_HIGHMEM are already part of 
> > > > > ~__GFP_BITS_MASK.
> > > > 
> > > > ?? ~__GFP_BITS_MASK means bits 25 to 31 are set.
> > > > 
> > > > __GFP_DMA32 is bit 2 and __GFP_HIGHMEM is bit 1.
> > > 
> > > Ah, yes, OK.
> > > 
> > > I suppose it's possible that __GFP_HIGHMEM was set.
> > > 
> > > do_huge_pmd_anonymous_page
> > > ->pte_alloc_one
> > >   ->alloc_pages(__userpte_alloc_gfp==__GFP_HIGHMEM)
> > 
> > do_huge_pmd_anonymous_page
> >  alloc_hugepage_vma
> >   alloc_pages_vma(GFP_TRANSHUGE)
> > 
> > GFP_TRANSHUGE contains GFP_HIGHUSER_MOVABLE, which has __GFP_HIGHMEM.
> 
> Looks like it's reasonable to sanitize flags in shrink_slab() by dropping
> flags incompatible with slab expectation. Like this:
> 
> diff --git a/mm/vmscan.c b/mm/vmscan.c
> index dcb47074ae03..eb165d29c5e5 100644
> --- a/mm/vmscan.c
> +++ b/mm/vmscan.c
> @@ -369,6 +369,8 @@ unsigned long shrink_slab(struct shrink_control 
> *shrinkctl,
> if (nr_pages_scanned == 0)
> nr_pages_scanned = SWAP_CLUSTER_MAX;
>  
> +   shrinkctl->gfp_mask &= ~(__GFP_DMA32 | __GFP_HIGHMEM);
> +
> if (!down_read_trylock(_rwsem)) {
> /*
>  * If we would return 0, our callers would understand that we

Well no, because nobody is supposed to be passing this gfp_mask back
into a new allocation attempt anyway.  It would be better to do

shrinkctl->gfp_mask |= __GFP_IMMEDIATELY_GO_BUG;

?



[Bug 87891] New: kernel BUG at mm/slab.c:2625!

2014-11-11 Thread Andrew Morton
On Wed, 12 Nov 2014 10:22:45 +0900 Joonsoo Kim  
wrote:

> On Tue, Nov 11, 2014 at 05:02:43PM -0800, Andrew Morton wrote:
> > On Wed, 12 Nov 2014 00:54:01 + Luke Dashjr  wrote:
> > 
> > > On Wednesday, November 12, 2014 12:49:13 AM Andrew Morton wrote:
> > > > But anyway - Luke, please attach your .config to
> > > > https://bugzilla.kernel.org/show_bug.cgi?id=87891?
> > > 
> > > Done: https://bugzilla.kernel.org/attachment.cgi?id=157381
> > > 
> > 
> > OK, thanks.  No CONFIG_HIGHMEM of course.  I'm stumped.
> 
> Hello, Andrew.
> 
> I think that the cause is GFP_HIGHMEM.
> GFP_HIGHMEM is always defined regardless CONFIG_HIGHMEM.
> Please look at the do_huge_pmd_anonymous_page().
> It calls alloc_hugepage_vma() and then alloc_pages_vma() is called
> with alloc_hugepage_gfpmask(). This gfpmask includes GFP_TRANSHUGE
> and then GFP_HIGHUSER_MOVABLE.

OK.

So where's the bug?  I'm inclined to say that it's in ttm.  It's taking
a gfp_mask which means "this is the allocation attempt which we are
attempting to satisfy" and uses that for its own allocation.

But ttm has no business using that gfp_mask for its own allocation
attempt.  If anything it should use something like, err,

GFP_KERNEL & ~__GFP_IO & ~__GFP_FS | __GFP_HIGH

although as I mentioned earlier, it would be better to avoid allocation
altogether.

Poor ttm guys - this is a bit of a trap we set for them.


[Bug 87891] New: kernel BUG at mm/slab.c:2625!

2014-11-11 Thread Andrew Morton
On Wed, 12 Nov 2014 00:54:01 + Luke Dashjr  wrote:

> On Wednesday, November 12, 2014 12:49:13 AM Andrew Morton wrote:
> > But anyway - Luke, please attach your .config to
> > https://bugzilla.kernel.org/show_bug.cgi?id=87891?
> 
> Done: https://bugzilla.kernel.org/attachment.cgi?id=157381
> 

OK, thanks.  No CONFIG_HIGHMEM of course.  I'm stumped.

It might just have been a random memory bitflip or other corruption of
course.  Is it repeatable at all?

If it is, please add the below and retest?

--- a/mm/slab.c~slab-improve-checking-for-invalid-gfp_flags
+++ a/mm/slab.c
@@ -2590,7 +2590,10 @@ static int cache_grow(struct kmem_cache
 * Be lazy and only check for valid flags here,  keeping it out of the
 * critical path in kmem_cache_alloc().
 */
-   BUG_ON(flags & GFP_SLAB_BUG_MASK);
+   if (unlikely(flags & GFP_SLAB_BUG_MASK)) {
+   pr_emerg("gfp: %u\n", flags & GFP_SLAB_BUG_MASK);
+   BUG();
+   }
local_flags = flags & (GFP_CONSTRAINT_MASK|GFP_RECLAIM_MASK);

/* Take the node list lock to change the colour_next on this node */
diff -puN mm/slub.c~slab-improve-checking-for-invalid-gfp_flags mm/slub.c
--- a/mm/slub.c~slab-improve-checking-for-invalid-gfp_flags
+++ a/mm/slub.c
@@ -1377,7 +1377,10 @@ static struct page *new_slab(struct kmem
int order;
int idx;

-   BUG_ON(flags & GFP_SLAB_BUG_MASK);
+   if (unlikely(flags & GFP_SLAB_BUG_MASK)) {
+   pr_emerg("gfp: %u\n", flags & GFP_SLAB_BUG_MASK);
+   BUG();
+   }

page = allocate_slab(s,
flags & (GFP_RECLAIM_MASK | GFP_CONSTRAINT_MASK), node);
_



[Bug 87891] New: kernel BUG at mm/slab.c:2625!

2014-11-11 Thread Andrew Morton
On Wed, 12 Nov 2014 09:44:19 +0900 Joonsoo Kim  
wrote:

> > 
> > And it's quite infuriating to go BUG when the code could easily warn
> > and fix it up.
> 
> If user wants memory on HIGHMEM, it can be easily fixed by following
> change because all memory is compatible for HIGHMEM. But, if user wants
> memory on DMA32, it's not easy to fix because memory on NORMAL isn't
> compatible with DMA32. slab could return object from another slab page
> even if cache_grow() is successfully called. So BUG_ON() here
> looks right thing to me. We cannot know in advance whether ignoring this
> flag cause more serious result or not.

Well, attempting to fix it up and continue is nice, but we can live
with the BUG.

Not knowing which bit was set is bad.

diff -puN mm/slab.c~slab-improve-checking-for-invalid-gfp_flags mm/slab.c
--- a/mm/slab.c~slab-improve-checking-for-invalid-gfp_flags
+++ a/mm/slab.c
@@ -2590,7 +2590,10 @@ static int cache_grow(struct kmem_cache
 * Be lazy and only check for valid flags here,  keeping it out of the
 * critical path in kmem_cache_alloc().
 */
-   BUG_ON(flags & GFP_SLAB_BUG_MASK);
+   if (unlikely(flags & GFP_SLAB_BUG_MASK)) {
+   pr_emerg("gfp: %u\n", flags & GFP_SLAB_BUG_MASK);
+   BUG();
+   }
local_flags = flags & (GFP_CONSTRAINT_MASK|GFP_RECLAIM_MASK);

/* Take the node list lock to change the colour_next on this node */
--- a/mm/slub.c~slab-improve-checking-for-invalid-gfp_flags
+++ a/mm/slub.c
@@ -1377,7 +1377,10 @@ static struct page *new_slab(struct kmem
int order;
int idx;

-   BUG_ON(flags & GFP_SLAB_BUG_MASK);
+   if (unlikely(flags & GFP_SLAB_BUG_MASK)) {
+   pr_emerg("gfp: %u\n", flags & GFP_SLAB_BUG_MASK);
+   BUG();
+   }

page = allocate_slab(s,
flags & (GFP_RECLAIM_MASK | GFP_CONSTRAINT_MASK), node);
_



[Bug 87891] New: kernel BUG at mm/slab.c:2625!

2014-11-11 Thread Andrew Morton
On Tue, 11 Nov 2014 18:36:28 -0600 (CST) Christoph Lameter  
wrote:

> On Tue, 11 Nov 2014, Andrew Morton wrote:
> 
> > There's no point in doing
> >
> > #define GFP_SLAB_BUG_MASK (__GFP_DMA32|__GFP_HIGHMEM|~__GFP_BITS_MASK)
> >
> > because __GFP_DMA32|__GFP_HIGHMEM are already part of ~__GFP_BITS_MASK.
> 
> ?? ~__GFP_BITS_MASK means bits 25 to 31 are set.
> 
> __GFP_DMA32 is bit 2 and __GFP_HIGHMEM is bit 1.

Ah, yes, OK.

I suppose it's possible that __GFP_HIGHMEM was set.

do_huge_pmd_anonymous_page
->pte_alloc_one
  ->alloc_pages(__userpte_alloc_gfp==__GFP_HIGHMEM)

but I haven't traced that through and that's 32-bit.

But anyway - Luke, please attach your .config to
https://bugzilla.kernel.org/show_bug.cgi?id=87891?



[Bug 87891] New: kernel BUG at mm/slab.c:2625!

2014-11-11 Thread Andrew Morton

(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).

On Thu, 06 Nov 2014 17:28:41 + bugzilla-daemon at bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=87891
> 
> Bug ID: 87891
>Summary: kernel BUG at mm/slab.c:2625!
>Product: Memory Management
>Version: 2.5
> Kernel Version: 3.17.2
>   Hardware: i386
> OS: Linux
>   Tree: Mainline
> Status: NEW
>   Severity: blocking
>   Priority: P1
>  Component: Slab Allocator
>   Assignee: akpm at linux-foundation.org
>   Reporter: luke-jr+linuxbugs at utopios.org
> Regression: No

Well this is interesting.


> [359782.842112] kernel BUG at mm/slab.c:2625!
> ...
> [359782.843008] Call Trace:
> [359782.843017]  [] __kmalloc+0xdf/0x200
> [359782.843037]  [] ? ttm_page_pool_free+0x35/0x180 [ttm]
> [359782.843060]  [] ttm_page_pool_free+0x35/0x180 [ttm]
> [359782.843084]  [] ttm_pool_shrink_scan+0xae/0xd0 [ttm]
> [359782.843108]  [] shrink_slab_node+0x12b/0x2e0
> [359782.843129]  [] ? fragmentation_index+0x14/0x70
> [359782.843150]  [] ? zone_watermark_ok+0x1a/0x20
> [359782.843171]  [] shrink_slab+0xc8/0x110
> [359782.843189]  [] do_try_to_free_pages+0x300/0x410
> [359782.843210]  [] try_to_free_pages+0xbb/0x190
> [359782.843230]  [] __alloc_pages_nodemask+0x696/0xa90
> [359782.843253]  [] do_huge_pmd_anonymous_page+0xfa/0x3f0
> [359782.843278]  [] ? debug_smp_processor_id+0x17/0x20
> [359782.843300]  [] ? __lru_cache_add+0x57/0xa0
> [359782.843321]  [] handle_mm_fault+0x37e/0xdd0

It went pagefault
->__alloc_pages_nodemask
  ->shrink_slab
->ttm_pool_shrink_scan
  ->ttm_page_pool_free
->kmalloc
  ->cache_grow
->BUG_ON(flags & GFP_SLAB_BUG_MASK);

And I don't really know why - I'm not seeing anything in there which
can set a GFP flag which is outside GFP_SLAB_BUG_MASK.  However I see
lots of nits.

Core MM:

__alloc_pages_nodemask() does

if (unlikely(!page)) {
/*
 * Runtime PM, block IO and its error handling path
 * can deadlock because I/O on the device might not
 * complete.
 */
gfp_mask = memalloc_noio_flags(gfp_mask);
page = __alloc_pages_slowpath(gfp_mask, order,
zonelist, high_zoneidx, nodemask,
preferred_zone, classzone_idx, migratetype);
}

so it permanently alters the value of incoming arg gfp_mask.  This
means that the following trace_mm_page_alloc() will print the wrong
value of gfp_mask, and if we later do the `goto retry_cpuset', we retry
with a possibly different gfp_mask.  Isn't this a bug?


Also, why are we even passing a gfp_t down to the shrinkers?  So they
can work out the allocation context - things like __GFP_IO, __GFP_FS,
etc?  Is it even appropriate to use that mask for a new allocation
attempt within a particular shrinker?


ttm:

I think it's a bad idea to be calling kmalloc() in the slab shrinker
function.  We *know* that the system is low on memory and is trying to
free things up.  Trying to allocate *more* memory at this time is
asking for trouble.  ttm_page_pool_free() could easily be tweaked to
use a fixed-size local array of page*'s t avoid that allocation.  Could
someone implement this please?


slab:

There's no point in doing

#define GFP_SLAB_BUG_MASK (__GFP_DMA32|__GFP_HIGHMEM|~__GFP_BITS_MASK)

because __GFP_DMA32|__GFP_HIGHMEM are already part of ~__GFP_BITS_MASK.
What's it trying to do here?

And it's quite infuriating to go BUG when the code could easily warn
and fix it up.

And it's quite infuriating to go BUG because one of the bits was set,
but not tell us which bit it was!


Could the slab guys please review this?

From: Andrew Morton <a...@linux-foundation.org>
Subject: slab: improve checking for invalid gfp_flags

- The code goes BUG, but doesn't tell us which bits were unexpectedly
  set.  Print that out.

- The code goes BUG when it could jsut fix things up and proceed.  Do that.

- ~__GFP_BITS_MASK already includes __GFP_DMA32 and __GFP_HIGHMEM, so
  remove those from the GFP_SLAB_BUG_MASK definition.

Cc: Christoph Lameter 
Cc: Pekka Enberg 
Cc: David Rientjes 
Cc: Joonsoo Kim 
Signed-off-by: Andrew Morton 
---

 include/linux/gfp.h |2 +-
 mm/slab.c   |5 -
 mm/slub.c   |5 -
 3 files changed, 9 insertions(+), 3 deletions(-)

diff -puN include/linux/gfp.h~slab-improve-checking-for-invalid-gfp_flags 
include/linux/gfp.h
--- a/include/linux/gfp.h~slab-improve-checking-for-invalid-gfp_flags
+++ a/include/linux/gfp.h
@@ -145,7 +145,7 @@ str

[PATCH 28/83] mm: Change timing of notification to IOMMUs about a page to be invalidated

2014-07-10 Thread Andrew Morton
On Fri, 11 Jul 2014 00:53:26 +0300 Oded Gabbay  wrote:

> From: Andrew Lewycky 
> 
> This patch changes the location of the mmu_notifier_invalidate_page function
> call inside try_to_unmap_one. The mmu_notifier_invalidate_page function
> call tells the IOMMU that a pgae should be invalidated.
> 
> The location is changed from after releasing the physical page to
> before releasing the physical page.
> 
> This change should prevent the bug that would occur in the
> (rare) case where the GPU attempts to access a page while the CPU
> attempts to swap out that page (or discard it if it is not dirty).

um OK, but what is the effect on all the other
mmu_notifier_ops.invalidate_page() implementations?

Please spell this out in full detail within the changelog and be sure
to cc the affected maintainers.




[PATCH] list: Fix order of arguments for hlist_add_after(_rcu)

2014-06-07 Thread Andrew Morton
On Fri, 6 Jun 2014 10:22:51 -0700 "Paul E. McKenney"  wrote:

> On Fri, Jun 06, 2014 at 03:56:52PM +, David Laight wrote:
> > From: Behalf Of Ken Helias
> > > All other add functions for lists have the new item as first argument and 
> > > the
> > > position where it is added as second argument. This was changed for no 
> > > good
> > > reason in this function and makes using it unnecessary confusing.
> > > 
> > > Also the naming of the arguments in hlist_add_after was confusing. It was
> > > changed to use the same names as hlist_add_after_rcu.
> > ...
> > > -static inline void hlist_add_after_rcu(struct hlist_node *prev,
> > > -struct hlist_node *n)
> > > +static inline void hlist_add_after_rcu(struct hlist_node *n,
> > > +struct hlist_node *prev)
> > 
> > It is rather a shame that the change doesn't generate a compilation
> > error for old source files.
> 
> I am also a bit concerned by this.
> 

yup.  hlist_add_behind()?


[Bug 64521] New: BUG kmalloc-4096 (Tainted: G W ): Poison overwritten

2013-11-06 Thread Andrew Morton

(switched to email.  Please respond via emailed reply-to-all, not via the
bugzilla web interface).


On Wed, 06 Nov 2013 19:12:29 + bugzilla-daemon at bugzilla.kernel.org wrote:

> https://bugzilla.kernel.org/show_bug.cgi?id=64521
> 
> Bug ID: 64521
>Summary: BUG kmalloc-4096 (Tainted: GW   ): Poison
> overwritten
>Product: Memory Management
>Version: 2.5
> Kernel Version: 3.11.7
>   Hardware: All
> OS: Linux
>   Tree: Mainline
> Status: NEW
>   Severity: normal
>   Priority: P1
>  Component: Slab Allocator
>   Assignee: akpm at linux-foundation.org
>   Reporter: mikhail.v.gavrilov at gmail.com
> Regression: No
> 
> Created attachment 113671
>   --> https://bugzilla.kernel.org/attachment.cgi?id=113671=edit
> dmesg output
> 
> $ uname -a
> Linux localhost.localdomain 3.11.7-300.fc20.x86_64+debug #1 SMP Mon Nov 4
> 14:54:17 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
> 
> [14065.872585] perf samples too long (2552 > 2500), lowering
> kernel.perf_event_max_sample_rate to 5
> [23825.627427]
> =
> [23825.627438] BUG kmalloc-4096 (Tainted: GW   ): Poison overwritten
> [23825.627442]
> -
> 
> [23825.627449] INFO: 0x880311f7a5ee-0x880311f7a5ee. First byte 0x7b
> instead of 0x6b
> [23825.627495] INFO: Allocated in i915_gem_execbuffer2+0x51/0x290 [i915] 
> age=89
> cpu=6 pid=2995
> [23825.627503] __slab_alloc+0x45f/0x526
> [23825.627509] __kmalloc+0x322/0x3b0
> [23825.627534] i915_gem_execbuffer2+0x51/0x290 [i915]
> [23825.627554] drm_ioctl+0x542/0x680 [drm]
> [23825.627561] do_vfs_ioctl+0x305/0x530
> [23825.627566] SyS_ioctl+0x81/0xa0
> [23825.627576] tracesys+0xdd/0xe2
> [23825.627601] INFO: Freed in i915_gem_execbuffer2+0xed/0x290 [i915] age=89
> cpu=6 pid=2995
> [23825.627607] __slab_free+0x3a/0x382
> [23825.627611] kfree+0x2c0/0x2f0
> [23825.627632] i915_gem_execbuffer2+0xed/0x290 [i915]
> [23825.627649] drm_ioctl+0x542/0x680 [drm]
> [23825.627653] do_vfs_ioctl+0x305/0x530
> [23825.627657] SyS_ioctl+0x81/0xa0
> [23825.627663] tracesys+0xdd/0xe2
> [23825.627668] INFO: Slab 0xea000c47de00 objects=7 used=7 fp=0x 
> (null) flags=0x5ff0004080
> [23825.627672] INFO: Object 0x880311f7a290 @offset=8848
> fp=0x880311f79148
> 
> [23825.627679] Bytes b4 880311f7a280: d6 ea 65 01 01 00 00 00 5a 5a 5a 5a
> 5a 5a 5a 5a  ..e.
> [23825.627684] Object 880311f7a290: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627688] Object 880311f7a2a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627692] Object 880311f7a2b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627696] Object 880311f7a2c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627700] Object 880311f7a2d0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627704] Object 880311f7a2e0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627708] Object 880311f7a2f0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627711] Object 880311f7a300: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627715] Object 880311f7a310: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627719] Object 880311f7a320: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627723] Object 880311f7a330: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627727] Object 880311f7a340: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627731] Object 880311f7a350: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627735] Object 880311f7a360: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627739] Object 880311f7a370: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627743] Object 880311f7a380: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627747] Object 880311f7a390: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627751] Object 880311f7a3a0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627755] Object 880311f7a3b0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627759] Object 880311f7a3c0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  
> [23825.627762] Object 880311f7a3d0: 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b 6b
> 6b 6b 6b  

[git pull] fbcon locking fixes.

2013-01-24 Thread Andrew Morton
On Fri, 25 Jan 2013 00:42:37 + (GMT)
Dave Airlie  wrote:

> These patches have been sailing around long enough, waiting for a maintainer
> to reappear, so I've decided enough is enough, lockdep is kinda useful to 
> have.
> 
> Thanks to Daniel for annoying me enough :-)

Me too, but the patches broke Maarten's EFI driver setup:
https://lkml.org/lkml/2013/1/15/203.


Re: [git pull] fbcon locking fixes.

2013-01-24 Thread Andrew Morton
On Fri, 25 Jan 2013 00:42:37 + (GMT)
Dave Airlie airl...@linux.ie wrote:

 These patches have been sailing around long enough, waiting for a maintainer
 to reappear, so I've decided enough is enough, lockdep is kinda useful to 
 have.
 
 Thanks to Daniel for annoying me enough :-)

Me too, but the patches broke Maarten's EFI driver setup:
https://lkml.org/lkml/2013/1/15/203.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


linux-next: Tree for Nov 14 (gpu/drm/i915)

2012-11-14 Thread Andrew Morton
On Wed, 14 Nov 2012 11:41:49 -0800
Randy Dunlap  wrote:

> On 11/13/2012 09:30 PM, Stephen Rothwell wrote:
> 
> > Hi all,
> > 
> > News: next-20121115 (i.e. tomorrow) will be the last release until
> > next-20121126 (which should be just be after -rc7, I guess - assuming
> > that Linus does not release v3.7 before then), so if you want something
> > in linux-next for a reasonable amount of testing, it should probably be
> > committed tomorrow.
> > 
> > Changes since 20121113:
> > 
> 
> 
> 
> on i386:
> 
> ERROR: "__build_bug_on_failed" [drivers/gpu/drm/i915/i915.ko] undefined!
> 
> Reference to that symbol is found in
> i915_gem_execbuffer.o.  Reference to BUILD_BUG_ON() is found in
> i915_gem_execbuffer.c:
> 
> static struct eb_objects *
> eb_create(int size)
> {
>   struct eb_objects *eb;
>   int count = PAGE_SIZE / sizeof(struct hlist_head) / 2;
>   BUILD_BUG_ON(!is_power_of_2(PAGE_SIZE / sizeof(struct hlist_head)));
> 

Where to start.

- eb_create() has no business assuming that the hlist_head has any
  particular size.  We could easily add some conditionally-compiled
  debug fields in there, and drm blows up.

- The assertion is obviously true at present, so I assume that gcc
  screwed up and failed to reduce all that to a compile-time constant.

- We have a BUILD_BUG_ON_NOT_POWER_OF_2().  Use it?

- This code is using PAGE_SIZE to scale a kernel data structure. 
  PAGE_SIZE can vary by a factor of 16, depending on Kconfig.  This
  can result in 64k PAGE_SIZE machines exhibiting different beahviour
  from that which the testers saw.

  Don't do it.  It's better to hard-wire 4096.



Re: linux-next: Tree for Nov 14 (gpu/drm/i915)

2012-11-14 Thread Andrew Morton
On Wed, 14 Nov 2012 11:41:49 -0800
Randy Dunlap rdun...@infradead.org wrote:

 On 11/13/2012 09:30 PM, Stephen Rothwell wrote:
 
  Hi all,
  
  News: next-20121115 (i.e. tomorrow) will be the last release until
  next-20121126 (which should be just be after -rc7, I guess - assuming
  that Linus does not release v3.7 before then), so if you want something
  in linux-next for a reasonable amount of testing, it should probably be
  committed tomorrow.
  
  Changes since 20121113:
  
 
 
 
 on i386:
 
 ERROR: __build_bug_on_failed [drivers/gpu/drm/i915/i915.ko] undefined!
 
 Reference to that symbol is found in
 i915_gem_execbuffer.o.  Reference to BUILD_BUG_ON() is found in
 i915_gem_execbuffer.c:
 
 static struct eb_objects *
 eb_create(int size)
 {
   struct eb_objects *eb;
   int count = PAGE_SIZE / sizeof(struct hlist_head) / 2;
   BUILD_BUG_ON(!is_power_of_2(PAGE_SIZE / sizeof(struct hlist_head)));
 

Where to start.

- eb_create() has no business assuming that the hlist_head has any
  particular size.  We could easily add some conditionally-compiled
  debug fields in there, and drm blows up.

- The assertion is obviously true at present, so I assume that gcc
  screwed up and failed to reduce all that to a compile-time constant.

- We have a BUILD_BUG_ON_NOT_POWER_OF_2().  Use it?

- This code is using PAGE_SIZE to scale a kernel data structure. 
  PAGE_SIZE can vary by a factor of 16, depending on Kconfig.  This
  can result in 64k PAGE_SIZE machines exhibiting different beahviour
  from that which the testers saw.

  Don't do it.  It's better to hard-wire 4096.

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH] backlight: initialize struct backlight_properties properly

2012-05-22 Thread Andrew Morton
On Mon, 21 May 2012 09:24:35 +0200
Corentin Chary  wrote:

> In all these files, the .power field was never correctly initialized.
> 
> ...
>
> --- a/drivers/gpu/drm/i915/intel_panel.c
> +++ b/drivers/gpu/drm/i915/intel_panel.c
> @@ -342,6 +342,7 @@ int intel_panel_setup_backlight(struct drm_device *dev)
>   else
>   return -ENODEV;
>  
> + memset(, 0, sizeof(props));

This code is all still quite fragile.  What happens if we add a new
field to backlight_properties?  We need to go edit a zillion drivers?

There are two ways of fixing this:

a) write and use

void backlight_properties_init(struct backlight_properties *props,
int type, int max_brightness, etc);

   or

b) edit all sites to do

struct backlight_properties bl_props = {
.type = BACKLIGHT_RAW,
.max_brighness = intel_panel_get_max_backlight(dev),
.power = FB_BLANK_UNBLANK,
};

they're largely equivalent - the struct will be zeroed out and
then certain fields will be initialised.  a) is a bit better because
some future field may not want the all-zeroes pattern (eg, a spinlock).

But they're both better than what we have now!


[PATCH v3] scatterlist: add sg_alloc_table_from_pages function

2012-05-22 Thread Andrew Morton
On Mon, 21 May 2012 16:01:50 +0200
Tomasz Stanislawski  wrote:

> >> +int sg_alloc_table_from_pages(struct sg_table *sgt,
> >> +  struct page **pages, unsigned int n_pages,
> >> +  unsigned long offset, unsigned long size,
> >> +  gfp_t gfp_mask)
> > 
> > I guess a 32-bit n_pages is OK.  A 16TB IO seems enough ;)
> > 
> 
> Do you think that 'unsigned long' for offset is too big?
> 
> Ad n_pages. Assuming that Moore's law holds it will take
> circa 25 years before the limit of 16 TB is reached :) for
> high-end scatterlist operations.
> Or I can change the type of n_pages to 'unsigned long' now at
> no cost :).

By then it will be Someone Else's Problem ;)

> >> +{
> >> +  unsigned int chunks;
> >> +  unsigned int i;
> > 
> > erk, please choose a different name for this.  When a C programmer sees
> > "i", he very much assumes it has type "int".  Making it unsigned causes
> > surprise.
> > 
> > And don't rename it to "u"!  Let's give it a nice meaningful name.  pageno?
> > 
> 
> The problem is that 'i' is  a natural name for a loop counter.

It's also the natural name for an integer.  If a C programmer sees "i",
he thinks "int".  It's a Fortran thing ;)

> AFAIK, in the kernel code developers try to avoid Hungarian notation.
> A name of a variable should reflect its purpose, not its type.
> I can change the name of 'i' to 'pageno' and 'j' to 'pageno2' (?)
> but I think it will make the code less reliable.

Well, one could do something radical such as using "p".




Re: [PATCH v3] scatterlist: add sg_alloc_table_from_pages function

2012-05-22 Thread Andrew Morton
On Mon, 21 May 2012 16:01:50 +0200
Tomasz Stanislawski t.stanisl...@samsung.com wrote:

  +int sg_alloc_table_from_pages(struct sg_table *sgt,
  +  struct page **pages, unsigned int n_pages,
  +  unsigned long offset, unsigned long size,
  +  gfp_t gfp_mask)
  
  I guess a 32-bit n_pages is OK.  A 16TB IO seems enough ;)
  
 
 Do you think that 'unsigned long' for offset is too big?
 
 Ad n_pages. Assuming that Moore's law holds it will take
 circa 25 years before the limit of 16 TB is reached :) for
 high-end scatterlist operations.
 Or I can change the type of n_pages to 'unsigned long' now at
 no cost :).

By then it will be Someone Else's Problem ;)

  +{
  +  unsigned int chunks;
  +  unsigned int i;
  
  erk, please choose a different name for this.  When a C programmer sees
  i, he very much assumes it has type int.  Making it unsigned causes
  surprise.
  
  And don't rename it to u!  Let's give it a nice meaningful name.  pageno?
  
 
 The problem is that 'i' is  a natural name for a loop counter.

It's also the natural name for an integer.  If a C programmer sees i,
he thinks int.  It's a Fortran thing ;)

 AFAIK, in the kernel code developers try to avoid Hungarian notation.
 A name of a variable should reflect its purpose, not its type.
 I can change the name of 'i' to 'pageno' and 'j' to 'pageno2' (?)
 but I think it will make the code less reliable.

Well, one could do something radical such as using p.


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] backlight: initialize struct backlight_properties properly

2012-05-22 Thread Andrew Morton
On Mon, 21 May 2012 09:24:35 +0200
Corentin Chary corentin.ch...@gmail.com wrote:

 In all these files, the .power field was never correctly initialized.
 
 ...

 --- a/drivers/gpu/drm/i915/intel_panel.c
 +++ b/drivers/gpu/drm/i915/intel_panel.c
 @@ -342,6 +342,7 @@ int intel_panel_setup_backlight(struct drm_device *dev)
   else
   return -ENODEV;
  
 + memset(props, 0, sizeof(props));

This code is all still quite fragile.  What happens if we add a new
field to backlight_properties?  We need to go edit a zillion drivers?

There are two ways of fixing this:

a) write and use

void backlight_properties_init(struct backlight_properties *props,
int type, int max_brightness, etc);

   or

b) edit all sites to do

struct backlight_properties bl_props = {
.type = BACKLIGHT_RAW,
.max_brighness = intel_panel_get_max_backlight(dev),
.power = FB_BLANK_UNBLANK,
};

they're largely equivalent - the struct will be zeroed out and
then certain fields will be initialised.  a) is a bit better because
some future field may not want the all-zeroes pattern (eg, a spinlock).

But they're both better than what we have now!
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH v3] scatterlist: add sg_alloc_table_from_pages function

2012-05-17 Thread Andrew Morton
On Tue, 08 May 2012 11:50:33 +0200
Tomasz Stanislawski t.stanisl...@samsung.com wrote:

 This patch adds a new constructor for an sg table. The table is constructed
 from an array of struct pages. All contiguous chunks of the pages are merged
 into a single sg nodes. A user may provide an offset and a size of a buffer if
 the buffer is not page-aligned.
 
 The function is dedicated for DMABUF exporters which often perform conversion
 from an page array to a scatterlist. Moreover the scatterlist should be
 squashed in order to save memory and to speed-up the process of DMA mapping
 using dma_map_sg.
 
 The code is based on the patch 'v4l: vb2-dma-contig: add support for
 scatterlist in userptr mode' and hints from Laurent Pinchart.
 
 ...

  /**
 + * sg_alloc_table_from_pages - Allocate and initialize an sg table from
 + *  an array of pages
 + * @sgt: The sg table header to use
 + * @pages:   Pointer to an array of page pointers
 + * @n_pages: Number of pages in the pages array
 + * @offset: Offset from start of the first page to the start of a buffer
 + * @size:   Number of valid bytes in the buffer (after offset)
 + * @gfp_mask:GFP allocation mask
 + *
 + *  Description:
 + *Allocate and initialize an sg table from a list of pages. Continuous

s/Continuous/Contiguous/

 + *ranges of the pages are squashed into a single scatterlist node. A user
 + *may provide an offset at a start and a size of valid data in a buffer
 + *specified by the page array. The returned sg table is released by
 + *sg_free_table.
 + *
 + * Returns:
 + *   0 on success, negative error on failure
 + **/

nit: Use */, not **/ here.

 +int sg_alloc_table_from_pages(struct sg_table *sgt,
 + struct page **pages, unsigned int n_pages,
 + unsigned long offset, unsigned long size,
 + gfp_t gfp_mask)

I guess a 32-bit n_pages is OK.  A 16TB IO seems enough ;)

 +{
 + unsigned int chunks;
 + unsigned int i;

erk, please choose a different name for this.  When a C programmer sees
i, he very much assumes it has type int.  Making it unsigned causes
surprise.

And don't rename it to u!  Let's give it a nice meaningful name.  pageno?

 + unsigned int cur_page;
 + int ret;
 + struct scatterlist *s;
 +
 + /* compute number of contiguous chunks */
 + chunks = 1;
 + for (i = 1; i  n_pages; ++i)
 + if (page_to_pfn(pages[i]) != page_to_pfn(pages[i - 1]) + 1)

This assumes that if two pages have contiguous pfn's then they are
physically contiguous.  Is that true for all architectures and memory
models, including sparsemem?  See sparse_encode_mem_map().

 + ++chunks;
 +
 + ret = sg_alloc_table(sgt, chunks, gfp_mask);
 + if (unlikely(ret))
 + return ret;
 +
 + /* merging chunks and putting them into the scatterlist */
 + cur_page = 0;
 + for_each_sg(sgt-sgl, s, sgt-orig_nents, i) {
 + unsigned long chunk_size;
 + unsigned int j;

j is an int, too.

 +
 + /* looking for the end of the current chunk */

s/looking/look/

 + for (j = cur_page + 1; j  n_pages; ++j)
 + if (page_to_pfn(pages[j]) !=
 + page_to_pfn(pages[j - 1]) + 1)
 + break;
 +
 + chunk_size = ((j - cur_page)  PAGE_SHIFT) - offset;
 + sg_set_page(s, pages[cur_page], min(size, chunk_size), offset);
 + size -= chunk_size;
 + offset = 0;
 + cur_page = j;
 + }
 +
 + return 0;
 +}
 +EXPORT_SYMBOL(sg_alloc_table_from_pages);

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH] mm: extend prefault helpers to fault in more than PAGE_SIZE

2012-03-01 Thread Andrew Morton
On Thu,  1 Mar 2012 20:22:59 +0100
Daniel Vetter  wrote:

> drm/i915 wants to read/write more than one page in its fastpath
> and hence needs to prefault more than PAGE_SIZE bytes.
> 
> Add new functions in filemap.h to make that possible.
> 
> Also kill a copy spurious space in both functions while at it.
> 
>
> ...
>
> +/* Multipage variants of the above prefault helpers, useful if more than
> + * PAGE_SIZE of date needs to be prefaulted. These are separate from the 
> above
> + * functions (which only handle up to PAGE_SIZE) to avoid clobbering the
> + * filemap.c hotpaths. */

Like this please:

/*
 * Multipage variants of the above prefault helpers, useful if more than
 * PAGE_SIZE of date needs to be prefaulted. These are separate from the above
 * functions (which only handle up to PAGE_SIZE) to avoid clobbering the
 * filemap.c hotpaths.
 */

and s/date/data/

> +static inline int fault_in_multipages_writeable(char __user *uaddr, int size)
> +{
> + int ret;
> + const char __user *end = uaddr + size - 1;
> +
> + if (unlikely(size == 0))
> + return 0;
> +
> + /*
> +  * Writing zeroes into userspace here is OK, because we know that if
> +  * the zero gets there, we'll be overwriting it.
> +  */

Yeah, like that.

> + while (uaddr <= end) {
> + ret = __put_user(0, uaddr);
> + if (ret != 0)
> + return ret;
> + uaddr += PAGE_SIZE;
> + }
> +
> + /* Check whether the range spilled into the next page. */
> + if (((unsigned long)uaddr & PAGE_MASK) ==
> + ((unsigned long)end & PAGE_MASK))
> + ret = __put_user(0, end);
> +
> + return ret;
> +}
> +
> +static inline int fault_in_multipages_readable(const char __user *uaddr,
> +int size)
> +{
> + volatile char c;
> + int ret;
> + const char __user *end = uaddr + size - 1;
> +
> + if (unlikely(size == 0))
> + return 0;
> +
> + while (uaddr <= end) {
> + ret = __get_user(c, uaddr);
> + if (ret != 0)
> + return ret;
> + uaddr += PAGE_SIZE;
> + }
> +
> + /* Check whether the range spilled into the next page. */
> + if (((unsigned long)uaddr & PAGE_MASK) ==
> + ((unsigned long)end & PAGE_MASK)) {
> + ret = __get_user(c, end);
> + (void)c;
> + }
> +
> + return ret;
> +}

Please merge it via the DRI tree.


Re: [PATCH] mm: extend prefault helpers to fault in more than PAGE_SIZE

2012-03-01 Thread Andrew Morton
On Thu,  1 Mar 2012 20:22:59 +0100
Daniel Vetter daniel.vet...@ffwll.ch wrote:

 drm/i915 wants to read/write more than one page in its fastpath
 and hence needs to prefault more than PAGE_SIZE bytes.
 
 Add new functions in filemap.h to make that possible.
 
 Also kill a copypasted spurious space in both functions while at it.
 

 ...

 +/* Multipage variants of the above prefault helpers, useful if more than
 + * PAGE_SIZE of date needs to be prefaulted. These are separate from the 
 above
 + * functions (which only handle up to PAGE_SIZE) to avoid clobbering the
 + * filemap.c hotpaths. */

Like this please:

/*
 * Multipage variants of the above prefault helpers, useful if more than
 * PAGE_SIZE of date needs to be prefaulted. These are separate from the above
 * functions (which only handle up to PAGE_SIZE) to avoid clobbering the
 * filemap.c hotpaths.
 */

and s/date/data/

 +static inline int fault_in_multipages_writeable(char __user *uaddr, int size)
 +{
 + int ret;
 + const char __user *end = uaddr + size - 1;
 +
 + if (unlikely(size == 0))
 + return 0;
 +
 + /*
 +  * Writing zeroes into userspace here is OK, because we know that if
 +  * the zero gets there, we'll be overwriting it.
 +  */

Yeah, like that.

 + while (uaddr = end) {
 + ret = __put_user(0, uaddr);
 + if (ret != 0)
 + return ret;
 + uaddr += PAGE_SIZE;
 + }
 +
 + /* Check whether the range spilled into the next page. */
 + if (((unsigned long)uaddr  PAGE_MASK) ==
 + ((unsigned long)end  PAGE_MASK))
 + ret = __put_user(0, end);
 +
 + return ret;
 +}
 +
 +static inline int fault_in_multipages_readable(const char __user *uaddr,
 +int size)
 +{
 + volatile char c;
 + int ret;
 + const char __user *end = uaddr + size - 1;
 +
 + if (unlikely(size == 0))
 + return 0;
 +
 + while (uaddr = end) {
 + ret = __get_user(c, uaddr);
 + if (ret != 0)
 + return ret;
 + uaddr += PAGE_SIZE;
 + }
 +
 + /* Check whether the range spilled into the next page. */
 + if (((unsigned long)uaddr  PAGE_MASK) ==
 + ((unsigned long)end  PAGE_MASK)) {
 + ret = __get_user(c, end);
 + (void)c;
 + }
 +
 + return ret;
 +}

Please merge it via the DRI tree.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH] mm: extend prefault helpers to fault in more than PAGE_SIZE

2012-02-29 Thread Andrew Morton
On Thu, 1 Mar 2012 00:14:53 +0100
Daniel Vetter  wrote:

> I'll redo this patch by adding _multipage versions of these 2 functions
> for i915.

OK, but I hope "for i915" doesn't mean "private to"!  Put 'em in
pagemap.h, for maintenance reasons and because they are generic.

Making them inline is a bit sad, but that's OK for now - they have few
callsites.


[PATCH] mm: extend prefault helpers to fault in more than PAGE_SIZE

2012-02-29 Thread Andrew Morton
On Wed, 29 Feb 2012 15:03:31 +0100
Daniel Vetter  wrote:

> drm/i915 wants to read/write more than one page in its fastpath
> and hence needs to prefault more than PAGE_SIZE bytes.
> 
> I've checked the callsites and they all already clamp size when
> calling fault_in_pages_* to the same as for the subsequent
> __copy_to|from_user and hence don't rely on the implicit clamping
> to PAGE_SIZE.
> 
> Also kill a copy spurious space in both functions while at it.
> 
> v2: As suggested by Andrew Morton, add a multipage parameter to both
> functions to avoid the additional branch for the pagemap.c hotpath.
> My gcc 4.6 here seems to dtrt and indeed reap these branches where not
> needed.
> 

I don't think this produced a very good result :(

>
> ...
>
> -static inline int fault_in_pages_writeable(char __user *uaddr, int size)
> +static inline int fault_in_pages_writeable(char __user *uaddr, int size,
> +bool multipage)
>  {
>   int ret;
> + char __user *end = uaddr + size - 1;
>  
>   if (unlikely(size == 0))
>   return 0;
> @@ -416,36 +419,46 @@ static inline int fault_in_pages_writeable(char __user 
> *uaddr, int size)
>* Writing zeroes into userspace here is OK, because we know that if
>* the zero gets there, we'll be overwriting it.
>*/
> - ret = __put_user(0, uaddr);
> - if (ret == 0) {
> - char __user *end = uaddr + size - 1;
> + do {
> + ret = __put_user(0, uaddr);
> + if (ret != 0)
> + return ret;
> + uaddr += PAGE_SIZE;
> + } while (multipage && uaddr <= end);
>  
> + if (ret == 0) {
>   /*
>* If the page was already mapped, this will get a cache miss
>* for sure, so try to avoid doing it.
>*/
> - if (((unsigned long)uaddr & PAGE_MASK) !=
> + if (((unsigned long)uaddr & PAGE_MASK) ==
>   ((unsigned long)end & PAGE_MASK))
> - ret = __put_user(0, end);
> + ret = __put_user(0, end);
>   }
>   return ret;
>  }

One effect of this change for the filemap.c callsite is that `uaddr'
now gets incremented by PAGE_SIZE.  That happens to not break anything
because we then mask `uaddr' with PAGE_MASK, and if gcc were really
smart, it could remove that addition.  But it's a bit ugly.

Ideally the patch would have no effect upon filemap.o size, but with an
allmodconfig config I'm seeing

   textdata bss dec hex filename
  22876 1187344   303387682 mm/filemap.o(before)
  22925 1187392   3043576e3 mm/filemap.o(after)

so we are adding read()/write() overhead, and bss mysteriously got larger.

Can we improve on this?  Even if it's some dumb

static inline int fault_in_pages_writeable(char __user *uaddr, int size,
   bool multipage)
{
if (multipage) {
do-this
} else {
do-that
}
}

the code duplication between do-this and do-that is regrettable, but at
least it's all in the same place in the same file, so we won't
accidentally introduce skew later on.

Alternatively, add a separate fault_in_multi_pages_writeable() to
pagemap.h.  I have a bad feeling this is what your original patch did!

(But we *should* be able to make this work!  Why did this version of
the patch go so wrong?)



Re: [PATCH] mm: extend prefault helpers to fault in more than PAGE_SIZE

2012-02-29 Thread Andrew Morton
On Wed, 29 Feb 2012 15:03:31 +0100
Daniel Vetter daniel.vet...@ffwll.ch wrote:

 drm/i915 wants to read/write more than one page in its fastpath
 and hence needs to prefault more than PAGE_SIZE bytes.
 
 I've checked the callsites and they all already clamp size when
 calling fault_in_pages_* to the same as for the subsequent
 __copy_to|from_user and hence don't rely on the implicit clamping
 to PAGE_SIZE.
 
 Also kill a copypasted spurious space in both functions while at it.
 
 v2: As suggested by Andrew Morton, add a multipage parameter to both
 functions to avoid the additional branch for the pagemap.c hotpath.
 My gcc 4.6 here seems to dtrt and indeed reap these branches where not
 needed.
 

I don't think this produced a very good result :(


 ...

 -static inline int fault_in_pages_writeable(char __user *uaddr, int size)
 +static inline int fault_in_pages_writeable(char __user *uaddr, int size,
 +bool multipage)
  {
   int ret;
 + char __user *end = uaddr + size - 1;
  
   if (unlikely(size == 0))
   return 0;
 @@ -416,36 +419,46 @@ static inline int fault_in_pages_writeable(char __user 
 *uaddr, int size)
* Writing zeroes into userspace here is OK, because we know that if
* the zero gets there, we'll be overwriting it.
*/
 - ret = __put_user(0, uaddr);
 - if (ret == 0) {
 - char __user *end = uaddr + size - 1;
 + do {
 + ret = __put_user(0, uaddr);
 + if (ret != 0)
 + return ret;
 + uaddr += PAGE_SIZE;
 + } while (multipage  uaddr = end);
  
 + if (ret == 0) {
   /*
* If the page was already mapped, this will get a cache miss
* for sure, so try to avoid doing it.
*/
 - if (((unsigned long)uaddr  PAGE_MASK) !=
 + if (((unsigned long)uaddr  PAGE_MASK) ==
   ((unsigned long)end  PAGE_MASK))
 - ret = __put_user(0, end);
 + ret = __put_user(0, end);
   }
   return ret;
  }

One effect of this change for the filemap.c callsite is that `uaddr'
now gets incremented by PAGE_SIZE.  That happens to not break anything
because we then mask `uaddr' with PAGE_MASK, and if gcc were really
smart, it could remove that addition.  But it's a bit ugly.

Ideally the patch would have no effect upon filemap.o size, but with an
allmodconfig config I'm seeing

   textdata bss dec hex filename
  22876 1187344   303387682 mm/filemap.o(before)
  22925 1187392   3043576e3 mm/filemap.o(after)

so we are adding read()/write() overhead, and bss mysteriously got larger.

Can we improve on this?  Even if it's some dumb

static inline int fault_in_pages_writeable(char __user *uaddr, int size,
   bool multipage)
{
if (multipage) {
do-this
} else {
do-that
}
}

the code duplication between do-this and do-that is regrettable, but at
least it's all in the same place in the same file, so we won't
accidentally introduce skew later on.

Alternatively, add a separate fault_in_multi_pages_writeable() to
pagemap.h.  I have a bad feeling this is what your original patch did!

(But we *should* be able to make this work!  Why did this version of
the patch go so wrong?)

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] mm: extend prefault helpers to fault in more than PAGE_SIZE

2012-02-29 Thread Andrew Morton
On Thu, 1 Mar 2012 00:14:53 +0100
Daniel Vetter dan...@ffwll.ch wrote:

 I'll redo this patch by adding _multipage versions of these 2 functions
 for i915.

OK, but I hope for i915 doesn't mean private to!  Put 'em in
pagemap.h, for maintenance reasons and because they are generic.

Making them inline is a bit sad, but that's OK for now - they have few
callsites.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH] mm: extend prefault helpers to fault in more than PAGE_SIZE

2012-02-24 Thread Andrew Morton
On Fri, 24 Feb 2012 14:34:31 +0100
Daniel Vetter  wrote:

> > > --- a/include/linux/pagemap.h
> > > +++ b/include/linux/pagemap.h
> > > @@ -408,6 +408,7 @@ extern void add_page_wait_queue(struct page *page, 
> > > wait_queue_t *waiter);
> > >  static inline int fault_in_pages_writeable(char __user *uaddr, int size)
> > >  {
> > >   int ret;
> > > + char __user *end = uaddr + size - 1;
> > >  
> > >   if (unlikely(size == 0))
> > >   return 0;
> > > @@ -416,17 +417,20 @@ static inline int fault_in_pages_writeable(char 
> > > __user *uaddr, int size)
> > >* Writing zeroes into userspace here is OK, because we know that if
> > >* the zero gets there, we'll be overwriting it.
> > >*/
> > > - ret = __put_user(0, uaddr);
> > > + while (uaddr <= end) {
> > > + ret = __put_user(0, uaddr);
> > > + if (ret != 0)
> > > + return ret;
> > > + uaddr += PAGE_SIZE;
> > > + }
> > 
> > The callsites in filemap.c are pretty hot paths, which is why this
> > thing remains explicitly inlined.  I think it would be worth adding a
> > bit of code here to avoid adding a pointless test-n-branch and larger
> > cache footprint to read() and write().
> > 
> > A way of doing that is to add another argument to these functions, say
> > "bool multipage".  Change the code to do
> > 
> > if (multipage) {
> > while (uaddr <= end) {
> > ...
> > }
> > }
> > 
> > and change the callsites to pass in constant "true" or "false".  Then
> > compile it up and manually check that the compiler completely removed
> > the offending code from the filemap.c callsites.
> > 
> > Wanna have a think about that?  If it all looks OK then please be sure
> > to add code comments explaining why we did this.
> 
> I wasn't really happy with the added branch either, but failed to come up
> with a trick to avoid it. Imho adding new _multipage variants of these
> functions instead of adding a constant argument is simpler because the
> functions don't really share much thanks to the block below. I'll see what
> it looks like (and obviously add a comment explaining what's going on).

well... that's just syntactic sugar:

static inline int __fault_in_pages_writeable(char __user *uaddr, int size, bool 
multipage)
{
...
}

static inline int fault_in_pages_writeable(char __user *uaddr, int size)
{
return __fault_in_pages_writeable(uaddr, size, false);
}

static inline int fault_in_multipages_writeable(char __user *uaddr, int size)
{
return __fault_in_pages_writeable(uaddr, size, true);
}

which I don't think is worth bothering with given the very small number
of callsites.



Re: [PATCH] mm: extend prefault helpers to fault in more than PAGE_SIZE

2012-02-24 Thread Andrew Morton
On Fri, 24 Feb 2012 14:34:31 +0100
Daniel Vetter dan...@ffwll.ch wrote:

   --- a/include/linux/pagemap.h
   +++ b/include/linux/pagemap.h
   @@ -408,6 +408,7 @@ extern void add_page_wait_queue(struct page *page, 
   wait_queue_t *waiter);
static inline int fault_in_pages_writeable(char __user *uaddr, int size)
{
 int ret;
   + char __user *end = uaddr + size - 1;

 if (unlikely(size == 0))
 return 0;
   @@ -416,17 +417,20 @@ static inline int fault_in_pages_writeable(char 
   __user *uaddr, int size)
  * Writing zeroes into userspace here is OK, because we know that if
  * the zero gets there, we'll be overwriting it.
  */
   - ret = __put_user(0, uaddr);
   + while (uaddr = end) {
   + ret = __put_user(0, uaddr);
   + if (ret != 0)
   + return ret;
   + uaddr += PAGE_SIZE;
   + }
  
  The callsites in filemap.c are pretty hot paths, which is why this
  thing remains explicitly inlined.  I think it would be worth adding a
  bit of code here to avoid adding a pointless test-n-branch and larger
  cache footprint to read() and write().
  
  A way of doing that is to add another argument to these functions, say
  bool multipage.  Change the code to do
  
  if (multipage) {
  while (uaddr = end) {
  ...
  }
  }
  
  and change the callsites to pass in constant true or false.  Then
  compile it up and manually check that the compiler completely removed
  the offending code from the filemap.c callsites.
  
  Wanna have a think about that?  If it all looks OK then please be sure
  to add code comments explaining why we did this.
 
 I wasn't really happy with the added branch either, but failed to come up
 with a trick to avoid it. Imho adding new _multipage variants of these
 functions instead of adding a constant argument is simpler because the
 functions don't really share much thanks to the block below. I'll see what
 it looks like (and obviously add a comment explaining what's going on).

well... that's just syntactic sugar:

static inline int __fault_in_pages_writeable(char __user *uaddr, int size, bool 
multipage)
{
...
}

static inline int fault_in_pages_writeable(char __user *uaddr, int size)
{
return __fault_in_pages_writeable(uaddr, size, false);
}

static inline int fault_in_multipages_writeable(char __user *uaddr, int size)
{
return __fault_in_pages_writeable(uaddr, size, true);
}

which I don't think is worth bothering with given the very small number
of callsites.

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH] mm: extend prefault helpers to fault in more than PAGE_SIZE

2012-02-23 Thread Andrew Morton
On Thu, 16 Feb 2012 13:01:36 +0100
Daniel Vetter  wrote:

> drm/i915 wants to read/write more than one page in its fastpath
> and hence needs to prefault more than PAGE_SIZE bytes.
> 
> I've checked the callsites and they all already clamp size when
> calling fault_in_pages_* to the same as for the subsequent
> __copy_to|from_user and hence don't rely on the implicit clamping
> to PAGE_SIZE.
> 
> Also kill a copy spurious space in both functions while at it.
>
> ...
>
> --- a/include/linux/pagemap.h
> +++ b/include/linux/pagemap.h
> @@ -408,6 +408,7 @@ extern void add_page_wait_queue(struct page *page, 
> wait_queue_t *waiter);
>  static inline int fault_in_pages_writeable(char __user *uaddr, int size)
>  {
>   int ret;
> + char __user *end = uaddr + size - 1;
>  
>   if (unlikely(size == 0))
>   return 0;
> @@ -416,17 +417,20 @@ static inline int fault_in_pages_writeable(char __user 
> *uaddr, int size)
>* Writing zeroes into userspace here is OK, because we know that if
>* the zero gets there, we'll be overwriting it.
>*/
> - ret = __put_user(0, uaddr);
> + while (uaddr <= end) {
> + ret = __put_user(0, uaddr);
> + if (ret != 0)
> + return ret;
> + uaddr += PAGE_SIZE;
> + }

The callsites in filemap.c are pretty hot paths, which is why this
thing remains explicitly inlined.  I think it would be worth adding a
bit of code here to avoid adding a pointless test-n-branch and larger
cache footprint to read() and write().

A way of doing that is to add another argument to these functions, say
"bool multipage".  Change the code to do

if (multipage) {
while (uaddr <= end) {
...
}
}

and change the callsites to pass in constant "true" or "false".  Then
compile it up and manually check that the compiler completely removed
the offending code from the filemap.c callsites.

Wanna have a think about that?  If it all looks OK then please be sure
to add code comments explaining why we did this.

>   if (ret == 0) {
> - char __user *end = uaddr + size - 1;
> -
>   /*
>* If the page was already mapped, this will get a cache miss
>* for sure, so try to avoid doing it.
>*/
> - if (((unsigned long)uaddr & PAGE_MASK) !=
> + if (((unsigned long)uaddr & PAGE_MASK) ==
>   ((unsigned long)end & PAGE_MASK))

Maybe I'm having a dim day, but I don't immediately see why != got
turned into ==.


Once we have this settled I'd suggest that the patch be carried in
whatever-git-tree-needs-it.


Re: [PATCH] mm: extend prefault helpers to fault in more than PAGE_SIZE

2012-02-23 Thread Andrew Morton
On Thu, 16 Feb 2012 13:01:36 +0100
Daniel Vetter daniel.vet...@ffwll.ch wrote:

 drm/i915 wants to read/write more than one page in its fastpath
 and hence needs to prefault more than PAGE_SIZE bytes.
 
 I've checked the callsites and they all already clamp size when
 calling fault_in_pages_* to the same as for the subsequent
 __copy_to|from_user and hence don't rely on the implicit clamping
 to PAGE_SIZE.
 
 Also kill a copypasted spurious space in both functions while at it.

 ...

 --- a/include/linux/pagemap.h
 +++ b/include/linux/pagemap.h
 @@ -408,6 +408,7 @@ extern void add_page_wait_queue(struct page *page, 
 wait_queue_t *waiter);
  static inline int fault_in_pages_writeable(char __user *uaddr, int size)
  {
   int ret;
 + char __user *end = uaddr + size - 1;
  
   if (unlikely(size == 0))
   return 0;
 @@ -416,17 +417,20 @@ static inline int fault_in_pages_writeable(char __user 
 *uaddr, int size)
* Writing zeroes into userspace here is OK, because we know that if
* the zero gets there, we'll be overwriting it.
*/
 - ret = __put_user(0, uaddr);
 + while (uaddr = end) {
 + ret = __put_user(0, uaddr);
 + if (ret != 0)
 + return ret;
 + uaddr += PAGE_SIZE;
 + }

The callsites in filemap.c are pretty hot paths, which is why this
thing remains explicitly inlined.  I think it would be worth adding a
bit of code here to avoid adding a pointless test-n-branch and larger
cache footprint to read() and write().

A way of doing that is to add another argument to these functions, say
bool multipage.  Change the code to do

if (multipage) {
while (uaddr = end) {
...
}
}

and change the callsites to pass in constant true or false.  Then
compile it up and manually check that the compiler completely removed
the offending code from the filemap.c callsites.

Wanna have a think about that?  If it all looks OK then please be sure
to add code comments explaining why we did this.

   if (ret == 0) {
 - char __user *end = uaddr + size - 1;
 -
   /*
* If the page was already mapped, this will get a cache miss
* for sure, so try to avoid doing it.
*/
 - if (((unsigned long)uaddr  PAGE_MASK) !=
 + if (((unsigned long)uaddr  PAGE_MASK) ==
   ((unsigned long)end  PAGE_MASK))

Maybe I'm having a dim day, but I don't immediately see why != got
turned into ==.


Once we have this settled I'd suggest that the patch be carried in
whatever-git-tree-needs-it.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


3.2-rc2+: Reported regressions from 3.0 and 3.1

2011-11-21 Thread Andrew Morton
On Tue, 22 Nov 2011 11:19:24 +0530 "Srivatsa S. Bhat"  wrote:

> > Subject: khugepaged blocks suspend2ram (3.2.0-rc1-00252-g8f042aa)
> > Submitter  : Sergei Trofimovich 
> > Date   : 2011-11-12 10:48
> > Message-ID : 2012104859.7744b282 at sf.home
> > References : https://lkml.org/lkml/2011/11/12/11
> > 
> 
> Andrea's patch already fixes this issue, which was reported first by
> Jiri Slaby. https://lkml.org/lkml/2011/11/11/93
> I remember Andrew Morton taking this patch in his -mm tree. But it is
> not in mainline yet. So can we consider this closed or not?

grr, nothing in that patch's changelog indicates that it fixed a
regression nor that it fixed an infinite blockage of suspend.

I moved it to my 3.2 queue, thanks.


Re: 3.2-rc2+: Reported regressions from 3.0 and 3.1

2011-11-21 Thread Andrew Morton
On Tue, 22 Nov 2011 11:19:24 +0530 Srivatsa S. Bhat 
srivatsa.b...@linux.vnet.ibm.com wrote:

  Subject: khugepaged blocks suspend2ram (3.2.0-rc1-00252-g8f042aa)
  Submitter  : Sergei Trofimovich sly...@gmail.com
  Date   : 2011-11-12 10:48
  Message-ID : 2012104859.7744b...@sf.home
  References : https://lkml.org/lkml/2011/11/12/11
  
 
 Andrea's patch already fixes this issue, which was reported first by
 Jiri Slaby. https://lkml.org/lkml/2011/11/11/93
 I remember Andrew Morton taking this patch in his -mm tree. But it is
 not in mainline yet. So can we consider this closed or not?

grr, nothing in that patch's changelog indicates that it fixed a
regression nor that it fixed an infinite blockage of suspend.

I moved it to my 3.2 queue, thanks.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH] drm: avoid switching to text console if there is no panic timeout

2011-11-10 Thread Andrew Morton
On Thu, 10 Nov 2011 13:15:04 -0800
Mandeep Singh Baines  wrote:

> David Rientjes (rientjes at google.com) wrote:
> > On Mon, 17 Oct 2011, David Rientjes wrote:
> > 
> > > On Mon, 17 Oct 2011, Mandeep Singh Baines wrote:
> > > 
> > > > From: Hugh Dickins 
> > > > 
> > > > Add a check for panic_timeout in the drm_fb_helper_panic() notifier: if
> > > > we're going to reboot immediately, the user will not be able to see the
> > > > messages anyway, and messing with the video mode may display artifacts,
> > > > and certainly get into several layers of complexity (including mutexes 
> > > > and
> > > > memory allocations) which we shall be much safer to avoid.
> > > > 
> > > > Signed-off-by: Hugh Dickins 
> > > > [ Edited commit message and modified to short-circuit panic_timeout < 0
> > > >   instead of testing panic_timeout >= 0.  -Mandeep ]
> > > > Signed-off-by: Mandeep Singh Baines 
> > > > Cc: Dave Airlie 
> > > > Cc: Andrew Morton 
> > > > Cc: dri-devel at lists.freedesktop.org
> > > 
> > > Acked-by: David Rientjes 
> > > 
> > 
> > Dave, where do we stand on this?  I haven't seen it hit Linus' tree and I 
> > don't see it in git://people.freedesktop.org/~airlied/linux.
> 
> The last status I have is Andrew pulling it into mmotm on 10/18/11.
> 
> Subject: + 
> drm-avoid-switching-to-text-console-if-there-is-no-panic-timeout.patch added 
> to -mm tree
> From: akpm at linux-foundation.org
> Date: Tue, 18 Oct 2011 15:42:46 -0700
> 
> 
> The patch titled
>  Subject: drm: avoid switching to text console if there is no panic 
> timeout
> has been added to the -mm tree.  Its filename is
>  drm-avoid-switching-to-text-console-if-there-is-no-panic-timeout.patch

I need to do another round of sending patches to maintainers.

It's a depressing exercise because the great majority of patches are
simply ignored.  Last time I even added "please don't ignore" to the
email Subject: on the more important ones.  Sigh.

> Where is mmotm hosted these days?

On my disk, until kernel.org ftp access returns.

But I regularly email tarballs to Stephen, so it's all in linux-next. 
The mmotm tree is largely unneeded now - use linux-next to get at -mm
patches.



Re: [PATCH] drm: avoid switching to text console if there is no panic timeout

2011-11-10 Thread Andrew Morton
On Thu, 10 Nov 2011 13:15:04 -0800
Mandeep Singh Baines m...@chromium.org wrote:

 David Rientjes (rient...@google.com) wrote:
  On Mon, 17 Oct 2011, David Rientjes wrote:
  
   On Mon, 17 Oct 2011, Mandeep Singh Baines wrote:
   
From: Hugh Dickins hu...@chromium.org

Add a check for panic_timeout in the drm_fb_helper_panic() notifier: if
we're going to reboot immediately, the user will not be able to see the
messages anyway, and messing with the video mode may display artifacts,
and certainly get into several layers of complexity (including mutexes 
and
memory allocations) which we shall be much safer to avoid.

Signed-off-by: Hugh Dickins hu...@google.com
[ Edited commit message and modified to short-circuit panic_timeout  0
  instead of testing panic_timeout = 0.  -Mandeep ]
Signed-off-by: Mandeep Singh Baines m...@chromium.org
Cc: Dave Airlie airl...@redhat.com
Cc: Andrew Morton a...@linux-foundation.org
Cc: dri-devel@lists.freedesktop.org
   
   Acked-by: David Rientjes rient...@google.com
   
  
  Dave, where do we stand on this?  I haven't seen it hit Linus' tree and I 
  don't see it in git://people.freedesktop.org/~airlied/linux.
 
 The last status I have is Andrew pulling it into mmotm on 10/18/11.
 
 Subject: + 
 drm-avoid-switching-to-text-console-if-there-is-no-panic-timeout.patch added 
 to -mm tree
 From: a...@linux-foundation.org
 Date: Tue, 18 Oct 2011 15:42:46 -0700
 
 
 The patch titled
  Subject: drm: avoid switching to text console if there is no panic 
 timeout
 has been added to the -mm tree.  Its filename is
  drm-avoid-switching-to-text-console-if-there-is-no-panic-timeout.patch

I need to do another round of sending patches to maintainers.

It's a depressing exercise because the great majority of patches are
simply ignored.  Last time I even added please don't ignore to the
email Subject: on the more important ones.  Sigh.

 Where is mmotm hosted these days?

On my disk, until kernel.org ftp access returns.

But I regularly email tarballs to Stephen, so it's all in linux-next. 
The mmotm tree is largely unneeded now - use linux-next to get at -mm
patches.

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


+ drivers-base-platformc-dont-mark-platform_device_register_resndata-as-__init_or_module.patch added to -mm tree

2011-05-19 Thread Andrew Morton
On Thu, 19 May 2011 08:21:36 +0200 Uwe Kleine-K?nig  wrote:

> I'm not sure that the things that radeon_cp_init does are sane.

Some more details here might help someone fix it.

> Maybe
> add a comment that it is the only known stopper to make
> platform_device_register_resndata __init_or_module and a similar comment
> to platform_device_register_resndata itself?

Send patch :)


Re: + drivers-base-platformc-dont-mark-platform_device_register_resndata-as-__init_or_module.patch added to -mm tree

2011-05-19 Thread Andrew Morton
On Thu, 19 May 2011 08:21:36 +0200 Uwe Kleine-König 
u.kleine-koe...@pengutronix.de wrote:

 I'm not sure that the things that radeon_cp_init does are sane.

Some more details here might help someone fix it.

 Maybe
 add a comment that it is the only known stopper to make
 platform_device_register_resndata __init_or_module and a similar comment
 to platform_device_register_resndata itself?

Send patch :)
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


2.6.37.2 : Font corruption and [drm:i915_gem_object_unbind] *ERROR* Attempting to unbind pinned buffer

2011-03-02 Thread Andrew Morton
(cc dri-devel)


On Mon, 28 Feb 2011 12:26:08 +0100 Paul Rolland  wrote:

> Hello,
> 
> I'm using 2.6.37.2 and I'm getting loads of this error in my messages,
> while at the same time I can observe font display corruption in a GTK
> application (Claws-Mail) while running X, compiz and this app.
> 
> I was previously using 2.6.36 and never experienced such a problem.
> 
> Video is :
> 00:02.0 VGA compatible controller: Intel Corporation Mobile 4 Series Chipset 
> Integrated Graphics Controller (rev 07)
> 00:02.1 Display controller: Intel Corporation Mobile 4 Series Chipset 
> Integrated Graphics Controller (rev 07)
> 
> I can provide additional information as needed (such a .config or details
> on userland env. if asked for precise details).
> 
> If this message warns about a badly acting userland, then why was it ok
> with 2.6.36 and the same userland env. ?
> 
> Best,
> Paul
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


[2.6.38-rc6] G965: i915 Hangcheck timer elapsed... GPU hung (not reproducible)

2011-03-02 Thread Andrew Morton

(cc dri-devel)

A post-2.6.37 regression.

On Sun, 27 Feb 2011 10:10:41 +0100
Paolo Ornati  wrote:

> Today I got this while starting a video in SMplayer (MPlayer) with
> 2.6.38-rc6-00113-g4662db4:
> 
> [  830.880014] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer 
> elapsed... GPU hung
> [  830.880736] [drm:i915_do_wait_request] *ERROR* i915_do_wait_request 
> returns -11 (awaiting 174895 at 174857, next 174896)
> [  830.881093] [drm:init_ring_common] *ERROR* render ring initialization 
> failed ctl  head  tail  start 
> [  831.379079] [drm:i915_do_wait_request] *ERROR* something (likely vbetool) 
> disabled interrupts, re-enabling
> [  831.399099] [drm:i915_do_wait_request] *ERROR* something (likely vbetool) 
> disabled interrupts, re-enabling
>   ...
> [  837.392012] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer 
> elapsed... GPU hung
> [  837.392038] [drm:i915_do_wait_request] *ERROR* i915_do_wait_request 
> returns -11 (awaiting 175016 at 174857, next 175022)
> [  837.392491] [drm:init_ring_common] *ERROR* render ring initialization 
> failed ctl  head  tail  start 
> [  837.537479] [drm:i915_do_wait_request] *ERROR* something (likely vbetool) 
> disabled interrupts, re-enabling
> [  837.543285] [drm:i915_do_wait_request] *ERROR* something (likely vbetool) 
> disabled interrupts, re-enabling
>   ...
> [  839.040011] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer 
> elapsed... GPU hung
> [  839.040034] [drm:i915_do_wait_request] *ERROR* i915_do_wait_request 
> returns -11 (awaiting 175022 at 174857, next 175122)
> [  839.040364] [drm:i915_reset] *ERROR* GPU hanging too fast, declaring 
> wedged!
> [  839.040367] [drm:i915_reset] *ERROR* Failed to reset chip.
> 
> Screen was almost freezed, cursor was stuck but the machine was alive
> and I was able to use SysRq to kill X and try to restart it (but that
> didn't help).
> 
> I don't remember anything similar in recent kernels (<= 2.6.37) and got
> this only once with 2.6.38-rcX.
> 
> Environment at the time of the GPU crash:
>   KDE4 (without "Desktop Effects")
>   Chromium
>   Claws Mail
>   Dolphin
>   ccached make -j3 on a just pulled linux-tree (so I/O bound)
>   SMPlayer/Mplayer (just launched)
> 
> Assorted logs attached.
> 



Re: [2.6.38-rc6] G965: i915 Hangcheck timer elapsed... GPU hung (not reproducible)

2011-03-02 Thread Andrew Morton

(cc dri-devel)

A post-2.6.37 regression.

On Sun, 27 Feb 2011 10:10:41 +0100
Paolo Ornati orn...@gmail.com wrote:

 Today I got this while starting a video in SMplayer (MPlayer) with
 2.6.38-rc6-00113-g4662db4:
 
 [  830.880014] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer 
 elapsed... GPU hung
 [  830.880736] [drm:i915_do_wait_request] *ERROR* i915_do_wait_request 
 returns -11 (awaiting 174895 at 174857, next 174896)
 [  830.881093] [drm:init_ring_common] *ERROR* render ring initialization 
 failed ctl  head  tail  start 
 [  831.379079] [drm:i915_do_wait_request] *ERROR* something (likely vbetool) 
 disabled interrupts, re-enabling
 [  831.399099] [drm:i915_do_wait_request] *ERROR* something (likely vbetool) 
 disabled interrupts, re-enabling
   ...
 [  837.392012] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer 
 elapsed... GPU hung
 [  837.392038] [drm:i915_do_wait_request] *ERROR* i915_do_wait_request 
 returns -11 (awaiting 175016 at 174857, next 175022)
 [  837.392491] [drm:init_ring_common] *ERROR* render ring initialization 
 failed ctl  head  tail  start 
 [  837.537479] [drm:i915_do_wait_request] *ERROR* something (likely vbetool) 
 disabled interrupts, re-enabling
 [  837.543285] [drm:i915_do_wait_request] *ERROR* something (likely vbetool) 
 disabled interrupts, re-enabling
   ...
 [  839.040011] [drm:i915_hangcheck_elapsed] *ERROR* Hangcheck timer 
 elapsed... GPU hung
 [  839.040034] [drm:i915_do_wait_request] *ERROR* i915_do_wait_request 
 returns -11 (awaiting 175022 at 174857, next 175122)
 [  839.040364] [drm:i915_reset] *ERROR* GPU hanging too fast, declaring 
 wedged!
 [  839.040367] [drm:i915_reset] *ERROR* Failed to reset chip.
 
 Screen was almost freezed, cursor was stuck but the machine was alive
 and I was able to use SysRq to kill X and try to restart it (but that
 didn't help).
 
 I don't remember anything similar in recent kernels (= 2.6.37) and got
 this only once with 2.6.38-rcX.
 
 Environment at the time of the GPU crash:
   KDE4 (without Desktop Effects)
   Chromium
   Claws Mail
   Dolphin
   ccached make -j3 on a just pulled linux-tree (so I/O bound)
   SMPlayer/Mplayer (just launched)
 
 Assorted logs attached.
 

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: 2.6.37.2 : Font corruption and [drm:i915_gem_object_unbind] *ERROR* Attempting to unbind pinned buffer

2011-03-02 Thread Andrew Morton
(cc dri-devel)


On Mon, 28 Feb 2011 12:26:08 +0100 Paul Rolland r...@as2917.net wrote:

 Hello,
 
 I'm using 2.6.37.2 and I'm getting loads of this error in my messages,
 while at the same time I can observe font display corruption in a GTK
 application (Claws-Mail) while running X, compiz and this app.
 
 I was previously using 2.6.36 and never experienced such a problem.
 
 Video is :
 00:02.0 VGA compatible controller: Intel Corporation Mobile 4 Series Chipset 
 Integrated Graphics Controller (rev 07)
 00:02.1 Display controller: Intel Corporation Mobile 4 Series Chipset 
 Integrated Graphics Controller (rev 07)
 
 I can provide additional information as needed (such a .config or details
 on userland env. if asked for precise details).
 
 If this message warns about a badly acting userland, then why was it ok
 with 2.6.36 and the same userland env. ?
 
 Best,
 Paul
 --
 To unsubscribe from this list: send the line unsubscribe linux-kernel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html
 Please read the FAQ at  http://www.tux.org/lkml/
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] change acquire/release_console_sem() to console_lock/unlock()

2011-01-21 Thread Andrew Morton
On Fri, 21 Jan 2011 09:10:06 +0100 Geert Uytterhoeven ge...@linux-m68k.org 
wrote:

 include/linux/mutex.h:
 
 /*
  * NOTE: mutex_trylock() follows the spin_trylock() convention,
  *   not the down_trylock() convention!
  *
  * Returns 1 if the mutex has been acquired successfully, and 0 on contention.
  */
 extern int mutex_trylock(struct mutex *lock);
 
 So that's why the return value was inverted (when treating it as a boolean).
 I can understand that.
 
 However:
 
 +/**
 + * console_trylock - try to lock the console system for exclusive use.
 + *
 + * Tried to acquire a lock which guarantees that the caller has
 + * exclusive access to the console system and the console_drivers list.
 + *
 + * returns -1 on success, and 0 on failure to acquire the lock.
 + */
 +int console_trylock(void)
 
 So this one returns -1 on success, not 1? Why?

Yup.  All callers just test for non-zero, so...

--- 
a/kernel/printk.c~change-acquire-release_console_sem-to-console_lock-unlock-fix-2
+++ a/kernel/printk.c
@@ -1058,7 +1058,7 @@ EXPORT_SYMBOL(console_lock);
  * Tried to acquire a lock which guarantees that the caller has
  * exclusive access to the console system and the console_drivers list.
  *
- * returns -1 on success, and 0 on failure to acquire the lock.
+ * returns 1 on success, and 0 on failure to acquire the lock.
  */
 int console_trylock(void)
 {
@@ -1070,7 +1070,7 @@ int console_trylock(void)
}
console_locked = 1;
console_may_schedule = 0;
-   return -1;
+   return 1;
 }
 EXPORT_SYMBOL(console_trylock);
 
_

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] change acquire/release_console_sem() to console_lock/unlock()

2011-01-20 Thread Andrew Morton
On Thu, 20 Jan 2011 17:55:02 +0100
torbenh torb...@gmx.de wrote:

 On Thu, Jan 20, 2011 at 08:34:48AM -0800, Greg KH wrote:
  On Thu, Jan 20, 2011 at 04:58:13PM +0100, Torben Hohn wrote:
   the -rt patches change the console_semaphore to console_mutex.
   so a quite large chunk of the patches changes all
   acquire/release_console_sem() to acquire/release_console_mutex()
  
  Why not just change the functionality of the existing function to be a
  mutex in the rt patches, instead of having to rename it everywhere?
 
 i hope that Thomas already did this in his upcoming -rt series.
 
  
   this commit makes things use more neutral function names
   which dont make implications about the underlying lock.
   
   the only real change is the return value of console_trylock
   which is inverted from try_acquire_console_sem()
   
   Signed-off-by: Torben Hohn torb...@gmx.de
   CC: Thomas Gleixner t...@tglx.de
  
  I don't mind this rename, but is it really going to help anything out?
  What's the odds of the -rt portion of this patch ever making it to
  mainline?
 
 the -rt portion only changes the semaphore to a mutex.
 since the console_sem is used with mutex semantics, i dont see any
 reason, not to merge that portion too. 
 
 i am just trying to shrink the -rt patch to make it more maintanable :)
 

Yeah, I think it's a better name and if we can indeed switch that
semaphore to a mutex then that's a good thing to do.
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [Intel-gfx] [PATCH 2/5] i915: Add native backlight control

2011-01-20 Thread Andrew Morton
On Fri, 21 Jan 2011 00:43:59 +
Matthew Garrett mj...@srcf.ucam.org wrote:

 On Thu, Jan 20, 2011 at 03:13:49PM -0800, Andrew Morton wrote:
  On Fri, 21 Jan 2011 00:43:46 +0330
  Ali Gholami Rudi aliqr...@gmail.com wrote:
  
   Ali Gholami Rudi aliqr...@gmail.com wrote:
I tried to apply this patch but I get:

drivers/gpu/drm/i915/intel_panel.c: In function 
'intel_panel_setup_backlight':
drivers/gpu/drm/i915/intel_panel.c:319: error: 'struct 
backlight_properties' has no member named 'type'
drivers/gpu/drm/i915/intel_panel.c:319: error: 'BACKLIGHT_RAW' 
undeclared (first use in this function)
drivers/gpu/drm/i915/intel_panel.c:319: error: (Each undeclared 
identifier is reported only once
drivers/gpu/drm/i915/intel_panel.c:319: error: for each 
function it appears in.)
   
   After applying patch 1/5, the patch applied cleanly.
   Now I can change the brightness using
   /sys/class/backlight/intel_backlight/brightness.
   So it does fix my issue.
   
  
  So we have a patch ordering issue and the
  radeon-expose-backlight-class-device-for-legacy-lvds-encoder.patch
  build error.
 
 He applied 2/5, it didn't build, he applied 1/5 and it built? I don't 
 think that's a patch ordering issue :)

What, you expect reading skills?

 I think Michel's patch should fix the Radeon one. If not, can you drop 
 that and keep the rest of the set? I'm travelling at the moment and 
 won't have proper build access until the weekend.

OK, I resurrected everything.

I updated the new drivers/video/backlight/adp5520_bl.c, but there's a
decent chance that unconverted drivers will sneak in.  I trust they
will still work OK?

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [Intel-gfx] [PATCH 2/5] i915: Add native backlight control

2011-01-20 Thread Andrew Morton
On Fri, 21 Jan 2011 06:36:54 +0100 Sedat Dilek sedat.di...@googlemail.com 
wrote:

 ( Original posting from [1] )
 
 I have the backlight-type patchset for months in my patch-series (and
 maintained them if necessary against daily linux-next).
 Also the last series from 14-Jan-2011 (see 1-5 patch from [2] and the
 following ones at dri-devel ML).
 
 If you couldn't find the updated v2 radeon-backlight-type patch, here
 the extract from my patch-series:
 ...
 # Patch from https://patchwork.kernel.org/patch/491351/
 + 
 backlight-type/v2-drm-radeon-kms-Expose-backlight-class-device-for-legacy-LVDS-encoder.patch
 ...
 
 I can only speak for the radeon(-KMS) part with
 CONFIG_BACKLIGHT_CLASS_DEVICE=y set and against linux-next: It worked,
 it works.
 
 I am a bit angry that someone of the big 5 gets immediately an
 answer where mine is ignored [3].

Well, who were they sent to?

If it was dri-devel then perhaps the people there felt it was
inappropriate to their tree, or they were all busy fixing
100 regressions.

If it was Richard then he's been distracted by other things for some
time, which is why I recently started handling backlight and leds
patches.

If it was me then kick me, but I don't think it was.

In general, if you think that patches aren't getting attention then let
me know and send them to me - harassing people for you is part of my
job description.

 So dear Mr. AKPM, if you can standstill for a few moments to answer
 only one of my questions, through which kernel-tree will this patchset
 walk trough?

leds and backlight patches are presently getting merged into my tree
and I'm sending them into Linus.  I make sure that Richard sees them
all and when he finds time he helps review them for us.

They should turn up in linux-next too - we're working on that.


This particular patch series is theoretically a bit late for 2.6.38,
but if anyone thinks that's a wrong decision, feel free to explain why.

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 3/5] radeon: Expose backlight class device for legacy LVDS encoder

2011-01-19 Thread Andrew Morton
On Fri, 14 Jan 2011 14:24:23 -0500
Matthew Garrett  wrote:

> From: Michel D__nzer 
> 
> Allows e.g. power management daemons to control the backlight level. Inspired
> by the corresponding code in radeonfb.
> 
> (Updated to add backlight type and make the connector the parent device - mjg)
> 

x86_64 allmodconfig:

drivers/gpu/drm/radeon/radeon_legacy_encoders.c: In function 
'radeon_legacy_lvds_update':
drivers/gpu/drm/radeon/radeon_legacy_encoders.c:64: error: 'struct 
radeon_encoder_atom_dig' has no member named 'bl_dev'
drivers/gpu/drm/radeon/radeon_legacy_encoders.c:65: error: 'struct 
radeon_encoder_atom_dig' has no member named 'backlight_level'
drivers/gpu/drm/radeon/radeon_legacy_encoders.c:69: error: 'struct 
radeon_encoder_lvds' has no member named 'bl_dev'
drivers/gpu/drm/radeon/radeon_legacy_encoders.c:70: error: 'struct 
radeon_encoder_lvds' has no member named 'backlight_level'
drivers/gpu/drm/radeon/radeon_legacy_encoders.c: In function 
'radeon_legacy_lvds_dpms':
drivers/gpu/drm/radeon/radeon_legacy_encoders.c:144: error: 'struct 
radeon_encoder_atom_dig' has no member named 'dpms_mode'
drivers/gpu/drm/radeon/radeon_legacy_encoders.c:147: error: 'struct 
radeon_encoder_lvds' has no member named 'dpms_mode'



Re: [PATCH 3/5] radeon: Expose backlight class device for legacy LVDS encoder

2011-01-19 Thread Andrew Morton
On Fri, 14 Jan 2011 14:24:23 -0500
Matthew Garrett m...@redhat.com wrote:

 From: Michel D__nzer mic...@daenzer.net
 
 Allows e.g. power management daemons to control the backlight level. Inspired
 by the corresponding code in radeonfb.
 
 (Updated to add backlight type and make the connector the parent device - mjg)
 

x86_64 allmodconfig:

drivers/gpu/drm/radeon/radeon_legacy_encoders.c: In function 
'radeon_legacy_lvds_update':
drivers/gpu/drm/radeon/radeon_legacy_encoders.c:64: error: 'struct 
radeon_encoder_atom_dig' has no member named 'bl_dev'
drivers/gpu/drm/radeon/radeon_legacy_encoders.c:65: error: 'struct 
radeon_encoder_atom_dig' has no member named 'backlight_level'
drivers/gpu/drm/radeon/radeon_legacy_encoders.c:69: error: 'struct 
radeon_encoder_lvds' has no member named 'bl_dev'
drivers/gpu/drm/radeon/radeon_legacy_encoders.c:70: error: 'struct 
radeon_encoder_lvds' has no member named 'backlight_level'
drivers/gpu/drm/radeon/radeon_legacy_encoders.c: In function 
'radeon_legacy_lvds_dpms':
drivers/gpu/drm/radeon/radeon_legacy_encoders.c:144: error: 'struct 
radeon_encoder_atom_dig' has no member named 'dpms_mode'
drivers/gpu/drm/radeon/radeon_legacy_encoders.c:147: error: 'struct 
radeon_encoder_lvds' has no member named 'dpms_mode'

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


VERY slow scrolling on radeon graphics card: debugging a timing issue?

2010-12-22 Thread Andrew Morton
(cc dri-devel)

On Mon, 20 Dec 2010 23:58:21 +0300 Michael Tokarev  wrote:

> Hello.
> 
> A weird problem here, and I'm looking for help in
> an attempt to solve it.
> 
> Ever since KMS went into kernel and I tried turning
> it on, the scrolling speed on the resulting "text"
> console (with kms it works in one of graphics modes
> hence "text" in quotes) become really _awful_.
> 
> For example, running `dmesg' (which has ~2000 lines)
> on the console takes about 2.5 _minutes_ (!) to
> complete, -- which means the speed is about 10 lines
> per second.  On an old notebook I have, with some also
> nvidia card, the same operation completes in about
> 0.8 sec.
> 
> The lines goes up in a slow motion, I can watch every
> new line appearing and scrolling.
> 
> It was this way for a long time, and I almost gave up --
> in X everything works ok, and in order to speed up
> booting again I just added "quiet" option to the kernel
> command line, to avoid scrolling of kernel messages.
> 
> But yesterday I noticed something else entirely, which
> make me hope the problem actually _can_ be solved.
> 
> The thing is: that same scrolling becomes much faster
> when I "do something" else while it scrolls up.  First
> I noticed this when I wanted to switch to another vt
> while it were scrolling -- I held down Ctrl key on my
> keyboard, and out of the sudden the scroll speed up
> dramatically.
> 
> It turned out I can speed the thing to about 10 times
> by generating some load: hit and hold a key on the
> keyboard (generates interrupts?), run kernel compile
> in the background (generates disk interrupts?), move
> mouse...
> 
> While doing "something", the same scrolling completes
> in about 8 seconds instead of 2m30s.  Dramatic improvement.
> 
> Now, when I hold a key or move mouse, the scrolling
> is "jumpy" - sometimes it slows down back to original
> "slow" form for a bit, and sometimes it jumps a few
> lines in one go.
> 
> I tried to disable cpufreq (selecting "performance"
> governor) - this changes exactly nothing.
> 
> Next someone suggested the "perf" tool.  And this one
> is even more interesting: while `perf top' is running
> (on another console or X), the scrolling is.. fast
> again, as if I were moving my mouse!  Once I stop
> `perf top', it becomes slow again.  So the bug
> disappears while you watch it.
> 
> And there I'm stuck again.  I asked in #radeon, but
> there, Alex Deucher told me that he has no clue and
> that the behavour is weird (it is weird indeed).
> 
> Any hints on where to go from there are apprecated.
> 
> The hardware is an AMD780g-based motherboard with
> and Athlon CPU, I've seen the same behavour from
> many other similar boards.  Kernels - all up to
> the current 2.6.36.2, sine the old days when kms
> for radeon first appeared in staging.
> 
> I know kms/fbcon scrolling is slow on radeon because
> it uses completely unoptimized bitblt routines (even
> when the hw is pretty much capable of doing all that
> stuff internally).  But what I see here is something
> different - the 8 sec to scroll 2000 lines is the
> result from the un-optimal bitblt, not the 2m30s.
> 
> Thanks!
> 
> /mjt
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo at vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/


[PATCH 1/4] drm: fix headers to include linux/types.h

2010-12-01 Thread Andrew Morton
On Wed, 1 Dec 2010 17:54:18 +0100
Julien Cristau  wrote:

> On Wed, Dec  1, 2010 at 17:10:42 +0200, Alexander Shishkin wrote:
> 
> > For headers that get exported to userland and make use of u32 style
> > type names, it is advised to include linux/types.h.
> > 
> > This fixes 5 headers_check warnings.
> > 
> How many times does this need to be NAKed?

Until someone gets a clue and puts comments in there explaining this?


Re: [PATCH 1/4] drm: fix headers to include linux/types.h

2010-12-01 Thread Andrew Morton
On Wed, 1 Dec 2010 17:54:18 +0100
Julien Cristau jcris...@debian.org wrote:

 On Wed, Dec  1, 2010 at 17:10:42 +0200, Alexander Shishkin wrote:
 
  For headers that get exported to userland and make use of u32 style
  type names, it is advised to include linux/types.h.
  
  This fixes 5 headers_check warnings.
  
 How many times does this need to be NAKed?

Until someone gets a clue and puts comments in there explaining this?
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH 1/5] Backlight: Add backlight type

2010-11-19 Thread Andrew Morton
On Fri, 19 Nov 2010 20:25:59 +
Matthew Garrett  wrote:

> On Fri, Nov 19, 2010 at 12:05:23PM -0800, Andrew Morton wrote:
> > On Fri, 19 Nov 2010 10:53:52 -0500
> > Matthew Garrett  wrote:
> > 
> > > There may be multiple ways of controlling the backlight on a given 
> > > machine.
> > > Allow drivers to expose the type of interface they are providing, making
> > > it possible for userspace to make appropriate policy decisions.
> > > 
> > > ...
> > >
> > >  60 files changed, 102 insertions(+), 0 deletions(-)
> > 
> > This patch has a pretty short half-life.
> 
> Well, ideally it would have landed in the backlight tree when I sent it 
> months ago. Then we'd have the opportunity to ensure that everything was 
> fixed up before it went in in the merge window.

Richard got distracted.  At present I'm grabbing the leds and backlight
patches and Richard is reviewing them as they fly past.

I don't see there's much point in me merging this patch series so if it
survives review, I'd suggest that you put it into an mjg tree and
thence into linux-next and mainline?

> > > @@ -62,6 +68,8 @@ struct backlight_properties {
> > >   /* FB Blanking active? (values as for power) */
> > >   /* Due to be removed, please use (state & BL_CORE_FBBLANK) */
> > >   int fb_blank;
> > > + /* Backlight type */
> > > + enum backlight_type type;
> > >   /* Flags used to signal drivers of state changes */
> > >   /* Upper 4 bits are reserved for driver internal use */
> > >   unsigned int state;
> > 
> > And if/when the half-life expires, we'll have drivers in-tree which
> > forget to set backlight_properties.type.  I haven't checked, but if
> > we're lucky they will default to "0".
> 
> Depends entirely on whether they kzalloc the structure or not before
> calling backlight_device_register(). 

Well.  Even if it's uninitialised, the chances of the value being 1, 2
or 3 for all users are pretty small, so we'll still get to hear about
it if the runtime check is appropriately implemented.

> > What will be the runtime effects upon such unconverted drivers? 
> > Ideally we'd like them to continue to work OK, and to emit a runtime
> > warning.  In which case you'll need BACKLIGHT_RAW=1 so the unconverted
> > driver can be detected, warned about and fixed up by the core code.
> 
> The worst case I can think of is that we walk off the array - I guess 
> there's an argument for sanity checking that in backlight_show_type().

OK, well please have a think about it, see what you can do to handle
unconverted (and possibly out-of-tree) drivers in a friendly fashion.



[PATCH 1/5] Backlight: Add backlight type

2010-11-19 Thread Andrew Morton
On Fri, 19 Nov 2010 10:53:52 -0500
Matthew Garrett  wrote:

> There may be multiple ways of controlling the backlight on a given machine.
> Allow drivers to expose the type of interface they are providing, making
> it possible for userspace to make appropriate policy decisions.
> 
> ...
>
>  60 files changed, 102 insertions(+), 0 deletions(-)

This patch has a pretty short half-life.

>
> ...
>
> --- a/include/linux/backlight.h
> +++ b/include/linux/backlight.h
> @@ -32,6 +32,12 @@ enum backlight_update_reason {
>   BACKLIGHT_UPDATE_SYSFS,
>  };
>  
> +enum backlight_type {
> + BACKLIGHT_RAW,
> + BACKLIGHT_PLATFORM,
> + BACKLIGHT_FIRMWARE,
> +};
> +
>  struct backlight_device;
>  struct fb_info;
>  
> @@ -62,6 +68,8 @@ struct backlight_properties {
>   /* FB Blanking active? (values as for power) */
>   /* Due to be removed, please use (state & BL_CORE_FBBLANK) */
>   int fb_blank;
> + /* Backlight type */
> + enum backlight_type type;
>   /* Flags used to signal drivers of state changes */
>   /* Upper 4 bits are reserved for driver internal use */
>   unsigned int state;

And if/when the half-life expires, we'll have drivers in-tree which
forget to set backlight_properties.type.  I haven't checked, but if
we're lucky they will default to "0".

What will be the runtime effects upon such unconverted drivers? 
Ideally we'd like them to continue to work OK, and to emit a runtime
warning.  In which case you'll need BACKLIGHT_RAW=1 so the unconverted
driver can be detected, warned about and fixed up by the core code.



Re: [PATCH 1/5] Backlight: Add backlight type

2010-11-19 Thread Andrew Morton
On Fri, 19 Nov 2010 10:53:52 -0500
Matthew Garrett m...@redhat.com wrote:

 There may be multiple ways of controlling the backlight on a given machine.
 Allow drivers to expose the type of interface they are providing, making
 it possible for userspace to make appropriate policy decisions.
 
 ...

  60 files changed, 102 insertions(+), 0 deletions(-)

This patch has a pretty short half-life.


 ...

 --- a/include/linux/backlight.h
 +++ b/include/linux/backlight.h
 @@ -32,6 +32,12 @@ enum backlight_update_reason {
   BACKLIGHT_UPDATE_SYSFS,
  };
  
 +enum backlight_type {
 + BACKLIGHT_RAW,
 + BACKLIGHT_PLATFORM,
 + BACKLIGHT_FIRMWARE,
 +};
 +
  struct backlight_device;
  struct fb_info;
  
 @@ -62,6 +68,8 @@ struct backlight_properties {
   /* FB Blanking active? (values as for power) */
   /* Due to be removed, please use (state  BL_CORE_FBBLANK) */
   int fb_blank;
 + /* Backlight type */
 + enum backlight_type type;
   /* Flags used to signal drivers of state changes */
   /* Upper 4 bits are reserved for driver internal use */
   unsigned int state;

And if/when the half-life expires, we'll have drivers in-tree which
forget to set backlight_properties.type.  I haven't checked, but if
we're lucky they will default to 0.

What will be the runtime effects upon such unconverted drivers? 
Ideally we'd like them to continue to work OK, and to emit a runtime
warning.  In which case you'll need BACKLIGHT_RAW=1 so the unconverted
driver can be detected, warned about and fixed up by the core code.

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH 1/5] Backlight: Add backlight type

2010-11-19 Thread Andrew Morton
On Fri, 19 Nov 2010 20:25:59 +
Matthew Garrett mj...@srcf.ucam.org wrote:

 On Fri, Nov 19, 2010 at 12:05:23PM -0800, Andrew Morton wrote:
  On Fri, 19 Nov 2010 10:53:52 -0500
  Matthew Garrett m...@redhat.com wrote:
  
   There may be multiple ways of controlling the backlight on a given 
   machine.
   Allow drivers to expose the type of interface they are providing, making
   it possible for userspace to make appropriate policy decisions.
   
   ...
  
60 files changed, 102 insertions(+), 0 deletions(-)
  
  This patch has a pretty short half-life.
 
 Well, ideally it would have landed in the backlight tree when I sent it 
 months ago. Then we'd have the opportunity to ensure that everything was 
 fixed up before it went in in the merge window.

Richard got distracted.  At present I'm grabbing the leds and backlight
patches and Richard is reviewing them as they fly past.

I don't see there's much point in me merging this patch series so if it
survives review, I'd suggest that you put it into an mjg tree and
thence into linux-next and mainline?

   @@ -62,6 +68,8 @@ struct backlight_properties {
 /* FB Blanking active? (values as for power) */
 /* Due to be removed, please use (state  BL_CORE_FBBLANK) */
 int fb_blank;
   + /* Backlight type */
   + enum backlight_type type;
 /* Flags used to signal drivers of state changes */
 /* Upper 4 bits are reserved for driver internal use */
 unsigned int state;
  
  And if/when the half-life expires, we'll have drivers in-tree which
  forget to set backlight_properties.type.  I haven't checked, but if
  we're lucky they will default to 0.
 
 Depends entirely on whether they kzalloc the structure or not before
 calling backlight_device_register(). 

Well.  Even if it's uninitialised, the chances of the value being 1, 2
or 3 for all users are pretty small, so we'll still get to hear about
it if the runtime check is appropriately implemented.

  What will be the runtime effects upon such unconverted drivers? 
  Ideally we'd like them to continue to work OK, and to emit a runtime
  warning.  In which case you'll need BACKLIGHT_RAW=1 so the unconverted
  driver can be detected, warned about and fixed up by the core code.
 
 The worst case I can think of is that we walk off the array - I guess 
 there's an argument for sanity checking that in backlight_show_type().

OK, well please have a think about it, see what you can do to handle
unconverted (and possibly out-of-tree) drivers in a friendly fashion.

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH RESEND] drm: include missing types header to drm_mode.h

2010-10-22 Thread Andrew Morton
On Fri, 22 Oct 2010 10:13:19 -0300
Davidlohr Bueso  wrote:

> drm: include missing types header to drm_mode.h
> 
> Signed-off-by: Davidlohr Bueso 
> ---
>  include/drm/drm_mode.h |2 ++
>  1 files changed, 2 insertions(+), 0 deletions(-)
> 
> diff --git a/include/drm/drm_mode.h b/include/drm/drm_mode.h
> index 0fc7397..eddd7f4 100644
> --- a/include/drm/drm_mode.h
> +++ b/include/drm/drm_mode.h
> @@ -24,6 +24,8 @@
>   * IN THE SOFTWARE.
>   */
>  
> +#include 
> +
>  #ifndef _DRM_MODE_H
>  #define _DRM_MODE_H
>  

Does this fix a build error?  If so, please send along the compiler
error output.



Re: [PATCH RESEND] drm: include missing types header to drm_mode.h

2010-10-22 Thread Andrew Morton
On Fri, 22 Oct 2010 10:13:19 -0300
Davidlohr Bueso d...@gnu.org wrote:

 drm: include missing types header to drm_mode.h
 
 Signed-off-by: Davidlohr Bueso d...@gnu.org
 ---
  include/drm/drm_mode.h |2 ++
  1 files changed, 2 insertions(+), 0 deletions(-)
 
 diff --git a/include/drm/drm_mode.h b/include/drm/drm_mode.h
 index 0fc7397..eddd7f4 100644
 --- a/include/drm/drm_mode.h
 +++ b/include/drm/drm_mode.h
 @@ -24,6 +24,8 @@
   * IN THE SOFTWARE.
   */
  
 +#include linux/types.h
 +
  #ifndef _DRM_MODE_H
  #define _DRM_MODE_H
  

Does this fix a build error?  If so, please send along the compiler
error output.

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


[PATCH] drm: Prune GEM vma entries

2010-09-27 Thread Andrew Morton
That was quick, thanks.

On Mon, 27 Sep 2010 21:08:36 +0100
Chris Wilson  wrote:

> Hook the GEM vm open/close ops into the generic drm vm open/close so
> that the vma entries are created and destroy appropriately.

Please update the changelog to indicate that it fixes a memory leak.

> Reported-by: Matt Mackall 
> Cc: Dave Airlie 
> Cc: Jesse Barnes 
> Signed-off-by: Chris Wilson 

And here please add

Cc: 

so that earlier kernels get reliably fixed.

Thanks.




DRM-related kmalloc-32 memory leak in 2.6.35

2010-09-27 Thread Andrew Morton
On Wed, 25 Aug 2010 15:59:09 -0500
Matt Mackall  wrote:

> On Tue, 2010-08-24 at 10:37 -0500, Christoph Lameter wrote:
> > On Tue, 24 Aug 2010, Matt Mackall wrote:
> > 
> > > kmalloc-321113344 1113344 32  1281 : tunables00
> > > 0 : slabdata   8698   8698  0
> > >
> > > That's /proc/slabinfo on my laptop with SLUB. It looks like my last
> > > reboot popped me back to 2.6.33 so it may also be old news, but I
> > > couldn't spot any reports with Google.
> > 
> > Boot with "slub_debug" as a kernel parameter
> > 
> > and then do a
> > 
> > cat /sys/kernel/slab/kmalloc-32/alloc_calls
> > 
> > to find the caller allocating the objets.
> 
> Still present in 2.6.35. Appears to be DRM:
> 
> 845 drm_vm_open_locked+0x72/0x109 age=43/37572/59269 pid=2089
> cpus=0-1
> 
> That's after about a minute of uptime. Grows to 100k in about a day.
> 
> dmesg bits:
> [0.834653] [drm] Initialized drm 1.1.0 20060810
> [0.834986] pci :00:02.0: PCI INT A -> GSI 16 (level, low) -> IRQ 16
> [0.834995] pci :00:02.0: setting latency timer to 64
> [1.002572] mtrr: type mismatch for e000,1000 old: write-back new: 
> write-combining
> [1.002580] [drm] MTRR allocation failed.  Graphics performance may suffer.
> [1.019880] acpi device:03: registered as cooling_device2
> [1.021520] input: Video Bus as 
> /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00/input/input3
> [1.021543] ACPI: Video Device [GFX0] (multi-head: yes  rom: no  post: no)
> [1.021855] [drm] Initialized i915 1.6.0 20080730 for :00:02.0 on 
> minor 0
> 
> This is with:
> 
> 00:02.0 VGA compatible controller: Intel Corporation Mobile GM965/GL960
> Integrated Graphics Controller (rev 03)
> 00:02.1 Display controller: Intel Corporation Mobile GM965/GL960
> Integrated Graphics Controller (rev 03)
> 

Matt tells me that this (drop-dead box-killing!) bug is still present
in current kernels.  Could someone take a look please?

The code seems very simple, and I couldn't spot a problem.  Probably
this means that I'm even simpler.



Re: DRM-related kmalloc-32 memory leak in 2.6.35

2010-09-27 Thread Andrew Morton
On Wed, 25 Aug 2010 15:59:09 -0500
Matt Mackall m...@selenic.com wrote:

 On Tue, 2010-08-24 at 10:37 -0500, Christoph Lameter wrote:
  On Tue, 24 Aug 2010, Matt Mackall wrote:
  
   kmalloc-321113344 1113344 32  1281 : tunables00
   0 : slabdata   8698   8698  0
  
   That's /proc/slabinfo on my laptop with SLUB. It looks like my last
   reboot popped me back to 2.6.33 so it may also be old news, but I
   couldn't spot any reports with Google.
  
  Boot with slub_debug as a kernel parameter
  
  and then do a
  
  cat /sys/kernel/slab/kmalloc-32/alloc_calls
  
  to find the caller allocating the objets.
 
 Still present in 2.6.35. Appears to be DRM:
 
 845 drm_vm_open_locked+0x72/0x109 age=43/37572/59269 pid=2089
 cpus=0-1
 
 That's after about a minute of uptime. Grows to 100k in about a day.
 
 dmesg bits:
 [0.834653] [drm] Initialized drm 1.1.0 20060810
 [0.834986] pci :00:02.0: PCI INT A - GSI 16 (level, low) - IRQ 16
 [0.834995] pci :00:02.0: setting latency timer to 64
 [1.002572] mtrr: type mismatch for e000,1000 old: write-back new: 
 write-combining
 [1.002580] [drm] MTRR allocation failed.  Graphics performance may suffer.
 [1.019880] acpi device:03: registered as cooling_device2
 [1.021520] input: Video Bus as 
 /devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00/input/input3
 [1.021543] ACPI: Video Device [GFX0] (multi-head: yes  rom: no  post: no)
 [1.021855] [drm] Initialized i915 1.6.0 20080730 for :00:02.0 on 
 minor 0
 
 This is with:
 
 00:02.0 VGA compatible controller: Intel Corporation Mobile GM965/GL960
 Integrated Graphics Controller (rev 03)
 00:02.1 Display controller: Intel Corporation Mobile GM965/GL960
 Integrated Graphics Controller (rev 03)
 

Matt tells me that this (drop-dead box-killing!) bug is still present
in current kernels.  Could someone take a look please?

The code seems very simple, and I couldn't spot a problem.  Probably
this means that I'm even simpler.

___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


Re: [PATCH] drm: Prune GEM vma entries

2010-09-27 Thread Andrew Morton
That was quick, thanks.

On Mon, 27 Sep 2010 21:08:36 +0100
Chris Wilson ch...@chris-wilson.co.uk wrote:

 Hook the GEM vm open/close ops into the generic drm vm open/close so
 that the vma entries are created and destroy appropriately.

Please update the changelog to indicate that it fixes a memory leak.

 Reported-by: Matt Mackall m...@selenic.com
 Cc: Dave Airlie airl...@redhat.com
 Cc: Jesse Barnes jbar...@virtuousgeek.org
 Signed-off-by: Chris Wilson ch...@chris-wilson.co.uk

And here please add

Cc: sta...@kernel.org

so that earlier kernels get reliably fixed.

Thanks.


___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


idr_remove called for id=0 which is not allocated

2010-09-22 Thread Andrew Morton
On Wed, 22 Sep 2010 23:10:10 +0200 Alessandro Guido  wrote:

> I have this traces in my logs (full dmesg attached).
> 
> idr_remove called for id=0 which is not allocated.
> Pid: 1136, comm: Xorg Not tainted 2.6.36-rc5-49-gc79bd89 #1
> Call Trace:
>   [] ? printk+0x18/0x1a
>   [] idr_remove+0x73/0x1c0
>   [] drm_mode_object_put+0x2f/0x50
>   [] drm_mode_destroy+0xe/0x20
>   [] nouveau_connector_get_modes+0x2b/0x390
>   [] ? acpi_lid_open+0x22/0x3c
>   [] ? queue_delayed_work+0x1b/0x30
>   [] drm_helper_probe_single_connector_modes+0xc4/0x360
>   [] drm_mode_getconnector+0x2a7/0x350
>   [] drm_ioctl+0x1c2/0x4b0
>   [] ? filemap_fault+0x81/0x3c0
>   [] ? drm_mode_getconnector+0x0/0x350
>   [] ? handle_mm_fault+0x13f/0x670
>   [] ? drm_ioctl+0x0/0x4b0
>   [] do_vfs_ioctl+0x7d/0x5f0
>   [] ? do_page_fault+0x17c/0x3c0
>   [] ? vfs_write+0xfd/0x140
>   [] ? do_sync_write+0x0/0xe0
>   [] sys_ioctl+0x39/0x60
>   [] sysenter_do_call+0x12/0x26

I assume this is a regression.  2.6.35 didn't do this?


Re: idr_remove called for id=0 which is not allocated

2010-09-22 Thread Andrew Morton
On Wed, 22 Sep 2010 23:10:10 +0200 Alessandro Guido a...@alessandroguido.name 
wrote:

 I have this traces in my logs (full dmesg attached).
 
 idr_remove called for id=0 which is not allocated.
 Pid: 1136, comm: Xorg Not tainted 2.6.36-rc5-49-gc79bd89 #1
 Call Trace:
   [c1379e16] ? printk+0x18/0x1a
   [c113b8f3] idr_remove+0x73/0x1c0
   [c11b8d6f] drm_mode_object_put+0x2f/0x50
   [c11b8f6e] drm_mode_destroy+0xe/0x20
   [c11eb24b] nouveau_connector_get_modes+0x2b/0x390
   [c1185b6f] ? acpi_lid_open+0x22/0x3c
   [c103800b] ? queue_delayed_work+0x1b/0x30
   [c11abf34] drm_helper_probe_single_connector_modes+0xc4/0x360
   [c11bb6a7] drm_mode_getconnector+0x2a7/0x350
   [c11b00d2] drm_ioctl+0x1c2/0x4b0
   [c1068071] ? filemap_fault+0x81/0x3c0
   [c11bb400] ? drm_mode_getconnector+0x0/0x350
   [c107bedf] ? handle_mm_fault+0x13f/0x670
   [c11aff10] ? drm_ioctl+0x0/0x4b0
   [c109c90d] do_vfs_ioctl+0x7d/0x5f0
   [c101a14c] ? do_page_fault+0x17c/0x3c0
   [c108fc6d] ? vfs_write+0xfd/0x140
   [c108f1c0] ? do_sync_write+0x0/0xe0
   [c109ceb9] sys_ioctl+0x39/0x60
   [c1002b90] sysenter_do_call+0x12/0x26

I assume this is a regression.  2.6.35 didn't do this?
___
dri-devel mailing list
dri-devel@lists.freedesktop.org
http://lists.freedesktop.org/mailman/listinfo/dri-devel


  1   2   >