[Intel-gfx] [PATCH] mm: extend prefault helpers to fault in more than PAGE_SIZE
drm/i915 wants to read/write more than one page in its fastpath and hence needs to prefault more than PAGE_SIZE bytes. I've checked the callsites and they all already clamp size when calling fault_in_pages_* to the same as for the subsequent __copy_to|from_user and hence don't rely on the implicit clamping to PAGE_SIZE. Also kill a copypasted spurious space in both functions while at it. v2: As suggested by Andrew Morton, add a multipage parameter to both functions to avoid the additional branch for the pagemap.c hotpath. My gcc 4.6 here seems to dtrt and indeed reap these branches where not needed. Cc: linux...@kvack.org Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch --- drivers/gpu/drm/i915/i915_gem.c|4 +- drivers/gpu/drm/i915/i915_gem_execbuffer.c |2 +- fs/pipe.c |4 +- fs/splice.c|2 +- include/linux/pagemap.h| 39 ++- mm/filemap.c |4 +- 6 files changed, 34 insertions(+), 21 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 544e528..9b200f4e 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -436,7 +436,7 @@ i915_gem_shmem_pread(struct drm_device *dev, mutex_unlock(dev-struct_mutex); if (!prefaulted) { - ret = fault_in_pages_writeable(user_data, remain); + ret = fault_in_pages_writeable(user_data, remain, true); /* Userspace is tricking us, but we've already clobbered * its pages with the prefault and promised to write the * data up to the first fault. Hence ignore any errors @@ -823,7 +823,7 @@ i915_gem_pwrite_ioctl(struct drm_device *dev, void *data, return -EFAULT; ret = fault_in_pages_readable((char __user *)(uintptr_t)args-data_ptr, - args-size); + args-size, true); if (ret) return -EFAULT; diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index 81687af..5f0b685 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -955,7 +955,7 @@ validate_exec_list(struct drm_i915_gem_exec_object2 *exec, if (!access_ok(VERIFY_WRITE, ptr, length)) return -EFAULT; - if (fault_in_pages_readable(ptr, length)) + if (fault_in_pages_readable(ptr, length, true)) return -EFAULT; } diff --git a/fs/pipe.c b/fs/pipe.c index a932ced..b29f71c 100644 --- a/fs/pipe.c +++ b/fs/pipe.c @@ -167,7 +167,7 @@ static int iov_fault_in_pages_write(struct iovec *iov, unsigned long len) unsigned long this_len; this_len = min_t(unsigned long, len, iov-iov_len); - if (fault_in_pages_writeable(iov-iov_base, this_len)) + if (fault_in_pages_writeable(iov-iov_base, this_len, false)) break; len -= this_len; @@ -189,7 +189,7 @@ static void iov_fault_in_pages_read(struct iovec *iov, unsigned long len) unsigned long this_len; this_len = min_t(unsigned long, len, iov-iov_len); - fault_in_pages_readable(iov-iov_base, this_len); + fault_in_pages_readable(iov-iov_base, this_len, false); len -= this_len; iov++; } diff --git a/fs/splice.c b/fs/splice.c index 1ec0493..e919d78 100644 --- a/fs/splice.c +++ b/fs/splice.c @@ -1491,7 +1491,7 @@ static int pipe_to_user(struct pipe_inode_info *pipe, struct pipe_buffer *buf, * See if we can use the atomic maps, by prefaulting in the * pages and doing an atomic copy */ - if (!fault_in_pages_writeable(sd-u.userptr, sd-len)) { + if (!fault_in_pages_writeable(sd-u.userptr, sd-len, false)) { src = buf-ops-map(pipe, buf, 1); ret = __copy_to_user_inatomic(sd-u.userptr, src + buf-offset, sd-len); diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h index cfaaa69..60ac5c5 100644 --- a/include/linux/pagemap.h +++ b/include/linux/pagemap.h @@ -403,11 +403,14 @@ extern void add_page_wait_queue(struct page *page, wait_queue_t *waiter); * Fault a userspace page into pagetables. Return non-zero on a fault. * * This assumes that two userspace pages are always sufficient. That's - * not true if PAGE_CACHE_SIZE PAGE_SIZE. + * not true if PAGE_CACHE_SIZE PAGE_SIZE. If more than PAGE_SIZE needs to be + * prefaulted, set multipage to true. */ -static inline int fault_in_pages_writeable(char __user *uaddr, int size) +static inline int
Re: [Intel-gfx] HDMI colour space and depth questions (YCbCr, xvYCC, Deep Colour)
On 28 February 2012 17:59, Jesse Barnes jbar...@virtuousgeek.org wrote: There are several places we need to set extended vs normal range: DSP*CNTR (bit 25) PIPE*CONF (bits 26 and 13) TRANS*CONF (bit 10, for xvYCC DP configs) DVS*CNTR (for sprites, bit 21) Okay - good learning exercise for me this! So by default, latest 3.3 kernel (github / master) upon boot xrandr reports Broadcast RGB as Full. For that intel_reg_read reports the following (hope I'm looking at the correct registers): intel_reg_read 0x70008 (PIPEACONF): 0xC000 / bits 26 and 13 aren't set intel_reg_read 0x70180 (DSPACNTR) : 0xD8004400 / bit 25 isn't set Okay - so just as an experiment, running xrandr -d :0 --output HDMI3 --set Broadcast RGB Limited 16-235 and rerunning the above gives exactly the same result, i.e. no bits set. So trying intel_reg_write with the following commands (hope my binary is correct!): intel_reg_write 0x70008 0xC4002000 intel_reg_write 0x70180 0xDA004400 Seems to have the desired effect - that is video seems to have the correct colour range - this is by eye since my TV doesn't seem to report the actual input range anywhere. Running just the first command seems to raise the brightness/decrease the contrast (or just raise gamma - not sure) - the second brings it back down - so both as you say are needed. Changing refresh seems to knock out the effect of the second command. Re-running that second command fixes it once more. I've posted on the XBMC forum in the hope others can try this to see what effect it has. Yeah defaulting to the limited range for TVs may make sense. Paulo has been looking at TV detection and HDMI infoframes a bit, so may be able to whip up a patch. That would be superb if possible since it removes all faffing about with xorg or whatever. Many thanks for this. Paul ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 1/3] intel_reg_read: add support for getopt
This will allow us to pass more options to it in the future. v2: fix whitespacing issues and improve scary warning text as suggested by Paul Menzel. Reviewed-by: Paul Menzel paulepan...@users.sourceforge.net Signed-off-by: Eugeni Dodonov eugeni.dodo...@intel.com --- tools/intel_reg_read.c | 44 +++- 1 file changed, 35 insertions(+), 9 deletions(-) diff --git a/tools/intel_reg_read.c b/tools/intel_reg_read.c index cad30ff..95d28c5 100644 --- a/tools/intel_reg_read.c +++ b/tools/intel_reg_read.c @@ -41,21 +41,45 @@ static void dump_range(uint32_t start, uint32_t end) *(volatile uint32_t *)((volatile char*)mmio + i)); } +static void usage(char *cmdname) +{ + printf(Usage: %s [-f | addr]\n, cmdname); + printf(\t -f : read back full range of registers.\n); + printf(\t WARNING! This option may result in a machine hang!\n); + printf(\t addr : in 0x format\n); +} + int main(int argc, char** argv) { + int ret = 0; uint32_t reg; + int ch; + char *cmdname = strdup(argv[0]); + int full_dump = 0; + + while ((ch = getopt(argc, argv, fh)) != -1) { + switch(ch) { + case 'f': + full_dump = 1; + break; + case 'h': + usage(cmdname); + ret = 1; + goto out; + } + } + argc -= optind; + argv += optind; - if (argc != 2) { - printf(Usage: %s [-f | addr]\n, argv[0]); - printf(\t -f : read back full range of registers.\n); - printf(\t WARNING! This could be danger to hang the machine!\n); - printf(\t addr : in 0x format\n); - exit(1); + if (argc != 1) { + usage(cmdname); + ret = 1; + goto out; } intel_register_access_init(intel_get_pci_device(), 0); - if (!strcmp(argv[1], -f)) { + if (full_dump) { dump_range(0x0, 0x00fff); /* VGA registers */ dump_range(0x02000, 0x02fff); /* instruction, memory, interrupt control registers */ dump_range(0x03000, 0x031ff); /* FENCE and PPGTT control registers */ @@ -71,12 +95,14 @@ int main(int argc, char** argv) dump_range(0x7, 0x72fff); /* display and cursor registers */ dump_range(0x73000, 0x73fff); /* performance counters */ } else { - sscanf(argv[1], 0x%x, reg); + sscanf(argv[0], 0x%x, reg); dump_range(reg, reg + 4); } intel_register_access_fini(); - return 0; +out: + free(cmdname); + return ret; } -- 1.7.9.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 2/3] intel_reg_read: support reading multiple registers
The registers must be passed on the command line and will be read sequentially, one at a time. Signed-off-by: Eugeni Dodonov eugeni.dodo...@intel.com --- tools/intel_reg_read.c | 12 +++- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/tools/intel_reg_read.c b/tools/intel_reg_read.c index 95d28c5..5f94fed 100644 --- a/tools/intel_reg_read.c +++ b/tools/intel_reg_read.c @@ -43,7 +43,7 @@ static void dump_range(uint32_t start, uint32_t end) static void usage(char *cmdname) { - printf(Usage: %s [-f | addr]\n, cmdname); + printf(Usage: %s [-f] [addr1] [addr2] .. [addrN]\n, cmdname); printf(\t -f : read back full range of registers.\n); printf(\t WARNING! This option may result in a machine hang!\n); printf(\t addr : in 0x format\n); @@ -53,7 +53,7 @@ int main(int argc, char** argv) { int ret = 0; uint32_t reg; - int ch; + int i, ch; char *cmdname = strdup(argv[0]); int full_dump = 0; @@ -71,7 +71,7 @@ int main(int argc, char** argv) argc -= optind; argv += optind; - if (argc != 1) { + if (argc 1) { usage(cmdname); ret = 1; goto out; @@ -95,8 +95,10 @@ int main(int argc, char** argv) dump_range(0x7, 0x72fff); /* display and cursor registers */ dump_range(0x73000, 0x73fff); /* performance counters */ } else { - sscanf(argv[0], 0x%x, reg); - dump_range(reg, reg + 4); + for (i=0; i argc; i++) { + sscanf(argv[i], 0x%x, reg); + dump_range(reg, reg + 4); + } } intel_register_access_fini(); -- 1.7.9.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 3/3] intel_reg_read: add a flag to simplify bit decoding
This allows to specify '-d' parameter which will decode individual bits in each register being read. The register bits are printed horizontally for space reasons. This requires more than 80x25 terminal to see them all. An alternative solution would be to print them vertically, but this will become much more difficult to read when printing multiple registers at the same time. v2: fix spacing to get us a bit closer to the code nirvana. Signed-off-by: Eugeni Dodonov eugeni.dodo...@intel.com --- tools/intel_reg_read.c | 25 +++-- 1 file changed, 23 insertions(+), 2 deletions(-) diff --git a/tools/intel_reg_read.c b/tools/intel_reg_read.c index 5f94fed..6187a4b 100644 --- a/tools/intel_reg_read.c +++ b/tools/intel_reg_read.c @@ -32,6 +32,19 @@ #include string.h #include intel_gpu_tools.h +static void bit_decode(uint32_t reg) +{ + int i; + + for (i=31; i = 0; i--) + printf( %2d, i); + printf(\n); + + for (i=31; i = 0; i--) + printf( %2d, (reg (1 i)) 1); + printf(\n); +} + static void dump_range(uint32_t start, uint32_t end) { int i; @@ -43,9 +56,10 @@ static void dump_range(uint32_t start, uint32_t end) static void usage(char *cmdname) { - printf(Usage: %s [-f] [addr1] [addr2] .. [addrN]\n, cmdname); + printf(Usage: %s [-f|-d] [addr1] [addr2] .. [addrN]\n, cmdname); printf(\t -f : read back full range of registers.\n); printf(\t WARNING! This option may result in a machine hang!\n); + printf(\t -d : decode register bits.\n); printf(\t addr : in 0x format\n); } @@ -56,9 +70,13 @@ int main(int argc, char** argv) int i, ch; char *cmdname = strdup(argv[0]); int full_dump = 0; + int decode_bits = 0; - while ((ch = getopt(argc, argv, fh)) != -1) { + while ((ch = getopt(argc, argv, dfh)) != -1) { switch(ch) { + case 'd': + decode_bits = 1; + break; case 'f': full_dump = 1; break; @@ -98,6 +116,9 @@ int main(int argc, char** argv) for (i=0; i argc; i++) { sscanf(argv[i], 0x%x, reg); dump_range(reg, reg + 4); + + if (decode_bits) + bit_decode(*(volatile uint32_t *)((volatile char*)mmio + reg)); } } -- 1.7.9.2 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] HDMI colour space and depth questions (YCbCr, xvYCC, Deep Colour)
On Wed, 29 Feb 2012 15:38:44 + Paul Owen p...@starstreak.net wrote: On 28 February 2012 17:59, Jesse Barnes jbar...@virtuousgeek.org wrote: There are several places we need to set extended vs normal range: DSP*CNTR (bit 25) PIPE*CONF (bits 26 and 13) TRANS*CONF (bit 10, for xvYCC DP configs) DVS*CNTR (for sprites, bit 21) Okay - good learning exercise for me this! So by default, latest 3.3 kernel (github / master) upon boot xrandr reports Broadcast RGB as Full. For that intel_reg_read reports the following (hope I'm looking at the correct registers): intel_reg_read 0x70008 (PIPEACONF): 0xC000 / bits 26 and 13 aren't set intel_reg_read 0x70180 (DSPACNTR) : 0xD8004400 / bit 25 isn't set Okay - so just as an experiment, running xrandr -d :0 --output HDMI3 --set Broadcast RGB Limited 16-235 and rerunning the above gives exactly the same result, i.e. no bits set. So trying intel_reg_write with the following commands (hope my binary is correct!): intel_reg_write 0x70008 0xC4002000 intel_reg_write 0x70180 0xDA004400 Seems to have the desired effect - that is video seems to have the correct colour range - this is by eye since my TV doesn't seem to report the actual input range anywhere. Running just the first command seems to raise the brightness/decrease the contrast (or just raise gamma - not sure) - the second brings it back down - so both as you say are needed. Changing refresh seems to knock out the effect of the second command. Re-running that second command fixes it once more. I've posted on the XBMC forum in the hope others can try this to see what effect it has. Yeah defaulting to the limited range for TVs may make sense. Paulo has been looking at TV detection and HDMI infoframes a bit, so may be able to whip up a patch. That would be superb if possible since it removes all faffing about with xorg or whatever. Many thanks for this. Great, glad it wasn't hiding anywhere more obscure. Can you file a bug with the findings above just so we don't lose it? I expect the fix should be pretty easy, but I don't want to lose track of this. Thanks, Jesse ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/nouveau: do a better job at hiding the NIH i2c bit-banging algo
On Tue, Feb 28, 2012 at 12:42:19AM +0100, Daniel Vetter wrote: I'd like to export the corresponding functions from the i2c core so that I can use them in fallback bit-banging in i915.ko v2: Adapt to new i2c export patch. Cc: nouv...@lists.freedesktop.org Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch I've applied this patch to d-i-n with Dave's irc-acked. -Daniel -- Daniel Vetter Mail: dan...@ffwll.ch Mobile: +41 (0)79 365 57 48 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] i2c: export bit-banging algo functions
On Tue, Feb 28, 2012 at 09:08:17AM +0100, Jean Delvare wrote: On Tue, 28 Feb 2012 00:39:39 +0100, Daniel Vetter wrote: i915 has a hw i2c controller (gmbus) but for a bunch of stupid reasons we need to be able to fall back to the bit-banging algo on gpio pins. The current code sets up a 2nd i2c controller for the same i2c bus using the bit-banging algo. This has a bunch of issues, the major one being that userspace can directly access this fallback i2c adaptor behind the drivers back. But we need to frob a few registers before and after using fallback gpio bit-banging, so this horribly fails. The new plan is to only set up one i2c adaptor and transparently fall back to bit-banging by directly calling the xfer function of the bit- banging algo in the i2c core. To make that possible, export the 2 i2c algo functions. v2: As suggested by Jean Delvare, simply export the i2c_bit_algo vtable instead of the individual functions. Signed-off-by: Daniel Vetter daniel.vet...@ffwll.ch --- drivers/i2c/algos/i2c-algo-bit.c |3 ++- include/linux/i2c-algo-bit.h |1 + 2 files changed, 3 insertions(+), 1 deletions(-) diff --git a/drivers/i2c/algos/i2c-algo-bit.c b/drivers/i2c/algos/i2c-algo-bit.c index 525c734..ad0459c 100644 --- a/drivers/i2c/algos/i2c-algo-bit.c +++ b/drivers/i2c/algos/i2c-algo-bit.c @@ -610,10 +610,11 @@ static u32 bit_func(struct i2c_adapter *adap) /* -exported algorithm data: - */ -static const struct i2c_algorithm i2c_bit_algo = { +const struct i2c_algorithm i2c_bit_algo = { .master_xfer= bit_xfer, .functionality = bit_func, }; +EXPORT_SYMBOL(i2c_bit_algo); /* * registering functions to load algorithms at runtime diff --git a/include/linux/i2c-algo-bit.h b/include/linux/i2c-algo-bit.h index 4f98148..584ffa0 100644 --- a/include/linux/i2c-algo-bit.h +++ b/include/linux/i2c-algo-bit.h @@ -49,5 +49,6 @@ struct i2c_algo_bit_data { int i2c_bit_add_bus(struct i2c_adapter *); int i2c_bit_add_numbered_bus(struct i2c_adapter *); +extern const struct i2c_algorithm i2c_bit_algo; #endif /* _LINUX_I2C_ALGO_BIT_H */ Acked-by: Jean Delvare kh...@linux-fr.org I've applied this patch to drm-intel-next, thanks for the feedback. -Daniel -- Daniel Vetter Mail: dan...@ffwll.ch Mobile: +41 (0)79 365 57 48 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/7] drm/i915: add dev_priv to intel_gmbus
On Mon, Feb 27, 2012 at 02:53:37PM -0300, Eugeni Dodonov wrote: On Tue, Feb 14, 2012 at 19:37, Daniel Vetter daniel.vet...@ffwll.ch wrote: This way we can free up the bus-adaptor.algo_data pointer and make it available for use with the bitbanging fallback algo. Signed-Off-by: Daniel Vetter daniel.vet...@ffwll.ch Reviewed-by: Eugeni Dodonov eugeni.dodo...@intel.com I've slurped this series into -next, thanks for reviewing. -Daniel -- Daniel Vetter Mail: dan...@ffwll.ch Mobile: +41 (0)79 365 57 48 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] mm: extend prefault helpers to fault in more than PAGE_SIZE
On Wed, 29 Feb 2012 15:03:31 +0100 Daniel Vetter daniel.vet...@ffwll.ch wrote: drm/i915 wants to read/write more than one page in its fastpath and hence needs to prefault more than PAGE_SIZE bytes. I've checked the callsites and they all already clamp size when calling fault_in_pages_* to the same as for the subsequent __copy_to|from_user and hence don't rely on the implicit clamping to PAGE_SIZE. Also kill a copypasted spurious space in both functions while at it. v2: As suggested by Andrew Morton, add a multipage parameter to both functions to avoid the additional branch for the pagemap.c hotpath. My gcc 4.6 here seems to dtrt and indeed reap these branches where not needed. I don't think this produced a very good result :( ... -static inline int fault_in_pages_writeable(char __user *uaddr, int size) +static inline int fault_in_pages_writeable(char __user *uaddr, int size, +bool multipage) { int ret; + char __user *end = uaddr + size - 1; if (unlikely(size == 0)) return 0; @@ -416,36 +419,46 @@ static inline int fault_in_pages_writeable(char __user *uaddr, int size) * Writing zeroes into userspace here is OK, because we know that if * the zero gets there, we'll be overwriting it. */ - ret = __put_user(0, uaddr); - if (ret == 0) { - char __user *end = uaddr + size - 1; + do { + ret = __put_user(0, uaddr); + if (ret != 0) + return ret; + uaddr += PAGE_SIZE; + } while (multipage uaddr = end); + if (ret == 0) { /* * If the page was already mapped, this will get a cache miss * for sure, so try to avoid doing it. */ - if (((unsigned long)uaddr PAGE_MASK) != + if (((unsigned long)uaddr PAGE_MASK) == ((unsigned long)end PAGE_MASK)) - ret = __put_user(0, end); + ret = __put_user(0, end); } return ret; } One effect of this change for the filemap.c callsite is that `uaddr' now gets incremented by PAGE_SIZE. That happens to not break anything because we then mask `uaddr' with PAGE_MASK, and if gcc were really smart, it could remove that addition. But it's a bit ugly. Ideally the patch would have no effect upon filemap.o size, but with an allmodconfig config I'm seeing textdata bss dec hex filename 22876 1187344 303387682 mm/filemap.o(before) 22925 1187392 3043576e3 mm/filemap.o(after) so we are adding read()/write() overhead, and bss mysteriously got larger. Can we improve on this? Even if it's some dumb static inline int fault_in_pages_writeable(char __user *uaddr, int size, bool multipage) { if (multipage) { do-this } else { do-that } } the code duplication between do-this and do-that is regrettable, but at least it's all in the same place in the same file, so we won't accidentally introduce skew later on. Alternatively, add a separate fault_in_multi_pages_writeable() to pagemap.h. I have a bad feeling this is what your original patch did! (But we *should* be able to make this work! Why did this version of the patch go so wrong?) ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] mm: extend prefault helpers to fault in more than PAGE_SIZE
On Wed, Feb 29, 2012 at 03:01:46PM -0800, Andrew Morton wrote: On Wed, 29 Feb 2012 15:03:31 +0100 Daniel Vetter daniel.vet...@ffwll.ch wrote: drm/i915 wants to read/write more than one page in its fastpath and hence needs to prefault more than PAGE_SIZE bytes. I've checked the callsites and they all already clamp size when calling fault_in_pages_* to the same as for the subsequent __copy_to|from_user and hence don't rely on the implicit clamping to PAGE_SIZE. Also kill a copypasted spurious space in both functions while at it. v2: As suggested by Andrew Morton, add a multipage parameter to both functions to avoid the additional branch for the pagemap.c hotpath. My gcc 4.6 here seems to dtrt and indeed reap these branches where not needed. I don't think this produced a very good result :( And I halfway expected this mail here ;-) ... -static inline int fault_in_pages_writeable(char __user *uaddr, int size) +static inline int fault_in_pages_writeable(char __user *uaddr, int size, + bool multipage) { int ret; + char __user *end = uaddr + size - 1; if (unlikely(size == 0)) return 0; @@ -416,36 +419,46 @@ static inline int fault_in_pages_writeable(char __user *uaddr, int size) * Writing zeroes into userspace here is OK, because we know that if * the zero gets there, we'll be overwriting it. */ - ret = __put_user(0, uaddr); - if (ret == 0) { - char __user *end = uaddr + size - 1; + do { + ret = __put_user(0, uaddr); + if (ret != 0) + return ret; + uaddr += PAGE_SIZE; + } while (multipage uaddr = end); + if (ret == 0) { /* * If the page was already mapped, this will get a cache miss * for sure, so try to avoid doing it. */ - if (((unsigned long)uaddr PAGE_MASK) != + if (((unsigned long)uaddr PAGE_MASK) == ((unsigned long)end PAGE_MASK)) - ret = __put_user(0, end); + ret = __put_user(0, end); } return ret; } One effect of this change for the filemap.c callsite is that `uaddr' now gets incremented by PAGE_SIZE. That happens to not break anything because we then mask `uaddr' with PAGE_MASK, and if gcc were really smart, it could remove that addition. But it's a bit ugly. Yep, gcc is not clever enough to reap the addl on uaddr (and change the check for 'do we need to fault the 2nd page to' from jne to je again). I've checked that before submitting - maybe should have mentioned this. Ideally the patch would have no effect upon filemap.o size, but with an allmodconfig config I'm seeing textdata bss dec hex filename 22876 1187344 303387682 mm/filemap.o (before) 22925 1187392 3043576e3 mm/filemap.o (after) so we are adding read()/write() overhead, and bss mysteriously got larger. Can we improve on this? Even if it's some dumb static inline int fault_in_pages_writeable(char __user *uaddr, int size, bool multipage) { if (multipage) { do-this } else { do-that } } the code duplication between do-this and do-that is regrettable, but at least it's all in the same place in the same file, so we won't accidentally introduce skew later on. Alternatively, add a separate fault_in_multi_pages_writeable() to pagemap.h. I have a bad feeling this is what your original patch did! (But we *should* be able to make this work! Why did this version of the patch go so wrong?) Well, I couldn't reconcile the non-multipage with the multipage versions of these functions - at least not without changing them slightly (like this patch here does). Which is why I've asked you whether I should just add a new multipage version of these. I personally deem your proposal of using and if (multipage) with no shared code too ugly. But you've shot at it a bit, so I've figured that this version here is what you want. I'll redo this patch by adding _multipage versions of these 2 functions for i915. Yours, Daniel -- Daniel Vetter Mail: dan...@ffwll.ch Mobile: +41 (0)79 365 57 48 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] mm: extend prefault helpers to fault in more than PAGE_SIZE
On Thu, 1 Mar 2012 00:14:53 +0100 Daniel Vetter dan...@ffwll.ch wrote: I'll redo this patch by adding _multipage versions of these 2 functions for i915. OK, but I hope for i915 doesn't mean private to! Put 'em in pagemap.h, for maintenance reasons and because they are generic. Making them inline is a bit sad, but that's OK for now - they have few callsites. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Don't write DSPSURF for old chips
On Tue, 28 Feb 2012 07:43:55 +0100, Takashi Iwai ti...@suse.de wrote: At Mon, 27 Feb 2012 14:08:28 +0100, Takashi Iwai wrote: At Mon, 27 Feb 2012 12:17:57 +, Chris Wilson wrote: On Mon, 27 Feb 2012 12:39:24 +0100, Takashi Iwai ti...@suse.de wrote: It seems that writing DSPSURF in intel_flush_display_plane() causes the blank screen on some old laptops like Dell D630 with 965GM. Since this operation is needed only for ILK+, make it conditional. The specs say that DSPASURF is the latch register for updates of the DSPA registers on gen4 (including 965gm) as well. Presumably the bug is that we only partially update the DSPA registers prior to the first call to intel_flush_display_plane() which this papers over by disabling the update until a valid address is written to DSPASURF. And there is such a spurious call to intel_enable_plane() prior to us setting a valid scanout surface: Sounds reasonable. FWIW, the change was first introduced in commit [b24e7179: drm/i915: add pipe/plane enable/disable functions], then in commit [efc2924e: drm/i915: Call intel_enable_plane from i9xx_crtc_mode_set (again)], it's placed into i9xx_crtc_mode_set(). This explains the fact that it was discovered only on old machines as i9xx_crtc_mode_set() is the only crtc_mode_set op calling intel_enable_plane(). BTW, the bisection leaded to a merge commit, so the bug is really depending on the activation path or timing. I'll ask a tester to try your patch. He reported back that it reduces the failure rate but doesn't fix completely. Still get a blank screen in 20% rate. Any other clue? I haven't yet found anything else in the same vein as this, but the 965gm does ring a warning bell for this recent regression: commit 5ca0c34ae28344b6b4ca3036bc82f89c8db16a59 Author: Dave Airlie airl...@redhat.com Date: Thu Feb 23 15:33:40 2012 + drm/i915: fix mode set on load pipe. (v2) Booted my i965 machine and it started printing the unsupported pixel format of 0 message (once I added content to it). Oh looksie here, we pass 0. fix. v2: compile it. Buzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45966 Reviewed-by: Daniel Vetter daniel.vet...@ffwll.ch Reviewed-by: Chris Wilson ch...@chris-wilson.co.uk Signed-off-by: Dave Airlie airl...@redhat.com Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org which is currently sitting in airlied/drm-fixes. -Chris -- Chris Wilson, Intel Open Source Technology Centre ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] HDMI colour space and depth questions (YCbCr, xvYCC, Deep Colour)
On 29 February 2012 18:36, Jesse Barnes jbar...@virtuousgeek.org wrote: Can you file a bug with the findings above just so we don't lose it? I expect the fix should be pretty easy, but I don't want to lose track of this. Done - hope that I submitted to the right place in the right way! Thanks again. https://bugs.freedesktop.org/show_bug.cgi?id=46800 Paul ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Don't write DSPSURF for old chips
At Wed, 29 Feb 2012 23:54:46 +, Chris Wilson wrote: On Tue, 28 Feb 2012 07:43:55 +0100, Takashi Iwai ti...@suse.de wrote: At Mon, 27 Feb 2012 14:08:28 +0100, Takashi Iwai wrote: At Mon, 27 Feb 2012 12:17:57 +, Chris Wilson wrote: On Mon, 27 Feb 2012 12:39:24 +0100, Takashi Iwai ti...@suse.de wrote: It seems that writing DSPSURF in intel_flush_display_plane() causes the blank screen on some old laptops like Dell D630 with 965GM. Since this operation is needed only for ILK+, make it conditional. The specs say that DSPASURF is the latch register for updates of the DSPA registers on gen4 (including 965gm) as well. Presumably the bug is that we only partially update the DSPA registers prior to the first call to intel_flush_display_plane() which this papers over by disabling the update until a valid address is written to DSPASURF. And there is such a spurious call to intel_enable_plane() prior to us setting a valid scanout surface: Sounds reasonable. FWIW, the change was first introduced in commit [b24e7179: drm/i915: add pipe/plane enable/disable functions], then in commit [efc2924e: drm/i915: Call intel_enable_plane from i9xx_crtc_mode_set (again)], it's placed into i9xx_crtc_mode_set(). This explains the fact that it was discovered only on old machines as i9xx_crtc_mode_set() is the only crtc_mode_set op calling intel_enable_plane(). BTW, the bisection leaded to a merge commit, so the bug is really depending on the activation path or timing. I'll ask a tester to try your patch. He reported back that it reduces the failure rate but doesn't fix completely. Still get a blank screen in 20% rate. Any other clue? I haven't yet found anything else in the same vein as this, but the 965gm does ring a warning bell for this recent regression: commit 5ca0c34ae28344b6b4ca3036bc82f89c8db16a59 Author: Dave Airlie airl...@redhat.com Date: Thu Feb 23 15:33:40 2012 + drm/i915: fix mode set on load pipe. (v2) Booted my i965 machine and it started printing the unsupported pixel format of 0 message (once I added content to it). Oh looksie here, we pass 0. fix. v2: compile it. Buzilla: https://bugs.freedesktop.org/show_bug.cgi?id=45966 Reviewed-by: Daniel Vetter daniel.vet...@ffwll.ch Reviewed-by: Chris Wilson ch...@chris-wilson.co.uk Signed-off-by: Dave Airlie airl...@redhat.com Signed-off-by: Jesse Barnes jbar...@virtuousgeek.org which is currently sitting in airlied/drm-fixes. Hm, this must be irrelevant (inapplicable) because the tests were done with 3.1, 3.2 (and SLE11-SP2 kernels which has backports from 3.3-rc2 but without these fixes). The bug seems to have been introduced between 3.0 and 3.1. thanks, Takashi ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx