[PATCH] drm/tegra: Support kernel mappings with IOMMU
Host1x command buffer patching requires that the buffer object can be mapped into kernel address space, however, the recent addition of IOMMU did not account to this requirement. Therefore Host1x engines cannot be used if IOMMU is enabled. This patch implements kmap, kunmap, mmap and munmap functions to host1x bo objects. Signed-off-by: Arto Merilainen --- drivers/gpu/drm/tegra/gem.c | 34 +++--- 1 file changed, 31 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/tegra/gem.c b/drivers/gpu/drm/tegra/gem.c index 1217272a51f2..89ca8d35f555 100644 --- a/drivers/gpu/drm/tegra/gem.c +++ b/drivers/gpu/drm/tegra/gem.c @@ -2,7 +2,7 @@ * NVIDIA Tegra DRM GEM helper functions * * Copyright (C) 2012 Sascha Hauer, Pengutronix - * Copyright (C) 2013 NVIDIA CORPORATION, All rights reserved. + * Copyright (C) 2013-2015 NVIDIA CORPORATION, All rights reserved. * * Based on the GEM/CMA helpers * @@ -50,23 +50,51 @@ static void *tegra_bo_mmap(struct host1x_bo *bo) { struct tegra_bo *obj = host1x_to_tegra_bo(bo); - return obj->vaddr; + if (obj->vaddr) + return obj->vaddr; + else if (obj->gem.import_attach) + return dma_buf_vmap(obj->gem.import_attach->dmabuf); + else + return vmap(obj->pages, obj->num_pages, VM_MAP, + pgprot_writecombine(PAGE_KERNEL)); } static void tegra_bo_munmap(struct host1x_bo *bo, void *addr) { + struct tegra_bo *obj = host1x_to_tegra_bo(bo); + + if (obj->vaddr) + return; + else if (obj->gem.import_attach) + dma_buf_vunmap(obj->gem.import_attach->dmabuf, addr); + else + vunmap(addr); } static void *tegra_bo_kmap(struct host1x_bo *bo, unsigned int page) { struct tegra_bo *obj = host1x_to_tegra_bo(bo); - return obj->vaddr + page * PAGE_SIZE; + if (obj->vaddr) + return obj->vaddr + page * PAGE_SIZE; + else if (obj->gem.import_attach) + return dma_buf_kmap(obj->gem.import_attach->dmabuf, page); + else + return vmap(obj->pages + page, 1, VM_MAP, + pgprot_writecombine(PAGE_KERNEL)); } static void tegra_bo_kunmap(struct host1x_bo *bo, unsigned int page, void *addr) { + struct tegra_bo *obj = host1x_to_tegra_bo(bo); + + if (obj->vaddr) + return; + else if (obj->gem.import_attach) + dma_buf_kunmap(obj->gem.import_attach->dmabuf, page, addr); + else + vunmap(addr); } static struct host1x_bo *tegra_bo_get(struct host1x_bo *bo) -- 1.8.1.5
[PATCH 3/4] drm/tegra: Add VIC support
Hi Thierry, Thank you for your thorough analysis - and sorry for a bunch of very silly mistakes. I am skipping most trivial parts and focus on the trickier comments and questions. On 05/22/2015 02:47 PM, Thierry Reding wrote: >> + >> +struct tegra_bo *ucode_bo; >> +bool ucode_valid; >> +void *ucode_vaddr; >> + >> +bool booted; > > There are a bunch of other drivers that use a Falcon and they will all > need to use similar data to this to deal with the firmware and all. I > would like to see that code to be made into a Falcon library so that it > can be reused in a meaningful way. > > Roughly this would look like this: > > struct falcon { > struct device *dev; > ... > }; > > int falcon_init(struct falcon *falcon, struct device *dev, > void __iomem *regs); > int falcon_load_firmware(struct falcon *falcon, const char *filename); > int falcon_exit(struct falcon *falcon); > int falcon_boot(struct falcon *falcon); > There are two issues in above scheme..: - Memory allocation. Despite I have explicitly mentioned that the series has been tested only with iommu disabled, I would prefer trying to keep an option to use it later. Depending how well we want to isolate the falcon library from other parts of tegradrm, the library may not have ability to map anything to the tegradrm iommu domain. - The firmware images may not hold only Falcon firmware. Already in VIC case we have two firmwares: One for Falcon, another for FCE. To overcome the above issues, I would prefer dropping falcon_load_firmware() and keeping firmware image specifics inside the client driver and giving data related to Falcon as a parameter for falcon_boot(). Would this be ok? >> + >> +/* for firewall - this determines if method 1 should be regarded >> + * as an address register */ >> +bool method_data_is_addr_reg; >> +}; > > I think it'd be best to incorporate that functionality into the firewall > so that we can deal with it more centrally rather that duplicate this in > all drivers. > Do you have a concrete suggestion how this should be done? Firewall has no access to the driver specifics and in VIC case the VIC methods themselves define whether the METHOD1 includes address or not. >> +static int vic_dma_pa_to_internal_256b(struct vic *vic, phys_addr_t pa, >> + u32 internal_offset, bool imem) > > The name is confusing. Without looking at the code I'd expect this to > perform some kind of conversion from a physical address to some internal > address, but if I understand correctly this actually copies code into > the Falcon instruction or data memories. A better name I think would be: > > static int falcon_load_chunk(struct falcon *falcon, phys_addr_t phys, >unsigned long offset, enum falcon_memory > target); > > Note that this is now part of the Falcon library because I've seen > identical code used in several other drivers. Also the bool argument is > now an enumeration, which makes it much easier to read. Compare: > > err = vic_dma_pa_to_internal_256b(vic, phys, offset, true); > > and > > err = falcon_load_chunk(>falcon, phys, offset, FALCON_MEMORY_IMEM); > Sounds ok. > Is there a specific term in Falcon-speak for these 256 byte blocks? > "chunk" is a little awkward. > Unfortunately I am not aware that there would be - maybe "block"? >> +if (ucode.bin_header->bin_ver != 1) { >> +dev_err(vic->dev, "unsupported firmware version"); >> +return -ENOENT; > > Why not -EINVAL here, too? > We can interpret the issue two ways..: The given firmware is invalid (-EINVAL) or that there was no supported firmware entry available (-ENOENT). I do not have strong opinions and will change this to -EINVAL. >> +vic->ucode_bo = tegra_bo_create(dev, ucode_fw->size, 0); >> +if (IS_ERR(vic->ucode_bo)) { >> +dev_err(vic->dev, "dma memory allocation failed"); >> +err = PTR_ERR(vic->ucode_bo); >> +goto err_alloc_iova; >> +} > > Erm... no. Please don't use tegra_bo_create() to allocate these buffers. > tegra_bo_create() creates GEM objects and firmware doesn't qualify as a > GEM object. > > Can't you use the DMA API here? > The firmware must be mapped to the IOMMU domain into which VIC is attached - and I would prefer keeping the door open for enabling iommu on VIC. This was the simplest way to get a buffer that is allocated to the tegradrm iommu domain. Should I add a function for allocating memory without making a gem object or should I keep memory allocation here and simply add functions for mapping it into tegradrm domain? >> +static int vic_boot(struct device *dev) >> +{ >> +struct vic *vic = dev_get_drvdata(dev); >> +u32 offset; >> +int err = 0; >> + >> +if (vic->booted) >> +return 0; >> + >> +if (!vic->ucode_valid) { >> +
[PATCH 3/4] drm/tegra: Add VIC support
On 05/22/2015 01:25 PM, Thierry Reding wrote: > * PGP Signed by an unknown key > > On Thu, May 21, 2015 at 05:40:31PM +0300, Mikko Perttunen wrote: >> On 05/21/2015 04:20 PM, Arto Merilainen wrote: > [...] >>> +static int vic_is_addr_reg(struct device *dev, u32 class, u32 offset, u32 >>> val) >>> +{ >>> + struct vic *vic = dev_get_drvdata(dev); >>> + >>> + /* handle host class */ >>> + if (class == HOST1X_CLASS_HOST1X) { >>> + if (offset == 0x2b) >>> + return true; >>> + return false; >> >> "return (offset == 0x2b);" perhaps? > > I think this should really be extracted into a separate helper. If we > ever need to take into account additional offsets we would otherwise > have to extend every driver rather than just the helper. I agree, that would be better. > > Also I think the 0x2b should be replaced by some symbolic name. > According to the TRM 0x2b is the host1x class method named > NV_CLASS_HOST_INDCTRL_0. Oddly enough that doesn't seem to be an address > register. Instead the address seems to be in the INDOFF2 and INDOFF > methods (0x2c and 0x2d). I also can't tell from the TRM what exactly > these are supposed to do. > > Arto, can you clarify? This looks like an unfortunate mistake that got reproduced from gr2d and gr3d. The INDCTRL method is used for indirect register accessing and it allows Host1x to read registers of an engine - or write data directly to memory. It allow implementing context switch for the clients whose state should be not change between jobs from the same application. > >>> + if (IS_ERR(vic->rst)) { >>> + dev_err(>dev, "cannot get reset\n"); >>> + return PTR_ERR(vic->rst); >>> + } >>> + >>> + platform_set_drvdata(pdev, vic); >>> + >>> + INIT_LIST_HEAD(>client.base.list); >>> + vic->client.base.ops = _client_ops; >>> + vic->client.base.dev = dev; >>> + vic->client.base.class = vic_config->class_id; >>> + vic->client.base.syncpts = syncpts; >>> + vic->client.base.num_syncpts = 1; >>> + vic->dev = dev; >>> + vic->config = vic_config; >>> + >>> + INIT_LIST_HEAD(>client.list); >>> + vic->client.ops = _ops; >>> + >>> + err = tegra_powergate_sequence_power_up(vic->config->powergate_id, >>> + vic->clk, vic->rst); >>> + if (err) { >>> + dev_err(dev, "cannot turn on the device\n"); >>> + return err; >>> + } >>> + >>> + err = host1x_client_register(>client.base); >>> + if (err < 0) { >> >> You used 'if (err) {' previously, so maybe also here. > > For consistency with other Tegra DRM code these checks should use (at > least where possible) the (err < 0) notation. > Will fix. - Arto
[PATCH 3/4] drm/tegra: Add VIC support
Hi Thierry, On 05/22/2015 01:02 PM, Thierry Reding wrote: > * PGP Signed by an unknown key > > On Thu, May 21, 2015 at 06:44:08PM +0300, Mikko Perttunen wrote: >> On 05/21/2015 06:10 PM, Arto Merilainen wrote: >>> ... >>>>> + >>>>> +vic->rst = devm_reset_control_get(dev, "vic03"); >>>> >>>> I might prefer just "vic" as the clock/reset name. The name is often >>>> used as a sort of "role" for the clock/reset for the device, not >>>> necessarily the raw name of the "correct" clock/reset. >>>> >>> >>> I considered that - but I then noticed that >>> drivers/clk/tegra/clk-tegra124.c was already using vic03 variant. I can >>> write a patch for changing that too. >> >> Well, the two can be different; the clock-name in device tree kind of means >> "string that i use to refer to a clock that powers the VIC unit". It's not >> really a big deal though, both ways are used in DT bindings. > > I'll insist on calling this vic in the clock-names property. The 03 is > as far as I can tell an encoding of the version number, so if you want > to call this vic04 in some future version we'll have to needlessly > patch the driver. I agree, this is better without 04 postfix and I will remove it in the next version. - Arto
[PATCH 3/4] drm/tegra: Add VIC support
Hi Thierry, On 05/22/2015 01:02 PM, Thierry Reding wrote: > * PGP Signed by an unknown key > > On Thu, May 21, 2015 at 06:44:08PM +0300, Mikko Perttunen wrote: >> On 05/21/2015 06:10 PM, Arto Merilainen wrote: >>> ... >>>>> + >>>>> +vic->rst = devm_reset_control_get(dev, "vic03"); >>>> >>>> I might prefer just "vic" as the clock/reset name. The name is often >>>> used as a sort of "role" for the clock/reset for the device, not >>>> necessarily the raw name of the "correct" clock/reset. >>>> >>> >>> I considered that - but I then noticed that >>> drivers/clk/tegra/clk-tegra124.c was already using vic03 variant. I can >>> write a patch for changing that too. >> >> Well, the two can be different; the clock-name in device tree kind of means >> "string that i use to refer to a clock that powers the VIC unit". It's not >> really a big deal though, both ways are used in DT bindings. > > I'll insist on calling this vic in the clock-names property. The 03 is > as far as I can tell an encoding of the version number, so if you want > to call this vic04 in some future version we'll have to needlessly > patch the driver. I agree, this is better without 03 postfix will fix this in the next version. - Arto
[PATCH libdrm] tegra: Add VIC clear test
Thank you Emil for your feedback, On 05/21/2015 05:58 PM, Emil Velikov wrote: > Hi Arto, > > On 21 May 2015 at 14:18, Arto Merilainen wrote: >> This patch adds a simple test for testing Video-Image-Compositor >> engine on Tegra124 SoC. The test case opens a channel and performs >> a surface clear. >> >> Signed-off-by: Arto Merilainen >> --- >> tests/tegra/Makefile.am | 3 +- >> tests/tegra/host1x.h | 52 ++ >> tests/tegra/submit_vic.c | 315 +++ >> tests/tegra/vic.h| 426 >> +++ >> 4 files changed, 795 insertions(+), 1 deletion(-) >> create mode 100644 tests/tegra/host1x.h >> create mode 100644 tests/tegra/submit_vic.c >> create mode 100644 tests/tegra/vic.h >> > > Please add the two new headers into the noinst_HEADERS list like > below. Otherwise they won''t end up in the tarball, thus the test will > fail to build. I will fix this in the next version. - Arto
[PATCH 3/4] drm/tegra: Add VIC support
Thank you Mikko for your comments! Please see my answers inline. - Arto On 05/21/2015 05:40 PM, Mikko Perttunen wrote: > Hi, very good patch! > > Here are a few small comments. Aside those, you should also add a > section to > Documentation/devicetree/bindings/gpu/nvidia,tegra20-host1x.txt in a > separate patch. Will do. > On 05/21/2015 04:20 PM, Arto Merilainen wrote: >> This patch adds support for Video Image Compositor engine which >> can be used for 2d operations. >> >> The engine has a microcontroller (Falcon) that acts as a frontend >> for the rest of the unit. In order to properly utilize the engine, >> the frontend must be booted before pushing any commands. >> >> Signed-off-by: Andrew Chew >> Signed-off-by: Arto Merilainen >> --- >> drivers/gpu/drm/tegra/Makefile | 3 +- >> drivers/gpu/drm/tegra/drm.c| 9 +- >> drivers/gpu/drm/tegra/drm.h| 1 + >> drivers/gpu/drm/tegra/vic.c| 593 >> + >> drivers/gpu/drm/tegra/vic.h| 116 >> include/linux/host1x.h | 1 + >> 6 files changed, 721 insertions(+), 2 deletions(-) >> create mode 100644 drivers/gpu/drm/tegra/vic.c >> create mode 100644 drivers/gpu/drm/tegra/vic.h >> >> diff --git a/drivers/gpu/drm/tegra/Makefile b/drivers/gpu/drm/tegra/Makefile >> index 2c66a8db9da4..3bc3566e00b6 100644 >> --- a/drivers/gpu/drm/tegra/Makefile >> +++ b/drivers/gpu/drm/tegra/Makefile >> @@ -13,6 +13,7 @@ tegra-drm-y := \ >> sor.o \ >> dpaux.o \ >> gr2d.o \ >> -gr3d.o >> +gr3d.o \ >> +vic.o >> >> obj-$(CONFIG_DRM_TEGRA) += tegra-drm.o >> diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c >> index bfad15a913a0..d947f5f4d801 100644 >> --- a/drivers/gpu/drm/tegra/drm.c >> +++ b/drivers/gpu/drm/tegra/drm.c >> @@ -1,6 +1,6 @@ >> /* >>* Copyright (C) 2012 Avionic Design GmbH >> - * Copyright (C) 2012-2013 NVIDIA CORPORATION. All rights reserved. >> + * Copyright (C) 2012-2015 NVIDIA CORPORATION. All rights reserved. >>* >>* This program is free software; you can redistribute it and/or modify >>* it under the terms of the GNU General Public License version 2 as >> @@ -1048,6 +1048,7 @@ static const struct of_device_id host1x_drm_subdevs[] >> = { >> { .compatible = "nvidia,tegra124-dc", }, >> { .compatible = "nvidia,tegra124-sor", }, >> { .compatible = "nvidia,tegra124-hdmi", }, >> +{ .compatible = "nvidia,tegra124-vic", }, >> { /* sentinel */ } >> }; >> >> @@ -1097,8 +1098,14 @@ static int __init host1x_drm_init(void) >> if (err < 0) >> goto unregister_gr2d; >> >> +err = platform_driver_register(_vic_driver); >> +if (err < 0) >> +goto unregister_gr3d; >> + >> return 0; >> >> +unregister_gr3d: >> +platform_driver_unregister(_gr3d_driver); >> unregister_gr2d: >> platform_driver_unregister(_gr2d_driver); >> unregister_dpaux: >> diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h >> index 0e7756e720c5..a9c02a80d6bf 100644 >> --- a/drivers/gpu/drm/tegra/drm.h >> +++ b/drivers/gpu/drm/tegra/drm.h >> @@ -275,5 +275,6 @@ extern struct platform_driver tegra_hdmi_driver; >> extern struct platform_driver tegra_dpaux_driver; >> extern struct platform_driver tegra_gr2d_driver; >> extern struct platform_driver tegra_gr3d_driver; >> +extern struct platform_driver tegra_vic_driver; >> >> #endif /* HOST1X_DRM_H */ >> diff --git a/drivers/gpu/drm/tegra/vic.c b/drivers/gpu/drm/tegra/vic.c >> new file mode 100644 >> index ..b7c5a96697ed >> --- /dev/null >> +++ b/drivers/gpu/drm/tegra/vic.c >> @@ -0,0 +1,593 @@ >> +/* >> + * Copyright (c) 2015, NVIDIA Corporation. >> + * >> + * This program is free software; you can redistribute it and/or modify >> + * it under the terms of the GNU General Public License version 2 as >> + * published by the Free Software Foundation. >> + */ >> + >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> +#include >> + >> +#include "drm.h" >> +#include "gem.h" >> +#include "vic.h" >> + >> +#define VIC_IDLE_TIMEOUT_DEFAULT1 /* 10 mil
[PATCH 4/4] ARM: tegra: Add VIC for Tegra124
This patch adds VIC device tree node for Video Image Compositor. Signed-off-by: Arto Merilainen --- arch/arm/boot/dts/tegra124.dtsi | 11 +++ 1 file changed, 11 insertions(+) diff --git a/arch/arm/boot/dts/tegra124.dtsi b/arch/arm/boot/dts/tegra124.dtsi index 13cc7ca5e031..626355693a41 100644 --- a/arch/arm/boot/dts/tegra124.dtsi +++ b/arch/arm/boot/dts/tegra124.dtsi @@ -136,6 +136,17 @@ status = "disabled"; }; + vic at 5434 { + compatible = "nvidia,tegra124-vic"; + reg = <0x0 0x5434 0x0 0x0004>; + clocks = <_car TEGRA124_CLK_VIC03>; + clock-names = "vic03"; + resets = <_car TEGRA124_CLK_VIC03>; + reset-names = "vic03"; + + iommus = < TEGRA_SWGROUP_VIC>; + }; + sor at 0,5454 { compatible = "nvidia,tegra124-sor"; reg = <0x0 0x5454 0x0 0x0004>; -- 1.8.1.5
[PATCH 3/4] drm/tegra: Add VIC support
This patch adds support for Video Image Compositor engine which can be used for 2d operations. The engine has a microcontroller (Falcon) that acts as a frontend for the rest of the unit. In order to properly utilize the engine, the frontend must be booted before pushing any commands. Signed-off-by: Andrew Chew Signed-off-by: Arto Merilainen --- drivers/gpu/drm/tegra/Makefile | 3 +- drivers/gpu/drm/tegra/drm.c| 9 +- drivers/gpu/drm/tegra/drm.h| 1 + drivers/gpu/drm/tegra/vic.c| 593 + drivers/gpu/drm/tegra/vic.h| 116 include/linux/host1x.h | 1 + 6 files changed, 721 insertions(+), 2 deletions(-) create mode 100644 drivers/gpu/drm/tegra/vic.c create mode 100644 drivers/gpu/drm/tegra/vic.h diff --git a/drivers/gpu/drm/tegra/Makefile b/drivers/gpu/drm/tegra/Makefile index 2c66a8db9da4..3bc3566e00b6 100644 --- a/drivers/gpu/drm/tegra/Makefile +++ b/drivers/gpu/drm/tegra/Makefile @@ -13,6 +13,7 @@ tegra-drm-y := \ sor.o \ dpaux.o \ gr2d.o \ - gr3d.o + gr3d.o \ + vic.o obj-$(CONFIG_DRM_TEGRA) += tegra-drm.o diff --git a/drivers/gpu/drm/tegra/drm.c b/drivers/gpu/drm/tegra/drm.c index bfad15a913a0..d947f5f4d801 100644 --- a/drivers/gpu/drm/tegra/drm.c +++ b/drivers/gpu/drm/tegra/drm.c @@ -1,6 +1,6 @@ /* * Copyright (C) 2012 Avionic Design GmbH - * Copyright (C) 2012-2013 NVIDIA CORPORATION. All rights reserved. + * Copyright (C) 2012-2015 NVIDIA CORPORATION. All rights reserved. * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2 as @@ -1048,6 +1048,7 @@ static const struct of_device_id host1x_drm_subdevs[] = { { .compatible = "nvidia,tegra124-dc", }, { .compatible = "nvidia,tegra124-sor", }, { .compatible = "nvidia,tegra124-hdmi", }, + { .compatible = "nvidia,tegra124-vic", }, { /* sentinel */ } }; @@ -1097,8 +1098,14 @@ static int __init host1x_drm_init(void) if (err < 0) goto unregister_gr2d; + err = platform_driver_register(_vic_driver); + if (err < 0) + goto unregister_gr3d; + return 0; +unregister_gr3d: + platform_driver_unregister(_gr3d_driver); unregister_gr2d: platform_driver_unregister(_gr2d_driver); unregister_dpaux: diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h index 0e7756e720c5..a9c02a80d6bf 100644 --- a/drivers/gpu/drm/tegra/drm.h +++ b/drivers/gpu/drm/tegra/drm.h @@ -275,5 +275,6 @@ extern struct platform_driver tegra_hdmi_driver; extern struct platform_driver tegra_dpaux_driver; extern struct platform_driver tegra_gr2d_driver; extern struct platform_driver tegra_gr3d_driver; +extern struct platform_driver tegra_vic_driver; #endif /* HOST1X_DRM_H */ diff --git a/drivers/gpu/drm/tegra/vic.c b/drivers/gpu/drm/tegra/vic.c new file mode 100644 index ..b7c5a96697ed --- /dev/null +++ b/drivers/gpu/drm/tegra/vic.c @@ -0,0 +1,593 @@ +/* + * Copyright (c) 2015, NVIDIA Corporation. + * + * This program is free software; you can redistribute it and/or modify + * it under the terms of the GNU General Public License version 2 as + * published by the Free Software Foundation. + */ + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "drm.h" +#include "gem.h" +#include "vic.h" + +#define VIC_IDLE_TIMEOUT_DEFAULT 1 /* 10 milliseconds */ +#define VIC_IDLE_CHECK_PERIOD 10 /* 10 usec */ + +struct vic; + +struct vic_config { + /* firmware name */ + char *ucode_name; + + /* class id */ + u32 class_id; + + /* powergate id */ + int powergate_id; +}; + +struct vic { + struct { + u32 bin_data_offset; + u32 data_offset; + u32 data_size; + u32 code_offset; + u32 size; + } os, fce; + + struct tegra_bo *ucode_bo; + bool ucode_valid; + void *ucode_vaddr; + + bool booted; + + void __iomem *regs; + struct tegra_drm_client client; + struct host1x_channel *channel; + struct iommu_domain *domain; + struct device *dev; + struct clk *clk; + struct reset_control *rst; + + /* Platform configuration */ + struct vic_config *config; + + /* for firewall - this determines if method 1 should be regarded +* as an address register */ + bool method_data_is_addr_reg; +}; + +static inline struct vic *to_vic(struct tegra_drm_client *client) +{ + return container_of(client, struct vic, client); +} + +void vic_writel(struct vic *vic, u32 v, u32 r) +{ + writel(v, vic->regs + r); +} + +u32 vic_readl(struct vic *vic, u32 r) +{ + return readl(vic->
[PATCH 2/4] host1x: Pass register value in firewall
In gr2d and gr3d units the register offset was sufficient for determining if the register in interest is used for storing a register value. However, in VIC this is not the case. The operations are passed through two registers, METHOD0 and METHOD1. Depending on content of METHOD0, METHOD1 can be either address or data field. This patch updates the firewall interface to deliver also the register value to allow book-keeping inside the engine driver. Signed-off-by: Arto Merilainen --- drivers/gpu/drm/tegra/drm.h | 4 ++-- drivers/gpu/drm/tegra/gr2d.c | 4 ++-- drivers/gpu/drm/tegra/gr3d.c | 4 ++-- drivers/gpu/host1x/job.c | 35 +++ include/linux/host1x.h | 4 ++-- 5 files changed, 31 insertions(+), 20 deletions(-) diff --git a/drivers/gpu/drm/tegra/drm.h b/drivers/gpu/drm/tegra/drm.h index 659b2fcc986d..0e7756e720c5 100644 --- a/drivers/gpu/drm/tegra/drm.h +++ b/drivers/gpu/drm/tegra/drm.h @@ -1,6 +1,6 @@ /* * Copyright (C) 2012 Avionic Design GmbH - * Copyright (C) 2012-2013 NVIDIA CORPORATION. All rights reserved. + * Copyright (C) 2012-2015 NVIDIA CORPORATION. All rights reserved. * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2 as @@ -70,7 +70,7 @@ struct tegra_drm_client_ops { int (*open_channel)(struct tegra_drm_client *client, struct tegra_drm_context *context); void (*close_channel)(struct tegra_drm_context *context); - int (*is_addr_reg)(struct device *dev, u32 class, u32 offset); + int (*is_addr_reg)(struct device *dev, u32 class, u32 offset, u32 val); int (*submit)(struct tegra_drm_context *context, struct drm_tegra_submit *args, struct drm_device *drm, struct drm_file *file); diff --git a/drivers/gpu/drm/tegra/gr2d.c b/drivers/gpu/drm/tegra/gr2d.c index 02cd3e37a6ec..7e4424f15cdf 100644 --- a/drivers/gpu/drm/tegra/gr2d.c +++ b/drivers/gpu/drm/tegra/gr2d.c @@ -1,5 +1,5 @@ /* - * Copyright (c) 2012-2013, NVIDIA Corporation. + * Copyright (c) 2012-2015, NVIDIA Corporation. * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2 as @@ -84,7 +84,7 @@ static void gr2d_close_channel(struct tegra_drm_context *context) host1x_channel_put(context->channel); } -static int gr2d_is_addr_reg(struct device *dev, u32 class, u32 offset) +static int gr2d_is_addr_reg(struct device *dev, u32 class, u32 offset, u32 val) { struct gr2d *gr2d = dev_get_drvdata(dev); diff --git a/drivers/gpu/drm/tegra/gr3d.c b/drivers/gpu/drm/tegra/gr3d.c index 0b3f2b977ba0..9ceaf35a856b 100644 --- a/drivers/gpu/drm/tegra/gr3d.c +++ b/drivers/gpu/drm/tegra/gr3d.c @@ -1,6 +1,6 @@ /* * Copyright (C) 2013 Avionic Design GmbH - * Copyright (C) 2013 NVIDIA Corporation + * Copyright (C) 2013-2015 NVIDIA Corporation * * This program is free software; you can redistribute it and/or modify * it under the terms of the GNU General Public License version 2 as @@ -94,7 +94,7 @@ static void gr3d_close_channel(struct tegra_drm_context *context) host1x_channel_put(context->channel); } -static int gr3d_is_addr_reg(struct device *dev, u32 class, u32 offset) +static int gr3d_is_addr_reg(struct device *dev, u32 class, u32 offset, u32 val) { struct gr3d *gr3d = dev_get_drvdata(dev); diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index b72aa918fa69..77d977bbf922 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -295,9 +295,10 @@ struct host1x_firewall { u32 count; }; -static int check_register(struct host1x_firewall *fw, unsigned long offset) +static int check_register(struct host1x_firewall *fw, + unsigned long offset, u32 val) { - if (fw->job->is_addr_reg(fw->dev, fw->class, offset)) { + if (fw->job->is_addr_reg(fw->dev, fw->class, offset, val)) { if (!fw->num_relocs) return -EINVAL; @@ -311,18 +312,21 @@ static int check_register(struct host1x_firewall *fw, unsigned long offset) return 0; } -static int check_mask(struct host1x_firewall *fw) +static int check_mask(struct host1x_firewall *fw, struct host1x_job_gather *g) { + u32 *cmdbuf_base = (u32 *)fw->job->gather_copy_mapped + + (g->offset / sizeof(u32)); u32 mask = fw->mask; u32 reg = fw->reg; int ret; while (mask) { + u32 val = cmdbuf_base[fw->offset]; if (fw->words == 0) return -EINVAL; if (mask & 1) { - ret = check_register(fw, reg); + ret = check_register(fw, reg, val); if (ret < 0) re
[PATCH 1/4] host1x: Store device address to all bufs
Currently job pinning is optimized to handle only the first buffer using a certain host1x_bo object and all subsequent buffers using the same host1x_bo are considered done. In most cases this is correct, however, in case the same host1x_bo is used in multiple gathers inside the same job, we skip also storing the device address (physical or iova) to this buffer. This patch reworks the host1x_job_pin() to store the device address to all gathers. Signed-off-by: Andrew Chew Signed-off-by: Arto Merilainen --- drivers/gpu/host1x/job.c | 9 ++--- 1 file changed, 6 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index 63bd63f3c7df..b72aa918fa69 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -1,7 +1,7 @@ /* * Tegra host1x Job * - * Copyright (c) 2010-2013, NVIDIA Corporation. + * Copyright (c) 2010-2015, NVIDIA Corporation. * * This program is free software; you can redistribute it and/or modify it * under the terms and conditions of the GNU General Public License, @@ -538,9 +538,12 @@ int host1x_job_pin(struct host1x_job *job, struct device *dev) g->base = job->gather_addr_phys[i]; - for (j = i + 1; j < job->num_gathers; j++) - if (job->gathers[j].bo == g->bo) + for (j = i + 1; j < job->num_gathers; j++) { + if (job->gathers[j].bo == g->bo) { job->gathers[j].handled = true; + job->gathers[j].base = g->base; + } + } err = do_relocs(job, g->bo); if (err) -- 1.8.1.5
[PATCH 0/4] Add VIC support for Tegra124
This series adds Video-Image-Compositor (VIC) support for Tegra124. The unit replaced gr2d engine on T124 and it is effectively used for similar operations: making simple surface copy and fill operations. The series consists of four patches: The first patch fixes an issue in the host1x submit routine which has prevented using same buffer object for multiple command buffers. The second patch makes a simple improvement to the firewall to allow implementing address register validator for VIC. The last two patches in the series make the real modifications to Tegra DRM and device tree to enable the engine on T124. The series has been tested on Jetson TK1 by first disabling IOMMU (*), enabling CMA and running a VIC clear test case that is posted to dri-devel and linux-tegra mailing lists. The firmware image for VIC is publicly available as part of Linux For Tegra driver package [0]. [0] https://developer.nvidia.com/linux-tegra (*) Currently Tegra DRM does not support mapping the host1x command buffers into kernel address space in case IOMMU is enabled. Arto Merilainen (4): host1x: Store device address to all bufs host1x: Pass register value in firewall drm/tegra: Add VIC support ARM: tegra: Add VIC for Tegra124 arch/arm/boot/dts/tegra124.dtsi | 11 + drivers/gpu/drm/tegra/Makefile | 3 +- drivers/gpu/drm/tegra/drm.c | 9 +- drivers/gpu/drm/tegra/drm.h | 5 +- drivers/gpu/drm/tegra/gr2d.c| 4 +- drivers/gpu/drm/tegra/gr3d.c| 4 +- drivers/gpu/drm/tegra/vic.c | 593 drivers/gpu/drm/tegra/vic.h | 116 drivers/gpu/host1x/job.c| 44 ++- include/linux/host1x.h | 5 +- 10 files changed, 769 insertions(+), 25 deletions(-) create mode 100644 drivers/gpu/drm/tegra/vic.c create mode 100644 drivers/gpu/drm/tegra/vic.h -- 1.8.1.5
[PATCH libdrm] tegra: Add VIC clear test
This patch adds a simple test for testing Video-Image-Compositor engine on Tegra124 SoC. The test case opens a channel and performs a surface clear. Signed-off-by: Arto Merilainen --- tests/tegra/Makefile.am | 3 +- tests/tegra/host1x.h | 52 ++ tests/tegra/submit_vic.c | 315 +++ tests/tegra/vic.h| 426 +++ 4 files changed, 795 insertions(+), 1 deletion(-) create mode 100644 tests/tegra/host1x.h create mode 100644 tests/tegra/submit_vic.c create mode 100644 tests/tegra/vic.h diff --git a/tests/tegra/Makefile.am b/tests/tegra/Makefile.am index 8e625c8fedb8..e3ef60e42039 100644 --- a/tests/tegra/Makefile.am +++ b/tests/tegra/Makefile.am @@ -10,4 +10,5 @@ LDADD = \ ../../libdrm.la noinst_PROGRAMS = \ - openclose + openclose \ + submit_vic diff --git a/tests/tegra/host1x.h b/tests/tegra/host1x.h new file mode 100644 index ..9a72c8e69f85 --- /dev/null +++ b/tests/tegra/host1x.h @@ -0,0 +1,52 @@ +/* + * Copyright © 2015 NVIDIA Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +#ifndef HOST1X_H +#define HOST1X_H + +#include + +enum host1x_class { + HOST1X_CLASS_HOST1X = 0x1, + HOST1X_CLASS_GR2D = 0x51, + HOST1X_CLASS_GR2D_SB = 0x52, + HOST1X_CLASS_VIC = 0x5D, + HOST1X_CLASS_GR3D = 0x60, +}; + +static inline uint32_t host1x_opcode_setclass( + unsigned class_id, unsigned offset, unsigned mask) +{ + return (0 << 28) | (offset << 16) | (class_id << 6) | mask; +} + +static inline uint32_t host1x_opcode_incr(unsigned offset, unsigned count) +{ + return (1 << 28) | (offset << 16) | count; +} + +static inline uint32_t host1x_opcode_nonincr(unsigned offset, unsigned count) +{ + return (2 << 28) | (offset << 16) | count; +} + +#endif diff --git a/tests/tegra/submit_vic.c b/tests/tegra/submit_vic.c new file mode 100644 index ..472fde93a528 --- /dev/null +++ b/tests/tegra/submit_vic.c @@ -0,0 +1,315 @@ +/* + * Copyright © 2015 NVIDIA Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice shall be included in + * all copies or substantial portions of the Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR + * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, + * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR + * OTHER DEALINGS IN THE SOFTWARE. + */ + +#ifdef HAVE_CONFIG_H +# include "config.h" +#endif + +#include +#include +#include +#include +#include +#include + +#include + +#include "host1x.h" +#include "vic.h" +#include "xf86drm.h" +#include "tegra.h" + +static const char default_device[] = "/dev/dri/card0"; + +int main(int argc, char *argv[]) +{ + struct drm_tegra_get_syncpt get_syncpt_args; + struct drm_tegra_bo *cmdbuf_bo, *histbuf_bo, *outbuf_bo, + *configbuf_bo; + uint32_t cmdbuf_bo_handle, histbuf_bo_handle, + outbuf_bo_handle, configbuf_bo_handle; + struct drm_tegra_open
[PATCHv2 0/4] gpu: host1x: Add syncpoint base support
The host1x driver uses currently syncpoints statically from host1x point of view. If we do a wait inside a job, it always has a constant value to wait. host1x supports also doing relative syncpoint waits with respect to syncpoint bases. This allows doing multiple operations inside a single submit and waiting an operation to complete before moving to next one. This set of patches adds support for syncpoint bases to host1x driver and enables the support for gr2d client. I have tested the series using the host1x test application (available at [0], function test_wait_base() in tests/tegra/host1x/tegra_host1x_test.c) on cardhu. I would appreciate help in reviewing the series and testing the patches on other boards. Changes in v2: - Reordered various code blocks to improve code consistency - Functions host1x_syncpt_alloc() and host1x_syncpt_request() take now a single bitfield argument instead of separate boolean arguments - Added a separate ioctl call for querying the base associated with some syncpoint [0] https://gitorious.org/linux-host1x/libdrm-host1x Arto Merilainen (4): gpu: host1x: Add 'flags' field to syncpt request gpu: host1x: Add syncpoint base support drm/tegra: Deliver syncpoint base to user space drm/tegra: Reserve base for gr2d drivers/gpu/host1x/dev.h | 2 ++ drivers/gpu/host1x/drm/drm.c | 25 + drivers/gpu/host1x/drm/gr2d.c | 2 +- drivers/gpu/host1x/hw/channel_hw.c | 19 ++ drivers/gpu/host1x/hw/hw_host1x01_uclass.h | 6 drivers/gpu/host1x/syncpt.c| 58 +- drivers/gpu/host1x/syncpt.h| 10 +- include/uapi/drm/tegra_drm.h | 26 +- 8 files changed, 128 insertions(+), 20 deletions(-) -- 1.8.1.5 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCHv2 1/4] gpu: host1x: Add 'flags' field to syncpt request
Functions host1x_syncpt_request() and _host1x_syncpt_alloc() have been taking a separate boolean flag ('client_managed') for indicating if the syncpoint value should be tracked by the host1x driver. This patch converts the field into generic 'flags' field so that we can easily add more information while requesting a syncpoint. Clients are adapted to use the new interface accordingly. Signed-off-by: Arto Merilainen amerilai...@nvidia.com --- drivers/gpu/host1x/drm/gr2d.c | 2 +- drivers/gpu/host1x/syncpt.c | 14 +++--- drivers/gpu/host1x/syncpt.h | 3 ++- 3 files changed, 10 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/host1x/drm/gr2d.c b/drivers/gpu/host1x/drm/gr2d.c index 27ffcf1..7efd97b 100644 --- a/drivers/gpu/host1x/drm/gr2d.c +++ b/drivers/gpu/host1x/drm/gr2d.c @@ -285,7 +285,7 @@ static int gr2d_probe(struct platform_device *pdev) if (!gr2d-channel) return -ENOMEM; - *syncpts = host1x_syncpt_request(dev, false); + *syncpts = host1x_syncpt_request(dev, 0); if (!(*syncpts)) { host1x_channel_free(gr2d-channel); return -ENOMEM; diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c index 409745b..d376cd4 100644 --- a/drivers/gpu/host1x/syncpt.c +++ b/drivers/gpu/host1x/syncpt.c @@ -30,9 +30,9 @@ #define SYNCPT_CHECK_PERIOD (2 * HZ) #define MAX_STUCK_CHECK_COUNT 15 -static struct host1x_syncpt *_host1x_syncpt_alloc(struct host1x *host, - struct device *dev, - bool client_managed) +static struct host1x_syncpt *host1x_syncpt_alloc(struct host1x *host, +struct device *dev, +unsigned long flags) { int i; struct host1x_syncpt *sp = host-syncpt; @@ -51,7 +51,7 @@ static struct host1x_syncpt *_host1x_syncpt_alloc(struct host1x *host, sp-dev = dev; sp-name = name; - sp-client_managed = client_managed; + sp-client_managed = !!(flags HOST1X_SYNCPT_CLIENT_MANAGED); return sp; } @@ -321,7 +321,7 @@ int host1x_syncpt_init(struct host1x *host) host1x_syncpt_restore(host); /* Allocate sync point to use for clearing waits for expired fences */ - host-nop_sp = _host1x_syncpt_alloc(host, NULL, false); + host-nop_sp = host1x_syncpt_alloc(host, NULL, 0); if (!host-nop_sp) return -ENOMEM; @@ -329,10 +329,10 @@ int host1x_syncpt_init(struct host1x *host) } struct host1x_syncpt *host1x_syncpt_request(struct device *dev, - bool client_managed) + unsigned long flags) { struct host1x *host = dev_get_drvdata(dev-parent); - return _host1x_syncpt_alloc(host, dev, client_managed); + return host1x_syncpt_alloc(host, dev, flags); } void host1x_syncpt_free(struct host1x_syncpt *sp) diff --git a/drivers/gpu/host1x/syncpt.h b/drivers/gpu/host1x/syncpt.h index 267c0b9..1de7b58 100644 --- a/drivers/gpu/host1x/syncpt.h +++ b/drivers/gpu/host1x/syncpt.h @@ -153,8 +153,9 @@ int host1x_syncpt_patch_wait(struct host1x_syncpt *sp, void *patch_addr); u32 host1x_syncpt_id(struct host1x_syncpt *sp); /* Allocate a sync point for a device. */ +#define HOST1X_SYNCPT_CLIENT_MANAGED (1 0) struct host1x_syncpt *host1x_syncpt_request(struct device *dev, - bool client_managed); + unsigned long flags); /* Free a sync point. */ void host1x_syncpt_free(struct host1x_syncpt *sp); -- 1.8.1.5 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCHv2 3/4] drm/tegra: Deliver syncpoint base to user space
This patch adds a separate ioctl for delivering syncpoint base number to user space. If the syncpoint does not have an associated base, the function returns -ENXIO. Signed-off-by: Arto Merilainen amerilai...@nvidia.com --- drivers/gpu/host1x/drm/drm.c | 25 + include/uapi/drm/tegra_drm.h | 26 +- 2 files changed, 42 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/host1x/drm/drm.c b/drivers/gpu/host1x/drm/drm.c index 8c61cee..c11cbf5 100644 --- a/drivers/gpu/host1x/drm/drm.c +++ b/drivers/gpu/host1x/drm/drm.c @@ -472,6 +472,30 @@ static int tegra_get_syncpt(struct drm_device *drm, void *data, return 0; } +static int tegra_get_syncpt_base(struct drm_device *drm, void *data, +struct drm_file *file) +{ + struct drm_tegra_get_syncpt_base *args = data; + struct host1x_drm_file *fpriv = file-driver_priv; + struct host1x_drm_context *context = + (struct host1x_drm_context *)(uintptr_t)args-context; + struct host1x_syncpt_base *base; + + if (!host1x_drm_file_owns_context(fpriv, context)) + return -ENODEV; + + if (args-index = context-client-num_syncpts) + return -EINVAL; + + base = context-client-syncpts[args-index]-base; + if (!base) + return -ENXIO; + + args-base_id = base-id; + + return 0; +} + static int tegra_submit(struct drm_device *drm, void *data, struct drm_file *file) { @@ -498,6 +522,7 @@ static const struct drm_ioctl_desc tegra_drm_ioctls[] = { DRM_IOCTL_DEF_DRV(TEGRA_CLOSE_CHANNEL, tegra_close_channel, DRM_UNLOCKED), DRM_IOCTL_DEF_DRV(TEGRA_GET_SYNCPT, tegra_get_syncpt, DRM_UNLOCKED), DRM_IOCTL_DEF_DRV(TEGRA_SUBMIT, tegra_submit, DRM_UNLOCKED), + DRM_IOCTL_DEF_DRV(TEGRA_GET_SYNCPT_BASE, tegra_get_syncpt_base, DRM_UNLOCKED), #endif }; diff --git a/include/uapi/drm/tegra_drm.h b/include/uapi/drm/tegra_drm.h index 73bde4e..8b8094c 100644 --- a/include/uapi/drm/tegra_drm.h +++ b/include/uapi/drm/tegra_drm.h @@ -65,6 +65,12 @@ struct drm_tegra_get_syncpt { __u32 id; }; +struct drm_tegra_get_syncpt_base { + __u64 context; + __u32 index; + __u32 base_id; +}; + struct drm_tegra_syncpt { __u32 id; __u32 incrs; @@ -115,15 +121,16 @@ struct drm_tegra_submit { __u32 reserved[5]; /* future expansion */ }; -#define DRM_TEGRA_GEM_CREATE 0x00 -#define DRM_TEGRA_GEM_MMAP 0x01 -#define DRM_TEGRA_SYNCPT_READ 0x02 -#define DRM_TEGRA_SYNCPT_INCR 0x03 -#define DRM_TEGRA_SYNCPT_WAIT 0x04 -#define DRM_TEGRA_OPEN_CHANNEL 0x05 -#define DRM_TEGRA_CLOSE_CHANNEL0x06 -#define DRM_TEGRA_GET_SYNCPT 0x07 -#define DRM_TEGRA_SUBMIT 0x08 +#define DRM_TEGRA_GEM_CREATE 0x00 +#define DRM_TEGRA_GEM_MMAP 0x01 +#define DRM_TEGRA_SYNCPT_READ 0x02 +#define DRM_TEGRA_SYNCPT_INCR 0x03 +#define DRM_TEGRA_SYNCPT_WAIT 0x04 +#define DRM_TEGRA_OPEN_CHANNEL 0x05 +#define DRM_TEGRA_CLOSE_CHANNEL0x06 +#define DRM_TEGRA_GET_SYNCPT 0x07 +#define DRM_TEGRA_SUBMIT 0x08 +#define DRM_TEGRA_GET_SYNCPT_BASE 0x09 #define DRM_IOCTL_TEGRA_GEM_CREATE DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GEM_CREATE, struct drm_tegra_gem_create) #define DRM_IOCTL_TEGRA_GEM_MMAP DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GEM_MMAP, struct drm_tegra_gem_mmap) @@ -134,5 +141,6 @@ struct drm_tegra_submit { #define DRM_IOCTL_TEGRA_CLOSE_CHANNEL DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_CLOSE_CHANNEL, struct drm_tegra_open_channel) #define DRM_IOCTL_TEGRA_GET_SYNCPT DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GET_SYNCPT, struct drm_tegra_get_syncpt) #define DRM_IOCTL_TEGRA_SUBMIT DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_SUBMIT, struct drm_tegra_submit) +#define DRM_IOCTL_TEGRA_GET_SYNCPT_BASE DRM_IOWR(DRM_COMMAND_BASE + DRM_TEGRA_GET_SYNCPT_BASE, struct drm_tegra_get_syncpt_base) #endif -- 1.8.1.5 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCHv2 2/4] gpu: host1x: Add syncpoint base support
This patch adds support for hardware syncpoint bases. This creates a simple mechanism to stall the command FIFO until an operation is completed. Signed-off-by: Arto Merilainen amerilai...@nvidia.com --- drivers/gpu/host1x/dev.h | 2 ++ drivers/gpu/host1x/hw/channel_hw.c | 19 + drivers/gpu/host1x/hw/hw_host1x01_uclass.h | 6 drivers/gpu/host1x/syncpt.c| 44 -- drivers/gpu/host1x/syncpt.h| 7 + 5 files changed, 76 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/host1x/dev.h b/drivers/gpu/host1x/dev.h index bed90a8..516ce0a 100644 --- a/drivers/gpu/host1x/dev.h +++ b/drivers/gpu/host1x/dev.h @@ -27,6 +27,7 @@ #include job.h struct host1x_syncpt; +struct host1x_syncpt_base; struct host1x_channel; struct host1x_cdma; struct host1x_job; @@ -102,6 +103,7 @@ struct host1x { void __iomem *regs; struct host1x_syncpt *syncpt; + struct host1x_syncpt_base *bases; struct device *dev; struct clk *clk; diff --git a/drivers/gpu/host1x/hw/channel_hw.c b/drivers/gpu/host1x/hw/channel_hw.c index ee19962..06f44bf 100644 --- a/drivers/gpu/host1x/hw/channel_hw.c +++ b/drivers/gpu/host1x/hw/channel_hw.c @@ -67,6 +67,21 @@ static void submit_gathers(struct host1x_job *job) } } +static inline void synchronize_syncpt_base(struct host1x_job *job) +{ + struct host1x_channel *ch = job-channel; + struct host1x *host = dev_get_drvdata(ch-dev-parent); + struct host1x_syncpt *sp = host-syncpt + job-syncpt_id; + u32 base_id = sp-base-id; + u32 base_val = host1x_syncpt_read_max(sp); + + host1x_cdma_push(ch-cdma, +host1x_opcode_setclass(HOST1X_CLASS_HOST1X, + HOST1X_UCLASS_LOAD_SYNCPT_BASE, 1), +HOST1X_UCLASS_LOAD_SYNCPT_BASE_BASE_INDX_F(base_id) | +HOST1X_UCLASS_LOAD_SYNCPT_BASE_VALUE_F(base_val)); +} + static int channel_submit(struct host1x_job *job) { struct host1x_channel *ch = job-channel; @@ -118,6 +133,10 @@ static int channel_submit(struct host1x_job *job) host1x_syncpt_read_max(sp))); } + /* Synchronize base register to allow using it for relative waiting */ + if (sp-base) + synchronize_syncpt_base(job); + syncval = host1x_syncpt_incr_max(sp, user_syncpt_incrs); job-syncpt_end = syncval; diff --git a/drivers/gpu/host1x/hw/hw_host1x01_uclass.h b/drivers/gpu/host1x/hw/hw_host1x01_uclass.h index 42f3ce1..f755359 100644 --- a/drivers/gpu/host1x/hw/hw_host1x01_uclass.h +++ b/drivers/gpu/host1x/hw/hw_host1x01_uclass.h @@ -111,6 +111,12 @@ static inline u32 host1x_uclass_wait_syncpt_base_offset_f(u32 v) } #define HOST1X_UCLASS_WAIT_SYNCPT_BASE_OFFSET_F(v) \ host1x_uclass_wait_syncpt_base_offset_f(v) +static inline u32 host1x_uclass_load_syncpt_base_r(void) +{ + return 0xb; +} +#define HOST1X_UCLASS_LOAD_SYNCPT_BASE \ + host1x_uclass_load_syncpt_base_r() static inline u32 host1x_uclass_load_syncpt_base_base_indx_f(u32 v) { return (v 0xff) 24; diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c index d376cd4..b5cb97c 100644 --- a/drivers/gpu/host1x/syncpt.c +++ b/drivers/gpu/host1x/syncpt.c @@ -30,6 +30,28 @@ #define SYNCPT_CHECK_PERIOD (2 * HZ) #define MAX_STUCK_CHECK_COUNT 15 +static struct host1x_syncpt_base *host1x_base_alloc(struct host1x *host) +{ + struct host1x_syncpt_base *bases = host-bases; + unsigned int i; + + for (i = 0; i host-info-nb_bases; i++) + if (!bases[i].requested) + break; + + if (i = host-info-nb_bases) + return NULL; + + bases[i].requested = true; + return bases[i]; +} + +static void host1x_syncpt_base_free(struct host1x_syncpt_base *base) +{ + if (base) + base-requested = false; +} + static struct host1x_syncpt *host1x_syncpt_alloc(struct host1x *host, struct device *dev, unsigned long flags) @@ -44,6 +66,12 @@ static struct host1x_syncpt *host1x_syncpt_alloc(struct host1x *host, if (i = host-info-nb_pts) return NULL; + if (flags HOST1X_SYNCPT_HAS_BASE) { + sp-base = host1x_base_alloc(host); + if (!sp-base) + return NULL; + } + name = kasprintf(GFP_KERNEL, %02d-%s, sp-id, dev ? dev_name(dev) : NULL); if (!name) @@ -304,19 +332,29 @@ int host1x_syncpt_patch_wait(struct host1x_syncpt *sp, void *patch_addr) int host1x_syncpt_init(struct host1x *host) { struct host1x_syncpt *syncpt; + struct host1x_syncpt_base *bases; int i; syncpt = devm_kzalloc(host-dev, sizeof(*syncpt) * host
[PATCHv2 4/4] drm/tegra: Reserve base for gr2d
This patch modifies the gr2d to reserve a base for syncpoint. Signed-off-by: Arto Merilainen amerilai...@nvidia.com --- drivers/gpu/host1x/drm/gr2d.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/host1x/drm/gr2d.c b/drivers/gpu/host1x/drm/gr2d.c index 7efd97b..337e1ad 100644 --- a/drivers/gpu/host1x/drm/gr2d.c +++ b/drivers/gpu/host1x/drm/gr2d.c @@ -285,7 +285,7 @@ static int gr2d_probe(struct platform_device *pdev) if (!gr2d-channel) return -ENOMEM; - *syncpts = host1x_syncpt_request(dev, 0); + *syncpts = host1x_syncpt_request(dev, HOST1X_SYNCPT_HAS_BASE); if (!(*syncpts)) { host1x_channel_free(gr2d-channel); return -ENOMEM; -- 1.8.1.5 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 2/3] drm/tegra: Deliver syncpoint base to user space
On 10/11/2013 12:43 PM, Thierry Reding wrote: * PGP Signed by an unknown key On Wed, Oct 09, 2013 at 02:54:09PM +0300, Arto Merilainen wrote: This patch makes the necessary additions to deliver syncpoint base to the user space. This patch splits the index field in the drm_tegra_get_syncpt structure into three separate fields (index, support_base, base_id). This allows to keep compatibility over kernel versions: - The placing of index field stays constant and therefore the interface should stay compatible with earlier userspace versions. - The old index field was input-only and it was not supposed to be read afterwards. Delivering data back in the same field should not introduce regressions to any old user space applications. - Old kernels left support_base and base_id fields intact (zero) and hence the new user space applications get the correct response from the kernel (=syncpoint does not have a base). Signed-off-by: Arto Merilainen amerilai...@nvidia.com --- drivers/gpu/host1x/drm/drm.c | 2 ++ include/uapi/drm/tegra_drm.h | 4 +++- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/host1x/drm/drm.c b/drivers/gpu/host1x/drm/drm.c index 8c61cee..4251563 100644 --- a/drivers/gpu/host1x/drm/drm.c +++ b/drivers/gpu/host1x/drm/drm.c @@ -468,6 +468,8 @@ static int tegra_get_syncpt(struct drm_device *drm, void *data, syncpt = context-client-syncpts[args-index]; args-id = host1x_syncpt_id(syncpt); + args-support_base = syncpt-base ? 1 : 0; + args-base_id = syncpt-base ? syncpt-base-id : 0; return 0; } diff --git a/include/uapi/drm/tegra_drm.h b/include/uapi/drm/tegra_drm.h index 73bde4e..7a79ca5 100644 --- a/include/uapi/drm/tegra_drm.h +++ b/include/uapi/drm/tegra_drm.h @@ -61,7 +61,9 @@ struct drm_tegra_close_channel { struct drm_tegra_get_syncpt { __u64 context; - __u32 index; + __u16 index; + __u8 support_base; + __u8 base_id; __u32 id; I don't like this. Can't we just add a separate IOCTL? Something like: Sure. This felt like a good idea, but I must admit I had a bit mixed feelings after looking my own explanation why this would be ok... - Arto ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 1/3] gpu: host1x: Add syncpoint base support
On 10/11/2013 12:39 PM, Thierry Reding wrote: * PGP Signed by an unknown key On Wed, Oct 09, 2013 at 02:54:08PM +0300, Arto Merilainen wrote: This patch adds support for hardware syncpoint bases. This creates a simple mechanism for waiting an operation to complete in the middle of the command buffer. Perhaps ... simple mechanism to stall the command FIFO until an operation is completed. That's what the TRM contains and more accurately describes the hardware functionality. diff --git a/drivers/gpu/host1x/dev.h b/drivers/gpu/host1x/dev.h [...] @@ -26,6 +26,7 @@ #include cdma.h #include job.h +struct host1x_base; host1x_syncpt_base is more explicit, host1x_base sounds very generic. I agree. diff --git a/drivers/gpu/host1x/hw/channel_hw.c b/drivers/gpu/host1x/hw/channel_hw.c index ee19962..5f9f735 100644 --- a/drivers/gpu/host1x/hw/channel_hw.c +++ b/drivers/gpu/host1x/hw/channel_hw.c @@ -67,6 +67,21 @@ static void submit_gathers(struct host1x_job *job) } } +static inline void synchronize_syncpt_base(struct host1x_job *job) +{ + struct host1x_channel *ch = job-channel; + struct host1x *host = dev_get_drvdata(ch-dev-parent); + struct host1x_syncpt *sp = host-syncpt + job-syncpt_id; + u32 base_id = sp-base-id; + u32 base_val = host1x_syncpt_read_max(sp); + + host1x_cdma_push(ch-cdma, +host1x_opcode_setclass(HOST1X_CLASS_HOST1X, + host1x_uclass_load_syncpt_base_r(), 1), +host1x_uclass_load_syncpt_base_base_indx_f(base_id) | +host1x_uclass_load_syncpt_base_value_f(base_val)); Please use the all-caps version of the register definitions. The lowercase variants were only introduced to allow profiling and coverage testing, which I think nobody's been doing and I'm in fact thinking about removing them. Will fix. diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c [...] +static struct host1x_base *host1x_base_alloc(struct host1x *host) +{ + struct host1x_base *base = host-bases; + int i; unsigned int Ops. Will fix. + + for (i = 0; i host-info-nb_bases base-reserved; i++, base++) + ; I'd like to see this rewritten as: for (i = 0; i host-info-nb_bases; i++) { if (!host-bases[i].reserved) break; } I agree, that looks less obfuscated. +static void host1x_base_free(struct host1x_base *base) +{ + if (!base) + return; + base-reserved = false; +} The following would be somewhat shorter: if (base) base-reserved = false; static struct host1x_syncpt *_host1x_syncpt_alloc(struct host1x *host, struct device *dev, - bool client_managed) + bool client_managed, + bool support_base) I think at this point we probably want to introduce a flags argument instead of adding all those boolean parameters. Something like: #define HOST1X_SYNCPT_CLIENT_MANAGED(1 0) #define HOST1X_SYNCPT_HAS_BASE (1 1) struct host1x_syncpt *host1x_syncpt_alloc(struct host1x *host, struct device *dev, unsigned long flags); This is not a bad idea... I will write a patch for that. int host1x_syncpt_init(struct host1x *host) { struct host1x_syncpt *syncpt; + struct host1x_base *bases; int i; + bases = devm_kzalloc(host-dev, sizeof(*bases) * host-info-nb_bases, + GFP_KERNEL); syncpt = devm_kzalloc(host-dev, sizeof(*syncpt) * host-info-nb_pts, GFP_KERNEL); I'd prefer these to be checked separately. Also the argument alignment is wrong here. Align with the first function argument. And for consistency, allocate bases after syncpoints... Oh. Will fix. - if (!syncpt) + if (!syncpt || !bases) return -ENOMEM; - for (i = 0; i host-info-nb_pts; ++i) { + for (i = 0; i host-info-nb_bases; i++) + bases[i].id = i; + + for (i = 0; i host-info-nb_pts; i++) { syncpt[i].id = i; syncpt[i].host = host; } ... and initialize them after the syncpoints... host-syncpt = syncpt; + host-bases = bases; ... to match the assignment order. Will fix. @@ -332,7 +368,14 @@ struct host1x_syncpt *host1x_syncpt_request(struct device *dev, bool client_managed) { struct host1x *host = dev_get_drvdata(dev-parent); - return _host1x_syncpt_alloc(host, dev, client_managed); + return _host1x_syncpt_alloc(host, dev, client_managed, false
[PATCH] drm/tegra: Use dma_mapping API in mmap
This far we have used the remap_pfn_range() function directly to map buffers to user space. Calling this function has worked as all memory allocations have been contiguous. However, the function must support also non-contiguous memory allocations as we later want to turn on IOMMU. This patch modifies the code to use dma_mapping API for mapping buffers to user space. Signed-off-by: Arto Merilainen amerilai...@nvidia.com --- I tested this patch on cardhu using Hiroshi Doyu's series Unified SMMU driver among Tegra SoCs and using just pure Linux 3.12rc4. I have not tested this on T20 so I would appreaciate help in here (although I do not think this should affect the behavior on T20 at all). I also would like to hear suggestions on better approaches for using dma_mmap_writecombine(). If IOMMU is not used, this function call ends up calling arm_iommu_mmap() which assumes that vm_pgoff value is valid. I see that other drm drivers simply implement fault callbacks and avoid this problem completely. drivers/gpu/host1x/drm/gem.c | 16 ++-- 1 file changed, 14 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/host1x/drm/gem.c b/drivers/gpu/host1x/drm/gem.c index 59623de..82ae3d5 100644 --- a/drivers/gpu/host1x/drm/gem.c +++ b/drivers/gpu/host1x/drm/gem.c @@ -240,6 +240,7 @@ int tegra_drm_mmap(struct file *file, struct vm_area_struct *vma) { struct drm_gem_object *gem; struct tegra_bo *bo; + unsigned long vm_pgoff; int ret; ret = drm_gem_mmap(file, vma); @@ -249,8 +250,19 @@ int tegra_drm_mmap(struct file *file, struct vm_area_struct *vma) gem = vma-vm_private_data; bo = to_tegra_bo(gem); - ret = remap_pfn_range(vma, vma-vm_start, bo-paddr PAGE_SHIFT, - vma-vm_end - vma-vm_start, vma-vm_page_prot); + /* the pages are real */ + vma-vm_flags = ~VM_PFNMAP; + + /* drm holds a fake offset in vm_pgoff... dma_mapping assumes that +* vm_pgoff contains data related to the buffer = clear the cookie +* temporarily. */ + + vm_pgoff = vma-vm_pgoff; + vma-vm_pgoff = 0; + ret = dma_mmap_writecombine(bo-gem.dev-dev, vma, bo-vaddr, + bo-paddr, bo-gem.size); + vma-vm_pgoff = vm_pgoff; + if (ret) drm_gem_vm_close(vma); -- 1.8.1.5 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 0/3] gpu: host1x: Add syncpoint base support
The host1x driver uses currently syncpoints statically from host1x point of view. If we do a wait inside a job, it always has a constant value to wait. host1x supports also doing relative syncpoint waits with respect to syncpoint bases. This allows doing multiple operations inside a single submit and waiting an operation to complete before moving to next one. This set of patches adds support for syncpoint bases to host1x driver and enables the support for gr2d client. The series is based on 3.12-rc4. I have tested the series using the host1x test application (available at [0], function test_wait_base() in tests/tegra/host1x/tegra_host1x_test.c) on cardhu. I would appreciate help in reviewing the series and testing the patches on other boards. [0] https://gitorious.org/linux-host1x/libdrm-host1x Arto Merilainen (3): gpu: host1x: Add syncpoint base support drm/tegra: Deliver syncpoint base to user space drm/tegra: Reserve base for gr2d drivers/gpu/host1x/dev.h | 2 ++ drivers/gpu/host1x/drm/drm.c | 2 ++ drivers/gpu/host1x/drm/gr2d.c | 2 +- drivers/gpu/host1x/hw/channel_hw.c | 19 +++ drivers/gpu/host1x/hw/hw_host1x01_uclass.h | 6 drivers/gpu/host1x/syncpt.c| 55 +++--- drivers/gpu/host1x/syncpt.h| 10 ++ include/uapi/drm/tegra_drm.h | 4 ++- 8 files changed, 93 insertions(+), 7 deletions(-) -- 1.8.1.5 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 1/3] gpu: host1x: Add syncpoint base support
This patch adds support for hardware syncpoint bases. This creates a simple mechanism for waiting an operation to complete in the middle of the command buffer. Signed-off-by: Arto Merilainen amerilai...@nvidia.com --- drivers/gpu/host1x/dev.h | 2 ++ drivers/gpu/host1x/hw/channel_hw.c | 19 +++ drivers/gpu/host1x/hw/hw_host1x01_uclass.h | 6 drivers/gpu/host1x/syncpt.c| 55 +++--- drivers/gpu/host1x/syncpt.h| 10 ++ 5 files changed, 87 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/host1x/dev.h b/drivers/gpu/host1x/dev.h index bed90a8..89a3c1e 100644 --- a/drivers/gpu/host1x/dev.h +++ b/drivers/gpu/host1x/dev.h @@ -26,6 +26,7 @@ #include cdma.h #include job.h +struct host1x_base; struct host1x_syncpt; struct host1x_channel; struct host1x_cdma; @@ -102,6 +103,7 @@ struct host1x { void __iomem *regs; struct host1x_syncpt *syncpt; + struct host1x_base *bases; struct device *dev; struct clk *clk; diff --git a/drivers/gpu/host1x/hw/channel_hw.c b/drivers/gpu/host1x/hw/channel_hw.c index ee19962..5f9f735 100644 --- a/drivers/gpu/host1x/hw/channel_hw.c +++ b/drivers/gpu/host1x/hw/channel_hw.c @@ -67,6 +67,21 @@ static void submit_gathers(struct host1x_job *job) } } +static inline void synchronize_syncpt_base(struct host1x_job *job) +{ + struct host1x_channel *ch = job-channel; + struct host1x *host = dev_get_drvdata(ch-dev-parent); + struct host1x_syncpt *sp = host-syncpt + job-syncpt_id; + u32 base_id = sp-base-id; + u32 base_val = host1x_syncpt_read_max(sp); + + host1x_cdma_push(ch-cdma, +host1x_opcode_setclass(HOST1X_CLASS_HOST1X, + host1x_uclass_load_syncpt_base_r(), 1), +host1x_uclass_load_syncpt_base_base_indx_f(base_id) | +host1x_uclass_load_syncpt_base_value_f(base_val)); +} + static int channel_submit(struct host1x_job *job) { struct host1x_channel *ch = job-channel; @@ -118,6 +133,10 @@ static int channel_submit(struct host1x_job *job) host1x_syncpt_read_max(sp))); } + /* Synchronize base register to allow using it for relative waiting */ + if (sp-base) + synchronize_syncpt_base(job); + syncval = host1x_syncpt_incr_max(sp, user_syncpt_incrs); job-syncpt_end = syncval; diff --git a/drivers/gpu/host1x/hw/hw_host1x01_uclass.h b/drivers/gpu/host1x/hw/hw_host1x01_uclass.h index 42f3ce1..f755359 100644 --- a/drivers/gpu/host1x/hw/hw_host1x01_uclass.h +++ b/drivers/gpu/host1x/hw/hw_host1x01_uclass.h @@ -111,6 +111,12 @@ static inline u32 host1x_uclass_wait_syncpt_base_offset_f(u32 v) } #define HOST1X_UCLASS_WAIT_SYNCPT_BASE_OFFSET_F(v) \ host1x_uclass_wait_syncpt_base_offset_f(v) +static inline u32 host1x_uclass_load_syncpt_base_r(void) +{ + return 0xb; +} +#define HOST1X_UCLASS_LOAD_SYNCPT_BASE \ + host1x_uclass_load_syncpt_base_r() static inline u32 host1x_uclass_load_syncpt_base_base_indx_f(u32 v) { return (v 0xff) 24; diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c index 409745b..48927b1 100644 --- a/drivers/gpu/host1x/syncpt.c +++ b/drivers/gpu/host1x/syncpt.c @@ -30,9 +30,32 @@ #define SYNCPT_CHECK_PERIOD (2 * HZ) #define MAX_STUCK_CHECK_COUNT 15 +static struct host1x_base *host1x_base_alloc(struct host1x *host) +{ + struct host1x_base *base = host-bases; + int i; + + for (i = 0; i host-info-nb_bases base-reserved; i++, base++) + ; + + if (i = host-info-nb_bases) + return NULL; + + base-reserved = true; + return base; +} + +static void host1x_base_free(struct host1x_base *base) +{ + if (!base) + return; + base-reserved = false; +} + static struct host1x_syncpt *_host1x_syncpt_alloc(struct host1x *host, struct device *dev, - bool client_managed) + bool client_managed, + bool support_base) { int i; struct host1x_syncpt *sp = host-syncpt; @@ -44,6 +67,12 @@ static struct host1x_syncpt *_host1x_syncpt_alloc(struct host1x *host, if (i = host-info-nb_pts) return NULL; + if (support_base) { + sp-base = host1x_base_alloc(host); + if (!sp-base) + return NULL; + } + name = kasprintf(GFP_KERNEL, %02d-%s, sp-id, dev ? dev_name(dev) : NULL); if (!name) @@ -304,24 +333,31 @@ int host1x_syncpt_patch_wait(struct host1x_syncpt *sp, void *patch_addr) int host1x_syncpt_init(struct host1x *host) { struct host1x_syncpt
[PATCH 2/3] drm/tegra: Deliver syncpoint base to user space
This patch makes the necessary additions to deliver syncpoint base to the user space. This patch splits the index field in the drm_tegra_get_syncpt structure into three separate fields (index, support_base, base_id). This allows to keep compatibility over kernel versions: - The placing of index field stays constant and therefore the interface should stay compatible with earlier userspace versions. - The old index field was input-only and it was not supposed to be read afterwards. Delivering data back in the same field should not introduce regressions to any old user space applications. - Old kernels left support_base and base_id fields intact (zero) and hence the new user space applications get the correct response from the kernel (=syncpoint does not have a base). Signed-off-by: Arto Merilainen amerilai...@nvidia.com --- drivers/gpu/host1x/drm/drm.c | 2 ++ include/uapi/drm/tegra_drm.h | 4 +++- 2 files changed, 5 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/host1x/drm/drm.c b/drivers/gpu/host1x/drm/drm.c index 8c61cee..4251563 100644 --- a/drivers/gpu/host1x/drm/drm.c +++ b/drivers/gpu/host1x/drm/drm.c @@ -468,6 +468,8 @@ static int tegra_get_syncpt(struct drm_device *drm, void *data, syncpt = context-client-syncpts[args-index]; args-id = host1x_syncpt_id(syncpt); + args-support_base = syncpt-base ? 1 : 0; + args-base_id = syncpt-base ? syncpt-base-id : 0; return 0; } diff --git a/include/uapi/drm/tegra_drm.h b/include/uapi/drm/tegra_drm.h index 73bde4e..7a79ca5 100644 --- a/include/uapi/drm/tegra_drm.h +++ b/include/uapi/drm/tegra_drm.h @@ -61,7 +61,9 @@ struct drm_tegra_close_channel { struct drm_tegra_get_syncpt { __u64 context; - __u32 index; + __u16 index; + __u8 support_base; + __u8 base_id; __u32 id; }; -- 1.8.1.5 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 3/3] drm/tegra: Reserve base for gr2d
This patch modifies the gr2d to reserve a base for syncpoint. Signed-off-by: Arto Merilainen amerilai...@nvidia.com --- drivers/gpu/host1x/drm/gr2d.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/host1x/drm/gr2d.c b/drivers/gpu/host1x/drm/gr2d.c index 27ffcf1..f0c3fd5 100644 --- a/drivers/gpu/host1x/drm/gr2d.c +++ b/drivers/gpu/host1x/drm/gr2d.c @@ -285,7 +285,7 @@ static int gr2d_probe(struct platform_device *pdev) if (!gr2d-channel) return -ENOMEM; - *syncpts = host1x_syncpt_request(dev, false); + *syncpts = host1x_syncpt_request_based(dev, false); if (!(*syncpts)) { host1x_channel_free(gr2d-channel); return -ENOMEM; -- 1.8.1.5 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCHv4 0/5] gpu: host1x: Add runtime pm support
This series adds runtime pm support for host1x, gr2d and dc. It retains the current behaviour if CONFIG_PM_RUNTIME is not enabled. The gr2d clock is enabled when a new job is submitted and disabled when the work is done. Due to parent-child relations between host1x-gr2d, this scheme enables and disables host1x clock. For dc, the clocks are enabled in .probe and disabled in .remove via runtime pm instead of direct clock APIs. Mayuresh is unfortunately not available to continue with the series and hence I will continue working on the patches. Changes in v4: - Fixed initialisation clean up in host1x and gr2d drivers - Runtime pm support removal follows now the same convention as other Tegra drivers - Code shuffling to prevent unnecessary function prototypes - Removed unnecessary NULL pointer checks. - Rebased on top of 3.12-rc4 Changes in v3: - Rebased patches on top of 3.12-rc2 - Removed unnecessary #ifdefs - Added descriptions to commit messages - If runtime pm is disabled, the code calls suspend/resume functions for enabling/disabling the clocks instead of repeating the functions Arto Merilainen (1): drm/tegra: Fix gr2d initialisation clean up Mayuresh Kulkarni (4): gpu: host1x: shuffle job APIs drm/tegra: Add runtime pm support for gr2d drm/tegra: Add runtime pm support for dc gpu: host1x: Add runtime pm support for host1x drivers/gpu/host1x/cdma.c | 2 ++ drivers/gpu/host1x/channel.c | 8 - drivers/gpu/host1x/channel.h | 1 - drivers/gpu/host1x/dev.c | 56 +- drivers/gpu/host1x/drm/dc.c | 59 ++- drivers/gpu/host1x/drm/gr2d.c | 71 --- drivers/gpu/host1x/job.c | 13 drivers/gpu/host1x/job.h | 3 ++ 8 files changed, 177 insertions(+), 36 deletions(-) -- 1.8.1.5 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCHv4 1/5] drm/tegra: Fix gr2d initialisation clean up
gr2d initialisation clean up had missing steps (i.e. host1x channel was not released if we could not register gr2d as a host1x client) that could have lead to weird issues. This patch reworks the initialisation sequence to do clean up correctly. Signed-off-by: Arto Merilainen amerilai...@nvidia.com --- drivers/gpu/host1x/drm/gr2d.c | 21 - 1 file changed, 16 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/host1x/drm/gr2d.c b/drivers/gpu/host1x/drm/gr2d.c index 27ffcf1..60150ad 100644 --- a/drivers/gpu/host1x/drm/gr2d.c +++ b/drivers/gpu/host1x/drm/gr2d.c @@ -282,13 +282,15 @@ static int gr2d_probe(struct platform_device *pdev) } gr2d-channel = host1x_channel_request(dev); - if (!gr2d-channel) - return -ENOMEM; + if (!gr2d-channel) { + err = -ENOMEM; + goto err_channel_request; + } *syncpts = host1x_syncpt_request(dev, false); if (!(*syncpts)) { - host1x_channel_free(gr2d-channel); - return -ENOMEM; + err = -ENOMEM; + goto err_syncpt_request; } gr2d-client.ops = gr2d_client_ops; @@ -300,7 +302,7 @@ static int gr2d_probe(struct platform_device *pdev) err = host1x_register_client(host1x, gr2d-client); if (err 0) { dev_err(dev, failed to register host1x client: %d\n, err); - return err; + goto err_register_client; } gr2d_init_addr_reg_map(dev, gr2d); @@ -308,6 +310,15 @@ static int gr2d_probe(struct platform_device *pdev) platform_set_drvdata(pdev, gr2d); return 0; + +err_register_client: + host1x_syncpt_free(*gr2d-client.syncpts); +err_syncpt_request: + host1x_channel_free(gr2d-channel); +err_channel_request: + clk_disable_unprepare(gr2d-clk); + + return err; } static int __exit gr2d_remove(struct platform_device *pdev) -- 1.8.1.5 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCHv4 3/5] drm/tegra: Add runtime pm support for gr2d
From: Mayuresh Kulkarni mkulka...@nvidia.com This far we have enabled gr2d clock on device probe and disabled it on device deinitialisation. This patch adds runtime pm support for the hardware unit allowing dynamic power management. If pm runtime is not enabled, gr2d clock is enabled in device probe and disabled in remove. Signed-off-by: Mayuresh Kulkarni mkulka...@nvidia.com Signed-off-by: Arto Merilainen amerilai...@nvidia.com --- drivers/gpu/host1x/drm/gr2d.c | 52 +-- drivers/gpu/host1x/job.c | 5 +++-- 2 files changed, 48 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/host1x/drm/gr2d.c b/drivers/gpu/host1x/drm/gr2d.c index 60150ad..ba2928e 100644 --- a/drivers/gpu/host1x/drm/gr2d.c +++ b/drivers/gpu/host1x/drm/gr2d.c @@ -22,6 +22,7 @@ #include linux/of.h #include linux/of_device.h #include linux/clk.h +#include linux/pm_runtime.h #include channel.h #include drm.h @@ -46,6 +47,27 @@ static inline struct gr2d *to_gr2d(struct host1x_client *client) static int gr2d_is_addr_reg(struct device *dev, u32 class, u32 reg); +static int gr2d_runtime_suspend(struct device *dev) +{ + struct gr2d *gr2d = dev_get_drvdata(dev); + + clk_disable_unprepare(gr2d-clk); + + return 0; +} + +static int gr2d_runtime_resume(struct device *dev) +{ + int err = 0; + struct gr2d *gr2d = dev_get_drvdata(dev); + + err = clk_prepare_enable(gr2d-clk); + if (err 0) + dev_err(dev, failed to enable clock\n); + + return err; +} + static int gr2d_client_init(struct host1x_client *client, struct drm_device *drm) { @@ -275,12 +297,18 @@ static int gr2d_probe(struct platform_device *pdev) return PTR_ERR(gr2d-clk); } - err = clk_prepare_enable(gr2d-clk); - if (err) { - dev_err(dev, cannot turn on clock\n); - return err; + /* pm runtime accesses the clock through driver data */ + platform_set_drvdata(pdev, gr2d); + + pm_runtime_enable(pdev-dev); + if (!pm_runtime_enabled(pdev-dev)) { + err = gr2d_runtime_resume(pdev-dev); + if (err 0) + return err; } + pm_runtime_get_sync(pdev-dev); + gr2d-channel = host1x_channel_request(dev); if (!gr2d-channel) { err = -ENOMEM; @@ -307,7 +335,7 @@ static int gr2d_probe(struct platform_device *pdev) gr2d_init_addr_reg_map(dev, gr2d); - platform_set_drvdata(pdev, gr2d); + pm_runtime_put(pdev-dev); return 0; @@ -316,7 +344,9 @@ err_register_client: err_syncpt_request: host1x_channel_free(gr2d-channel); err_channel_request: - clk_disable_unprepare(gr2d-clk); + pm_runtime_disable(pdev-dev); + if (!pm_runtime_status_suspended(pdev-dev)) + gr2d_runtime_suspend(pdev-dev); return err; } @@ -338,11 +368,18 @@ static int __exit gr2d_remove(struct platform_device *pdev) host1x_syncpt_free(gr2d-client.syncpts[i]); host1x_channel_free(gr2d-channel); - clk_disable_unprepare(gr2d-clk); + + pm_runtime_disable(pdev-dev); + if (!pm_runtime_status_suspended(pdev-dev)) + gr2d_runtime_suspend(pdev-dev); return 0; } +static const struct dev_pm_ops gr2d_pm_ops = { + SET_RUNTIME_PM_OPS(gr2d_runtime_suspend, gr2d_runtime_resume, NULL) +}; + struct platform_driver tegra_gr2d_driver = { .probe = gr2d_probe, .remove = __exit_p(gr2d_remove), @@ -350,5 +387,6 @@ struct platform_driver tegra_gr2d_driver = { .owner = THIS_MODULE, .name = gr2d, .of_match_table = gr2d_match, + .pm = gr2d_pm_ops, } }; diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index 3928b4e..e4148e0 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -23,6 +23,7 @@ #include linux/scatterlist.h #include linux/slab.h #include linux/vmalloc.h +#include linux/pm_runtime.h #include trace/events/host1x.h #include channel.h @@ -589,11 +590,11 @@ void host1x_job_dump(struct device *dev, struct host1x_job *job) int host1x_job_submit(struct host1x_job *job) { struct host1x *host = dev_get_drvdata(job-channel-dev-parent); - + pm_runtime_get_sync(job-channel-dev); return host1x_hw_channel_submit(host, job); } int host1x_job_complete(struct host1x_job *job) { - return 0; + return pm_runtime_put(job-channel-dev); } -- 1.8.1.5 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCHv4 4/5] drm/tegra: Add runtime pm support for dc
From: Mayuresh Kulkarni mkulka...@nvidia.com This patch adds initial runtime pm support for Tegra display controller. As of now, the dc clock is enabled in device probe via runtime pm and disabled in device remove. In case pm runtime is not configured, we simply enable the clock in device probe (..and disable it in remove). Signed-off-by: Mayuresh Kulkarni mkulka...@nvidia.com Signed-off-by: Arto Merilainen amerilai...@nvidia.com --- drivers/gpu/host1x/drm/dc.c | 59 +++-- 1 file changed, 51 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/host1x/drm/dc.c b/drivers/gpu/host1x/drm/dc.c index b1a05ad..ca09ceb 100644 --- a/drivers/gpu/host1x/drm/dc.c +++ b/drivers/gpu/host1x/drm/dc.c @@ -13,6 +13,7 @@ #include linux/of.h #include linux/platform_device.h #include linux/clk/tegra.h +#include linux/pm_runtime.h #include host1x_client.h #include dc.h @@ -24,6 +25,27 @@ struct tegra_plane { unsigned int index; }; +static int tegra_dc_runtime_suspend(struct device *dev) +{ + struct tegra_dc *dc = dev_get_drvdata(dev); + + clk_disable_unprepare(dc-clk); + + return 0; +} + +static int tegra_dc_runtime_resume(struct device *dev) +{ + int err = 0; + struct tegra_dc *dc = dev_get_drvdata(dev); + + err = clk_prepare_enable(dc-clk); + if (err 0) + dev_err(dev, failed to enable clock\n); + + return err; +} + static inline struct tegra_plane *to_tegra_plane(struct drm_plane *plane) { return container_of(plane, struct tegra_plane, base); @@ -1128,9 +1150,7 @@ static int tegra_dc_probe(struct platform_device *pdev) return PTR_ERR(dc-clk); } - err = clk_prepare_enable(dc-clk); - if (err 0) - return err; + platform_set_drvdata(pdev, dc); regs = platform_get_resource(pdev, IORESOURCE_MEM, 0); dc-regs = devm_ioremap_resource(pdev-dev, regs); @@ -1147,22 +1167,37 @@ static int tegra_dc_probe(struct platform_device *pdev) dc-client.ops = dc_client_ops; dc-client.dev = pdev-dev; + pm_runtime_enable(pdev-dev); + if (!pm_runtime_enabled(pdev-dev)) { + err = tegra_dc_runtime_resume(pdev-dev); + if (err 0) + return err; + } + + pm_runtime_get_sync(pdev-dev); + err = tegra_dc_rgb_probe(dc); if (err 0 err != -ENODEV) { dev_err(pdev-dev, failed to probe RGB output: %d\n, err); - return err; + goto err_dc_rgb_probe; } err = host1x_register_client(host1x, dc-client); if (err 0) { dev_err(pdev-dev, failed to register host1x client: %d\n, err); - return err; + goto err_register_client; } - platform_set_drvdata(pdev, dc); - return 0; + +err_register_client: +err_dc_rgb_probe: + pm_runtime_disable(pdev-dev); + if (!pm_runtime_status_suspended(pdev-dev)) + tegra_dc_runtime_suspend(pdev-dev); + + return err; } static int tegra_dc_remove(struct platform_device *pdev) @@ -1178,11 +1213,18 @@ static int tegra_dc_remove(struct platform_device *pdev) return err; } - clk_disable_unprepare(dc-clk); + pm_runtime_disable(pdev-dev); + if (!pm_runtime_status_suspended(pdev-dev)) + tegra_dc_runtime_suspend(pdev-dev); return 0; } +static const struct dev_pm_ops tegra_dc_pm_ops = { + SET_RUNTIME_PM_OPS(tegra_dc_runtime_suspend, + tegra_dc_runtime_resume, NULL) +}; + static struct of_device_id tegra_dc_of_match[] = { { .compatible = nvidia,tegra30-dc, }, { .compatible = nvidia,tegra20-dc, }, @@ -1194,6 +1236,7 @@ struct platform_driver tegra_dc_driver = { .name = tegra-dc, .owner = THIS_MODULE, .of_match_table = tegra_dc_of_match, + .pm = tegra_dc_pm_ops, }, .probe = tegra_dc_probe, .remove = tegra_dc_remove, -- 1.8.1.5 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCHv4 5/5] gpu: host1x: Add runtime pm support for host1x
From: Mayuresh Kulkarni mkulka...@nvidia.com This patch adds runtime pm support for host1x hardware unit. This allows host1x clock to be turned off when it is idle. If pm runtime is not configured, we enable host1x clock in device probe and disable it in remove. Signed-off-by: Mayuresh Kulkarni mkulka...@nvidia.com Signed-off-by: Arto Merilainen amerilai...@nvidia.com --- drivers/gpu/host1x/dev.c | 56 +--- 1 file changed, 48 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c index 4716302..1179b78 100644 --- a/drivers/gpu/host1x/dev.c +++ b/drivers/gpu/host1x/dev.c @@ -23,6 +23,7 @@ #include linux/of_device.h #include linux/clk.h #include linux/io.h +#include linux/pm_runtime.h #define CREATE_TRACE_POINTS #include trace/events/host1x.h @@ -34,6 +35,27 @@ #include hw/host1x01.h #include host1x_client.h +static int host1x_runtime_suspend(struct device *dev) +{ + struct host1x *host = dev_get_drvdata(dev); + + clk_disable_unprepare(host-clk); + + return 0; +} + +static int host1x_runtime_resume(struct device *dev) +{ + int err = 0; + struct host1x *host = dev_get_drvdata(dev); + + err = clk_prepare_enable(host-clk); + if (err 0) + dev_err(dev, failed to enable clock\n); + + return err; +} + void host1x_set_drm_data(struct device *dev, void *data) { struct host1x *host1x = dev_get_drvdata(dev); @@ -143,32 +165,42 @@ static int host1x_probe(struct platform_device *pdev) return err; } - err = clk_prepare_enable(host-clk); - if (err 0) { - dev_err(pdev-dev, failed to enable clock\n); - return err; + pm_runtime_enable(pdev-dev); + if (!pm_runtime_enabled(pdev-dev)) { + err = host1x_runtime_resume(pdev-dev); + if (err 0) + return err; } + pm_runtime_get_sync(pdev-dev); + err = host1x_syncpt_init(host); if (err) { dev_err(pdev-dev, failed to initialize syncpts\n); - return err; + goto err_syncpt_init; } err = host1x_intr_init(host, syncpt_irq); if (err) { dev_err(pdev-dev, failed to initialize interrupts\n); - goto fail_deinit_syncpt; + goto err_intr_init; } host1x_debug_init(host); host1x_drm_alloc(pdev); + pm_runtime_put(pdev-dev); + return 0; -fail_deinit_syncpt: +err_intr_init: host1x_syncpt_deinit(host); +err_syncpt_init: + pm_runtime_disable(pdev-dev); + if (!pm_runtime_status_suspended(pdev-dev)) + pm_runtime_disable(pdev-dev); + return err; } @@ -178,11 +210,18 @@ static int __exit host1x_remove(struct platform_device *pdev) host1x_intr_deinit(host); host1x_syncpt_deinit(host); - clk_disable_unprepare(host-clk); + + pm_runtime_disable(pdev-dev); + if (!pm_runtime_status_suspended(pdev-dev)) + host1x_runtime_suspend(pdev-dev); return 0; } +static const struct dev_pm_ops host1x_pm_ops = { + SET_RUNTIME_PM_OPS(host1x_runtime_suspend, host1x_runtime_resume, NULL) +}; + static struct platform_driver tegra_host1x_driver = { .probe = host1x_probe, .remove = __exit_p(host1x_remove), @@ -190,6 +229,7 @@ static struct platform_driver tegra_host1x_driver = { .owner = THIS_MODULE, .name = tegra-host1x, .of_match_table = host1x_of_match, + .pm = host1x_pm_ops, }, }; -- 1.8.1.5 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH] gpu: host1x: check relocs after all gathers are consumed
On 10/04/2013 11:18 PM, Erik Faye-Lund wrote: The num_relocs count are passed to the kernel per job, not per gather. For multi-gather jobs, we would previously fail if there were relocs in other gathers aside from the first one. Fix this by simply moving the check until all gathers have been consumed. Signed-off-by: Erik Faye-Lund kusmab...@gmail.com Reviewed-by: Arto Merilainen amerilai...@nvidia.com - Arto ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCHv3 2/4] drm/tegra: Add runtime pm support for gr2d
On 10/01/2013 09:14 PM, Stephen Warren wrote: On 09/24/2013 06:05 AM, Arto Merilainen wrote: diff --git a/drivers/gpu/host1x/drm/gr2d.c b/drivers/gpu/host1x/drm/gr2d.c @@ -327,11 +336,48 @@ static int __exit gr2d_remove(struct platform_device *pdev) host1x_syncpt_free(gr2d-client.syncpts[i]); host1x_channel_free(gr2d-channel); + + if (pm_runtime_enabled(pdev-dev)) + pm_runtime_disable(pdev-dev); + else + gr2d_runtime_suspend(pdev-dev); This code is slightly different to the code in e.g. sound/soc/tegra/tegra20_i2s.c:remove(), whereas the code in probe() is identical. I'm not sure whether there's some advantage in this version? If so, perhaps the sound drivers should be updated to be consistent. If not, perhaps this driver should do the same thing as the I2S driver, so we keep the drivers consistent, and provide the same example code everywhere. Hmm. I cannot immediately see any advantage in this version... I will update the patch. +static int gr2d_runtime_suspend(struct device *dev) +{ + struct gr2d *gr2d; + + gr2d = dev_get_drvdata(dev); + if (!gr2d) + return -EINVAL; Presumably, gr2d will never be NULL here, unless there's some chronic bug. Can't we re-write those last 5 lines as simply: struct gr2d *grd2 = dev_get_drvdata(dev); If that's not valid, we should probably update the audio drivers (and perhaps others) too. I think you are correct. I cannot see any reason why that check is required. - Arto ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCHv3 4/4] gpu: host1x: Add runtime pm support for host1x
On 10/01/2013 09:17 PM, Stephen Warren wrote: On 09/24/2013 06:05 AM, Arto Merilainen wrote: From: Mayuresh Kulkarni mkulka...@nvidia.com This patch adds runtime pm support for host1x hardware unit. This allows host1x clock to be turned off when it is idle. If pm runtime is not configured, we enable host1x clock in device probe and disable it in remove. diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c +static int host1x_runtime_suspend(struct device *dev); +static int host1x_runtime_resume(struct device *dev); You could avoid having these prototypes by simply putting the function bodies earlier on in the file, somewhere before they're used. I don't care much either way, but I've certainly seen some people care about this and ask for them to be moved. Will fix. - Arto ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCHv3 0/4] gpu: host1x: Add runtime pm support
This series adds runtime pm support for host1x, gr2d and dc. It retains the current behaviour if CONFIG_PM_RUNTIME is not enabled. The gr2d clock is enabled when a new job is submitted and disabled when the work is done. Due to parent-child relations between host1x-gr2d, this scheme enables and disables host1x clock. For dc, the clocks are enabled in .probe and disabled in .remove via runtime pm instead of direct clock APIs. Mayuresh is unfortunately not available to continue with the series and hence I will continue working on the patches. Changes in v3: - Rebased patches on top of 3.12-rc2 - Removed unnecessary #ifdefs - Added descriptions to commit messages - If runtime pm is disabled, the code calls suspend/resume functions for enabling/disabling the clocks instead of repeating the functions Mayuresh Kulkarni (4): gpu: host1x: shuffle job APIs drm/tegra: Add runtime pm support for gr2d drm/tegra: Add runtime pm support for dc gpu: host1x: Add runtime pm support for host1x drivers/gpu/host1x/cdma.c | 2 ++ drivers/gpu/host1x/channel.c | 8 -- drivers/gpu/host1x/channel.h | 1 - drivers/gpu/host1x/dev.c | 57 +++--- drivers/gpu/host1x/drm/dc.c | 58 +++ drivers/gpu/host1x/drm/gr2d.c | 57 ++ drivers/gpu/host1x/job.c | 13 ++ drivers/gpu/host1x/job.h | 3 +++ 8 files changed, 176 insertions(+), 23 deletions(-) -- 1.8.1.5 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCHv3 2/4] drm/tegra: Add runtime pm support for gr2d
From: Mayuresh Kulkarni mkulka...@nvidia.com This far we have enabled gr2d clock on device probe and disabled it on device deinitialisation. This patch adds runtime pm support for the hardware unit allowing dynamic power management. If pm runtime is not enabled, gr2d clock is enabled in device probe and disabled in remove. Signed-off-by: Mayuresh Kulkarni mkulka...@nvidia.com Signed-off-by: Arto Merilainen amerilai...@nvidia.com --- drivers/gpu/host1x/drm/gr2d.c | 57 +++ drivers/gpu/host1x/job.c | 5 ++-- 2 files changed, 55 insertions(+), 7 deletions(-) diff --git a/drivers/gpu/host1x/drm/gr2d.c b/drivers/gpu/host1x/drm/gr2d.c index 27ffcf1..8d92925 100644 --- a/drivers/gpu/host1x/drm/gr2d.c +++ b/drivers/gpu/host1x/drm/gr2d.c @@ -22,6 +22,7 @@ #include linux/of.h #include linux/of_device.h #include linux/clk.h +#include linux/pm_runtime.h #include channel.h #include drm.h @@ -45,6 +46,8 @@ static inline struct gr2d *to_gr2d(struct host1x_client *client) } static int gr2d_is_addr_reg(struct device *dev, u32 class, u32 reg); +static int gr2d_runtime_suspend(struct device *dev); +static int gr2d_runtime_resume(struct device *dev); static int gr2d_client_init(struct host1x_client *client, struct drm_device *drm) @@ -275,12 +278,18 @@ static int gr2d_probe(struct platform_device *pdev) return PTR_ERR(gr2d-clk); } - err = clk_prepare_enable(gr2d-clk); - if (err) { - dev_err(dev, cannot turn on clock\n); - return err; + /* pm runtime accesses the clock through driver data */ + platform_set_drvdata(pdev, gr2d); + + pm_runtime_enable(pdev-dev); + if (!pm_runtime_enabled(pdev-dev)) { + err = gr2d_runtime_resume(pdev-dev); + if (err 0) + return err; } + pm_runtime_get_sync(pdev-dev); + gr2d-channel = host1x_channel_request(dev); if (!gr2d-channel) return -ENOMEM; @@ -305,7 +314,7 @@ static int gr2d_probe(struct platform_device *pdev) gr2d_init_addr_reg_map(dev, gr2d); - platform_set_drvdata(pdev, gr2d); + pm_runtime_put(pdev-dev); return 0; } @@ -327,11 +336,48 @@ static int __exit gr2d_remove(struct platform_device *pdev) host1x_syncpt_free(gr2d-client.syncpts[i]); host1x_channel_free(gr2d-channel); + + if (pm_runtime_enabled(pdev-dev)) + pm_runtime_disable(pdev-dev); + else + gr2d_runtime_suspend(pdev-dev); + + return 0; +} + +static int gr2d_runtime_suspend(struct device *dev) +{ + struct gr2d *gr2d; + + gr2d = dev_get_drvdata(dev); + if (!gr2d) + return -EINVAL; + clk_disable_unprepare(gr2d-clk); return 0; } +static int gr2d_runtime_resume(struct device *dev) +{ + int err = 0; + struct gr2d *gr2d; + + gr2d = dev_get_drvdata(dev); + if (!gr2d) + return -EINVAL; + + err = clk_prepare_enable(gr2d-clk); + if (err 0) + dev_err(dev, failed to enable clock\n); + + return err; +} + +static const struct dev_pm_ops gr2d_pm_ops = { + SET_RUNTIME_PM_OPS(gr2d_runtime_suspend, gr2d_runtime_resume, NULL) +}; + struct platform_driver tegra_gr2d_driver = { .probe = gr2d_probe, .remove = __exit_p(gr2d_remove), @@ -339,5 +385,6 @@ struct platform_driver tegra_gr2d_driver = { .owner = THIS_MODULE, .name = gr2d, .of_match_table = gr2d_match, + .pm = gr2d_pm_ops, } }; diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index 3928b4e..e4148e0 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -23,6 +23,7 @@ #include linux/scatterlist.h #include linux/slab.h #include linux/vmalloc.h +#include linux/pm_runtime.h #include trace/events/host1x.h #include channel.h @@ -589,11 +590,11 @@ void host1x_job_dump(struct device *dev, struct host1x_job *job) int host1x_job_submit(struct host1x_job *job) { struct host1x *host = dev_get_drvdata(job-channel-dev-parent); - + pm_runtime_get_sync(job-channel-dev); return host1x_hw_channel_submit(host, job); } int host1x_job_complete(struct host1x_job *job) { - return 0; + return pm_runtime_put(job-channel-dev); } -- 1.8.1.5 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCHv3 4/4] gpu: host1x: Add runtime pm support for host1x
From: Mayuresh Kulkarni mkulka...@nvidia.com This patch adds runtime pm support for host1x hardware unit. This allows host1x clock to be turned off when it is idle. If pm runtime is not configured, we enable host1x clock in device probe and disable it in remove. Signed-off-by: Mayuresh Kulkarni mkulka...@nvidia.com Signed-off-by: Arto Merilainen amerilai...@nvidia.com --- drivers/gpu/host1x/dev.c | 57 1 file changed, 53 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/host1x/dev.c b/drivers/gpu/host1x/dev.c index 4716302..5f0b91e 100644 --- a/drivers/gpu/host1x/dev.c +++ b/drivers/gpu/host1x/dev.c @@ -23,6 +23,7 @@ #include linux/of_device.h #include linux/clk.h #include linux/io.h +#include linux/pm_runtime.h #define CREATE_TRACE_POINTS #include trace/events/host1x.h @@ -34,6 +35,9 @@ #include hw/host1x01.h #include host1x_client.h +static int host1x_runtime_suspend(struct device *dev); +static int host1x_runtime_resume(struct device *dev); + void host1x_set_drm_data(struct device *dev, void *data) { struct host1x *host1x = dev_get_drvdata(dev); @@ -143,12 +147,15 @@ static int host1x_probe(struct platform_device *pdev) return err; } - err = clk_prepare_enable(host-clk); - if (err 0) { - dev_err(pdev-dev, failed to enable clock\n); - return err; + pm_runtime_enable(pdev-dev); + if (!pm_runtime_enabled(pdev-dev)) { + err = host1x_runtime_resume(pdev-dev); + if (err 0) + return err; } + pm_runtime_get_sync(pdev-dev); + err = host1x_syncpt_init(host); if (err) { dev_err(pdev-dev, failed to initialize syncpts\n); @@ -165,10 +172,14 @@ static int host1x_probe(struct platform_device *pdev) host1x_drm_alloc(pdev); + pm_runtime_put(pdev-dev); + return 0; fail_deinit_syncpt: host1x_syncpt_deinit(host); + pm_runtime_put(pdev-dev); + pm_runtime_disable(pdev-dev); return err; } @@ -178,11 +189,48 @@ static int __exit host1x_remove(struct platform_device *pdev) host1x_intr_deinit(host); host1x_syncpt_deinit(host); + + if (pm_runtime_enabled(pdev-dev)) + pm_runtime_disable(pdev-dev); + else + host1x_runtime_suspend(pdev-dev); + + return 0; +} + +static int host1x_runtime_suspend(struct device *dev) +{ + struct host1x *host; + + host = dev_get_drvdata(dev); + if (!host) + return -EINVAL; + clk_disable_unprepare(host-clk); return 0; } +static int host1x_runtime_resume(struct device *dev) +{ + int err = 0; + struct host1x *host; + + host = dev_get_drvdata(dev); + if (!host) + return -EINVAL; + + err = clk_prepare_enable(host-clk); + if (err 0) + dev_err(dev, failed to enable clock\n); + + return err; +} + +static const struct dev_pm_ops host1x_pm_ops = { + SET_RUNTIME_PM_OPS(host1x_runtime_suspend, host1x_runtime_resume, NULL) +}; + static struct platform_driver tegra_host1x_driver = { .probe = host1x_probe, .remove = __exit_p(host1x_remove), @@ -190,6 +238,7 @@ static struct platform_driver tegra_host1x_driver = { .owner = THIS_MODULE, .name = tegra-host1x, .of_match_table = host1x_of_match, + .pm = host1x_pm_ops, }, }; -- 1.8.1.5 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCHv3 1/4] gpu: host1x: shuffle job APIs
From: Mayuresh Kulkarni mkulka...@nvidia.com This patch moves function host1x_job_submit() to job.c file where all other host1x_job_* functions are placed. This patch also introduces function host1x_job_complete(). Signed-off-by: Mayuresh Kulkarni mkulka...@nvidia.com Signed-off-by: Arto Merilainen amerilai...@nvidia.com --- drivers/gpu/host1x/cdma.c| 2 ++ drivers/gpu/host1x/channel.c | 8 drivers/gpu/host1x/channel.h | 1 - drivers/gpu/host1x/job.c | 12 drivers/gpu/host1x/job.h | 3 +++ 5 files changed, 17 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/host1x/cdma.c b/drivers/gpu/host1x/cdma.c index de72172..910087b 100644 --- a/drivers/gpu/host1x/cdma.c +++ b/drivers/gpu/host1x/cdma.c @@ -252,6 +252,8 @@ static void update_cdma_locked(struct host1x_cdma *cdma) signal = true; } + host1x_job_complete(job); + list_del(job-list); host1x_job_put(job); } diff --git a/drivers/gpu/host1x/channel.c b/drivers/gpu/host1x/channel.c index 83ea51b..c381441 100644 --- a/drivers/gpu/host1x/channel.c +++ b/drivers/gpu/host1x/channel.c @@ -21,7 +21,6 @@ #include channel.h #include dev.h -#include job.h /* Constructor for the host1x device list */ int host1x_channel_list_init(struct host1x *host) @@ -37,13 +36,6 @@ int host1x_channel_list_init(struct host1x *host) return 0; } -int host1x_job_submit(struct host1x_job *job) -{ - struct host1x *host = dev_get_drvdata(job-channel-dev-parent); - - return host1x_hw_channel_submit(host, job); -} - struct host1x_channel *host1x_channel_get(struct host1x_channel *channel) { int err = 0; diff --git a/drivers/gpu/host1x/channel.h b/drivers/gpu/host1x/channel.h index 48723b8..8401f25 100644 --- a/drivers/gpu/host1x/channel.h +++ b/drivers/gpu/host1x/channel.h @@ -44,7 +44,6 @@ struct host1x_channel *host1x_channel_request(struct device *dev); void host1x_channel_free(struct host1x_channel *channel); struct host1x_channel *host1x_channel_get(struct host1x_channel *channel); void host1x_channel_put(struct host1x_channel *channel); -int host1x_job_submit(struct host1x_job *job); #define host1x_for_each_channel(host, channel) \ list_for_each_entry(channel, host-chlist.list, list) diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index c4e1050..3928b4e 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -585,3 +585,15 @@ void host1x_job_dump(struct device *dev, struct host1x_job *job) dev_dbg(dev, NUM_SLOTS %d\n, job-num_slots); dev_dbg(dev, NUM_HANDLES %d\n, job-num_unpins); } + +int host1x_job_submit(struct host1x_job *job) +{ + struct host1x *host = dev_get_drvdata(job-channel-dev-parent); + + return host1x_hw_channel_submit(host, job); +} + +int host1x_job_complete(struct host1x_job *job) +{ + return 0; +} diff --git a/drivers/gpu/host1x/job.h b/drivers/gpu/host1x/job.h index fba45f2..e0249c3 100644 --- a/drivers/gpu/host1x/job.h +++ b/drivers/gpu/host1x/job.h @@ -159,4 +159,7 @@ void host1x_job_unpin(struct host1x_job *job); */ void host1x_job_dump(struct device *dev, struct host1x_job *job); +int host1x_job_submit(struct host1x_job *job); +int host1x_job_complete(struct host1x_job *job); + #endif -- 1.8.1.5 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCHv3 3/4] drm/tegra: Add runtime pm support for dc
From: Mayuresh Kulkarni mkulka...@nvidia.com This patch adds initial runtime pm support for Tegra display controller. As of now, the dc clock is enabled in device probe via runtime pm and disabled in device remove. In case pm runtime is not configured, we simply enable the clock in device probe (..and disable it in remove). Signed-off-by: Mayuresh Kulkarni mkulka...@nvidia.com Signed-off-by: Arto Merilainen amerilai...@nvidia.com --- drivers/gpu/host1x/drm/dc.c | 58 + 1 file changed, 53 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/host1x/drm/dc.c b/drivers/gpu/host1x/drm/dc.c index b1a05ad..6d1d5fc 100644 --- a/drivers/gpu/host1x/drm/dc.c +++ b/drivers/gpu/host1x/drm/dc.c @@ -13,6 +13,7 @@ #include linux/of.h #include linux/platform_device.h #include linux/clk/tegra.h +#include linux/pm_runtime.h #include host1x_client.h #include dc.h @@ -24,6 +25,9 @@ struct tegra_plane { unsigned int index; }; +static int tegra_dc_runtime_suspend(struct device *dev); +static int tegra_dc_runtime_resume(struct device *dev); + static inline struct tegra_plane *to_tegra_plane(struct drm_plane *plane) { return container_of(plane, struct tegra_plane, base); @@ -1128,9 +1132,7 @@ static int tegra_dc_probe(struct platform_device *pdev) return PTR_ERR(dc-clk); } - err = clk_prepare_enable(dc-clk); - if (err 0) - return err; + platform_set_drvdata(pdev, dc); regs = platform_get_resource(pdev, IORESOURCE_MEM, 0); dc-regs = devm_ioremap_resource(pdev-dev, regs); @@ -1147,6 +1149,15 @@ static int tegra_dc_probe(struct platform_device *pdev) dc-client.ops = dc_client_ops; dc-client.dev = pdev-dev; + pm_runtime_enable(pdev-dev); + if (!pm_runtime_enabled(pdev-dev)) { + err = tegra_dc_runtime_resume(pdev-dev); + if (err 0) + return err; + } + + pm_runtime_get_sync(pdev-dev); + err = tegra_dc_rgb_probe(dc); if (err 0 err != -ENODEV) { dev_err(pdev-dev, failed to probe RGB output: %d\n, err); @@ -1160,8 +1171,6 @@ static int tegra_dc_probe(struct platform_device *pdev) return err; } - platform_set_drvdata(pdev, dc); - return 0; } @@ -1178,11 +1187,49 @@ static int tegra_dc_remove(struct platform_device *pdev) return err; } + pm_runtime_put(pdev-dev); + if (pm_runtime_enabled(pdev-dev)) + pm_runtime_disable(pdev-dev); + else + tegra_dc_runtime_suspend(pdev-dev); + + return 0; +} + +static int tegra_dc_runtime_suspend(struct device *dev) +{ + struct tegra_dc *dc; + + dc = dev_get_drvdata(dev); + if (!dc) + return -EINVAL; + clk_disable_unprepare(dc-clk); return 0; } +static int tegra_dc_runtime_resume(struct device *dev) +{ + int err = 0; + struct tegra_dc *dc; + + dc = dev_get_drvdata(dev); + if (!dc) + return -EINVAL; + + err = clk_prepare_enable(dc-clk); + if (err 0) + dev_err(dev, failed to enable clock\n); + + return err; +} + +static const struct dev_pm_ops tegra_dc_pm_ops = { + SET_RUNTIME_PM_OPS(tegra_dc_runtime_suspend, + tegra_dc_runtime_resume, NULL) +}; + static struct of_device_id tegra_dc_of_match[] = { { .compatible = nvidia,tegra30-dc, }, { .compatible = nvidia,tegra20-dc, }, @@ -1194,6 +1241,7 @@ struct platform_driver tegra_dc_driver = { .name = tegra-dc, .owner = THIS_MODULE, .of_match_table = tegra_dc_of_match, + .pm = tegra_dc_pm_ops, }, .probe = tegra_dc_probe, .remove = tegra_dc_remove, -- 1.8.1.5 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCHv2 3/7] gpu: host1x: Don't reset firewall between gathers
On 05/29/2013 02:21 PM, Thierry Reding wrote: > * PGP Signed by an unknown key > > * Arto Merilainen wrote: > [...] >> diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c > [...] >> @@ -553,7 +549,6 @@ int host1x_job_pin(struct host1x_job *job, struct device >> *dev) >> >> if (!err) >> err = do_waitchks(job, host, g->bo); >> - >> if (err) >> break; >> } > > This is another unnecessary whitespace change, but since it was the only > issue I saw with the series I fixed it up myself while applying. > Thanks! I thought I got them all before posting but apparently not... - Arto
[PATCHv2 7/7] gpu: host1x: Rework CPU syncpoint increment
This patch merges host1x_syncpt_cpu_incr to host1x_syncpt_incr() as they are in practise doing the same thing. host1x_syncpt_incr() is also modified to return error codes. User space interface is modified accordingly to pass return values. Signed-off-by: Arto Merilainen --- drivers/gpu/host1x/dev.h | 8 drivers/gpu/host1x/drm/drm.c | 3 +-- drivers/gpu/host1x/hw/cdma_hw.c | 2 +- drivers/gpu/host1x/hw/syncpt_hw.c | 12 +--- drivers/gpu/host1x/syncpt.c | 15 ++- drivers/gpu/host1x/syncpt.h | 7 ++- 6 files changed, 15 insertions(+), 32 deletions(-) diff --git a/drivers/gpu/host1x/dev.h b/drivers/gpu/host1x/dev.h index a1607d6..790ddf1 100644 --- a/drivers/gpu/host1x/dev.h +++ b/drivers/gpu/host1x/dev.h @@ -73,7 +73,7 @@ struct host1x_syncpt_ops { void (*restore_wait_base)(struct host1x_syncpt *syncpt); void (*load_wait_base)(struct host1x_syncpt *syncpt); u32 (*load)(struct host1x_syncpt *syncpt); - void (*cpu_incr)(struct host1x_syncpt *syncpt); + int (*cpu_incr)(struct host1x_syncpt *syncpt); int (*patch_wait)(struct host1x_syncpt *syncpt, void *patch_addr); }; @@ -157,10 +157,10 @@ static inline u32 host1x_hw_syncpt_load(struct host1x *host, return host->syncpt_op->load(sp); } -static inline void host1x_hw_syncpt_cpu_incr(struct host1x *host, -struct host1x_syncpt *sp) +static inline int host1x_hw_syncpt_cpu_incr(struct host1x *host, + struct host1x_syncpt *sp) { - host->syncpt_op->cpu_incr(sp); + return host->syncpt_op->cpu_incr(sp); } static inline int host1x_hw_syncpt_patch_wait(struct host1x *host, diff --git a/drivers/gpu/host1x/drm/drm.c b/drivers/gpu/host1x/drm/drm.c index 2b561c9..1dfd454 100644 --- a/drivers/gpu/host1x/drm/drm.c +++ b/drivers/gpu/host1x/drm/drm.c @@ -378,8 +378,7 @@ static int tegra_syncpt_incr(struct drm_device *drm, void *data, if (!sp) return -EINVAL; - host1x_syncpt_incr(sp); - return 0; + return host1x_syncpt_incr(sp); } static int tegra_syncpt_wait(struct drm_device *drm, void *data, diff --git a/drivers/gpu/host1x/hw/cdma_hw.c b/drivers/gpu/host1x/hw/cdma_hw.c index 590b69d..2ee4ad5 100644 --- a/drivers/gpu/host1x/hw/cdma_hw.c +++ b/drivers/gpu/host1x/hw/cdma_hw.c @@ -44,7 +44,7 @@ static void cdma_timeout_cpu_incr(struct host1x_cdma *cdma, u32 getptr, u32 i; for (i = 0; i < syncpt_incrs; i++) - host1x_syncpt_cpu_incr(cdma->timeout.syncpt); + host1x_syncpt_incr(cdma->timeout.syncpt); /* after CPU incr, ensure shadow is up to date */ host1x_syncpt_load(cdma->timeout.syncpt); diff --git a/drivers/gpu/host1x/hw/syncpt_hw.c b/drivers/gpu/host1x/hw/syncpt_hw.c index 6117499..0cf6095 100644 --- a/drivers/gpu/host1x/hw/syncpt_hw.c +++ b/drivers/gpu/host1x/hw/syncpt_hw.c @@ -77,21 +77,19 @@ static u32 syncpt_load(struct host1x_syncpt *sp) * Write a cpu syncpoint increment to the hardware, without touching * the cache. */ -static void syncpt_cpu_incr(struct host1x_syncpt *sp) +static int syncpt_cpu_incr(struct host1x_syncpt *sp) { struct host1x *host = sp->host; u32 reg_offset = sp->id / 32; if (!host1x_syncpt_client_managed(sp) && - host1x_syncpt_idle(sp)) { - dev_err(host->dev, "Trying to increment syncpoint id %d beyond max\n", - sp->id); - host1x_debug_dump(sp->host); - return; - } + host1x_syncpt_idle(sp)) + return -EINVAL; host1x_sync_writel(host, BIT_MASK(sp->id), HOST1X_SYNC_SYNCPT_CPU_INCR(reg_offset)); wmb(); + + return 0; } /* remove a wait pointed to by patch_addr */ diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c index 27201b5..409745b 100644 --- a/drivers/gpu/host1x/syncpt.c +++ b/drivers/gpu/host1x/syncpt.c @@ -129,22 +129,11 @@ u32 host1x_syncpt_load_wait_base(struct host1x_syncpt *sp) } /* - * Write a cpu syncpoint increment to the hardware, without touching - * the cache. Caller is responsible for host being powered. - */ -void host1x_syncpt_cpu_incr(struct host1x_syncpt *sp) -{ - host1x_hw_syncpt_cpu_incr(sp->host, sp); -} - -/* * Increment syncpoint value from cpu, updating cache */ -void host1x_syncpt_incr(struct host1x_syncpt *sp) +int host1x_syncpt_incr(struct host1x_syncpt *sp) { - if (host1x_syncpt_client_managed(sp)) - host1x_syncpt_incr_max(sp, 1); - host1x_syncpt_cpu_incr(sp); + return host1x_hw_syncpt_cpu_incr(sp->host, sp); } /* diff --git a/drivers/gpu/host1x/syncpt.h b/drivers/gpu/host1x/syncpt.h index d00e758..267c0b9 100644 --- a/drivers/gpu/host1x/syncpt.h +++ b/drivers/gpu/host1x/s
[PATCHv2 5/7] gpu: host1x: Fix memory access in syncpt request
This patch fixes a bad memory access in syncpoint request code. If no syncpoints were available, the code accessed unreserved memory area causing unexpected behaviour. Signed-off-by: Arto Merilainen --- drivers/gpu/host1x/syncpt.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c index 4b49345..2b03f1b 100644 --- a/drivers/gpu/host1x/syncpt.c +++ b/drivers/gpu/host1x/syncpt.c @@ -40,7 +40,8 @@ static struct host1x_syncpt *_host1x_syncpt_alloc(struct host1x *host, for (i = 0; i < host->info->nb_pts && sp->name; i++, sp++) ; - if (sp->dev) + + if (i >= host->info->nb_pts) return NULL; name = kasprintf(GFP_KERNEL, "%02d-%s", sp->id, -- 1.8.1.5
[PATCHv2 4/7] gpu: host1x: Copy gathers before verification
The firewall verified gather buffers before copying them. This allowed a malicious application to rewrite the buffer content by timing the rewrite carefully. This patch makes the buffer validation occur after copying the buffers. Signed-off-by: Arto Merilainen Signed-off-by: Terje Bergstrom --- drivers/gpu/host1x/job.c | 50 +++- 1 file changed, 20 insertions(+), 30 deletions(-) diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index 028d2d4..cc80766 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -228,17 +228,15 @@ static unsigned int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf) void *cmdbuf_page_addr = NULL; /* pin & patch the relocs for one gather */ - while (i < job->num_relocs) { + for (i = 0; i < job->num_relocs; i++) { struct host1x_reloc *reloc = >relocarray[i]; u32 reloc_addr = (job->reloc_addr_phys[i] + reloc->target_offset) >> reloc->shift; u32 *target; /* skip all other gathers */ - if (!(reloc->cmdbuf && cmdbuf == reloc->cmdbuf)) { - i++; + if (cmdbuf != reloc->cmdbuf) continue; - } if (last_page != reloc->cmdbuf_offset >> PAGE_SHIFT) { if (cmdbuf_page_addr) @@ -257,9 +255,6 @@ static unsigned int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf) target = cmdbuf_page_addr + (reloc->cmdbuf_offset & ~PAGE_MASK); *target = reloc_addr; - - /* mark this gather as handled */ - reloc->cmdbuf = 0; } if (cmdbuf_page_addr) @@ -378,15 +373,13 @@ static int check_nonincr(struct host1x_firewall *fw) static int validate(struct host1x_firewall *fw, struct host1x_job_gather *g) { - u32 *cmdbuf_base; + u32 *cmdbuf_base = (u32 *)fw->job->gather_copy_mapped + + (g->offset / sizeof(u32)); int err = 0; if (!fw->job->is_addr_reg) return 0; - cmdbuf_base = host1x_bo_mmap(g->bo); - if (!cmdbuf_base) - return -ENOMEM; fw->words = g->words; fw->cmdbuf_id = g->bo; fw->offset = 0; @@ -453,10 +446,17 @@ out: static inline int copy_gathers(struct host1x_job *job, struct device *dev) { + struct host1x_firewall fw; size_t size = 0; size_t offset = 0; int i; + fw.job = job; + fw.dev = dev; + fw.reloc = job->relocarray; + fw.num_relocs = job->num_relocs; + fw.class = 0; + for (i = 0; i < job->num_gathers; i++) { struct host1x_job_gather *g = >gathers[i]; size += g->words * sizeof(u32); @@ -477,14 +477,19 @@ static inline int copy_gathers(struct host1x_job *job, struct device *dev) struct host1x_job_gather *g = >gathers[i]; void *gather; + /* Copy the gather */ gather = host1x_bo_mmap(g->bo); memcpy(job->gather_copy_mapped + offset, gather + g->offset, g->words * sizeof(u32)); host1x_bo_munmap(g->bo, gather); + /* Store the location in the buffer */ g->base = job->gather_copy; g->offset = offset; - g->bo = NULL; + + /* Validate the job */ + if (validate(, g)) + return -EINVAL; offset += g->words * sizeof(u32); } @@ -497,15 +502,8 @@ int host1x_job_pin(struct host1x_job *job, struct device *dev) int err; unsigned int i, j; struct host1x *host = dev_get_drvdata(dev->parent); - struct host1x_firewall fw; DECLARE_BITMAP(waitchk_mask, host1x_syncpt_nb_pts(host)); - fw.job = job; - fw.dev = dev; - fw.reloc = job->relocarray; - fw.num_relocs = job->num_relocs; - fw.class = 0; - bitmap_zero(waitchk_mask, host1x_syncpt_nb_pts(host)); for (i = 0; i < job->num_waitchk; i++) { u32 syncpt_id = job->waitchk[i].syncpt_id; @@ -536,19 +534,11 @@ int host1x_job_pin(struct host1x_job *job, struct device *dev) if (job->gathers[j].bo == g->bo) job->gathers[j].handled = true; - err = 0; - - if (IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL)) - err = validate(, g); - + err = do_relocs(job, g->bo); if (err) - dev_err(dev, "Job invalid (err=%d)\n", err); - - if (!err) - err
[PATCHv2 3/7] gpu: host1x: Don't reset firewall between gathers
From: Terje Bergstrom <tbergst...@nvidia.com> The firewall was reinitialised for each gather. Because the filter was reinitialised, it did not track the class over gather boundaries. This allowed the user application to set host1x class to one class in one gather and use that class in another gather without firewall having knowledge about that. Signed-off-by: Terje Bergstrom Signed-off-by: Arto Merilainen --- drivers/gpu/host1x/job.c | 73 ++-- 1 file changed, 34 insertions(+), 39 deletions(-) diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index 83804fd..028d2d4 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -376,69 +376,60 @@ static int check_nonincr(struct host1x_firewall *fw) return 0; } -static int validate(struct host1x_job *job, struct device *dev, - struct host1x_job_gather *g) +static int validate(struct host1x_firewall *fw, struct host1x_job_gather *g) { u32 *cmdbuf_base; int err = 0; - struct host1x_firewall fw; - - fw.job = job; - fw.dev = dev; - fw.reloc = job->relocarray; - fw.num_relocs = job->num_relocs; - fw.cmdbuf_id = g->bo; - - fw.offset = 0; - fw.class = 0; - if (!job->is_addr_reg) + if (!fw->job->is_addr_reg) return 0; cmdbuf_base = host1x_bo_mmap(g->bo); if (!cmdbuf_base) return -ENOMEM; + fw->words = g->words; + fw->cmdbuf_id = g->bo; + fw->offset = 0; - fw.words = g->words; - while (fw.words && !err) { - u32 word = cmdbuf_base[fw.offset]; + while (fw->words && !err) { + u32 word = cmdbuf_base[fw->offset]; u32 opcode = (word & 0xf000) >> 28; - fw.mask = 0; - fw.reg = 0; - fw.count = 0; - fw.words--; - fw.offset++; + fw->mask = 0; + fw->reg = 0; + fw->count = 0; + fw->words--; + fw->offset++; switch (opcode) { case 0: - fw.class = word >> 6 & 0x3ff; - fw.mask = word & 0x3f; - fw.reg = word >> 16 & 0xfff; - err = check_mask(); + fw->class = word >> 6 & 0x3ff; + fw->mask = word & 0x3f; + fw->reg = word >> 16 & 0xfff; + err = check_mask(fw); if (err) goto out; break; case 1: - fw.reg = word >> 16 & 0xfff; - fw.count = word & 0x; - err = check_incr(); + fw->reg = word >> 16 & 0xfff; + fw->count = word & 0x; + err = check_incr(fw); if (err) goto out; break; case 2: - fw.reg = word >> 16 & 0xfff; - fw.count = word & 0x; - err = check_nonincr(); + fw->reg = word >> 16 & 0xfff; + fw->count = word & 0x; + err = check_nonincr(fw); if (err) goto out; break; case 3: - fw.mask = word & 0x; - fw.reg = word >> 16 & 0xfff; - err = check_mask(); + fw->mask = word & 0x; + fw->reg = word >> 16 & 0xfff; + err = check_mask(fw); if (err) goto out; break; @@ -453,12 +444,10 @@ static int validate(struct host1x_job *job, struct device *dev, } /* No relocs should remain at this point */ - if (fw.num_relocs) + if (fw->num_relocs) err = -EINVAL; out: - host1x_bo_munmap(g->bo, cmdbuf_base); - return err; } @@ -508,8 +497,15 @@ int host1x_job_pin(struct host1x_job *job, struct device *dev) int err; unsigned int i, j; struct host1x *host = dev_get_drvdata(dev->parent); + struct host1x_firewall fw; DECLARE_BITMAP(waitchk_mask, host1x_syncpt_nb_pts(host)); + fw.job = job; + fw.dev = dev; + fw.reloc = job->relocarray; + fw.num_relocs = job->num_relocs; + fw.class = 0; +
[PATCHv2 2/7] gpu: host1x: Check reloc table before usage
The firewall assumed that the user space always delivers a relocation table when it is accessing address registers. If userspace did not deliver a relocation table and tried to access the address registers, the code performed bad memory accesses. This patch modifies the firewall to check correctly that the firewall table is available before accessing it. In addition, check_reloc() is converted to use boolean return value (true when the reloc is valid, false when invalid). Signed-off-by: Arto Merilainen --- drivers/gpu/host1x/job.c | 26 +- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index 2974ac8..83804fd 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -268,15 +268,15 @@ static unsigned int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf) return 0; } -static int check_reloc(struct host1x_reloc *reloc, struct host1x_bo *cmdbuf, +static bool check_reloc(struct host1x_reloc *reloc, struct host1x_bo *cmdbuf, unsigned int offset) { offset *= sizeof(u32); if (reloc->cmdbuf != cmdbuf || reloc->cmdbuf_offset != offset) - return -EINVAL; + return false; - return 0; + return true; } struct host1x_firewall { @@ -307,10 +307,10 @@ static int check_mask(struct host1x_firewall *fw) if (mask & 1) { if (fw->job->is_addr_reg(fw->dev, fw->class, reg)) { - bool bad_reloc = check_reloc(fw->reloc, -fw->cmdbuf_id, -fw->offset); - if (!fw->num_relocs || bad_reloc) + if (!fw->num_relocs) + return -EINVAL; + if (!check_reloc(fw->reloc, fw->cmdbuf_id, +fw->offset)) return -EINVAL; fw->reloc++; fw->num_relocs--; @@ -335,9 +335,9 @@ static int check_incr(struct host1x_firewall *fw) return -EINVAL; if (fw->job->is_addr_reg(fw->dev, fw->class, reg)) { - bool bad_reloc = check_reloc(fw->reloc, fw->cmdbuf_id, -fw->offset); - if (!fw->num_relocs || bad_reloc) + if (!fw->num_relocs) + return -EINVAL; + if (!check_reloc(fw->reloc, fw->cmdbuf_id, fw->offset)) return -EINVAL; fw->reloc++; fw->num_relocs--; @@ -361,9 +361,9 @@ static int check_nonincr(struct host1x_firewall *fw) return -EINVAL; if (is_addr_reg) { - bool bad_reloc = check_reloc(fw->reloc, fw->cmdbuf_id, -fw->offset); - if (!fw->num_relocs || bad_reloc) + if (!fw->num_relocs) + return -EINVAL; + if (!check_reloc(fw->reloc, fw->cmdbuf_id, fw->offset)) return -EINVAL; fw->reloc++; fw->num_relocs--; -- 1.8.1.5
[PATCHv2 1/7] gpu: host1x: Check INCR opcode correctly
From: Terje Bergstrom <tbergst...@nvidia.com> The firewall code used a wrong loop condition (pointer to a structure) while checking INCR opcode. This patch fixes the code to use correct loop condition (number of words remaining). Signed-off-by: Terje Bergstrom Signed-off-by: Arto Merilainen --- drivers/gpu/host1x/job.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index f665d67..2974ac8 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -330,7 +330,7 @@ static int check_incr(struct host1x_firewall *fw) u32 count = fw->count; u32 reg = fw->reg; - while (fw) { + while (count) { if (fw->words == 0) return -EINVAL; -- 1.8.1.5
[PATCHv2 0/7] Miscellaneous fixes to host1x
This patch series fixes two issues in the host1x driver: First, the command buffer validation routine had vulnerabilities that were not detected in earlier testing. Second, the return codes of some functions were misleading or completely missing. This caused the driver to give wrong return codes also to the userspace. The series is based on top of 3.10rc3. I have tested the patch series on cardhu by running host1x and gr2d test cases (available at [0]). I would appreciate any help in testing/reviewing these patches. Changes from v1: * Rebased on top of 3.10rc3 * Split firewall fixes to smaller patches * Reworked no-reloc case in firewall code. Fix in v1 was not sufficient in all cases * Dropped patch "Fix syncpoint wait return value" as it is not critical and discussion on it has not yet settled. * Fixed style and whitespace issues [0] https://gitorious.org/linux-host1x/libdrm-host1x Arto Merilainen (5): gpu: host1x: Check reloc table before usage gpu: host1x: Copy gathers before verification gpu: host1x: Fix memory access in syncpt request gpu: host1x: Fix client_managed type gpu: host1x: Rework CPU syncpoint increment Terje Bergstrom (2): gpu: host1x: Check INCR opcode correctly gpu: host1x: Don't reset firewall between gathers drivers/gpu/host1x/dev.h | 8 +-- drivers/gpu/host1x/drm/drm.c | 3 +- drivers/gpu/host1x/drm/gr2d.c | 2 +- drivers/gpu/host1x/hw/cdma_hw.c | 2 +- drivers/gpu/host1x/hw/syncpt_hw.c | 12 ++-- drivers/gpu/host1x/job.c | 135 +- drivers/gpu/host1x/syncpt.c | 26 +++- drivers/gpu/host1x/syncpt.h | 13 ++-- 8 files changed, 85 insertions(+), 116 deletions(-) -- 1.8.1.5
[PATCHv2 0/7] Miscellaneous fixes to host1x
This patch series fixes two issues in the host1x driver: First, the command buffer validation routine had vulnerabilities that were not detected in earlier testing. Second, the return codes of some functions were misleading or completely missing. This caused the driver to give wrong return codes also to the userspace. The series is based on top of 3.10rc3. I have tested the patch series on cardhu by running host1x and gr2d test cases (available at [0]). I would appreciate any help in testing/reviewing these patches. Changes from v1: * Rebased on top of 3.10rc3 * Split firewall fixes to smaller patches * Reworked no-reloc case in firewall code. Fix in v1 was not sufficient in all cases * Dropped patch Fix syncpoint wait return value as it is not critical and discussion on it has not yet settled. * Fixed style and whitespace issues [0] https://gitorious.org/linux-host1x/libdrm-host1x Arto Merilainen (5): gpu: host1x: Check reloc table before usage gpu: host1x: Copy gathers before verification gpu: host1x: Fix memory access in syncpt request gpu: host1x: Fix client_managed type gpu: host1x: Rework CPU syncpoint increment Terje Bergstrom (2): gpu: host1x: Check INCR opcode correctly gpu: host1x: Don't reset firewall between gathers drivers/gpu/host1x/dev.h | 8 +-- drivers/gpu/host1x/drm/drm.c | 3 +- drivers/gpu/host1x/drm/gr2d.c | 2 +- drivers/gpu/host1x/hw/cdma_hw.c | 2 +- drivers/gpu/host1x/hw/syncpt_hw.c | 12 ++-- drivers/gpu/host1x/job.c | 135 +- drivers/gpu/host1x/syncpt.c | 26 +++- drivers/gpu/host1x/syncpt.h | 13 ++-- 8 files changed, 85 insertions(+), 116 deletions(-) -- 1.8.1.5 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCHv2 1/7] gpu: host1x: Check INCR opcode correctly
From: Terje Bergstrom tbergst...@nvidia.com The firewall code used a wrong loop condition (pointer to a structure) while checking INCR opcode. This patch fixes the code to use correct loop condition (number of words remaining). Signed-off-by: Terje Bergstrom tbergst...@nvidia.com Signed-off-by: Arto Merilainen amerilai...@nvidia.com --- drivers/gpu/host1x/job.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index f665d67..2974ac8 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -330,7 +330,7 @@ static int check_incr(struct host1x_firewall *fw) u32 count = fw-count; u32 reg = fw-reg; - while (fw) { + while (count) { if (fw-words == 0) return -EINVAL; -- 1.8.1.5 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCHv2 2/7] gpu: host1x: Check reloc table before usage
The firewall assumed that the user space always delivers a relocation table when it is accessing address registers. If userspace did not deliver a relocation table and tried to access the address registers, the code performed bad memory accesses. This patch modifies the firewall to check correctly that the firewall table is available before accessing it. In addition, check_reloc() is converted to use boolean return value (true when the reloc is valid, false when invalid). Signed-off-by: Arto Merilainen amerilai...@nvidia.com --- drivers/gpu/host1x/job.c | 26 +- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index 2974ac8..83804fd 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -268,15 +268,15 @@ static unsigned int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf) return 0; } -static int check_reloc(struct host1x_reloc *reloc, struct host1x_bo *cmdbuf, +static bool check_reloc(struct host1x_reloc *reloc, struct host1x_bo *cmdbuf, unsigned int offset) { offset *= sizeof(u32); if (reloc-cmdbuf != cmdbuf || reloc-cmdbuf_offset != offset) - return -EINVAL; + return false; - return 0; + return true; } struct host1x_firewall { @@ -307,10 +307,10 @@ static int check_mask(struct host1x_firewall *fw) if (mask 1) { if (fw-job-is_addr_reg(fw-dev, fw-class, reg)) { - bool bad_reloc = check_reloc(fw-reloc, -fw-cmdbuf_id, -fw-offset); - if (!fw-num_relocs || bad_reloc) + if (!fw-num_relocs) + return -EINVAL; + if (!check_reloc(fw-reloc, fw-cmdbuf_id, +fw-offset)) return -EINVAL; fw-reloc++; fw-num_relocs--; @@ -335,9 +335,9 @@ static int check_incr(struct host1x_firewall *fw) return -EINVAL; if (fw-job-is_addr_reg(fw-dev, fw-class, reg)) { - bool bad_reloc = check_reloc(fw-reloc, fw-cmdbuf_id, -fw-offset); - if (!fw-num_relocs || bad_reloc) + if (!fw-num_relocs) + return -EINVAL; + if (!check_reloc(fw-reloc, fw-cmdbuf_id, fw-offset)) return -EINVAL; fw-reloc++; fw-num_relocs--; @@ -361,9 +361,9 @@ static int check_nonincr(struct host1x_firewall *fw) return -EINVAL; if (is_addr_reg) { - bool bad_reloc = check_reloc(fw-reloc, fw-cmdbuf_id, -fw-offset); - if (!fw-num_relocs || bad_reloc) + if (!fw-num_relocs) + return -EINVAL; + if (!check_reloc(fw-reloc, fw-cmdbuf_id, fw-offset)) return -EINVAL; fw-reloc++; fw-num_relocs--; -- 1.8.1.5 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCHv2 3/7] gpu: host1x: Don't reset firewall between gathers
From: Terje Bergstrom tbergst...@nvidia.com The firewall was reinitialised for each gather. Because the filter was reinitialised, it did not track the class over gather boundaries. This allowed the user application to set host1x class to one class in one gather and use that class in another gather without firewall having knowledge about that. Signed-off-by: Terje Bergstrom tbergst...@nvidia.com Signed-off-by: Arto Merilainen amerilai...@nvidia.com --- drivers/gpu/host1x/job.c | 73 ++-- 1 file changed, 34 insertions(+), 39 deletions(-) diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index 83804fd..028d2d4 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -376,69 +376,60 @@ static int check_nonincr(struct host1x_firewall *fw) return 0; } -static int validate(struct host1x_job *job, struct device *dev, - struct host1x_job_gather *g) +static int validate(struct host1x_firewall *fw, struct host1x_job_gather *g) { u32 *cmdbuf_base; int err = 0; - struct host1x_firewall fw; - - fw.job = job; - fw.dev = dev; - fw.reloc = job-relocarray; - fw.num_relocs = job-num_relocs; - fw.cmdbuf_id = g-bo; - - fw.offset = 0; - fw.class = 0; - if (!job-is_addr_reg) + if (!fw-job-is_addr_reg) return 0; cmdbuf_base = host1x_bo_mmap(g-bo); if (!cmdbuf_base) return -ENOMEM; + fw-words = g-words; + fw-cmdbuf_id = g-bo; + fw-offset = 0; - fw.words = g-words; - while (fw.words !err) { - u32 word = cmdbuf_base[fw.offset]; + while (fw-words !err) { + u32 word = cmdbuf_base[fw-offset]; u32 opcode = (word 0xf000) 28; - fw.mask = 0; - fw.reg = 0; - fw.count = 0; - fw.words--; - fw.offset++; + fw-mask = 0; + fw-reg = 0; + fw-count = 0; + fw-words--; + fw-offset++; switch (opcode) { case 0: - fw.class = word 6 0x3ff; - fw.mask = word 0x3f; - fw.reg = word 16 0xfff; - err = check_mask(fw); + fw-class = word 6 0x3ff; + fw-mask = word 0x3f; + fw-reg = word 16 0xfff; + err = check_mask(fw); if (err) goto out; break; case 1: - fw.reg = word 16 0xfff; - fw.count = word 0x; - err = check_incr(fw); + fw-reg = word 16 0xfff; + fw-count = word 0x; + err = check_incr(fw); if (err) goto out; break; case 2: - fw.reg = word 16 0xfff; - fw.count = word 0x; - err = check_nonincr(fw); + fw-reg = word 16 0xfff; + fw-count = word 0x; + err = check_nonincr(fw); if (err) goto out; break; case 3: - fw.mask = word 0x; - fw.reg = word 16 0xfff; - err = check_mask(fw); + fw-mask = word 0x; + fw-reg = word 16 0xfff; + err = check_mask(fw); if (err) goto out; break; @@ -453,12 +444,10 @@ static int validate(struct host1x_job *job, struct device *dev, } /* No relocs should remain at this point */ - if (fw.num_relocs) + if (fw-num_relocs) err = -EINVAL; out: - host1x_bo_munmap(g-bo, cmdbuf_base); - return err; } @@ -508,8 +497,15 @@ int host1x_job_pin(struct host1x_job *job, struct device *dev) int err; unsigned int i, j; struct host1x *host = dev_get_drvdata(dev-parent); + struct host1x_firewall fw; DECLARE_BITMAP(waitchk_mask, host1x_syncpt_nb_pts(host)); + fw.job = job; + fw.dev = dev; + fw.reloc = job-relocarray; + fw.num_relocs = job-num_relocs; + fw.class = 0; + bitmap_zero(waitchk_mask, host1x_syncpt_nb_pts(host)); for (i = 0; i job-num_waitchk; i++) { u32 syncpt_id = job-waitchk[i].syncpt_id; @@ -543,7 +539,7 @@ int host1x_job_pin(struct host1x_job *job, struct device *dev) err = 0
[PATCHv2 4/7] gpu: host1x: Copy gathers before verification
The firewall verified gather buffers before copying them. This allowed a malicious application to rewrite the buffer content by timing the rewrite carefully. This patch makes the buffer validation occur after copying the buffers. Signed-off-by: Arto Merilainen amerilai...@nvidia.com Signed-off-by: Terje Bergstrom tbergst...@nvidia.com --- drivers/gpu/host1x/job.c | 50 +++- 1 file changed, 20 insertions(+), 30 deletions(-) diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index 028d2d4..cc80766 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -228,17 +228,15 @@ static unsigned int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf) void *cmdbuf_page_addr = NULL; /* pin patch the relocs for one gather */ - while (i job-num_relocs) { + for (i = 0; i job-num_relocs; i++) { struct host1x_reloc *reloc = job-relocarray[i]; u32 reloc_addr = (job-reloc_addr_phys[i] + reloc-target_offset) reloc-shift; u32 *target; /* skip all other gathers */ - if (!(reloc-cmdbuf cmdbuf == reloc-cmdbuf)) { - i++; + if (cmdbuf != reloc-cmdbuf) continue; - } if (last_page != reloc-cmdbuf_offset PAGE_SHIFT) { if (cmdbuf_page_addr) @@ -257,9 +255,6 @@ static unsigned int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf) target = cmdbuf_page_addr + (reloc-cmdbuf_offset ~PAGE_MASK); *target = reloc_addr; - - /* mark this gather as handled */ - reloc-cmdbuf = 0; } if (cmdbuf_page_addr) @@ -378,15 +373,13 @@ static int check_nonincr(struct host1x_firewall *fw) static int validate(struct host1x_firewall *fw, struct host1x_job_gather *g) { - u32 *cmdbuf_base; + u32 *cmdbuf_base = (u32 *)fw-job-gather_copy_mapped + + (g-offset / sizeof(u32)); int err = 0; if (!fw-job-is_addr_reg) return 0; - cmdbuf_base = host1x_bo_mmap(g-bo); - if (!cmdbuf_base) - return -ENOMEM; fw-words = g-words; fw-cmdbuf_id = g-bo; fw-offset = 0; @@ -453,10 +446,17 @@ out: static inline int copy_gathers(struct host1x_job *job, struct device *dev) { + struct host1x_firewall fw; size_t size = 0; size_t offset = 0; int i; + fw.job = job; + fw.dev = dev; + fw.reloc = job-relocarray; + fw.num_relocs = job-num_relocs; + fw.class = 0; + for (i = 0; i job-num_gathers; i++) { struct host1x_job_gather *g = job-gathers[i]; size += g-words * sizeof(u32); @@ -477,14 +477,19 @@ static inline int copy_gathers(struct host1x_job *job, struct device *dev) struct host1x_job_gather *g = job-gathers[i]; void *gather; + /* Copy the gather */ gather = host1x_bo_mmap(g-bo); memcpy(job-gather_copy_mapped + offset, gather + g-offset, g-words * sizeof(u32)); host1x_bo_munmap(g-bo, gather); + /* Store the location in the buffer */ g-base = job-gather_copy; g-offset = offset; - g-bo = NULL; + + /* Validate the job */ + if (validate(fw, g)) + return -EINVAL; offset += g-words * sizeof(u32); } @@ -497,15 +502,8 @@ int host1x_job_pin(struct host1x_job *job, struct device *dev) int err; unsigned int i, j; struct host1x *host = dev_get_drvdata(dev-parent); - struct host1x_firewall fw; DECLARE_BITMAP(waitchk_mask, host1x_syncpt_nb_pts(host)); - fw.job = job; - fw.dev = dev; - fw.reloc = job-relocarray; - fw.num_relocs = job-num_relocs; - fw.class = 0; - bitmap_zero(waitchk_mask, host1x_syncpt_nb_pts(host)); for (i = 0; i job-num_waitchk; i++) { u32 syncpt_id = job-waitchk[i].syncpt_id; @@ -536,19 +534,11 @@ int host1x_job_pin(struct host1x_job *job, struct device *dev) if (job-gathers[j].bo == g-bo) job-gathers[j].handled = true; - err = 0; - - if (IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL)) - err = validate(fw, g); - + err = do_relocs(job, g-bo); if (err) - dev_err(dev, Job invalid (err=%d)\n, err); - - if (!err) - err = do_relocs(job, g-bo); + break; - if (!err) - err = do_waitchks(job, host, g-bo); + err = do_waitchks(job, host
[PATCHv2 5/7] gpu: host1x: Fix memory access in syncpt request
This patch fixes a bad memory access in syncpoint request code. If no syncpoints were available, the code accessed unreserved memory area causing unexpected behaviour. Signed-off-by: Arto Merilainen amerilai...@nvidia.com --- drivers/gpu/host1x/syncpt.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c index 4b49345..2b03f1b 100644 --- a/drivers/gpu/host1x/syncpt.c +++ b/drivers/gpu/host1x/syncpt.c @@ -40,7 +40,8 @@ static struct host1x_syncpt *_host1x_syncpt_alloc(struct host1x *host, for (i = 0; i host-info-nb_pts sp-name; i++, sp++) ; - if (sp-dev) + + if (i = host-info-nb_pts) return NULL; name = kasprintf(GFP_KERNEL, %02d-%s, sp-id, -- 1.8.1.5 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCHv2 6/7] gpu: host1x: Fix client_managed type
client_managed field in syncpoint structure was defined as an integer. The field holds, however, only a boolean value. This patch modifies the type to boolean. Signed-off-by: Arto Merilainen amerilai...@nvidia.com --- drivers/gpu/host1x/drm/gr2d.c | 2 +- drivers/gpu/host1x/syncpt.c | 8 drivers/gpu/host1x/syncpt.h | 6 +++--- 3 files changed, 8 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/host1x/drm/gr2d.c b/drivers/gpu/host1x/drm/gr2d.c index 6a45ae0..6c3a27b 100644 --- a/drivers/gpu/host1x/drm/gr2d.c +++ b/drivers/gpu/host1x/drm/gr2d.c @@ -281,7 +281,7 @@ static int gr2d_probe(struct platform_device *pdev) if (!gr2d-channel) return -ENOMEM; - *syncpts = host1x_syncpt_request(dev, 0); + *syncpts = host1x_syncpt_request(dev, false); if (!(*syncpts)) { host1x_channel_free(gr2d-channel); return -ENOMEM; diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c index 2b03f1b..27201b5 100644 --- a/drivers/gpu/host1x/syncpt.c +++ b/drivers/gpu/host1x/syncpt.c @@ -32,7 +32,7 @@ static struct host1x_syncpt *_host1x_syncpt_alloc(struct host1x *host, struct device *dev, - int client_managed) + bool client_managed) { int i; struct host1x_syncpt *sp = host-syncpt; @@ -332,7 +332,7 @@ int host1x_syncpt_init(struct host1x *host) host1x_syncpt_restore(host); /* Allocate sync point to use for clearing waits for expired fences */ - host-nop_sp = _host1x_syncpt_alloc(host, NULL, 0); + host-nop_sp = _host1x_syncpt_alloc(host, NULL, false); if (!host-nop_sp) return -ENOMEM; @@ -340,7 +340,7 @@ int host1x_syncpt_init(struct host1x *host) } struct host1x_syncpt *host1x_syncpt_request(struct device *dev, - int client_managed) + bool client_managed) { struct host1x *host = dev_get_drvdata(dev-parent); return _host1x_syncpt_alloc(host, dev, client_managed); @@ -354,7 +354,7 @@ void host1x_syncpt_free(struct host1x_syncpt *sp) kfree(sp-name); sp-dev = NULL; sp-name = NULL; - sp-client_managed = 0; + sp-client_managed = false; } void host1x_syncpt_deinit(struct host1x *host) diff --git a/drivers/gpu/host1x/syncpt.h b/drivers/gpu/host1x/syncpt.h index c998061..d00e758 100644 --- a/drivers/gpu/host1x/syncpt.h +++ b/drivers/gpu/host1x/syncpt.h @@ -36,7 +36,7 @@ struct host1x_syncpt { atomic_t max_val; u32 base_val; const char *name; - int client_managed; + bool client_managed; struct host1x *host; struct device *dev; @@ -94,7 +94,7 @@ static inline bool host1x_syncpt_check_max(struct host1x_syncpt *sp, u32 real) } /* Return true if sync point is client managed. */ -static inline int host1x_syncpt_client_managed(struct host1x_syncpt *sp) +static inline bool host1x_syncpt_client_managed(struct host1x_syncpt *sp) { return sp-client_managed; } @@ -157,7 +157,7 @@ u32 host1x_syncpt_id(struct host1x_syncpt *sp); /* Allocate a sync point for a device. */ struct host1x_syncpt *host1x_syncpt_request(struct device *dev, - int client_managed); + bool client_managed); /* Free a sync point. */ void host1x_syncpt_free(struct host1x_syncpt *sp); -- 1.8.1.5 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCHv2 7/7] gpu: host1x: Rework CPU syncpoint increment
This patch merges host1x_syncpt_cpu_incr to host1x_syncpt_incr() as they are in practise doing the same thing. host1x_syncpt_incr() is also modified to return error codes. User space interface is modified accordingly to pass return values. Signed-off-by: Arto Merilainen amerilai...@nvidia.com --- drivers/gpu/host1x/dev.h | 8 drivers/gpu/host1x/drm/drm.c | 3 +-- drivers/gpu/host1x/hw/cdma_hw.c | 2 +- drivers/gpu/host1x/hw/syncpt_hw.c | 12 +--- drivers/gpu/host1x/syncpt.c | 15 ++- drivers/gpu/host1x/syncpt.h | 7 ++- 6 files changed, 15 insertions(+), 32 deletions(-) diff --git a/drivers/gpu/host1x/dev.h b/drivers/gpu/host1x/dev.h index a1607d6..790ddf1 100644 --- a/drivers/gpu/host1x/dev.h +++ b/drivers/gpu/host1x/dev.h @@ -73,7 +73,7 @@ struct host1x_syncpt_ops { void (*restore_wait_base)(struct host1x_syncpt *syncpt); void (*load_wait_base)(struct host1x_syncpt *syncpt); u32 (*load)(struct host1x_syncpt *syncpt); - void (*cpu_incr)(struct host1x_syncpt *syncpt); + int (*cpu_incr)(struct host1x_syncpt *syncpt); int (*patch_wait)(struct host1x_syncpt *syncpt, void *patch_addr); }; @@ -157,10 +157,10 @@ static inline u32 host1x_hw_syncpt_load(struct host1x *host, return host-syncpt_op-load(sp); } -static inline void host1x_hw_syncpt_cpu_incr(struct host1x *host, -struct host1x_syncpt *sp) +static inline int host1x_hw_syncpt_cpu_incr(struct host1x *host, + struct host1x_syncpt *sp) { - host-syncpt_op-cpu_incr(sp); + return host-syncpt_op-cpu_incr(sp); } static inline int host1x_hw_syncpt_patch_wait(struct host1x *host, diff --git a/drivers/gpu/host1x/drm/drm.c b/drivers/gpu/host1x/drm/drm.c index 2b561c9..1dfd454 100644 --- a/drivers/gpu/host1x/drm/drm.c +++ b/drivers/gpu/host1x/drm/drm.c @@ -378,8 +378,7 @@ static int tegra_syncpt_incr(struct drm_device *drm, void *data, if (!sp) return -EINVAL; - host1x_syncpt_incr(sp); - return 0; + return host1x_syncpt_incr(sp); } static int tegra_syncpt_wait(struct drm_device *drm, void *data, diff --git a/drivers/gpu/host1x/hw/cdma_hw.c b/drivers/gpu/host1x/hw/cdma_hw.c index 590b69d..2ee4ad5 100644 --- a/drivers/gpu/host1x/hw/cdma_hw.c +++ b/drivers/gpu/host1x/hw/cdma_hw.c @@ -44,7 +44,7 @@ static void cdma_timeout_cpu_incr(struct host1x_cdma *cdma, u32 getptr, u32 i; for (i = 0; i syncpt_incrs; i++) - host1x_syncpt_cpu_incr(cdma-timeout.syncpt); + host1x_syncpt_incr(cdma-timeout.syncpt); /* after CPU incr, ensure shadow is up to date */ host1x_syncpt_load(cdma-timeout.syncpt); diff --git a/drivers/gpu/host1x/hw/syncpt_hw.c b/drivers/gpu/host1x/hw/syncpt_hw.c index 6117499..0cf6095 100644 --- a/drivers/gpu/host1x/hw/syncpt_hw.c +++ b/drivers/gpu/host1x/hw/syncpt_hw.c @@ -77,21 +77,19 @@ static u32 syncpt_load(struct host1x_syncpt *sp) * Write a cpu syncpoint increment to the hardware, without touching * the cache. */ -static void syncpt_cpu_incr(struct host1x_syncpt *sp) +static int syncpt_cpu_incr(struct host1x_syncpt *sp) { struct host1x *host = sp-host; u32 reg_offset = sp-id / 32; if (!host1x_syncpt_client_managed(sp) - host1x_syncpt_idle(sp)) { - dev_err(host-dev, Trying to increment syncpoint id %d beyond max\n, - sp-id); - host1x_debug_dump(sp-host); - return; - } + host1x_syncpt_idle(sp)) + return -EINVAL; host1x_sync_writel(host, BIT_MASK(sp-id), HOST1X_SYNC_SYNCPT_CPU_INCR(reg_offset)); wmb(); + + return 0; } /* remove a wait pointed to by patch_addr */ diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c index 27201b5..409745b 100644 --- a/drivers/gpu/host1x/syncpt.c +++ b/drivers/gpu/host1x/syncpt.c @@ -129,22 +129,11 @@ u32 host1x_syncpt_load_wait_base(struct host1x_syncpt *sp) } /* - * Write a cpu syncpoint increment to the hardware, without touching - * the cache. Caller is responsible for host being powered. - */ -void host1x_syncpt_cpu_incr(struct host1x_syncpt *sp) -{ - host1x_hw_syncpt_cpu_incr(sp-host, sp); -} - -/* * Increment syncpoint value from cpu, updating cache */ -void host1x_syncpt_incr(struct host1x_syncpt *sp) +int host1x_syncpt_incr(struct host1x_syncpt *sp) { - if (host1x_syncpt_client_managed(sp)) - host1x_syncpt_incr_max(sp, 1); - host1x_syncpt_cpu_incr(sp); + return host1x_hw_syncpt_cpu_incr(sp-host, sp); } /* diff --git a/drivers/gpu/host1x/syncpt.h b/drivers/gpu/host1x/syncpt.h index d00e758..267c0b9 100644 --- a/drivers/gpu/host1x/syncpt.h +++ b/drivers/gpu/host1x/syncpt.h @@ -115,9 +115,6 @@ static inline bool host1x_syncpt_idle
[PATCH 3/6] gpu: host1x: Fix memory access in syncpt request
On 05/26/2013 01:15 PM, Thierry Reding wrote: > * PGP Signed by an unknown key > > On Fri, May 17, 2013 at 02:49:45PM +0300, Arto Merilainen wrote: >> This patch fixes a bad memory access in syncpoint request code. If >> no syncpoints were available, the code accessed unreserved memory >> area causing unexpected behaviour. >> >> Signed-off-by: Arto Merilainen >> --- >> drivers/gpu/host1x/syncpt.c |2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c >> index 5bf5366..6b7ee88 100644 >> --- a/drivers/gpu/host1x/syncpt.c >> +++ b/drivers/gpu/host1x/syncpt.c >> @@ -40,7 +40,7 @@ static struct host1x_syncpt *_host1x_syncpt_alloc(struct >> host1x *host, >> >> for (i = 0; i < host->info->nb_pts && sp->name; i++, sp++) >> ; >> -if (sp->dev) >> +if (i >= host->info->nb_pts) >> return NULL; > > While changing this, can you please add a blank line between the loop > and the new 'if (...)'? > Yep. Will do. - Arto
[PATCH 2/6] gpu: host1x: Fix syncpoint wait return value
On 05/26/2013 01:12 PM, Thierry Reding wrote: > * PGP Signed by an unknown key > > On Fri, May 17, 2013 at 02:49:44PM +0300, Arto Merilainen wrote: >> Syncpoint wait returned EAGAIN if it was called with zero timeout. >> This patch modifies the function to return ETIMEDOUT. > > This description is a bit redundant, because it repeats in prose what > the code does. I'd rather see a description of why the change is > necessary. True. Will fix. > > Thinking about it, maybe it would be good to have two separate error > codes. Keeping -EAGAIN for the case where a zero timeout was passed > doesn't sound too bad to differentiate it from the case where a non- > zero timeout was passed and it actually timed out. What do you think? I agree, in this case it would not look bad at all. However, user space libraries may loop until the ioctl return code is something else than -EAGAIN or -EINTR. Especially function drmIoctl() in libdrm does this which is why I noted this isssue in the first place. If user space uses zero timeout to just check if a syncpoint value has already passed the library continues looping until the syncpoint value actually passes. Of course, we could just modify the ioctl interface to "cast" this return code to something else but that does not seem correct. - Arto
[PATCH 1/6] gpu: host1x: Fixes to host1x firewall
On 05/26/2013 01:02 PM, Thierry Reding wrote: > * PGP Signed by an unknown key > > On Fri, May 17, 2013 at 02:49:43PM +0300, Arto Merilainen wrote: >> From: Terje Bergstrom >> >> This patch adds several fixes to host1x firewall: >> - Host1x firewall does not survive if it expects a reloc, but user >>space didn't pass any relocs. Also it reset the reloc table for >>each gather, whereas they should be reset only per submit. Also >>class does not need to be reset for each class - once per submit >>is enough. >> - For INCR opcode the check was not working properly at all. >> - The firewall verified gather buffers before copying them. This >>allowed a malicious application to rewrite the buffer content by >>timing the rewrite carefully. This patch makes the buffer >>validation occur after copying the buffers. > > Can these be split into separate patches, please? It's not only always > good to split logical changes into separate patches but it also makes > reviewing a lot more pleasant. It's hard to tell from this combined > patch which changes belong together. Sure. > > I have a few additional comments inline. > >> diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c >> index f665d67..4f3c004 100644 >> --- a/drivers/gpu/host1x/job.c >> +++ b/drivers/gpu/host1x/job.c >> @@ -228,17 +228,15 @@ static unsigned int do_relocs(struct host1x_job *job, >> struct host1x_bo *cmdbuf) >> void *cmdbuf_page_addr = NULL; >> >> /* pin & patch the relocs for one gather */ >> -while (i < job->num_relocs) { >> +for (i = 0; i < job->num_relocs; ++i) { > > Nit: I prefer post-increment where possible. For consistency. Will do. > >> @@ -268,15 +263,15 @@ static unsigned int do_relocs(struct host1x_job *job, >> struct host1x_bo *cmdbuf) >> return 0; >> } >> >> -static int check_reloc(struct host1x_reloc *reloc, struct host1x_bo *cmdbuf, >> - unsigned int offset) >> +static bool check_reloc(struct host1x_reloc *reloc, struct host1x_bo >> *cmdbuf, >> +unsigned int offset) >> { >> offset *= sizeof(u32); >> >> -if (reloc->cmdbuf != cmdbuf || reloc->cmdbuf_offset != offset) >> -return -EINVAL; >> +if (!reloc || reloc->cmdbuf != cmdbuf || reloc->cmdbuf_offset != offset) > > Is the additional !reloc check really necessary? Looking at the callers, > they always pass in fw->relocarray, which in turn is only NULL if no > buffers are to be relocated. Yes, the additional check is necessary exactly for that reason. The code will fail if the userspace does not deliver a relocation array and still pushes data to an address register. However, the code *should* check that there are relocations left before even coming here so I probably just fix this differently. > >> +return true; >> >> -return 0; >> +return false; >> } > > I wonder whether we should be changing the logic here and have the > check_reloc() function return true if the relocation is good. I find > that to be more intuitive. > I was also thinking that earlier. Will do. >> @@ -508,6 +502,7 @@ int host1x_job_pin(struct host1x_job *job, struct device >> *dev) >> int err; >> unsigned int i, j; >> struct host1x *host = dev_get_drvdata(dev->parent); >> + >> DECLARE_BITMAP(waitchk_mask, host1x_syncpt_nb_pts(host)); > > This is an unnecessary whitespace change. Ops. Will fix. - Arto
Re: [PATCH 1/6] gpu: host1x: Fixes to host1x firewall
On 05/26/2013 01:02 PM, Thierry Reding wrote: * PGP Signed by an unknown key On Fri, May 17, 2013 at 02:49:43PM +0300, Arto Merilainen wrote: From: Terje Bergstrom tbergst...@nvidia.com This patch adds several fixes to host1x firewall: - Host1x firewall does not survive if it expects a reloc, but user space didn't pass any relocs. Also it reset the reloc table for each gather, whereas they should be reset only per submit. Also class does not need to be reset for each class - once per submit is enough. - For INCR opcode the check was not working properly at all. - The firewall verified gather buffers before copying them. This allowed a malicious application to rewrite the buffer content by timing the rewrite carefully. This patch makes the buffer validation occur after copying the buffers. Can these be split into separate patches, please? It's not only always good to split logical changes into separate patches but it also makes reviewing a lot more pleasant. It's hard to tell from this combined patch which changes belong together. Sure. I have a few additional comments inline. diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index f665d67..4f3c004 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -228,17 +228,15 @@ static unsigned int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf) void *cmdbuf_page_addr = NULL; /* pin patch the relocs for one gather */ - while (i job-num_relocs) { + for (i = 0; i job-num_relocs; ++i) { Nit: I prefer post-increment where possible. For consistency. Will do. @@ -268,15 +263,15 @@ static unsigned int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf) return 0; } -static int check_reloc(struct host1x_reloc *reloc, struct host1x_bo *cmdbuf, - unsigned int offset) +static bool check_reloc(struct host1x_reloc *reloc, struct host1x_bo *cmdbuf, + unsigned int offset) { offset *= sizeof(u32); - if (reloc-cmdbuf != cmdbuf || reloc-cmdbuf_offset != offset) - return -EINVAL; + if (!reloc || reloc-cmdbuf != cmdbuf || reloc-cmdbuf_offset != offset) Is the additional !reloc check really necessary? Looking at the callers, they always pass in fw-relocarray, which in turn is only NULL if no buffers are to be relocated. Yes, the additional check is necessary exactly for that reason. The code will fail if the userspace does not deliver a relocation array and still pushes data to an address register. However, the code *should* check that there are relocations left before even coming here so I probably just fix this differently. + return true; - return 0; + return false; } I wonder whether we should be changing the logic here and have the check_reloc() function return true if the relocation is good. I find that to be more intuitive. I was also thinking that earlier. Will do. @@ -508,6 +502,7 @@ int host1x_job_pin(struct host1x_job *job, struct device *dev) int err; unsigned int i, j; struct host1x *host = dev_get_drvdata(dev-parent); + DECLARE_BITMAP(waitchk_mask, host1x_syncpt_nb_pts(host)); This is an unnecessary whitespace change. Ops. Will fix. - Arto ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 2/6] gpu: host1x: Fix syncpoint wait return value
On 05/26/2013 01:12 PM, Thierry Reding wrote: * PGP Signed by an unknown key On Fri, May 17, 2013 at 02:49:44PM +0300, Arto Merilainen wrote: Syncpoint wait returned EAGAIN if it was called with zero timeout. This patch modifies the function to return ETIMEDOUT. This description is a bit redundant, because it repeats in prose what the code does. I'd rather see a description of why the change is necessary. True. Will fix. Thinking about it, maybe it would be good to have two separate error codes. Keeping -EAGAIN for the case where a zero timeout was passed doesn't sound too bad to differentiate it from the case where a non- zero timeout was passed and it actually timed out. What do you think? I agree, in this case it would not look bad at all. However, user space libraries may loop until the ioctl return code is something else than -EAGAIN or -EINTR. Especially function drmIoctl() in libdrm does this which is why I noted this isssue in the first place. If user space uses zero timeout to just check if a syncpoint value has already passed the library continues looping until the syncpoint value actually passes. Of course, we could just modify the ioctl interface to cast this return code to something else but that does not seem correct. - Arto ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 3/6] gpu: host1x: Fix memory access in syncpt request
On 05/26/2013 01:15 PM, Thierry Reding wrote: * PGP Signed by an unknown key On Fri, May 17, 2013 at 02:49:45PM +0300, Arto Merilainen wrote: This patch fixes a bad memory access in syncpoint request code. If no syncpoints were available, the code accessed unreserved memory area causing unexpected behaviour. Signed-off-by: Arto Merilainen amerilai...@nvidia.com --- drivers/gpu/host1x/syncpt.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c index 5bf5366..6b7ee88 100644 --- a/drivers/gpu/host1x/syncpt.c +++ b/drivers/gpu/host1x/syncpt.c @@ -40,7 +40,7 @@ static struct host1x_syncpt *_host1x_syncpt_alloc(struct host1x *host, for (i = 0; i host-info-nb_pts sp-name; i++, sp++) ; - if (sp-dev) + if (i = host-info-nb_pts) return NULL; While changing this, can you please add a blank line between the loop and the new 'if (...)'? Yep. Will do. - Arto ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 6/6] drm/tegra: Fix syncpoint increment return code
Add syncpoint increment to return a proper return code based on the return value from host1x. Signed-off-by: Arto Merilainen --- drivers/gpu/host1x/drm/drm.c |3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/gpu/host1x/drm/drm.c b/drivers/gpu/host1x/drm/drm.c index 2b561c9..1dfd454 100644 --- a/drivers/gpu/host1x/drm/drm.c +++ b/drivers/gpu/host1x/drm/drm.c @@ -378,8 +378,7 @@ static int tegra_syncpt_incr(struct drm_device *drm, void *data, if (!sp) return -EINVAL; - host1x_syncpt_incr(sp); - return 0; + return host1x_syncpt_incr(sp); } static int tegra_syncpt_wait(struct drm_device *drm, void *data, -- 1.7.9.5
[PATCH 5/6] gpu: host1x: Rework CPU syncpoint increment
This patch merges host1x_syncpt_cpu_incr to host1x_syncpt_incr() as they are in practise doing the same thing. host1x_syncpt_incr() is also modified to return error codes. Signed-off-by: Arto Merilainen --- drivers/gpu/host1x/dev.h |8 drivers/gpu/host1x/hw/cdma_hw.c |2 +- drivers/gpu/host1x/hw/syncpt_hw.c | 12 +--- drivers/gpu/host1x/syncpt.c | 15 ++- drivers/gpu/host1x/syncpt.h |7 ++- 5 files changed, 14 insertions(+), 30 deletions(-) diff --git a/drivers/gpu/host1x/dev.h b/drivers/gpu/host1x/dev.h index a1607d6..790ddf1 100644 --- a/drivers/gpu/host1x/dev.h +++ b/drivers/gpu/host1x/dev.h @@ -73,7 +73,7 @@ struct host1x_syncpt_ops { void (*restore_wait_base)(struct host1x_syncpt *syncpt); void (*load_wait_base)(struct host1x_syncpt *syncpt); u32 (*load)(struct host1x_syncpt *syncpt); - void (*cpu_incr)(struct host1x_syncpt *syncpt); + int (*cpu_incr)(struct host1x_syncpt *syncpt); int (*patch_wait)(struct host1x_syncpt *syncpt, void *patch_addr); }; @@ -157,10 +157,10 @@ static inline u32 host1x_hw_syncpt_load(struct host1x *host, return host->syncpt_op->load(sp); } -static inline void host1x_hw_syncpt_cpu_incr(struct host1x *host, -struct host1x_syncpt *sp) +static inline int host1x_hw_syncpt_cpu_incr(struct host1x *host, + struct host1x_syncpt *sp) { - host->syncpt_op->cpu_incr(sp); + return host->syncpt_op->cpu_incr(sp); } static inline int host1x_hw_syncpt_patch_wait(struct host1x *host, diff --git a/drivers/gpu/host1x/hw/cdma_hw.c b/drivers/gpu/host1x/hw/cdma_hw.c index 590b69d..2ee4ad5 100644 --- a/drivers/gpu/host1x/hw/cdma_hw.c +++ b/drivers/gpu/host1x/hw/cdma_hw.c @@ -44,7 +44,7 @@ static void cdma_timeout_cpu_incr(struct host1x_cdma *cdma, u32 getptr, u32 i; for (i = 0; i < syncpt_incrs; i++) - host1x_syncpt_cpu_incr(cdma->timeout.syncpt); + host1x_syncpt_incr(cdma->timeout.syncpt); /* after CPU incr, ensure shadow is up to date */ host1x_syncpt_load(cdma->timeout.syncpt); diff --git a/drivers/gpu/host1x/hw/syncpt_hw.c b/drivers/gpu/host1x/hw/syncpt_hw.c index 6117499..0cf6095 100644 --- a/drivers/gpu/host1x/hw/syncpt_hw.c +++ b/drivers/gpu/host1x/hw/syncpt_hw.c @@ -77,21 +77,19 @@ static u32 syncpt_load(struct host1x_syncpt *sp) * Write a cpu syncpoint increment to the hardware, without touching * the cache. */ -static void syncpt_cpu_incr(struct host1x_syncpt *sp) +static int syncpt_cpu_incr(struct host1x_syncpt *sp) { struct host1x *host = sp->host; u32 reg_offset = sp->id / 32; if (!host1x_syncpt_client_managed(sp) && - host1x_syncpt_idle(sp)) { - dev_err(host->dev, "Trying to increment syncpoint id %d beyond max\n", - sp->id); - host1x_debug_dump(sp->host); - return; - } + host1x_syncpt_idle(sp)) + return -EINVAL; host1x_sync_writel(host, BIT_MASK(sp->id), HOST1X_SYNC_SYNCPT_CPU_INCR(reg_offset)); wmb(); + + return 0; } /* remove a wait pointed to by patch_addr */ diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c index e560d67..afb631a 100644 --- a/drivers/gpu/host1x/syncpt.c +++ b/drivers/gpu/host1x/syncpt.c @@ -128,22 +128,11 @@ u32 host1x_syncpt_load_wait_base(struct host1x_syncpt *sp) } /* - * Write a cpu syncpoint increment to the hardware, without touching - * the cache. Caller is responsible for host being powered. - */ -void host1x_syncpt_cpu_incr(struct host1x_syncpt *sp) -{ - host1x_hw_syncpt_cpu_incr(sp->host, sp); -} - -/* * Increment syncpoint value from cpu, updating cache */ -void host1x_syncpt_incr(struct host1x_syncpt *sp) +int host1x_syncpt_incr(struct host1x_syncpt *sp) { - if (host1x_syncpt_client_managed(sp)) - host1x_syncpt_incr_max(sp, 1); - host1x_syncpt_cpu_incr(sp); + return host1x_hw_syncpt_cpu_incr(sp->host, sp); } /* diff --git a/drivers/gpu/host1x/syncpt.h b/drivers/gpu/host1x/syncpt.h index d00e758..267c0b9 100644 --- a/drivers/gpu/host1x/syncpt.h +++ b/drivers/gpu/host1x/syncpt.h @@ -115,9 +115,6 @@ static inline bool host1x_syncpt_idle(struct host1x_syncpt *sp) /* Return pointer to struct denoting sync point id. */ struct host1x_syncpt *host1x_syncpt_get(struct host1x *host, u32 id); -/* Request incrementing a sync point. */ -void host1x_syncpt_cpu_incr(struct host1x_syncpt *sp); - /* Load current value from hardware to the shadow register. */ u32 host1x_syncpt_load(struct host1x_syncpt *sp); @@ -133,8 +130,8 @@ void host1x_syncpt_restore(struct host1x *host); /* Read current wait base value into shadow register and
[PATCH 4/6] gpu: host1x: Fix client_managed type
client_managed field in syncpoint structure was defined as an integer. The field holds, however, only a boolean value. This patch modifies the type to boolean. Signed-off-by: Arto Merilainen --- drivers/gpu/host1x/drm/gr2d.c |2 +- drivers/gpu/host1x/syncpt.c |8 drivers/gpu/host1x/syncpt.h |6 +++--- 3 files changed, 8 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/host1x/drm/gr2d.c b/drivers/gpu/host1x/drm/gr2d.c index 6a45ae0..6c3a27b 100644 --- a/drivers/gpu/host1x/drm/gr2d.c +++ b/drivers/gpu/host1x/drm/gr2d.c @@ -281,7 +281,7 @@ static int gr2d_probe(struct platform_device *pdev) if (!gr2d->channel) return -ENOMEM; - *syncpts = host1x_syncpt_request(dev, 0); + *syncpts = host1x_syncpt_request(dev, false); if (!(*syncpts)) { host1x_channel_free(gr2d->channel); return -ENOMEM; diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c index 6b7ee88..e560d67 100644 --- a/drivers/gpu/host1x/syncpt.c +++ b/drivers/gpu/host1x/syncpt.c @@ -32,7 +32,7 @@ static struct host1x_syncpt *_host1x_syncpt_alloc(struct host1x *host, struct device *dev, - int client_managed) + bool client_managed) { int i; struct host1x_syncpt *sp = host->syncpt; @@ -331,7 +331,7 @@ int host1x_syncpt_init(struct host1x *host) host1x_syncpt_restore(host); /* Allocate sync point to use for clearing waits for expired fences */ - host->nop_sp = _host1x_syncpt_alloc(host, NULL, 0); + host->nop_sp = _host1x_syncpt_alloc(host, NULL, false); if (!host->nop_sp) return -ENOMEM; @@ -339,7 +339,7 @@ int host1x_syncpt_init(struct host1x *host) } struct host1x_syncpt *host1x_syncpt_request(struct device *dev, - int client_managed) + bool client_managed) { struct host1x *host = dev_get_drvdata(dev->parent); return _host1x_syncpt_alloc(host, dev, client_managed); @@ -353,7 +353,7 @@ void host1x_syncpt_free(struct host1x_syncpt *sp) kfree(sp->name); sp->dev = NULL; sp->name = NULL; - sp->client_managed = 0; + sp->client_managed = false; } void host1x_syncpt_deinit(struct host1x *host) diff --git a/drivers/gpu/host1x/syncpt.h b/drivers/gpu/host1x/syncpt.h index c998061..d00e758 100644 --- a/drivers/gpu/host1x/syncpt.h +++ b/drivers/gpu/host1x/syncpt.h @@ -36,7 +36,7 @@ struct host1x_syncpt { atomic_t max_val; u32 base_val; const char *name; - int client_managed; + bool client_managed; struct host1x *host; struct device *dev; @@ -94,7 +94,7 @@ static inline bool host1x_syncpt_check_max(struct host1x_syncpt *sp, u32 real) } /* Return true if sync point is client managed. */ -static inline int host1x_syncpt_client_managed(struct host1x_syncpt *sp) +static inline bool host1x_syncpt_client_managed(struct host1x_syncpt *sp) { return sp->client_managed; } @@ -157,7 +157,7 @@ u32 host1x_syncpt_id(struct host1x_syncpt *sp); /* Allocate a sync point for a device. */ struct host1x_syncpt *host1x_syncpt_request(struct device *dev, - int client_managed); + bool client_managed); /* Free a sync point. */ void host1x_syncpt_free(struct host1x_syncpt *sp); -- 1.7.9.5
[PATCH 3/6] gpu: host1x: Fix memory access in syncpt request
This patch fixes a bad memory access in syncpoint request code. If no syncpoints were available, the code accessed unreserved memory area causing unexpected behaviour. Signed-off-by: Arto Merilainen --- drivers/gpu/host1x/syncpt.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c index 5bf5366..6b7ee88 100644 --- a/drivers/gpu/host1x/syncpt.c +++ b/drivers/gpu/host1x/syncpt.c @@ -40,7 +40,7 @@ static struct host1x_syncpt *_host1x_syncpt_alloc(struct host1x *host, for (i = 0; i < host->info->nb_pts && sp->name; i++, sp++) ; - if (sp->dev) + if (i >= host->info->nb_pts) return NULL; name = kasprintf(GFP_KERNEL, "%02d-%s", sp->id, -- 1.7.9.5
[PATCH 2/6] gpu: host1x: Fix syncpoint wait return value
Syncpoint wait returned EAGAIN if it was called with zero timeout. This patch modifies the function to return ETIMEDOUT. Signed-off-by: Arto Merilainen --- drivers/gpu/host1x/syncpt.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c index 4b49345..5bf5366 100644 --- a/drivers/gpu/host1x/syncpt.c +++ b/drivers/gpu/host1x/syncpt.c @@ -187,7 +187,7 @@ int host1x_syncpt_wait(struct host1x_syncpt *sp, u32 thresh, long timeout, } if (!timeout) { - err = -EAGAIN; + err = -ETIMEDOUT; goto done; } @@ -205,7 +205,7 @@ int host1x_syncpt_wait(struct host1x_syncpt *sp, u32 thresh, long timeout, if (err) goto done; - err = -EAGAIN; + err = -ETIMEDOUT; /* Caller-specified timeout may be impractically low */ if (timeout < 0) timeout = LONG_MAX; -- 1.7.9.5
[PATCH 1/6] gpu: host1x: Fixes to host1x firewall
From: Terje Bergstrom <tbergst...@nvidia.com> This patch adds several fixes to host1x firewall: - Host1x firewall does not survive if it expects a reloc, but user space didn't pass any relocs. Also it reset the reloc table for each gather, whereas they should be reset only per submit. Also class does not need to be reset for each class - once per submit is enough. - For INCR opcode the check was not working properly at all. - The firewall verified gather buffers before copying them. This allowed a malicious application to rewrite the buffer content by timing the rewrite carefully. This patch makes the buffer validation occur after copying the buffers. Signed-off-by: Terje Bergstrom Signed-off-by: Arto Merilainen --- drivers/gpu/host1x/job.c | 120 -- 1 file changed, 53 insertions(+), 67 deletions(-) diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index f665d67..4f3c004 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -228,17 +228,15 @@ static unsigned int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf) void *cmdbuf_page_addr = NULL; /* pin & patch the relocs for one gather */ - while (i < job->num_relocs) { + for (i = 0; i < job->num_relocs; ++i) { struct host1x_reloc *reloc = >relocarray[i]; u32 reloc_addr = (job->reloc_addr_phys[i] + reloc->target_offset) >> reloc->shift; u32 *target; /* skip all other gathers */ - if (!(reloc->cmdbuf && cmdbuf == reloc->cmdbuf)) { - i++; + if (cmdbuf != reloc->cmdbuf) continue; - } if (last_page != reloc->cmdbuf_offset >> PAGE_SHIFT) { if (cmdbuf_page_addr) @@ -257,9 +255,6 @@ static unsigned int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf) target = cmdbuf_page_addr + (reloc->cmdbuf_offset & ~PAGE_MASK); *target = reloc_addr; - - /* mark this gather as handled */ - reloc->cmdbuf = 0; } if (cmdbuf_page_addr) @@ -268,15 +263,15 @@ static unsigned int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf) return 0; } -static int check_reloc(struct host1x_reloc *reloc, struct host1x_bo *cmdbuf, - unsigned int offset) +static bool check_reloc(struct host1x_reloc *reloc, struct host1x_bo *cmdbuf, + unsigned int offset) { offset *= sizeof(u32); - if (reloc->cmdbuf != cmdbuf || reloc->cmdbuf_offset != offset) - return -EINVAL; + if (!reloc || reloc->cmdbuf != cmdbuf || reloc->cmdbuf_offset != offset) + return true; - return 0; + return false; } struct host1x_firewall { @@ -330,7 +325,7 @@ static int check_incr(struct host1x_firewall *fw) u32 count = fw->count; u32 reg = fw->reg; - while (fw) { + while (count) { if (fw->words == 0) return -EINVAL; @@ -376,69 +371,58 @@ static int check_nonincr(struct host1x_firewall *fw) return 0; } -static int validate(struct host1x_job *job, struct device *dev, - struct host1x_job_gather *g) +static int validate_gather(struct host1x_firewall *fw, + struct host1x_job_gather *g) { - u32 *cmdbuf_base; + u32 *cmdbuf_base = (u32 *)fw->job->gather_copy_mapped + (g->offset / 4); int err = 0; - struct host1x_firewall fw; - fw.job = job; - fw.dev = dev; - fw.reloc = job->relocarray; - fw.num_relocs = job->num_relocs; - fw.cmdbuf_id = g->bo; - - fw.offset = 0; - fw.class = 0; - - if (!job->is_addr_reg) + if (!fw->job->is_addr_reg) return 0; - cmdbuf_base = host1x_bo_mmap(g->bo); - if (!cmdbuf_base) - return -ENOMEM; + fw->words = g->words; + fw->cmdbuf_id = g->bo; + fw->offset = 0; - fw.words = g->words; - while (fw.words && !err) { - u32 word = cmdbuf_base[fw.offset]; + while (fw->words && !err) { + u32 word = cmdbuf_base[fw->offset]; u32 opcode = (word & 0xf000) >> 28; - fw.mask = 0; - fw.reg = 0; - fw.count = 0; - fw.words--; - fw.offset++; + fw->mask = 0; + fw->reg = 0; + fw->count = 0; + fw->words--; + fw->offset++; switch (opcode) { case 0: - fw.class = word >&g
[PATCH 0/6] Miscellaneous fixes to host1x
This patch series fixes two issues in the host1x driver: First, the command buffer validation routine had vulnerabilities that were not detected in earlier testing. Second, the return codes of some functions were misleading or completely missing. This caused the driver to give wrong return codes also to the userspace. The series is based on top of 3.10rc1. I have tested the patch series on cardhu by running host1x and gr2d test cases (available at [0]). I would appreciate any help in testing/reviewing these patches. [0] https://gitorious.org/linux-host1x/libdrm-host1x Arto Merilainen (5): gpu: host1x: Fix syncpoint wait return value gpu: host1x: Fix memory access in syncpt request gpu: host1x: Fix client_managed type gpu: host1x: Rework CPU syncpoint increment drm/tegra: Fix syncpoint increment return code Terje Bergstrom (1): gpu: host1x: Fixes to host1x firewall drivers/gpu/host1x/dev.h |8 +-- drivers/gpu/host1x/drm/drm.c |3 +- drivers/gpu/host1x/drm/gr2d.c |2 +- drivers/gpu/host1x/hw/cdma_hw.c |2 +- drivers/gpu/host1x/hw/syncpt_hw.c | 12 ++-- drivers/gpu/host1x/job.c | 120 - drivers/gpu/host1x/syncpt.c | 29 +++-- drivers/gpu/host1x/syncpt.h | 13 ++-- 8 files changed, 79 insertions(+), 110 deletions(-) -- 1.7.9.5
[PATCH 0/6] Miscellaneous fixes to host1x
This patch series fixes two issues in the host1x driver: First, the command buffer validation routine had vulnerabilities that were not detected in earlier testing. Second, the return codes of some functions were misleading or completely missing. This caused the driver to give wrong return codes also to the userspace. The series is based on top of 3.10rc1. I have tested the patch series on cardhu by running host1x and gr2d test cases (available at [0]). I would appreciate any help in testing/reviewing these patches. [0] https://gitorious.org/linux-host1x/libdrm-host1x Arto Merilainen (5): gpu: host1x: Fix syncpoint wait return value gpu: host1x: Fix memory access in syncpt request gpu: host1x: Fix client_managed type gpu: host1x: Rework CPU syncpoint increment drm/tegra: Fix syncpoint increment return code Terje Bergstrom (1): gpu: host1x: Fixes to host1x firewall drivers/gpu/host1x/dev.h |8 +-- drivers/gpu/host1x/drm/drm.c |3 +- drivers/gpu/host1x/drm/gr2d.c |2 +- drivers/gpu/host1x/hw/cdma_hw.c |2 +- drivers/gpu/host1x/hw/syncpt_hw.c | 12 ++-- drivers/gpu/host1x/job.c | 120 - drivers/gpu/host1x/syncpt.c | 29 +++-- drivers/gpu/host1x/syncpt.h | 13 ++-- 8 files changed, 79 insertions(+), 110 deletions(-) -- 1.7.9.5 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 1/6] gpu: host1x: Fixes to host1x firewall
From: Terje Bergstrom tbergst...@nvidia.com This patch adds several fixes to host1x firewall: - Host1x firewall does not survive if it expects a reloc, but user space didn't pass any relocs. Also it reset the reloc table for each gather, whereas they should be reset only per submit. Also class does not need to be reset for each class - once per submit is enough. - For INCR opcode the check was not working properly at all. - The firewall verified gather buffers before copying them. This allowed a malicious application to rewrite the buffer content by timing the rewrite carefully. This patch makes the buffer validation occur after copying the buffers. Signed-off-by: Terje Bergstrom tbergst...@nvidia.com Signed-off-by: Arto Merilainen amerilai...@nvidia.com --- drivers/gpu/host1x/job.c | 120 -- 1 file changed, 53 insertions(+), 67 deletions(-) diff --git a/drivers/gpu/host1x/job.c b/drivers/gpu/host1x/job.c index f665d67..4f3c004 100644 --- a/drivers/gpu/host1x/job.c +++ b/drivers/gpu/host1x/job.c @@ -228,17 +228,15 @@ static unsigned int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf) void *cmdbuf_page_addr = NULL; /* pin patch the relocs for one gather */ - while (i job-num_relocs) { + for (i = 0; i job-num_relocs; ++i) { struct host1x_reloc *reloc = job-relocarray[i]; u32 reloc_addr = (job-reloc_addr_phys[i] + reloc-target_offset) reloc-shift; u32 *target; /* skip all other gathers */ - if (!(reloc-cmdbuf cmdbuf == reloc-cmdbuf)) { - i++; + if (cmdbuf != reloc-cmdbuf) continue; - } if (last_page != reloc-cmdbuf_offset PAGE_SHIFT) { if (cmdbuf_page_addr) @@ -257,9 +255,6 @@ static unsigned int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf) target = cmdbuf_page_addr + (reloc-cmdbuf_offset ~PAGE_MASK); *target = reloc_addr; - - /* mark this gather as handled */ - reloc-cmdbuf = 0; } if (cmdbuf_page_addr) @@ -268,15 +263,15 @@ static unsigned int do_relocs(struct host1x_job *job, struct host1x_bo *cmdbuf) return 0; } -static int check_reloc(struct host1x_reloc *reloc, struct host1x_bo *cmdbuf, - unsigned int offset) +static bool check_reloc(struct host1x_reloc *reloc, struct host1x_bo *cmdbuf, + unsigned int offset) { offset *= sizeof(u32); - if (reloc-cmdbuf != cmdbuf || reloc-cmdbuf_offset != offset) - return -EINVAL; + if (!reloc || reloc-cmdbuf != cmdbuf || reloc-cmdbuf_offset != offset) + return true; - return 0; + return false; } struct host1x_firewall { @@ -330,7 +325,7 @@ static int check_incr(struct host1x_firewall *fw) u32 count = fw-count; u32 reg = fw-reg; - while (fw) { + while (count) { if (fw-words == 0) return -EINVAL; @@ -376,69 +371,58 @@ static int check_nonincr(struct host1x_firewall *fw) return 0; } -static int validate(struct host1x_job *job, struct device *dev, - struct host1x_job_gather *g) +static int validate_gather(struct host1x_firewall *fw, + struct host1x_job_gather *g) { - u32 *cmdbuf_base; + u32 *cmdbuf_base = (u32 *)fw-job-gather_copy_mapped + (g-offset / 4); int err = 0; - struct host1x_firewall fw; - fw.job = job; - fw.dev = dev; - fw.reloc = job-relocarray; - fw.num_relocs = job-num_relocs; - fw.cmdbuf_id = g-bo; - - fw.offset = 0; - fw.class = 0; - - if (!job-is_addr_reg) + if (!fw-job-is_addr_reg) return 0; - cmdbuf_base = host1x_bo_mmap(g-bo); - if (!cmdbuf_base) - return -ENOMEM; + fw-words = g-words; + fw-cmdbuf_id = g-bo; + fw-offset = 0; - fw.words = g-words; - while (fw.words !err) { - u32 word = cmdbuf_base[fw.offset]; + while (fw-words !err) { + u32 word = cmdbuf_base[fw-offset]; u32 opcode = (word 0xf000) 28; - fw.mask = 0; - fw.reg = 0; - fw.count = 0; - fw.words--; - fw.offset++; + fw-mask = 0; + fw-reg = 0; + fw-count = 0; + fw-words--; + fw-offset++; switch (opcode) { case 0: - fw.class = word 6 0x3ff; - fw.mask = word 0x3f; - fw.reg = word 16 0xfff; - err = check_mask(fw); + fw-class = word 6
[PATCH 2/6] gpu: host1x: Fix syncpoint wait return value
Syncpoint wait returned EAGAIN if it was called with zero timeout. This patch modifies the function to return ETIMEDOUT. Signed-off-by: Arto Merilainen amerilai...@nvidia.com --- drivers/gpu/host1x/syncpt.c |4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c index 4b49345..5bf5366 100644 --- a/drivers/gpu/host1x/syncpt.c +++ b/drivers/gpu/host1x/syncpt.c @@ -187,7 +187,7 @@ int host1x_syncpt_wait(struct host1x_syncpt *sp, u32 thresh, long timeout, } if (!timeout) { - err = -EAGAIN; + err = -ETIMEDOUT; goto done; } @@ -205,7 +205,7 @@ int host1x_syncpt_wait(struct host1x_syncpt *sp, u32 thresh, long timeout, if (err) goto done; - err = -EAGAIN; + err = -ETIMEDOUT; /* Caller-specified timeout may be impractically low */ if (timeout 0) timeout = LONG_MAX; -- 1.7.9.5 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 3/6] gpu: host1x: Fix memory access in syncpt request
This patch fixes a bad memory access in syncpoint request code. If no syncpoints were available, the code accessed unreserved memory area causing unexpected behaviour. Signed-off-by: Arto Merilainen amerilai...@nvidia.com --- drivers/gpu/host1x/syncpt.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c index 5bf5366..6b7ee88 100644 --- a/drivers/gpu/host1x/syncpt.c +++ b/drivers/gpu/host1x/syncpt.c @@ -40,7 +40,7 @@ static struct host1x_syncpt *_host1x_syncpt_alloc(struct host1x *host, for (i = 0; i host-info-nb_pts sp-name; i++, sp++) ; - if (sp-dev) + if (i = host-info-nb_pts) return NULL; name = kasprintf(GFP_KERNEL, %02d-%s, sp-id, -- 1.7.9.5 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 6/6] drm/tegra: Fix syncpoint increment return code
Add syncpoint increment to return a proper return code based on the return value from host1x. Signed-off-by: Arto Merilainen amerilai...@nvidia.com --- drivers/gpu/host1x/drm/drm.c |3 +-- 1 file changed, 1 insertion(+), 2 deletions(-) diff --git a/drivers/gpu/host1x/drm/drm.c b/drivers/gpu/host1x/drm/drm.c index 2b561c9..1dfd454 100644 --- a/drivers/gpu/host1x/drm/drm.c +++ b/drivers/gpu/host1x/drm/drm.c @@ -378,8 +378,7 @@ static int tegra_syncpt_incr(struct drm_device *drm, void *data, if (!sp) return -EINVAL; - host1x_syncpt_incr(sp); - return 0; + return host1x_syncpt_incr(sp); } static int tegra_syncpt_wait(struct drm_device *drm, void *data, -- 1.7.9.5 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 4/6] gpu: host1x: Fix client_managed type
client_managed field in syncpoint structure was defined as an integer. The field holds, however, only a boolean value. This patch modifies the type to boolean. Signed-off-by: Arto Merilainen amerilai...@nvidia.com --- drivers/gpu/host1x/drm/gr2d.c |2 +- drivers/gpu/host1x/syncpt.c |8 drivers/gpu/host1x/syncpt.h |6 +++--- 3 files changed, 8 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/host1x/drm/gr2d.c b/drivers/gpu/host1x/drm/gr2d.c index 6a45ae0..6c3a27b 100644 --- a/drivers/gpu/host1x/drm/gr2d.c +++ b/drivers/gpu/host1x/drm/gr2d.c @@ -281,7 +281,7 @@ static int gr2d_probe(struct platform_device *pdev) if (!gr2d-channel) return -ENOMEM; - *syncpts = host1x_syncpt_request(dev, 0); + *syncpts = host1x_syncpt_request(dev, false); if (!(*syncpts)) { host1x_channel_free(gr2d-channel); return -ENOMEM; diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c index 6b7ee88..e560d67 100644 --- a/drivers/gpu/host1x/syncpt.c +++ b/drivers/gpu/host1x/syncpt.c @@ -32,7 +32,7 @@ static struct host1x_syncpt *_host1x_syncpt_alloc(struct host1x *host, struct device *dev, - int client_managed) + bool client_managed) { int i; struct host1x_syncpt *sp = host-syncpt; @@ -331,7 +331,7 @@ int host1x_syncpt_init(struct host1x *host) host1x_syncpt_restore(host); /* Allocate sync point to use for clearing waits for expired fences */ - host-nop_sp = _host1x_syncpt_alloc(host, NULL, 0); + host-nop_sp = _host1x_syncpt_alloc(host, NULL, false); if (!host-nop_sp) return -ENOMEM; @@ -339,7 +339,7 @@ int host1x_syncpt_init(struct host1x *host) } struct host1x_syncpt *host1x_syncpt_request(struct device *dev, - int client_managed) + bool client_managed) { struct host1x *host = dev_get_drvdata(dev-parent); return _host1x_syncpt_alloc(host, dev, client_managed); @@ -353,7 +353,7 @@ void host1x_syncpt_free(struct host1x_syncpt *sp) kfree(sp-name); sp-dev = NULL; sp-name = NULL; - sp-client_managed = 0; + sp-client_managed = false; } void host1x_syncpt_deinit(struct host1x *host) diff --git a/drivers/gpu/host1x/syncpt.h b/drivers/gpu/host1x/syncpt.h index c998061..d00e758 100644 --- a/drivers/gpu/host1x/syncpt.h +++ b/drivers/gpu/host1x/syncpt.h @@ -36,7 +36,7 @@ struct host1x_syncpt { atomic_t max_val; u32 base_val; const char *name; - int client_managed; + bool client_managed; struct host1x *host; struct device *dev; @@ -94,7 +94,7 @@ static inline bool host1x_syncpt_check_max(struct host1x_syncpt *sp, u32 real) } /* Return true if sync point is client managed. */ -static inline int host1x_syncpt_client_managed(struct host1x_syncpt *sp) +static inline bool host1x_syncpt_client_managed(struct host1x_syncpt *sp) { return sp-client_managed; } @@ -157,7 +157,7 @@ u32 host1x_syncpt_id(struct host1x_syncpt *sp); /* Allocate a sync point for a device. */ struct host1x_syncpt *host1x_syncpt_request(struct device *dev, - int client_managed); + bool client_managed); /* Free a sync point. */ void host1x_syncpt_free(struct host1x_syncpt *sp); -- 1.7.9.5 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 5/6] gpu: host1x: Rework CPU syncpoint increment
This patch merges host1x_syncpt_cpu_incr to host1x_syncpt_incr() as they are in practise doing the same thing. host1x_syncpt_incr() is also modified to return error codes. Signed-off-by: Arto Merilainen amerilai...@nvidia.com --- drivers/gpu/host1x/dev.h |8 drivers/gpu/host1x/hw/cdma_hw.c |2 +- drivers/gpu/host1x/hw/syncpt_hw.c | 12 +--- drivers/gpu/host1x/syncpt.c | 15 ++- drivers/gpu/host1x/syncpt.h |7 ++- 5 files changed, 14 insertions(+), 30 deletions(-) diff --git a/drivers/gpu/host1x/dev.h b/drivers/gpu/host1x/dev.h index a1607d6..790ddf1 100644 --- a/drivers/gpu/host1x/dev.h +++ b/drivers/gpu/host1x/dev.h @@ -73,7 +73,7 @@ struct host1x_syncpt_ops { void (*restore_wait_base)(struct host1x_syncpt *syncpt); void (*load_wait_base)(struct host1x_syncpt *syncpt); u32 (*load)(struct host1x_syncpt *syncpt); - void (*cpu_incr)(struct host1x_syncpt *syncpt); + int (*cpu_incr)(struct host1x_syncpt *syncpt); int (*patch_wait)(struct host1x_syncpt *syncpt, void *patch_addr); }; @@ -157,10 +157,10 @@ static inline u32 host1x_hw_syncpt_load(struct host1x *host, return host-syncpt_op-load(sp); } -static inline void host1x_hw_syncpt_cpu_incr(struct host1x *host, -struct host1x_syncpt *sp) +static inline int host1x_hw_syncpt_cpu_incr(struct host1x *host, + struct host1x_syncpt *sp) { - host-syncpt_op-cpu_incr(sp); + return host-syncpt_op-cpu_incr(sp); } static inline int host1x_hw_syncpt_patch_wait(struct host1x *host, diff --git a/drivers/gpu/host1x/hw/cdma_hw.c b/drivers/gpu/host1x/hw/cdma_hw.c index 590b69d..2ee4ad5 100644 --- a/drivers/gpu/host1x/hw/cdma_hw.c +++ b/drivers/gpu/host1x/hw/cdma_hw.c @@ -44,7 +44,7 @@ static void cdma_timeout_cpu_incr(struct host1x_cdma *cdma, u32 getptr, u32 i; for (i = 0; i syncpt_incrs; i++) - host1x_syncpt_cpu_incr(cdma-timeout.syncpt); + host1x_syncpt_incr(cdma-timeout.syncpt); /* after CPU incr, ensure shadow is up to date */ host1x_syncpt_load(cdma-timeout.syncpt); diff --git a/drivers/gpu/host1x/hw/syncpt_hw.c b/drivers/gpu/host1x/hw/syncpt_hw.c index 6117499..0cf6095 100644 --- a/drivers/gpu/host1x/hw/syncpt_hw.c +++ b/drivers/gpu/host1x/hw/syncpt_hw.c @@ -77,21 +77,19 @@ static u32 syncpt_load(struct host1x_syncpt *sp) * Write a cpu syncpoint increment to the hardware, without touching * the cache. */ -static void syncpt_cpu_incr(struct host1x_syncpt *sp) +static int syncpt_cpu_incr(struct host1x_syncpt *sp) { struct host1x *host = sp-host; u32 reg_offset = sp-id / 32; if (!host1x_syncpt_client_managed(sp) - host1x_syncpt_idle(sp)) { - dev_err(host-dev, Trying to increment syncpoint id %d beyond max\n, - sp-id); - host1x_debug_dump(sp-host); - return; - } + host1x_syncpt_idle(sp)) + return -EINVAL; host1x_sync_writel(host, BIT_MASK(sp-id), HOST1X_SYNC_SYNCPT_CPU_INCR(reg_offset)); wmb(); + + return 0; } /* remove a wait pointed to by patch_addr */ diff --git a/drivers/gpu/host1x/syncpt.c b/drivers/gpu/host1x/syncpt.c index e560d67..afb631a 100644 --- a/drivers/gpu/host1x/syncpt.c +++ b/drivers/gpu/host1x/syncpt.c @@ -128,22 +128,11 @@ u32 host1x_syncpt_load_wait_base(struct host1x_syncpt *sp) } /* - * Write a cpu syncpoint increment to the hardware, without touching - * the cache. Caller is responsible for host being powered. - */ -void host1x_syncpt_cpu_incr(struct host1x_syncpt *sp) -{ - host1x_hw_syncpt_cpu_incr(sp-host, sp); -} - -/* * Increment syncpoint value from cpu, updating cache */ -void host1x_syncpt_incr(struct host1x_syncpt *sp) +int host1x_syncpt_incr(struct host1x_syncpt *sp) { - if (host1x_syncpt_client_managed(sp)) - host1x_syncpt_incr_max(sp, 1); - host1x_syncpt_cpu_incr(sp); + return host1x_hw_syncpt_cpu_incr(sp-host, sp); } /* diff --git a/drivers/gpu/host1x/syncpt.h b/drivers/gpu/host1x/syncpt.h index d00e758..267c0b9 100644 --- a/drivers/gpu/host1x/syncpt.h +++ b/drivers/gpu/host1x/syncpt.h @@ -115,9 +115,6 @@ static inline bool host1x_syncpt_idle(struct host1x_syncpt *sp) /* Return pointer to struct denoting sync point id. */ struct host1x_syncpt *host1x_syncpt_get(struct host1x *host, u32 id); -/* Request incrementing a sync point. */ -void host1x_syncpt_cpu_incr(struct host1x_syncpt *sp); - /* Load current value from hardware to the shadow register. */ u32 host1x_syncpt_load(struct host1x_syncpt *sp); @@ -133,8 +130,8 @@ void host1x_syncpt_restore(struct host1x *host); /* Read current wait base value into shadow register and return it. */ u32 host1x_syncpt_load_wait_base(struct host1x_syncpt
[RFCv2,libdrm 2/2] tests: tegra: Add stream library test
This patch adds a minimal test set for the stream library and host1x kernel interface. The test verifies that the driver (or library) is able to: - Increment, read and wait for syncpoint values - Use a host1x channel to do host1x operations - Handle submit timeout correctly - Do relocations to buffer - Allocate and release memory - Use stream pools correctly Signed-off-by: Arto Merilainen --- configure.ac |1 + tests/tegra/host1x/Makefile.am | 12 + tests/tegra/host1x/tegra_host1x_test.c | 893 3 files changed, 906 insertions(+) create mode 100644 tests/tegra/host1x/Makefile.am create mode 100644 tests/tegra/host1x/tegra_host1x_test.c diff --git a/configure.ac b/configure.ac index e55e9c1..a678bcd 100644 --- a/configure.ac +++ b/configure.ac @@ -399,6 +399,7 @@ AC_CONFIG_FILES([ tests/radeon/Makefile tests/vbltest/Makefile tests/exynos/Makefile + tests/tegra/host1x/Makefile include/Makefile include/drm/Makefile man/Makefile diff --git a/tests/tegra/host1x/Makefile.am b/tests/tegra/host1x/Makefile.am new file mode 100644 index 000..700f764 --- /dev/null +++ b/tests/tegra/host1x/Makefile.am @@ -0,0 +1,12 @@ +AM_CFLAGS = \ + -I $(top_srcdir)/include/drm \ + -I $(top_srcdir) + +LDADD = \ + $(top_builddir)/libdrm.la + +noinst_PROGRAMS = \ + tegra_host1x_test + +tegra_host1x_test_SOURCES = \ + tegra_host1x_test.c diff --git a/tests/tegra/host1x/tegra_host1x_test.c b/tests/tegra/host1x/tegra_host1x_test.c new file mode 100644 index 000..548f422 --- /dev/null +++ b/tests/tegra/host1x/tegra_host1x_test.c @@ -0,0 +1,893 @@ +/* + * Copyright (C) 2012-2013 NVIDIA Corporation. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + * Authors: + * Arto Merilainen + */ + +#include +#include +#include +#include + +/* Include the code file to access the internals of the library */ +#include "tegra/tegra_drm.c" + +#define HOST1X_OPCODE_NOP host1x_opcode_nonincr(0, 0) + +/* + * test_oversized_submit(channel) - Do a submit that does not fit into + * preallocated stream buffer + */ + +int test_oversized_submit(struct tegra_channel *channel) +{ + struct tegra_stream *stream = NULL; + struct tegra_fence fence; + unsigned int diff_ms; + int i; + + /* Create a really small buffer */ + if (!(stream = tegra_stream_create(channel, 4, 0, 0))) + return -1; + + if (tegra_stream_begin(stream, 100, NULL, 0, 0, + HOST1X_CLASS_HOST1X)) + goto destroy; + for (i = 0; i < 100; ++i) { + if (tegra_stream_push(stream, HOST1X_OPCODE_NOP)) + goto destroy; + } + if (tegra_stream_end(stream)) + goto destroy; + if (tegra_stream_flush(stream, )) + goto destroy; + if (!tegra_fence_is_valid()) + goto destroy; + if (tegra_fence_waitex(channel, , 15000, NULL)) + goto destroy; + + tegra_stream_destroy(stream); + return 0; + +destroy: + tegra_stream_destroy(stream); + return -1; + +} + +/* + * test_huge_submit(channel) - Do single huge submit and wait for completion + */ + +int test_huge_submit(struct tegra_channel *channel) +{ + struct tegra_stream *stream = NULL; + struct tegra_fence fence; + const unsigned int submit_count = 1000; + struct timespec tp_begin, tp_end; + unsigned int diff_ms; + int i; + + clock_gettime(CLOCK_MONOTONIC, _begin); + + if (!(stream = tegra_stream_create(channel, 0, 0, 0))) + return -1; + + /* Create many small submits */ + for (i = 0; i < submit_count; ++i) { + if (tegr
[RFCv2,libdrm 1/2] tegra: Add stream library
This patch introduces tegra stream library. The library is used for buffer management, command stream construction and work synchronization. Signed-off-by: Arto Merilainen --- Makefile.am|6 +- configure.ac | 13 + tegra/Makefile.am | 25 ++ tegra/class_ids.h | 36 ++ tegra/host1x01_hardware.h | 125 ++ tegra/hw_host1x01_uclass.h | 155 +++ tegra/libdrm_tegra.pc.in | 10 + tegra/tegra_drm.c | 998 tegra/tegra_drm.h | 136 ++ tegra/tegra_drmif.h| 110 + 10 files changed, 1613 insertions(+), 1 deletion(-) create mode 100644 tegra/Makefile.am create mode 100644 tegra/class_ids.h create mode 100644 tegra/host1x01_hardware.h create mode 100644 tegra/hw_host1x01_uclass.h create mode 100644 tegra/libdrm_tegra.pc.in create mode 100644 tegra/tegra_drm.c create mode 100644 tegra/tegra_drm.h create mode 100644 tegra/tegra_drmif.h diff --git a/Makefile.am b/Makefile.am index f726036..bd942e7 100644 --- a/Makefile.am +++ b/Makefile.am @@ -51,7 +51,11 @@ if HAVE_FREEDRENO FREEDRENO_SUBDIR = freedreno endif -SUBDIRS = . $(LIBKMS_SUBDIR) $(INTEL_SUBDIR) $(NOUVEAU_SUBDIR) $(RADEON_SUBDIR) $(OMAP_SUBDIR) $(EXYNOS_SUBDIR) $(FREEDRENO_SUBDIR) tests include man +if HAVE_TEGRA +TEGRA_SUBDIR = tegra +endif + +SUBDIRS = . $(LIBKMS_SUBDIR) $(INTEL_SUBDIR) $(NOUVEAU_SUBDIR) $(RADEON_SUBDIR) $(OMAP_SUBDIR) $(EXYNOS_SUBDIR) $(FREEDRENO_SUBDIR) $(TEGRA_SUBDIR) tests include man libdrm_la_LTLIBRARIES = libdrm.la libdrm_ladir = $(libdir) diff --git a/configure.ac b/configure.ac index 2786c87..e55e9c1 100644 --- a/configure.ac +++ b/configure.ac @@ -103,6 +103,11 @@ AC_ARG_ENABLE(install-test-programs, [Install test programs (default: no)]), [INSTALL_TESTS=$enableval], [INSTALL_TESTS=no]) +AC_ARG_ENABLE(tegra, + AS_HELP_STRING([--enable-tegra], + [Enable support for tegra's API (default: disabled)]), + [TEGRA=$enableval], [TEGRA=no]) + dnl === dnl check compiler flags AC_DEFUN([LIBDRM_CC_TRY_FLAG], [ @@ -216,6 +221,11 @@ if test "x$FREEDRENO" = xyes; then AC_DEFINE(HAVE_FREEDRENO, 1, [Have freedreno support]) fi +AM_CONDITIONAL(HAVE_TEGRA, [test "x$TEGRA" = xyes]) +if test "x$TEGRA" = xyes; then + AC_DEFINE(HAVE_TEGRA, 1, [Have TEGRA support]) +fi + AM_CONDITIONAL(HAVE_INSTALL_TESTS, [test "x$INSTALL_TESTS" = xyes]) if test "x$INSTALL_TESTS" = xyes; then AC_DEFINE(HAVE_INSTALL_TESTS, 1, [Install test programs]) @@ -380,6 +390,8 @@ AC_CONFIG_FILES([ exynos/libdrm_exynos.pc freedreno/Makefile freedreno/libdrm_freedreno.pc + tegra/Makefile + tegra/libdrm_tegra.pc tests/Makefile tests/modeprint/Makefile tests/modetest/Makefile @@ -404,4 +416,5 @@ echo " Nouveau API$NOUVEAU" echo " OMAP API $OMAP" echo " EXYNOS API $EXYNOS" echo " Freedreno API $FREEDRENO" +echo " TEGRA API $TEGRA" echo "" diff --git a/tegra/Makefile.am b/tegra/Makefile.am new file mode 100644 index 000..72675e5 --- /dev/null +++ b/tegra/Makefile.am @@ -0,0 +1,25 @@ +AM_CFLAGS = \ + $(WARN_CFLAGS) \ + -I$(top_srcdir) \ + -I$(top_srcdir)/tegra \ + $(PTHREADSTUBS_CFLAGS) \ + -I$(top_srcdir)/include/drm + +libdrm_tegra_la_LTLIBRARIES = libdrm_tegra.la +libdrm_tegra_ladir = $(libdir) +libdrm_tegra_la_LDFLAGS = -version-number 1:0:0 -no-undefined +libdrm_tegra_la_LIBADD = ../libdrm.la @PTHREADSTUBS_LIBS@ + +libdrm_tegra_la_SOURCES = \ + tegra_drm.c + +libdrm_tegracommonincludedir = ${includedir}/tegra +libdrm_tegracommoninclude_HEADERS = \ + tegra_drm.h + +libdrm_tegraincludedir = ${includedir}/libdrm +libdrm_tegrainclude_HEADERS = \ + tegra_drmif.h + +pkgconfigdir = @pkgconfigdir@ +pkgconfig_DATA = libdrm_tegra.pc diff --git a/tegra/class_ids.h b/tegra/class_ids.h new file mode 100644 index 000..b512306 --- /dev/null +++ b/tegra/class_ids.h @@ -0,0 +1,36 @@ +/* + * Copyright (C) 2012-2013 NVIDIA Corporation. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the "Software"), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED "
[RFCv2,libdrm 0/2] NVIDIA Tegra support
This patch series adds application level support for host1x hardware on Tegra SoCs. This set of patches can be used in combination with host1x kernel patches. The most recent version of the kernel patch series is available at [0]. An example of using 2d hardware acceleration with this library is available at [1]. Changes in version 2: - Instead of using assertations, the library now returns error codes - Added a minimal set of tests to test common use cases - The size of a stream buffer pool can be set runtime - The library keeps track of syncpoint increments - Added reference counting to buffer management - Removed 2d related patches from the series - Rebased to latest libdrm - Updated ioctl interface - Fixed stylish issues [0]: http://gitorious.org/linux-host1x/linux-host1x [1]: http://gitorious.org/linux-host1x/libdrm-host1x/commits/2d Arto Merilainen (2): tegra: Add stream library tests: tegra: Add stream library test Makefile.am|6 +- configure.ac | 14 + tegra/Makefile.am | 25 + tegra/class_ids.h | 36 ++ tegra/host1x01_hardware.h | 125 tegra/hw_host1x01_uclass.h | 155 + tegra/libdrm_tegra.pc.in | 10 + tegra/tegra_drm.c | 998 tegra/tegra_drm.h | 136 + tegra/tegra_drmif.h| 110 tests/tegra/host1x/Makefile.am | 12 + tests/tegra/host1x/tegra_host1x_test.c | 893 12 files changed, 2519 insertions(+), 1 deletion(-) create mode 100644 tegra/Makefile.am create mode 100644 tegra/class_ids.h create mode 100644 tegra/host1x01_hardware.h create mode 100644 tegra/hw_host1x01_uclass.h create mode 100644 tegra/libdrm_tegra.pc.in create mode 100644 tegra/tegra_drm.c create mode 100644 tegra/tegra_drm.h create mode 100644 tegra/tegra_drmif.h create mode 100644 tests/tegra/host1x/Makefile.am create mode 100644 tests/tegra/host1x/tegra_host1x_test.c -- 1.7.9.5
[RFCv2,libdrm 0/2] NVIDIA Tegra support
This patch series adds application level support for host1x hardware on Tegra SoCs. This set of patches can be used in combination with host1x kernel patches. The most recent version of the kernel patch series is available at [0]. An example of using 2d hardware acceleration with this library is available at [1]. Changes in version 2: - Instead of using assertations, the library now returns error codes - Added a minimal set of tests to test common use cases - The size of a stream buffer pool can be set runtime - The library keeps track of syncpoint increments - Added reference counting to buffer management - Removed 2d related patches from the series - Rebased to latest libdrm - Updated ioctl interface - Fixed stylish issues [0]: http://gitorious.org/linux-host1x/linux-host1x [1]: http://gitorious.org/linux-host1x/libdrm-host1x/commits/2d Arto Merilainen (2): tegra: Add stream library tests: tegra: Add stream library test Makefile.am|6 +- configure.ac | 14 + tegra/Makefile.am | 25 + tegra/class_ids.h | 36 ++ tegra/host1x01_hardware.h | 125 tegra/hw_host1x01_uclass.h | 155 + tegra/libdrm_tegra.pc.in | 10 + tegra/tegra_drm.c | 998 tegra/tegra_drm.h | 136 + tegra/tegra_drmif.h| 110 tests/tegra/host1x/Makefile.am | 12 + tests/tegra/host1x/tegra_host1x_test.c | 893 12 files changed, 2519 insertions(+), 1 deletion(-) create mode 100644 tegra/Makefile.am create mode 100644 tegra/class_ids.h create mode 100644 tegra/host1x01_hardware.h create mode 100644 tegra/hw_host1x01_uclass.h create mode 100644 tegra/libdrm_tegra.pc.in create mode 100644 tegra/tegra_drm.c create mode 100644 tegra/tegra_drm.h create mode 100644 tegra/tegra_drmif.h create mode 100644 tests/tegra/host1x/Makefile.am create mode 100644 tests/tegra/host1x/tegra_host1x_test.c -- 1.7.9.5 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[RFCv2,libdrm 1/2] tegra: Add stream library
This patch introduces tegra stream library. The library is used for buffer management, command stream construction and work synchronization. Signed-off-by: Arto Merilainen amerilai...@nvidia.com --- Makefile.am|6 +- configure.ac | 13 + tegra/Makefile.am | 25 ++ tegra/class_ids.h | 36 ++ tegra/host1x01_hardware.h | 125 ++ tegra/hw_host1x01_uclass.h | 155 +++ tegra/libdrm_tegra.pc.in | 10 + tegra/tegra_drm.c | 998 tegra/tegra_drm.h | 136 ++ tegra/tegra_drmif.h| 110 + 10 files changed, 1613 insertions(+), 1 deletion(-) create mode 100644 tegra/Makefile.am create mode 100644 tegra/class_ids.h create mode 100644 tegra/host1x01_hardware.h create mode 100644 tegra/hw_host1x01_uclass.h create mode 100644 tegra/libdrm_tegra.pc.in create mode 100644 tegra/tegra_drm.c create mode 100644 tegra/tegra_drm.h create mode 100644 tegra/tegra_drmif.h diff --git a/Makefile.am b/Makefile.am index f726036..bd942e7 100644 --- a/Makefile.am +++ b/Makefile.am @@ -51,7 +51,11 @@ if HAVE_FREEDRENO FREEDRENO_SUBDIR = freedreno endif -SUBDIRS = . $(LIBKMS_SUBDIR) $(INTEL_SUBDIR) $(NOUVEAU_SUBDIR) $(RADEON_SUBDIR) $(OMAP_SUBDIR) $(EXYNOS_SUBDIR) $(FREEDRENO_SUBDIR) tests include man +if HAVE_TEGRA +TEGRA_SUBDIR = tegra +endif + +SUBDIRS = . $(LIBKMS_SUBDIR) $(INTEL_SUBDIR) $(NOUVEAU_SUBDIR) $(RADEON_SUBDIR) $(OMAP_SUBDIR) $(EXYNOS_SUBDIR) $(FREEDRENO_SUBDIR) $(TEGRA_SUBDIR) tests include man libdrm_la_LTLIBRARIES = libdrm.la libdrm_ladir = $(libdir) diff --git a/configure.ac b/configure.ac index 2786c87..e55e9c1 100644 --- a/configure.ac +++ b/configure.ac @@ -103,6 +103,11 @@ AC_ARG_ENABLE(install-test-programs, [Install test programs (default: no)]), [INSTALL_TESTS=$enableval], [INSTALL_TESTS=no]) +AC_ARG_ENABLE(tegra, + AS_HELP_STRING([--enable-tegra], + [Enable support for tegra's API (default: disabled)]), + [TEGRA=$enableval], [TEGRA=no]) + dnl === dnl check compiler flags AC_DEFUN([LIBDRM_CC_TRY_FLAG], [ @@ -216,6 +221,11 @@ if test x$FREEDRENO = xyes; then AC_DEFINE(HAVE_FREEDRENO, 1, [Have freedreno support]) fi +AM_CONDITIONAL(HAVE_TEGRA, [test x$TEGRA = xyes]) +if test x$TEGRA = xyes; then + AC_DEFINE(HAVE_TEGRA, 1, [Have TEGRA support]) +fi + AM_CONDITIONAL(HAVE_INSTALL_TESTS, [test x$INSTALL_TESTS = xyes]) if test x$INSTALL_TESTS = xyes; then AC_DEFINE(HAVE_INSTALL_TESTS, 1, [Install test programs]) @@ -380,6 +390,8 @@ AC_CONFIG_FILES([ exynos/libdrm_exynos.pc freedreno/Makefile freedreno/libdrm_freedreno.pc + tegra/Makefile + tegra/libdrm_tegra.pc tests/Makefile tests/modeprint/Makefile tests/modetest/Makefile @@ -404,4 +416,5 @@ echo Nouveau API$NOUVEAU echo OMAP API $OMAP echo EXYNOS API $EXYNOS echo Freedreno API $FREEDRENO +echo TEGRA API $TEGRA echo diff --git a/tegra/Makefile.am b/tegra/Makefile.am new file mode 100644 index 000..72675e5 --- /dev/null +++ b/tegra/Makefile.am @@ -0,0 +1,25 @@ +AM_CFLAGS = \ + $(WARN_CFLAGS) \ + -I$(top_srcdir) \ + -I$(top_srcdir)/tegra \ + $(PTHREADSTUBS_CFLAGS) \ + -I$(top_srcdir)/include/drm + +libdrm_tegra_la_LTLIBRARIES = libdrm_tegra.la +libdrm_tegra_ladir = $(libdir) +libdrm_tegra_la_LDFLAGS = -version-number 1:0:0 -no-undefined +libdrm_tegra_la_LIBADD = ../libdrm.la @PTHREADSTUBS_LIBS@ + +libdrm_tegra_la_SOURCES = \ + tegra_drm.c + +libdrm_tegracommonincludedir = ${includedir}/tegra +libdrm_tegracommoninclude_HEADERS = \ + tegra_drm.h + +libdrm_tegraincludedir = ${includedir}/libdrm +libdrm_tegrainclude_HEADERS = \ + tegra_drmif.h + +pkgconfigdir = @pkgconfigdir@ +pkgconfig_DATA = libdrm_tegra.pc diff --git a/tegra/class_ids.h b/tegra/class_ids.h new file mode 100644 index 000..b512306 --- /dev/null +++ b/tegra/class_ids.h @@ -0,0 +1,36 @@ +/* + * Copyright (C) 2012-2013 NVIDIA Corporation. + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY
[RFC,libdrm 1/3] tegra: Add stream library
On 12/28/2012 11:04 AM, Mark Zhang wrote: > On 12/28/2012 04:50 PM, Arto Merilainen wrote: >> >> In my opinion asking tegra_stream_begin() to put a bad fence into the >> stream is a case we should never be. assert() kills the application >> immediately (in debug builds) and usually this helps the programmer for >> 1) finding bugs 2) not doing bad code. >> > > Yep, I agree. But in release builds, assert does nothing. So this > checking doesn't make sense and also a wrong fence will be pushed into > command buffer silently. And we always use release version in real > products, so we can't count on this "assert". The only pro of using assert is low (=no :-) ) overhead in release builds. > >> "Silencing" is not a good solution especially in this case: >> tegra_stream_flush() returns an invalid fence when flushing fails. If >> the application chains submits (i.e. do a blit and then do another using >> the output of the first blit) it is crucial to be sure the first submit >> has been performed before starting the second one. >> > > Yes. So I suggest doing fence checking at the beginning of the > "tegra_stream_begin", if invalid fence found, return an error. > This sounds reasonable. - Arto
[RFC,libdrm 1/3] tegra: Add stream library
On 12/28/2012 09:57 AM, Mark Zhang wrote: > On 12/28/2012 03:45 PM, Arto Merilainen wrote: >> On 12/28/2012 08:47 AM, Mark Zhang wrote: >>>> + >>>> +/* Add fences */ >>>> +if (num_fences) { >>>> + >>>> +tegra_stream_push(stream, >>>> +nvhost_opcode_setclass(NV_HOST1X_CLASS_ID, >>>> +host1x_uclass_wait_syncpt_r(), num_fences)); >>>> + >>>> +for (; num_fences; num_fences--, fence++) { >>>> +assert(tegra_fence_is_valid(fence)); >>> >>> This is useless. We already add "1 + num_fences" to num_words above. So >>> move this "assert" before adding "1 + num_fences" to num_words makes >>> sense. >>> >> >> The assertion checks the validity of a single fence - not if there is >> room in the command buffer. >> >> The goal is to prevent having invalid fences in the command stream. If >> this check were not here it would be possible to initialise a fence with >> tegra_fence_clear() and put that fence into the stream. > > My idea is, if one fence is invalid, then we should not count this in > "num_words". In current code, if one fence is invalid, then this fence > will not be pushed into the command stream, and the "num_words" shows a > wrong command buffer size. > > So I think we should: > - validate the fences, remove the invalid fence > - update num_words > - then you don't need to check fence here - I mean, before push a host1x > syncpt wait command into the active buffer of stream. > In my opinion asking tegra_stream_begin() to put a bad fence into the stream is a case we should never be. assert() kills the application immediately (in debug builds) and usually this helps the programmer for 1) finding bugs 2) not doing bad code. "Silencing" is not a good solution especially in this case: tegra_stream_flush() returns an invalid fence when flushing fails. If the application chains submits (i.e. do a blit and then do another using the output of the first blit) it is crucial to be sure the first submit has been performed before starting the second one. - Arto
[RFC,libdrm 1/3] tegra: Add stream library
On 12/28/2012 08:47 AM, Mark Zhang wrote: >> +int tegra_fence_is_valid(const struct tegra_fence *fence) >> +{ >> +int valid = fence ? 1 : 0; >> +valid = valid && fence->id != (uint32_t) -1; >> +valid = valid && fence->id < 32; > > Hardcode here. Assume always has 32 syncpts. > Change to a micro wrapped with tegra version ifdef will be better. > You are correct. I will fix this. >> +return valid; >> +} >> + > [...] >> + >> +/* Add fences */ >> +if (num_fences) { >> + >> +tegra_stream_push(stream, >> +nvhost_opcode_setclass(NV_HOST1X_CLASS_ID, >> +host1x_uclass_wait_syncpt_r(), num_fences)); >> + >> +for (; num_fences; num_fences--, fence++) { >> +assert(tegra_fence_is_valid(fence)); > > This is useless. We already add "1 + num_fences" to num_words above. So > move this "assert" before adding "1 + num_fences" to num_words makes sense. > The assertion checks the validity of a single fence - not if there is room in the command buffer. The goal is to prevent having invalid fences in the command stream. If this check were not here it would be possible to initialise a fence with tegra_fence_clear() and put that fence into the stream. >> + >> +tegra_stream_push(stream, >> nvhost_class_host_wait_syncpt(fence->id, >> +fence->value)); >> +} >> +} >> + >> +if (class_id) >> +tegra_stream_push(stream, nvhost_opcode_setclass(class_id, 0, 0)); >> + >> +return 0; >> +} >> + > [...] >> + >> +#endif /* TEGRA_DRMIF_H */ >> - Arto
Re: [RFC,libdrm 1/3] tegra: Add stream library
On 12/28/2012 09:57 AM, Mark Zhang wrote: On 12/28/2012 03:45 PM, Arto Merilainen wrote: On 12/28/2012 08:47 AM, Mark Zhang wrote: + +/* Add fences */ +if (num_fences) { + +tegra_stream_push(stream, +nvhost_opcode_setclass(NV_HOST1X_CLASS_ID, +host1x_uclass_wait_syncpt_r(), num_fences)); + +for (; num_fences; num_fences--, fence++) { +assert(tegra_fence_is_valid(fence)); This is useless. We already add 1 + num_fences to num_words above. So move this assert before adding 1 + num_fences to num_words makes sense. The assertion checks the validity of a single fence - not if there is room in the command buffer. The goal is to prevent having invalid fences in the command stream. If this check were not here it would be possible to initialise a fence with tegra_fence_clear() and put that fence into the stream. My idea is, if one fence is invalid, then we should not count this in num_words. In current code, if one fence is invalid, then this fence will not be pushed into the command stream, and the num_words shows a wrong command buffer size. So I think we should: - validate the fences, remove the invalid fence - update num_words - then you don't need to check fence here - I mean, before push a host1x syncpt wait command into the active buffer of stream. In my opinion asking tegra_stream_begin() to put a bad fence into the stream is a case we should never be. assert() kills the application immediately (in debug builds) and usually this helps the programmer for 1) finding bugs 2) not doing bad code. Silencing is not a good solution especially in this case: tegra_stream_flush() returns an invalid fence when flushing fails. If the application chains submits (i.e. do a blit and then do another using the output of the first blit) it is crucial to be sure the first submit has been performed before starting the second one. - Arto ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [RFC,libdrm 1/3] tegra: Add stream library
On 12/28/2012 11:04 AM, Mark Zhang wrote: On 12/28/2012 04:50 PM, Arto Merilainen wrote: In my opinion asking tegra_stream_begin() to put a bad fence into the stream is a case we should never be. assert() kills the application immediately (in debug builds) and usually this helps the programmer for 1) finding bugs 2) not doing bad code. Yep, I agree. But in release builds, assert does nothing. So this checking doesn't make sense and also a wrong fence will be pushed into command buffer silently. And we always use release version in real products, so we can't count on this assert. The only pro of using assert is low (=no :-) ) overhead in release builds. Silencing is not a good solution especially in this case: tegra_stream_flush() returns an invalid fence when flushing fails. If the application chains submits (i.e. do a blit and then do another using the output of the first blit) it is crucial to be sure the first submit has been performed before starting the second one. Yes. So I suggest doing fence checking at the beginning of the tegra_stream_begin, if invalid fence found, return an error. This sounds reasonable. - Arto ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [RFC,libdrm 1/3] tegra: Add stream library
On 12/28/2012 08:47 AM, Mark Zhang wrote: +int tegra_fence_is_valid(const struct tegra_fence *fence) +{ +int valid = fence ? 1 : 0; +valid = valid fence-id != (uint32_t) -1; +valid = valid fence-id 32; Hardcode here. Assume always has 32 syncpts. Change to a micro wrapped with tegra version ifdef will be better. You are correct. I will fix this. +return valid; +} + [...] + +/* Add fences */ +if (num_fences) { + +tegra_stream_push(stream, +nvhost_opcode_setclass(NV_HOST1X_CLASS_ID, +host1x_uclass_wait_syncpt_r(), num_fences)); + +for (; num_fences; num_fences--, fence++) { +assert(tegra_fence_is_valid(fence)); This is useless. We already add 1 + num_fences to num_words above. So move this assert before adding 1 + num_fences to num_words makes sense. The assertion checks the validity of a single fence - not if there is room in the command buffer. The goal is to prevent having invalid fences in the command stream. If this check were not here it would be possible to initialise a fence with tegra_fence_clear() and put that fence into the stream. + +tegra_stream_push(stream, nvhost_class_host_wait_syncpt(fence-id, +fence-value)); +} +} + +if (class_id) +tegra_stream_push(stream, nvhost_opcode_setclass(class_id, 0, 0)); + +return 0; +} + [...] + +#endif /* TEGRA_DRMIF_H */ - Arto ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[RFC v2 6/8] gpu: drm: tegra: Remove redundant host1x
On 12/20/2012 07:14 PM, Stephen Warren wrote: > > What's wrong with just having each device ask the host1x (its parent) > for a pointer to the (dummy) tegradrm device. That seems extremely > We are talking about creating a dummy device because: - we need to give something for drm_platform_init(), - drm related data should be stored somewhere, - some data must be passed to all driver probe() functions. This is not hw-related data, but just some lists that are used to ensure that all drivers have been initialised before calling drm_platform_init(). All these issues are purely tegra-drm related and solving them elsewhere (in host1x) would be just wrong! host1x would not even use the dummy device for anything so creating it there seems bizarre. - Arto
Re: [RFC v2 6/8] gpu: drm: tegra: Remove redundant host1x
On 12/20/2012 07:14 PM, Stephen Warren wrote: What's wrong with just having each device ask the host1x (its parent) for a pointer to the (dummy) tegradrm device. That seems extremely We are talking about creating a dummy device because: - we need to give something for drm_platform_init(), - drm related data should be stored somewhere, - some data must be passed to all driver probe() functions. This is not hw-related data, but just some lists that are used to ensure that all drivers have been initialised before calling drm_platform_init(). All these issues are purely tegra-drm related and solving them elsewhere (in host1x) would be just wrong! host1x would not even use the dummy device for anything so creating it there seems bizarre. - Arto ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[RFC libdrm] Add NVIDIA Tegra support
On 12/04/2012 05:13 PM, Thierry Reding wrote: > +int drm_tegra_open(const char *path, struct drm_tegra **devicep) > +{ > + struct drm_tegra *device; > + int err; > + > + if (!path || !devicep) > + return -EINVAL; > + > + device = calloc(1, sizeof(*device)); > + if (!device) > + return -ENOMEM; > + > + DRMINITLISTHEAD(>bo_list); > + > + device->fd = open(path, O_RDWR); > + if (device->fd < 0) { > + err = -errno; > + free(device); > + return err; > + } > + > + *devicep = device; > + > + return 0; > +} I think you shouldn't ask the path from the application (=DDX) here, but use drmOpen() that automatically finds the correct device for you. I'd also prefer letting the application open and close the device and modify drm_tegra_open() to take the fd as a parameter. That way the DDX could easily access also all generic libdrm functions. - Arto
Re: [RFC libdrm] Add NVIDIA Tegra support
On 12/04/2012 05:13 PM, Thierry Reding wrote: +int drm_tegra_open(const char *path, struct drm_tegra **devicep) +{ + struct drm_tegra *device; + int err; + + if (!path || !devicep) + return -EINVAL; + + device = calloc(1, sizeof(*device)); + if (!device) + return -ENOMEM; + + DRMINITLISTHEAD(device-bo_list); + + device-fd = open(path, O_RDWR); + if (device-fd 0) { + err = -errno; + free(device); + return err; + } + + *devicep = device; + + return 0; +} I think you shouldn't ask the path from the application (=DDX) here, but use drmOpen() that automatically finds the correct device for you. I'd also prefer letting the application open and close the device and modify drm_tegra_open() to take the fd as a parameter. That way the DDX could easily access also all generic libdrm functions. - Arto ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel