[Bug 41740] New: Mesa 7.12-devel gallium/state_trackers/d3d1x compilation error
https://bugs.freedesktop.org/show_bug.cgi?id=41740 Summary: Mesa 7.12-devel gallium/state_trackers/d3d1x compilation error Product: Mesa Version: git Platform: x86-64 (AMD64) OS/Version: Linux (All) Status: NEW Severity: normal Priority: medium Component: Drivers/Gallium/r600 AssignedTo: dri-devel at lists.freedesktop.org ReportedBy: wolput at onsneteindhoven.nl Compiling Mesa 7.12-devel configured with --enable-d3d1x shows the folowing error: --- In file included from d3d11.cpp:220:0: d3d11_context.h: In member function ?void GalliumD3D11DeviceContext::init_context()?: d3d11_context.h:153:34: error: ?screen? was not declared in this scope d3d11.cpp: In function ?HRESULT GalliumD3D11DeviceCreate(pipe_screen*, pipe_context*, BOOL, unsigned int, IDXGIAdapter*, ID3D11Device**)?: d3d11.cpp:224:200: error: new declaration ?HRESULT GalliumD3D11DeviceCreate(pipe_screen*, pipe_context*, BOOL, unsigned int, IDXGIAdapter*, ID3D11Device**)? ../gd3dapi/galliumd3d11.h:65:10: error: ambiguates old declaration ?HRESULT GalliumD3D11DeviceCreate(pipe_screen*, pipe_context*, BOOL, unsigned int, IDXGIAdapter*, ID3D11Device**)? In file included from d3d11.cpp:220:0: d3d11_context.h: In member function ?HRESULT GalliumD3D11DeviceContext::Map(ID3D11Resource*, unsigned int, D3D11_MAP, unsigned int, D3D11_MAPPED_SUBRESOURCE*) [with PtrTraits = nonatomic_device_child_ptr_traits, HRESULT = int, ID3D11Resource = ID3D11Resource, D3D11_MAP = D3D11_MAP, D3D11_MAPPED_SUBRESOURCE = D3D11_MAPPED_SUBRESOURCE]?: d3d11.cpp:231:1: instantiated from here d3d11_context.h:1484:12: warning: unused variable ?face? [-Wunused-variable] d3d11_context.h: In member function ?void GalliumD3D11DeviceContext::CopySubresourceRegion(ID3D11Resource*, unsigned int, unsigned int, unsigned int, unsigned int, ID3D11Resource*, unsigned int, const D3D11_BOX*) [with PtrTraits = nonatomic_device_child_ptr_traits, ID3D11Resource = ID3D11Resource, D3D11_BOX = D3D11_BOX]?: d3d11.cpp:231:1: instantiated from here d3d11_context.h:1545:12: warning: unused variable ?dst_face? [-Wunused-variable] d3d11_context.h:1547:12: warning: unused variable ?src_face? [-Wunused-variable] make[5]: *** [d3d11.o] Error 1 make[5]: Leaving directory `/home/jos/src/xorg/git-master/mesa/src/gallium/state_trackers/d3d1x/gd3d11' make[4]: *** [all] Error 2 make[4]: Leaving directory `/home/jos/src/xorg/git-master/mesa/src/gallium/state_trackers/d3d1x' make[3]: *** [subdirs] Error 1 make[3]: Leaving directory `/home/jos/src/xorg/git-master/mesa/src/gallium/state_trackers' make[2]: *** [default] Error 1 make[2]: Leaving directory `/home/jos/src/xorg/git-master/mesa/src/gallium' make[1]: *** [subdirs] Error 1 make[1]: Leaving directory `/home/jos/src/xorg/git-master/mesa/src' make: *** [default] Error 1 --- Make Mesa 7.11 produces a similar error. -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug.
Power profiles low and mid are identical on Radeon HD6470M
Am 11.10.2011 23:53, schrieb Alex Deucher: > On Sat, Oct 8, 2011 at 2:25 PM, Wolfgang Fritz > wrote: >> Hello, >> >> I have an HP Elitebook 8560p with Radeon HD7470M graphics, running Debian >> sid with kernel 3.0.4. >> >> I noticed that the power profiles low and mid are setting identical clocks >> and voltage, the lowest possible values: >> >> default engine clock: 75 kHz >> current engine clock: 0 kHz >> default memory clock: 90 kHz >> current memory clock: 149970 kHz >> voltage: 900 mV >> >> Looking at the code, this seems to be intentional at least for the mobility >> chips, but the chip provides more modes: >> >> [9.361401] [drm] R600: Number of power states = 7 >> [9.361402] [drm] Is mobility = YES >> [9.361403] [drm] ps #0 type 0, modes=3 >> [9.361404] [drm] 0: mclk=9, sclk=75000, volt=1100, vddci=0 >> [9.361406] [drm] 1: mclk=9, sclk=75000, volt=1100, vddci=0 >> [9.361407] [drm] 2: mclk=9, sclk=75000, volt=1100, vddci=0 >> [9.361409] [drm] ps #1 type 4, modes=3 >> [9.361410] [drm] 0: mclk=15000, sclk=1, volt=900, vddci=0 >> [9.361411] [drm] 1: mclk=9, sclk=4, volt=1000, vddci=0 >> [9.361413] [drm] 2: mclk=9, sclk=75000, volt=1100, vddci=0 >> [9.361414] [drm] ps #2 type 0, modes=3 >> [9.361415] [drm] 0: mclk=9, sclk=7, volt=1100, vddci=0 >> [9.361417] [drm] 1: mclk=9, sclk=7, volt=1100, vddci=0 >> [9.361418] [drm] 2: mclk=9, sclk=7, volt=1100, vddci=0 >> [9.361419] [drm] ps #3 type 2, modes=3 >> [9.361420] [drm] 0: mclk=15000, sclk=1, volt=900, vddci=0 >> [9.361422] [drm] 1: mclk=15000, sclk=1, volt=900, vddci=0 >> [9.361423] [drm] 2: mclk=3, sclk=3, volt=900, vddci=0 >> [9.361424] [drm] ps #4 type 2, modes=3 >> [9.361426] [drm] 0: mclk=65000, sclk=4, volt=900, vddci=0 >> [9.361427] [drm] 1: mclk=65000, sclk=4, volt=900, vddci=0 >> [9.361428] [drm] 2: mclk=65000, sclk=4, volt=900, vddci=0 >> [9.361430] [drm] ps #5 type 2, modes=3 >> [9.361431] [drm] 0: mclk=3, sclk=3, volt=900, vddci=0 >> [9.361433] [drm] 1: mclk=3, sclk=3, volt=900, vddci=0 >> [9.361434] [drm] 2: mclk=3, sclk=3, volt=900, vddci=0 >> [9.361435] [drm] ps #6 type 0, modes=3 >> [9.361436] [drm] 0: mclk=65000, sclk=4, volt=900, vddci=0 >> [9.361438] [drm] 1: mclk=65000, sclk=4, volt=900, vddci=0 >> [9.361439] [drm] 2: mclk=65000, sclk=4, volt=900, vddci=0 >> [9.361440] [drm] NOT CHIP_R600 >> >> (dmesg output from patched radeon module) >> >> Questions: >> 1. Is this a bug or a feature? (I see that it is not obvious which power >> state to choose) > > It's the way it is. > :-) >> 2. What do the 3 clock/voltage modes per power state mean? > > On r6xx+, each power state defines an operating state (e.g., single > head battery, multi-head battery, single head performance, multi-head > performance, etc.). Within each operating state, there are > high/mid/low clock modes that the define that operating state. So if > you have one head active and are on battery, the driver should switch > between the high/mid/low clock modes defined in that power state based > on the GPU load. If you enable multi-head and are still on battery, > the driver would switch to the multi-head battery state and switch > between the high/mid/low modes in that state. > OK. That's what I assumed after short code inspection. So, this is not cooperating well with the current dynamic clock interface in sysfs (at least as I understand it now). I understand that there are the dynamic and the profile power methods. In dynamic, I see the clocks switching, probably using the 3 power states in the second operation state in the list above (maximum performance). This results in an average power consumption similar to the catalyst driver (the fan is off most of the time). But it is not usable because the screen flickers when the clock state is changed, and this happens quite frequently. Also it seems to be independent of battery/mains mode. In the profile power mode, the clocks are at full speed with clock profiles default, high and at lowest speed with profiles mid and low. The high profile keeps the fan running continuously. This seems to be independent of mains or battery mode (I have to double check this) Low and mid profiles are unusable slow with 3D effects enabled, but work quite well with effects disabled, so this would be a suitable profile on low battery. With power profile auto, power state is high performance in mains mode and low in battery mode. So, as long as true dynamic clocking is not working flicker free, it would be nice to be able to change the clock modes manually to a value that keeps the fan quiet but is sufficient for ordinary work with effects enabled. I am currently running at 400/650 MHz @ 900mV with a patched driver. Finally some questions: Q1: Are all the power modes safe
[PATCH 3/3] drm/radeon/kms/DCE4.1: fix Select_CrtcSource EncodeMode setting for DP bridges (v2)
From: Alex DeucherSettings in this table reflect the physical panel/connector rather than the internal dig encoding. v2: fix typo for DRM_MODE_CONNECTOR_VGA case. Signed-off-by: Alex Deucher --- drivers/gpu/drm/radeon/radeon_encoders.c | 12 +++- 1 files changed, 11 insertions(+), 1 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_encoders.c b/drivers/gpu/drm/radeon/radeon_encoders.c index a90d9ee..eb3f6dc 100644 --- a/drivers/gpu/drm/radeon/radeon_encoders.c +++ b/drivers/gpu/drm/radeon/radeon_encoders.c @@ -1638,7 +1638,17 @@ atombios_set_encoder_crtc_source(struct drm_encoder *encoder) break; case 2: args.v2.ucCRTC = radeon_crtc->crtc_id; - args.v2.ucEncodeMode = atombios_get_encoder_mode(encoder); + if (radeon_encoder_is_dp_bridge(encoder)) { + struct drm_connector *connector = radeon_get_connector_for_encoder(encoder); + + if (connector->connector_type == DRM_MODE_CONNECTOR_LVDS) + args.v2.ucEncodeMode = ATOM_ENCODER_MODE_LVDS; + else if (connector->connector_type == DRM_MODE_CONNECTOR_VGA) + args.v2.ucEncodeMode = ATOM_ENCODER_MODE_CRT; + else + args.v2.ucEncodeMode = atombios_get_encoder_mode(encoder); + } else + args.v2.ucEncodeMode = atombios_get_encoder_mode(encoder); switch (radeon_encoder->encoder_id) { case ENCODER_OBJECT_ID_INTERNAL_UNIPHY: case ENCODER_OBJECT_ID_INTERNAL_UNIPHY1: -- 1.7.1.1
[PATCH 3/3] drm/radeon/kms/DCE4.1: fix Select_CrtcSource EncodeMode setting for DP bridges
From: Alex DeucherSettings in this table reflect the physical panel/connector rather than the internal dig encoding. Signed-off-by: Alex Deucher --- drivers/gpu/drm/radeon/radeon_encoders.c | 12 +++- 1 files changed, 11 insertions(+), 1 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_encoders.c b/drivers/gpu/drm/radeon/radeon_encoders.c index a90d9ee..bfe1662 100644 --- a/drivers/gpu/drm/radeon/radeon_encoders.c +++ b/drivers/gpu/drm/radeon/radeon_encoders.c @@ -1638,7 +1638,17 @@ atombios_set_encoder_crtc_source(struct drm_encoder *encoder) break; case 2: args.v2.ucCRTC = radeon_crtc->crtc_id; - args.v2.ucEncodeMode = atombios_get_encoder_mode(encoder); + if (radeon_encoder_is_dp_bridge(encoder)) { + struct drm_connector *connector = radeon_get_connector_for_encoder(encoder); + + if (connector->connector_type == DRM_MODE_CONNECTOR_LVDS) + args.v2.ucEncodeMode = ATOM_ENCODER_MODE_LVDS; + else if (connector->connector_type == DRM_MODE_CONNECTOR_VGA) + args.v2.ucEncodeMode = ATOM_ENCODER_MODE_LVDS; + else + args.v2.ucEncodeMode = atombios_get_encoder_mode(encoder); + } else + args.v2.ucEncodeMode = atombios_get_encoder_mode(encoder); switch (radeon_encoder->encoder_id) { case ENCODER_OBJECT_ID_INTERNAL_UNIPHY: case ENCODER_OBJECT_ID_INTERNAL_UNIPHY1: -- 1.7.1.1
[PATCH 2/3] drm/radeon/kms/DCE4.1: ss is not supported on the internal pplls
From: Alex DeucherIt's handled via external clock. It should already be protected by the external ss flag, but add an explicit check just in case. Signed-off-by: Alex Deucher --- drivers/gpu/drm/radeon/atombios_crtc.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/gpu/drm/radeon/atombios_crtc.c b/drivers/gpu/drm/radeon/atombios_crtc.c index c742944..a515b2a 100644 --- a/drivers/gpu/drm/radeon/atombios_crtc.c +++ b/drivers/gpu/drm/radeon/atombios_crtc.c @@ -466,7 +466,7 @@ static void atombios_crtc_program_ss(struct drm_crtc *crtc, return; } args.v2.ucEnable = enable; - if ((ss->percentage == 0) || (ss->type & ATOM_EXTERNAL_SS_MASK)) + if ((ss->percentage == 0) || (ss->type & ATOM_EXTERNAL_SS_MASK) || ASIC_IS_DCE41(rdev)) args.v2.ucEnable = ATOM_DISABLE; } else if (ASIC_IS_DCE3(rdev)) { args.v1.usSpreadSpectrumPercentage = cpu_to_le16(ss->percentage); -- 1.7.1.1
[PATCH 1/3] drm/radeon/kms/DCE4.1: fix dig encoder to transmitter mapping
From: Alex Deucherllano has fully routeable dig encoders similar to DCE3.2 while ontario has a hardcoded mapping similar to DCE4.0. Signed-off-by: Alex Deucher --- drivers/gpu/drm/radeon/radeon_encoders.c | 13 + 1 files changed, 9 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_encoders.c b/drivers/gpu/drm/radeon/radeon_encoders.c index 8a171b2..a90d9ee 100644 --- a/drivers/gpu/drm/radeon/radeon_encoders.c +++ b/drivers/gpu/drm/radeon/radeon_encoders.c @@ -1756,10 +1756,15 @@ static int radeon_atom_pick_dig_encoder(struct drm_encoder *encoder) if (ASIC_IS_DCE4(rdev)) { dig = radeon_encoder->enc_priv; if (ASIC_IS_DCE41(rdev)) { - if (dig->linkb) - return 1; - else - return 0; + /* ontario follows DCE4 */ + if (rdev->family == CHIP_PALM) { + if (dig->linkb) + return 1; + else + return 0; + } else + /* llano follows DCE3.2 */ + return radeon_crtc->crtc_id; } else { switch (radeon_encoder->encoder_id) { case ENCODER_OBJECT_ID_INTERNAL_UNIPHY: -- 1.7.1.1
[Linaro-mm-sig] [RFC 1/2] dma-buf: Introduce dma buffer sharing mechanism
On Wed, Oct 12, 2011 at 03:34:54PM +0100, Dave Airlie wrote: > On Wed, Oct 12, 2011 at 3:24 PM, Rob Clark wrote: > > On Wed, Oct 12, 2011 at 9:01 AM, Dave Airlie wrote: > >>> But then we'd need a different set of accessors for every different > >>> drm/v4l/etc driver, wouldn't we? > >> > >> Not any more different than you need for this, you just have a new > >> interface that you request a sw object from, > >> then mmap that object, and underneath it knows who owns it in the kernel. > > > > oh, ok, so you are talking about a kernel level interface, rather than > > userspace.. > > > > but I guess in this case I don't quite see the difference. ?It amounts > > to which fd you call mmap (or ioctl[*]) on.. ?If you use the dmabuf fd > > directly then you don't have to pass around a 2nd fd. > > > > [*] there is nothing stopping defining some dmabuf ioctls (such as for > > synchronization).. although the thinking was to keep it simple for > > first version of dmabuf > > > > Yes a separate kernel level interface. > > Well I'd like to keep it even simpler. dmabuf is a buffer sharing API, > shoehorning in a sw mapping API isn't making it simpler. > > The problem I have with implementing mmap on the sharing fd, is that > nothing says this should be purely optional and userspace shouldn't > rely on it. > > In the Intel GEM space alone you have two types of mapping, one direct > to shmem one via GTT, the GTT could be even be a linear view. The > intel guys initially did GEM mmaps direct to the shmem pages because > it seemed simple, up until they > had to do step two which was do mmaps on the GTT copy and ended up > having two separate mmap methods. I think the problem here is it seems > deceptively simple to add this to the API now because the API is > simple, however I think in the future it'll become a burden that we'll > have to workaround. Yeah, that's my feeling, too. Adding mmap sounds like a neat, simple idea, that could simplify things for simple devices like v4l. But as soon as you're dealing with a real gpu, nothing is simple. Those who don't believe this, just take a look at the data upload/download paths in the open-source i915,nouveau,radeon drivers. Making this fast (and for gpus, it needs to be fast) requires tons of tricks, special-cases and jumping through loops. You absolutely want the device-specific ioctls to do that. Adding a generic mmap just makes matters worse, especially if userspace expects this to work synchronized with everything else that is going on. Cheers, Daniel -- Daniel Vetter Mail: daniel at ffwll.ch Mobile: +41 (0)79 365 57 48
[PATCH 19/21] drm/i915: Asynchronous eDP panel power off
> Using the same basic plan as the VDD force delayed power off, make > turning the panel power off asynchronous. NAK, tested on my 2540p, up to this patch in macbook-air branch stuff worked, after this I just get black screen on resume. Dave.
[Linaro-mm-sig] [RFC 1/2] dma-buf: Introduce dma buffer sharing mechanism
On Wed, Oct 12, 2011 at 3:24 PM, Rob Clark wrote: > On Wed, Oct 12, 2011 at 9:01 AM, Dave Airlie wrote: >>> But then we'd need a different set of accessors for every different >>> drm/v4l/etc driver, wouldn't we? >> >> Not any more different than you need for this, you just have a new >> interface that you request a sw object from, >> then mmap that object, and underneath it knows who owns it in the kernel. > > oh, ok, so you are talking about a kernel level interface, rather than > userspace.. > > but I guess in this case I don't quite see the difference. ?It amounts > to which fd you call mmap (or ioctl[*]) on.. ?If you use the dmabuf fd > directly then you don't have to pass around a 2nd fd. > > [*] there is nothing stopping defining some dmabuf ioctls (such as for > synchronization).. although the thinking was to keep it simple for > first version of dmabuf > Yes a separate kernel level interface. Well I'd like to keep it even simpler. dmabuf is a buffer sharing API, shoehorning in a sw mapping API isn't making it simpler. The problem I have with implementing mmap on the sharing fd, is that nothing says this should be purely optional and userspace shouldn't rely on it. In the Intel GEM space alone you have two types of mapping, one direct to shmem one via GTT, the GTT could be even be a linear view. The intel guys initially did GEM mmaps direct to the shmem pages because it seemed simple, up until they had to do step two which was do mmaps on the GTT copy and ended up having two separate mmap methods. I think the problem here is it seems deceptively simple to add this to the API now because the API is simple, however I think in the future it'll become a burden that we'll have to workaround. Dave.
[RFC 2/2] dma-buf: Documentation for buffer sharing framework
On 10/11/2011 02:23 AM, Sumit Semwal wrote: > Add documentation for dma buffer sharing framework, explaining the > various operations, members and API of the dma buffer sharing > framework. > > Signed-off-by: Sumit Semwal > Signed-off-by: Sumit Semwal > --- > Documentation/dma-buf-sharing.txt | 210 > + > 1 files changed, 210 insertions(+), 0 deletions(-) > create mode 100644 Documentation/dma-buf-sharing.txt > > diff --git a/Documentation/dma-buf-sharing.txt > b/Documentation/dma-buf-sharing.txt > new file mode 100644 > index 000..4da6644 > --- /dev/null > +++ b/Documentation/dma-buf-sharing.txt > @@ -0,0 +1,210 @@ > +DMA Buffer Sharing API Guide > + > + > +Sumit Semwal > + > + > + > +This document serves as a guide to device-driver writers on what is the > dma-buf > +buffer sharing API, how to use it for exporting and using shared buffers. > + > +Any device driver which wishes to be a part of dma buffer sharing, can do so > as Please use DMA instead of dma (except combinations like dma-buf are OK). [multiple] > +either the 'exporter' of buffers, or the 'user' of buffers. > + > +Say a driver A wants to use buffers created by driver B, then we call B as > the > +exporter, and B as buffer-user. and A > + > +The exporter > +- implements and manages operations[1] for the buffer > +- allows other users to share the buffer by using dma_buf sharing APIs, > +- manages the details of buffer allocation, > +- decides about the actual backing storage where this allocation happens, > +- takes care of any migration of scatterlist - for all (shared) users of this > + buffer, > +- optionally, provides mmap capability for drivers that need it. > + > +The buffer-user > +- is one of (many) sharing users of the buffer. > +- doesn't need to worry about how the buffer is allocated, or where. > +- needs a mechanism to get access to the scatterlist that makes up this > buffer > + in memory, mapped into its own address space, so it can access the same > area > + of memory. > + > + > +The dma_buf buffer sharing API usage contains the following steps: > + > +1. Exporter announces that it wishes to export a buffer > +2. Userspace gets the file descriptor associated with the exported buffer, > and > + passes it around to potential buffer-users based on use case > +3. Each buffer-user 'connects' itself to the buffer > +4. When needed, buffer-user requests access to the buffer from exporter > +5. When finished with its use, the buffer-user notifies end-of-dma to > exporter > +6. when buffer-user is done using this buffer completely, it 'disconnects' > + itself from the buffer. > + > + > +1. Exporter's announcement of buffer export > + > + The buffer exporter announces its wish to export a buffer. In this, it > + connects its own private buffer data, provides implementation for > operations > + that can be performed on the exported dma_buf, and flags for the file > + associated with this buffer. > + > + Interface: > + struct dma_buf *dma_buf_export(void *priv, struct dma_buf_ops *ops, > +int flags) > + > + If this succeeds, dma_buf_export allocates a dma_buf structure, and > returns a > + pointer to the same. It also associates an anon file with this buffer, so > it s/anon/anonymous/ (multiple) > + can be exported. On failure to allocate the dma_buf object, it returns > NULL. > + > +2. Userspace gets a handle to pass around to potential buffer-users > + > + Userspace entity requests for a file-descriptor (fd) which is a handle to > the > + anon file associated with the buffer. It can then share the fd with other > + drivers and/or processes. > + > + Interface: > + int dma_buf_fd(struct dma_buf *dmabuf) > + > + This API installs an fd for the anon file associated with this buffer; > + returns either 'fd', or error. > + > +3. Each buffer-user 'connects' itself to the buffer > + > + Each buffer-user now gets a reference to the buffer, using the fd passed > to > + it. > + > + Interface: > + struct dma_buf *dma_buf_get(int fd) > + > + This API will return a reference to the dma_buf, and increment refcount > for > + it. > + > + After this, the buffer-user needs to attach its device with the buffer, > which > + helps the exporter to know of device buffer constraints. > + > + Interface: > + struct dma_buf_attachment *dma_buf_attach(struct dma_buf *dmabuf, > +struct device *dev) > + > + This API returns reference to an attachment structure, which is then used > + for scatterlist operations. It will optionally call the 'attach' dma_buf > + operation, if provided by the exporter. > + > + The dma-buf sharing framework does the book-keeping bits related to > keeping
[Linaro-mm-sig] [RFC 1/2] dma-buf: Introduce dma buffer sharing mechanism
> But then we'd need a different set of accessors for every different > drm/v4l/etc driver, wouldn't we? Not any more different than you need for this, you just have a new interface that you request a sw object from, then mmap that object, and underneath it knows who owns it in the kernel. mmap just feels wrong in this API, which is a buffer sharing API not a buffer mapping API. > I guess if sharing a buffer between multiple drm devices, there is > nothing stopping you from having some NOT_DMABUF_MMAPABLE flag you > pass when the buffer is allocated, then you don't have to support > dmabuf->mmap(), and instead mmap via device and use some sort of > DRM_CPU_PREP/FINI ioctls for synchronization.. Or we could make a generic CPU accessor that we don't have to worry about. Dave.
[PATCH 1/2] drm/radeon: allow pcie gen2 speed on NI
On Wed, Oct 12, 2011 at 2:25 PM, Ilija Hadzic wrote: > > Hi Dave, > > A few weeks ago I sent the two patches that allow PCI Express interface to > run at Gen 2 speed on NI parts. Links to the patches in the mailing list > archive + review from Alex quoted below: > > http://lists.freedesktop.org/archives/dri-devel/2011-September/014474.html > http://lists.freedesktop.org/archives/dri-devel/2011-September/014475.html > > I saw some activity on drm-next and drm-core-next branches, but I have not > seen these two patches merge yet. Just wondering if they are in the queue > for merging or if they may have fell through the cracks? /me misses patchwork a lot. I've picked them up now. Thanks, Dave.
[Linaro-mm-sig] [RFC 1/2] dma-buf: Introduce dma buffer sharing mechanism
> > well, the mmap is actually implemented by the buffer allocator > (v4l/drm).. although not sure if this was the point Then why not use the correct interface? doing some sort of not-quite generic interface isn't really helping anyone except adding an ABI that we have to support. If someone wants to bypass the current kernel APIs we should add a new API for them not shove it into this generic buffer sharing layer. > The intent was that this is for well defined formats.. ie. it would > need to be a format that both v4l and drm understood in the first > place for sharing to make sense at all.. How will you know the stride to take a simple example? The userspace had to create this buffer somehow and wants to share it with "something", you sound like you really needs another API that is a simple accessor API that can handle mmaps. > Anyways, the basic reason is to handle random edge cases where you > need sw access to the buffer. ?For example, you are decoding video and > pull out a frame to generate a thumbnail w/ a sw jpeg encoder.. Again, doesn't sound like it should be part of this API, and also sounds like the sw jpeg encoder will need more info about the buffer anyways like stride and format. > With this current scheme, synchronization could be handled in > dmabufops->mmap() and vm_ops->close().. ?it is perhaps a bit heavy to > require mmap/munmap for each sw access, but I suppose this isn't > really for the high-performance use case. ?It is just so that some > random bit of sw that gets passed a dmabuf handle without knowing who > allocated it can have sw access if really needed. So I think thats fine, write a sw accessor providers, don't go overloading the buffer sharing code. This API will limit what people can use this buffer sharing for with pure hw accessors, you might say, oh buts its okay to fail the mmap then, but the chances of sw handling that I'm not so sure off. Dave.
[Linaro-mm-sig] [RFC 1/2] dma-buf: Introduce dma buffer sharing mechanism
On Tue, Oct 11, 2011 at 10:23 AM, Sumit Semwal wrote: > This is the first step in defining a dma buffer sharing mechanism. > > A new buffer object dma_buf is added, with operations and API to allow easy > sharing of this buffer object across devices. > > The framework allows: > - a new buffer-object to be created with fixed size. > - different devices to 'attach' themselves to this buffer, to facilitate > ?backing storage negotiation, using dma_buf_attach() API. > - association of a file pointer with each user-buffer and associated > ? allocator-defined operations on that buffer. This operation is called the > ? 'export' operation. > - this exported buffer-object to be shared with the other entity by asking for > ? its 'file-descriptor (fd)', and sharing the fd across. > - a received fd to get the buffer object back, where it can be accessed using > ? the associated exporter-defined operations. > - the exporter and user to share the scatterlist using get_scatterlist and > ? put_scatterlist operations. > > Atleast one 'attach()' call is required to be made prior to calling the > get_scatterlist() operation. > > Couple of building blocks in get_scatterlist() are added to ease introduction > of sync'ing across exporter and users, and late allocation by the exporter. > > mmap() file operation is provided for the associated 'fd', as wrapper over the > optional allocator defined mmap(), to be used by devices that might need one. Why is this needed? it really doesn't make sense to be mmaping objects independent of some front-end like drm or v4l. how will you know what contents are in them, how will you synchronise access. Unless someone has a hard use-case for this I'd say we drop it until someone does. Dave.
[PATCH 19/21] drm/i915: Asynchronous eDP panel power off
On Wed, 12 Oct 2011 15:41:11 +0100, Dave Airlie wrote: > > Using the same basic plan as the VDD force delayed power off, make > > turning the panel power off asynchronous. > > NAK, tested on my 2540p, up to this patch in macbook-air branch stuff > worked, after this I just get black screen on resume. Thanks for testing. I've created a new edp-training-fixes branch that removes the async panel power off and leaves the rest of the branch. I'll test on a 2540p that I've got access to today, and on the MBA when I get home this evening. -- keith.packard at intel.com -- next part -- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 827 bytes Desc: not available URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20111012/ac22efd7/attachment-0001.pgp>
[Linaro-mm-sig] [RFC 1/2] dma-buf: Introduce dma buffer sharing mechanism
On Wed, Oct 12, 2011 at 9:34 AM, Dave Airlie wrote: > On Wed, Oct 12, 2011 at 3:24 PM, Rob Clark wrote: >> On Wed, Oct 12, 2011 at 9:01 AM, Dave Airlie wrote: But then we'd need a different set of accessors for every different drm/v4l/etc driver, wouldn't we? >>> >>> Not any more different than you need for this, you just have a new >>> interface that you request a sw object from, >>> then mmap that object, and underneath it knows who owns it in the kernel. >> >> oh, ok, so you are talking about a kernel level interface, rather than >> userspace.. >> >> but I guess in this case I don't quite see the difference. ?It amounts >> to which fd you call mmap (or ioctl[*]) on.. ?If you use the dmabuf fd >> directly then you don't have to pass around a 2nd fd. >> >> [*] there is nothing stopping defining some dmabuf ioctls (such as for >> synchronization).. although the thinking was to keep it simple for >> first version of dmabuf >> > > Yes a separate kernel level interface. I'm not against it, but if it is a device-independent interface, it just seems like six of one, half-dozen of the other.. Ie. how does it differ if the dmabuf fd is the fd used for ioctl/mmap, vs if some other /dev/buffer-sharer file that you open? But I think maybe I'm misunderstanding what you have in mind? BR, -R > Well I'd like to keep it even simpler. dmabuf is a buffer sharing API, > shoehorning in a sw mapping API isn't making it simpler. > > The problem I have with implementing mmap on the sharing fd, is that > nothing says this should be purely optional and userspace shouldn't > rely on it. > > In the Intel GEM space alone you have two types of mapping, one direct > to shmem one via GTT, the GTT could be even be a linear view. The > intel guys initially did GEM mmaps direct to the shmem pages because > it seemed simple, up until they > had to do step two which was do mmaps on the GTT copy and ended up > having two separate mmap methods. I think the problem here is it seems > deceptively simple to add this to the API now because the API is > simple, however I think in the future it'll become a burden that we'll > have to workaround. > > Dave. >
[Bug 36003] [Radeon HD 5650 and 5470] Driver crash during recovery boot and in normal boot (Regression from 2.6.38-3 to -4)
https://bugs.freedesktop.org/show_bug.cgi?id=36003 Jeremy Huddleston changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED --- Comment #20 from Jeremy Huddleston 2011-10-12 10:00:48 PDT --- Thanks. Closing based on the above comment. -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug.
[Linaro-mm-sig] [RFC 1/2] dma-buf: Introduce dma buffer sharing mechanism
On Wed, Oct 12, 2011 at 9:01 AM, Dave Airlie wrote: >> But then we'd need a different set of accessors for every different >> drm/v4l/etc driver, wouldn't we? > > Not any more different than you need for this, you just have a new > interface that you request a sw object from, > then mmap that object, and underneath it knows who owns it in the kernel. oh, ok, so you are talking about a kernel level interface, rather than userspace.. but I guess in this case I don't quite see the difference. It amounts to which fd you call mmap (or ioctl[*]) on.. If you use the dmabuf fd directly then you don't have to pass around a 2nd fd. [*] there is nothing stopping defining some dmabuf ioctls (such as for synchronization).. although the thinking was to keep it simple for first version of dmabuf BR, -R > mmap just feels wrong in this API, which is a buffer sharing API not a > buffer mapping API. > >> I guess if sharing a buffer between multiple drm devices, there is >> nothing stopping you from having some NOT_DMABUF_MMAPABLE flag you >> pass when the buffer is allocated, then you don't have to support >> dmabuf->mmap(), and instead mmap via device and use some sort of >> DRM_CPU_PREP/FINI ioctls for synchronization.. > > Or we could make a generic CPU accessor that we don't have to worry about. > > Dave. >
[Linaro-mm-sig] [RFC 1/2] dma-buf: Introduce dma buffer sharing mechanism
On Wed, Oct 12, 2011 at 8:35 AM, Dave Airlie wrote: >> >> well, the mmap is actually implemented by the buffer allocator >> (v4l/drm).. although not sure if this was the point > > Then why not use the correct interface? doing some sort of not-quite > generic interface isn't really helping anyone except adding an ABI > that we have to support. But what if you don't know who allocated the buffer? How do you know what interface to use to mmap? > If someone wants to bypass the current kernel APIs we should add a new > API for them not shove it into this generic buffer sharing layer. > >> The intent was that this is for well defined formats.. ie. it would >> need to be a format that both v4l and drm understood in the first >> place for sharing to make sense at all.. > > How will you know the stride to take a simple example? The userspace > had to create this buffer somehow and wants to share it with > "something", you sound like > you really needs another API that is a simple accessor API that can > handle mmaps. Well, things like stride, width, height, color format, userspace needs to know all this already, even for malloc()'d sw buffers. The assumption is userspace already has a way to pass this information around so it was not required to be duplicated by dmabuf. >> Anyways, the basic reason is to handle random edge cases where you >> need sw access to the buffer. ?For example, you are decoding video and >> pull out a frame to generate a thumbnail w/ a sw jpeg encoder.. > > Again, doesn't sound like it should be part of this API, and also > sounds like the sw jpeg encoder will need more info about the buffer > anyways like stride and format. > >> With this current scheme, synchronization could be handled in >> dmabufops->mmap() and vm_ops->close().. ?it is perhaps a bit heavy to >> require mmap/munmap for each sw access, but I suppose this isn't >> really for the high-performance use case. ?It is just so that some >> random bit of sw that gets passed a dmabuf handle without knowing who >> allocated it can have sw access if really needed. > > So I think thats fine, write a sw accessor providers, don't go > overloading the buffer sharing code. But then we'd need a different set of accessors for every different drm/v4l/etc driver, wouldn't we? > This API will limit what people can use this buffer sharing for with > pure hw accessors, you might say, oh buts its okay to fail the mmap > then, but the chances of sw handling that I'm not so sure off. I'm not entirely sure the case you are worried about.. sharing buffers between multiple GPU's that understand same tiled formats? I guess that is a bit different from a case like a jpeg encoder that is passed a dmabuf handle without any idea where it came from.. I guess if sharing a buffer between multiple drm devices, there is nothing stopping you from having some NOT_DMABUF_MMAPABLE flag you pass when the buffer is allocated, then you don't have to support dmabuf->mmap(), and instead mmap via device and use some sort of DRM_CPU_PREP/FINI ioctls for synchronization.. BR, -R > Dave. >
[Linaro-mm-sig] [RFC 1/2] dma-buf: Introduce dma buffer sharing mechanism
On Wed, Oct 12, 2011 at 7:41 AM, Dave Airlie wrote: > On Tue, Oct 11, 2011 at 10:23 AM, Sumit Semwal wrote: >> This is the first step in defining a dma buffer sharing mechanism. >> >> A new buffer object dma_buf is added, with operations and API to allow easy >> sharing of this buffer object across devices. >> >> The framework allows: >> - a new buffer-object to be created with fixed size. >> - different devices to 'attach' themselves to this buffer, to facilitate >> ?backing storage negotiation, using dma_buf_attach() API. >> - association of a file pointer with each user-buffer and associated >> ? allocator-defined operations on that buffer. This operation is called the >> ? 'export' operation. >> - this exported buffer-object to be shared with the other entity by asking >> for >> ? its 'file-descriptor (fd)', and sharing the fd across. >> - a received fd to get the buffer object back, where it can be accessed using >> ? the associated exporter-defined operations. >> - the exporter and user to share the scatterlist using get_scatterlist and >> ? put_scatterlist operations. >> >> Atleast one 'attach()' call is required to be made prior to calling the >> get_scatterlist() operation. >> >> Couple of building blocks in get_scatterlist() are added to ease introduction >> of sync'ing across exporter and users, and late allocation by the exporter. >> >> mmap() file operation is provided for the associated 'fd', as wrapper over >> the >> optional allocator defined mmap(), to be used by devices that might need one. > > Why is this needed? it really doesn't make sense to be mmaping objects > independent of some front-end like drm or v4l. well, the mmap is actually implemented by the buffer allocator (v4l/drm).. although not sure if this was the point > how will you know what contents are in them, how will you synchronise > access. Unless someone has a hard use-case for this I'd say we drop it > until someone does. The intent was that this is for well defined formats.. ie. it would need to be a format that both v4l and drm understood in the first place for sharing to make sense at all.. Anyways, the basic reason is to handle random edge cases where you need sw access to the buffer. For example, you are decoding video and pull out a frame to generate a thumbnail w/ a sw jpeg encoder.. On gstreamer 0.11 branch, for example, there is already a map/unmap virtual method on the gst buffer for sw access (ie. same purpose as PrepareAccess/FinishAccess in EXA). The idea w/ dmabuf mmap() support is that we could implement support to mmap()/munmap() before/after sw access. With this current scheme, synchronization could be handled in dmabufops->mmap() and vm_ops->close().. it is perhaps a bit heavy to require mmap/munmap for each sw access, but I suppose this isn't really for the high-performance use case. It is just so that some random bit of sw that gets passed a dmabuf handle without knowing who allocated it can have sw access if really needed. BR, -R > Dave. > -- > To unsubscribe from this list: send the line "unsubscribe linux-media" in > the body of a message to majordomo at vger.kernel.org > More majordomo info at ?http://vger.kernel.org/majordomo-info.html >
[patch] drm/nva3: checking the wrong variable
On Tue, 2011-10-11 at 17:34 +0300, Dan Carpenter wrote: > "id" is unsigned here and it's never less than zero. I believe the > intent was to check the return value from nva3_pm_pll_offset(). > Also I've changed it to pass on the -ENOENT error code from the lower > levels instead of returning -EINVAL. The patch looks correct. It's worth noting though that a complete rewrite of that particular code is queued for 3.2 already. Ben. > > Signed-off-by: Dan Carpenter > > diff --git a/drivers/gpu/drm/nouveau/nva3_pm.c > b/drivers/gpu/drm/nouveau/nva3_pm.c > index e4b2b9e..0be517d 100644 > --- a/drivers/gpu/drm/nouveau/nva3_pm.c > +++ b/drivers/gpu/drm/nouveau/nva3_pm.c > @@ -112,8 +112,8 @@ nva3_pm_clock_pre(struct drm_device *dev, struct > nouveau_pm_level *perflvl, > return (ret == -ENOENT) ? NULL : ERR_PTR(ret); > > off = nva3_pm_pll_offset(id); > - if (id < 0) > - return ERR_PTR(-EINVAL); > + if (off < 0) > + return ERR_PTR(off); > > > pll = kzalloc(sizeof(*pll), GFP_KERNEL);
[PATCH 1/2] drm/radeon: allow pcie gen2 speed on NI
Hi Dave, A few weeks ago I sent the two patches that allow PCI Express interface to run at Gen 2 speed on NI parts. Links to the patches in the mailing list archive + review from Alex quoted below: http://lists.freedesktop.org/archives/dri-devel/2011-September/014474.html http://lists.freedesktop.org/archives/dri-devel/2011-September/014475.html I saw some activity on drm-next and drm-core-next branches, but I have not seen these two patches merge yet. Just wondering if they are in the queue for merging or if they may have fell through the cracks? thanks, Ilija On Tue, 20 Sep 2011, Alex Deucher wrote: > On Tue, Sep 20, 2011 at 10:22 AM, Ilija Hadzic > wrote: >> Enabling pcie gen2 speed was skipped for Northern Islands >> AISCs, although it looks like it works just fine with the same >> initialization sequence used for evergreen. >> >> According to Alex D. gen2 init was skipped to prevent a crash >> that has been caused by some other bug that has been >> fixed in the meantime; so now it should be safe to enable it. >> >> Signed-off-by: Ilija Hadzic > > I just double checked and BTC and cayman use the same programming > method. Both patches: > > Reviewed-by: Alex Deucher > > Thanks! > > Alex > > >> --- >> ?drivers/gpu/drm/radeon/evergreen.c | ? ?3 +-- >> ?1 files changed, 1 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/gpu/drm/radeon/evergreen.c >> b/drivers/gpu/drm/radeon/evergreen.c >> index f09bace..208b59c 100644 >> --- a/drivers/gpu/drm/radeon/evergreen.c >> +++ b/drivers/gpu/drm/radeon/evergreen.c >> @@ -2987,8 +2987,7 @@ static int evergreen_startup(struct radeon_device >> *rdev) >> ? ? ? ?int r; >> >> ? ? ? ?/* enable pcie gen2 link */ >> - ? ? ? if (!ASIC_IS_DCE5(rdev)) >> - ? ? ? ? ? ? ? evergreen_pcie_gen2_enable(rdev); >> + ? ? ? evergreen_pcie_gen2_enable(rdev); >> >> ? ? ? ?if (ASIC_IS_DCE5(rdev)) { >> ? ? ? ? ? ? ? ?if (!rdev->me_fw || !rdev->pfp_fw || !rdev->rlc_fw || >> !rdev->mc_fw) { >> -- >> 1.7.6 >> >> ___ >> dri-devel mailing list >> dri-devel at lists.freedesktop.org >> http://lists.freedesktop.org/mailman/listinfo/dri-devel >> >
[Bug 41698] [r300g] Flickering user interface in WoW
https://bugs.freedesktop.org/show_bug.cgi?id=41698 --- Comment #1 from Chris Rankin 2011-10-12 03:35:09 PDT --- Reverting this single patch in git (with the exception of the header file that no longer exists, of course) has fixed the flickering problem. So far, anyway. -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug.
[Bug 41668] Screen locks up at random points when using a 3D compositing wm (gnome-shell) on an rv515 (radeon mobility x1300)
https://bugs.freedesktop.org/show_bug.cgi?id=41668 --- Comment #11 from dmotd 2011-10-12 03:24:07 PDT --- (In reply to comment #10) > (In reply to comment #6) > > > > running glxgears just shows an empty black box.. all other glx demos > > > > are the > > > > same empty boxes.. > > > > > > Do they work with the environment variable vblank_mode=0? If yes, does the > > > number for radeon increase in /proc/interrupts once the problem occurs? > > > > setting vblank_mode=0 works and displays an output.. but not much change in > > /proc/interrupts (irq 46 for radeon) > > Not much change for the radeon number, or none at all? If the latter, > apparently the IRQ for the radeon card stops working for some reason, which > would explain the core symptoms of the freeze. no change to the radeon irq number. -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug.
[Bug 23103] screen not lighting up on resume when using kms
https://bugs.freedesktop.org/show_bug.cgi?id=23103 Michel D?nzer changed: What|Removed |Added Product|xorg|DRI Version|7.4 |unspecified Component|Driver/Radeon |DRM/Radeon AssignedTo|xorg-driver-ati at lists.x.org |dri-devel at lists.freedesktop ||.org QAContact|xorg-team at lists.x.org | -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug.
[Bug 24097] screen backlight off after resume-from-suspend when using ATI KMS
https://bugs.freedesktop.org/show_bug.cgi?id=24097 Michel D?nzer changed: What|Removed |Added Product|xorg|DRI Version|7.4 |unspecified Status Whiteboard|2011BRB_Reviewed| Component|Driver/Radeon |DRM/Radeon AssignedTo|xorg-driver-ati at lists.x.org |dri-devel at lists.freedesktop ||.org QAContact|xorg-team at lists.x.org | -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug.
[Bug 36003] [Radeon HD 5650 and 5470] Driver crash during recovery boot and in normal boot (Regression from 2.6.38-3 to -4)
https://bugs.freedesktop.org/show_bug.cgi?id=36003 Michel D?nzer changed: What|Removed |Added Product|xorg|DRI Version|7.6 |unspecified Status Whiteboard|2011BRB_Reviewed| Component|Driver/Radeon |DRM/Radeon AssignedTo|xorg-driver-ati at lists.x.org |dri-devel at lists.freedesktop ||.org QAContact|xorg-team at lists.x.org | -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug.
[Bug 38694] Server freezes with latest commit on 22/06/2011
https://bugs.freedesktop.org/show_bug.cgi?id=38694 Michel D?nzer changed: What|Removed |Added Product|xorg|DRI Version|git |unspecified Component|Driver/Radeon |DRM/Radeon AssignedTo|xorg-driver-ati at lists.x.org |dri-devel at lists.freedesktop ||.org QAContact|xorg-team at lists.x.org | -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug.
[Bug 41668] Screen locks up at random points when using a 3D compositing wm (gnome-shell) on an rv515 (radeon mobility x1300)
https://bugs.freedesktop.org/show_bug.cgi?id=41668 --- Comment #10 from Michel D?nzer 2011-10-12 03:07:04 PDT --- (In reply to comment #6) > > > running glxgears just shows an empty black box.. all other glx demos are > > > the > > > same empty boxes.. > > > > Do they work with the environment variable vblank_mode=0? If yes, does the > > number for radeon increase in /proc/interrupts once the problem occurs? > > setting vblank_mode=0 works and displays an output.. but not much change in > /proc/interrupts (irq 46 for radeon) Not much change for the radeon number, or none at all? If the latter, apparently the IRQ for the radeon card stops working for some reason, which would explain the core symptoms of the freeze. -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug.
[Bug 41579] R300 Segfaults when using mupen64plus
https://bugs.freedesktop.org/show_bug.cgi?id=41579 --- Comment #3 from Michel D?nzer 2011-10-12 02:59:21 PDT --- (In reply to comment #2) > Just tested current Git, and it seems to be working. Great. If it's still broken with the current 7.11 branch, and you can isolate the change that fixed it, maybe we can backport it. -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug.
[Bug 41579] R300 Segfaults when using mupen64plus
https://bugs.freedesktop.org/show_bug.cgi?id=41579 --- Comment #3 from Michel Dänzer mic...@daenzer.net 2011-10-12 02:59:21 PDT --- (In reply to comment #2) Just tested current Git, and it seems to be working. Great. If it's still broken with the current 7.11 branch, and you can isolate the change that fixed it, maybe we can backport it. -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 41668] Screen locks up at random points when using a 3D compositing wm (gnome-shell) on an rv515 (radeon mobility x1300)
https://bugs.freedesktop.org/show_bug.cgi?id=41668 --- Comment #10 from Michel Dänzer mic...@daenzer.net 2011-10-12 03:07:04 PDT --- (In reply to comment #6) running glxgears just shows an empty black box.. all other glx demos are the same empty boxes.. Do they work with the environment variable vblank_mode=0? If yes, does the number for radeon increase in /proc/interrupts once the problem occurs? setting vblank_mode=0 works and displays an output.. but not much change in /proc/interrupts (irq 46 for radeon) Not much change for the radeon number, or none at all? If the latter, apparently the IRQ for the radeon card stops working for some reason, which would explain the core symptoms of the freeze. -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 38694] Server freezes with latest commit on 22/06/2011
https://bugs.freedesktop.org/show_bug.cgi?id=38694 Michel Dänzer mic...@daenzer.net changed: What|Removed |Added Product|xorg|DRI Version|git |unspecified Component|Driver/Radeon |DRM/Radeon AssignedTo|xorg-driver-...@lists.x.org |dri-devel@lists.freedesktop ||.org QAContact|xorg-t...@lists.x.org | -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 24097] screen backlight off after resume-from-suspend when using ATI KMS
https://bugs.freedesktop.org/show_bug.cgi?id=24097 Michel Dänzer mic...@daenzer.net changed: What|Removed |Added Product|xorg|DRI Version|7.4 |unspecified Status Whiteboard|2011BRB_Reviewed| Component|Driver/Radeon |DRM/Radeon AssignedTo|xorg-driver-...@lists.x.org |dri-devel@lists.freedesktop ||.org QAContact|xorg-t...@lists.x.org | -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 23103] screen not lighting up on resume when using kms
https://bugs.freedesktop.org/show_bug.cgi?id=23103 Michel Dänzer mic...@daenzer.net changed: What|Removed |Added Product|xorg|DRI Version|7.4 |unspecified Component|Driver/Radeon |DRM/Radeon AssignedTo|xorg-driver-...@lists.x.org |dri-devel@lists.freedesktop ||.org QAContact|xorg-t...@lists.x.org | -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 41668] Screen locks up at random points when using a 3D compositing wm (gnome-shell) on an rv515 (radeon mobility x1300)
https://bugs.freedesktop.org/show_bug.cgi?id=41668 --- Comment #11 from dmotd inaudi...@simplesuperlativ.es 2011-10-12 03:24:07 PDT --- (In reply to comment #10) (In reply to comment #6) running glxgears just shows an empty black box.. all other glx demos are the same empty boxes.. Do they work with the environment variable vblank_mode=0? If yes, does the number for radeon increase in /proc/interrupts once the problem occurs? setting vblank_mode=0 works and displays an output.. but not much change in /proc/interrupts (irq 46 for radeon) Not much change for the radeon number, or none at all? If the latter, apparently the IRQ for the radeon card stops working for some reason, which would explain the core symptoms of the freeze. no change to the radeon irq number. -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [Linaro-mm-sig] [RFC 1/2] dma-buf: Introduce dma buffer sharing mechanism
On Tue, Oct 11, 2011 at 10:23 AM, Sumit Semwal sumit.sem...@ti.com wrote: This is the first step in defining a dma buffer sharing mechanism. A new buffer object dma_buf is added, with operations and API to allow easy sharing of this buffer object across devices. The framework allows: - a new buffer-object to be created with fixed size. - different devices to 'attach' themselves to this buffer, to facilitate backing storage negotiation, using dma_buf_attach() API. - association of a file pointer with each user-buffer and associated allocator-defined operations on that buffer. This operation is called the 'export' operation. - this exported buffer-object to be shared with the other entity by asking for its 'file-descriptor (fd)', and sharing the fd across. - a received fd to get the buffer object back, where it can be accessed using the associated exporter-defined operations. - the exporter and user to share the scatterlist using get_scatterlist and put_scatterlist operations. Atleast one 'attach()' call is required to be made prior to calling the get_scatterlist() operation. Couple of building blocks in get_scatterlist() are added to ease introduction of sync'ing across exporter and users, and late allocation by the exporter. mmap() file operation is provided for the associated 'fd', as wrapper over the optional allocator defined mmap(), to be used by devices that might need one. Why is this needed? it really doesn't make sense to be mmaping objects independent of some front-end like drm or v4l. how will you know what contents are in them, how will you synchronise access. Unless someone has a hard use-case for this I'd say we drop it until someone does. Dave. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 1/2] drm/radeon: allow pcie gen2 speed on NI
Hi Dave, A few weeks ago I sent the two patches that allow PCI Express interface to run at Gen 2 speed on NI parts. Links to the patches in the mailing list archive + review from Alex quoted below: http://lists.freedesktop.org/archives/dri-devel/2011-September/014474.html http://lists.freedesktop.org/archives/dri-devel/2011-September/014475.html I saw some activity on drm-next and drm-core-next branches, but I have not seen these two patches merge yet. Just wondering if they are in the queue for merging or if they may have fell through the cracks? thanks, Ilija On Tue, 20 Sep 2011, Alex Deucher wrote: On Tue, Sep 20, 2011 at 10:22 AM, Ilija Hadzic ihad...@research.bell-labs.com wrote: Enabling pcie gen2 speed was skipped for Northern Islands AISCs, although it looks like it works just fine with the same initialization sequence used for evergreen. According to Alex D. gen2 init was skipped to prevent a crash that has been caused by some other bug that has been fixed in the meantime; so now it should be safe to enable it. Signed-off-by: Ilija Hadzic ihad...@research.bell-labs.com I just double checked and BTC and cayman use the same programming method. Both patches: Reviewed-by: Alex Deucher alexander.deuc...@amd.com Thanks! Alex --- drivers/gpu/drm/radeon/evergreen.c | 3 +-- 1 files changed, 1 insertions(+), 2 deletions(-) diff --git a/drivers/gpu/drm/radeon/evergreen.c b/drivers/gpu/drm/radeon/evergreen.c index f09bace..208b59c 100644 --- a/drivers/gpu/drm/radeon/evergreen.c +++ b/drivers/gpu/drm/radeon/evergreen.c @@ -2987,8 +2987,7 @@ static int evergreen_startup(struct radeon_device *rdev) int r; /* enable pcie gen2 link */ - if (!ASIC_IS_DCE5(rdev)) - evergreen_pcie_gen2_enable(rdev); + evergreen_pcie_gen2_enable(rdev); if (ASIC_IS_DCE5(rdev)) { if (!rdev-me_fw || !rdev-pfp_fw || !rdev-rlc_fw || !rdev-mc_fw) { -- 1.7.6 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [Linaro-mm-sig] [RFC 1/2] dma-buf: Introduce dma buffer sharing mechanism
On Wed, Oct 12, 2011 at 7:41 AM, Dave Airlie airl...@gmail.com wrote: On Tue, Oct 11, 2011 at 10:23 AM, Sumit Semwal sumit.sem...@ti.com wrote: This is the first step in defining a dma buffer sharing mechanism. A new buffer object dma_buf is added, with operations and API to allow easy sharing of this buffer object across devices. The framework allows: - a new buffer-object to be created with fixed size. - different devices to 'attach' themselves to this buffer, to facilitate backing storage negotiation, using dma_buf_attach() API. - association of a file pointer with each user-buffer and associated allocator-defined operations on that buffer. This operation is called the 'export' operation. - this exported buffer-object to be shared with the other entity by asking for its 'file-descriptor (fd)', and sharing the fd across. - a received fd to get the buffer object back, where it can be accessed using the associated exporter-defined operations. - the exporter and user to share the scatterlist using get_scatterlist and put_scatterlist operations. Atleast one 'attach()' call is required to be made prior to calling the get_scatterlist() operation. Couple of building blocks in get_scatterlist() are added to ease introduction of sync'ing across exporter and users, and late allocation by the exporter. mmap() file operation is provided for the associated 'fd', as wrapper over the optional allocator defined mmap(), to be used by devices that might need one. Why is this needed? it really doesn't make sense to be mmaping objects independent of some front-end like drm or v4l. well, the mmap is actually implemented by the buffer allocator (v4l/drm).. although not sure if this was the point how will you know what contents are in them, how will you synchronise access. Unless someone has a hard use-case for this I'd say we drop it until someone does. The intent was that this is for well defined formats.. ie. it would need to be a format that both v4l and drm understood in the first place for sharing to make sense at all.. Anyways, the basic reason is to handle random edge cases where you need sw access to the buffer. For example, you are decoding video and pull out a frame to generate a thumbnail w/ a sw jpeg encoder.. On gstreamer 0.11 branch, for example, there is already a map/unmap virtual method on the gst buffer for sw access (ie. same purpose as PrepareAccess/FinishAccess in EXA). The idea w/ dmabuf mmap() support is that we could implement support to mmap()/munmap() before/after sw access. With this current scheme, synchronization could be handled in dmabufops-mmap() and vm_ops-close().. it is perhaps a bit heavy to require mmap/munmap for each sw access, but I suppose this isn't really for the high-performance use case. It is just so that some random bit of sw that gets passed a dmabuf handle without knowing who allocated it can have sw access if really needed. BR, -R Dave. -- To unsubscribe from this list: send the line unsubscribe linux-media in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [Linaro-mm-sig] [RFC 1/2] dma-buf: Introduce dma buffer sharing mechanism
well, the mmap is actually implemented by the buffer allocator (v4l/drm).. although not sure if this was the point Then why not use the correct interface? doing some sort of not-quite generic interface isn't really helping anyone except adding an ABI that we have to support. If someone wants to bypass the current kernel APIs we should add a new API for them not shove it into this generic buffer sharing layer. The intent was that this is for well defined formats.. ie. it would need to be a format that both v4l and drm understood in the first place for sharing to make sense at all.. How will you know the stride to take a simple example? The userspace had to create this buffer somehow and wants to share it with something, you sound like you really needs another API that is a simple accessor API that can handle mmaps. Anyways, the basic reason is to handle random edge cases where you need sw access to the buffer. For example, you are decoding video and pull out a frame to generate a thumbnail w/ a sw jpeg encoder.. Again, doesn't sound like it should be part of this API, and also sounds like the sw jpeg encoder will need more info about the buffer anyways like stride and format. With this current scheme, synchronization could be handled in dmabufops-mmap() and vm_ops-close().. it is perhaps a bit heavy to require mmap/munmap for each sw access, but I suppose this isn't really for the high-performance use case. It is just so that some random bit of sw that gets passed a dmabuf handle without knowing who allocated it can have sw access if really needed. So I think thats fine, write a sw accessor providers, don't go overloading the buffer sharing code. This API will limit what people can use this buffer sharing for with pure hw accessors, you might say, oh buts its okay to fail the mmap then, but the chances of sw handling that I'm not so sure off. Dave. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 1/2] drm/radeon: allow pcie gen2 speed on NI
On Wed, Oct 12, 2011 at 2:25 PM, Ilija Hadzic ihad...@research.bell-labs.com wrote: Hi Dave, A few weeks ago I sent the two patches that allow PCI Express interface to run at Gen 2 speed on NI parts. Links to the patches in the mailing list archive + review from Alex quoted below: http://lists.freedesktop.org/archives/dri-devel/2011-September/014474.html http://lists.freedesktop.org/archives/dri-devel/2011-September/014475.html I saw some activity on drm-next and drm-core-next branches, but I have not seen these two patches merge yet. Just wondering if they are in the queue for merging or if they may have fell through the cracks? /me misses patchwork a lot. I've picked them up now. Thanks, Dave. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [Linaro-mm-sig] [RFC 1/2] dma-buf: Introduce dma buffer sharing mechanism
On Wed, Oct 12, 2011 at 8:35 AM, Dave Airlie airl...@gmail.com wrote: well, the mmap is actually implemented by the buffer allocator (v4l/drm).. although not sure if this was the point Then why not use the correct interface? doing some sort of not-quite generic interface isn't really helping anyone except adding an ABI that we have to support. But what if you don't know who allocated the buffer? How do you know what interface to use to mmap? If someone wants to bypass the current kernel APIs we should add a new API for them not shove it into this generic buffer sharing layer. The intent was that this is for well defined formats.. ie. it would need to be a format that both v4l and drm understood in the first place for sharing to make sense at all.. How will you know the stride to take a simple example? The userspace had to create this buffer somehow and wants to share it with something, you sound like you really needs another API that is a simple accessor API that can handle mmaps. Well, things like stride, width, height, color format, userspace needs to know all this already, even for malloc()'d sw buffers. The assumption is userspace already has a way to pass this information around so it was not required to be duplicated by dmabuf. Anyways, the basic reason is to handle random edge cases where you need sw access to the buffer. For example, you are decoding video and pull out a frame to generate a thumbnail w/ a sw jpeg encoder.. Again, doesn't sound like it should be part of this API, and also sounds like the sw jpeg encoder will need more info about the buffer anyways like stride and format. With this current scheme, synchronization could be handled in dmabufops-mmap() and vm_ops-close().. it is perhaps a bit heavy to require mmap/munmap for each sw access, but I suppose this isn't really for the high-performance use case. It is just so that some random bit of sw that gets passed a dmabuf handle without knowing who allocated it can have sw access if really needed. So I think thats fine, write a sw accessor providers, don't go overloading the buffer sharing code. But then we'd need a different set of accessors for every different drm/v4l/etc driver, wouldn't we? This API will limit what people can use this buffer sharing for with pure hw accessors, you might say, oh buts its okay to fail the mmap then, but the chances of sw handling that I'm not so sure off. I'm not entirely sure the case you are worried about.. sharing buffers between multiple GPU's that understand same tiled formats? I guess that is a bit different from a case like a jpeg encoder that is passed a dmabuf handle without any idea where it came from.. I guess if sharing a buffer between multiple drm devices, there is nothing stopping you from having some NOT_DMABUF_MMAPABLE flag you pass when the buffer is allocated, then you don't have to support dmabuf-mmap(), and instead mmap via device and use some sort of DRM_CPU_PREP/FINI ioctls for synchronization.. BR, -R Dave. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [Linaro-mm-sig] [RFC 1/2] dma-buf: Introduce dma buffer sharing mechanism
But then we'd need a different set of accessors for every different drm/v4l/etc driver, wouldn't we? Not any more different than you need for this, you just have a new interface that you request a sw object from, then mmap that object, and underneath it knows who owns it in the kernel. mmap just feels wrong in this API, which is a buffer sharing API not a buffer mapping API. I guess if sharing a buffer between multiple drm devices, there is nothing stopping you from having some NOT_DMABUF_MMAPABLE flag you pass when the buffer is allocated, then you don't have to support dmabuf-mmap(), and instead mmap via device and use some sort of DRM_CPU_PREP/FINI ioctls for synchronization.. Or we could make a generic CPU accessor that we don't have to worry about. Dave. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [Linaro-mm-sig] [RFC 1/2] dma-buf: Introduce dma buffer sharing mechanism
On Wed, Oct 12, 2011 at 9:01 AM, Dave Airlie airl...@gmail.com wrote: But then we'd need a different set of accessors for every different drm/v4l/etc driver, wouldn't we? Not any more different than you need for this, you just have a new interface that you request a sw object from, then mmap that object, and underneath it knows who owns it in the kernel. oh, ok, so you are talking about a kernel level interface, rather than userspace.. but I guess in this case I don't quite see the difference. It amounts to which fd you call mmap (or ioctl[*]) on.. If you use the dmabuf fd directly then you don't have to pass around a 2nd fd. [*] there is nothing stopping defining some dmabuf ioctls (such as for synchronization).. although the thinking was to keep it simple for first version of dmabuf BR, -R mmap just feels wrong in this API, which is a buffer sharing API not a buffer mapping API. I guess if sharing a buffer between multiple drm devices, there is nothing stopping you from having some NOT_DMABUF_MMAPABLE flag you pass when the buffer is allocated, then you don't have to support dmabuf-mmap(), and instead mmap via device and use some sort of DRM_CPU_PREP/FINI ioctls for synchronization.. Or we could make a generic CPU accessor that we don't have to worry about. Dave. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [Linaro-mm-sig] [RFC 1/2] dma-buf: Introduce dma buffer sharing mechanism
On Wed, Oct 12, 2011 at 3:24 PM, Rob Clark robdcl...@gmail.com wrote: On Wed, Oct 12, 2011 at 9:01 AM, Dave Airlie airl...@gmail.com wrote: But then we'd need a different set of accessors for every different drm/v4l/etc driver, wouldn't we? Not any more different than you need for this, you just have a new interface that you request a sw object from, then mmap that object, and underneath it knows who owns it in the kernel. oh, ok, so you are talking about a kernel level interface, rather than userspace.. but I guess in this case I don't quite see the difference. It amounts to which fd you call mmap (or ioctl[*]) on.. If you use the dmabuf fd directly then you don't have to pass around a 2nd fd. [*] there is nothing stopping defining some dmabuf ioctls (such as for synchronization).. although the thinking was to keep it simple for first version of dmabuf Yes a separate kernel level interface. Well I'd like to keep it even simpler. dmabuf is a buffer sharing API, shoehorning in a sw mapping API isn't making it simpler. The problem I have with implementing mmap on the sharing fd, is that nothing says this should be purely optional and userspace shouldn't rely on it. In the Intel GEM space alone you have two types of mapping, one direct to shmem one via GTT, the GTT could be even be a linear view. The intel guys initially did GEM mmaps direct to the shmem pages because it seemed simple, up until they had to do step two which was do mmaps on the GTT copy and ended up having two separate mmap methods. I think the problem here is it seems deceptively simple to add this to the API now because the API is simple, however I think in the future it'll become a burden that we'll have to workaround. Dave. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 19/21] drm/i915: Asynchronous eDP panel power off
Using the same basic plan as the VDD force delayed power off, make turning the panel power off asynchronous. NAK, tested on my 2540p, up to this patch in macbook-air branch stuff worked, after this I just get black screen on resume. Dave. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [Linaro-mm-sig] [RFC 1/2] dma-buf: Introduce dma buffer sharing mechanism
On Wed, Oct 12, 2011 at 03:34:54PM +0100, Dave Airlie wrote: On Wed, Oct 12, 2011 at 3:24 PM, Rob Clark robdcl...@gmail.com wrote: On Wed, Oct 12, 2011 at 9:01 AM, Dave Airlie airl...@gmail.com wrote: But then we'd need a different set of accessors for every different drm/v4l/etc driver, wouldn't we? Not any more different than you need for this, you just have a new interface that you request a sw object from, then mmap that object, and underneath it knows who owns it in the kernel. oh, ok, so you are talking about a kernel level interface, rather than userspace.. but I guess in this case I don't quite see the difference. It amounts to which fd you call mmap (or ioctl[*]) on.. If you use the dmabuf fd directly then you don't have to pass around a 2nd fd. [*] there is nothing stopping defining some dmabuf ioctls (such as for synchronization).. although the thinking was to keep it simple for first version of dmabuf Yes a separate kernel level interface. Well I'd like to keep it even simpler. dmabuf is a buffer sharing API, shoehorning in a sw mapping API isn't making it simpler. The problem I have with implementing mmap on the sharing fd, is that nothing says this should be purely optional and userspace shouldn't rely on it. In the Intel GEM space alone you have two types of mapping, one direct to shmem one via GTT, the GTT could be even be a linear view. The intel guys initially did GEM mmaps direct to the shmem pages because it seemed simple, up until they had to do step two which was do mmaps on the GTT copy and ended up having two separate mmap methods. I think the problem here is it seems deceptively simple to add this to the API now because the API is simple, however I think in the future it'll become a burden that we'll have to workaround. Yeah, that's my feeling, too. Adding mmap sounds like a neat, simple idea, that could simplify things for simple devices like v4l. But as soon as you're dealing with a real gpu, nothing is simple. Those who don't believe this, just take a look at the data upload/download paths in the open-source i915,nouveau,radeon drivers. Making this fast (and for gpus, it needs to be fast) requires tons of tricks, special-cases and jumping through loops. You absolutely want the device-specific ioctls to do that. Adding a generic mmap just makes matters worse, especially if userspace expects this to work synchronized with everything else that is going on. Cheers, Daniel -- Daniel Vetter Mail: dan...@ffwll.ch Mobile: +41 (0)79 365 57 48 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [Linaro-mm-sig] [RFC 1/2] dma-buf: Introduce dma buffer sharing mechanism
On Wed, Oct 12, 2011 at 9:34 AM, Dave Airlie airl...@gmail.com wrote: On Wed, Oct 12, 2011 at 3:24 PM, Rob Clark robdcl...@gmail.com wrote: On Wed, Oct 12, 2011 at 9:01 AM, Dave Airlie airl...@gmail.com wrote: But then we'd need a different set of accessors for every different drm/v4l/etc driver, wouldn't we? Not any more different than you need for this, you just have a new interface that you request a sw object from, then mmap that object, and underneath it knows who owns it in the kernel. oh, ok, so you are talking about a kernel level interface, rather than userspace.. but I guess in this case I don't quite see the difference. It amounts to which fd you call mmap (or ioctl[*]) on.. If you use the dmabuf fd directly then you don't have to pass around a 2nd fd. [*] there is nothing stopping defining some dmabuf ioctls (such as for synchronization).. although the thinking was to keep it simple for first version of dmabuf Yes a separate kernel level interface. I'm not against it, but if it is a device-independent interface, it just seems like six of one, half-dozen of the other.. Ie. how does it differ if the dmabuf fd is the fd used for ioctl/mmap, vs if some other /dev/buffer-sharer file that you open? But I think maybe I'm misunderstanding what you have in mind? BR, -R Well I'd like to keep it even simpler. dmabuf is a buffer sharing API, shoehorning in a sw mapping API isn't making it simpler. The problem I have with implementing mmap on the sharing fd, is that nothing says this should be purely optional and userspace shouldn't rely on it. In the Intel GEM space alone you have two types of mapping, one direct to shmem one via GTT, the GTT could be even be a linear view. The intel guys initially did GEM mmaps direct to the shmem pages because it seemed simple, up until they had to do step two which was do mmaps on the GTT copy and ended up having two separate mmap methods. I think the problem here is it seems deceptively simple to add this to the API now because the API is simple, however I think in the future it'll become a burden that we'll have to workaround. Dave. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: [PATCH 19/21] drm/i915: Asynchronous eDP panel power off
On Wed, 12 Oct 2011 15:41:11 +0100, Dave Airlie airl...@gmail.com wrote: Using the same basic plan as the VDD force delayed power off, make turning the panel power off asynchronous. NAK, tested on my 2540p, up to this patch in macbook-air branch stuff worked, after this I just get black screen on resume. Thanks for testing. I've created a new edp-training-fixes branch that removes the async panel power off and leaves the rest of the branch. I'll test on a 2540p that I've got access to today, and on the MBA when I get home this evening. -- keith.pack...@intel.com pgpQJKfQjssQ3.pgp Description: PGP signature ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[Bug 36003] [Radeon HD 5650 and 5470] Driver crash during recovery boot and in normal boot (Regression from 2.6.38-3 to -4)
https://bugs.freedesktop.org/show_bug.cgi?id=36003 Jeremy Huddleston jerem...@freedesktop.org changed: What|Removed |Added Status|NEW |RESOLVED Resolution||FIXED --- Comment #20 from Jeremy Huddleston jerem...@freedesktop.org 2011-10-12 10:00:48 PDT --- Thanks. Closing based on the above comment. -- Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email --- You are receiving this mail because: --- You are the assignee for the bug. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
Re: Power profiles low and mid are identical on Radeon HD6470M
Am 11.10.2011 23:53, schrieb Alex Deucher: On Sat, Oct 8, 2011 at 2:25 PM, Wolfgang Fritzwolfgang.fr...@gmx.net wrote: Hello, I have an HP Elitebook 8560p with Radeon HD7470M graphics, running Debian sid with kernel 3.0.4. I noticed that the power profiles low and mid are setting identical clocks and voltage, the lowest possible values: default engine clock: 75 kHz current engine clock: 0 kHz default memory clock: 90 kHz current memory clock: 149970 kHz voltage: 900 mV Looking at the code, this seems to be intentional at least for the mobility chips, but the chip provides more modes: [9.361401] [drm] R600: Number of power states = 7 [9.361402] [drm] Is mobility = YES [9.361403] [drm] ps #0 type 0, modes=3 [9.361404] [drm] 0: mclk=9, sclk=75000, volt=1100, vddci=0 [9.361406] [drm] 1: mclk=9, sclk=75000, volt=1100, vddci=0 [9.361407] [drm] 2: mclk=9, sclk=75000, volt=1100, vddci=0 [9.361409] [drm] ps #1 type 4, modes=3 [9.361410] [drm] 0: mclk=15000, sclk=1, volt=900, vddci=0 [9.361411] [drm] 1: mclk=9, sclk=4, volt=1000, vddci=0 [9.361413] [drm] 2: mclk=9, sclk=75000, volt=1100, vddci=0 [9.361414] [drm] ps #2 type 0, modes=3 [9.361415] [drm] 0: mclk=9, sclk=7, volt=1100, vddci=0 [9.361417] [drm] 1: mclk=9, sclk=7, volt=1100, vddci=0 [9.361418] [drm] 2: mclk=9, sclk=7, volt=1100, vddci=0 [9.361419] [drm] ps #3 type 2, modes=3 [9.361420] [drm] 0: mclk=15000, sclk=1, volt=900, vddci=0 [9.361422] [drm] 1: mclk=15000, sclk=1, volt=900, vddci=0 [9.361423] [drm] 2: mclk=3, sclk=3, volt=900, vddci=0 [9.361424] [drm] ps #4 type 2, modes=3 [9.361426] [drm] 0: mclk=65000, sclk=4, volt=900, vddci=0 [9.361427] [drm] 1: mclk=65000, sclk=4, volt=900, vddci=0 [9.361428] [drm] 2: mclk=65000, sclk=4, volt=900, vddci=0 [9.361430] [drm] ps #5 type 2, modes=3 [9.361431] [drm] 0: mclk=3, sclk=3, volt=900, vddci=0 [9.361433] [drm] 1: mclk=3, sclk=3, volt=900, vddci=0 [9.361434] [drm] 2: mclk=3, sclk=3, volt=900, vddci=0 [9.361435] [drm] ps #6 type 0, modes=3 [9.361436] [drm] 0: mclk=65000, sclk=4, volt=900, vddci=0 [9.361438] [drm] 1: mclk=65000, sclk=4, volt=900, vddci=0 [9.361439] [drm] 2: mclk=65000, sclk=4, volt=900, vddci=0 [9.361440] [drm] NOT CHIP_R600 (dmesg output from patched radeon module) Questions: 1. Is this a bug or a feature? (I see that it is not obvious which power state to choose) It's the way it is. :-) 2. What do the 3 clock/voltage modes per power state mean? On r6xx+, each power state defines an operating state (e.g., single head battery, multi-head battery, single head performance, multi-head performance, etc.). Within each operating state, there are high/mid/low clock modes that the define that operating state. So if you have one head active and are on battery, the driver should switch between the high/mid/low clock modes defined in that power state based on the GPU load. If you enable multi-head and are still on battery, the driver would switch to the multi-head battery state and switch between the high/mid/low modes in that state. OK. That's what I assumed after short code inspection. So, this is not cooperating well with the current dynamic clock interface in sysfs (at least as I understand it now). I understand that there are the dynamic and the profile power methods. In dynamic, I see the clocks switching, probably using the 3 power states in the second operation state in the list above (maximum performance). This results in an average power consumption similar to the catalyst driver (the fan is off most of the time). But it is not usable because the screen flickers when the clock state is changed, and this happens quite frequently. Also it seems to be independent of battery/mains mode. In the profile power mode, the clocks are at full speed with clock profiles default, high and at lowest speed with profiles mid and low. The high profile keeps the fan running continuously. This seems to be independent of mains or battery mode (I have to double check this) Low and mid profiles are unusable slow with 3D effects enabled, but work quite well with effects disabled, so this would be a suitable profile on low battery. With power profile auto, power state is high performance in mains mode and low in battery mode. So, as long as true dynamic clocking is not working flicker free, it would be nice to be able to change the clock modes manually to a value that keeps the fan quiet but is sufficient for ordinary work with effects enabled. I am currently running at 400/650 MHz @ 900mV with a patched driver. Finally some questions: Q1: Are all the power modes safe (maybe not optimal) to be used in all configurations (dual/single, battery/mains) or is it dangerous (meaning for the HW) using for example a dual
[PATCH 1/3] drm/radeon/kms/DCE4.1: fix dig encoder to transmitter mapping
From: Alex Deucher alexander.deuc...@amd.com llano has fully routeable dig encoders similar to DCE3.2 while ontario has a hardcoded mapping similar to DCE4.0. Signed-off-by: Alex Deucher alexander.deuc...@amd.com --- drivers/gpu/drm/radeon/radeon_encoders.c | 13 + 1 files changed, 9 insertions(+), 4 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_encoders.c b/drivers/gpu/drm/radeon/radeon_encoders.c index 8a171b2..a90d9ee 100644 --- a/drivers/gpu/drm/radeon/radeon_encoders.c +++ b/drivers/gpu/drm/radeon/radeon_encoders.c @@ -1756,10 +1756,15 @@ static int radeon_atom_pick_dig_encoder(struct drm_encoder *encoder) if (ASIC_IS_DCE4(rdev)) { dig = radeon_encoder-enc_priv; if (ASIC_IS_DCE41(rdev)) { - if (dig-linkb) - return 1; - else - return 0; + /* ontario follows DCE4 */ + if (rdev-family == CHIP_PALM) { + if (dig-linkb) + return 1; + else + return 0; + } else + /* llano follows DCE3.2 */ + return radeon_crtc-crtc_id; } else { switch (radeon_encoder-encoder_id) { case ENCODER_OBJECT_ID_INTERNAL_UNIPHY: -- 1.7.1.1 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 2/3] drm/radeon/kms/DCE4.1: ss is not supported on the internal pplls
From: Alex Deucher alexander.deuc...@amd.com It's handled via external clock. It should already be protected by the external ss flag, but add an explicit check just in case. Signed-off-by: Alex Deucher alexander.deuc...@amd.com --- drivers/gpu/drm/radeon/atombios_crtc.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/drivers/gpu/drm/radeon/atombios_crtc.c b/drivers/gpu/drm/radeon/atombios_crtc.c index c742944..a515b2a 100644 --- a/drivers/gpu/drm/radeon/atombios_crtc.c +++ b/drivers/gpu/drm/radeon/atombios_crtc.c @@ -466,7 +466,7 @@ static void atombios_crtc_program_ss(struct drm_crtc *crtc, return; } args.v2.ucEnable = enable; - if ((ss-percentage == 0) || (ss-type ATOM_EXTERNAL_SS_MASK)) + if ((ss-percentage == 0) || (ss-type ATOM_EXTERNAL_SS_MASK) || ASIC_IS_DCE41(rdev)) args.v2.ucEnable = ATOM_DISABLE; } else if (ASIC_IS_DCE3(rdev)) { args.v1.usSpreadSpectrumPercentage = cpu_to_le16(ss-percentage); -- 1.7.1.1 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 3/3] drm/radeon/kms/DCE4.1: fix Select_CrtcSource EncodeMode setting for DP bridges
From: Alex Deucher alexander.deuc...@amd.com Settings in this table reflect the physical panel/connector rather than the internal dig encoding. Signed-off-by: Alex Deucher alexander.deuc...@amd.com --- drivers/gpu/drm/radeon/radeon_encoders.c | 12 +++- 1 files changed, 11 insertions(+), 1 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_encoders.c b/drivers/gpu/drm/radeon/radeon_encoders.c index a90d9ee..bfe1662 100644 --- a/drivers/gpu/drm/radeon/radeon_encoders.c +++ b/drivers/gpu/drm/radeon/radeon_encoders.c @@ -1638,7 +1638,17 @@ atombios_set_encoder_crtc_source(struct drm_encoder *encoder) break; case 2: args.v2.ucCRTC = radeon_crtc-crtc_id; - args.v2.ucEncodeMode = atombios_get_encoder_mode(encoder); + if (radeon_encoder_is_dp_bridge(encoder)) { + struct drm_connector *connector = radeon_get_connector_for_encoder(encoder); + + if (connector-connector_type == DRM_MODE_CONNECTOR_LVDS) + args.v2.ucEncodeMode = ATOM_ENCODER_MODE_LVDS; + else if (connector-connector_type == DRM_MODE_CONNECTOR_VGA) + args.v2.ucEncodeMode = ATOM_ENCODER_MODE_LVDS; + else + args.v2.ucEncodeMode = atombios_get_encoder_mode(encoder); + } else + args.v2.ucEncodeMode = atombios_get_encoder_mode(encoder); switch (radeon_encoder-encoder_id) { case ENCODER_OBJECT_ID_INTERNAL_UNIPHY: case ENCODER_OBJECT_ID_INTERNAL_UNIPHY1: -- 1.7.1.1 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 3/3] drm/radeon/kms/DCE4.1: fix Select_CrtcSource EncodeMode setting for DP bridges (v2)
From: Alex Deucher alexander.deuc...@amd.com Settings in this table reflect the physical panel/connector rather than the internal dig encoding. v2: fix typo for DRM_MODE_CONNECTOR_VGA case. Signed-off-by: Alex Deucher alexander.deuc...@amd.com --- drivers/gpu/drm/radeon/radeon_encoders.c | 12 +++- 1 files changed, 11 insertions(+), 1 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_encoders.c b/drivers/gpu/drm/radeon/radeon_encoders.c index a90d9ee..eb3f6dc 100644 --- a/drivers/gpu/drm/radeon/radeon_encoders.c +++ b/drivers/gpu/drm/radeon/radeon_encoders.c @@ -1638,7 +1638,17 @@ atombios_set_encoder_crtc_source(struct drm_encoder *encoder) break; case 2: args.v2.ucCRTC = radeon_crtc-crtc_id; - args.v2.ucEncodeMode = atombios_get_encoder_mode(encoder); + if (radeon_encoder_is_dp_bridge(encoder)) { + struct drm_connector *connector = radeon_get_connector_for_encoder(encoder); + + if (connector-connector_type == DRM_MODE_CONNECTOR_LVDS) + args.v2.ucEncodeMode = ATOM_ENCODER_MODE_LVDS; + else if (connector-connector_type == DRM_MODE_CONNECTOR_VGA) + args.v2.ucEncodeMode = ATOM_ENCODER_MODE_CRT; + else + args.v2.ucEncodeMode = atombios_get_encoder_mode(encoder); + } else + args.v2.ucEncodeMode = atombios_get_encoder_mode(encoder); switch (radeon_encoder-encoder_id) { case ENCODER_OBJECT_ID_INTERNAL_UNIPHY: case ENCODER_OBJECT_ID_INTERNAL_UNIPHY1: -- 1.7.1.1 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
drm/radeon/kms: improve performance of blit-copy
The following set of patches will improve the performance of blit-copy functions for Radeon GPUs based on R600, R700, Evergreen and NI ASICs. The foundation for improvement is the use of tiled mode access (which for copying bo's can be used regardless of whether the content is tiled or not), and segmenting the memory block being copied into rectangles whose edge ratio is between 1:1 and 1:2. This maximizes the number of PCIe transactions that use maximum payload size (typically 128 bytes) and also creates a memory access pattern that is more favorable for both VRAM and host DRAM than what's currently in the kernel. To come up with the new blit-copy code, I did a lot of PCIe traffic analysis with the bus analyzer and also had many discussions with Alex, trying to explain what's going on (thanks to Alex for his time). Below (at the end of this note) are the results of some benchmarks that I did with various GPUs (all in the same host: Intel i7 CPU, X58 chipset, three DRAM channels). To run the tests on your machine load the radeon module with 'benchmark=1 pcie_gen2=1' parameters. Most significant improvement is in the upstream (VRAM to GART) direction because that's where the PCIe transactions were fragmented and also where memory access pattern was such that it created a lot of backpressure from the host. It is also interesting that high-end devices (e.g. Cayman) exhibit the least improvement and were the worst to begin with. This is because high-end devices copy more tiles in parallel which in turn can create bank conflicts on host memory and cause the host to do lots of bank-close/precharge/bank-open cycles. As an added bonus, I also did some code cleanup and consolidated the repeated code into common function, so r600 and evergreen/NI parts now share the blit-copy code. I also expanded on the benchmark coverage, so the module now takes benckmark parameter value between 1 and 8 and each results in running a different benchmark. For details, see the commit log messages and the code. I have been running with these patches for a few months (and I kept rebasing them to drm-core-next as the public git progressed) and I used them in a system setup that does *many* copying of this kind (and does them frequently); I have not seen instabilities introduced by these patches. I also verified the correctness of the copy using test=1 parameter for each GPU that I had and the test passed. I would welcome some feedback and if you run the benchmarks with the new blit code, I would very much like to hear what kind of improvement you are seeing. BENCHMARK RESULTS: == 1) VRAM to GTT == Card (ASIC) VRAMBefore After - 5570 (Redwood) DDR3 1600MHZ 4543912 6450 (Caicos) DDR5 3200MHz37185090 6570 (Turks)DDR3 1800MHz 4844144 5450 (Cedar)DDR3 1600MHz36795090 5450 (Cedar)DDR2 800MHz26954639 E4690 (RV730) DDR3 1400MHZ 4854969 E6760 (Turks) DDR5 3200MHz 4744177 V5700 (RV730) DDR3 MHz 4884297 2260 (RV620)DDR2 MHz 4943093 6870 (Barts)DDR5 4200MHz 4751113 6970 (Cayman) DDR5 4200MHz 473 710 2) GTT to VRAM == Card (ASIC) VRAMBefore After - 5570 (Redwood) DDR3 1600MHz31583360 6450 (Caicos) DDR5 3200MHz29953393 6570 (Turks)DDR3 1800MHz30393339 5450 (Cedar)DDR3 1600MHz32463404 5450 (Cedar)DDR2 800MHz26143371 E4690 (RV730) DDR3 1400MHz30843426 E6760 (Turks) DDR5 3200MHz24432570 V5700 (RV730) DDR3 MHz31873506 2260 (RV620)DDR2 MHz 5843246 6870 (Barts)DDR5 4200MHz24722601 6970 (Cayman) DDR5 4200MHz24602737 ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel
[PATCH 1/9] drm/radeon/kms: improve evergreen blit code
start with first-cut conceptual patch from Alex Deucher (commit info below); turn on 1D tiling make rectangular buffer always 2:1 or 1:2 ratio make buffer dimenstions an integer multiple of unit dimensions make sures that integral number of pages map to the buffer fix a few bugs that resulted in incorrect dimensions tidy up a little bit to get rid of an ugly if/else parametrize some magic constants add protections from illegal buffer sizes etc. From 77e6703c37f0ad8673b9ab285589d5c26782a515 Mon Sep 17 00:00:00 2001 From: Alex Deucher alexdeuc...@gmail.com Date: Tue, 17 May 2011 05:08:58 -0400 Subject: [PATCH 1/2] drm/radeon/kms: simplify evergreen blit code Covert 4k pages to multiples of 64x64x4 tiles. This is also more efficient than a scanline based approach from the MC's perspective. Signed-off-by: Alex Deucher alexdeuc...@gmail.com Signed-off-by: Ilija Hadzic ihad...@research.bell-labs.com --- drivers/gpu/drm/radeon/evergreen.c |4 +- drivers/gpu/drm/radeon/evergreen_blit_kms.c | 295 +++ drivers/gpu/drm/radeon/radeon_asic.h|4 +- 3 files changed, 123 insertions(+), 180 deletions(-) diff --git a/drivers/gpu/drm/radeon/evergreen.c b/drivers/gpu/drm/radeon/evergreen.c index 5df39bf..5f0ecc7 100644 --- a/drivers/gpu/drm/radeon/evergreen.c +++ b/drivers/gpu/drm/radeon/evergreen.c @@ -3180,14 +3180,14 @@ int evergreen_copy_blit(struct radeon_device *rdev, mutex_lock(rdev-r600_blit.mutex); rdev-r600_blit.vb_ib = NULL; - r = evergreen_blit_prepare_copy(rdev, num_pages * RADEON_GPU_PAGE_SIZE); + r = evergreen_blit_prepare_copy(rdev, num_pages); if (r) { if (rdev-r600_blit.vb_ib) radeon_ib_free(rdev, rdev-r600_blit.vb_ib); mutex_unlock(rdev-r600_blit.mutex); return r; } - evergreen_kms_blit_copy(rdev, src_offset, dst_offset, num_pages * RADEON_GPU_PAGE_SIZE); + evergreen_kms_blit_copy(rdev, src_offset, dst_offset, num_pages); evergreen_blit_done_copy(rdev, fence); mutex_unlock(rdev-r600_blit.mutex); return 0; diff --git a/drivers/gpu/drm/radeon/evergreen_blit_kms.c b/drivers/gpu/drm/radeon/evergreen_blit_kms.c index 2eb2518..3b24137 100644 --- a/drivers/gpu/drm/radeon/evergreen_blit_kms.c +++ b/drivers/gpu/drm/radeon/evergreen_blit_kms.c @@ -44,6 +44,10 @@ #define COLOR_5_6_5 0x8 #define COLOR_8_8_8_8 0x1a +#define RECT_UNIT_H 32 +#define RECT_UNIT_W (RADEON_GPU_PAGE_SIZE / 4 / RECT_UNIT_H) +#define MAX_RECT_DIM 16384 + /* emits 17 */ static void set_render_target(struct radeon_device *rdev, int format, @@ -56,7 +60,7 @@ set_render_target(struct radeon_device *rdev, int format, if (h 8) h = 8; - cb_color_info = ((format 2) | (1 24) | (1 8)); + cb_color_info = ((format 2) | (1 24) | (2 8)); pitch = (w / 8) - 1; slice = ((w * h) / 64) - 1; @@ -67,7 +71,7 @@ set_render_target(struct radeon_device *rdev, int format, radeon_ring_write(rdev, slice); radeon_ring_write(rdev, 0); radeon_ring_write(rdev, cb_color_info); - radeon_ring_write(rdev, (1 4)); + radeon_ring_write(rdev, 0); radeon_ring_write(rdev, (w - 1) | ((h - 1) 16)); radeon_ring_write(rdev, 0); radeon_ring_write(rdev, 0); @@ -179,7 +183,7 @@ set_tex_resource(struct radeon_device *rdev, sq_tex_resource_word0 = (1 0); /* 2D */ sq_tex_resource_word0 |= pitch 3) - 1) 6) | ((w - 1) 18)); - sq_tex_resource_word1 = ((h - 1) 0) | (1 28); + sq_tex_resource_word1 = ((h - 1) 0) | (2 28); /* xyzw swizzles */ sq_tex_resource_word4 = (0 16) | (1 19) | (2 22) | (3 25); @@ -751,30 +755,80 @@ static void evergreen_vb_ib_put(struct radeon_device *rdev) radeon_ib_free(rdev, rdev-r600_blit.vb_ib); } -int evergreen_blit_prepare_copy(struct radeon_device *rdev, int size_bytes) + +/* maps the rectangle to the buffer so that satisfies the following properties: + * - dimensions are less or equal to the hardware limit (MAX_RECT_DIM) + * - rectangle consists of integer number of pages + * - height is an integer multiple of RECT_UNIT_H + * - width is an integer multiple of RECT_UNIT_W + * - (the above three conditions also guarantee tile-aligned size) + * - it is as square as possible (sides ratio never greater than 2:1) + * - uses maximum number of pages that fit the above constraints + * + * input: buffer size, pointers to width/height variables + * return: number of pages that were successfully mapped to the rectangle + * width/height of the rectangle + */ +static unsigned evergreen_blit_create_rect(unsigned num_pages, int *width, int *height) +{ + unsigned max_pages; + unsigned pages = num_pages; + int w, h; + + if (num_pages
[PATCH 2/9] drm/radeon/kms: improve r6xx blit code
start with first-cut conceptual patch from Alex Deucher (commit info below); turn on 1D tiling make rectangular buffer always 2:1 or 1:2 ratio make buffer dimenstions an integer multiple of unit dimensionsmake sures that integral number of pages map to the buffer fix a few bugs that resulted in incorrect dimensions tidy up a little bit to get rid of an ugly if/else parametrize some magic constants add protections from illegal buffer sizes etc. From 2cd7a267d6cbcdf414b7a724237aa24525c12b54 Mon Sep 17 00:00:00 2001 From: Alex Deucher alexdeuc...@gmail.com Date: Tue, 17 May 2011 05:09:43 -0400 Subject: [PATCH 2/2] drm/radeon/kms: simplify r6xx blit code Covert 4k pages to multiples of 64x64x4 tiles. This is also more efficient than a scanline based approach from the MC's perspective. Signed-off-by: Alex Deucher alexdeuc...@gmail.com Signed-off-by: Ilija Hadzic ihad...@research.bell-labs.com --- drivers/gpu/drm/radeon/r600.c |4 +- drivers/gpu/drm/radeon/r600_blit_kms.c | 276 drivers/gpu/drm/radeon/radeon_asic.h |4 +- 3 files changed, 109 insertions(+), 175 deletions(-) diff --git a/drivers/gpu/drm/radeon/r600.c b/drivers/gpu/drm/radeon/r600.c index 334aee6..9fc6844 100644 --- a/drivers/gpu/drm/radeon/r600.c +++ b/drivers/gpu/drm/radeon/r600.c @@ -2363,14 +2363,14 @@ int r600_copy_blit(struct radeon_device *rdev, mutex_lock(rdev-r600_blit.mutex); rdev-r600_blit.vb_ib = NULL; - r = r600_blit_prepare_copy(rdev, num_pages * RADEON_GPU_PAGE_SIZE); + r = r600_blit_prepare_copy(rdev, num_pages); if (r) { if (rdev-r600_blit.vb_ib) radeon_ib_free(rdev, rdev-r600_blit.vb_ib); mutex_unlock(rdev-r600_blit.mutex); return r; } - r600_kms_blit_copy(rdev, src_offset, dst_offset, num_pages * RADEON_GPU_PAGE_SIZE); + r600_kms_blit_copy(rdev, src_offset, dst_offset, num_pages); r600_blit_done_copy(rdev, fence); mutex_unlock(rdev-r600_blit.mutex); return 0; diff --git a/drivers/gpu/drm/radeon/r600_blit_kms.c b/drivers/gpu/drm/radeon/r600_blit_kms.c index 9aa74c3..d9994c9 100644 --- a/drivers/gpu/drm/radeon/r600_blit_kms.c +++ b/drivers/gpu/drm/radeon/r600_blit_kms.c @@ -42,6 +42,10 @@ #define COLOR_5_6_5 0x8 #define COLOR_8_8_8_8 0x1a +#define RECT_UNIT_H 32 +#define RECT_UNIT_W (RADEON_GPU_PAGE_SIZE / 4 / RECT_UNIT_H) +#define MAX_RECT_DIM 8192 + /* emits 21 on rv770+, 23 on r600 */ static void set_render_target(struct radeon_device *rdev, int format, @@ -600,13 +604,59 @@ static void r600_vb_ib_put(struct radeon_device *rdev) radeon_ib_free(rdev, rdev-r600_blit.vb_ib); } -int r600_blit_prepare_copy(struct radeon_device *rdev, int size_bytes) +/* FIXME: the function is very similar to evergreen_blit_create_rect, except + that it different predefined constants; consider commonizing */ +static unsigned r600_blit_create_rect(unsigned num_pages, int *width, int *height) +{ + unsigned max_pages; + unsigned pages = num_pages; + int w, h; + + if (num_pages == 0) { + /* not supposed to be called with no pages, but just in case */ + h = 0; + w = 0; + pages = 0; + WARN_ON(1); + } else { + int rect_order = 2; + h = RECT_UNIT_H; + while (num_pages / rect_order) { + h *= 2; + rect_order *= 4; + if (h = MAX_RECT_DIM) { + h = MAX_RECT_DIM; + break; + } + } + max_pages = (MAX_RECT_DIM * h) / (RECT_UNIT_W * RECT_UNIT_H); + if (pages max_pages) + pages = max_pages; + w = (pages * RECT_UNIT_W * RECT_UNIT_H) / h; + w = (w / RECT_UNIT_W) * RECT_UNIT_W; + pages = (w * h) / (RECT_UNIT_W * RECT_UNIT_H); + BUG_ON(pages == 0); + } + + + DRM_DEBUG(blit_rectangle: h=%d, w=%d, pages=%d\n, h, w, pages); + + /* return width and height only of the caller wants it */ + if (height) + *height = h; + if (width) + *width = w; + + return pages; +} + + +int r600_blit_prepare_copy(struct radeon_device *rdev, unsigned num_pages) { int r; - int ring_size, line_size; - int max_size; + int ring_size; /* loops of emits 64 + fence emit possible */ - int dwords_per_loop = 76, num_loops; + int dwords_per_loop = 76, num_loops = 0; r = r600_vb_ib_get(rdev); if (r) @@ -616,18 +666,12 @@ int r600_blit_prepare_copy(struct radeon_device *rdev, int size_bytes) if (rdev-family CHIP_R600 rdev-family CHIP_RV770) dwords_per_loop += 2; - /* 8
[PATCH 3/9] drm/radeon/kms: demystify evergreen blit code
some bits in 3D registers used by blit functions look like magic and this is hard to follow; change them to a little bit more meaningful pre-defined constants Signed-off-by: Ilija Hadzic ihad...@research.bell-labs.com --- drivers/gpu/drm/radeon/evergreen_blit_kms.c | 29 +-- drivers/gpu/drm/radeon/evergreend.h | 42 +++ 2 files changed, 62 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/radeon/evergreen_blit_kms.c b/drivers/gpu/drm/radeon/evergreen_blit_kms.c index 3b24137..68d0de2 100644 --- a/drivers/gpu/drm/radeon/evergreen_blit_kms.c +++ b/drivers/gpu/drm/radeon/evergreen_blit_kms.c @@ -60,7 +60,9 @@ set_render_target(struct radeon_device *rdev, int format, if (h 8) h = 8; - cb_color_info = ((format 2) | (1 24) | (2 8)); + cb_color_info = CB_FORMAT(format) | + CB_SOURCE_FORMAT(CB_SF_EXPORT_NORM) | + CB_ARRAY_MODE(ARRAY_1D_TILED_THIN1); pitch = (w / 8) - 1; slice = ((w * h) / 64) - 1; @@ -137,12 +139,16 @@ set_vtx_resource(struct radeon_device *rdev, u64 gpu_addr) u32 sq_vtx_constant_word2, sq_vtx_constant_word3; /* high addr, stride */ - sq_vtx_constant_word2 = ((upper_32_bits(gpu_addr) 0xff) | (16 8)); + sq_vtx_constant_word2 = SQ_VTXC_BASE_ADDR_HI(upper_32_bits(gpu_addr) 0xff) | + SQ_VTXC_STRIDE(16); #ifdef __BIG_ENDIAN - sq_vtx_constant_word2 |= (2 30); + sq_vtx_constant_word2 |= SQ_VTXC_ENDIAN_SWAP(SQ_ENDIAN_8IN32); #endif /* xyzw swizzles */ - sq_vtx_constant_word3 = (0 3) | (1 6) | (2 9) | (3 12); + sq_vtx_constant_word3 = SQ_VTCX_SEL_X(SQ_SEL_X) | + SQ_VTCX_SEL_Y(SQ_SEL_Y) | + SQ_VTCX_SEL_Z(SQ_SEL_Z) | + SQ_VTCX_SEL_W(SQ_SEL_W); radeon_ring_write(rdev, PACKET3(PACKET3_SET_RESOURCE, 8)); radeon_ring_write(rdev, 0x580); @@ -153,7 +159,7 @@ set_vtx_resource(struct radeon_device *rdev, u64 gpu_addr) radeon_ring_write(rdev, 0); radeon_ring_write(rdev, 0); radeon_ring_write(rdev, 0); - radeon_ring_write(rdev, SQ_TEX_VTX_VALID_BUFFER 30); + radeon_ring_write(rdev, S__SQ_CONSTANT_TYPE(SQ_TEX_VTX_VALID_BUFFER)); if ((rdev-family == CHIP_CEDAR) || (rdev-family == CHIP_PALM) || @@ -180,14 +186,19 @@ set_tex_resource(struct radeon_device *rdev, if (h 1) h = 1; - sq_tex_resource_word0 = (1 0); /* 2D */ + sq_tex_resource_word0 = TEX_DIM(SQ_TEX_DIM_2D); sq_tex_resource_word0 |= pitch 3) - 1) 6) | ((w - 1) 18)); - sq_tex_resource_word1 = ((h - 1) 0) | (2 28); + sq_tex_resource_word1 = ((h - 1) 0) | + TEX_ARRAY_MODE(ARRAY_1D_TILED_THIN1); /* xyzw swizzles */ - sq_tex_resource_word4 = (0 16) | (1 19) | (2 22) | (3 25); + sq_tex_resource_word4 = TEX_DST_SEL_X(SQ_SEL_X) | + TEX_DST_SEL_Y(SQ_SEL_Y) | + TEX_DST_SEL_Z(SQ_SEL_Z) | + TEX_DST_SEL_W(SQ_SEL_W); - sq_tex_resource_word7 = format | (SQ_TEX_VTX_VALID_TEXTURE 30); + sq_tex_resource_word7 = format | + S__SQ_CONSTANT_TYPE(SQ_TEX_VTX_VALID_TEXTURE); radeon_ring_write(rdev, PACKET3(PACKET3_SET_RESOURCE, 8)); radeon_ring_write(rdev, 0); diff --git a/drivers/gpu/drm/radeon/evergreend.h b/drivers/gpu/drm/radeon/evergreend.h index 7363d9d..b937c49 100644 --- a/drivers/gpu/drm/radeon/evergreend.h +++ b/drivers/gpu/drm/radeon/evergreend.h @@ -941,11 +941,15 @@ #defineCB_COLOR0_SLICE 0x28c68 #defineCB_COLOR0_VIEW 0x28c6c #defineCB_COLOR0_INFO 0x28c70 +# define CB_FORMAT(x) ((x) 2) # define CB_ARRAY_MODE(x) ((x) 8) # define ARRAY_LINEAR_GENERAL 0 # define ARRAY_LINEAR_ALIGNED 1 # define ARRAY_1D_TILED_THIN1 2 # define ARRAY_2D_TILED_THIN1 4 +# define CB_SOURCE_FORMAT(x) ((x) 24) +# define CB_SF_EXPORT_FULL0 +# define CB_SF_EXPORT_NORM1 #defineCB_COLOR0_ATTRIB0x28c74 #defineCB_COLOR0_DIM 0x28c78 /* only CB0-7 blocks have these regs */ @@ -1107,15 +,53 @@ #defineCB_COLOR7_CLEAR_WORD3 0x28e3c #define SQ_TEX_RESOURCE_WORD0_0 0x3 +# define TEX_DIM(x) ((x) 0) +# define SQ_TEX_DIM_1D0 +# define SQ_TEX_DIM_2D
[PATCH 4/9] drm/radeon/kms: demystify r600 blit code
some 3d register bits look like magic in r600 blit functions use predefined constants to make it more intuitive what they are Signed-off-by: Ilija Hadzic ihad...@research.bell-labs.com --- drivers/gpu/drm/radeon/r600_blit_kms.c | 30 +- drivers/gpu/drm/radeon/r600d.h | 22 ++ 2 files changed, 39 insertions(+), 13 deletions(-) diff --git a/drivers/gpu/drm/radeon/r600_blit_kms.c b/drivers/gpu/drm/radeon/r600_blit_kms.c index d9994c9..71fec92 100644 --- a/drivers/gpu/drm/radeon/r600_blit_kms.c +++ b/drivers/gpu/drm/radeon/r600_blit_kms.c @@ -58,7 +58,9 @@ set_render_target(struct radeon_device *rdev, int format, if (h 8) h = 8; - cb_color_info = ((format 2) | (1 27) | (1 8)); + cb_color_info = CB_FORMAT(format) | + CB_SOURCE_FORMAT(CB_SF_EXPORT_NORM) | + CB_ARRAY_MODE(ARRAY_1D_TILED_THIN1); pitch = (w / 8) - 1; slice = ((w * h) / 64) - 1; @@ -168,9 +170,10 @@ set_vtx_resource(struct radeon_device *rdev, u64 gpu_addr) { u32 sq_vtx_constant_word2; - sq_vtx_constant_word2 = ((upper_32_bits(gpu_addr) 0xff) | (16 8)); + sq_vtx_constant_word2 = SQ_VTXC_BASE_ADDR_HI(upper_32_bits(gpu_addr) 0xff) | + SQ_VTXC_STRIDE(16); #ifdef __BIG_ENDIAN - sq_vtx_constant_word2 |= (2 30); + sq_vtx_constant_word2 |= SQ_VTXC_ENDIAN_SWAP(SQ_ENDIAN_8IN32); #endif radeon_ring_write(rdev, PACKET3(PACKET3_SET_RESOURCE, 7)); @@ -206,18 +209,19 @@ set_tex_resource(struct radeon_device *rdev, if (h 1) h = 1; - sq_tex_resource_word0 = (1 0) | (1 3); - sq_tex_resource_word0 |= pitch 3) - 1) 8) | - ((w - 1) 19)); + sq_tex_resource_word0 = S_038000_DIM(V_038000_SQ_TEX_DIM_2D) | + S_038000_TILE_MODE(V_038000_ARRAY_1D_TILED_THIN1); + sq_tex_resource_word0 |= S_038000_PITCH((pitch 3) - 1) | + S_038000_TEX_WIDTH(w - 1); - sq_tex_resource_word1 = (format 26); - sq_tex_resource_word1 |= ((h - 1) 0); + sq_tex_resource_word1 = S_038004_DATA_FORMAT(format); + sq_tex_resource_word1 |= S_038004_TEX_HEIGHT(h - 1); - sq_tex_resource_word4 = ((1 14) | -(0 16) | -(1 19) | -(2 22) | -(3 25)); + sq_tex_resource_word4 = S_038010_REQUEST_SIZE(1) | + S_038010_DST_SEL_X(SQ_SEL_X) | + S_038010_DST_SEL_Y(SQ_SEL_Y) | + S_038010_DST_SEL_Z(SQ_SEL_Z) | + S_038010_DST_SEL_W(SQ_SEL_W); radeon_ring_write(rdev, PACKET3(PACKET3_SET_RESOURCE, 7)); radeon_ring_write(rdev, 0); diff --git a/drivers/gpu/drm/radeon/r600d.h b/drivers/gpu/drm/radeon/r600d.h index 0245ae6..bfe1b5d 100644 --- a/drivers/gpu/drm/radeon/r600d.h +++ b/drivers/gpu/drm/radeon/r600d.h @@ -79,6 +79,11 @@ #define CB_COLOR0_SIZE 0x28060 #define CB_COLOR0_VIEW 0x28080 #define CB_COLOR0_INFO 0x280a0 +# define CB_FORMAT(x) ((x) 2) +# define CB_ARRAY_MODE(x) ((x) 8) +# define CB_SOURCE_FORMAT(x) ((x) 27) +# define CB_SF_EXPORT_FULL0 +# define CB_SF_EXPORT_NORM1 #define CB_COLOR0_TILE 0x280c0 #define CB_COLOR0_FRAG 0x280e0 #define CB_COLOR0_MASK 0x28100 @@ -417,6 +422,17 @@ #defineSQ_PGM_START_VS 0x28858 #define SQ_PGM_RESOURCES_VS 0x28868 #define SQ_PGM_CF_OFFSET_VS 0x288d0 + +#define SQ_VTX_CONSTANT_WORD0_00x3 +#define SQ_VTX_CONSTANT_WORD1_00x30004 +#define SQ_VTX_CONSTANT_WORD2_00x30008 +# define SQ_VTXC_BASE_ADDR_HI(x) ((x) 0) +# define SQ_VTXC_STRIDE(x)((x) 8) +# define SQ_VTXC_ENDIAN_SWAP(x) ((x) 30) +# define SQ_ENDIAN_NONE 0 +# define SQ_ENDIAN_8IN16 1 +# define SQ_ENDIAN_8IN32 2 +#define SQ_VTX_CONSTANT_WORD3_00x3000c #defineSQ_VTX_CONSTANT_WORD6_0 0x38018 #defineS__SQ_VTX_CONSTANT_TYPE(x) (((x) 3) 30) #defineG__SQ_VTX_CONSTANT_TYPE(x) (((x) 30) 3) @@ -1352,6 +1368,12 @@ #define S_038010_DST_SEL_W(x)(((x) 0x7) 25) #define G_038010_DST_SEL_W(x)
[PATCH 5/9] drm/radeon/kms: cleanup benchmark code
factor out repeated code into functions fix units in which the throughput is reported (megabytes per second and megabits per second make sense, others are kind of confusing) make report more amenable to awk and friends (e.g. whitespace is always the separator, unit is separated from the number, etc) add #defines for some hard coded constants besides beautification this reorg is done in preparation for writing more elaborate benchmarks Signed-off-by: Ilija Hadzic ihad...@research.bell-labs.com --- drivers/gpu/drm/radeon/radeon_benchmark.c | 156 - 1 files changed, 86 insertions(+), 70 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon_benchmark.c b/drivers/gpu/drm/radeon/radeon_benchmark.c index 10191d9..6951426 100644 --- a/drivers/gpu/drm/radeon/radeon_benchmark.c +++ b/drivers/gpu/drm/radeon/radeon_benchmark.c @@ -26,21 +26,80 @@ #include radeon_reg.h #include radeon.h -void radeon_benchmark_move(struct radeon_device *rdev, unsigned bsize, - unsigned sdomain, unsigned ddomain) +#define RADEON_BENCHMARK_COPY_BLIT 1 +#define RADEON_BENCHMARK_COPY_DMA 0 + +#define RADEON_BENCHMARK_ITERATIONS 1024 + +static int radeon_benchmark_do_move(struct radeon_device *rdev, unsigned size, + uint64_t saddr, uint64_t daddr, + int flag, int n) +{ + unsigned long start_jiffies; + unsigned long end_jiffies; + struct radeon_fence *fence = NULL; + int i, r; + + start_jiffies = jiffies; + for (i = 0; i n; i++) { + r = radeon_fence_create(rdev, fence); + if (r) + return r; + + switch (flag) { + case RADEON_BENCHMARK_COPY_DMA: + r = radeon_copy_dma(rdev, saddr, daddr, + size / RADEON_GPU_PAGE_SIZE, + fence); + break; + case RADEON_BENCHMARK_COPY_BLIT: + r = radeon_copy_blit(rdev, saddr, daddr, +size / RADEON_GPU_PAGE_SIZE, +fence); + break; + default: + DRM_ERROR(Unknown copy method\n); + r = -EINVAL; + } + if (r) + goto exit_do_move; + r = radeon_fence_wait(fence, false); + if (r) + goto exit_do_move; + radeon_fence_unref(fence); + } + end_jiffies = jiffies; + r = jiffies_to_msecs(end_jiffies - start_jiffies); + +exit_do_move: + if (fence) + radeon_fence_unref(fence); + return r; +} + + +static void radeon_benchmark_log_results(int n, unsigned size, +unsigned int time, +unsigned sdomain, unsigned ddomain, +char *kind) +{ + unsigned int throughput = (n * (size 10)) / time; + DRM_INFO(radeon: %s %u bo moves of %u kB from + %d to %d in %u ms, throughput: %u Mb/s or %u MB/s\n, +kind, n, size 10, sdomain, ddomain, time, +throughput * 8, throughput); +} + +static void radeon_benchmark_move(struct radeon_device *rdev, unsigned size, + unsigned sdomain, unsigned ddomain) { struct radeon_bo *dobj = NULL; struct radeon_bo *sobj = NULL; - struct radeon_fence *fence = NULL; uint64_t saddr, daddr; - unsigned long start_jiffies; - unsigned long end_jiffies; - unsigned long time; - unsigned i, n, size; - int r; + int r, n; + unsigned int time; - size = bsize; - n = 1024; + n = RADEON_BENCHMARK_ITERATIONS; r = radeon_bo_create(rdev, size, PAGE_SIZE, true, sdomain, sobj); if (r) { goto out_cleanup; @@ -68,64 +127,23 @@ void radeon_benchmark_move(struct radeon_device *rdev, unsigned bsize, /* r100 doesn't have dma engine so skip the test */ if (rdev-asic-copy_dma) { - - start_jiffies = jiffies; - for (i = 0; i n; i++) { - r = radeon_fence_create(rdev, fence); - if (r) { - goto out_cleanup; - } - - r = radeon_copy_dma(rdev, saddr, daddr, - size / RADEON_GPU_PAGE_SIZE, fence); - - if (r) { - goto out_cleanup; - } - r = radeon_fence_wait(fence, false); - if (r) { - goto out_cleanup; - } -
[PATCH 6/9] drm/radeon/kms: add more elaborate benchmarks
Lots of new (and hopefully useful) benchmark. Load the driver with radeon_benchmark=test_number and enjoy. Among tests added are VRAM to VRAM blits and blits with buffer size sweeps. The latter can be from GTT to VRAM, VRAM to GTT, and VRAM to VRAM and there are two types of sweeps: powers of two and (probably more interesting) buffers sizes that correspond to common modes. Signed-off-by: Ilija Hadzic ihad...@research.bell-labs.com --- drivers/gpu/drm/radeon/radeon.h |2 +- drivers/gpu/drm/radeon/radeon_benchmark.c | 91 +++-- drivers/gpu/drm/radeon/radeon_device.c|2 +- 3 files changed, 87 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/radeon/radeon.h b/drivers/gpu/drm/radeon/radeon.h index ff5424e..5361dd7 100644 --- a/drivers/gpu/drm/radeon/radeon.h +++ b/drivers/gpu/drm/radeon/radeon.h @@ -868,7 +868,7 @@ struct radeon_pm { /* * Benchmarking */ -void radeon_benchmark(struct radeon_device *rdev); +void radeon_benchmark(struct radeon_device *rdev, int test_number); /* diff --git a/drivers/gpu/drm/radeon/radeon_benchmark.c b/drivers/gpu/drm/radeon/radeon_benchmark.c index 6951426..5cafc90 100644 --- a/drivers/gpu/drm/radeon/radeon_benchmark.c +++ b/drivers/gpu/drm/radeon/radeon_benchmark.c @@ -30,6 +30,7 @@ #define RADEON_BENCHMARK_COPY_DMA 0 #define RADEON_BENCHMARK_ITERATIONS 1024 +#define RADEON_BENCHMARK_COMMON_MODES_N 17 static int radeon_benchmark_do_move(struct radeon_device *rdev, unsigned size, uint64_t saddr, uint64_t daddr, @@ -126,7 +127,9 @@ static void radeon_benchmark_move(struct radeon_device *rdev, unsigned size, } /* r100 doesn't have dma engine so skip the test */ - if (rdev-asic-copy_dma) { + /* also, VRAM-to-VRAM test doesn't make much sense for DMA */ + /* skip it as well if domains are the same */ + if ((rdev-asic-copy_dma) (sdomain != ddomain)) { time = radeon_benchmark_do_move(rdev, size, saddr, daddr, RADEON_BENCHMARK_COPY_DMA, n); if (time 0) @@ -167,10 +170,86 @@ out_cleanup: } } -void radeon_benchmark(struct radeon_device *rdev) +void radeon_benchmark(struct radeon_device *rdev, int test_number) { - radeon_benchmark_move(rdev, 1024*1024, RADEON_GEM_DOMAIN_GTT, - RADEON_GEM_DOMAIN_VRAM); - radeon_benchmark_move(rdev, 1024*1024, RADEON_GEM_DOMAIN_VRAM, - RADEON_GEM_DOMAIN_GTT); + int i; + int common_modes[RADEON_BENCHMARK_COMMON_MODES_N] = { + 640 * 480 * 4, + 720 * 480 * 4, + 800 * 600 * 4, + 848 * 480 * 4, + 1024 * 768 * 4, + 1152 * 768 * 4, + 1280 * 720 * 4, + 1280 * 800 * 4, + 1280 * 854 * 4, + 1280 * 960 * 4, + 1280 * 1024 * 4, + 1440 * 900 * 4, + 1400 * 1050 * 4, + 1680 * 1050 * 4, + 1600 * 1200 * 4, + 1920 * 1080 * 4, + 1920 * 1200 * 4 + }; + + switch (test_number) { + case 1: + /* simple test, VRAM to GTT and GTT to VRAM */ + radeon_benchmark_move(rdev, 1024*1024, RADEON_GEM_DOMAIN_GTT, + RADEON_GEM_DOMAIN_VRAM); + radeon_benchmark_move(rdev, 1024*1024, RADEON_GEM_DOMAIN_VRAM, + RADEON_GEM_DOMAIN_GTT); + break; + case 2: + /* simple test, VRAM to VRAM */ + radeon_benchmark_move(rdev, 1024*1024, RADEON_GEM_DOMAIN_VRAM, + RADEON_GEM_DOMAIN_VRAM); + break; + case 3: + /* GTT to VRAM, buffer size sweep, powers of 2 */ + for (i = 1; i = 65536; i = 1) + radeon_benchmark_move(rdev, i*1024, + RADEON_GEM_DOMAIN_GTT, + RADEON_GEM_DOMAIN_VRAM); + break; + case 4: + /* VRAM to GTT, buffer size sweep, powers of 2 */ + for (i = 1; i = 65536; i = 1) + radeon_benchmark_move(rdev, i*1024, + RADEON_GEM_DOMAIN_VRAM, + RADEON_GEM_DOMAIN_GTT); + break; + case 5: + /* VRAM to VRAM, buffer size sweep, powers of 2 */ + for (i = 1; i = 65536; i = 1) + radeon_benchmark_move(rdev, i*1024, + RADEON_GEM_DOMAIN_VRAM, + RADEON_GEM_DOMAIN_VRAM); + break; + case 6: + /* GTT to VRAM, buffer size sweep, common modes */ +
[PATCH 7/9] drm/radeon/kms: cleanup r600 blit code
reorganize the code such that only the primitives (i.e., the functions that load the CP ring) are hardware specific; dynamically link the primitives in a (new) pointer structure inside r600_blit at blit initialization time so that the functions that control the blit operations can be made common for r600 and evergreen parts Signed-off-by: Ilija Hadzic ihad...@research.bell-labs.com --- drivers/gpu/drm/radeon/r600_blit_kms.c | 94 +--- drivers/gpu/drm/radeon/radeon.h| 21 +++ 2 files changed, 70 insertions(+), 45 deletions(-) diff --git a/drivers/gpu/drm/radeon/r600_blit_kms.c b/drivers/gpu/drm/radeon/r600_blit_kms.c index 71fec92..07e3df4 100644 --- a/drivers/gpu/drm/radeon/r600_blit_kms.c +++ b/drivers/gpu/drm/radeon/r600_blit_kms.c @@ -44,7 +44,6 @@ #define RECT_UNIT_H 32 #define RECT_UNIT_W (RADEON_GPU_PAGE_SIZE / 4 / RECT_UNIT_H) -#define MAX_RECT_DIM 8192 /* emits 21 on rv770+, 23 on r600 */ static void @@ -491,6 +490,27 @@ int r600_blit_init(struct radeon_device *rdev) u32 packet2s[16]; int num_packet2s = 0; + rdev-r600_blit.primitives.set_render_target = set_render_target; + rdev-r600_blit.primitives.cp_set_surface_sync = cp_set_surface_sync; + rdev-r600_blit.primitives.set_shaders = set_shaders; + rdev-r600_blit.primitives.set_vtx_resource = set_vtx_resource; + rdev-r600_blit.primitives.set_tex_resource = set_tex_resource; + rdev-r600_blit.primitives.set_scissors = set_scissors; + rdev-r600_blit.primitives.draw_auto = draw_auto; + rdev-r600_blit.primitives.set_default_state = set_default_state; + + rdev-r600_blit.ring_size_common = 40; /* shaders + def state */ + rdev-r600_blit.ring_size_common += 10; /* fence emit for VB IB */ + rdev-r600_blit.ring_size_common += 5; /* done copy */ + rdev-r600_blit.ring_size_common += 10; /* fence emit for done copy */ + + rdev-r600_blit.ring_size_per_loop = 76; + /* set_render_target emits 2 extra dwords on rv6xx */ + if (rdev-family CHIP_R600 rdev-family CHIP_RV770) + rdev-r600_blit.ring_size_per_loop += 2; + + rdev-r600_blit.max_dim = 8192; + /* pin copy shader into vram if already initialized */ if (rdev-r600_blit.shader_obj) goto done; @@ -608,9 +628,8 @@ static void r600_vb_ib_put(struct radeon_device *rdev) radeon_ib_free(rdev, rdev-r600_blit.vb_ib); } -/* FIXME: the function is very similar to evergreen_blit_create_rect, except - that it different predefined constants; consider commonizing */ -static unsigned r600_blit_create_rect(unsigned num_pages, int *width, int *height) +static unsigned r600_blit_create_rect(unsigned num_pages, + int *width, int *height, int max_dim) { unsigned max_pages; unsigned pages = num_pages; @@ -628,12 +647,12 @@ static unsigned r600_blit_create_rect(unsigned num_pages, int *width, int *heigh while (num_pages / rect_order) { h *= 2; rect_order *= 4; - if (h = MAX_RECT_DIM) { - h = MAX_RECT_DIM; + if (h = max_dim) { + h = max_dim; break; } } - max_pages = (MAX_RECT_DIM * h) / (RECT_UNIT_W * RECT_UNIT_H); + max_pages = (max_dim * h) / (RECT_UNIT_W * RECT_UNIT_H); if (pages max_pages) pages = max_pages; w = (pages * RECT_UNIT_W * RECT_UNIT_H) / h; @@ -659,36 +678,29 @@ int r600_blit_prepare_copy(struct radeon_device *rdev, unsigned num_pages) { int r; int ring_size; - /* loops of emits 64 + fence emit possible */ - int dwords_per_loop = 76, num_loops = 0; + int num_loops = 0; + int dwords_per_loop = rdev-r600_blit.ring_size_per_loop; r = r600_vb_ib_get(rdev); if (r) return r; - /* set_render_target emits 2 extra dwords on rv6xx */ - if (rdev-family CHIP_R600 rdev-family CHIP_RV770) - dwords_per_loop += 2; - /* num loops */ while (num_pages) { - num_pages -= r600_blit_create_rect(num_pages, NULL, NULL); + num_pages -= r600_blit_create_rect(num_pages, NULL, NULL, + rdev-r600_blit.max_dim); num_loops++; } /* calculate number of loops correctly */ ring_size = num_loops * dwords_per_loop; - /* set default + shaders */ - ring_size += 40; /* shaders + def state */ - ring_size += 10; /* fence emit for VB IB */ - ring_size += 5; /* done copy */ - ring_size += 10; /* fence emit for done copy */ + ring_size +=
[PATCH 8/9] drm/radeon/kms: blit code commoning
factor out most of evergreen blit code and use the refactored code from r600 that is now common for both r600 and evergreen Signed-off-by: Ilija Hadzic ihad...@research.bell-labs.com --- drivers/gpu/drm/radeon/evergreen.c | 25 +--- drivers/gpu/drm/radeon/evergreen_blit_kms.c | 260 ++- drivers/gpu/drm/radeon/ni.c |4 +- drivers/gpu/drm/radeon/radeon_asic.c| 16 +- drivers/gpu/drm/radeon/radeon_asic.h| 10 - 5 files changed, 30 insertions(+), 285 deletions(-) diff --git a/drivers/gpu/drm/radeon/evergreen.c b/drivers/gpu/drm/radeon/evergreen.c index 5f0ecc7..69dded2 100644 --- a/drivers/gpu/drm/radeon/evergreen.c +++ b/drivers/gpu/drm/radeon/evergreen.c @@ -3087,7 +3087,7 @@ static int evergreen_startup(struct radeon_device *rdev) r = evergreen_blit_init(rdev); if (r) { - evergreen_blit_fini(rdev); + r600_blit_fini(rdev); rdev-asic-copy = NULL; dev_warn(rdev-dev, failed blitter (%d) falling back to memcpy\n, r); } @@ -3172,27 +3172,6 @@ int evergreen_suspend(struct radeon_device *rdev) return 0; } -int evergreen_copy_blit(struct radeon_device *rdev, - uint64_t src_offset, uint64_t dst_offset, - unsigned num_pages, struct radeon_fence *fence) -{ - int r; - - mutex_lock(rdev-r600_blit.mutex); - rdev-r600_blit.vb_ib = NULL; - r = evergreen_blit_prepare_copy(rdev, num_pages); - if (r) { - if (rdev-r600_blit.vb_ib) - radeon_ib_free(rdev, rdev-r600_blit.vb_ib); - mutex_unlock(rdev-r600_blit.mutex); - return r; - } - evergreen_kms_blit_copy(rdev, src_offset, dst_offset, num_pages); - evergreen_blit_done_copy(rdev, fence); - mutex_unlock(rdev-r600_blit.mutex); - return 0; -} - /* Plan is to move initialization in that function and use * helper function so that radeon_device_init pretty much * do nothing more than calling asic specific function. This @@ -3301,7 +3280,7 @@ int evergreen_init(struct radeon_device *rdev) void evergreen_fini(struct radeon_device *rdev) { - evergreen_blit_fini(rdev); + r600_blit_fini(rdev); r700_cp_fini(rdev); r600_irq_fini(rdev); radeon_wb_fini(rdev); diff --git a/drivers/gpu/drm/radeon/evergreen_blit_kms.c b/drivers/gpu/drm/radeon/evergreen_blit_kms.c index 68d0de2..dcf11bb 100644 --- a/drivers/gpu/drm/radeon/evergreen_blit_kms.c +++ b/drivers/gpu/drm/radeon/evergreen_blit_kms.c @@ -44,10 +44,6 @@ #define COLOR_5_6_5 0x8 #define COLOR_8_8_8_8 0x1a -#define RECT_UNIT_H 32 -#define RECT_UNIT_W (RADEON_GPU_PAGE_SIZE / 4 / RECT_UNIT_H) -#define MAX_RECT_DIM 16384 - /* emits 17 */ static void set_render_target(struct radeon_device *rdev, int format, @@ -599,31 +595,6 @@ set_default_state(struct radeon_device *rdev) } -static inline uint32_t i2f(uint32_t input) -{ - u32 result, i, exponent, fraction; - - if ((input 0x3fff) == 0) - result = 0; /* 0 is a special case */ - else { - exponent = 140; /* exponent biased by 127; */ - fraction = (input 0x3fff) 10; /* cheat and only - handle numbers below 2^^15 */ - for (i = 0; i 14; i++) { - if (fraction 0x80) - break; - else { - fraction = fraction 1; /* keep -shifting left until top bit = 1 */ - exponent = exponent - 1; - } - } - result = exponent 23 | (fraction 0x7f); /* mask - off top bit; assumed 1 */ - } - return result; -} - int evergreen_blit_init(struct radeon_device *rdev) { u32 obj_size; @@ -632,6 +603,24 @@ int evergreen_blit_init(struct radeon_device *rdev) u32 packet2s[16]; int num_packet2s = 0; + rdev-r600_blit.primitives.set_render_target = set_render_target; + rdev-r600_blit.primitives.cp_set_surface_sync = cp_set_surface_sync; + rdev-r600_blit.primitives.set_shaders = set_shaders; + rdev-r600_blit.primitives.set_vtx_resource = set_vtx_resource; + rdev-r600_blit.primitives.set_tex_resource = set_tex_resource; + rdev-r600_blit.primitives.set_scissors = set_scissors; + rdev-r600_blit.primitives.draw_auto = draw_auto; + rdev-r600_blit.primitives.set_default_state = set_default_state; + + rdev-r600_blit.ring_size_common = 55; /* shaders + def state */ + rdev-r600_blit.ring_size_common += 10; /* fence emit for VB IB */ +
Re: [RFC 2/2] dma-buf: Documentation for buffer sharing framework
Hi Randy, On Thu, Oct 13, 2011 at 4:00 AM, Randy Dunlap rdun...@xenotime.net wrote: On 10/11/2011 02:23 AM, Sumit Semwal wrote: Add documentation for dma buffer sharing framework, explaining the various operations, members and API of the dma buffer sharing framework. Signed-off-by: Sumit Semwal sumit.sem...@linaro.org Signed-off-by: Sumit Semwal sumit.sem...@ti.com --- Documentation/dma-buf-sharing.txt | 210 + snip + if the new buffer-user has stricter 'backing-storage constraints', and the + exporter can handle these constraints, the exporter can just stall on the + get_scatterlist till all outstanding access is completed (as signalled by until Thanks for your review; I will update all these in the next version. + put_scatterlist). + Once all ongoing access is completed, the exporter could potentially move + the buffer to the stricter backing-storage, and then allow further + {get,put}_scatterlist operations from any buffer-user from the migrated + backing-storage. + + If the exporter cannot fulfill the backing-storage constraints of the new + buffer-user device as requested, dma_buf_attach() would return an error to + denote non-compatibility of the new buffer-sharing request with the current + buffer. + + If the exporter chooses not to allow an attach() operation once a + get_scatterlist has been called, it simply returns an error. + +- mmap file operation + An mmap() file operation is provided for the fd associated with the buffer. + If the exporter defines an mmap operation, the mmap() fop calls this to allow + mmap for devices that might need it; if not, it returns an error. + +References: +[1] struct dma_buf_ops in include/linux/dma-buf.h +[2] All interfaces mentioned above defined in include/linux/dma-buf.h -- ~Randy *** Remember to use Documentation/SubmitChecklist when testing your code *** Best regards, ~Sumit. ___ dri-devel mailing list dri-devel@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/dri-devel