Re: [Mesa-dev] Request for support of GL_AMD_pinned_memory and GL_ARB_buffer_storage extensions
On Thu, Feb 6, 2014 at 2:40 AM, Roland Scheidegger srol...@vmware.com wrote: I don't think that would work. The reason for this stuff to exist is because new hw makes that possible on the hw level directly. I don't think this has anything to do with new hardware. This stuff has always been possible and it's a shame it wasn't exposed by DX9 and GL1.5 or even earlier versions. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Request for support of GL_AMD_pinned_memory and GL_ARB_buffer_storage extensions
Am 06.02.2014 12:42, schrieb Marek Olšák: On Thu, Feb 6, 2014 at 2:40 AM, Roland Scheidegger srol...@vmware.com wrote: I don't think that would work. The reason for this stuff to exist is because new hw makes that possible on the hw level directly. I don't think this has anything to do with new hardware. This stuff has always been possible and it's a shame it wasn't exposed by DX9 and GL1.5 or even earlier versions. Marek Yes you are quite right. You could potentially use cached memory though with new APUs, but that is probably not useful for streaming vertex data... Might not be useful outside of compute. Roland ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Request for support of GL_AMD_pinned_memory and GL_ARB_buffer_storage extensions
Hello Marek, Nice to hear the extension is being tackled! Took me a while to get mesa building again, and I did a quick test with your patches - unfortunately they outright crash Dolphin at the moment. I'm not quite sure if you have sent an updated patch series yet, so I used the one you sent on Jan 29, applied on current git mesa (5c975966) from today. Most importantly, our buffer streaming code uses persistent (and coherent) mapping in the buffer_storage code path, so maybe that's an issue? If you like, I can do some additional debugging, but I'm not very familiar with mesa debugging so I'd need some help. For what it's worth, I'm usually hanging around in #dri-devel under the nick neobrain. Regards, Tony Am 29.01.2014 01:49, schrieb Marek Olšák: On Wed, Jan 29, 2014 at 1:42 AM, Ian Romanick i...@freedesktop.org wrote: On 01/28/2014 05:35 PM, Marek Olšák wrote: Yes, GL_ARB_buffer_storage is being worked on. We'll support it on all Radeon cards R300 and up. Are you guys working on that? Have an ETA? :) It's done. I'm writing piglit tests at the moment. I'll send my patches tomorrow. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Request for support of GL_AMD_pinned_memory and GL_ARB_buffer_storage extensions
I honestly hope that GL_AMD_pinned_memory doesn't become popular. It would have been alright if it wasn't for this bit in http://www.opengl.org/registry/specs/AMD/pinned_memory.txt which says: 2) Can the application still use the buffer using the CPU address? RESOLVED: YES. However, this access would be completely non synchronized to the OpenGL pipeline, unless explicit synchronization is being used (for example, through glFinish or by using sync objects). And I'm imagining apps which are streaming vertex data doing precisely just that... This means that, in order to obtain traces of applications that use AMD_pinned_memory like that with Apitrace, we'll need to use heuristics to determine when applications touch the memory behind the scenes and emit fake memcpies, which means slow tracing and/or bloated trace files... Just like user memory pointer arrays... :( Instead of that uglyness, maybe Apitrace should just mask out GL_AMD_pinned_memory support, so that I don't have to worry about it, and let apps and OpenGL drivers support/use it at their own peril. Jose - Original Message - Yes, GL_ARB_buffer_storage is being worked on. We'll support it on all Radeon cards R300 and up. Anyway, GL_STREAM_DRAW should give you the same behavior as GL_CLIENT_STORAGE_BIT on open source Radeon drivers. Marek On Sun, Nov 24, 2013 at 1:19 PM, Tony Wasserka neobra...@googlemail.com wrote: Hello everyone, I was told on IRC that my question would get most attention around here - so bear with me if this is the wrong place to ask I'm one of the developers of the GC/Wii emulator Dolphin. We recently rewrote our OpenGL renderer to use modern OpenGL 3 features, however one thing that we stumbled upon are the lack of efficient (vertex/index) buffer data streaming mechanisms in OpenGL. Basically, most of our vertex data is used once and never again after that (we have to do this for accurate emulation) - so all vertex data gets streamed into one huge ring buffer (and analogously for index data, which uses its own huge ring buffer). For buffer streaming, we have multiple code paths using a combination of glMapBufferRange, glBufferSubData, fences and buffer orphaning, yet none of these come anywhere close to the performance of (legacy) rendering from a vertex array stored in RAM. There are two OpenGL extensions which greatly help us in this situation: AMD's pinned memory [1], and buffer storage[2] in GL 4.4. We currently have no buffer storage code path, but usage of pinned memory gave us a speedup of up to 60% under heavy workloads when working with AMD's Catalyst driver under Windows. We expect the same speedup when using buffer storage (specifically we need CLIENT_STORAGE_BIT, if I recall correctly). So the natural question that arises is: Is either of these two extensions going to be supported in mesa anytime soon or is it of lower priority than other extensions? Also, is the pinned memory extension AMD hardware specific or would it be possible to support it for other hardware, too? I'm not sure if buffer storage (being a GL 4.4 extension, and I read that it might actually depend on some other GL 4.3 extension) is possible to implement on older hardware, yet it would be very useful for us to have efficient streaming methods for old GPUs, too. I hope this mail doesn't sound too commanding or anything, it's just supposed to be a friendly question on improving the emulator experience for our user base Thanks in advance! Best regards, Tony [1] https://urldefense.proofpoint.com/v1/url?u=http://www.opengl.org/registry/specs/AMD/pinned_memory.txtk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=NMr9uy2iTjWVixC0wOcYCWEIYhfo80qKwRgdodpoDzA%3D%0Am=cv4fRRLbo4swWVoK5KeixmMKacksWBLX%2Bi4XDCp0aDI%3D%0As=ede103da8bd227ae11f6ab3a4a6d6b0c673860dc15b6814055302759bf4ef355 [2] https://urldefense.proofpoint.com/v1/url?u=http://www.opengl.org/registry/specs/ARB/buffer_storage.txtk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=NMr9uy2iTjWVixC0wOcYCWEIYhfo80qKwRgdodpoDzA%3D%0Am=cv4fRRLbo4swWVoK5KeixmMKacksWBLX%2Bi4XDCp0aDI%3D%0As=7ceb1af3a41882ca6ba4e13bf3df2c8a59b08441835b05174e93968ef8f580f2 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://urldefense.proofpoint.com/v1/url?u=http://lists.freedesktop.org/mailman/listinfo/mesa-devk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=NMr9uy2iTjWVixC0wOcYCWEIYhfo80qKwRgdodpoDzA%3D%0Am=cv4fRRLbo4swWVoK5KeixmMKacksWBLX%2Bi4XDCp0aDI%3D%0As=5d44af52ecd36e285eff028b59a928ba327a33c57498a7e5d0a3f8e5e12070a9 ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org
Re: [Mesa-dev] Request for support of GL_AMD_pinned_memory and GL_ARB_buffer_storage extensions
On 05.02.2014 18:08, Jose Fonseca wrote: I honestly hope that GL_AMD_pinned_memory doesn't become popular. It would have been alright if it wasn't for this bit in http://www.opengl.org/registry/specs/AMD/pinned_memory.txt which says: 2) Can the application still use the buffer using the CPU address? RESOLVED: YES. However, this access would be completely non synchronized to the OpenGL pipeline, unless explicit synchronization is being used (for example, through glFinish or by using sync objects). And I'm imagining apps which are streaming vertex data doing precisely just that... I don't understand your concern, this is exactly the same behavior GL_MAP_UNSYCHRONIZED_BIT has, and apps are supposedly using that properly. How does apitrace handle it? Grigori ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Request for support of GL_AMD_pinned_memory and GL_ARB_buffer_storage extensions
My understanding is that this is like having MAP_UNSYNCHRONIZED on at all times, even when it isn't mapped, because it is always mapped (into memory). Is that correct Jose? Patrick On Wed, Feb 5, 2014 at 11:53 AM, Grigori Goronzy g...@chown.ath.cx wrote: On 05.02.2014 18:08, Jose Fonseca wrote: I honestly hope that GL_AMD_pinned_memory doesn't become popular. It would have been alright if it wasn't for this bit in http://www.opengl.org/registry/specs/AMD/pinned_memory.txt which says: 2) Can the application still use the buffer using the CPU address? RESOLVED: YES. However, this access would be completely non synchronized to the OpenGL pipeline, unless explicit synchronization is being used (for example, through glFinish or by using sync objects). And I'm imagining apps which are streaming vertex data doing precisely just that... I don't understand your concern, this is exactly the same behavior GL_MAP_UNSYCHRONIZED_BIT has, and apps are supposedly using that properly. How does apitrace handle it? Grigori ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Request for support of GL_AMD_pinned_memory and GL_ARB_buffer_storage extensions
- Original Message - On 05.02.2014 18:08, Jose Fonseca wrote: I honestly hope that GL_AMD_pinned_memory doesn't become popular. It would have been alright if it wasn't for this bit in https://urldefense.proofpoint.com/v1/url?u=http://www.opengl.org/registry/specs/AMD/pinned_memory.txtk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=NMr9uy2iTjWVixC0wOcYCWEIYhfo80qKwRgdodpoDzA%3D%0Am=pA%2FnK9X3xx0wAlMUZ24PfQ1mW6wAMdTUujz%2Bx7LRwCA%3D%0As=ebbe1f51deb46c81578b3c125b16e31b5f4b28c1d47e283bc9ef588e2707024d which says: 2) Can the application still use the buffer using the CPU address? RESOLVED: YES. However, this access would be completely non synchronized to the OpenGL pipeline, unless explicit synchronization is being used (for example, through glFinish or by using sync objects). And I'm imagining apps which are streaming vertex data doing precisely just that... I don't understand your concern, this is exactly the same behavior GL_MAP_UNSYCHRONIZED_BIT has, and apps are supposedly using that properly. How does apitrace handle it? GL_AMD_pinned_memory it's nothing like GL_ARB_map_buffer_range's GL_MAP_UNSYCHRONIZED_BIT: - When an app touches memory returned by glMapBufferRange(GL_MAP_UNSYCHRONIZED_BIT) it will communicate back to the OpenGL driver which bytes it actually touched via the glFlushMappedBufferRange (unless the apps doesn't care about performance and doesn't call glFlushMappedBufferRange at all, which is silly as it will force the OpenGL driver to assumed the whole range changed) In this case, the OpenGL driver (hence apitrace) should get all the information it needs about which bytes were updated betwen glMap/glUnmap. - When an app touches memory bound via GL_AMD_pinned_memory outside glMap/glUnmap, there are be _no_ hints whatsever. The OpenGL driver might not care as the memory is shared between CPU and GPU, so all is good as far is it is concerned, but all the changes the app does are invisible at an API level, hence apitrace will not be able to catch them unless it does onerous heuristics. So while both extensions allow unsynchronized access, but lack of synchronization is not my concern. My concern is that GL_AMD_pinned_memory allows *hidden* access to GPU memory. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Request for support of GL_AMD_pinned_memory and GL_ARB_buffer_storage extensions
Yes, precisely. Jose - Original Message - My understanding is that this is like having MAP_UNSYNCHRONIZED on at all times, even when it isn't mapped, because it is always mapped (into memory). Is that correct Jose? Patrick On Wed, Feb 5, 2014 at 11:53 AM, Grigori Goronzy g...@chown.ath.cx wrote: On 05.02.2014 18:08, Jose Fonseca wrote: I honestly hope that GL_AMD_pinned_memory doesn't become popular. It would have been alright if it wasn't for this bit in https://urldefense.proofpoint.com/v1/url?u=http://www.opengl.org/registry/specs/AMD/pinned_memory.txtk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=NMr9uy2iTjWVixC0wOcYCWEIYhfo80qKwRgdodpoDzA%3D%0Am=vU0qdyo0bT3OlrdDiNzEDE1rwoALRno8drdsy3dobcI%3D%0As=0068372b93924b1324d4b77b80c5deec67683c1a59a9b2e91255c5a041603274 which says: 2) Can the application still use the buffer using the CPU address? RESOLVED: YES. However, this access would be completely non synchronized to the OpenGL pipeline, unless explicit synchronization is being used (for example, through glFinish or by using sync objects). And I'm imagining apps which are streaming vertex data doing precisely just that... I don't understand your concern, this is exactly the same behavior GL_MAP_UNSYCHRONIZED_BIT has, and apps are supposedly using that properly. How does apitrace handle it? Grigori ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org https://urldefense.proofpoint.com/v1/url?u=http://lists.freedesktop.org/mailman/listinfo/mesa-devk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=NMr9uy2iTjWVixC0wOcYCWEIYhfo80qKwRgdodpoDzA%3D%0Am=vU0qdyo0bT3OlrdDiNzEDE1rwoALRno8drdsy3dobcI%3D%0As=5f37c510dc241c96f7f1918728b86768c5ad61c70a9281e5b46a460197cec9ee ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Request for support of GL_AMD_pinned_memory and GL_ARB_buffer_storage extensions
- Original Message - - Original Message - On 05.02.2014 18:08, Jose Fonseca wrote: I honestly hope that GL_AMD_pinned_memory doesn't become popular. It would have been alright if it wasn't for this bit in https://urldefense.proofpoint.com/v1/url?u=http://www.opengl.org/registry/specs/AMD/pinned_memory.txtk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=NMr9uy2iTjWVixC0wOcYCWEIYhfo80qKwRgdodpoDzA%3D%0Am=pA%2FnK9X3xx0wAlMUZ24PfQ1mW6wAMdTUujz%2Bx7LRwCA%3D%0As=ebbe1f51deb46c81578b3c125b16e31b5f4b28c1d47e283bc9ef588e2707024d which says: 2) Can the application still use the buffer using the CPU address? RESOLVED: YES. However, this access would be completely non synchronized to the OpenGL pipeline, unless explicit synchronization is being used (for example, through glFinish or by using sync objects). And I'm imagining apps which are streaming vertex data doing precisely just that... I don't understand your concern, this is exactly the same behavior GL_MAP_UNSYCHRONIZED_BIT has, and apps are supposedly using that properly. How does apitrace handle it? GL_AMD_pinned_memory it's nothing like GL_ARB_map_buffer_range's GL_MAP_UNSYCHRONIZED_BIT: - When an app touches memory returned by glMapBufferRange(GL_MAP_UNSYCHRONIZED_BIT) it will communicate back to the OpenGL driver which bytes it actually touched via the glFlushMappedBufferRange (unless the apps doesn't care about performance and doesn't call glFlushMappedBufferRange at all, which is silly as it will force the OpenGL driver to assumed the whole range changed) In this case, the OpenGL driver (hence apitrace) should get all the information it needs about which bytes were updated betwen glMap/glUnmap. - When an app touches memory bound via GL_AMD_pinned_memory outside glMap/glUnmap, there are be _no_ hints whatsever. The OpenGL driver might not care as the memory is shared between CPU and GPU, so all is good as far is it is concerned, but all the changes the app does are invisible at an API level, hence apitrace will not be able to catch them unless it does onerous heuristics. So while both extensions allow unsynchronized access, but lack of synchronization is not my concern. My concern is that GL_AMD_pinned_memory allows *hidden* access to GPU memory. Just for the record, the challenges GL_AMD_pinned_memory presents to Apitrace are much similar to the old-fashioned OpenGL user array pointers: an app is free to change the contents of memory pointed by user arrays pointers at any point in time, except during a draw call. This means that before every draw call, Apitrace needs to scavenge all the user memory pointers and write their contents to the trace file, just in case the app changed it.. In order to support GL_AMD_pinned_memory, for every draw call Apitrace would also need to walk over bound GL_AMD_pinned_memory (and nowadays there are loads of bound points!), and check if data changed, and serialize in the trace file if it did... I never care much about performance of Apitrace with user array pointers: it is an old paradigm; only old apps use it, or programmers which don't particular care about performance -- either way, a performance conscious app developer would use VBOs hence never hit the problem at all. My displeasure with GL_AMD_pinned_memory is that it essentially flips everything on its head -- it encourages a paradigm which apitrace will never be able to handle properly. People often complain that OpenGL development tools are poor compared with Direct3D's. An important fact they often miss is that Direct3D API is several orders of mangnitude tool friendlier: it's clear that Direct3D API's cares about things like allowing to query all state back, whereas OpenGL is more fire and forget and never look back -- the main concern in OpenGL is ensuring that state can go from App to Driver fast, but little thought is often given to ensuring that one can read whole state back, or ensuring that one can intercept all state as it goes between the app and the driver... In this particular case, if the answer for Can the application still use the buffer using the CPU address? was a NO, the world would be a much better place. Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Request for support of GL_AMD_pinned_memory and GL_ARB_buffer_storage extensions
However, GL_ARB_buffer_storage (OpenGL 4.4) with GL_MAP_PERSISTENT_BIT isn't much different. The only difference I see between ARB_buffer_storage and AMD_pinned_memory is that AMD_pinned_memory allows mapping CPU memory to the GPU address space permanently, while ARB_buffer_storage allows mapping GPU memory to the CPU address permanently. At the end of the day, both the GPU and the CPU can read and modify the same buffer and all they need to use for synchronization is fences. Marek On Wed, Feb 5, 2014 at 8:10 PM, Jose Fonseca jfons...@vmware.com wrote: - Original Message - - Original Message - On 05.02.2014 18:08, Jose Fonseca wrote: I honestly hope that GL_AMD_pinned_memory doesn't become popular. It would have been alright if it wasn't for this bit in https://urldefense.proofpoint.com/v1/url?u=http://www.opengl.org/registry/specs/AMD/pinned_memory.txtk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=NMr9uy2iTjWVixC0wOcYCWEIYhfo80qKwRgdodpoDzA%3D%0Am=pA%2FnK9X3xx0wAlMUZ24PfQ1mW6wAMdTUujz%2Bx7LRwCA%3D%0As=ebbe1f51deb46c81578b3c125b16e31b5f4b28c1d47e283bc9ef588e2707024d which says: 2) Can the application still use the buffer using the CPU address? RESOLVED: YES. However, this access would be completely non synchronized to the OpenGL pipeline, unless explicit synchronization is being used (for example, through glFinish or by using sync objects). And I'm imagining apps which are streaming vertex data doing precisely just that... I don't understand your concern, this is exactly the same behavior GL_MAP_UNSYCHRONIZED_BIT has, and apps are supposedly using that properly. How does apitrace handle it? GL_AMD_pinned_memory it's nothing like GL_ARB_map_buffer_range's GL_MAP_UNSYCHRONIZED_BIT: - When an app touches memory returned by glMapBufferRange(GL_MAP_UNSYCHRONIZED_BIT) it will communicate back to the OpenGL driver which bytes it actually touched via the glFlushMappedBufferRange (unless the apps doesn't care about performance and doesn't call glFlushMappedBufferRange at all, which is silly as it will force the OpenGL driver to assumed the whole range changed) In this case, the OpenGL driver (hence apitrace) should get all the information it needs about which bytes were updated betwen glMap/glUnmap. - When an app touches memory bound via GL_AMD_pinned_memory outside glMap/glUnmap, there are be _no_ hints whatsever. The OpenGL driver might not care as the memory is shared between CPU and GPU, so all is good as far is it is concerned, but all the changes the app does are invisible at an API level, hence apitrace will not be able to catch them unless it does onerous heuristics. So while both extensions allow unsynchronized access, but lack of synchronization is not my concern. My concern is that GL_AMD_pinned_memory allows *hidden* access to GPU memory. Just for the record, the challenges GL_AMD_pinned_memory presents to Apitrace are much similar to the old-fashioned OpenGL user array pointers: an app is free to change the contents of memory pointed by user arrays pointers at any point in time, except during a draw call. This means that before every draw call, Apitrace needs to scavenge all the user memory pointers and write their contents to the trace file, just in case the app changed it.. In order to support GL_AMD_pinned_memory, for every draw call Apitrace would also need to walk over bound GL_AMD_pinned_memory (and nowadays there are loads of bound points!), and check if data changed, and serialize in the trace file if it did... I never care much about performance of Apitrace with user array pointers: it is an old paradigm; only old apps use it, or programmers which don't particular care about performance -- either way, a performance conscious app developer would use VBOs hence never hit the problem at all. My displeasure with GL_AMD_pinned_memory is that it essentially flips everything on its head -- it encourages a paradigm which apitrace will never be able to handle properly. People often complain that OpenGL development tools are poor compared with Direct3D's. An important fact they often miss is that Direct3D API is several orders of mangnitude tool friendlier: it's clear that Direct3D API's cares about things like allowing to query all state back, whereas OpenGL is more fire and forget and never look back -- the main concern in OpenGL is ensuring that state can go from App to Driver fast, but little thought is often given to ensuring that one can read whole state back, or ensuring that one can intercept all state as it goes between the app and the driver... In this particular case, if the answer for Can the application still use the buffer using the CPU address? was a NO, the world would be a much better place. Jose ___ mesa-dev
Re: [Mesa-dev] Request for support of GL_AMD_pinned_memory and GL_ARB_buffer_storage extensions
I hadn't looked at GL_ARB_buffer_storage. I need to read more closely, but at a glance i looks like GL_MAP_PERSISTENT_BIT alone is okay (app needs to call FlushMappedBufferRange must be called to guarantee coherence) but if GL_MAP_COHERENCE_BIT is set we are indeed in face of the same issue... :-( Even worse, being part of GL 4.4 and there being no way for the implementation to fail GL_MAP_COHERENCE_BIT mappings, it means there is no way to avoid supporting it... Jose Note to self: my time would be better spent on reviewing extensions before they are ratified, than ranting after the fact... - Original Message - However, GL_ARB_buffer_storage (OpenGL 4.4) with GL_MAP_PERSISTENT_BIT isn't much different. The only difference I see between ARB_buffer_storage and AMD_pinned_memory is that AMD_pinned_memory allows mapping CPU memory to the GPU address space permanently, while ARB_buffer_storage allows mapping GPU memory to the CPU address permanently. At the end of the day, both the GPU and the CPU can read and modify the same buffer and all they need to use for synchronization is fences. Marek On Wed, Feb 5, 2014 at 8:10 PM, Jose Fonseca jfons...@vmware.com wrote: - Original Message - - Original Message - On 05.02.2014 18:08, Jose Fonseca wrote: I honestly hope that GL_AMD_pinned_memory doesn't become popular. It would have been alright if it wasn't for this bit in https://urldefense.proofpoint.com/v1/url?u=http://www.opengl.org/registry/specs/AMD/pinned_memory.txtk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=NMr9uy2iTjWVixC0wOcYCWEIYhfo80qKwRgdodpoDzA%3D%0Am=pA%2FnK9X3xx0wAlMUZ24PfQ1mW6wAMdTUujz%2Bx7LRwCA%3D%0As=ebbe1f51deb46c81578b3c125b16e31b5f4b28c1d47e283bc9ef588e2707024d which says: 2) Can the application still use the buffer using the CPU address? RESOLVED: YES. However, this access would be completely non synchronized to the OpenGL pipeline, unless explicit synchronization is being used (for example, through glFinish or by using sync objects). And I'm imagining apps which are streaming vertex data doing precisely just that... I don't understand your concern, this is exactly the same behavior GL_MAP_UNSYCHRONIZED_BIT has, and apps are supposedly using that properly. How does apitrace handle it? GL_AMD_pinned_memory it's nothing like GL_ARB_map_buffer_range's GL_MAP_UNSYCHRONIZED_BIT: - When an app touches memory returned by glMapBufferRange(GL_MAP_UNSYCHRONIZED_BIT) it will communicate back to the OpenGL driver which bytes it actually touched via the glFlushMappedBufferRange (unless the apps doesn't care about performance and doesn't call glFlushMappedBufferRange at all, which is silly as it will force the OpenGL driver to assumed the whole range changed) In this case, the OpenGL driver (hence apitrace) should get all the information it needs about which bytes were updated betwen glMap/glUnmap. - When an app touches memory bound via GL_AMD_pinned_memory outside glMap/glUnmap, there are be _no_ hints whatsever. The OpenGL driver might not care as the memory is shared between CPU and GPU, so all is good as far is it is concerned, but all the changes the app does are invisible at an API level, hence apitrace will not be able to catch them unless it does onerous heuristics. So while both extensions allow unsynchronized access, but lack of synchronization is not my concern. My concern is that GL_AMD_pinned_memory allows *hidden* access to GPU memory. Just for the record, the challenges GL_AMD_pinned_memory presents to Apitrace are much similar to the old-fashioned OpenGL user array pointers: an app is free to change the contents of memory pointed by user arrays pointers at any point in time, except during a draw call. This means that before every draw call, Apitrace needs to scavenge all the user memory pointers and write their contents to the trace file, just in case the app changed it.. In order to support GL_AMD_pinned_memory, for every draw call Apitrace would also need to walk over bound GL_AMD_pinned_memory (and nowadays there are loads of bound points!), and check if data changed, and serialize in the trace file if it did... I never care much about performance of Apitrace with user array pointers: it is an old paradigm; only old apps use it, or programmers which don't particular care about performance -- either way, a performance conscious app developer would use VBOs hence never hit the problem at all. My displeasure with GL_AMD_pinned_memory is that it essentially flips everything on its head -- it encourages a paradigm which apitrace will never be able to handle properly. People often complain that OpenGL development tools are poor compared with Direct3D's.
Re: [Mesa-dev] Request for support of GL_AMD_pinned_memory and GL_ARB_buffer_storage extensions
The synchronization for non-coherent persistent mappings can also be done using: glMemoryBarrier(GL_CLIENT_MAPPED_BUFFER_BARRIER_BIT); In which case you don't know the range either. However I fully support the addition of coherent persistent mappings to GL. It's perfect for uploading data without the GL API overhead. Marek On Thu, Feb 6, 2014 at 12:49 AM, Jose Fonseca jfons...@vmware.com wrote: I hadn't looked at GL_ARB_buffer_storage. I need to read more closely, but at a glance i looks like GL_MAP_PERSISTENT_BIT alone is okay (app needs to call FlushMappedBufferRange must be called to guarantee coherence) but if GL_MAP_COHERENCE_BIT is set we are indeed in face of the same issue... :-( Even worse, being part of GL 4.4 and there being no way for the implementation to fail GL_MAP_COHERENCE_BIT mappings, it means there is no way to avoid supporting it... Jose Note to self: my time would be better spent on reviewing extensions before they are ratified, than ranting after the fact... - Original Message - However, GL_ARB_buffer_storage (OpenGL 4.4) with GL_MAP_PERSISTENT_BIT isn't much different. The only difference I see between ARB_buffer_storage and AMD_pinned_memory is that AMD_pinned_memory allows mapping CPU memory to the GPU address space permanently, while ARB_buffer_storage allows mapping GPU memory to the CPU address permanently. At the end of the day, both the GPU and the CPU can read and modify the same buffer and all they need to use for synchronization is fences. Marek On Wed, Feb 5, 2014 at 8:10 PM, Jose Fonseca jfons...@vmware.com wrote: - Original Message - - Original Message - On 05.02.2014 18:08, Jose Fonseca wrote: I honestly hope that GL_AMD_pinned_memory doesn't become popular. It would have been alright if it wasn't for this bit in https://urldefense.proofpoint.com/v1/url?u=http://www.opengl.org/registry/specs/AMD/pinned_memory.txtk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=NMr9uy2iTjWVixC0wOcYCWEIYhfo80qKwRgdodpoDzA%3D%0Am=pA%2FnK9X3xx0wAlMUZ24PfQ1mW6wAMdTUujz%2Bx7LRwCA%3D%0As=ebbe1f51deb46c81578b3c125b16e31b5f4b28c1d47e283bc9ef588e2707024d which says: 2) Can the application still use the buffer using the CPU address? RESOLVED: YES. However, this access would be completely non synchronized to the OpenGL pipeline, unless explicit synchronization is being used (for example, through glFinish or by using sync objects). And I'm imagining apps which are streaming vertex data doing precisely just that... I don't understand your concern, this is exactly the same behavior GL_MAP_UNSYCHRONIZED_BIT has, and apps are supposedly using that properly. How does apitrace handle it? GL_AMD_pinned_memory it's nothing like GL_ARB_map_buffer_range's GL_MAP_UNSYCHRONIZED_BIT: - When an app touches memory returned by glMapBufferRange(GL_MAP_UNSYCHRONIZED_BIT) it will communicate back to the OpenGL driver which bytes it actually touched via the glFlushMappedBufferRange (unless the apps doesn't care about performance and doesn't call glFlushMappedBufferRange at all, which is silly as it will force the OpenGL driver to assumed the whole range changed) In this case, the OpenGL driver (hence apitrace) should get all the information it needs about which bytes were updated betwen glMap/glUnmap. - When an app touches memory bound via GL_AMD_pinned_memory outside glMap/glUnmap, there are be _no_ hints whatsever. The OpenGL driver might not care as the memory is shared between CPU and GPU, so all is good as far is it is concerned, but all the changes the app does are invisible at an API level, hence apitrace will not be able to catch them unless it does onerous heuristics. So while both extensions allow unsynchronized access, but lack of synchronization is not my concern. My concern is that GL_AMD_pinned_memory allows *hidden* access to GPU memory. Just for the record, the challenges GL_AMD_pinned_memory presents to Apitrace are much similar to the old-fashioned OpenGL user array pointers: an app is free to change the contents of memory pointed by user arrays pointers at any point in time, except during a draw call. This means that before every draw call, Apitrace needs to scavenge all the user memory pointers and write their contents to the trace file, just in case the app changed it.. In order to support GL_AMD_pinned_memory, for every draw call Apitrace would also need to walk over bound GL_AMD_pinned_memory (and nowadays there are loads of bound points!), and check if data changed, and serialize in the trace file if it did... I never care much about performance of Apitrace with user array pointers: it is an old paradigm; only old apps use it, or programmers which don't
Re: [Mesa-dev] Request for support of GL_AMD_pinned_memory and GL_ARB_buffer_storage extensions
On 02/05/2014 11:10 AM, Jose Fonseca wrote: - Original Message - - Original Message - On 05.02.2014 18:08, Jose Fonseca wrote: I honestly hope that GL_AMD_pinned_memory doesn't become popular. It would have been alright if it wasn't for this bit in https://urldefense.proofpoint.com/v1/url?u=http://www.opengl.org/registry/specs/AMD/pinned_memory.txtk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=NMr9uy2iTjWVixC0wOcYCWEIYhfo80qKwRgdodpoDzA%3D%0Am=pA%2FnK9X3xx0wAlMUZ24PfQ1mW6wAMdTUujz%2Bx7LRwCA%3D%0As=ebbe1f51deb46c81578b3c125b16e31b5f4b28c1d47e283bc9ef588e2707024d which says: 2) Can the application still use the buffer using the CPU address? RESOLVED: YES. However, this access would be completely non synchronized to the OpenGL pipeline, unless explicit synchronization is being used (for example, through glFinish or by using sync objects). And I'm imagining apps which are streaming vertex data doing precisely just that... I don't understand your concern, this is exactly the same behavior GL_MAP_UNSYCHRONIZED_BIT has, and apps are supposedly using that properly. How does apitrace handle it? GL_AMD_pinned_memory it's nothing like GL_ARB_map_buffer_range's GL_MAP_UNSYCHRONIZED_BIT: - When an app touches memory returned by glMapBufferRange(GL_MAP_UNSYCHRONIZED_BIT) it will communicate back to the OpenGL driver which bytes it actually touched via the glFlushMappedBufferRange (unless the apps doesn't care about performance and doesn't call glFlushMappedBufferRange at all, which is silly as it will force the OpenGL driver to assumed the whole range changed) In this case, the OpenGL driver (hence apitrace) should get all the information it needs about which bytes were updated betwen glMap/glUnmap. - When an app touches memory bound via GL_AMD_pinned_memory outside glMap/glUnmap, there are be _no_ hints whatsever. The OpenGL driver might not care as the memory is shared between CPU and GPU, so all is good as far is it is concerned, but all the changes the app does are invisible at an API level, hence apitrace will not be able to catch them unless it does onerous heuristics. So while both extensions allow unsynchronized access, but lack of synchronization is not my concern. My concern is that GL_AMD_pinned_memory allows *hidden* access to GPU memory. Just for the record, the challenges GL_AMD_pinned_memory presents to Apitrace are much similar to the old-fashioned OpenGL user array pointers: an app is free to change the contents of memory pointed by user arrays pointers at any point in time, except during a draw call. This means that before every draw call, Apitrace needs to scavenge all the user memory pointers and write their contents to the trace file, just in case the app changed it.. In order to support GL_AMD_pinned_memory, for every draw call Apitrace would also need to walk over bound GL_AMD_pinned_memory (and nowadays there are loads of bound points!), and check if data changed, and serialize in the trace file if it did... I never care much about performance of Apitrace with user array pointers: it is an old paradigm; only old apps use it, or programmers which don't particular care about performance -- either way, a performance conscious app developer would use VBOs hence never hit the problem at all. My displeasure with GL_AMD_pinned_memory is that it essentially flips everything on its head -- it encourages a paradigm which apitrace will never be able to handle properly. People often complain that OpenGL development tools are poor compared with Direct3D's. An important fact they often miss is that Direct3D API is several orders of mangnitude tool friendlier: it's clear that Direct3D API's cares about things like allowing to query all state back, whereas OpenGL is more fire and forget and never look back -- the main concern in OpenGL is ensuring that state can go from App to Driver fast, but little thought is often given to ensuring that one can read whole state back, or ensuring that one can intercept all state as it goes between the app and the driver... In this particular case, if the answer for Can the application still use the buffer using the CPU address? was a NO, the world would be a much better place. I suspect the reason that they didn't do that is it would imply a very expensive validation step at draw time. There are a whole bunch of technologies in newer GL implementations that will make tracing a miserable prospect. :( Jose ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Request for support of GL_AMD_pinned_memory and GL_ARB_buffer_storage extensions
Am 06.02.2014 00:49, schrieb Jose Fonseca: I hadn't looked at GL_ARB_buffer_storage. I need to read more closely, but at a glance i looks like GL_MAP_PERSISTENT_BIT alone is okay (app needs to call FlushMappedBufferRange must be called to guarantee coherence) but if GL_MAP_COHERENCE_BIT is set we are indeed in face of the same issue... :-( Even worse, being part of GL 4.4 and there being no way for the implementation to fail GL_MAP_COHERENCE_BIT mappings, it means there is no way to avoid supporting it... Jose Note to self: my time would be better spent on reviewing extensions before they are ratified, than ranting after the fact... I don't think that would work. The reason for this stuff to exist is because new hw makes that possible on the hw level directly. Some apus might even be able to share such buffers in LLC (I don't know if Haswell can do that, and AMD APUs lack a common cache level but they can actually do fully coherent memory access from the cpu and gpu side). Now with discrete chips it's not that easy but everybody is doing unified memory these days. I don't know how to solve this for tracing, though, indeed seems impossible... Roland - Original Message - However, GL_ARB_buffer_storage (OpenGL 4.4) with GL_MAP_PERSISTENT_BIT isn't much different. The only difference I see between ARB_buffer_storage and AMD_pinned_memory is that AMD_pinned_memory allows mapping CPU memory to the GPU address space permanently, while ARB_buffer_storage allows mapping GPU memory to the CPU address permanently. At the end of the day, both the GPU and the CPU can read and modify the same buffer and all they need to use for synchronization is fences. Marek On Wed, Feb 5, 2014 at 8:10 PM, Jose Fonseca jfons...@vmware.com wrote: - Original Message - - Original Message - On 05.02.2014 18:08, Jose Fonseca wrote: I honestly hope that GL_AMD_pinned_memory doesn't become popular. It would have been alright if it wasn't for this bit in https://urldefense.proofpoint.com/v1/url?u=http://www.opengl.org/registry/specs/AMD/pinned_memory.txtk=oIvRg1%2BdGAgOoM1BIlLLqw%3D%3D%0Ar=NMr9uy2iTjWVixC0wOcYCWEIYhfo80qKwRgdodpoDzA%3D%0Am=pA%2FnK9X3xx0wAlMUZ24PfQ1mW6wAMdTUujz%2Bx7LRwCA%3D%0As=ebbe1f51deb46c81578b3c125b16e31b5f4b28c1d47e283bc9ef588e2707024d which says: 2) Can the application still use the buffer using the CPU address? RESOLVED: YES. However, this access would be completely non synchronized to the OpenGL pipeline, unless explicit synchronization is being used (for example, through glFinish or by using sync objects). And I'm imagining apps which are streaming vertex data doing precisely just that... I don't understand your concern, this is exactly the same behavior GL_MAP_UNSYCHRONIZED_BIT has, and apps are supposedly using that properly. How does apitrace handle it? GL_AMD_pinned_memory it's nothing like GL_ARB_map_buffer_range's GL_MAP_UNSYCHRONIZED_BIT: - When an app touches memory returned by glMapBufferRange(GL_MAP_UNSYCHRONIZED_BIT) it will communicate back to the OpenGL driver which bytes it actually touched via the glFlushMappedBufferRange (unless the apps doesn't care about performance and doesn't call glFlushMappedBufferRange at all, which is silly as it will force the OpenGL driver to assumed the whole range changed) In this case, the OpenGL driver (hence apitrace) should get all the information it needs about which bytes were updated betwen glMap/glUnmap. - When an app touches memory bound via GL_AMD_pinned_memory outside glMap/glUnmap, there are be _no_ hints whatsever. The OpenGL driver might not care as the memory is shared between CPU and GPU, so all is good as far is it is concerned, but all the changes the app does are invisible at an API level, hence apitrace will not be able to catch them unless it does onerous heuristics. So while both extensions allow unsynchronized access, but lack of synchronization is not my concern. My concern is that GL_AMD_pinned_memory allows *hidden* access to GPU memory. Just for the record, the challenges GL_AMD_pinned_memory presents to Apitrace are much similar to the old-fashioned OpenGL user array pointers: an app is free to change the contents of memory pointed by user arrays pointers at any point in time, except during a draw call. This means that before every draw call, Apitrace needs to scavenge all the user memory pointers and write their contents to the trace file, just in case the app changed it.. In order to support GL_AMD_pinned_memory, for every draw call Apitrace would also need to walk over bound GL_AMD_pinned_memory (and nowadays there are loads of bound points!), and check if data changed, and serialize in the trace file if it did... I never care much about performance of Apitrace with user array pointers: it is an old paradigm; only old
Re: [Mesa-dev] Request for support of GL_AMD_pinned_memory and GL_ARB_buffer_storage extensions
Yes, GL_ARB_buffer_storage is being worked on. We'll support it on all Radeon cards R300 and up. Anyway, GL_STREAM_DRAW should give you the same behavior as GL_CLIENT_STORAGE_BIT on open source Radeon drivers. Marek On Sun, Nov 24, 2013 at 1:19 PM, Tony Wasserka neobra...@googlemail.com wrote: Hello everyone, I was told on IRC that my question would get most attention around here - so bear with me if this is the wrong place to ask I'm one of the developers of the GC/Wii emulator Dolphin. We recently rewrote our OpenGL renderer to use modern OpenGL 3 features, however one thing that we stumbled upon are the lack of efficient (vertex/index) buffer data streaming mechanisms in OpenGL. Basically, most of our vertex data is used once and never again after that (we have to do this for accurate emulation) - so all vertex data gets streamed into one huge ring buffer (and analogously for index data, which uses its own huge ring buffer). For buffer streaming, we have multiple code paths using a combination of glMapBufferRange, glBufferSubData, fences and buffer orphaning, yet none of these come anywhere close to the performance of (legacy) rendering from a vertex array stored in RAM. There are two OpenGL extensions which greatly help us in this situation: AMD's pinned memory [1], and buffer storage[2] in GL 4.4. We currently have no buffer storage code path, but usage of pinned memory gave us a speedup of up to 60% under heavy workloads when working with AMD's Catalyst driver under Windows. We expect the same speedup when using buffer storage (specifically we need CLIENT_STORAGE_BIT, if I recall correctly). So the natural question that arises is: Is either of these two extensions going to be supported in mesa anytime soon or is it of lower priority than other extensions? Also, is the pinned memory extension AMD hardware specific or would it be possible to support it for other hardware, too? I'm not sure if buffer storage (being a GL 4.4 extension, and I read that it might actually depend on some other GL 4.3 extension) is possible to implement on older hardware, yet it would be very useful for us to have efficient streaming methods for old GPUs, too. I hope this mail doesn't sound too commanding or anything, it's just supposed to be a friendly question on improving the emulator experience for our user base Thanks in advance! Best regards, Tony [1] http://www.opengl.org/registry/specs/AMD/pinned_memory.txt [2] http://www.opengl.org/registry/specs/ARB/buffer_storage.txt ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Request for support of GL_AMD_pinned_memory and GL_ARB_buffer_storage extensions
On 01/28/2014 05:35 PM, Marek Olšák wrote: Yes, GL_ARB_buffer_storage is being worked on. We'll support it on all Radeon cards R300 and up. Are you guys working on that? Have an ETA? :) Anyway, GL_STREAM_DRAW should give you the same behavior as GL_CLIENT_STORAGE_BIT on open source Radeon drivers. I think a big piece of functionality that Tony wants is the ability to have CPU pointers that persist for the lifetime of the context. Without GL_ARB_buffer_storage or GL_AMD_pinned_memory the application has to MapBuffer and UnmapBuffer around draw calls. Marek On Sun, Nov 24, 2013 at 1:19 PM, Tony Wasserka neobra...@googlemail.com wrote: Hello everyone, I was told on IRC that my question would get most attention around here - so bear with me if this is the wrong place to ask I'm one of the developers of the GC/Wii emulator Dolphin. We recently rewrote our OpenGL renderer to use modern OpenGL 3 features, however one thing that we stumbled upon are the lack of efficient (vertex/index) buffer data streaming mechanisms in OpenGL. Basically, most of our vertex data is used once and never again after that (we have to do this for accurate emulation) - so all vertex data gets streamed into one huge ring buffer (and analogously for index data, which uses its own huge ring buffer). For buffer streaming, we have multiple code paths using a combination of glMapBufferRange, glBufferSubData, fences and buffer orphaning, yet none of these come anywhere close to the performance of (legacy) rendering from a vertex array stored in RAM. There are two OpenGL extensions which greatly help us in this situation: AMD's pinned memory [1], and buffer storage[2] in GL 4.4. We currently have no buffer storage code path, but usage of pinned memory gave us a speedup of up to 60% under heavy workloads when working with AMD's Catalyst driver under Windows. We expect the same speedup when using buffer storage (specifically we need CLIENT_STORAGE_BIT, if I recall correctly). So the natural question that arises is: Is either of these two extensions going to be supported in mesa anytime soon or is it of lower priority than other extensions? Also, is the pinned memory extension AMD hardware specific or would it be possible to support it for other hardware, too? I'm not sure if buffer storage (being a GL 4.4 extension, and I read that it might actually depend on some other GL 4.3 extension) is possible to implement on older hardware, yet it would be very useful for us to have efficient streaming methods for old GPUs, too. I hope this mail doesn't sound too commanding or anything, it's just supposed to be a friendly question on improving the emulator experience for our user base Thanks in advance! Best regards, Tony [1] http://www.opengl.org/registry/specs/AMD/pinned_memory.txt [2] http://www.opengl.org/registry/specs/ARB/buffer_storage.txt ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Request for support of GL_AMD_pinned_memory and GL_ARB_buffer_storage extensions
On Wed, Jan 29, 2014 at 1:42 AM, Ian Romanick i...@freedesktop.org wrote: On 01/28/2014 05:35 PM, Marek Olšák wrote: Yes, GL_ARB_buffer_storage is being worked on. We'll support it on all Radeon cards R300 and up. Are you guys working on that? Have an ETA? :) It's done. I'm writing piglit tests at the moment. I'll send my patches tomorrow. Marek ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Request for support of GL_AMD_pinned_memory and GL_ARB_buffer_storage extensions
Hey Matt, The speedup was only observed on discrete GPUs so far, I have no data about APUs. Best regards, Tony Am 26.11.2013 04:50, schrieb Matt Harvey: Hi Tony, I guess the lack of response means that neither of those extensions is on anyone's road map for right now. I have a quick question. Were you seeing those speedups only on the AMD APUs, or also on the discrete cards? Thanks, Matt On Sun, Nov 24, 2013 at 7:19 AM, Tony Wasserka neobra...@googlemail.com mailto:neobra...@googlemail.com wrote: Hello everyone, I was told on IRC that my question would get most attention around here - so bear with me if this is the wrong place to ask I'm one of the developers of the GC/Wii emulator Dolphin. We recently rewrote our OpenGL renderer to use modern OpenGL 3 features, however one thing that we stumbled upon are the lack of efficient (vertex/index) buffer data streaming mechanisms in OpenGL. Basically, most of our vertex data is used once and never again after that (we have to do this for accurate emulation) - so all vertex data gets streamed into one huge ring buffer (and analogously for index data, which uses its own huge ring buffer). For buffer streaming, we have multiple code paths using a combination of glMapBufferRange, glBufferSubData, fences and buffer orphaning, yet none of these come anywhere close to the performance of (legacy) rendering from a vertex array stored in RAM. There are two OpenGL extensions which greatly help us in this situation: AMD's pinned memory [1], and buffer storage[2] in GL 4.4. We currently have no buffer storage code path, but usage of pinned memory gave us a speedup of up to 60% under heavy workloads when working with AMD's Catalyst driver under Windows. We expect the same speedup when using buffer storage (specifically we need CLIENT_STORAGE_BIT, if I recall correctly). So the natural question that arises is: Is either of these two extensions going to be supported in mesa anytime soon or is it of lower priority than other extensions? Also, is the pinned memory extension AMD hardware specific or would it be possible to support it for other hardware, too? I'm not sure if buffer storage (being a GL 4.4 extension, and I read that it might actually depend on some other GL 4.3 extension) is possible to implement on older hardware, yet it would be very useful for us to have efficient streaming methods for old GPUs, too. I hope this mail doesn't sound too commanding or anything, it's just supposed to be a friendly question on improving the emulator experience for our user base Thanks in advance! Best regards, Tony [1] http://www.opengl.org/registry/specs/AMD/pinned_memory.txt [2] http://www.opengl.org/registry/specs/ARB/buffer_storage.txt ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org mailto:mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
Re: [Mesa-dev] Request for support of GL_AMD_pinned_memory and GL_ARB_buffer_storage extensions
Hi Tim, I've given your suggestion some thought - and while it looks like something which would work, my time schedule currently is too tight (regardless of money involved) to implement support for the extension. I might think about it again in three months, if the extension has not been implemented by then by someone else. Thanks for bringing this up though, it definitely would've been an option that I hadn't thought of, if only I had any time to spend on it :) Best regards, Tony Am 25.11.2013 22:45, schrieb Timothy Arceri: Hi Tony, I'm not one of the main Mesa devs just an independent developer that works on Mesa in my spare time. All I can suggest is you have a go at implementing the features yourself. You obviously have a lot of talent and I'm sure you would be able to accomplish the task. If time is an issue there is a lot of interest in the Linux community for improving Mesa and I myself have run two successful crowd funding campaigns to be able to support some full time work on Mesa. See: http://www.indiegogo.com/projects/improve-opengl-support-for-the-linux-graphics-drivers-mesa/x/2053460 Maybe you could do something similar. If you do decide to do this I find its useful to start working on the extension (showing work on github etc) before running the campaign as people like to be sure you can accomplish what you are promising. Anyway this is just an option for you to think about. Tim On Sunday, 24 November 2013 11:57 PM, Tony Wasserka neobra...@googlemail.com wrote: Hello everyone, I was told on IRC that my question would get most attention around here - so bear with me if this is the wrong place to ask I'm one of the developers of the GC/Wii emulator Dolphin. We recently rewrote our OpenGL renderer to use modern OpenGL 3 features, however one thing that we stumbled upon are the lack of efficient (vertex/index) buffer data streaming mechanisms in OpenGL. Basically, most of our vertex data is used once and never again after that (we have to do this for accurate emulation) - so all vertex data gets streamed into one huge ring buffer (and analogously for index data, which uses its own huge ring buffer). For buffer streaming, we have multiple code paths using a combination of glMapBufferRange, glBufferSubData, fences and buffer orphaning, yet none of these come anywhere close to the performance of (legacy) rendering from a vertex array stored in RAM. There are two OpenGL extensions which greatly help us in this situation: AMD's pinned memory [1], and buffer storage[2] in GL 4.4. We currently have no buffer storage code path, but usage of pinned memory gave us a speedup of up to 60% under heavy workloads when working with AMD's Catalyst driver under Windows. We expect the same speedup when using buffer storage (specifically we need CLIENT_STORAGE_BIT, if I recall correctly). So the natural question that arises is: Is either of these two extensions going to be supported in mesa anytime soon or is it of lower priority than other extensions? Also, is the pinned memory extension AMD hardware specific or would it be possible to support it for other hardware, too? I'm not sure if buffer storage (being a GL 4.4 extension, and I read that it might actually depend on some other GL 4.3 extension) is possible to implement on older hardware, yet it would be very useful for us to have efficient streaming methods for old GPUs, too. I hope this mail doesn't sound too commanding or anything, it's just supposed to be a friendly question on improving the emulator experience for our user base Thanks in advance! Best regards, Tony [1] http://www.opengl.org/registry/specs/AMD/pinned_memory.txt [2] http://www.opengl.org/registry/specs/ARB/buffer_storage.txt ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org mailto:mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev
[Mesa-dev] Request for support of GL_AMD_pinned_memory and GL_ARB_buffer_storage extensions
Hello everyone, I was told on IRC that my question would get most attention around here - so bear with me if this is the wrong place to ask I'm one of the developers of the GC/Wii emulator Dolphin. We recently rewrote our OpenGL renderer to use modern OpenGL 3 features, however one thing that we stumbled upon are the lack of efficient (vertex/index) buffer data streaming mechanisms in OpenGL. Basically, most of our vertex data is used once and never again after that (we have to do this for accurate emulation) - so all vertex data gets streamed into one huge ring buffer (and analogously for index data, which uses its own huge ring buffer). For buffer streaming, we have multiple code paths using a combination of glMapBufferRange, glBufferSubData, fences and buffer orphaning, yet none of these come anywhere close to the performance of (legacy) rendering from a vertex array stored in RAM. There are two OpenGL extensions which greatly help us in this situation: AMD's pinned memory [1], and buffer storage[2] in GL 4.4. We currently have no buffer storage code path, but usage of pinned memory gave us a speedup of up to 60% under heavy workloads when working with AMD's Catalyst driver under Windows. We expect the same speedup when using buffer storage (specifically we need CLIENT_STORAGE_BIT, if I recall correctly). So the natural question that arises is: Is either of these two extensions going to be supported in mesa anytime soon or is it of lower priority than other extensions? Also, is the pinned memory extension AMD hardware specific or would it be possible to support it for other hardware, too? I'm not sure if buffer storage (being a GL 4.4 extension, and I read that it might actually depend on some other GL 4.3 extension) is possible to implement on older hardware, yet it would be very useful for us to have efficient streaming methods for old GPUs, too. I hope this mail doesn't sound too commanding or anything, it's just supposed to be a friendly question on improving the emulator experience for our user base Thanks in advance! Best regards, Tony [1] http://www.opengl.org/registry/specs/AMD/pinned_memory.txt [2] http://www.opengl.org/registry/specs/ARB/buffer_storage.txt ___ mesa-dev mailing list mesa-dev@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/mesa-dev