Re: [Intel-gfx] [Libva] keep Nalu start code in VASliceDataBufferType data
On Tue, 2016-08-30 at 08:59 +0800, Randy Li wrote: > Hi all: > When I just doing the driver for us chip, we would request the > Nalu > header present in the data to be process. But I found the data be > Rendered to with type VASliceDataBufferType is removed the Nalu start > code. Is there any way to make the client send the data without > remove > the start code ? Thank you. Do you mean the start code prefix 0x01? It depends on the codec, VA-API has this requirement for HEVC, but no for AVC. For AVC, I don't think the existing softwares using vaapi send the prefix to driver, you have to workaround it in your driver. but if the application is developed by yourself, you may send the prefix with setting the right slice_data_offset, so other drivers can work well with your application. > > P.S Thank you for the Intel guys help, I decided not to use the DRM > framework to implement the interface in kernel after I talked to the > kernel upstream. But the request API would be used. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 4/6] drm/i915/huc: Add debugfs for HuC loading status check
Hi Peter, Could you provide the interface for UMD driver to detect HuC FW loading status in your new patch set? A normal user doesn't have permission to read files in debugfs. Reusing I915_GETPARAM is fine for me. Thanks Haihao > > > -Original Message- > > From: Thierry, Michel > > Sent: Thursday, June 23, 2016 3:48 AM > > To: Antoine, Peter <peter.anto...@intel.com>; Xiang, Haihao > > <haihao.xi...@intel.com>; daniel.vet...@ffwll.ch > > Cc: Kelley, Sean V <sean.v.kel...@intel.com>; intel- > > g...@lists.freedesktop.org; Li, Lawrence T <lawrence.t...@intel.com> > > ; Vivi, > > Rodrigo <rodrigo.v...@intel.com> > > Subject: Re: [Intel-gfx] [PATCH 4/6] drm/i915/huc: Add debugfs for > > HuC > > loading status check > > > > On 6/23/2016 11:01 AM, Peter Antoine wrote: > > > Daniel, > > > > > > Is this suggestion acceptable? I don't want to waste time and > > > effort > > > writing code that is not going to be accepted? > > > > > > Peter. > > > > > > > Reuse I915_GETPARAM and do more-less what Chris did for > > i915.enable_gvt? [1] > > > > > > [1] > > https://cgit.freedesktop.org/drm- > > intel/commit/?id=7822492fd21a44eeb3568082b0ab915df7388061 > > Something along those lines would work for me with our media UMD. > > Thanks, > > Sean > > > > > > On Thu, 23 Jun 2016, Xiang, Haihao wrote: > > > > > > > > > > > Hi Peter, > > > > > > > > Besides debugfs, could you add a IOCTL to check HuC loading > > > > status? > > > > Userspace media driver needs to advertise the features based on > > > > HuC > > > > to user. > > > > > > > > Thanks > > > > Haihao > > > > > > > > > > > > > From: Alex Dai <yu@intel.com> > > > > > > > > > > Add debugfs entry for HuC loading status check. > > > > > > > > > > Signed-off-by: Alex Dai <yu@intel.com> > > > > > Signed-off-by: Peter Antoine <peter.anto...@intel.com> > > > > > --- > > > > > drivers/gpu/drm/i915/i915_debugfs.c | 32 > > > > > > > > > > 1 file changed, 32 insertions(+) > > > > > > > > > > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c > > > > > b/drivers/gpu/drm/i915/i915_debugfs.c > > > > > index 69964c2..f5976f8 100644 > > > > > --- a/drivers/gpu/drm/i915/i915_debugfs.c > > > > > +++ b/drivers/gpu/drm/i915/i915_debugfs.c > > > > > @@ -2479,6 +2479,37 @@ static int i915_llc(struct seq_file > > > > > *m, void > > > > > *data) > > > > > return 0; > > > > > } > > > > > > > > > > +static int i915_huc_load_status_info(struct seq_file *m, > > > > > void > > > > > +*data) { > > > > > +struct drm_info_node *node = m->private; > > > > > +struct drm_i915_private *dev_priv = node->minor->dev- > > > > > > dev_private; > > > > > +struct intel_uc_fw *huc_fw = _priv->huc.huc_fw; > > > > > + > > > > > +if (!HAS_HUC_UCODE(dev_priv->dev)) > > > > > +return 0; > > > > > + > > > > > +seq_puts(m, "HuC firmware status:\n"); > > > > > +seq_printf(m, "\tpath: %s\n", huc_fw->uc_fw_path); > > > > > +seq_printf(m, "\tfetch: %s\n", > > > > > +intel_uc_fw_status_repr(huc_fw->fetch_status)); > > > > > +seq_printf(m, "\tload: %s\n", > > > > > +intel_uc_fw_status_repr(huc_fw->load_status)); > > > > > +seq_printf(m, "\tversion wanted: %d.%d\n", > > > > > +huc_fw->major_ver_wanted, huc_fw- > > > > > >minor_ver_wanted); > > > > > +seq_printf(m, "\tversion found: %d.%d\n", > > > > > +huc_fw->major_ver_found, huc_fw- > > > > > >minor_ver_found); > > > > > +seq_printf(m, "\theader: offset is %d; size = %d\n", > > > > > +huc_fw->header_offset, huc_fw->header_size); > > > > > +seq_printf(m, "\tuCode: offset is %d; size = %d\n", >
Re: [Intel-gfx] [PATCH v3 6/6] drm/i915/huc: Add BXT HuC Loading Support
Hi Rodrigo, We will use HuC on BXT. Thanks Haihao > vaapi-intel-driver, the userspace component here is only using HuC > for > SKL for now, so I believe this one will be on hold for now, right? > > > > On Wed, 2016-07-06 at 15:24 +0100, Peter Antoine wrote: > > This patch adds the HuC Loading for the BXT. > > Version 1.7 of the HuC firmware. > > > > v2: rebased. > > v3: rebased. > > changed file name to match the install package format. > > > > Signed-off-by: Peter Antoine> > --- > > drivers/gpu/drm/i915/intel_huc_loader.c | 7 +++ > > 1 file changed, 7 insertions(+) > > > > diff --git a/drivers/gpu/drm/i915/intel_huc_loader.c > > b/drivers/gpu/drm/i915/intel_huc_loader.c > > index 96cd9d8..c6d53b3 100644 > > --- a/drivers/gpu/drm/i915/intel_huc_loader.c > > +++ b/drivers/gpu/drm/i915/intel_huc_loader.c > > @@ -49,6 +49,9 @@ > > #define I915_SKL_HUC_UCODE "i915/skl_huc_ver01_07_1398.bin" > > MODULE_FIRMWARE(I915_SKL_HUC_UCODE); > > > > +#define I915_BXT_HUC_UCODE "i915/bxt_huc_ver01_07_1398.bin" > > +MODULE_FIRMWARE(I915_BXT_HUC_UCODE); > > + > > /** > > * intel_huc_load_ucode() - DMA's the firmware > > * @dev: the drm device > > @@ -157,6 +160,10 @@ void intel_huc_init(struct drm_device *dev) > > fw_path = I915_SKL_HUC_UCODE; > > huc_fw->major_ver_wanted = 1; > > huc_fw->minor_ver_wanted = 7; > > + } else if (IS_BROXTON(dev_priv)) { > > + fw_path = I915_BXT_HUC_UCODE; > > + huc_fw->major_ver_wanted = 1; > > + huc_fw->minor_ver_wanted = 7; > > } > > > > if (fw_path == NULL) > ___ > Intel-gfx mailing list > Intel-gfx@lists.freedesktop.org > https://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 4/6] drm/i915/huc: Add debugfs for HuC loading status check
Hi Peter, Besides debugfs, could you add a IOCTL to check HuC loading status? Userspace media driver needs to advertise the features based on HuC to user. Thanks Haihao > From: Alex Dai> > Add debugfs entry for HuC loading status check. > > Signed-off-by: Alex Dai > Signed-off-by: Peter Antoine > --- > drivers/gpu/drm/i915/i915_debugfs.c | 32 > > 1 file changed, 32 insertions(+) > > diff --git a/drivers/gpu/drm/i915/i915_debugfs.c > b/drivers/gpu/drm/i915/i915_debugfs.c > index 69964c2..f5976f8 100644 > --- a/drivers/gpu/drm/i915/i915_debugfs.c > +++ b/drivers/gpu/drm/i915/i915_debugfs.c > @@ -2479,6 +2479,37 @@ static int i915_llc(struct seq_file *m, void > *data) > return 0; > } > > +static int i915_huc_load_status_info(struct seq_file *m, void *data) > +{ > + struct drm_info_node *node = m->private; > + struct drm_i915_private *dev_priv = node->minor->dev- > >dev_private; > + struct intel_uc_fw *huc_fw = _priv->huc.huc_fw; > + > + if (!HAS_HUC_UCODE(dev_priv->dev)) > + return 0; > + > + seq_puts(m, "HuC firmware status:\n"); > + seq_printf(m, "\tpath: %s\n", huc_fw->uc_fw_path); > + seq_printf(m, "\tfetch: %s\n", > + intel_uc_fw_status_repr(huc_fw->fetch_status)); > + seq_printf(m, "\tload: %s\n", > + intel_uc_fw_status_repr(huc_fw->load_status)); > + seq_printf(m, "\tversion wanted: %d.%d\n", > + huc_fw->major_ver_wanted, huc_fw->minor_ver_wanted); > + seq_printf(m, "\tversion found: %d.%d\n", > + huc_fw->major_ver_found, huc_fw->minor_ver_found); > + seq_printf(m, "\theader: offset is %d; size = %d\n", > + huc_fw->header_offset, huc_fw->header_size); > + seq_printf(m, "\tuCode: offset is %d; size = %d\n", > + huc_fw->ucode_offset, huc_fw->ucode_size); > + seq_printf(m, "\tRSA: offset is %d; size = %d\n", > + huc_fw->rsa_offset, huc_fw->rsa_size); > + > + seq_printf(m, "\nHuC status 0x%08x:\n", > I915_READ(HUC_STATUS2)); > + > + return 0; > +} > + > static int i915_guc_load_status_info(struct seq_file *m, void *data) > { > struct drm_info_node *node = m->private; > @@ -5432,6 +5463,7 @@ static const struct drm_info_list > i915_debugfs_list[] = { > {"i915_guc_info", i915_guc_info, 0}, > {"i915_guc_load_status", i915_guc_load_status_info, 0}, > {"i915_guc_log_dump", i915_guc_log_dump, 0}, > + {"i915_huc_load_status", i915_huc_load_status_info, 0}, > {"i915_frequency_info", i915_frequency_info, 0}, > {"i915_hangcheck_info", i915_hangcheck_info, 0}, > {"i915_drpc_info", i915_drpc_info, 0}, ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org https://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Make sample_c messages go faster on Haswell.
On Mon, 2015-01-05 at 21:54 -0800, Kenneth Graunke wrote: On Tuesday, January 06, 2015 01:11:53 PM Xiang, Haihao wrote: Hi Kenneth, How did you test OSD ? I can't reproduce the issue you mentioned, OSD works well for me when using mplayer-vaapi with the latest libva/libva-intel-driver master branch. I tried your patch, what surprised me is OSD still works well after applying your patch. It seems your patch didn't disable the palette. Thanks Haihao I ran: mplayer -osdlevel 3 -vo vaapi big_buck_bunny_720p_stereo.ogg For me, the OSD text is solid green, with hard edges. The OSD text is white for me when using mplayer -osdlevel 3 -vo vaapi xxx. If possible, could you update your mplayer ? If you use -vo gl or -vo xv, the OSD is solid white text with a black border around it. I presume that it's supposed to be white with vaapi as well, but I guess I'm not entirely sure. It's possible that the optimization doesn't affect the palette as long as you never use sample_c with the paletted textures. I verified the palette takes effect in the following way: 1. Only support P8A8 format in the driver 2. ran the above command and I saw white OSD text 3. Only support P4A4 format in the driver and don't use 3DSTATE_SAMPLER_PALETTE_LOAD0 to load the value to the texture palette, so the palette keeps unchanged. 4. ran the above command and I saw black OSD text. 5. Load the right value to the texture palette and ran the above command again, I saw white OSD text. Hence I think sample_c with the paletted textures is used in the driver. --Ken ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Make sample_c messages go faster on Haswell.
On Tue, 2015-01-06 at 14:39 +0800, Xiang, Haihao wrote: On Mon, 2015-01-05 at 21:54 -0800, Kenneth Graunke wrote: On Tuesday, January 06, 2015 01:11:53 PM Xiang, Haihao wrote: Hi Kenneth, How did you test OSD ? I can't reproduce the issue you mentioned, OSD works well for me when using mplayer-vaapi with the latest libva/libva-intel-driver master branch. I tried your patch, what surprised me is OSD still works well after applying your patch. It seems your patch didn't disable the palette. Thanks Haihao I ran: mplayer -osdlevel 3 -vo vaapi big_buck_bunny_720p_stereo.ogg For me, the OSD text is solid green, with hard edges. The OSD text is white for me when using mplayer -osdlevel 3 -vo vaapi xxx. If possible, could you update your mplayer ? If you use -vo gl or -vo xv, the OSD is solid white text with a black border around it. I presume that it's supposed to be white with vaapi as well, but I guess I'm not entirely sure. It's possible that the optimization doesn't affect the palette as long as you never use sample_c with the paletted textures. I verified the palette takes effect in the following way: 1. Only support P8A8 format in the driver 2. ran the above command and I saw white OSD text 3. Only support P4A4 format in the driver and don't use 3DSTATE_SAMPLER_PALETTE_LOAD0 to load the value to the texture palette, so the palette keeps unchanged. 4. ran the above command and I saw black OSD text. 5. Load the right value to the texture palette and ran the above command again, I saw white OSD text. Hence I think sample_c with the paletted textures is used in the driver. Sorry, libva driver doesn't use sample_c message, I mean the paletted texture is used. However corroding to the doc, Palette is disabled for fast mode. --Ken ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Make sample_c messages go faster on Haswell.
On Mon, 2015-01-05 at 23:03 -0800, Kenneth Graunke wrote: On Tuesday, January 06, 2015 02:39:36 PM Xiang, Haihao wrote: On Mon, 2015-01-05 at 21:54 -0800, Kenneth Graunke wrote: On Tuesday, January 06, 2015 01:11:53 PM Xiang, Haihao wrote: Hi Kenneth, How did you test OSD ? I can't reproduce the issue you mentioned, OSD works well for me when using mplayer-vaapi with the latest libva/libva-intel-driver master branch. I tried your patch, what surprised me is OSD still works well after applying your patch. It seems your patch didn't disable the palette. Thanks Haihao I ran: mplayer -osdlevel 3 -vo vaapi big_buck_bunny_720p_stereo.ogg For me, the OSD text is solid green, with hard edges. The OSD text is white for me when using mplayer -osdlevel 3 -vo vaapi xxx. If possible, could you update your mplayer ? Huh. I'm using the Arch Linux package of mplayer-vaapi 36265-13, which seems to be the most recent subversion commit ID. I've never seen white text on my Haswell system - it seems to be consistently dark green. If you use -vo gl or -vo xv, the OSD is solid white text with a black border around it. I presume that it's supposed to be white with vaapi as well, but I guess I'm not entirely sure. It's possible that the optimization doesn't affect the palette as long as you never use sample_c with the paletted textures. I verified the palette takes effect in the following way: 1. Only support P8A8 format in the driver 2. ran the above command and I saw white OSD text 3. Only support P4A4 format in the driver and don't use 3DSTATE_SAMPLER_PALETTE_LOAD0 to load the value to the texture palette, so the palette keeps unchanged. 4. ran the above command and I saw black OSD text. 5. Load the right value to the texture palette and ran the above command again, I saw white OSD text. Hence I think sample_c with the paletted textures is used in the driver. That sounds like the palette is actually working, then. Great :) I doubt that libva would use sample_c - sampling with a shadow comparison? It looks like it just uses sample and sample+killpix. You are right, libva driver doesn't use sample_c message. I'm pretty sure the sample_c optimization just uses the palette memory as storage for some stuff, so it's quite possible it just works if you're only using sample and sample+killpix. Thanks for the explanation, it makes sense to me. --Ken ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Make sample_c messages go faster on Haswell.
Hi Kenneth, How did you test OSD ? I can't reproduce the issue you mentioned, OSD works well for me when using mplayer-vaapi with the latest libva/libva-intel-driver master branch. I tried your patch, what surprised me is OSD still works well after applying your patch. It seems your patch didn't disable the palette. Thanks Haihao On Monday, January 05, 2015 02:19:15 PM Daniel Vetter wrote: On Wed, Dec 31, 2014 at 04:23:00PM -0800, Kenneth Graunke wrote: Haswell significantly improved the performance of sampler_c messages, but the optimization appears to be off by default. Later platforms remove this bit, and apparently always enable the optimization. Improves performance in Counter Strike: Global Offensive by 18% at default settings on Iris Pro. This may break sampling of paletted formats (P8/A8P8/P8A8). It's unclear whether it affects sampling of paletted formats in general, or just the sample_c message (which is never used). While libva does have support for using paletted formats (primarily for OSDs), that support appears to have been broken for at least a year, so I couldn't observe a regression from this. Signed-off-by: Kenneth Graunke kenn...@whitecape.org --- drivers/gpu/drm/i915/i915_reg.h | 1 + drivers/gpu/drm/i915/intel_pm.c | 4 2 files changed, 5 insertions(+) Resubmitting the patch to unconditionally enable this. I tried to get libva-intel to use paletted formats, and observe a regression...but the only thing I found that used it was mplayer's OSD (on screen display). Even without my patch, the colors were totally wrong with that, and it's according to a few distro wikis, that's been the case for over a year. If libva's code for paletted formats /is/ broken, they could always add code to disable this bit using the command validator when fixing it. Could we try merging this, and back it out if someone reports a regression? I haven't observed any problems. It's also been quite stable. Yeah makes sense. When resending please incorporated review feedback (Ville dug out the wa name), I've done that. And I've pasted the additional detail about the libva saga, just for reference (since no one will remember that it's mplayer's OSD which uses this 2 months down the road). Also please cc libva mailing lists next time around as an fyi. Done that too. Queued for -next, thanks for the patch. -Daniel Oh, sorry, I missed that in the review. Thanks, Daniel! --Ken ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Make sample_c messages go faster on Haswell.
On Mon, 2014-11-03 at 13:48 +0100, Daniel Vetter wrote: On Fri, Oct 31, 2014 at 11:27:33AM +0200, Ville Syrjälä wrote: On Thu, Oct 30, 2014 at 12:57:04PM -0700, Kenneth Graunke wrote: Before we get too much further...we should check if libva is actually broken. I don't know if this means the sampler palette completely doesn't work, or if it just means sample_c doesn't work with the palette. If it's the latter, we're probably fine, because I doubt libva uses sample_c. Yeah if we wouldn't break any existing userspace I guess we could just flip the switch in the kernel. If anyone later wants to start doing something that no longer works they'd have to deal with disabling the bit using an LRI. It very much looks like libva uses palettes, since it supports C8 and C4 image formats (well some crazy fourcc nonsense, but meh). And it does so on all generations support by the libva driver, i.e. including hsw afaict Cc'ing people and lists with more clue who should be able to tell whether its not just there but actually works ... Yes, libva driver uses sample (0) and palette on HSW Thanks Haihao -Daniel ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH v2 i-g-t] lib/chv: CHV media pipeline command sequence
+gen8_emit_curbe_load(struct intel_batchbuffer *batch, uint32_t curbe_buffer) +{ + OUT_BATCH(GEN8_MEDIA_CURBE_LOAD | (4 - 2)); + OUT_BATCH(0); + /* curbe total data length */ + OUT_BATCH(64); + /* curbe data start address, is relative to the dynamics base address */ + OUT_BATCH(curbe_buffer); +} + +static void +gen8_emit_interface_descriptor_load(struct intel_batchbuffer *batch, uint32_t interface_descriptor) +{ + OUT_BATCH(GEN8_MEDIA_INTERFACE_DESCRIPTOR_LOAD | (4 - 2)); + OUT_BATCH(0); + /* interface descriptor data length */ + OUT_BATCH(sizeof(struct gen8_interface_descriptor_data)); + /* interface descriptor address, is relative to the dynamics base address */ + OUT_BATCH(interface_descriptor); +} + +static void +gen8lp_emit_media_objects(struct intel_batchbuffer *batch, + unsigned x, unsigned y, + unsigned width, unsigned height) +{ + int i, j; + + for (i = 0; i width / 16; i++) { + for (j = 0; j height / 16; j++) { + OUT_BATCH(GEN8_MEDIA_OBJECT | (8 - 2)); + + /* interface descriptor offset */ + OUT_BATCH(0); + + /* without indirect data */ + OUT_BATCH(0); + OUT_BATCH(0); + + /* scoreboard */ + OUT_BATCH(0); + OUT_BATCH(0); + + /* inline data (xoffset, yoffset) */ + OUT_BATCH(x + i * 16); + OUT_BATCH(y + j * 16); + } + } +} + +/* + * This sets up the media pipeline, + * + * +---+ 4096 + * | ^ | + * | | | + * |various| + * | state| + * | | | + * |___|___| 2048 + ? + * | ^ | + * | | | + * | batch | + * |commands | + * | | | + * | | | + * +---+ 0 + ? + * + */ + +#define BATCH_STATE_SPLIT 2048 + +void +gen8lp_media_fillfunc(struct intel_batchbuffer *batch, + struct igt_buf *dst, + unsigned x, unsigned y, + unsigned width, unsigned height, + uint8_t color) +{ + uint32_t curbe_buffer, interface_descriptor; + uint32_t batch_end; + + intel_batchbuffer_flush(batch); + + /* setup states */ + batch-ptr = batch-buffer[BATCH_STATE_SPLIT]; + + curbe_buffer = gen8_fill_curbe_buffer_data(batch, color); + interface_descriptor = gen8_fill_interface_descriptor(batch, dst); + assert(batch-ptr batch-buffer[4095]); + + /* media pipeline */ + batch-ptr = batch-buffer; + OUT_BATCH(GEN8_PIPELINE_SELECT | PIPELINE_SELECT_MEDIA); + gen8_emit_state_base_address(batch); + + gen8_emit_vfe_state(batch); + + gen8_emit_curbe_load(batch, curbe_buffer); + + gen8_emit_interface_descriptor_load(batch, interface_descriptor); + + gen8lp_emit_media_objects(batch, x, y, width, height); + + OUT_BATCH(MI_BATCH_BUFFER_END); + + batch_end = batch_align(batch, 8); + assert(batch_end BATCH_STATE_SPLIT); + + gen8_render_flush(batch, batch_end); + intel_batchbuffer_reset(batch); +} LGTM. Reviewed-by: Xiang, Haihao haihao.xi...@intel.com ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 2/3] drm/i915: Introduce dual_bsd_ring parameter.
On Wed, 2014-07-02 at 07:52 +0800, Zhao, Yakui wrote: On Tue, 2014-07-01 at 09:26 -0600, Vivi, Rodrigo wrote: It seems the flexibility on rings is more wanted and needed than I imagined. Please ignore this patch here... I liked both execution flag or debugfs, but exec flag would cover this case of different applications using different command streamers. With flags Would it be something like: Execution without flag = ping-pong Flag BSD1 use only VCS1 Flag BSD2 use only VCS2 IMO the execution flag looks reasonable. It can cover the flexibility of different applications. In such case it can determine which ring is used to dispatch command at runtime. I prefer the execution flag too. Thanks Haihao Haihao, what do you think? With debugfs would be something like i195_dual_bsd_ring file with 3 options: all bsd1 bsd2 Thanks, Rodrigo. -Original Message- From: Zhao, Yakui Sent: Monday, June 30, 2014 6:37 PM To: Vivi, Rodrigo Cc: intel-gfx@lists.freedesktop.org Subject: Re: [Intel-gfx] [PATCH 2/3] drm/i915: Introduce dual_bsd_ring parameter. On Mon, 2014-06-30 at 10:51 -0600, Rodrigo Vivi wrote: On Broadwell GT3 we have 2 Video Command Streamers (VCS), but userspace has no control when using VCS1 or VCS2. So we cannot test, validate or debug specific changes or workaround that might affect only one or another ring. So this patch introduces a mechanism to avoid the ping-pong selection and use one specific ring given at boot time. If it is mainly used for the test/validation, can we add one override flag so that the user-space app can explicitly declare which BSD ring is used to dispatch the corresponding BSD commands? In such case it will force to dispatch the corresponding commands on the ring passed by user-application. At the same time this patch is not helpful under the following scenario. For example: One application hopes to use the BSD Ring 0 while another application hopes to use the BSD ring 1. Signed-off-by: Rodrigo Vivi rodrigo.v...@intel.com --- drivers/gpu/drm/i915/i915_drv.h| 1 + drivers/gpu/drm/i915/i915_gem_execbuffer.c | 34 ++ drivers/gpu/drm/i915/i915_params.c | 6 ++ 3 files changed, 27 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 8cea596..7b6614f 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2069,6 +2069,7 @@ struct i915_params { int panel_ignore_lid; unsigned int powersave; int semaphores; + int dual_bsd_ring; unsigned int lvds_downclock; int lvds_channel_mode; int panel_use_ssc; diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index d815ef5..09f350e 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -1035,26 +1035,32 @@ static int gen8_dispatch_bsd_ring(struct drm_device *dev, { struct drm_i915_private *dev_priv = dev-dev_private; struct drm_i915_file_private *file_priv = file-driver_priv; + int ring_id; + int dual = i915.dual_bsd_ring; /* Check whether the file_priv is using one ring */ if (file_priv-bsd_ring) return file_priv-bsd_ring-id; - else { - /* If no, use the ping-pong mechanism to select one ring */ - int ring_id; - mutex_lock(dev-struct_mutex); - if (dev_priv-mm.bsd_ring_dispatch_index == 0) { - ring_id = VCS; - dev_priv-mm.bsd_ring_dispatch_index = 1; - } else { - ring_id = VCS2; - dev_priv-mm.bsd_ring_dispatch_index = 0; - } - file_priv-bsd_ring = dev_priv-ring[ring_id]; - mutex_unlock(dev-struct_mutex); - return ring_id; + /* If no, use the parameter defined or ping-pong mechanism + * to select one ring */ + mutex_lock(dev-struct_mutex); + + if (dual == 1 || (dual != 2 + dev_priv-mm.bsd_ring_dispatch_index == 0)) { + ring_id = VCS; + dev_priv-mm.bsd_ring_dispatch_index = 1; + } else { + ring_id = VCS2; + dev_priv-mm.bsd_ring_dispatch_index = 0; } + + file_priv-bsd_ring = dev_priv-ring[ring_id]; + mutex_unlock(dev-struct_mutex); + + WARN(dual, Forcibly trying to use only one bsd ring. Using: %s\n, + file_priv-bsd_ring-name); + return ring_id; } static struct drm_i915_gem_object * diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c index 8145729..d4871c8 100644 --- a/drivers/gpu/drm/i915/i915_params.c +++ b/drivers/gpu/drm/i915/i915_params.c @@ -29,6 +29,7 @@
Re: [Intel-gfx] [PATCH 2/3] drm/i915: Introduce dual_bsd_ring parameter.
On Mon, 2014-06-30 at 09:51 -0700, Rodrigo Vivi wrote: On Broadwell GT3 we have 2 Video Command Streamers (VCS), but userspace has no control when using VCS1 or VCS2. So we cannot test, validate or debug specific changes or workaround that might affect only one or another ring. So this patch introduces a mechanism to avoid the ping-pong selection and use one specific ring given at boot time. Hi, rodrigo Could you use a mechanism to specify the ring at runtime ? If so, it is flexible for us to use VCS ? Thanks Haihao Signed-off-by: Rodrigo Vivi rodrigo.v...@intel.com --- drivers/gpu/drm/i915/i915_drv.h| 1 + drivers/gpu/drm/i915/i915_gem_execbuffer.c | 34 ++ drivers/gpu/drm/i915/i915_params.c | 6 ++ 3 files changed, 27 insertions(+), 14 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 8cea596..7b6614f 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -2069,6 +2069,7 @@ struct i915_params { int panel_ignore_lid; unsigned int powersave; int semaphores; + int dual_bsd_ring; unsigned int lvds_downclock; int lvds_channel_mode; int panel_use_ssc; diff --git a/drivers/gpu/drm/i915/i915_gem_execbuffer.c b/drivers/gpu/drm/i915/i915_gem_execbuffer.c index d815ef5..09f350e 100644 --- a/drivers/gpu/drm/i915/i915_gem_execbuffer.c +++ b/drivers/gpu/drm/i915/i915_gem_execbuffer.c @@ -1035,26 +1035,32 @@ static int gen8_dispatch_bsd_ring(struct drm_device *dev, { struct drm_i915_private *dev_priv = dev-dev_private; struct drm_i915_file_private *file_priv = file-driver_priv; + int ring_id; + int dual = i915.dual_bsd_ring; /* Check whether the file_priv is using one ring */ if (file_priv-bsd_ring) return file_priv-bsd_ring-id; - else { - /* If no, use the ping-pong mechanism to select one ring */ - int ring_id; - mutex_lock(dev-struct_mutex); - if (dev_priv-mm.bsd_ring_dispatch_index == 0) { - ring_id = VCS; - dev_priv-mm.bsd_ring_dispatch_index = 1; - } else { - ring_id = VCS2; - dev_priv-mm.bsd_ring_dispatch_index = 0; - } - file_priv-bsd_ring = dev_priv-ring[ring_id]; - mutex_unlock(dev-struct_mutex); - return ring_id; + /* If no, use the parameter defined or ping-pong mechanism + * to select one ring */ + mutex_lock(dev-struct_mutex); + + if (dual == 1 || (dual != 2 + dev_priv-mm.bsd_ring_dispatch_index == 0)) { + ring_id = VCS; + dev_priv-mm.bsd_ring_dispatch_index = 1; + } else { + ring_id = VCS2; + dev_priv-mm.bsd_ring_dispatch_index = 0; } + + file_priv-bsd_ring = dev_priv-ring[ring_id]; + mutex_unlock(dev-struct_mutex); + + WARN(dual, Forcibly trying to use only one bsd ring. Using: %s\n, + file_priv-bsd_ring-name); + return ring_id; } static struct drm_i915_gem_object * diff --git a/drivers/gpu/drm/i915/i915_params.c b/drivers/gpu/drm/i915/i915_params.c index 8145729..d4871c8 100644 --- a/drivers/gpu/drm/i915/i915_params.c +++ b/drivers/gpu/drm/i915/i915_params.c @@ -29,6 +29,7 @@ struct i915_params i915 __read_mostly = { .panel_ignore_lid = 1, .powersave = 1, .semaphores = -1, + .dual_bsd_ring = 0, .lvds_downclock = 0, .lvds_channel_mode = 0, .panel_use_ssc = -1, @@ -70,6 +71,11 @@ MODULE_PARM_DESC(semaphores, Use semaphores for inter-ring sync (default: -1 (use per-chip defaults))); +module_param_named(dual_bsd_ring, i915.dual_bsd_ring, int, 0600); +MODULE_PARM_DESC(dual_bsd_ring, + Specify bds rings for VCS when there are multiple VCSs available. + (0=All available bsd rings [default], 1=only VCS1, 2=only VCS2)); + module_param_named(enable_rc6, i915.enable_rc6, int, 0400); MODULE_PARM_DESC(enable_rc6, Enable power-saving render C-state 6. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] intel_audio_dump: fix CTS/M value index
On Thu, 2014-03-13 at 16:38 -0400, mengdong@intel.com wrote: From: Mengdong Lin mengdong@intel.com This patch fixes the reversed CTS/M value index when dumping the 'audio M/CTS programing enable' register. Signed-off-by: Mengdong Lin mengdong@intel.com diff --git a/tools/intel_audio_dump.c b/tools/intel_audio_dump.c index 46eebdb..3ed2918 100644 --- a/tools/intel_audio_dump.c +++ b/tools/intel_audio_dump.c @@ -97,6 +97,11 @@ static int get_num_pipes(void) return num_pipes; } +static const char * const cts_m_value_index[] = { + [0] = CTS, + [1] = M, +}; + static const char * const pixel_clock[] = { [0] = 25.2 / 1.001 MHz, [1] = 25.2 MHz, @@ -1408,7 +1413,8 @@ static void dump_aud_m_cts_enable(int index) printf(%s CTS_programming\t\t\t%#lx\n,prefix, BITS(dword, 19, 0)); printf(%s Enable_CTS_or_M_programming\t%lu\n, prefix, BIT(dword, 20)); - printf(%s CTS_M value Index\t\t\t%s\n,prefix, BIT(dword, 21) ? CTS : M); + printf(%s CTS_M value Index\t\t\t[0x%lx] %s\n,prefix, BIT(dword, 21), + OPNAME(cts_m_value_index, BIT(dword, 21))); } static void dump_aud_power_state(void) It is OK for me. Reviewed-by: Haihao Xiang haihao.xi...@intel.com ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH Inte-gpu-tools 1/5] Assembler/bdw: Remove the unsupported cache agent for WRITE(...)
On Tue, 2014-01-28 at 09:53 +0800, yakui.z...@intel.com wrote: From: Zhao Yakui yakui.z...@intel.com The Sampler/Constant cache is read-only. And it can't be used as the target cache agent of WRITE message. Signed-off-by: Zhao Yakui yakui.z...@intel.com --- assembler/gram.y | 4 1 file changed, 4 deletions(-) diff --git a/assembler/gram.y b/assembler/gram.y index ad4cb29..589a0fe 100644 --- a/assembler/gram.y +++ b/assembler/gram.y @@ -1652,9 +1652,7 @@ msgtarget: NULL_TOKEN { if (IS_GENp(8)) { if ($9 != 0 - $9 != GEN6_SFID_DATAPORT_SAMPLER_CACHE $9 != GEN6_SFID_DATAPORT_RENDER_CACHE - $9 != GEN6_SFID_DATAPORT_CONSTANT_CACHE $9 != GEN7_SFID_DATAPORT_DATA_CACHE $9 != HSW_SFID_DATAPORT_DATA_CACHE1) { error (@9, error: wrong cache type\n); @@ -1715,9 +1713,7 @@ msgtarget: NULL_TOKEN { if (IS_GENp(8)) { if ($9 != 0 - $9 != GEN6_SFID_DATAPORT_SAMPLER_CACHE $9 != GEN6_SFID_DATAPORT_RENDER_CACHE - $9 != GEN6_SFID_DATAPORT_CONSTANT_CACHE $9 != GEN7_SFID_DATAPORT_DATA_CACHE $9 != HSW_SFID_DATAPORT_DATA_CACHE1) { error (@9, error: wrong cache type\n); It is OK for me Reviewed-by: Xiang, Haihao haihao.xi...@intel.com ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/2] drm/i915/bdw: Force all Data Cache Data Port access to be Non-Coherent
On Thu, 2013-12-12 at 15:28 -0800, Ben Widawsky wrote: I stumbled on to some unimplemented errata. To be honest, I am not really sure of the impact, just that the docs say to do. No w/a name for this one. Cc: Kenneth Graunke kenn...@whitecape.org Signed-off-by: Ben Widawsky b...@bwidawsk.net --- drivers/gpu/drm/i915/i915_reg.h | 4 drivers/gpu/drm/i915/intel_pm.c | 7 +++ 2 files changed, 11 insertions(+) diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index e8cc27c..3259e83 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -4167,6 +4167,10 @@ #define GEN7_L3SQCREG4 0xb034 #define L3SQ_URB_READ_CAM_MATCH_DISABLE (127) +/* GEN8 chicken */ +#define HDC_CHICKEN0 0x7300 +#define HDC_FORCE_NON_COHERENT (14) + /* WaCatErrorRejectionIssue */ #define GEN7_SQ_CHICKEN_MBCUNIT_CONFIG 0x9030 #define GEN7_SQ_CHICKEN_MBCUNIT_SQINTMOB(111) diff --git a/drivers/gpu/drm/i915/intel_pm.c b/drivers/gpu/drm/i915/intel_pm.c index ac9dd46..7e2a0e9 100644 --- a/drivers/gpu/drm/i915/intel_pm.c +++ b/drivers/gpu/drm/i915/intel_pm.c @@ -5279,6 +5279,13 @@ static void gen8_init_clock_gating(struct drm_device *dev) I915_READ(CHICKEN_PIPESL_1(i) | DPRS_MASK_VBLANK_SRD)); } + + /* Use Force Non-Coherent whenever executing a 3D context. This is a + * workaround for for a possible hang in the unlikely event a TLB + * invalidation occurs during a PSD flush. + */ + I915_WRITE(HDC_FORCE_NON_COHERENT, It should be HDC_CHICKEN0 instead of HDC_FORCE_NON_COHERENT +I915_READ(HDC_CHICKEN0) | HDC_FORCE_NON_COHERENT); It has a mask bit which should be set for writing } static void haswell_init_clock_gating(struct drm_device *dev) ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/2] rendercopy/bdw: Emit 3DSTATE_WM_HZ_OP.
On Tue, 2013-12-10 at 09:04 -0800, Kenneth Graunke wrote: On 12/10/2013 03:40 AM, Damien Lespiau wrote: On Mon, Dec 09, 2013 at 11:29:35PM -0800, Kenneth Graunke wrote: We don't want depth/stencil fast clears or HiZ resolves; we want normal drawing. Without this, the pixel pipeline doesn't work. Signed-off-by: Kenneth Graunke kenn...@whitecape.org Cc: Ben Widawsky b...@bwidawsk.net Cc: Damien Lespiau damien.lesp...@intel.com Both patches reviewed and pushed, thanks a lot for doing this. Does it mean rendercopy run for you now? (I don't have silicon to test myself). I've also taught my command parser to warn harder about instruction lengths, it was missing the cases you fixed. It still hangs for me. I also tried Haihao's patch, though I guess I didn't try both together... Now gem_render_copy works fine for me without my workaround. Thanks Haihao --Ken ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [Intel gfx][i-g-t PATCH 3/4] rendercopy/bdw: A workaround for 3D pipeline
On Fri, 2013-12-06 at 13:30 +, Damien Lespiau wrote: On Fri, Dec 06, 2013 at 04:54:46PM +0800, Xiang, Haihao wrote: From: Xiang, Haihao haihao.xi...@intel.com Emit PIPELINE_SELECT twice and make sure the commands in the first batch buffer have been done. However I don't know why this works !!! Hum :) on one hand, it's great that you found this w/a, on the other hand, I'm not comfortable with not understanding why this works. Thanks for the comments, actually I am not comfortable with it too. gem_render_copy passed after I happened to run gem_media_fill first, so I am curious which setting in gem_media_fill impact the result. Finally I found it works if I emit PIPELINE_SELECT in a separated batch first. So far what we know (I don't have Silicon that can't test anything): - Ken was saying that mesa doesn't need this. - There are a bunch of W/A around FF units clock gating, might worth checking that we're not hiting WaDisableFfDopClockGating or one of those 3D Vs GPGPU pipelines ones. This could happen to you but not to Ken because you have been switching between 3D and media pipeline with the 2 igt tests. - In any case, doing a pass on the W/A sounds like a good idea - I'd be interested to know if there a even more minimal batch that works (say an empty batch), or if the active ingredient is the pipeline switch. Oh, it works even with an batch which has only MI_BATCH_BUFFER_END. If people want to push the patch to make progress on other parts, I guess that's fine, but we'll need to dig deeper here. Agree, we should look into the issue to find the real root cause. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [Intel gfx][i-g-t PATCH 2/4] rendercopy/bdw: Set Instruction Buffer size Modify Enable to 1
From: Xiang, Haihao haihao.xi...@intel.com Otherwise it may result in GPU hang Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- lib/rendercopy_gen8.c |2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/lib/rendercopy_gen8.c b/lib/rendercopy_gen8.c index 43e962c..1a137dd 100644 --- a/lib/rendercopy_gen8.c +++ b/lib/rendercopy_gen8.c @@ -526,7 +526,7 @@ gen8_emit_state_base_address(struct intel_batchbuffer *batch) { /* indirect object buffer size */ OUT_BATCH(0xf000 | 1); /* intruction buffer size */ - OUT_BATCH(1 12); + OUT_BATCH(1 12 | 1); } static void -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [Intel gfx][i-g-t PATCH 3/4] rendercopy/bdw: A workaround for 3D pipeline
From: Xiang, Haihao haihao.xi...@intel.com Emit PIPELINE_SELECT twice and make sure the commands in the first batch buffer have been done. However I don't know why this works !!! Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- lib/rendercopy_gen8.c | 19 +-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git a/lib/rendercopy_gen8.c b/lib/rendercopy_gen8.c index 1a137dd..6eb1051 100644 --- a/lib/rendercopy_gen8.c +++ b/lib/rendercopy_gen8.c @@ -148,7 +148,8 @@ batch_copy(struct intel_batchbuffer *batch, const void *ptr, uint32_t size, uint static void gen6_render_flush(struct intel_batchbuffer *batch, - drm_intel_context *context, uint32_t batch_end) + drm_intel_context *context, uint32_t batch_end, + int waiting) { int ret; @@ -157,6 +158,11 @@ gen6_render_flush(struct intel_batchbuffer *batch, ret = drm_intel_gem_bo_context_exec(batch-bo, context, batch_end, 0); assert(ret == 0); + + if (waiting) { + dri_bo_map(batch-bo, 0); + dri_bo_unmap(batch-bo); + } } /* Mostly copy+paste from gen6, except height, width, pitch moved */ @@ -880,6 +886,15 @@ void gen8_render_copyfunc(struct intel_batchbuffer *batch, intel_batchbuffer_flush_with_context(batch, context); + /* I don't know why it works !!! */ + batch-ptr = batch-buffer; + OUT_BATCH(GEN6_PIPELINE_SELECT | PIPELINE_SELECT_3D); + OUT_BATCH(MI_BATCH_BUFFER_END); + batch_end = batch_align(batch, 8); + assert(batch_end BATCH_STATE_SPLIT); + gen6_render_flush(batch, context, batch_end, 1); + intel_batchbuffer_reset(batch); + batch_align(batch, 8); batch-ptr = batch-buffer[BATCH_STATE_SPLIT]; @@ -968,6 +983,6 @@ void gen8_render_copyfunc(struct intel_batchbuffer *batch, annotation_flush(aub_annotations, batch); - gen6_render_flush(batch, context, batch_end); + gen6_render_flush(batch, context, batch_end, 0); intel_batchbuffer_reset(batch); } -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [Intel gfx][i-g-t PATCH 4/4] Revert gen8 rendercpy: temporarily disable
From: Xiang, Haihao haihao.xi...@intel.com This reverts commit e41928e6c9bb3f24833a827903f1afeda83592d6. Now the case no longer causes GPU hang on GEN --- lib/rendercopy_i830.c |6 ++ 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/lib/rendercopy_i830.c b/lib/rendercopy_i830.c index 5dd67b2..73edcfa 100644 --- a/lib/rendercopy_i830.c +++ b/lib/rendercopy_i830.c @@ -241,10 +241,8 @@ render_copyfunc_t get_render_copyfunc(int devid) copy = gen6_render_copyfunc; else if (IS_GEN7(devid)) copy = gen7_render_copyfunc; - else if (IS_GEN8(devid)) { - fprintf(stderr, Temporarily disabled\n); - //copy = gen8_render_copyfunc; - } + else if (IS_GEN8(devid)) + copy = gen8_render_copyfunc; return copy; } -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [Intel gfx][i-g-t PATCH 1/4] lib: Clean the batch buffer store after reset
From: Xiang, Haihao haihao.xi...@intel.com Otherwise the stale data in the buffer Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- lib/intel_batchbuffer.c |2 ++ 1 file changed, 2 insertions(+) diff --git a/lib/intel_batchbuffer.c b/lib/intel_batchbuffer.c index 06a5437..9ce7424 100644 --- a/lib/intel_batchbuffer.c +++ b/lib/intel_batchbuffer.c @@ -50,6 +50,8 @@ intel_batchbuffer_reset(struct intel_batchbuffer *batch) batch-bo = drm_intel_bo_alloc(batch-bufmgr, batchbuffer, BATCH_SZ, 4096); + memset(batch-buffer, 0, sizeof(batch-buffer)); + batch-ptr = batch-buffer; } -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [Intel gfx][i-g-t PATCH (v3) 1/4] tests: add gem_media_fill
From: Xiang, Haihao haihao.xi...@intel.com It is to check whether media pipeline on render ring works. Codes are copied and modified from the rendercopy case which uses 3D pipeline. However media pipeline is simpler than 3D pipeline and there is few changes between gen6,gen7 and gen8 Reviewed-by: Zhao Yakui yakui.z...@intel.com Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- lib/Makefile.sources |2 + lib/media_fill.c |9 lib/media_fill.h | 50 ++ tests/Makefile.sources |1 + tests/gem_media_fill.c | 132 5 files changed, 194 insertions(+) create mode 100644 lib/media_fill.c create mode 100644 lib/media_fill.h create mode 100644 tests/gem_media_fill.c diff --git a/lib/Makefile.sources b/lib/Makefile.sources index 699621b..cad238a 100644 --- a/lib/Makefile.sources +++ b/lib/Makefile.sources @@ -19,6 +19,8 @@ libintel_tools_la_SOURCES = \ intel_mmio.c\ intel_pci.c \ intel_reg.h \ + media_fill.c\ + media_fill.h\ rendercopy_i915.c \ rendercopy_i830.c \ gen6_render.h \ diff --git a/lib/media_fill.c b/lib/media_fill.c new file mode 100644 index 000..8ee5db6 --- /dev/null +++ b/lib/media_fill.c @@ -0,0 +1,9 @@ +#include i830_reg.h +#include media_fill.h + +media_fillfunc_t get_media_fillfunc(int devid) +{ + media_fillfunc_t fill = NULL; + + return fill; +} diff --git a/lib/media_fill.h b/lib/media_fill.h new file mode 100644 index 000..2e058cb --- /dev/null +++ b/lib/media_fill.h @@ -0,0 +1,50 @@ +#ifndef RENDE_MEDIA_FILL_H +#define RENDE_MEDIA_FILL_H + +#include stdlib.h +#include sys/ioctl.h +#include stdio.h +#include string.h +#include assert.h +#include fcntl.h +#include inttypes.h +#include errno.h +#include sys/stat.h +#include sys/time.h +#include getopt.h +#include drm.h +#include i915_drm.h +#include drmtest.h +#include intel_bufmgr.h +#include intel_batchbuffer.h +#include intel_gpu_tools.h + +struct scratch_buf { +drm_intel_bo *bo; +uint32_t stride; +uint32_t tiling; +uint32_t *data; +uint32_t *cpu_mapping; +uint32_t size; +unsigned num_tiles; +}; + +static inline unsigned buf_width(struct scratch_buf *buf) +{ + return buf-stride/sizeof(uint8_t); +} + +static inline unsigned buf_height(struct scratch_buf *buf) +{ + return buf-size/buf-stride; +} + +typedef void (*media_fillfunc_t)(struct intel_batchbuffer *batch, + struct scratch_buf *dst, + unsigned x, unsigned y, + unsigned width, unsigned height, + uint8_t color); + +media_fillfunc_t get_media_fillfunc(int devid); + +#endif /* RENDE_MEDIA_FILL_H */ diff --git a/tests/Makefile.sources b/tests/Makefile.sources index d201809..0ff0e37 100644 --- a/tests/Makefile.sources +++ b/tests/Makefile.sources @@ -87,6 +87,7 @@ TESTS_progs = \ gem_largeobject \ gem_lut_handle \ gem_mmap_offset_exhaustion \ + gem_media_fill \ gem_pin \ gem_pipe_control_store_loop \ gem_reg_read \ diff --git a/tests/gem_media_fill.c b/tests/gem_media_fill.c new file mode 100644 index 000..40b391d --- /dev/null +++ b/tests/gem_media_fill.c @@ -0,0 +1,132 @@ +/* + * Copyright © 2013 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + * Authors: + *Damien Lespiau damien.lesp...@intel.com + *Xiang, Haihao haihao.xi...@intel.com + */ + +/* + * This file is a basic test for the media_fill() function, a very simple + * workload for the Media pipeline. + */ + +#include stdbool.h +#include unistd.h +#include cairo.h + +#include media_fill.h + +#define WIDTH 64 +#define STRIDE (WIDTH) +#define HEIGHT 64 +#define SIZE
[Intel-gfx] [Intel gfx][i-g-t PATCH (v3) 2/4] tests/gem_media_fill: add support for gen8
From: Xiang, Haihao haihao.xi...@intel.com v2: Fixed the source register used for the send with EOT Fixed the posted destination operand for the send with EOT v3: Workaround: Insert MEDIA_STATE_FLUSH after MEDIA_OBJECT. Fixed the cache agent used in media_block_write message Set Instruction Buffer size Modify Enable to 1, otherwise it may result in GPU hang Reviewed-by: Zhao Yakui yakui.z...@intel.com Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- lib/Makefile.sources |2 + lib/gen8_media.h | 372 lib/media_fill.c |3 + lib/media_fill.h |7 + lib/media_fill_gen8.c | 374 + 5 files changed, 758 insertions(+) create mode 100644 lib/gen8_media.h create mode 100644 lib/media_fill_gen8.c diff --git a/lib/Makefile.sources b/lib/Makefile.sources index cad238a..95ccb2f 100644 --- a/lib/Makefile.sources +++ b/lib/Makefile.sources @@ -21,6 +21,8 @@ libintel_tools_la_SOURCES = \ intel_reg.h \ media_fill.c\ media_fill.h\ + media_fill_gen8.c \ + gen8_media.h\ rendercopy_i915.c \ rendercopy_i830.c \ gen6_render.h \ diff --git a/lib/gen8_media.h b/lib/gen8_media.h new file mode 100644 index 000..b890df4 --- /dev/null +++ b/lib/gen8_media.h @@ -0,0 +1,372 @@ +#ifndef GEN8_MEDIA_H +#define GEN8_MEDIA_H + +#define GEN8_SURFACEFORMAT_R32G32B32A32_FLOAT 0x000 +#define GEN8_SURFACEFORMAT_R32G32B32A32_SINT 0x001 +#define GEN8_SURFACEFORMAT_R32G32B32A32_UINT 0x002 +#define GEN8_SURFACEFORMAT_R32G32B32A32_UNORM 0x003 +#define GEN8_SURFACEFORMAT_R32G32B32A32_SNORM 0x004 +#define GEN8_SURFACEFORMAT_R64G64_FLOAT 0x005 +#define GEN8_SURFACEFORMAT_R32G32B32X32_FLOAT 0x006 +#define GEN8_SURFACEFORMAT_R32G32B32A32_SSCALED 0x007 +#define GEN8_SURFACEFORMAT_R32G32B32A32_USCALED 0x008 +#define GEN8_SURFACEFORMAT_R32G32B32_FLOAT0x040 +#define GEN8_SURFACEFORMAT_R32G32B32_SINT 0x041 +#define GEN8_SURFACEFORMAT_R32G32B32_UINT 0x042 +#define GEN8_SURFACEFORMAT_R32G32B32_UNORM0x043 +#define GEN8_SURFACEFORMAT_R32G32B32_SNORM0x044 +#define GEN8_SURFACEFORMAT_R32G32B32_SSCALED 0x045 +#define GEN8_SURFACEFORMAT_R32G32B32_USCALED 0x046 +#define GEN8_SURFACEFORMAT_R16G16B16A16_UNORM 0x080 +#define GEN8_SURFACEFORMAT_R16G16B16A16_SNORM 0x081 +#define GEN8_SURFACEFORMAT_R16G16B16A16_SINT 0x082 +#define GEN8_SURFACEFORMAT_R16G16B16A16_UINT 0x083 +#define GEN8_SURFACEFORMAT_R16G16B16A16_FLOAT 0x084 +#define GEN8_SURFACEFORMAT_R32G32_FLOAT 0x085 +#define GEN8_SURFACEFORMAT_R32G32_SINT0x086 +#define GEN8_SURFACEFORMAT_R32G32_UINT0x087 +#define GEN8_SURFACEFORMAT_R32_FLOAT_X8X24_TYPELESS 0x088 +#define GEN8_SURFACEFORMAT_X32_TYPELESS_G8X24_UINT0x089 +#define GEN8_SURFACEFORMAT_L32A32_FLOAT 0x08A +#define GEN8_SURFACEFORMAT_R32G32_UNORM 0x08B +#define GEN8_SURFACEFORMAT_R32G32_SNORM 0x08C +#define GEN8_SURFACEFORMAT_R64_FLOAT 0x08D +#define GEN8_SURFACEFORMAT_R16G16B16X16_UNORM 0x08E +#define GEN8_SURFACEFORMAT_R16G16B16X16_FLOAT 0x08F +#define GEN8_SURFACEFORMAT_A32X32_FLOAT 0x090 +#define GEN8_SURFACEFORMAT_L32X32_FLOAT 0x091 +#define GEN8_SURFACEFORMAT_I32X32_FLOAT 0x092 +#define GEN8_SURFACEFORMAT_R16G16B16A16_SSCALED 0x093 +#define GEN8_SURFACEFORMAT_R16G16B16A16_USCALED 0x094 +#define GEN8_SURFACEFORMAT_R32G32_SSCALED 0x095 +#define GEN8_SURFACEFORMAT_R32G32_USCALED 0x096 +#define GEN8_SURFACEFORMAT_B8G8R8A8_UNORM 0x0C0 +#define GEN8_SURFACEFORMAT_B8G8R8A8_UNORM_SRGB0x0C1 +#define GEN8_SURFACEFORMAT_R10G10B10A2_UNORM 0x0C2 +#define GEN8_SURFACEFORMAT_R10G10B10A2_UNORM_SRGB 0x0C3 +#define GEN8_SURFACEFORMAT_R10G10B10A2_UINT 0x0C4 +#define GEN8_SURFACEFORMAT_R10G10B10_SNORM_A2_UNORM 0x0C5 +#define GEN8_SURFACEFORMAT_R8G8B8A8_UNORM 0x0C7 +#define GEN8_SURFACEFORMAT_R8G8B8A8_UNORM_SRGB0x0C8 +#define GEN8_SURFACEFORMAT_R8G8B8A8_SNORM 0x0C9 +#define GEN8_SURFACEFORMAT_R8G8B8A8_SINT 0x0CA +#define GEN8_SURFACEFORMAT_R8G8B8A8_UINT 0x0CB +#define GEN8_SURFACEFORMAT_R16G16_UNORM 0x0CC +#define GEN8_SURFACEFORMAT_R16G16_SNORM 0x0CD +#define GEN8_SURFACEFORMAT_R16G16_SINT0x0CE +#define GEN8_SURFACEFORMAT_R16G16_UINT
[Intel-gfx] [Intel gfx][i-g-t PATCH (v3) 3/4] tests/gem_media_fill: add support for gen7
From: Xiang, Haihao haihao.xi...@intel.com v2: Fixed the source register used for the send with EOT Fixed the posted destination operand for the send with EOT Reviewed-by: Zhao Yakui yakui.z...@intel.com Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- lib/Makefile.sources |2 + lib/gen7_media.h | 323 + lib/media_fill.c |2 + lib/media_fill.h |7 + lib/media_fill_gen7.c | 351 + 5 files changed, 685 insertions(+) create mode 100644 lib/gen7_media.h create mode 100644 lib/media_fill_gen7.c diff --git a/lib/Makefile.sources b/lib/Makefile.sources index 95ccb2f..fd08c1f 100644 --- a/lib/Makefile.sources +++ b/lib/Makefile.sources @@ -21,7 +21,9 @@ libintel_tools_la_SOURCES = \ intel_reg.h \ media_fill.c\ media_fill.h\ + media_fill_gen7.c \ media_fill_gen8.c \ + gen7_media.h\ gen8_media.h\ rendercopy_i915.c \ rendercopy_i830.c \ diff --git a/lib/gen7_media.h b/lib/gen7_media.h new file mode 100644 index 000..d75ee1b --- /dev/null +++ b/lib/gen7_media.h @@ -0,0 +1,323 @@ +#ifndef GEN7_MEDIA_H +#define GEN7_MEDIA_H + +#define GEN7_SURFACEFORMAT_R32G32B32A32_FLOAT 0x000 +#define GEN7_SURFACEFORMAT_R32G32B32A32_SINT 0x001 +#define GEN7_SURFACEFORMAT_R32G32B32A32_UINT 0x002 +#define GEN7_SURFACEFORMAT_R32G32B32A32_UNORM 0x003 +#define GEN7_SURFACEFORMAT_R32G32B32A32_SNORM 0x004 +#define GEN7_SURFACEFORMAT_R64G64_FLOAT 0x005 +#define GEN7_SURFACEFORMAT_R32G32B32X32_FLOAT 0x006 +#define GEN7_SURFACEFORMAT_R32G32B32A32_SSCALED 0x007 +#define GEN7_SURFACEFORMAT_R32G32B32A32_USCALED 0x008 +#define GEN7_SURFACEFORMAT_R32G32B32_FLOAT0x040 +#define GEN7_SURFACEFORMAT_R32G32B32_SINT 0x041 +#define GEN7_SURFACEFORMAT_R32G32B32_UINT 0x042 +#define GEN7_SURFACEFORMAT_R32G32B32_UNORM0x043 +#define GEN7_SURFACEFORMAT_R32G32B32_SNORM0x044 +#define GEN7_SURFACEFORMAT_R32G32B32_SSCALED 0x045 +#define GEN7_SURFACEFORMAT_R32G32B32_USCALED 0x046 +#define GEN7_SURFACEFORMAT_R16G16B16A16_UNORM 0x080 +#define GEN7_SURFACEFORMAT_R16G16B16A16_SNORM 0x081 +#define GEN7_SURFACEFORMAT_R16G16B16A16_SINT 0x082 +#define GEN7_SURFACEFORMAT_R16G16B16A16_UINT 0x083 +#define GEN7_SURFACEFORMAT_R16G16B16A16_FLOAT 0x084 +#define GEN7_SURFACEFORMAT_R32G32_FLOAT 0x085 +#define GEN7_SURFACEFORMAT_R32G32_SINT0x086 +#define GEN7_SURFACEFORMAT_R32G32_UINT0x087 +#define GEN7_SURFACEFORMAT_R32_FLOAT_X8X24_TYPELESS 0x088 +#define GEN7_SURFACEFORMAT_X32_TYPELESS_G8X24_UINT0x089 +#define GEN7_SURFACEFORMAT_L32A32_FLOAT 0x08A +#define GEN7_SURFACEFORMAT_R32G32_UNORM 0x08B +#define GEN7_SURFACEFORMAT_R32G32_SNORM 0x08C +#define GEN7_SURFACEFORMAT_R64_FLOAT 0x08D +#define GEN7_SURFACEFORMAT_R16G16B16X16_UNORM 0x08E +#define GEN7_SURFACEFORMAT_R16G16B16X16_FLOAT 0x08F +#define GEN7_SURFACEFORMAT_A32X32_FLOAT 0x090 +#define GEN7_SURFACEFORMAT_L32X32_FLOAT 0x091 +#define GEN7_SURFACEFORMAT_I32X32_FLOAT 0x092 +#define GEN7_SURFACEFORMAT_R16G16B16A16_SSCALED 0x093 +#define GEN7_SURFACEFORMAT_R16G16B16A16_USCALED 0x094 +#define GEN7_SURFACEFORMAT_R32G32_SSCALED 0x095 +#define GEN7_SURFACEFORMAT_R32G32_USCALED 0x096 +#define GEN7_SURFACEFORMAT_B8G8R8A8_UNORM 0x0C0 +#define GEN7_SURFACEFORMAT_B8G8R8A8_UNORM_SRGB0x0C1 +#define GEN7_SURFACEFORMAT_R10G10B10A2_UNORM 0x0C2 +#define GEN7_SURFACEFORMAT_R10G10B10A2_UNORM_SRGB 0x0C3 +#define GEN7_SURFACEFORMAT_R10G10B10A2_UINT 0x0C4 +#define GEN7_SURFACEFORMAT_R10G10B10_SNORM_A2_UNORM 0x0C5 +#define GEN7_SURFACEFORMAT_R8G8B8A8_UNORM 0x0C7 +#define GEN7_SURFACEFORMAT_R8G8B8A8_UNORM_SRGB0x0C8 +#define GEN7_SURFACEFORMAT_R8G8B8A8_SNORM 0x0C9 +#define GEN7_SURFACEFORMAT_R8G8B8A8_SINT 0x0CA +#define GEN7_SURFACEFORMAT_R8G8B8A8_UINT 0x0CB +#define GEN7_SURFACEFORMAT_R16G16_UNORM 0x0CC +#define GEN7_SURFACEFORMAT_R16G16_SNORM 0x0CD +#define GEN7_SURFACEFORMAT_R16G16_SINT0x0CE +#define GEN7_SURFACEFORMAT_R16G16_UINT0x0CF +#define GEN7_SURFACEFORMAT_R16G16_FLOAT 0x0D0 +#define GEN7_SURFACEFORMAT_B10G10R10A2_UNORM 0x0D1 +#define
[Intel-gfx] [Intel gfx][i-g-t PATCH (v3) 4/4] tests/gem_media_fill: the assembly code for the shader used in the case
From: Xiang, Haihao haihao.xi...@intel.com The code is for reference only v2: Fixed the source register used for the send with EOT Fixed the posted destination operand for the send with EOT v3: Fixed the cache agent used in media_block_write message on GEN8 Reviewed-by: Zhao Yakui yakui.z...@intel.com Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- shaders/media/README |5 + shaders/media/media_fill.gxa | 44 ++ 2 files changed, 49 insertions(+) create mode 100644 shaders/media/README create mode 100644 shaders/media/media_fill.gxa diff --git a/shaders/media/README b/shaders/media/README new file mode 100644 index 000..9f29601 --- /dev/null +++ b/shaders/media/README @@ -0,0 +1,5 @@ +These files are here for reference only. + +Commands used to generate the shader on gen8 +$ m4 media_fill.gxa media_fill.gxm +$ intel-gen4asm -g 8 -o output media_fill.gxm diff --git a/shaders/media/media_fill.gxa b/shaders/media/media_fill.gxa new file mode 100644 index 000..7578890 --- /dev/null +++ b/shaders/media/media_fill.gxa @@ -0,0 +1,44 @@ +/* + * Registers + * g0 -- header + * g1 -- constant + * g2 -- inline data + * g3 -- reserved + * g4-g12 payload for write message + */ +define(`ORIG', `g2.02,2,1UD') +define(`COLOR', `g1.0') +define(`COLORUB', `COLOR0,1,0UB') +define(`COLORUD', `COLOR0,1,0UD') + +mov(4) COLOR1UB COLORUB {align1}; + +/* WRITE */ +mov(8) g4.01UD g0.08,8,1UD {align1}; +mov(2) g4.01UD ORIG{align1}; +mov(1) g4.81UD 0x000f000fUD{align1}; + +mov(16) g5.01UD COLORUD {align1 compr}; +mov(16) g7.01UD COLORUD {align1 compr}; +mov(16) g9.01UD COLORUD {align1 compr}; +mov(16) g11.01UD COLORUD {align1 compr}; + +/* + * comment out the following instruction on Gen7 + * write(0, 0, 10, 12) + * 10: media_block_write + * 12: data cache data port 1 + */ +send(16) 4 acc01UW null write(0, 0, 10, 12) mlen 9 rlen 0 {align1}; + +/* + * uncomment the following instruction on Gen7 + * write(0, 0, 10, 0) + * 10: media_block_write + *0: reander cache data port + */ +/* send(16) 4 acc01UW null write(0, 0, 10, 0) mlen 9 rlen 0 {align1}; */ + +/* EOT */ +mov(8) g112.01UD g0.08,8,1UD {align1}; +send(16) 112 null1UW null thread_spawner(0, 0, 1) mlen 1 rlen 0 {align1 EOT}; -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [Intel gfx][assembler][i-g-t PATCH] assembler/bdw: Update write(...)
From: Xiang, Haihao haihao.xi...@intel.com write(...) is used for Render Target Write and Media Block Write. The two message types no longer share the same cache agent on GEN8, So a parameter is needed for cache agent. The 4th parameter of write() is used for write commit bit which has been removed since GEN7. Hence we can re-use the 4th parameter as cache agent on GEN8 Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- assembler/gram.y | 30 -- 1 file changed, 28 insertions(+), 2 deletions(-) diff --git a/assembler/gram.y b/assembler/gram.y index bdcfe79..ad4cb29 100644 --- a/assembler/gram.y +++ b/assembler/gram.y @@ -1651,7 +1651,20 @@ msgtarget: NULL_TOKEN INTEGER RPAREN { if (IS_GENp(8)) { - gen8_set_sfid(GEN8($$), GEN6_SFID_DATAPORT_RENDER_CACHE); + if ($9 != 0 + $9 != GEN6_SFID_DATAPORT_SAMPLER_CACHE + $9 != GEN6_SFID_DATAPORT_RENDER_CACHE + $9 != GEN6_SFID_DATAPORT_CONSTANT_CACHE + $9 != GEN7_SFID_DATAPORT_DATA_CACHE + $9 != HSW_SFID_DATAPORT_DATA_CACHE1) { + error (@9, error: wrong cache type\n); + } + + if ($9 == 0) + gen8_set_sfid(GEN8($$), GEN6_SFID_DATAPORT_RENDER_CACHE); + else + gen8_set_sfid(GEN8($$), $9); + gen8_set_header_present(GEN8($$), 1); gen8_set_dp_binding_table_index(GEN8($$), $3); gen8_set_dp_message_control(GEN8($$), $5); @@ -1701,7 +1714,20 @@ msgtarget: NULL_TOKEN INTEGER COMMA INTEGER RPAREN { if (IS_GENp(8)) { - gen8_set_sfid(GEN8($$), GEN6_SFID_DATAPORT_RENDER_CACHE); + if ($9 != 0 + $9 != GEN6_SFID_DATAPORT_SAMPLER_CACHE + $9 != GEN6_SFID_DATAPORT_RENDER_CACHE + $9 != GEN6_SFID_DATAPORT_CONSTANT_CACHE + $9 != GEN7_SFID_DATAPORT_DATA_CACHE + $9 != HSW_SFID_DATAPORT_DATA_CACHE1) { + error (@9, error: wrong cache type\n); + } + + if ($9 == 0) + gen8_set_sfid(GEN8($$), GEN6_SFID_DATAPORT_RENDER_CACHE); + else + gen8_set_sfid(GEN8($$), $9); + gen8_set_header_present(GEN8($$), ($11 != 0)); gen8_set_dp_binding_table_index(GEN8($$), $3); gen8_set_dp_message_control(GEN8($$), $5); -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [Intel gfx][i-g-t PATCH 4/4] tests/gem_media_fill: the assembly code for the shader used in the case
On Thu, 2013-11-28 at 23:57 -0700, Xiang, Haihao wrote: From: Xiang, Haihao haihao.xi...@intel.com The code is for reference only Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- shaders/media/README |6 ++ shaders/media/media_fill.gxa | 30 ++ 2 files changed, 36 insertions(+) create mode 100644 shaders/media/README create mode 100644 shaders/media/media_fill.gxa diff --git a/shaders/media/README b/shaders/media/README new file mode 100644 index 000..334106c --- /dev/null +++ b/shaders/media/README @@ -0,0 +1,6 @@ +These files are here for reference only. + +Commands used to generate the shader on gen8 +$ m4 media_fill.gxa media_fill.gxm +$ intel-gen4asm -g 8 -o output media_fill.gxm + diff --git a/shaders/media/media_fill.gxa b/shaders/media/media_fill.gxa new file mode 100644 index 000..d2931d4 --- /dev/null +++ b/shaders/media/media_fill.gxa @@ -0,0 +1,30 @@ +/* + * Registers + * g0 -- header + * g1 -- constant + * g2 -- inline data + * g3 -- reserved + * g4-g12 message payload + */ +define(`ORIG', `g2.02,2,1UD') +define(`COLOR', `g1.0') +define(`COLORUB', `COLOR0,1,0UB') +define(`COLORUD', `COLOR0,1,0UD') + +mov(4) COLOR1UB COLORUB {align1}; + +/* WRITE */ +mov(8) g4.01UD g0.08,8,1UD {align1}; +mov(2) g4.01UD ORIG{align1}; +mov(1) g4.81UD 0x000f000fUD{align1}; + +mov(16) g5.01UD COLORUD {align1 compr}; +mov(16) g7.01UD COLORUD {align1 compr}; +mov(16) g9.01UD COLORUD {align1 compr}; +mov(16) g11.01UD COLORUD {align1 compr}; + +send(16) 4 acc01UW null write(0, 0, 10, 0) mlen 9 rlen 0 {align1}; + +/* EOT */ +mov(8) g4.01UD g0.08,8,1UD {align1}; +send(16) 4 acc01UW null thread_spawner(0, 0, 1) mlen 1 rlen 0 {align1 EOT}; Based on the spec the send with EOT flag should use the register space r112-r127 for src. So 4 had better be changed as 127. Thanks for pointing out the issue, I will fix it in the new version of patches. Thanks. Yakui ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [Intel gfx][i-g-t PATCH 1/4] tests: add gem_media_fill
On Fri, 2013-11-29 at 09:02 +0100, Daniel Vetter wrote: On Fri, Nov 29, 2013 at 02:57:13PM +0800, Xiang, Haihao wrote: From: Xiang, Haihao haihao.xi...@intel.com It is to check whether media pipeline on render ring works. Codes are copied and modified from the rendercopy case which uses 3D pipeline. However media pipeline is simpler than 3D pipeline and there is few changes between gen6,gen7 and gen8 Signed-off-by: Xiang, Haihao haihao.xi...@intel.com Really awesome. This should also help in writing crazy multi-ring tests which check correctness. I don't have any clue about media stuff, so please let someone else from your team quickly review it before you push. Otherwise lgtm. Thanks for your comments. I will fix the issue Yakui pointed out first then push the code. -Daniel --- lib/Makefile.sources |2 + lib/media_fill.c |9 lib/media_fill.h | 50 ++ tests/Makefile.sources |1 + tests/gem_media_fill.c | 132 5 files changed, 194 insertions(+) create mode 100644 lib/media_fill.c create mode 100644 lib/media_fill.h create mode 100644 tests/gem_media_fill.c diff --git a/lib/Makefile.sources b/lib/Makefile.sources index 699621b..cad238a 100644 --- a/lib/Makefile.sources +++ b/lib/Makefile.sources @@ -19,6 +19,8 @@ libintel_tools_la_SOURCES = \ intel_mmio.c\ intel_pci.c \ intel_reg.h \ + media_fill.c\ + media_fill.h\ rendercopy_i915.c \ rendercopy_i830.c \ gen6_render.h \ diff --git a/lib/media_fill.c b/lib/media_fill.c new file mode 100644 index 000..8ee5db6 --- /dev/null +++ b/lib/media_fill.c @@ -0,0 +1,9 @@ +#include i830_reg.h +#include media_fill.h + +media_fillfunc_t get_media_fillfunc(int devid) +{ + media_fillfunc_t fill = NULL; + + return fill; +} diff --git a/lib/media_fill.h b/lib/media_fill.h new file mode 100644 index 000..2e058cb --- /dev/null +++ b/lib/media_fill.h @@ -0,0 +1,50 @@ +#ifndef RENDE_MEDIA_FILL_H +#define RENDE_MEDIA_FILL_H + +#include stdlib.h +#include sys/ioctl.h +#include stdio.h +#include string.h +#include assert.h +#include fcntl.h +#include inttypes.h +#include errno.h +#include sys/stat.h +#include sys/time.h +#include getopt.h +#include drm.h +#include i915_drm.h +#include drmtest.h +#include intel_bufmgr.h +#include intel_batchbuffer.h +#include intel_gpu_tools.h + +struct scratch_buf { +drm_intel_bo *bo; +uint32_t stride; +uint32_t tiling; +uint32_t *data; +uint32_t *cpu_mapping; +uint32_t size; +unsigned num_tiles; +}; + +static inline unsigned buf_width(struct scratch_buf *buf) +{ + return buf-stride/sizeof(uint8_t); +} + +static inline unsigned buf_height(struct scratch_buf *buf) +{ + return buf-size/buf-stride; +} + +typedef void (*media_fillfunc_t)(struct intel_batchbuffer *batch, + struct scratch_buf *dst, + unsigned x, unsigned y, + unsigned width, unsigned height, + uint8_t color); + +media_fillfunc_t get_media_fillfunc(int devid); + +#endif /* RENDE_MEDIA_FILL_H */ diff --git a/tests/Makefile.sources b/tests/Makefile.sources index d201809..0ff0e37 100644 --- a/tests/Makefile.sources +++ b/tests/Makefile.sources @@ -87,6 +87,7 @@ TESTS_progs = \ gem_largeobject \ gem_lut_handle \ gem_mmap_offset_exhaustion \ + gem_media_fill \ gem_pin \ gem_pipe_control_store_loop \ gem_reg_read \ diff --git a/tests/gem_media_fill.c b/tests/gem_media_fill.c new file mode 100644 index 000..40b391d --- /dev/null +++ b/tests/gem_media_fill.c @@ -0,0 +1,132 @@ +/* + * Copyright © 2013 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT
[Intel-gfx] [Intel gfx][i-g-t PATCH (v2) 1/4] tests: add gem_media_fill
From: Xiang, Haihao haihao.xi...@intel.com It is to check whether media pipeline on render ring works. Codes are copied and modified from the rendercopy case which uses 3D pipeline. However media pipeline is simpler than 3D pipeline and there is few changes between gen6,gen7 and gen8 Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- lib/Makefile.sources |2 + lib/media_fill.c |9 lib/media_fill.h | 50 ++ tests/Makefile.sources |1 + tests/gem_media_fill.c | 132 5 files changed, 194 insertions(+) create mode 100644 lib/media_fill.c create mode 100644 lib/media_fill.h create mode 100644 tests/gem_media_fill.c diff --git a/lib/Makefile.sources b/lib/Makefile.sources index 699621b..cad238a 100644 --- a/lib/Makefile.sources +++ b/lib/Makefile.sources @@ -19,6 +19,8 @@ libintel_tools_la_SOURCES = \ intel_mmio.c\ intel_pci.c \ intel_reg.h \ + media_fill.c\ + media_fill.h\ rendercopy_i915.c \ rendercopy_i830.c \ gen6_render.h \ diff --git a/lib/media_fill.c b/lib/media_fill.c new file mode 100644 index 000..8ee5db6 --- /dev/null +++ b/lib/media_fill.c @@ -0,0 +1,9 @@ +#include i830_reg.h +#include media_fill.h + +media_fillfunc_t get_media_fillfunc(int devid) +{ + media_fillfunc_t fill = NULL; + + return fill; +} diff --git a/lib/media_fill.h b/lib/media_fill.h new file mode 100644 index 000..2e058cb --- /dev/null +++ b/lib/media_fill.h @@ -0,0 +1,50 @@ +#ifndef RENDE_MEDIA_FILL_H +#define RENDE_MEDIA_FILL_H + +#include stdlib.h +#include sys/ioctl.h +#include stdio.h +#include string.h +#include assert.h +#include fcntl.h +#include inttypes.h +#include errno.h +#include sys/stat.h +#include sys/time.h +#include getopt.h +#include drm.h +#include i915_drm.h +#include drmtest.h +#include intel_bufmgr.h +#include intel_batchbuffer.h +#include intel_gpu_tools.h + +struct scratch_buf { +drm_intel_bo *bo; +uint32_t stride; +uint32_t tiling; +uint32_t *data; +uint32_t *cpu_mapping; +uint32_t size; +unsigned num_tiles; +}; + +static inline unsigned buf_width(struct scratch_buf *buf) +{ + return buf-stride/sizeof(uint8_t); +} + +static inline unsigned buf_height(struct scratch_buf *buf) +{ + return buf-size/buf-stride; +} + +typedef void (*media_fillfunc_t)(struct intel_batchbuffer *batch, + struct scratch_buf *dst, + unsigned x, unsigned y, + unsigned width, unsigned height, + uint8_t color); + +media_fillfunc_t get_media_fillfunc(int devid); + +#endif /* RENDE_MEDIA_FILL_H */ diff --git a/tests/Makefile.sources b/tests/Makefile.sources index d201809..0ff0e37 100644 --- a/tests/Makefile.sources +++ b/tests/Makefile.sources @@ -87,6 +87,7 @@ TESTS_progs = \ gem_largeobject \ gem_lut_handle \ gem_mmap_offset_exhaustion \ + gem_media_fill \ gem_pin \ gem_pipe_control_store_loop \ gem_reg_read \ diff --git a/tests/gem_media_fill.c b/tests/gem_media_fill.c new file mode 100644 index 000..40b391d --- /dev/null +++ b/tests/gem_media_fill.c @@ -0,0 +1,132 @@ +/* + * Copyright © 2013 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + * Authors: + *Damien Lespiau damien.lesp...@intel.com + *Xiang, Haihao haihao.xi...@intel.com + */ + +/* + * This file is a basic test for the media_fill() function, a very simple + * workload for the Media pipeline. + */ + +#include stdbool.h +#include unistd.h +#include cairo.h + +#include media_fill.h + +#define WIDTH 64 +#define STRIDE (WIDTH) +#define HEIGHT 64 +#define SIZE (HEIGHT*STRIDE) + +#define COLOR_C4 0xc4
[Intel-gfx] [Intel gfx][i-g-t PATCH (v2) 4/4] tests/gem_media_fill: the assembly code for the shader used in the case
From: Xiang, Haihao haihao.xi...@intel.com The code is for reference only v2: Fixed the source register used for the send with EOT Fixed the posted destination operand for the send with EOT Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- shaders/media/README |6 ++ shaders/media/media_fill.gxa | 30 ++ 2 files changed, 36 insertions(+) create mode 100644 shaders/media/README create mode 100644 shaders/media/media_fill.gxa diff --git a/shaders/media/README b/shaders/media/README new file mode 100644 index 000..334106c --- /dev/null +++ b/shaders/media/README @@ -0,0 +1,6 @@ +These files are here for reference only. + +Commands used to generate the shader on gen8 +$ m4 media_fill.gxa media_fill.gxm +$ intel-gen4asm -g 8 -o output media_fill.gxm + diff --git a/shaders/media/media_fill.gxa b/shaders/media/media_fill.gxa new file mode 100644 index 000..53e2c9f --- /dev/null +++ b/shaders/media/media_fill.gxa @@ -0,0 +1,30 @@ +/* + * Registers + * g0 -- header + * g1 -- constant + * g2 -- inline data + * g3 -- reserved + * g4-g12 payload for write message + */ +define(`ORIG', `g2.02,2,1UD') +define(`COLOR', `g1.0') +define(`COLORUB', `COLOR0,1,0UB') +define(`COLORUD', `COLOR0,1,0UD') + +mov(4) COLOR1UB COLORUB {align1}; + +/* WRITE */ +mov(8) g4.01UD g0.08,8,1UD {align1}; +mov(2) g4.01UD ORIG{align1}; +mov(1) g4.81UD 0x000f000fUD{align1}; + +mov(16) g5.01UD COLORUD {align1 compr}; +mov(16) g7.01UD COLORUD {align1 compr}; +mov(16) g9.01UD COLORUD {align1 compr}; +mov(16) g11.01UD COLORUD {align1 compr}; + +send(16) 4 acc01UW null write(0, 0, 10, 0) mlen 9 rlen 0 {align1}; + +/* EOT */ +mov(8) g112.01UD g0.08,8,1UD {align1}; +send(16) 112 null1UW null thread_spawner(0, 0, 1) mlen 1 rlen 0 {align1 EOT}; -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [Intel gfx][i-g-t PATCH (v2) 2/4] tests/gem_media_fill: add support for gen8
From: Xiang, Haihao haihao.xi...@intel.com v2: Fixed the source register used for the send with EOT Fixed the posted destination operand for the send with EOT Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- lib/Makefile.sources |2 + lib/gen8_media.h | 371 + lib/media_fill.c |3 + lib/media_fill.h |7 + lib/media_fill_gen8.c | 366 5 files changed, 749 insertions(+) create mode 100644 lib/gen8_media.h create mode 100644 lib/media_fill_gen8.c diff --git a/lib/Makefile.sources b/lib/Makefile.sources index cad238a..95ccb2f 100644 --- a/lib/Makefile.sources +++ b/lib/Makefile.sources @@ -21,6 +21,8 @@ libintel_tools_la_SOURCES = \ intel_reg.h \ media_fill.c\ media_fill.h\ + media_fill_gen8.c \ + gen8_media.h\ rendercopy_i915.c \ rendercopy_i830.c \ gen6_render.h \ diff --git a/lib/gen8_media.h b/lib/gen8_media.h new file mode 100644 index 000..c61aed2 --- /dev/null +++ b/lib/gen8_media.h @@ -0,0 +1,371 @@ +#ifndef GEN8_MEDIA_H +#define GEN8_MEDIA_H + +#define GEN8_SURFACEFORMAT_R32G32B32A32_FLOAT 0x000 +#define GEN8_SURFACEFORMAT_R32G32B32A32_SINT 0x001 +#define GEN8_SURFACEFORMAT_R32G32B32A32_UINT 0x002 +#define GEN8_SURFACEFORMAT_R32G32B32A32_UNORM 0x003 +#define GEN8_SURFACEFORMAT_R32G32B32A32_SNORM 0x004 +#define GEN8_SURFACEFORMAT_R64G64_FLOAT 0x005 +#define GEN8_SURFACEFORMAT_R32G32B32X32_FLOAT 0x006 +#define GEN8_SURFACEFORMAT_R32G32B32A32_SSCALED 0x007 +#define GEN8_SURFACEFORMAT_R32G32B32A32_USCALED 0x008 +#define GEN8_SURFACEFORMAT_R32G32B32_FLOAT0x040 +#define GEN8_SURFACEFORMAT_R32G32B32_SINT 0x041 +#define GEN8_SURFACEFORMAT_R32G32B32_UINT 0x042 +#define GEN8_SURFACEFORMAT_R32G32B32_UNORM0x043 +#define GEN8_SURFACEFORMAT_R32G32B32_SNORM0x044 +#define GEN8_SURFACEFORMAT_R32G32B32_SSCALED 0x045 +#define GEN8_SURFACEFORMAT_R32G32B32_USCALED 0x046 +#define GEN8_SURFACEFORMAT_R16G16B16A16_UNORM 0x080 +#define GEN8_SURFACEFORMAT_R16G16B16A16_SNORM 0x081 +#define GEN8_SURFACEFORMAT_R16G16B16A16_SINT 0x082 +#define GEN8_SURFACEFORMAT_R16G16B16A16_UINT 0x083 +#define GEN8_SURFACEFORMAT_R16G16B16A16_FLOAT 0x084 +#define GEN8_SURFACEFORMAT_R32G32_FLOAT 0x085 +#define GEN8_SURFACEFORMAT_R32G32_SINT0x086 +#define GEN8_SURFACEFORMAT_R32G32_UINT0x087 +#define GEN8_SURFACEFORMAT_R32_FLOAT_X8X24_TYPELESS 0x088 +#define GEN8_SURFACEFORMAT_X32_TYPELESS_G8X24_UINT0x089 +#define GEN8_SURFACEFORMAT_L32A32_FLOAT 0x08A +#define GEN8_SURFACEFORMAT_R32G32_UNORM 0x08B +#define GEN8_SURFACEFORMAT_R32G32_SNORM 0x08C +#define GEN8_SURFACEFORMAT_R64_FLOAT 0x08D +#define GEN8_SURFACEFORMAT_R16G16B16X16_UNORM 0x08E +#define GEN8_SURFACEFORMAT_R16G16B16X16_FLOAT 0x08F +#define GEN8_SURFACEFORMAT_A32X32_FLOAT 0x090 +#define GEN8_SURFACEFORMAT_L32X32_FLOAT 0x091 +#define GEN8_SURFACEFORMAT_I32X32_FLOAT 0x092 +#define GEN8_SURFACEFORMAT_R16G16B16A16_SSCALED 0x093 +#define GEN8_SURFACEFORMAT_R16G16B16A16_USCALED 0x094 +#define GEN8_SURFACEFORMAT_R32G32_SSCALED 0x095 +#define GEN8_SURFACEFORMAT_R32G32_USCALED 0x096 +#define GEN8_SURFACEFORMAT_B8G8R8A8_UNORM 0x0C0 +#define GEN8_SURFACEFORMAT_B8G8R8A8_UNORM_SRGB0x0C1 +#define GEN8_SURFACEFORMAT_R10G10B10A2_UNORM 0x0C2 +#define GEN8_SURFACEFORMAT_R10G10B10A2_UNORM_SRGB 0x0C3 +#define GEN8_SURFACEFORMAT_R10G10B10A2_UINT 0x0C4 +#define GEN8_SURFACEFORMAT_R10G10B10_SNORM_A2_UNORM 0x0C5 +#define GEN8_SURFACEFORMAT_R8G8B8A8_UNORM 0x0C7 +#define GEN8_SURFACEFORMAT_R8G8B8A8_UNORM_SRGB0x0C8 +#define GEN8_SURFACEFORMAT_R8G8B8A8_SNORM 0x0C9 +#define GEN8_SURFACEFORMAT_R8G8B8A8_SINT 0x0CA +#define GEN8_SURFACEFORMAT_R8G8B8A8_UINT 0x0CB +#define GEN8_SURFACEFORMAT_R16G16_UNORM 0x0CC +#define GEN8_SURFACEFORMAT_R16G16_SNORM 0x0CD +#define GEN8_SURFACEFORMAT_R16G16_SINT0x0CE +#define GEN8_SURFACEFORMAT_R16G16_UINT0x0CF +#define GEN8_SURFACEFORMAT_R16G16_FLOAT 0x0D0 +#define GEN8_SURFACEFORMAT_B10G10R10A2_UNORM 0x0D1 +#define GEN8_SURFACEFORMAT_B10G10R10A2_UNORM_SRGB 0x0D2 +#define GEN8_SURFACEFORMAT_R11G11B10_FLOAT
[Intel-gfx] [Intel gfx][i-g-t PATCH (v2) 3/4] tests/gem_media_fill: add support for gen7
From: Xiang, Haihao haihao.xi...@intel.com v2: Fixed the source register used for the send with EOT Fixed the posted destination operand for the send with EOT Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- lib/Makefile.sources |2 + lib/gen7_media.h | 323 + lib/media_fill.c |2 + lib/media_fill.h |7 + lib/media_fill_gen7.c | 351 + 5 files changed, 685 insertions(+) create mode 100644 lib/gen7_media.h create mode 100644 lib/media_fill_gen7.c diff --git a/lib/Makefile.sources b/lib/Makefile.sources index 95ccb2f..fd08c1f 100644 --- a/lib/Makefile.sources +++ b/lib/Makefile.sources @@ -21,7 +21,9 @@ libintel_tools_la_SOURCES = \ intel_reg.h \ media_fill.c\ media_fill.h\ + media_fill_gen7.c \ media_fill_gen8.c \ + gen7_media.h\ gen8_media.h\ rendercopy_i915.c \ rendercopy_i830.c \ diff --git a/lib/gen7_media.h b/lib/gen7_media.h new file mode 100644 index 000..d75ee1b --- /dev/null +++ b/lib/gen7_media.h @@ -0,0 +1,323 @@ +#ifndef GEN7_MEDIA_H +#define GEN7_MEDIA_H + +#define GEN7_SURFACEFORMAT_R32G32B32A32_FLOAT 0x000 +#define GEN7_SURFACEFORMAT_R32G32B32A32_SINT 0x001 +#define GEN7_SURFACEFORMAT_R32G32B32A32_UINT 0x002 +#define GEN7_SURFACEFORMAT_R32G32B32A32_UNORM 0x003 +#define GEN7_SURFACEFORMAT_R32G32B32A32_SNORM 0x004 +#define GEN7_SURFACEFORMAT_R64G64_FLOAT 0x005 +#define GEN7_SURFACEFORMAT_R32G32B32X32_FLOAT 0x006 +#define GEN7_SURFACEFORMAT_R32G32B32A32_SSCALED 0x007 +#define GEN7_SURFACEFORMAT_R32G32B32A32_USCALED 0x008 +#define GEN7_SURFACEFORMAT_R32G32B32_FLOAT0x040 +#define GEN7_SURFACEFORMAT_R32G32B32_SINT 0x041 +#define GEN7_SURFACEFORMAT_R32G32B32_UINT 0x042 +#define GEN7_SURFACEFORMAT_R32G32B32_UNORM0x043 +#define GEN7_SURFACEFORMAT_R32G32B32_SNORM0x044 +#define GEN7_SURFACEFORMAT_R32G32B32_SSCALED 0x045 +#define GEN7_SURFACEFORMAT_R32G32B32_USCALED 0x046 +#define GEN7_SURFACEFORMAT_R16G16B16A16_UNORM 0x080 +#define GEN7_SURFACEFORMAT_R16G16B16A16_SNORM 0x081 +#define GEN7_SURFACEFORMAT_R16G16B16A16_SINT 0x082 +#define GEN7_SURFACEFORMAT_R16G16B16A16_UINT 0x083 +#define GEN7_SURFACEFORMAT_R16G16B16A16_FLOAT 0x084 +#define GEN7_SURFACEFORMAT_R32G32_FLOAT 0x085 +#define GEN7_SURFACEFORMAT_R32G32_SINT0x086 +#define GEN7_SURFACEFORMAT_R32G32_UINT0x087 +#define GEN7_SURFACEFORMAT_R32_FLOAT_X8X24_TYPELESS 0x088 +#define GEN7_SURFACEFORMAT_X32_TYPELESS_G8X24_UINT0x089 +#define GEN7_SURFACEFORMAT_L32A32_FLOAT 0x08A +#define GEN7_SURFACEFORMAT_R32G32_UNORM 0x08B +#define GEN7_SURFACEFORMAT_R32G32_SNORM 0x08C +#define GEN7_SURFACEFORMAT_R64_FLOAT 0x08D +#define GEN7_SURFACEFORMAT_R16G16B16X16_UNORM 0x08E +#define GEN7_SURFACEFORMAT_R16G16B16X16_FLOAT 0x08F +#define GEN7_SURFACEFORMAT_A32X32_FLOAT 0x090 +#define GEN7_SURFACEFORMAT_L32X32_FLOAT 0x091 +#define GEN7_SURFACEFORMAT_I32X32_FLOAT 0x092 +#define GEN7_SURFACEFORMAT_R16G16B16A16_SSCALED 0x093 +#define GEN7_SURFACEFORMAT_R16G16B16A16_USCALED 0x094 +#define GEN7_SURFACEFORMAT_R32G32_SSCALED 0x095 +#define GEN7_SURFACEFORMAT_R32G32_USCALED 0x096 +#define GEN7_SURFACEFORMAT_B8G8R8A8_UNORM 0x0C0 +#define GEN7_SURFACEFORMAT_B8G8R8A8_UNORM_SRGB0x0C1 +#define GEN7_SURFACEFORMAT_R10G10B10A2_UNORM 0x0C2 +#define GEN7_SURFACEFORMAT_R10G10B10A2_UNORM_SRGB 0x0C3 +#define GEN7_SURFACEFORMAT_R10G10B10A2_UINT 0x0C4 +#define GEN7_SURFACEFORMAT_R10G10B10_SNORM_A2_UNORM 0x0C5 +#define GEN7_SURFACEFORMAT_R8G8B8A8_UNORM 0x0C7 +#define GEN7_SURFACEFORMAT_R8G8B8A8_UNORM_SRGB0x0C8 +#define GEN7_SURFACEFORMAT_R8G8B8A8_SNORM 0x0C9 +#define GEN7_SURFACEFORMAT_R8G8B8A8_SINT 0x0CA +#define GEN7_SURFACEFORMAT_R8G8B8A8_UINT 0x0CB +#define GEN7_SURFACEFORMAT_R16G16_UNORM 0x0CC +#define GEN7_SURFACEFORMAT_R16G16_SNORM 0x0CD +#define GEN7_SURFACEFORMAT_R16G16_SINT0x0CE +#define GEN7_SURFACEFORMAT_R16G16_UINT0x0CF +#define GEN7_SURFACEFORMAT_R16G16_FLOAT 0x0D0 +#define GEN7_SURFACEFORMAT_B10G10R10A2_UNORM 0x0D1 +#define GEN7_SURFACEFORMAT_B10G10R10A2_UNORM_SRGB 0x0D2 +#define
[Intel-gfx] [Intel gfx][i-g-t PATCH 4/4] tests/gem_media_fill: the assembly code for the shader used in the case
From: Xiang, Haihao haihao.xi...@intel.com The code is for reference only Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- shaders/media/README |6 ++ shaders/media/media_fill.gxa | 30 ++ 2 files changed, 36 insertions(+) create mode 100644 shaders/media/README create mode 100644 shaders/media/media_fill.gxa diff --git a/shaders/media/README b/shaders/media/README new file mode 100644 index 000..334106c --- /dev/null +++ b/shaders/media/README @@ -0,0 +1,6 @@ +These files are here for reference only. + +Commands used to generate the shader on gen8 +$ m4 media_fill.gxa media_fill.gxm +$ intel-gen4asm -g 8 -o output media_fill.gxm + diff --git a/shaders/media/media_fill.gxa b/shaders/media/media_fill.gxa new file mode 100644 index 000..d2931d4 --- /dev/null +++ b/shaders/media/media_fill.gxa @@ -0,0 +1,30 @@ +/* + * Registers + * g0 -- header + * g1 -- constant + * g2 -- inline data + * g3 -- reserved + * g4-g12 message payload + */ +define(`ORIG', `g2.02,2,1UD') +define(`COLOR', `g1.0') +define(`COLORUB', `COLOR0,1,0UB') +define(`COLORUD', `COLOR0,1,0UD') + +mov(4) COLOR1UB COLORUB {align1}; + +/* WRITE */ +mov(8) g4.01UD g0.08,8,1UD {align1}; +mov(2) g4.01UD ORIG{align1}; +mov(1) g4.81UD 0x000f000fUD{align1}; + +mov(16) g5.01UD COLORUD {align1 compr}; +mov(16) g7.01UD COLORUD {align1 compr}; +mov(16) g9.01UD COLORUD {align1 compr}; +mov(16) g11.01UD COLORUD {align1 compr}; + +send(16) 4 acc01UW null write(0, 0, 10, 0) mlen 9 rlen 0 {align1}; + +/* EOT */ +mov(8) g4.01UD g0.08,8,1UD {align1}; +send(16) 4 acc01UW null thread_spawner(0, 0, 1) mlen 1 rlen 0 {align1 EOT}; -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [Intel gfx][i-g-t PATCH 3/4] tests/gem_media_fill: add support for gen7
From: Xiang, Haihao haihao.xi...@intel.com Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- lib/Makefile.sources |2 + lib/gen7_media.h | 323 + lib/media_fill.c |2 + lib/media_fill.h |7 + lib/media_fill_gen7.c | 351 + 5 files changed, 685 insertions(+) create mode 100644 lib/gen7_media.h create mode 100644 lib/media_fill_gen7.c diff --git a/lib/Makefile.sources b/lib/Makefile.sources index 95ccb2f..fd08c1f 100644 --- a/lib/Makefile.sources +++ b/lib/Makefile.sources @@ -21,7 +21,9 @@ libintel_tools_la_SOURCES = \ intel_reg.h \ media_fill.c\ media_fill.h\ + media_fill_gen7.c \ media_fill_gen8.c \ + gen7_media.h\ gen8_media.h\ rendercopy_i915.c \ rendercopy_i830.c \ diff --git a/lib/gen7_media.h b/lib/gen7_media.h new file mode 100644 index 000..d75ee1b --- /dev/null +++ b/lib/gen7_media.h @@ -0,0 +1,323 @@ +#ifndef GEN7_MEDIA_H +#define GEN7_MEDIA_H + +#define GEN7_SURFACEFORMAT_R32G32B32A32_FLOAT 0x000 +#define GEN7_SURFACEFORMAT_R32G32B32A32_SINT 0x001 +#define GEN7_SURFACEFORMAT_R32G32B32A32_UINT 0x002 +#define GEN7_SURFACEFORMAT_R32G32B32A32_UNORM 0x003 +#define GEN7_SURFACEFORMAT_R32G32B32A32_SNORM 0x004 +#define GEN7_SURFACEFORMAT_R64G64_FLOAT 0x005 +#define GEN7_SURFACEFORMAT_R32G32B32X32_FLOAT 0x006 +#define GEN7_SURFACEFORMAT_R32G32B32A32_SSCALED 0x007 +#define GEN7_SURFACEFORMAT_R32G32B32A32_USCALED 0x008 +#define GEN7_SURFACEFORMAT_R32G32B32_FLOAT0x040 +#define GEN7_SURFACEFORMAT_R32G32B32_SINT 0x041 +#define GEN7_SURFACEFORMAT_R32G32B32_UINT 0x042 +#define GEN7_SURFACEFORMAT_R32G32B32_UNORM0x043 +#define GEN7_SURFACEFORMAT_R32G32B32_SNORM0x044 +#define GEN7_SURFACEFORMAT_R32G32B32_SSCALED 0x045 +#define GEN7_SURFACEFORMAT_R32G32B32_USCALED 0x046 +#define GEN7_SURFACEFORMAT_R16G16B16A16_UNORM 0x080 +#define GEN7_SURFACEFORMAT_R16G16B16A16_SNORM 0x081 +#define GEN7_SURFACEFORMAT_R16G16B16A16_SINT 0x082 +#define GEN7_SURFACEFORMAT_R16G16B16A16_UINT 0x083 +#define GEN7_SURFACEFORMAT_R16G16B16A16_FLOAT 0x084 +#define GEN7_SURFACEFORMAT_R32G32_FLOAT 0x085 +#define GEN7_SURFACEFORMAT_R32G32_SINT0x086 +#define GEN7_SURFACEFORMAT_R32G32_UINT0x087 +#define GEN7_SURFACEFORMAT_R32_FLOAT_X8X24_TYPELESS 0x088 +#define GEN7_SURFACEFORMAT_X32_TYPELESS_G8X24_UINT0x089 +#define GEN7_SURFACEFORMAT_L32A32_FLOAT 0x08A +#define GEN7_SURFACEFORMAT_R32G32_UNORM 0x08B +#define GEN7_SURFACEFORMAT_R32G32_SNORM 0x08C +#define GEN7_SURFACEFORMAT_R64_FLOAT 0x08D +#define GEN7_SURFACEFORMAT_R16G16B16X16_UNORM 0x08E +#define GEN7_SURFACEFORMAT_R16G16B16X16_FLOAT 0x08F +#define GEN7_SURFACEFORMAT_A32X32_FLOAT 0x090 +#define GEN7_SURFACEFORMAT_L32X32_FLOAT 0x091 +#define GEN7_SURFACEFORMAT_I32X32_FLOAT 0x092 +#define GEN7_SURFACEFORMAT_R16G16B16A16_SSCALED 0x093 +#define GEN7_SURFACEFORMAT_R16G16B16A16_USCALED 0x094 +#define GEN7_SURFACEFORMAT_R32G32_SSCALED 0x095 +#define GEN7_SURFACEFORMAT_R32G32_USCALED 0x096 +#define GEN7_SURFACEFORMAT_B8G8R8A8_UNORM 0x0C0 +#define GEN7_SURFACEFORMAT_B8G8R8A8_UNORM_SRGB0x0C1 +#define GEN7_SURFACEFORMAT_R10G10B10A2_UNORM 0x0C2 +#define GEN7_SURFACEFORMAT_R10G10B10A2_UNORM_SRGB 0x0C3 +#define GEN7_SURFACEFORMAT_R10G10B10A2_UINT 0x0C4 +#define GEN7_SURFACEFORMAT_R10G10B10_SNORM_A2_UNORM 0x0C5 +#define GEN7_SURFACEFORMAT_R8G8B8A8_UNORM 0x0C7 +#define GEN7_SURFACEFORMAT_R8G8B8A8_UNORM_SRGB0x0C8 +#define GEN7_SURFACEFORMAT_R8G8B8A8_SNORM 0x0C9 +#define GEN7_SURFACEFORMAT_R8G8B8A8_SINT 0x0CA +#define GEN7_SURFACEFORMAT_R8G8B8A8_UINT 0x0CB +#define GEN7_SURFACEFORMAT_R16G16_UNORM 0x0CC +#define GEN7_SURFACEFORMAT_R16G16_SNORM 0x0CD +#define GEN7_SURFACEFORMAT_R16G16_SINT0x0CE +#define GEN7_SURFACEFORMAT_R16G16_UINT0x0CF +#define GEN7_SURFACEFORMAT_R16G16_FLOAT 0x0D0 +#define GEN7_SURFACEFORMAT_B10G10R10A2_UNORM 0x0D1 +#define GEN7_SURFACEFORMAT_B10G10R10A2_UNORM_SRGB 0x0D2 +#define GEN7_SURFACEFORMAT_R11G11B10_FLOAT0x0D3 +#define GEN7_SURFACEFORMAT_R32_SINT 0x0D6
[Intel-gfx] [Intel gfx][i-g-t PATCH 2/4] tests/gem_media_fill: add support for gen8
From: Xiang, Haihao haihao.xi...@intel.com Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- lib/Makefile.sources |2 + lib/gen8_media.h | 371 + lib/media_fill.c |3 + lib/media_fill.h |7 + lib/media_fill_gen8.c | 366 5 files changed, 749 insertions(+) create mode 100644 lib/gen8_media.h create mode 100644 lib/media_fill_gen8.c diff --git a/lib/Makefile.sources b/lib/Makefile.sources index cad238a..95ccb2f 100644 --- a/lib/Makefile.sources +++ b/lib/Makefile.sources @@ -21,6 +21,8 @@ libintel_tools_la_SOURCES = \ intel_reg.h \ media_fill.c\ media_fill.h\ + media_fill_gen8.c \ + gen8_media.h\ rendercopy_i915.c \ rendercopy_i830.c \ gen6_render.h \ diff --git a/lib/gen8_media.h b/lib/gen8_media.h new file mode 100644 index 000..c61aed2 --- /dev/null +++ b/lib/gen8_media.h @@ -0,0 +1,371 @@ +#ifndef GEN8_MEDIA_H +#define GEN8_MEDIA_H + +#define GEN8_SURFACEFORMAT_R32G32B32A32_FLOAT 0x000 +#define GEN8_SURFACEFORMAT_R32G32B32A32_SINT 0x001 +#define GEN8_SURFACEFORMAT_R32G32B32A32_UINT 0x002 +#define GEN8_SURFACEFORMAT_R32G32B32A32_UNORM 0x003 +#define GEN8_SURFACEFORMAT_R32G32B32A32_SNORM 0x004 +#define GEN8_SURFACEFORMAT_R64G64_FLOAT 0x005 +#define GEN8_SURFACEFORMAT_R32G32B32X32_FLOAT 0x006 +#define GEN8_SURFACEFORMAT_R32G32B32A32_SSCALED 0x007 +#define GEN8_SURFACEFORMAT_R32G32B32A32_USCALED 0x008 +#define GEN8_SURFACEFORMAT_R32G32B32_FLOAT0x040 +#define GEN8_SURFACEFORMAT_R32G32B32_SINT 0x041 +#define GEN8_SURFACEFORMAT_R32G32B32_UINT 0x042 +#define GEN8_SURFACEFORMAT_R32G32B32_UNORM0x043 +#define GEN8_SURFACEFORMAT_R32G32B32_SNORM0x044 +#define GEN8_SURFACEFORMAT_R32G32B32_SSCALED 0x045 +#define GEN8_SURFACEFORMAT_R32G32B32_USCALED 0x046 +#define GEN8_SURFACEFORMAT_R16G16B16A16_UNORM 0x080 +#define GEN8_SURFACEFORMAT_R16G16B16A16_SNORM 0x081 +#define GEN8_SURFACEFORMAT_R16G16B16A16_SINT 0x082 +#define GEN8_SURFACEFORMAT_R16G16B16A16_UINT 0x083 +#define GEN8_SURFACEFORMAT_R16G16B16A16_FLOAT 0x084 +#define GEN8_SURFACEFORMAT_R32G32_FLOAT 0x085 +#define GEN8_SURFACEFORMAT_R32G32_SINT0x086 +#define GEN8_SURFACEFORMAT_R32G32_UINT0x087 +#define GEN8_SURFACEFORMAT_R32_FLOAT_X8X24_TYPELESS 0x088 +#define GEN8_SURFACEFORMAT_X32_TYPELESS_G8X24_UINT0x089 +#define GEN8_SURFACEFORMAT_L32A32_FLOAT 0x08A +#define GEN8_SURFACEFORMAT_R32G32_UNORM 0x08B +#define GEN8_SURFACEFORMAT_R32G32_SNORM 0x08C +#define GEN8_SURFACEFORMAT_R64_FLOAT 0x08D +#define GEN8_SURFACEFORMAT_R16G16B16X16_UNORM 0x08E +#define GEN8_SURFACEFORMAT_R16G16B16X16_FLOAT 0x08F +#define GEN8_SURFACEFORMAT_A32X32_FLOAT 0x090 +#define GEN8_SURFACEFORMAT_L32X32_FLOAT 0x091 +#define GEN8_SURFACEFORMAT_I32X32_FLOAT 0x092 +#define GEN8_SURFACEFORMAT_R16G16B16A16_SSCALED 0x093 +#define GEN8_SURFACEFORMAT_R16G16B16A16_USCALED 0x094 +#define GEN8_SURFACEFORMAT_R32G32_SSCALED 0x095 +#define GEN8_SURFACEFORMAT_R32G32_USCALED 0x096 +#define GEN8_SURFACEFORMAT_B8G8R8A8_UNORM 0x0C0 +#define GEN8_SURFACEFORMAT_B8G8R8A8_UNORM_SRGB0x0C1 +#define GEN8_SURFACEFORMAT_R10G10B10A2_UNORM 0x0C2 +#define GEN8_SURFACEFORMAT_R10G10B10A2_UNORM_SRGB 0x0C3 +#define GEN8_SURFACEFORMAT_R10G10B10A2_UINT 0x0C4 +#define GEN8_SURFACEFORMAT_R10G10B10_SNORM_A2_UNORM 0x0C5 +#define GEN8_SURFACEFORMAT_R8G8B8A8_UNORM 0x0C7 +#define GEN8_SURFACEFORMAT_R8G8B8A8_UNORM_SRGB0x0C8 +#define GEN8_SURFACEFORMAT_R8G8B8A8_SNORM 0x0C9 +#define GEN8_SURFACEFORMAT_R8G8B8A8_SINT 0x0CA +#define GEN8_SURFACEFORMAT_R8G8B8A8_UINT 0x0CB +#define GEN8_SURFACEFORMAT_R16G16_UNORM 0x0CC +#define GEN8_SURFACEFORMAT_R16G16_SNORM 0x0CD +#define GEN8_SURFACEFORMAT_R16G16_SINT0x0CE +#define GEN8_SURFACEFORMAT_R16G16_UINT0x0CF +#define GEN8_SURFACEFORMAT_R16G16_FLOAT 0x0D0 +#define GEN8_SURFACEFORMAT_B10G10R10A2_UNORM 0x0D1 +#define GEN8_SURFACEFORMAT_B10G10R10A2_UNORM_SRGB 0x0D2 +#define GEN8_SURFACEFORMAT_R11G11B10_FLOAT0x0D3 +#define GEN8_SURFACEFORMAT_R32_SINT 0x0D6 +#define
[Intel-gfx] [Intel gfx][i-g-t PATCH 1/4] tests: add gem_media_fill
From: Xiang, Haihao haihao.xi...@intel.com It is to check whether media pipeline on render ring works. Codes are copied and modified from the rendercopy case which uses 3D pipeline. However media pipeline is simpler than 3D pipeline and there is few changes between gen6,gen7 and gen8 Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- lib/Makefile.sources |2 + lib/media_fill.c |9 lib/media_fill.h | 50 ++ tests/Makefile.sources |1 + tests/gem_media_fill.c | 132 5 files changed, 194 insertions(+) create mode 100644 lib/media_fill.c create mode 100644 lib/media_fill.h create mode 100644 tests/gem_media_fill.c diff --git a/lib/Makefile.sources b/lib/Makefile.sources index 699621b..cad238a 100644 --- a/lib/Makefile.sources +++ b/lib/Makefile.sources @@ -19,6 +19,8 @@ libintel_tools_la_SOURCES = \ intel_mmio.c\ intel_pci.c \ intel_reg.h \ + media_fill.c\ + media_fill.h\ rendercopy_i915.c \ rendercopy_i830.c \ gen6_render.h \ diff --git a/lib/media_fill.c b/lib/media_fill.c new file mode 100644 index 000..8ee5db6 --- /dev/null +++ b/lib/media_fill.c @@ -0,0 +1,9 @@ +#include i830_reg.h +#include media_fill.h + +media_fillfunc_t get_media_fillfunc(int devid) +{ + media_fillfunc_t fill = NULL; + + return fill; +} diff --git a/lib/media_fill.h b/lib/media_fill.h new file mode 100644 index 000..2e058cb --- /dev/null +++ b/lib/media_fill.h @@ -0,0 +1,50 @@ +#ifndef RENDE_MEDIA_FILL_H +#define RENDE_MEDIA_FILL_H + +#include stdlib.h +#include sys/ioctl.h +#include stdio.h +#include string.h +#include assert.h +#include fcntl.h +#include inttypes.h +#include errno.h +#include sys/stat.h +#include sys/time.h +#include getopt.h +#include drm.h +#include i915_drm.h +#include drmtest.h +#include intel_bufmgr.h +#include intel_batchbuffer.h +#include intel_gpu_tools.h + +struct scratch_buf { +drm_intel_bo *bo; +uint32_t stride; +uint32_t tiling; +uint32_t *data; +uint32_t *cpu_mapping; +uint32_t size; +unsigned num_tiles; +}; + +static inline unsigned buf_width(struct scratch_buf *buf) +{ + return buf-stride/sizeof(uint8_t); +} + +static inline unsigned buf_height(struct scratch_buf *buf) +{ + return buf-size/buf-stride; +} + +typedef void (*media_fillfunc_t)(struct intel_batchbuffer *batch, + struct scratch_buf *dst, + unsigned x, unsigned y, + unsigned width, unsigned height, + uint8_t color); + +media_fillfunc_t get_media_fillfunc(int devid); + +#endif /* RENDE_MEDIA_FILL_H */ diff --git a/tests/Makefile.sources b/tests/Makefile.sources index d201809..0ff0e37 100644 --- a/tests/Makefile.sources +++ b/tests/Makefile.sources @@ -87,6 +87,7 @@ TESTS_progs = \ gem_largeobject \ gem_lut_handle \ gem_mmap_offset_exhaustion \ + gem_media_fill \ gem_pin \ gem_pipe_control_store_loop \ gem_reg_read \ diff --git a/tests/gem_media_fill.c b/tests/gem_media_fill.c new file mode 100644 index 000..40b391d --- /dev/null +++ b/tests/gem_media_fill.c @@ -0,0 +1,132 @@ +/* + * Copyright © 2013 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + * Authors: + *Damien Lespiau damien.lesp...@intel.com + *Xiang, Haihao haihao.xi...@intel.com + */ + +/* + * This file is a basic test for the media_fill() function, a very simple + * workload for the Media pipeline. + */ + +#include stdbool.h +#include unistd.h +#include cairo.h + +#include media_fill.h + +#define WIDTH 64 +#define STRIDE (WIDTH) +#define HEIGHT 64 +#define SIZE (HEIGHT*STRIDE) + +#define COLOR_C4 0xc4
Re: [Intel-gfx] [RFC 00/22] Gen7 batch buffer command parser
On Wed, 2013-11-27 at 09:10 +0100, Daniel Vetter wrote: On Wed, Nov 27, 2013 at 09:32:32AM +0800, ykzhao wrote: On Tue, 2013-11-26 at 13:24 -0700, Volkin, Bradley D wrote: On Tue, Nov 26, 2013 at 11:35:38AM -0800, Daniel Vetter wrote: Hi Brad, On Tue, Nov 26, 2013 at 08:51:17AM -0800, bradley.d.vol...@intel.com wrote: From: Brad Volkin bradley.d.vol...@intel.com Certain OpenGL features (e.g. transform feedback, performance monitoring) require userspace code to submit batches containing commands such as MI_LOAD_REGISTER_IMM to access various registers. Unfortunately, some generations of the hardware will noop these commands in unsecure batches (which includes all userspace batches submitted via i915) even though the commands may be safe and represent the intended programming model of the device. This series introduces a software command parser similar in operation to the command parsing done in hardware for unsecure batches. However, the software parser allows some operations that would be noop'd by hardware, if the parser determines the operation is safe, and submits the batch as secure to prevent hardware parsing. Currently the series implements this on IVB and HSW. The series is divided into several phases: patches 01-09: These implement infrastructure and the command parsing algorithm, all behind a module parameter. I expect some discussion and rework, but hopefully there's nothing too controversial. patches 10-17: These define the checks performed by the parser. I expect much discussion :) patches 18-20: In a final pass over the command checks, I found some issues with the definitions. They looked painful to rebase in, so I've added them here. patches 21-22: These enable the parser by default. It runs on all batches except those that set the I915_EXEC_SECURE flag in the execbuffer2 call. I think long-term we should even scan secure batches. We'd need to allow some registers which only the drm master (i.e. owner of the display hardware) is allowed to do, e.g. for scanline waits. But once we have that we should be able to port all current users of secure batches over to scanned batches and so enforce this everywhere by default. The other issue is that igt tests assume to be able to run some evil tests, so maybe we don't actually want this. Agreed. I thought we could handle this as a follow-up task once the basic stuff is in place, particularly given that we'd want to modify at least some users to test. I also wasn't sure if we would want the check to be root master, as in the current secure flag, or just master. W.r.t. the tests, I suppose we can just turn checking on for secure batches and see what happens. There are follow-up patches to libdrm and to i-g-t. The i-g-t tests are very basic and do not test all of the commands used by the parser on the assumption that I'm likely to make the same mistakes in both the parser and the test. Yeah, I agree that just checking whether commands all go through (or not) as expected adds very little value on top of the few tests you have done. I think we should take a look at some corner cases which might trip up your checker a bit though: - I think we should check batchbuffer chaining and make sure it works on the vcs ring and not anywhere else (we can't ever break shipping libva which uses this). - Some tests to trip up your parser should be done, like 3D commands that fall off the end of the batch bo. Or commands that span page boundaries. The later isn't an issue atm since you use vmap, but we should switch to per-page kmap since the vmap overhead is fairly horrible. Good suggestions. I'll look into these. Hi, Brad More inputs from libva about the batchbuffer chaining. Now the batchbuffer chaining is widely used in libva driver. This is related with how the libva driver processes the image. For the encoding purpose, it needs to be handled based on macroblock(16x16).And every macroblock needs a group of GPU commands. So the GPU commands for all the macroblocks will be constructed in the second-level batchbuffer. The mode of batchbuffer chaining will bring the following benefits: a. The size of second-level batch buffer can be allocated based on the size of handled image. For example: 1080p/720p/480p can use the different size. b. The gpu commands in second-level batchbuffer can be constructed by using GPU instead of CPU, which is helpful to improve the performance.
Re: [Intel-gfx] [RFC 00/22] Gen7 batch buffer command parser
On Wed, 2013-11-27 at 09:31 +0100, Daniel Vetter wrote: On Wed, Nov 27, 2013 at 9:23 AM, Xiang, Haihao haihao.xi...@intel.com wrote: So are these 2nd level batches constructed by the gpu in some cases? That would be fairly horribly to take into account with the batch checker ... It is *not* the 2nd level batch buffer (bit 22 isn't set). Only batch buffer chain is used. That's not really the hard part for the command checker, the important question is whether the gpu generates these batches or whether they're constructed by the cpu. Yes, some batches are generated by GPU, either by EU shaders or by BSD unit (batch buffer for MC on ILK). -Daniel ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC 00/22] Gen7 batch buffer command parser
On Wed, 2013-11-27 at 09:47 +0100, Daniel Vetter wrote: On Wed, Nov 27, 2013 at 04:42:11PM +0800, Xiang, Haihao wrote: On Wed, 2013-11-27 at 09:31 +0100, Daniel Vetter wrote: On Wed, Nov 27, 2013 at 9:23 AM, Xiang, Haihao haihao.xi...@intel.com wrote: So are these 2nd level batches constructed by the gpu in some cases? That would be fairly horribly to take into account with the batch checker ... It is *not* the 2nd level batch buffer (bit 22 isn't set). Only batch buffer chain is used. That's not really the hard part for the command checker, the important question is whether the gpu generates these batches or whether they're constructed by the cpu. Yes, some batches are generated by GPU, either by EU shaders or by BSD unit (batch buffer for MC on ILK). So is ilk the only platform which does that? The command checker is only for gen7+ (and maybe gen6). No. In libva some batches are generated by BSD unit on ILK. on Gen6+, some batches are constructed by GPU shader. -Daniel ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [RFC 00/22] Gen7 batch buffer command parser
On Tue, 2013-11-26 at 20:35 +0100, Daniel Vetter wrote: Hi Brad, On Tue, Nov 26, 2013 at 08:51:17AM -0800, bradley.d.vol...@intel.com wrote: From: Brad Volkin bradley.d.vol...@intel.com Certain OpenGL features (e.g. transform feedback, performance monitoring) require userspace code to submit batches containing commands such as MI_LOAD_REGISTER_IMM to access various registers. Unfortunately, some generations of the hardware will noop these commands in unsecure batches (which includes all userspace batches submitted via i915) even though the commands may be safe and represent the intended programming model of the device. This series introduces a software command parser similar in operation to the command parsing done in hardware for unsecure batches. However, the software parser allows some operations that would be noop'd by hardware, if the parser determines the operation is safe, and submits the batch as secure to prevent hardware parsing. Currently the series implements this on IVB and HSW. The series is divided into several phases: patches 01-09: These implement infrastructure and the command parsing algorithm, all behind a module parameter. I expect some discussion and rework, but hopefully there's nothing too controversial. patches 10-17: These define the checks performed by the parser. I expect much discussion :) patches 18-20: In a final pass over the command checks, I found some issues with the definitions. They looked painful to rebase in, so I've added them here. patches 21-22: These enable the parser by default. It runs on all batches except those that set the I915_EXEC_SECURE flag in the execbuffer2 call. I think long-term we should even scan secure batches. We'd need to allow some registers which only the drm master (i.e. owner of the display hardware) is allowed to do, e.g. for scanline waits. But once we have that we should be able to port all current users of secure batches over to scanned batches and so enforce this everywhere by default. The other issue is that igt tests assume to be able to run some evil tests, so maybe we don't actually want this. There are follow-up patches to libdrm and to i-g-t. The i-g-t tests are very basic and do not test all of the commands used by the parser on the assumption that I'm likely to make the same mistakes in both the parser and the test. Yeah, I agree that just checking whether commands all go through (or not) as expected adds very little value on top of the few tests you have done. I think we should take a look at some corner cases which might trip up your checker a bit though: - I think we should check batchbuffer chaining and make sure it works on the vcs ring and not anywhere else (we can't ever break shipping libva which uses this). Besides the vcs ring, we also use batchbuffer chaining on the render ring for video post processing, video motion estimation and motion compensation(on ILK), A fixed length batch buffer isn't suitable for those operations as those operations are based on a macroblock instead of a frame. It would be better to make sure batchbuffer chaining works on the render ring too. - Some tests to trip up your parser should be done, like 3D commands that fall off the end of the batch bo. Or commands that span page boundaries. The later isn't an issue atm since you use vmap, but we should switch to per-page kmap since the vmap overhead is fairly horrible. I've run the i-g-t gem_* tests, the piglit quick tests (w/Mesa git from a few days ago), and generally used an Ubuntu 13.10 IVB system with the parser running. Aside from a failure described below, I don't think there are any regressions. That is, piglit claims some regressions, but from manually running the tests I think these are false positives. However, I could use help in getting broader testing, particularly around performance. In general, I see less than 3% performance impact on HSW, with more like 10% impact for pathological batch sizes. But we'll certainly want to run relevant benchmarks beyond what I've done. Yeah, a microbenchmark that just shovels MI_NOP batches of various sizes through the checker and bypassing it (with EXEC_SECURE) would be really good. Maybe even some variable-sized commands (all the state setup stuff should be useful for that) to keep things interesting. Some variation is also important to have some good cache thrasing going on (since your check tables are fairly large I think). At this point there are a couple of known issues and potential improvements. 1) VLV. The parser is currently disabled for VLV. One type of check performed by the parser is that commands which access memory do so via PPGTT. VLV does not have PPGTT enabled at this time. I chose to implement the
[Intel-gfx] [i-g-t][PATH] debugger: Include path for cairo to fix compiler error
From: Xiang, Haihao haihao.xi...@intel.com CC eudb.o In file included from eudb.c:44:0: ../lib/drmtest.h:34:19: fatal error: cairo.h: No such file or directory compilation terminated. make[3]: *** [eudb.o] Error 1 Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- debugger/Makefile.am |3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/debugger/Makefile.am b/debugger/Makefile.am index d76e2ac..fde6e02 100644 --- a/debugger/Makefile.am +++ b/debugger/Makefile.am @@ -11,6 +11,7 @@ AM_CPPFLAGS = \ AM_CFLAGS =\ $(DRM_CFLAGS) \ $(PCIACCESS_CFLAGS) \ - $(CWARNFLAGS) + $(CWARNFLAGS) \ + $(CAIRO_CFLAGS) LDADD = $(top_builddir)/lib/libintel_tools.la $(DRM_LIBS) $(PCIACCESS_LIBS) $(CAIRO_LIBS) -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 1/2] gem_ring_sync_loop: check the rings supported by the kernel
From: Xiang, Haihao haihao.xi...@intel.com Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- tests/gem_ring_sync_loop.c | 37 ++--- 1 file changed, 34 insertions(+), 3 deletions(-) diff --git a/tests/gem_ring_sync_loop.c b/tests/gem_ring_sync_loop.c index b689bcd..2875cf3 100644 --- a/tests/gem_ring_sync_loop.c +++ b/tests/gem_ring_sync_loop.c @@ -55,15 +55,46 @@ static drm_intel_bo *target_buffer; #define MI_COND_BATCH_BUFFER_END (0x3623 | 1) #define MI_DO_COMPARE (121) +static int +get_num_rings(int fd) +{ + int num_rings = 1; /* render ring is always available */ + drm_i915_getparam_t gp; + int ret, tmp; + + memset(gp, 0, sizeof(gp)); + gp.value = tmp; + + gp.param = I915_PARAM_HAS_BSD; + ret = drmIoctl(fd, DRM_IOCTL_I915_GETPARAM, gp); + + if ((ret == 0) (*gp.value 0)) + num_rings++; + else + goto skip; + + gp.param = I915_PARAM_HAS_BLT; + ret = drmIoctl(fd, DRM_IOCTL_I915_GETPARAM, gp); + + if ((ret == 0) (*gp.value 0)) + num_rings++; + else + goto skip; + +skip: + return num_rings; +} + static void -store_dword_loop(void) +store_dword_loop(int fd) { int i; + int num_rings = get_num_rings(fd); srandom(0xdeadbeef); for (i = 0; i 0x10; i++) { - int ring = random() % 3 + 1; + int ring = random() % num_rings + 1; if (ring == I915_EXEC_RENDER) { BEGIN_BATCH(4); @@ -127,7 +158,7 @@ int main(int argc, char **argv) exit(-1); } - store_dword_loop(); + store_dword_loop(fd); drm_intel_bo_unreference(target_buffer); intel_batchbuffer_free(batch); -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 0/2] test cases for the new ring on Haswell
On Wed, 2012-11-14 at 08:23 +, Chris Wilson wrote: On Wed, 14 Nov 2012 12:55:54 +0800, Xiang, Haihao haihao.xi...@intel.com wrote: From: Xiang, Haihao haihao.xi...@intel.com Xiang, Haihao (2): tests: storedw on VEBOX Update gem_ring_sync_loop to support VEBOX ring (the 4th ring) on Haswell Should be using the GET_PARAM to determine support for the various rings. Thanks for your comment, I split it into 2 patches: one is to check the the rings supported by drm/i915, another is to test the new ring. Thanks Haihao ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 2/2] gem_ring_sync_loop: test the new ring
From: Xiang, Haihao haihao.xi...@intel.com The code is surround by a #ifdef...#endif to avoid to break compiling against the current libdrm release Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- tests/gem_ring_sync_loop.c | 12 1 file changed, 12 insertions(+) diff --git a/tests/gem_ring_sync_loop.c b/tests/gem_ring_sync_loop.c index 2875cf3..955bf34 100644 --- a/tests/gem_ring_sync_loop.c +++ b/tests/gem_ring_sync_loop.c @@ -81,6 +81,18 @@ get_num_rings(int fd) else goto skip; +#ifdef I915_PARAM_HAS_VEBOX /* remove it once the upstream libdrm support VEBOX */ + + gp.param = I915_PARAM_HAS_VEBOX; + ret = drmIoctl(fd, DRM_IOCTL_I915_GETPARAM, gp); + + if ((ret == 0) (*gp.value 0)) + num_rings++; + else + goto skip; + +#endif + skip: return num_rings; } -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH 1/2] intel: Sync the parameter of i915_getparma with the kernel
On Thu, 2012-11-15 at 12:30 +0100, Daniel Vetter wrote: On Wed, Nov 14, 2012 at 12:46:38PM +0800, Xiang, Haihao wrote: From: Zhao Yakui yakui.z...@intel.com Signed-off-by: Zhao Yakui yakui.z...@intel.com Fyi the best way is to simply run $ make headers_install in the latest kernel tree and copy the resulting userspace header from usr/include/drm/i915_drm.h to libdrm. Otherwise things tend to get out of sync. In the commit message you can then mention up to which kernel commit you've synced. -Daniel Thanks for your comment. But it will bring some changes to the existent data structures by this way, such as -struct drm_i915_gem_cacheing { +struct drm_i915_gem_caching { Is this what we want ? Thanks Haihao --- include/drm/i915_drm.h |2 ++ 1 file changed, 2 insertions(+) diff --git a/include/drm/i915_drm.h b/include/drm/i915_drm.h index 7e9e9bd..8b069ac 100644 --- a/include/drm/i915_drm.h +++ b/include/drm/i915_drm.h @@ -303,6 +303,8 @@ typedef struct drm_i915_irq_wait { #define I915_PARAM_HAS_LLC 17 #define I915_PARAM_HAS_ALIASING_PPGTT 18 #define I915_PARAM_HAS_WAIT_TIMEOUT 19 +#define I915_PARAM_HAS_SEMAPHORES 20 +#define I915_PARAM_HAS_PRIME_VMAP_FLUSH 21 typedef struct drm_i915_getparam { int param; -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 1/2] intel: Sync the parameter of i915_getparma with the kernel
From: Zhao Yakui yakui.z...@intel.com Signed-off-by: Zhao Yakui yakui.z...@intel.com --- include/drm/i915_drm.h |2 ++ 1 file changed, 2 insertions(+) diff --git a/include/drm/i915_drm.h b/include/drm/i915_drm.h index 7e9e9bd..8b069ac 100644 --- a/include/drm/i915_drm.h +++ b/include/drm/i915_drm.h @@ -303,6 +303,8 @@ typedef struct drm_i915_irq_wait { #define I915_PARAM_HAS_LLC 17 #define I915_PARAM_HAS_ALIASING_PPGTT 18 #define I915_PARAM_HAS_WAIT_TIMEOUT 19 +#define I915_PARAM_HAS_SEMAPHORES 20 +#define I915_PARAM_HAS_PRIME_VMAP_FLUSH 21 typedef struct drm_i915_getparam { int param; -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 2/2] intel: Add support for VEBOX ring (v2)
From: Xiang, Haihao haihao.xi...@intel.com v2: Fix the test for has_vebox Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- include/drm/i915_drm.h |2 ++ intel/intel_bufmgr_gem.c |9 + 2 files changed, 11 insertions(+) diff --git a/include/drm/i915_drm.h b/include/drm/i915_drm.h index 8b069ac..2341d2a 100644 --- a/include/drm/i915_drm.h +++ b/include/drm/i915_drm.h @@ -305,6 +305,7 @@ typedef struct drm_i915_irq_wait { #define I915_PARAM_HAS_WAIT_TIMEOUT 19 #define I915_PARAM_HAS_SEMAPHORES 20 #define I915_PARAM_HAS_PRIME_VMAP_FLUSH 21 +#define I915_PARAM_HAS_VEBOX22 typedef struct drm_i915_getparam { int param; @@ -651,6 +652,7 @@ struct drm_i915_gem_execbuffer2 { #define I915_EXEC_RENDER (10) #define I915_EXEC_BSD(20) #define I915_EXEC_BLT(30) +#define I915_EXEC_VEBOX (40) /* Used for switching the constants addressing mode on gen4+ RENDER ring. * Gen6+ only supports relative addressing to dynamic state (default) and diff --git a/intel/intel_bufmgr_gem.c b/intel/intel_bufmgr_gem.c index 512bc6f..758cc52 100644 --- a/intel/intel_bufmgr_gem.c +++ b/intel/intel_bufmgr_gem.c @@ -125,6 +125,7 @@ typedef struct _drm_intel_bufmgr_gem { unsigned int has_wait_timeout : 1; unsigned int bo_reuse : 1; unsigned int no_exec : 1; + unsigned int has_vebox : 1; bool fenced_relocs; FILE *aub_file; @@ -2210,6 +2211,10 @@ do_exec2(drm_intel_bo *bo, int used, drm_intel_context *ctx, if (!bufmgr_gem-has_bsd) return -EINVAL; break; + case I915_EXEC_VEBOX: + if (!bufmgr_gem-has_vebox) + return -EINVAL; + break; case I915_EXEC_RENDER: case I915_EXEC_DEFAULT: break; @@ -3123,6 +3128,10 @@ drm_intel_bufmgr_gem_init(int fd, int batch_size) } else bufmgr_gem-has_llc = *gp.value; + gp.param = I915_PARAM_HAS_VEBOX; + ret = drmIoctl(bufmgr_gem-fd, DRM_IOCTL_I915_GETPARAM, gp); + bufmgr_gem-has_vebox = (ret == 0) (*gp.value 0); + if (bufmgr_gem-gen 4) { gp.param = I915_PARAM_NUM_FENCES_AVAIL; gp.value = bufmgr_gem-available_fences; -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 0/2] test cases for the new ring on Haswell
From: Xiang, Haihao haihao.xi...@intel.com Xiang, Haihao (2): tests: storedw on VEBOX Update gem_ring_sync_loop to support VEBOX ring (the 4th ring) on Haswell lib/intel_chipset.h|2 + tests/Makefile.am |1 + tests/gem_ring_sync_loop.c | 18 - tests/gem_storedw_loop_vebox.c | 153 4 files changed, 171 insertions(+), 3 deletions(-) create mode 100644 tests/gem_storedw_loop_vebox.c -- 1.7.9.5 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] REg: Doubt in drm kernel driver
No, you don't need to modify the drm kernel for the decoded output data. You can directly map the corresponding buffer (GEM buffer) after executing the batchbuffer. BTW The decoded frame is specified in MFX_PIPE_BUF_ADDR_STATE too. Thanks Haihao Hi all, Anybody has any suggestions or any information for my doubts below, it will be really helpful. Thanks and regards Srinath.D From: Duraisamy, Srinath Sent: Wednesday, November 07, 2012 8:41 PM To: intel-gfx@lists.freedesktop.org Subject: REg: Doubt in drm kernel driver Hi all, I am working on a project related to MFX part of Gen7 GPU. I am working on dumping the input commands and data being passed to MFX pipeline while decoding. And to dump the decoded output data from the MFX pipeline. I am using mplayer to decode the h264 file with vaapi support to use the MFX pipeline. I have modified the intel_driver code - gen7_mfd.c file, and I am able to dump the input commands and data being sent to the MFX into a file. Now I need to dump the decoded output of the MFX pipeline. I am not able to find it in user space intel-driver and drm driver code. I believe we need to modify the drm kernel driver code for dumping the decoded output data. It will be really helpful if anyone can provide me information on where to modify the drm kernel driver code to dump the following data 1. The output of the current decoded frame. 2. The output of the decoded motion vectors of the current frame. The decoded motion vector will be stored in address passed in “MFX_AVC_DIRECTMODE_STATE” command. 3. The motion vector information for the reference frame needed for decoding the current frame. This information will be in the address passed in “MFX_AVC_REF_IDX_STATE” command. 4. The decoder reference picture data used for motion compensation. These data can be read from the “MFX_PIPE_BUF_ADDR_STATE“ command. Please let me know if you have any questions. Thanks in advance. Srinath ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] Question regarding libva encoding
The support for Main/High profile has been done in the staging branch. We will merge the interfaces for Main/High profile back into the master branch. Thanks Haihao Hello my name is Charlie Good and I am the CTO of Wowza Media System. We are the authors of Wowza Media Server. Our product includes a transcoder for transcoding incoming streams to adaptive bitrate stream sets. We are only using the AVC/H.264 encoder at this time. We are looking to use libva for accelerated encoding on Linux leveraging the Quick Sync technology. We are already doing this on Windows using the Intel Media SDK. I have an implementation that is currently working when using the baseline profile. I would also like to support main and high profile. It looks like these profiles are not yet implemented. I can see this in gen6_mfc.c in the pipeline code where it looks like only baseline encoding is supported: VAStatus gen6_mfc_pipeline(VADriverContextP ctx, VAProfile profile, struct encode_state *encode_state, struct gen6_encoder_context *gen6_encoder_context) { VAStatus vaStatus; switch (profile) { case VAProfileH264Baseline: vaStatus = gen6_mfc_avc_encode_picture(ctx, encode_state, gen6_encoder_context); break; /* FIXME: add for other profile */ default: vaStatus = VA_STATUS_ERROR_UNSUPPORTED_PROFILE; break; } return vaStatus; } Is there a plan to add support for main and high encoding in a future release? Do you have any estimate of when this might be added? BTW, I am a huge fan of Intel Quick Sync. Very cool technology. Amazing performance and quality. Charlie ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [libva] GPU hung
On Mon, 2012-07-02 at 18:49 +, Christophe Oosterlynck wrote: Hi, Is there any update on this issue or has a bug been reported? I seem to have a similar issue ([drm:i915_hangcheck_hung] *ERROR* Hangcheck timer) when using vaapi with gstreamer. https://bugs.freedesktop.org/show_bug.cgi?id=51061 Angela and I can't reproduce this issue by MPlayer vaapi. Could you also give a try ? Thanks Haihao Best regards, Christophe ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [libva] GPU hung
Hi, Angela Could you file a bug and provide more details how to reproduce this issue ? Thanks Haihao Am Mittwoch, den 13.06.2012, 00:09 +0200 schrieb Angela: For gpu hangs the important thing is the i915_error_state file from sysfs (the files you've attached are mainly interesting for modeset issues). I guess the best thing would be to file a bug on bugs.freedesktop.org with that. # cat /sys/kernel/debug/dri/0/i915_error_state no error state collected # cat /sys/kernel/debug/dri/64/i915_error_state no error state collected I copied the wrong line from mount, debugfs is always mounted on Ubuntu none on /sys/kernel/debug type debugfs (rw) Still, I don't have any output right after the crash, see above. Tried several times 1080i recordings, either no error state collected Below error is with a 720p recording, which behaves and looks different cat .../debug/dri/0/i915_error_state, blocks Jun 13 17:35:39 minerva11 kernel: [68682.433743] [drm:i915_driver_open], Jun 13 17:35:39 minerva11 kernel: [68682.691256] [drm:i915_driver_open], Jun 13 17:35:39 minerva11 kernel: [68682.860834] [drm:i915_driver_open], Jun 13 17:35:40 minerva11 kernel: [68683.185483] [drm:drm_mode_addfb], [FB:27] Jun 13 17:35:46 minerva11 kernel: [68689.196485] [drm:i915_hangcheck_hung] *ERROR* Hangcheck timer elapsed... GPU hung Jun 13 17:35:46 minerva11 kernel: [68689.196946] [drm] capturing error event; look for more information in /debug/dri/0/i915_error_state Jun 13 17:35:46 minerva11 kernel: [68689.199780] [drm:i915_error_work_func], resetting chip Jun 13 17:35:46 minerva11 kernel: [68689.199843] [drm:drm_crtc_helper_set_config], Jun 13 17:35:46 minerva11 kernel: [68689.199845] [drm:drm_crtc_helper_set_config], [CRTC:3] [NOFB] Jun 13 17:35:46 minerva11 kernel: [68689.199867] [drm:ironlake_crtc_dpms], crtc 0/0 dpms off Jun 13 17:35:46 minerva11 kernel: [68689.199869] [drm:i915_get_vblank_timestamp], crtc 0 is disabled Jun 13 17:35:46 minerva11 kernel: [68689.205723] [drm:intel_prepare_page_flip], preparing flip with no unpin work? Jun 13 17:35:46 minerva11 udevd[19509]: failed to execute '/usr/share/apport/apport-gpu-error-intel.py' '/usr/share/apport/apport-gpu-error-intel.py': No such file or directory Jun 13 17:36:46 dhclient: last message repeated 4 times Jun 13 17:36:46 minerva11 kernel: [68689.252474] [drm:intel_disable_pch_pll], disable PCH PLL c6014 (active 1, on? 1) for crtc 3 Jun 13 17:36:46 minerva11 kernel: [68689.252476] [drm:intel_disable_pch_pll], disabling PCH PLL c6014 Jun 13 17:36:46 minerva11 kernel: [68689.252882] [drm:intel_update_fbc], Jun 13 17:36:46 minerva11 kernel: [68689.252906] [ cut here ] Jun 13 17:36:46 minerva11 kernel: [68689.253355] kernel BUG at drivers/gpu/drm/i915/i915_gem.c:3084! Jun 13 17:36:46 minerva11 kernel: [68689.253795] invalid opcode: [#1] SMP Jun 13 17:36:46 minerva11 kernel: [68689.254227] CPU 3 Jun 13 17:36:46 minerva11 kernel: [68689.254672] Modules linked in: des_generic md4 nls_utf8 cifs xts gf128mul autofs4 dm_crypt binfmt_misc ipt_MASQUERADE xt_conntrack snd_hda_codec_hdmi snd_hda_codec_realtek iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_filter ip_tables x_tables tda18271c2dd(O) snd_hda_intel snd_hda_codec snd_hwdep snd_pcm snd_seq_midi snd_rawmidi snd_seq_midi_event drxk(O) snd_seq arc4 ath9k mac80211 ath9k_common ath9k_hw eeepc_wmi ddbridge(O) snd_timer snd_seq_device dvb_core(O) asus_wmi cxd2099(O) coretemp hid_generic ath btusb snd mei bluetooth psmouse serio_raw lpc_ich sparse_keymap soundcore cfg80211 snd_page_alloc microcode lp parport usbhid hid usb_storage uas 8139too mxm_wmi 8139cp ghash_clmulni_intel aesni_intel cryptd firewire_ohci aes_x86_64 firewire_core crc_itu_t i915 drm_kms_helper drm ahci e1000e libahci i2c_algo_bit xhci_hcd video wmi [last unloaded: kvm] Jun 13 17:36:46 minerva11 kernel: [68689.256832] Jun 13 17:36:46 miJun 13 17:44:59 minerva11 kernel: imklog 5.8.6, log source = /proc/kmsg started. Again with 1080i, I also noticed the stacktrace is not always the same, however ends always at the same point (intel_unpin_fb_obj): Jun 13 18:13:36 minerva11 kernel: [ 1737.267614] [ cut here ] Jun 13 18:13:36 minerva11 kernel: [ 1737.267630] kernel BUG at drivers/gpu/drm/i915/i915_gem.c:3084! Jun 13 18:13:36 minerva11 kernel: [ 1737.267646] invalid opcode: [#1] SMP Jun 13 18:13:36 minerva11 kernel: [ 1737.267657] CPU 2 Jun 13 18:13:36 minerva11 kernel: [ 1737.267683] Modules linked in: xts gf128mul autofs4 binfmt_misc dm_crypt btusb ipt_MASQUERADE xt_conntrack iptable_nat nf_nat nf_conntrack_ipv4 nf_conntrack nf_defrag_ipv4 iptable_filter ip_tables x_tables snd_hda_codec_hdmi snd_hda_codec_realtek tda18271c2dd(O) arc4 ath9k mac80211 snd_hda_intel ath9k_common snd_hda_codec snd_hwdep snd_pcm snd_seq_midi ath9k_hw
Re: [Intel-gfx] [Q77 express][x86_64][DRI][DRM/intel]Xorg loads intel_drv.soerror, IVYBRIDGE_S_GT2
On Thu, 2012-05-24 at 08:40 +0800, 袁竞杰 wrote: At last,I found the answer. I used libdrm-2.4.27 which does not support IVYBRIDGE_S_GT2. Using the latest libdrm-2.4.34 to solve this problem. VAAPI still complains cannot open i965_drv_video.so,I find the latest intel-driver-1.0.17 doesn't support IVYBRIDGE_S_GT2 . Add #define PCI_CHIP_IVYBRIDGE_S_GT2 016a(line 175)and devid==PCI_CHIP_IVYBRIDGE_S_GT2 (line 205) in intel-driver-1.0.17/src/intel_driver.h, Now it works! The support for this GPU was added into the master branch a month ago. The master branch contains more up-to-date code, you can have a try. Thanks Haihao Ginger Yuan 发件人: 袁竞杰 发送时间: 2012-05-17 09:47 收件人: intel-gfx 主题: [Q77 express][x86_64][DRI][DRM/intel]Xorg loads intel_drv.soerror System environment: xf86-video-intel:2.19.0 intel-driver(for vaapi):1.0.15 libva:1.0.15 libdrm:2.4.32 xserver:11.0 Mesa:8.0.2 kernel version:3.4 rc6 system: ubuntu 12.04 VGA:intel Ivy Bridge Graphics Controller lspci:00:02.0 0300: 8086:016a(rev 08) Bug description: In order to use VAAPI to accelerate decoding in Ivybridge ,I installed xf86-video-intel 2.19 which supports PCI_DEVICE_ID_INTEL_IVYBRIDGE_S_GT2_IG,and restarted system. System stoped with blank screen,after loaded intel_drv.so Xorg.0.log is in the attachment. By the way,it seems little information with VAAPI except source code , could you send me some documents about VAAPI or intel graphic card decoding? thank you very much ! End of Xorg.0.log [12.259] (II) Loading extension DRI2 [12.259] (II) LoadModule: intel [12.344] (II) Loading /usr/lib/xorg/modules/drivers/intel_drv.so [12.344] (II) Module intel: vendor=X.Org Foundation [12.344] compiled for 1.11.3, module version = 2.19.0 [12.344] Module class: X.Org Video Driver [12.344] ABI class: X.Org Video Driver, version 11.0 [12.344] (II) intel: Driver for Intel Integrated Graphics Chipsets: i810, i810-dc100, i810e, i815, i830M, 845G, 854, 852GM/855GM, 865G, 915G, E7221 (i915), 915GM, 945G, 945GM, 945GME, Pineview GM, Pineview G, 965G, G35, 965Q, 946GZ, 965GM, 965GME/GLE, G33, Q35, Q33, GM45, 4 Series, G45/G43, Q45/Q43, G41, B43, B43, Clarkdale, Arrandale, Sandybridge Desktop (GT1), Sandybridge Desktop (GT2), Sandybridge Desktop (GT2+), Sandybridge Mobile (GT1), Sandybridge Mobile (GT2), Sandybridge Mobile (GT2+), Sandybridge Server, Ivybridge Mobile (GT1), Ivybridge Mobile (GT2), Ivybridge Desktop (GT1), Ivybridge Desktop (GT2), Ivybridge Server, Ivybridge Server (GT2) [12.344] (++) using VT number 7 [12.345] (II) Loading /usr/lib/xorg/modules/drivers/intel_drv.so [12.345] drmOpenDevice: node name is /dev/dri/card0 [12.345] drmOpenDevice: open result is 9, (OK) [12.345] drmOpenByBusid: Searching for BusID pci::00:02.0 [12.345] drmOpenDevice: node name is /dev/dri/card0 [12.345] drmOpenDevice: open result is 9, (OK) [12.345] drmOpenByBusid: drmOpenMinor returns 9 [12.345] drmOpenByBusid: drmGetBusid reports pci::00:02.0 [12.345] (II) intel(0): Creating default Display subsection in Screen section Default Screen for depth/fbbpp 24/32 [12.345] (==) intel(0): Depth 24, (--) framebuffer bpp 32 [12.345] (==) intel(0): RGB weight 888 [12.345] (==) intel(0): Default visual is TrueColor [12.345] (II) intel(0): Integrated Graphics Chipset: Intel(R) Ivybridge Server (GT2) [12.345] (--) intel(0): Chipset: Ivybridge Server (GT2) ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] Does intel GM35 support libva?
The libva driver for Intel doesn't support GM35. Thanks Haihao Howdy, Xorg says: [ 36196.880] (II) Loading /usr/lib/xorg/modules/drivers/intel_drv.so [ 36196.880] drmOpenDevice: node name is /dev/dri/card0 [ 36196.880] drmOpenDevice: open result is 9, (OK) [ 36196.880] drmOpenByBusid: Searching for BusID pci::00:02.0 [ 36196.880] drmOpenDevice: node name is /dev/dri/card0 [ 36196.880] drmOpenDevice: open result is 9, (OK) [ 36196.880] drmOpenByBusid: drmOpenMinor returns 9 [ 36196.880] drmOpenByBusid: drmGetBusid reports pci::00:02.0 [ 36196.880] (**) intel(0): Depth 24, (--) framebuffer bpp 32 [ 36196.880] (==) intel(0): RGB weight 888 [ 36196.880] (==) intel(0): Default visual is TrueColor [ 36196.880] (II) intel(0): Integrated Graphics Chipset: Intel(R) G35 [ 36196.880] (--) intel(0): Chipset: G35 myth:~$ vainfo libva: libva version 0.32.0 libva: va_getDriverName() returns 0 libva: Trying to open /usr/lib/dri/i965_drv_video.so libva error: /usr/lib/dri/i965_drv_video.so init failed libva: va_openDriver() returns -1 vaInitialize failed with error code -1 (unknown libva error),exit Is this an error on my system, or my chip is just not supported? http://www.x.org/wiki/IntelGraphicsDriver says 'unknown' :) Thanks, Marc ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] VA-API brightness property
Please, can anyone tell me if there is a list for user questions about VAAPI? I have several questions... The list for VAAPI is li...@lists.freedesktop.org As for brightness property, the driver doesn't support it. Thanks Haihao Greets, Kiste Am 09.03.2012 08:04, schrieb Oliver Seitz: Hi! I'm using the patched, VA-API enabled MPlayer on SandyBridge. System is Debian Wheezy. Works great, decodes BluRay-like quality with neglectable CPU load (below 5%). Now, I'm trying to adjust brightness. I think I've read that VA-API defines a brightness property, but MPlayer can not set it. Is this feature still missing in the driver, or is MPlayer calling for it in the wrong way? Thanks for any hints! Greets, Kiste ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] VAAPI (master or ext) no deinterlacing with Clarkdale GPU
Hi, I know what is the problem. For some reason, the native pixel format for MPEG-2 decoding on Clarkdale is I420, however the input pixel format of deinterlacing is NV12 in the driver, so the driver doesn't support deinterlacing for MPEG-2 on Clardale. We will try to fix this issue but don't expect it too soon. BTW you can send all VAAPI related mail to li...@lists.freedesktop.org as well. Thanks Haihao Atechsystem Atechsystem at freenet.de writes: Hello, I’ve written an email to Haihao Xiang regarding the “no deinterlacing” bug on Clarkdale a week ago and he answered today. He will check this issue. I’ll hope he can fix it. Will the extended vaapi-ext deinterlacers (temporal or spatial I guess) also be available on Clarkdale platform? I tried the extended DXVA hardware deinterlacing on Windows today and it worked fine on my CoreI3 Clarkdale Laptop CPU J Best regards Atech Hi Atech, same problem over here. I am on vaapi-ext and latest xbmc. I can select Bob and Bob (inverted) as deinterlacers in xbmc. With MPEG2 material deinterlacing doesn't work at all. With H.264 video it seems to deinterlace, but video is stuttering and I have lots of frame drops. Hardware: Core i3 530 Software: Ubuntu 11.11 x86 minimal with latest xorg stuff (edgers:ppa) and vaapi-ext branch for libva and intel-vaapi Regards, Christoph / Flachzange ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] vaapi intel-driver (vaapi-ext): assertion failed
On Tue, 2012-01-31 at 17:32 +0100, Christoph Evers wrote: Hi folks, i am not sure whether this issue belongs to the libva mailinglist or this one. I'll give it a try :-) You can send all VAAPI related mail to li...@lists.freedesktop.org For testing purpose I switched to vaapi-ext branch of intel-driver and libva (master is working fine). I am using xine-lib-vaapi for playback with xine which uses a vaapi accelerated ffmpeg: https://github.com/huceke/xine-lib-vaapi H.264 (non interlaced) files are played fine, but with MPEG2 (interlaced/non-interlaced) the vaapi intel driver throws an exception: Do you mean playing with a frame coded video also triggers the Assert? xine: i965_drv_video.c:2075: i965_check_alloc_surface_bo: assertion obj_surface-fourcc == fourcc failed. Obviously, the obj_surface fourcc (NV12) does not fit to the fourcc of the media (I420). It seems there are other operations on this surface before decoding in xine-lib-vaapi. I can't reproduce this issue with mplayer-vaapi If I force NV12 in i965_media_mpeg2.c:517 playback is working (even deinterlaced on Clarkdale) but naturally with the wrong color space. The native pixel format for MPEG-2 decoding on Core i3 530 is I420. Do you guys have any idea? Christoph Hardware: Core i3 530 Software: Ubuntu 11.11 x86 minimal with latest xorg stuff (edgers:ppa) and vaapi-ext branch for libva and intel-vaapi (git 30012012) ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 0/5] enable Xv on Ivybridge
Xiang, Haihao (5): Xv: separate fragments from M4 macros Xv: New shaders for Xv on Ivybridge Xv: update SURFACE_STATE SAMPLER_STATE for Xv on Ivybridge Xv: upload new shaders to GEM objects for Xv on Ivybridge Xv: set up pipeline for Xv on Ivybridge configure.ac|2 +- src/brw_structs.h | 124 + src/i965_reg.h | 132 + src/i965_video.c| 632 +-- src/render_program/Makefile.am | 38 ++- src/render_program/exa_wm_affine.g6i| 35 ++ src/render_program/exa_wm_mask_affine.g6a |8 +- src/render_program/exa_wm_sample_planar.g4i | 64 +++ src/render_program/exa_wm_src_affine.g6a|8 +- src/render_program/exa_wm_src_affine.g7a| 41 ++ src/render_program/exa_wm_src_affine.g7b|4 + src/render_program/exa_wm_src_sample_argb.g4a | 18 +- src/render_program/exa_wm_src_sample_argb.g4i | 44 ++ src/render_program/exa_wm_src_sample_argb.g7a | 38 ++ src/render_program/exa_wm_src_sample_argb.g7b |3 + src/render_program/exa_wm_src_sample_planar.g4a | 36 +-- src/render_program/exa_wm_src_sample_planar.g7a | 38 ++ src/render_program/exa_wm_src_sample_planar.g7b |5 + src/render_program/exa_wm_write.g6a | 38 +-- src/render_program/exa_wm_write.g6i | 61 +++ src/render_program/exa_wm_write.g7a | 41 ++ src/render_program/exa_wm_write.g7b | 17 + src/render_program/exa_wm_yuv_rgb.g7a |1 + src/render_program/exa_wm_yuv_rgb.g7b | 12 + 24 files changed, 1292 insertions(+), 148 deletions(-) create mode 100644 src/render_program/exa_wm_affine.g6i create mode 100644 src/render_program/exa_wm_sample_planar.g4i create mode 100644 src/render_program/exa_wm_src_affine.g7a create mode 100644 src/render_program/exa_wm_src_affine.g7b create mode 100644 src/render_program/exa_wm_src_sample_argb.g4i create mode 100644 src/render_program/exa_wm_src_sample_argb.g7a create mode 100644 src/render_program/exa_wm_src_sample_argb.g7b create mode 100644 src/render_program/exa_wm_src_sample_planar.g7a create mode 100644 src/render_program/exa_wm_src_sample_planar.g7b create mode 100644 src/render_program/exa_wm_write.g6i create mode 100644 src/render_program/exa_wm_write.g7a create mode 100644 src/render_program/exa_wm_write.g7b create mode 12 src/render_program/exa_wm_yuv_rgb.g7a create mode 100644 src/render_program/exa_wm_yuv_rgb.g7b ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 1/5] Xv: separate fragments from M4 macros
It is to prepare for Xv on Ivybridge. The difference from Sandybridge is that all message payload must be in GRF registers instead of MRF registers on Ivybridge. We will only redefine some M4 macros for Ivybridge Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- src/render_program/Makefile.am | 13 - src/render_program/exa_wm_affine.g6i| 35 src/render_program/exa_wm_mask_affine.g6a |8 +--- src/render_program/exa_wm_sample_planar.g4i | 64 +++ src/render_program/exa_wm_src_affine.g6a|8 +--- src/render_program/exa_wm_src_sample_argb.g4a | 18 +-- src/render_program/exa_wm_src_sample_argb.g4i | 44 src/render_program/exa_wm_src_sample_planar.g4a | 36 + src/render_program/exa_wm_write.g6a | 38 +- src/render_program/exa_wm_write.g6i | 61 + 10 files changed, 219 insertions(+), 106 deletions(-) create mode 100644 src/render_program/exa_wm_affine.g6i create mode 100644 src/render_program/exa_wm_sample_planar.g4i create mode 100644 src/render_program/exa_wm_src_sample_argb.g4i create mode 100644 src/render_program/exa_wm_write.g6i diff --git a/src/render_program/Makefile.am b/src/render_program/Makefile.am index 1a19437..8e48d27 100644 --- a/src/render_program/Makefile.am +++ b/src/render_program/Makefile.am @@ -20,7 +20,9 @@ INTEL_G4A = \ INTEL_G4I =\ exa_wm.g4i \ exa_wm_affine.g4i \ - exa_wm_projective.g4i + exa_wm_projective.g4i \ + exa_wm_sample_planar.g4i\ + exa_wm_src_sample_argb.g4i INTEL_G4B =\ exa_sf.g4b \ @@ -61,6 +63,10 @@ INTEL_G4B_GEN5 = \ exa_wm_yuv_rgb.g4b.gen5 \ exa_wm_xy.g4b.gen5 +INTEL_G6I =\ + exa_wm_affine.g6i \ + exa_wm_write.g6i + INTEL_G6A =\ exa_wm_src_affine.g6a \ exa_wm_src_projective.g6a \ @@ -99,7 +105,8 @@ EXTRA_DIST = \ $(INTEL_G4B)\ $(INTEL_G4B_GEN5)\ $(INTEL_G6A)\ - $(INTEL_G6B) + $(INTEL_G6B)\ + $(INTEL_G6I) if HAVE_GEN4ASM @@ -111,7 +118,7 @@ SUFFIXES = .g4a .g4b .g6a .g6b m4 -I$(srcdir) -s $ $*.g6m intel-gen4asm -g 6 -o $@ $*.g6m rm $*.g6m $(INTEL_G4B): $(INTEL_G4I) -$(INTEL_G6B): $(INTEL_G4I) +$(INTEL_G6B): $(INTEL_G4I) $(INTEL_G6I) BUILT_SOURCES= $(INTEL_G4B) $(INTEL_G6B) diff --git a/src/render_program/exa_wm_affine.g6i b/src/render_program/exa_wm_affine.g6i new file mode 100644 index 000..9ac21d5 --- /dev/null +++ b/src/render_program/exa_wm_affine.g6i @@ -0,0 +1,35 @@ +/* + * Copyright © 2010-2011 Intel Corporation + * + * Permission is hereby granted, free of charge, to any person obtaining a + * copy of this software and associated documentation files (the Software), + * to deal in the Software without restriction, including without limitation + * the rights to use, copy, modify, merge, publish, distribute, sublicense, + * and/or sell copies of the Software, and to permit persons to whom the + * Software is furnished to do so, subject to the following conditions: + * + * The above copyright notice and this permission notice (including the next + * paragraph) shall be included in all copies or substantial portions of the + * Software. + * + * THE SOFTWARE IS PROVIDED AS IS, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL + * THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING + * FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS + * IN THE SOFTWARE. + * + */ + +/* + * Fragment to compute src u/v values + */ + +/* U */ +pln (8) ul1F a0_a_x bl { align1 }; /* pixel 0-7 */ +pln (8) uh1F a0_a_x bh { align1 }; /* pixel 8-15 */ + +/* V */ +pln (8) vl1F a0_a_y bl { align1 }; /* pixel 0-7 */ +pln (8) vh1F a0_a_y bh { align1 }; /* pixel 8-15 */ diff --git a/src/render_program/exa_wm_mask_affine.g6a b/src/render_program/exa_wm_mask_affine.g6a index 2daf4e2..04ad2a2 100644 --- a/src/render_program/exa_wm_mask_affine.g6a +++ b/src/render_program/exa_wm_mask_affine.g6a @@ -38,10 +38,4 @@ define(`bh',`g4.08,8,1F') define(`a0_a_x',`g8.00,1,0F') define(`a0_a_y',`g8.160,1,0F') -/* U */ -pln (8) ul1F a0_a_x bl { align1 }; /* pixel 0-7 */ -pln (8) uh1F a0_a_x bh { align1 }; /* pixel 8-15 */ - -/* V */ -pln (8) vl1F a0_a_y bl { align1 }; /* pixel 0-7 */ -pln (8) vh1F a0_a_y bh { align1 }; /* pixel 8-15 */ +include
[Intel-gfx] [PATCH 4/5] Xv: upload new shaders to GEM objects for Xv on Ivybridge
Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- src/i965_video.c | 36 ++-- 1 files changed, 30 insertions(+), 6 deletions(-) diff --git a/src/i965_video.c b/src/i965_video.c index 9fbba40..84230a1 100644 --- a/src/i965_video.c +++ b/src/i965_video.c @@ -149,6 +149,21 @@ static const uint32_t ps_kernel_planar_static_gen6[][4] = { #include exa_wm_write.g6b }; +/* programs for Ivybridge */ +static const uint32_t ps_kernel_packed_static_gen7[][4] = { +#include exa_wm_src_affine.g7b +#include exa_wm_src_sample_argb.g7b +#include exa_wm_yuv_rgb.g7b +#include exa_wm_write.g7b +}; + +static const uint32_t ps_kernel_planar_static_gen7[][4] = { +#include exa_wm_src_affine.g7b +#include exa_wm_src_sample_planar.g7b +#include exa_wm_yuv_rgb.g7b +#include exa_wm_write.g7b +}; + #ifndef MAX2 #define MAX2(a,b) ((a) (b) ? (a) : (b)) #endif @@ -1459,28 +1474,37 @@ gen6_create_vidoe_objects(ScrnInfoPtr scrn) { intel_screen_private *intel = intel_get_screen_private(scrn); drm_intel_bo *(*create_sampler_state)(ScrnInfoPtr); - + const uint32_t *packed_ps_kernel, *planar_ps_kernel; + unsigned int packed_ps_size, planar_ps_size; + if (INTEL_INFO(intel)-gen = 70) { create_sampler_state = gen7_create_sampler_state; + packed_ps_kernel = ps_kernel_packed_static_gen7[0][0]; + packed_ps_size = sizeof(ps_kernel_packed_static_gen7); + planar_ps_kernel = ps_kernel_planar_static_gen7[0][0]; + planar_ps_size = sizeof(ps_kernel_planar_static_gen7); } else { create_sampler_state = i965_create_sampler_state; + packed_ps_kernel = ps_kernel_packed_static_gen6[0][0]; + packed_ps_size = sizeof(ps_kernel_packed_static_gen6); + planar_ps_kernel = ps_kernel_planar_static_gen6[0][0]; + planar_ps_size = sizeof(ps_kernel_planar_static_gen6); } - if (intel-video.gen4_sampler_bo == NULL) intel-video.gen4_sampler_bo = create_sampler_state(scrn); if (intel-video.wm_prog_packed_bo == NULL) intel-video.wm_prog_packed_bo = i965_create_program(scrn, - ps_kernel_packed_static_gen6[0][0], - sizeof(ps_kernel_packed_static_gen6)); + packed_ps_kernel, + packed_ps_size); if (intel-video.wm_prog_planar_bo == NULL) intel-video.wm_prog_planar_bo = i965_create_program(scrn, - ps_kernel_planar_static_gen6[0][0], - sizeof(ps_kernel_planar_static_gen6)); + planar_ps_kernel, + planar_ps_size); if (intel-video.gen4_cc_vp_bo == NULL) intel-video.gen4_cc_vp_bo = i965_create_cc_vp_state(scrn); -- 1.7.0.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 5/5] Xv: set up pipeline for Xv on Ivybridge
The configuration is same as that on Sandybridge, but many state commands are changed Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- src/i965_reg.h | 132 src/i965_video.c | 446 +++--- 2 files changed, 554 insertions(+), 24 deletions(-) diff --git a/src/i965_reg.h b/src/i965_reg.h index df41fba..ab6c020 100644 --- a/src/i965_reg.h +++ b/src/i965_reg.h @@ -136,6 +136,138 @@ # define GEN6_3DSTATE_MULTISAMPLE_NUMSAMPLES_4 (2 1) # define GEN6_3DSTATE_MULTISAMPLE_NUMSAMPLES_8 (3 1) +/* on GEN7+ */ +/* _3DSTATE_VERTEX_BUFFERS on GEN7*/ +/* DW1 */ +#define GEN7_VB0_ADDRESS_MODIFYENABLE (1 14) + +/* _3DPRIMITIVE on GEN7 */ +/* DW1 */ +# define GEN7_3DPRIM_VERTEXBUFFER_ACCESS_SEQUENTIAL (0 8) +# define GEN7_3DPRIM_VERTEXBUFFER_ACCESS_RANDOM (1 8) + +/* 3DSTATE_WM on GEN7 */ +/* DW1 */ +# define GEN7_WM_STATISTICS_ENABLE (1 31) +# define GEN7_WM_DEPTH_CLEAR(1 30) +# define GEN7_WM_DISPATCH_ENABLE(1 29) +# define GEN6_WM_DEPTH_RESOLVE (1 28) +# define GEN7_WM_HIERARCHICAL_DEPTH_RESOLVE (1 27) +# define GEN7_WM_KILL_ENABLE(1 25) +# define GEN7_WM_PSCDEPTH_OFF (0 23) +# define GEN7_WM_PSCDEPTH_ON(1 23) +# define GEN7_WM_PSCDEPTH_ON_GE (2 23) +# define GEN7_WM_PSCDEPTH_ON_LE (3 23) +# define GEN7_WM_USES_SOURCE_DEPTH (1 20) +# define GEN7_WM_USES_SOURCE_W (1 19) +# define GEN7_WM_POSITION_ZW_PIXEL (0 17) +# define GEN7_WM_POSITION_ZW_CENTROID (2 17) +# define GEN7_WM_POSITION_ZW_SAMPLE (3 17) +# define GEN7_WM_NONPERSPECTIVE_SAMPLE_BARYCENTRIC (1 16) +# define GEN7_WM_NONPERSPECTIVE_CENTROID_BARYCENTRIC(1 15) +# define GEN7_WM_NONPERSPECTIVE_PIXEL_BARYCENTRIC (1 14) +# define GEN7_WM_PERSPECTIVE_SAMPLE_BARYCENTRIC (1 13) +# define GEN7_WM_PERSPECTIVE_CENTROID_BARYCENTRIC (1 12) +# define GEN7_WM_PERSPECTIVE_PIXEL_BARYCENTRIC (1 11) +# define GEN7_WM_USES_INPUT_COVERAGE_MASK (1 10) +# define GEN7_WM_LINE_END_CAP_AA_WIDTH_0_5 (0 8) +# define GEN7_WM_LINE_END_CAP_AA_WIDTH_1_0 (1 8) +# define GEN7_WM_LINE_END_CAP_AA_WIDTH_2_0 (2 8) +# define GEN7_WM_LINE_END_CAP_AA_WIDTH_4_0 (3 8) +# define GEN7_WM_LINE_AA_WIDTH_0_5 (0 6) +# define GEN7_WM_LINE_AA_WIDTH_1_0 (1 6) +# define GEN7_WM_LINE_AA_WIDTH_2_0 (2 6) +# define GEN7_WM_LINE_AA_WIDTH_4_0 (3 6) +# define GEN7_WM_POLYGON_STIPPLE_ENABLE (1 4) +# define GEN7_WM_LINE_STIPPLE_ENABLE(1 3) +# define GEN7_WM_POINT_RASTRULE_UPPER_RIGHT (1 2) +# define GEN7_WM_MSRAST_OFF_PIXEL (0 0) +# define GEN7_WM_MSRAST_OFF_PATTERN (1 0) +# define GEN7_WM_MSRAST_ON_PIXEL(2 0) +# define GEN7_WM_MSRAST_ON_PATTERN (3 0) +/* DW2 */ +# define GEN7_WM_MSDISPMODE_PERPIXEL(1 31) + +#define GEN7_3DSTATE_CLEAR_PARAMS BRW_3D(3, 0, 0x04) +#define GEN7_3DSTATE_DEPTH_BUFFER BRW_3D(3, 0, 0x05) + +#define GEN7_3DSTATE_CONSTANT_HSBRW_3D(3, 0, 0x19) +#define GEN7_3DSTATE_CONSTANT_DSBRW_3D(3, 0, 0x1a) + +#define GEN7_3DSTATE_HS BRW_3D(3, 0, 0x1b) +#define GEN7_3DSTATE_TE BRW_3D(3, 0, 0x1c) +#define GEN7_3DSTATE_DS BRW_3D(3, 0, 0x1d) +#define GEN7_3DSTATE_STREAMOUT BRW_3D(3, 0, 0x1e) +#define GEN7_3DSTATE_SBEBRW_3D(3, 0, 0x1f) + +/* DW1 */ +# define GEN7_SBE_SWIZZLE_CONTROL_MODE (1 28) +# define GEN7_SBE_NUM_OUTPUTS_SHIFT 22 +# define GEN7_SBE_SWIZZLE_ENABLE(1 21) +# define GEN7_SBE_POINT_SPRITE_LOWERLEFT(1 20) +# define GEN7_SBE_URB_ENTRY_READ_LENGTH_SHIFT 11 +# define GEN7_SBE_URB_ENTRY_READ_OFFSET_SHIFT 4 + +#define GEN7_3DSTATE_PS BRW_3D(3, 0, 0x20) +/* DW1: kernel pointer */ +/* DW2 */ +# define GEN7_PS_SPF_MODE (1 31) +# define GEN7_PS_VECTOR_MASK_ENABLE (1 30) +# define GEN7_PS_SAMPLER_COUNT_SHIFT27 +# define
[Intel-gfx] errors when building the latest intel DDX driver
Hi, I got the following error message when run 'autogen.sh' checking for XORG... configure: error: Package requirements (xorg-server = xproto fontsproto randrproto renderproto xextproto x11 xextproto) were not met: Requested 'xorg-server = xproto' but version of xorg-server is 1.10.2.901 I don't enable SNA, so the required xserver version is 1.6. This issue is caused by commit 585667c2f9f88554ed89ff21ae38600f761d964c Author: Chris Wilson ch...@chris-wilson.co.uk Date: Sat Jun 18 15:52:22 2011 +0100 sna: Bump the required xserver version to 1.10 After reverting this commit, I can build DDX driver normally. Thanks Haihao ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH] drm/i915: Update the location of the ringbuffers' HWS_PGA registers for IVB.
On Wed, 2011-05-11 at 02:24 +0800, Eric Anholt wrote: They have been moved from the ringbuffer groups to their own group it looks like. Fixes GPU hangs on gnome startup. Signed-off-by: Eric Anholt e...@anholt.net --- drivers/gpu/drm/i915/i915_reg.h |3 +++ drivers/gpu/drm/i915/intel_ringbuffer.c | 27 --- 2 files changed, 27 insertions(+), 3 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index f12c291..9cb6353 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -291,6 +291,9 @@ #define RING_MAX_IDLE(base) ((base)+0x54) #define RING_HWS_PGA(base) ((base)+0x80) #define RING_HWS_PGA_GEN6(base) ((base)+0x2080) +#define RENDER_HWS_PGA_GEN7 (0x04080) +#define BSD_HWS_PGA_GEN7 (0x04180) The documents says the BSD HWS_PGA register is 0x4180 since GEN6, but we found 0x4180 causes GPU hang when using BSD ring, however BSD ring works fine with 0x14080. I am not sure whether GEN7 has the same problem or not. Currently we have not machine, could you help to verify it? +#define BLT_HWS_PGA_GEN7 (0x04280) #define RING_ACTHD(base) ((base)+0x74) #define RING_NOPID(base) ((base)+0x94) #define RING_IMR(base) ((base)+0xa8) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index a32dc71..5edb512 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -551,10 +551,31 @@ render_ring_put_irq(struct intel_ring_buffer *ring) void intel_ring_setup_status_page(struct intel_ring_buffer *ring) { + struct drm_device *dev = ring-dev; drm_i915_private_t *dev_priv = ring-dev-dev_private; - u32 mmio = (IS_GEN6(ring-dev) || IS_GEN7(ring-dev)) ? - RING_HWS_PGA_GEN6(ring-mmio_base) : - RING_HWS_PGA(ring-mmio_base); + u32 mmio = 0; + + /* The ring status page addresses are no longer next to the rest of + * the ring registers as of gen7. + */ + if (IS_GEN7(dev)) { + switch (ring-id) { + case RING_RENDER: + mmio = RENDER_HWS_PGA_GEN7; + break; + case RING_BLT: + mmio = BLT_HWS_PGA_GEN7; + break; + case RING_BSD: + mmio = BSD_HWS_PGA_GEN7; + break; + } + } else if (IS_GEN6(ring-dev)) { + mmio = RING_HWS_PGA_GEN6(ring-mmio_base); + } else { + mmio = RING_HWS_PGA(ring-mmio_base); + } + I915_WRITE(mmio, (u32)ring-status_page.gfx_addr); POSTING_READ(mmio); } ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] SandyBridge encoding code merged to libva master
On Wed, 2011-04-27 at 11:07 +0800, Zou, Nanhai wrote: Hi, We have merged HW accelerated SandyBridge encoding code to libva master branch. At this point we support I frame and P frame encoding for H.264 main profile. B frame support, frame rate control, performance tuning and quality improvement will come in next quarter. We will provide a simple test program soon to demonstrate how to use the encoding API. You can use the simple program 'avcenc' under libva dir/test/encode for testing. The usage is avcenc width height input file output file [qp] qp is optional. Thanks Haihao ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] Crash in Gstreamer Vaapi Application for playing multiple videos
The issue on my box with libva-1.0.7 libva: libva version 0.31.1 libva: va_getDriverName() returns 0 libva: Trying to open /opt/X11R7/lib/dri/i965_drv_video.so libva: va_openDriver() returns 0 [New Thread 0xaeafeb70 (LWP 12165)] [Thread 0xb4afeb70 (LWP 12151) exited] Program received signal SIGSEGV, Segmentation fault. [Switching to Thread 0xb5cedb70 (LWP 12149)] i965_PutSurface (ctx=0xaf310148, surface=67108883, draw=0xc5, srcx=0, srcy=0, srcw=1920, srch=796, destx=0, desty=0, destw=1920, desth=796, cliprects=0x0, number_cliprects=0, flags=0) at i965_drv_video.c:1725 1725if (obj_surface-bo == NULL) (gdb) bt #0 i965_PutSurface (ctx=0xaf310148, surface=67108883, draw=0xc5, srcx=0, srcy=0, srcw=1920, srch=796, destx=0, desty=0, destw=1920, desth=796, cliprects=0x0, number_cliprects=0, flags=0) at i965_drv_video.c:1725 #1 0xb6e43a5d in vaPutSurface (dpy=0xaf3b2a18, surface=67108883, draw=12582917, srcx=value optimized out, srcy=value optimized out, srcw=value optimized out, srch=value optimized out, destx=value optimized out, desty=0, destw=1920, desth=796, cliprects=0x0, number_cliprects=0, flags=0) at va_x11.c:288 #2 0xb7b5bfa1 in gst_vaapi_window_x11_render (window=0x8192d98, surface=0x82cf9a8, src_rect=0xb5ceccb8, dst_rect=0x81a4298, flags=3) at gstvaapiwindow_x11.c:424 #3 0xb70ffab5 in gst_vaapi_window_put_surface (window=0x8192d98, surface=0x82cf9a8, src_rect=0xb5ceccb8, dst_rect=0x81a4298, flags=3) at gstvaapiwindow.c:506 #4 0xb6c48da7 in gst_vaapisink_show_frame_x11 (base_sink=0x81a40d8, buffer=0x81c8118) at gstvaapisink.c:680 #5 gst_vaapisink_show_frame (base_sink=0x81a40d8, buffer=0x81c8118) at gstvaapisink.c:714 #6 0xb7c95c98 in ?? () from /usr/lib/libgstbase-0.10.so.0 #7 0xb7c9d595 in ?? () from /usr/lib/libgstbase-0.10.so.0 #8 0xb7c9f359 in ?? () from /usr/lib/libgstbase-0.10.so.0 #9 0xb7c9f871 in ?? () from /usr/lib/libgstbase-0.10.so.0 #10 0xb7e8be05 in ?? () from /usr/lib/libgstreamer-0.10.so.0 #11 0xb7e8c864 in ?? () from /usr/lib/libgstreamer-0.10.so.0 #12 0xb6c57441 in gst_vaapidecode_step (pad=0x81549a0, buf=0xaf300df0) at gstvaapidecode.c:162 #13 gst_vaapidecode_chain (pad=0x81549a0, buf=0xaf300df0) at gstvaapidecode.c:536 #14 0xb7e8be05 in ?? () from /usr/lib/libgstreamer-0.10.so.0 #15 0xb7e8c864 in ?? () from /usr/lib/libgstreamer-0.10.so.0 #16 0xb6bf555d in ?? () from /usr/lib/gstreamer-0.10/libgstcoreelements.so #17 0xb7eb7d6b in ?? () from /usr/lib/libgstreamer-0.10.so.0 #18 0xb7eb9377 in ?? () from /usr/lib/libgstreamer-0.10.so.0 #19 0xb7f68d0c in ?? () from /lib/libglib-2.0.so.0 #20 0xb7f66def in ?? () from /lib/libglib-2.0.so.0 #21 0xb6e7f96e in start_thread (arg=0xb5cedb70) at pthread_create.c:300 #22 0xb6fc5a4e in clone () at ../sysdeps/unix/sysv/linux/i386/clone.S:130 line 1725: 1724 obj_surface = SURFACE(surface); 1725 if (obj_surface-bo == NULL) Obviously obj_surface is invalid in this case. Hi, Could you please share the log of the crash. Not sure what is the problem at my end. :-( Could you suggest something I can try? Thanks, Jyotsana. Xiang, Haihao wrote: On Thu, 2011-02-24 at 17:28 +0800, Jyotsana wrote: Hi, Thats's great. But I am not sure why it doesn't play at my end. Which platform are you running on? Sandybridge And for the segfault does the file run till the EOS and then crashes or somewhere in the middle? It crashes in the middle Thanks, Jyotsana. Xiang, Haihao wrote: One reason of the sample not receiving 'prepare-xwindow-id' message could be if the file is not present in the same path as the executable. The 'Filename' variable should be changed accordingly if it is not. Could you tell me the format of your file and the type of video codec present in the file? The application supports only mov/mp4(h264 : video codec) or mpegts(mpeg2 : video codec). Oh, my fault. After moving all files to the same directory, I got four windows, however I can't reproduce your issue. 1. All videos are rendered fine at the beginning. See the attached screenshot. 2. I also got a segment fault after a while, but this issue isn't same as yours according to your backtrace. An invalid surface id is passed to the backed driver via vaPutSurface in my case. I have fixed this segment issue however I am not sure why an invalid id is passed to the driver. Maybe it is a plugin problem. Thanks, Jyotsana. Xiang, Haihao wrote: Hi, I tried to reproduce the issue with your sample code but failed. No window is created even with single video file. With gdb's help, I found the sample doesn't receive 'prepare-xwindow-id' message. After building the sample code ( I modified the variable 'Filename'), I directly run MultiVideo $ ./MultiVideo Did I miss something? Thanks Haihao
Re: [Intel-gfx] Crash in Gstreamer Vaapi Application for playing multiple videos
Hi, I tried to reproduce the issue with your sample code but failed. No window is created even with single video file. With gdb's help, I found the sample doesn't receive 'prepare-xwindow-id' message. After building the sample code ( I modified the variable 'Filename'), I directly run MultiVideo $ ./MultiVideo Did I miss something? Thanks Haihao Hi, I am trying to play multiple videos simultaneously using GStreamer(vaapidecode and vaapisink plugins) and libVA-1.0.7. In the sample application I am creating multiple windows using XCreateWindow and passing the generated window ID to vaapisink. There are two problems I am facing: 1. The first video is getting rendered but the other videos are not getting rendered. 2. The application crashes randomly. None of the gstreamer calls return a failure and the state is getting changed to play successfully. Tried the application with different container formats. Also from the command line using gst-launch I am able to playback multiple videos simultaneously. The backtrace of the gdb log is attached CrashLog.txt. The log suggests the crash is in i965_PutSurface. I am using the following packages as mentioned on Intel Linux graphics site http://intellinuxgraphics.org/2010Q4.html: 2D driver: xf86-video-intel 2.14.0 release http://xorg.freedesktop.org/archive/individual/driver/xf86-video-intel-2.14.0.tar.bz2 3D driver: mesa 7.10 ftp://freedesktop.org/pub/mesa/7.10/ Libdrm: libdrm-2.4.23 release http://dri.freedesktop.org/libdrm/libdrm-2.4.23.tar.bz2 Kernel: 2.6.37 release http://www.kernel.org/pub/linux/kernel/v2.6/linux-2.6.37.tar.bz2 Cairo: cairo-1.10.2 release http://cairographics.org/news/cairo-1.10.2/ Libva: libva-1.0.7 release http://cgit.freedesktop.org/libva/snapshot/libva-1.0.7.tar.bz2 xserver: 1.9.3 Apart from this I have installed the following gstreamer packages: gstreamer-0.10.31 gst-plugins-base-0.10.29 gst-plugins-good-0.10.22 gst-plugins-bad-0.10.19 gst-plugins-ugly-0.10.15 gst-ffmpeg-0.10.10. For reference attaching the sample application MultiVideo.c. I am not sure if it is application problem or plugin or driver limitation or X? As the same application runs with ximagesinkand decoder like ffmpeg. What could be the problem? PS : OS : Fedora Core 13. Platform: Sandy Bridge.Kernel : 2.6.37. Regards, Jyotsana. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] G45/4500MHD Hardware Acceleration
On Tue, 2011-02-22 at 12:16 +0800, Gabriel Torreiro de Moraes wrote: Hello Mr. Jin and everybody First, sorry about emailing to intel-gfx-owner. It was on my first email so I thought it would be this. My bad, sorry. Second, what about the vainfo issue? It was working (reporting) before. You updated libva.so but didn't update the backend driver i965_drv_video.so I would like to test it to see if I messed something up bad. Which codecs/format works with GM45? I'll try to re-code my actual videos to a supported format. I place my netbook to disposition for any necessary testing. It packs a underpowered processor, cool and energy-efficient. An ideal architecture for cheap and small HTPCs.. or for a netbook. Your pick :) Thank you for you time. Best regards, Gabriel Moraes Em Ter, 2011-02-22 às 10:48 +0800, Jin, Gordon escreveu: If you do want to use libva on GM45, please wait. It’s not implemented yet. Nanhai and Haihao are working on that. If you don’t care about libva (with GPU decoding), then you should play H264 video on Ubuntu 10.10 out of box. If it doesn’t work, please file bugs. Please don’t send email to intel-gfx-owner@ any more. This is for administrators. You should send to intel-gfx@ instead. Gordon From: mailman-boun...@lists.freedesktop.org [mailto:mailman-boun...@lists.freedesktop.org] On Behalf Of Gabriel Torreiro de Moraes Sent: Tuesday, February 22, 2011 10:38 AM To: Jin, Gordon; intel-gfx-ow...@lists.freedesktop.org Subject: RE: G45/4500MHD Hardware Acceleration Hello Mr. Gordon Jin and everybody, First, thanks for the reply :) Now let's get to what really matters: You got a point. I've been trying to use it with H264 encoded videos and it wasn't working. Which codecs does it support? I'll be buying an external Blu-Ray reader for my netbook, so.. will it decode properly? I've tried yesterday updating my drivers, but I'm still a newbie on Linux. I've messed something up and now my touchpad periodically stops working, my theme resets everytime I boot up and running vainfo on terminal gives me this: - libva: libva version 0.32.0 libva: va_getDriverName() returns 0 libva: Trying to open /usr/local/lib/dri/i965_drv_video.so libva error: /usr/local/lib/dri/i965_drv_video.so has no function __vaDriverInit_0_32 libva: va_openDriver() returns -1 vaInitialize failed with error code -1 (unknown libva error),exit --- Before the update, it was giving me no errors, and playing a video just rendered a black screen at the right resolution. Now when I play a video with VAAPI, mplayer only plays sound. I would be VERY grateful if you guide me through the process of fully updating video drivers, or some site that instructs newbies to do so :) Btw, here's the net basic specs: Acer Aspire One 723 (Former Aspire 1410-2287) - Intel Celeron ULV 723 1.2GHz 11.6 LED Screen, 1366x768 Intel 4500MHD HDMI Output (used to watch 1080p videos flawlessly on Windows with Media Player Classic on a 46 TV) - Isn't a speed beast, but isn't a power hog either. Does the trick for me. Best regards, Gabriel Moraes P.S.: I added the xorg.conf file on the correct folder. I'm using Ubuntu Maverick 10.10. The 2010Q4 Graphics Package says it needs the libva-1.0.7, but it isn't available anywhere! Am I getting the right package for my distro? Em Ter, 2011-02-22 às 08:50 +0800, Jin, Gordon escreveu: Hi Gabriel, GM45 (4500MHD) is well supported, and you should be able to use it to watch HD movies (if the cpu is not too bad). If you have problems, feel free to file bugs. Just H.264 hw decoding not supported yet. (it’s already supported on newer hw, but with lower priority to port to G45/GM45.) We have G45 (desktop) and GM45 (laptop), but no netbook based on that. Gordon From: mailman-boun...@lists.freedesktop.org [mailto:mailman-boun...@lists.freedesktop.org] On Behalf Of Gabriel Torreiro de Moraes Sent: Monday, February 21, 2011 11:31 PM To: intel-gfx-ow...@lists.freedesktop.org Subject: G45/4500MHD Hardware Acceleration Hey, I'm using Ubuntu 10.10 Maverick and Intel 4500MHD. I'm wondering if it's ever going (or if it's already out there, but I didn't install properly) any support for x264 and other video formats acceleration for the 4500MHD (G45 right?). It works flawlessly under Windows, out of the box actually, but I'm suffering to make it work under Ubuntu. I've
Re: [Intel-gfx] [PATCH] intel: Add AUB file dump support
Could you add a entry for media kernel for name_to_type_mapping or just use a common name for all tracked kernels? Thanks Haihao This adds AUB file dump support to generate execution trace for internal GPU simulator. Signed-off-by: Zhenyu Wang zhen...@linux.intel.com --- intel/Makefile.am|3 +- intel/intel_bufmgr.h | 38 + intel/intel_bufmgr_gem.c | 402 ++ 3 files changed, 442 insertions(+), 1 deletions(-) diff --git a/intel/Makefile.am b/intel/Makefile.am index 1ae92f8..398cd2f 100644 --- a/intel/Makefile.am +++ b/intel/Makefile.am @@ -41,7 +41,8 @@ libdrm_intel_la_SOURCES = \ intel_bufmgr_gem.c \ intel_chipset.h \ mm.c \ - mm.h + mm.h \ + intel_aub.h libdrm_intelincludedir = ${includedir}/libdrm libdrm_intelinclude_HEADERS = intel_bufmgr.h diff --git a/intel/intel_bufmgr.h b/intel/intel_bufmgr.h index daa18b4..bb4158a 100644 --- a/intel/intel_bufmgr.h +++ b/intel/intel_bufmgr.h @@ -35,6 +35,7 @@ #define INTEL_BUFMGR_H #include stdint.h +#include stdio.h struct drm_clip_rect; @@ -83,6 +84,39 @@ struct _drm_intel_bo { int handle; }; +enum drm_intel_aub_bmp_format { + AUB_DUMP_BMP_LEGACY, + AUB_DUMP_BMP_8BIT, + AUB_DUMP_BMP_ARGB_0555, + AUB_DUMP_BMP_ARGB_0565, + AUB_DUMP_BMP_ARGB_, + AUB_DUMP_BMP_ARGB_1555, + AUB_DUMP_BMP_ARGB_0888, + AUB_DUMP_BMP_ARGB_, + AUB_DUMP_BMP_YCRCB_SWAPY, + AUB_DUMP_BMP_YCRCB_NORMAL, + AUB_DUMP_BMP_YCRCB_SWAPUV, + AUB_DUMP_BMP_YCRCB_SWAPUVY, + AUB_DUMP_BMP_ABGR_, +}; + +/* + * surface info needed by aub DUMP_BMP block + */ +struct drm_intel_aub_surface_bmp { + uint16_t x_offset; + uint16_t y_offset; + uint16_t pitch; + uint8_t bits_per_pixel; + uint8_t format; + uint16_t width; + uint16_t height; + uint32_t tiling_walk_y:1; + uint32_t tiling:1; + uint32_t pad:30; +}; + + #define BO_ALLOC_FOR_RENDER (10) drm_intel_bo *drm_intel_bo_alloc(drm_intel_bufmgr *bufmgr, const char *name, @@ -150,6 +184,10 @@ int drm_intel_gem_bo_unmap_gtt(drm_intel_bo *bo); void drm_intel_gem_bo_start_gtt_access(drm_intel_bo *bo, int write_enable); int drm_intel_get_pipe_from_crtc_id(drm_intel_bufmgr *bufmgr, int crtc_id); +void drm_intel_bufmgr_gem_set_aubfile(drm_intel_bufmgr *bufmgr, FILE *file); +void drm_intel_bufmgr_gem_stop_aubfile(drm_intel_bufmgr *bufmgr); +int drm_intel_gem_aub_dump_bmp(drm_intel_bufmgr *bufmgr, drm_intel_bo *bo, + unsigned int offset, struct drm_intel_aub_surface_bmp *bmp); /* drm_intel_bufmgr_fake.c */ drm_intel_bufmgr *drm_intel_bufmgr_fake_init(int fd, diff --git a/intel/intel_bufmgr_gem.c b/intel/intel_bufmgr_gem.c index 3cdffce..654bc31 100644 --- a/intel/intel_bufmgr_gem.c +++ b/intel/intel_bufmgr_gem.c @@ -57,6 +57,7 @@ #include intel_bufmgr.h #include intel_bufmgr_priv.h #include intel_chipset.h +#include intel_aub.h #include string.h #include i915_drm.h @@ -75,6 +76,13 @@ struct drm_intel_gem_bo_bucket { unsigned long size; }; +struct drm_intel_aub_bmp { + drm_intel_bo *bo; /* surface bo */ + unsigned int offset; + struct drm_intel_aub_surface_bmp bmp; + struct drm_intel_aub_bmp *next; +}; + typedef struct _drm_intel_bufmgr_gem { drm_intel_bufmgr bufmgr; @@ -106,6 +114,10 @@ typedef struct _drm_intel_bufmgr_gem { unsigned int has_relaxed_fencing : 1; unsigned int bo_reuse : 1; char fenced_relocs; + + FILE *aub_file; + uint32_t aub_offset; + struct drm_intel_aub_bmp *aub_bmp; } drm_intel_bufmgr_gem; #define DRM_INTEL_RELOC_FENCE (10) @@ -195,8 +207,396 @@ struct _drm_intel_bo_gem { * relocations. */ int reloc_tree_fences; + + uint32_t aub_offset; }; +/* AUB trace dump support */ + +static void +aub_out(drm_intel_bufmgr_gem *bufmgr_gem, uint32_t data) +{ + fwrite(data, 1, 4, bufmgr_gem-aub_file); +} + +static void +aub_out_data(drm_intel_bufmgr_gem *bufmgr_gem, void *data, size_t size) +{ + fwrite(data, 1, size, bufmgr_gem-aub_file); +} + +static void +aub_write_bo_data(drm_intel_bo *bo, uint32_t offset, uint32_t size) +{ + drm_intel_bufmgr_gem *bufmgr_gem = (drm_intel_bufmgr_gem *) bo-bufmgr; + drm_intel_bo_gem *bo_gem = (drm_intel_bo_gem *) bo; + uint32_t *data; + unsigned int i; + + data = malloc(bo-size); + drm_intel_bo_get_subdata(bo, offset, size, data); + + /* Easy mode: write out bo with no relocations */ + if (!bo_gem-reloc_count) { + aub_out_data(bufmgr_gem, data, size); + free(data); + return; + } + + /* Otherwise, handle the relocations
Re: [Intel-gfx] [PATCH] intel: Add AUB file dump support
On Tue, 2011-02-22 at 14:22 +0800, Zhenyu Wang wrote: On 2011.02.22 14:09:43 +0800, Xiang, Haihao wrote: For mesa, kernel names depend on which cache it is from, and as far as I know aub has no special requirement for kernel name. AubList could decode kernel source with offset specified in pipe states. But I have to use a name specified in name_to_type_mapping when allocating a GEM BO for a media kernel, or aub doesn't know the trace type of this BO. Of course, I can also use 'VS_PROG'/'GS_PROG'/'CLIP_PROG'... for a media kernel :(. oh, right, I see. You still need subtype as kernel. How about 'MEDIA_PROG'? Or what's the name for your current kernel? 'MEDIA_PROG' is ok for me Thanks Haihao ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] Crash of repeated playback using libva and gstreamer
On Mon, 2011-01-03 at 14:54 +0800, Jyotsana wrote: Hi, I am trying to play multiple video files one after the other using GStreamer(vaapidecode and vaapisink plugins) and libVA-1.0.3. I am able to play one file successfully but once the first file is played to completion, we delete all references of the first pipeline and create a new one. I can play multiple videos (MPEG2 H.264) with mplayer_vaapi. Does GStreamer call vaTerminate after playing a file? Could you try the latest master branch? If you still experience this issue, please file a bug to track it. Thanks Haihao When the second file is played, application crashes with the following error: libva: libva version 0.31.1 libva: va_getDriverName() returns 0 libva: Trying to open /opt/X11R7/lib/dri/i965_drv_video.so libva: va_openDriver() returns 0 libva: libva version 0.31.1 libva: va_getDriverName() returns 0 libva: Trying to open /opt/X11R7/lib/dri/i965_drv_video.so libva: va_openDriver() returns 0 VaapiApp: i965_media.c:277: i965_media_decode_picture: Assertion `media_state-media_states_setup' failed. Following is the backtrace obtained from gdb: i965_media_decode_picture: Assertion `media_state-media_states_setup' failed. # 0xb7fff424 in __kernel_vsyscall () # #1 0x00414d71 in raise () from /lib/libc.so.6 # #2 0x0041664a in abort () from /lib/libc.so.6 # #3 0x0040dde8 in __assert_fail () from /lib/libc.so.6 # #4 0xb71531d5 in i965_media_decode_picture (ctx=0xaeb19760, profile=VAProfileMPEG2Main, decode_state=0xaeb1a524) at i965_media.c:277 # #5 0xb715afcf in i965_EndPicture (ctx=0xaeb19760, context=33554432) at i965_drv_video.c:1146 # #6 0xb7d8cec5 in vaEndPicture (dpy=0xaeb1e960, context=33554432) at va.c:815 # #7 0xb5f73a85 in render_picture (s=0xb0701c00) at libavcodec/vaapi.c:74 # #8 ff_vaapi_common_end_frame (s=0xb0701c00) at libavcodec/vaapi.c:188 # #9 0xb5d37c0b in slice_end (avctx=0xb0701400, picture=0xb0740940, data_size=0xb10fef24, buf=0xb0704600 , buf_size=18936) at libavcodec/mpeg12.c:1935 # #10 decode_chunks (avctx=0xb0701400, picture=0xb0740940, data_size=0xb10fef24, buf=0xb0704600 , buf_size=18936) at libavcodec/mpeg12.c:2303 # #11 0xb5d39290 in mpeg_decode_frame (avctx=0xb0701400, data=0xb0740940, data_size=0xb10fef24, avpkt=0xb10fee64) at libavcodec/mpeg12.c:2272 # #12 0xb5e2de35 in avcodec_decode_video2 (avctx=0xb0701400, picture=0xb0740940, got_picture_ptr=0xb10fef24, buf=0xb0704600 , buf_size=18936) at libavcodec/utils.c:611 # #13 avcodec_decode_video (avctx=0xb0701400, picture=0xb0740940, got_picture_ptr=0xb10fef24, buf=0xb0704600 , buf_size=18936) at libavcodec/utils.c:597 # #14 0xb5be3392 in decode_frame (decoder=0xafd03478, buffer=0xae91b9e0) at gstvaapidecoder_ffmpeg.c:483 # #15 gst_vaapi_decoder_ffmpeg_decode (decoder=0xafd03478, buffer=0xae91b9e0) at gstvaapidecoder_ffmpeg.c:566 # #16 0xb5bd74ad in decode_step (decoder=0xafd03478, pstatus=0xb10ff00c) at gstvaapidecoder.c:117 # #17 gst_vaapi_decoder_get_surface (decoder=0xafd03478, pstatus=0xb10ff00c) at gstvaapidecoder.c:422 # #18 0xb7fe5105 in gst_vaapidecode_step (pad=0xb7036008, buf=0xae91b9e0) at gstvaapidecode.c:116 # #19 gst_vaapidecode_chain (pad=0xb7036008, buf=0xae91b9e0) at gstvaapidecode.c:536 # #20 0x420dfacd in gst_pad_chain_data_unchecked (pad=0xb7036008, is_buffer=1, data=0xae91b9e0) at gstpad.c:4131 # #21 0x420e04e7 in gst_pad_push_data (pad=0xb70363f0, is_buffer=1, data=0xae91b9e0) at gstpad.c:4360 # #22 0xb71e2655 in gst_queue_push_one (pad=0xb70363f0) at gstqueue.c:1083 # #23 gst_queue_loop (pad=0xb70363f0) at gstqueue.c:1185 # #24 0x4210ba11 in gst_task_func (task=0xb70570b0) at gsttask.c:271 # #25 0x4210d047 in default_func (tdata=0xaeb00d58, pool=0xb7408c08) at gsttaskpool.c:68 # #26 0x0066e214 in ?? () from /lib/libglib-2.0.so.0 # #27 0x0066c210 in ?? () from /lib/libglib-2.0.so.0 # #28 0x00585919 in start_thread () from /lib/libpthread.so.0 # #29 0x004c7e5e in clone () from /lib/libc.so.6 I am running this on Fedora Core 13 on Calpella Platform. What could be the problem? :-( Regards, Jyotsana. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] Versions required for h264 acceleration on Intel HD chipset
On Mon, 2010-12-06 at 02:01 +0800, Pedro Ribeiro wrote: Hi all, tomorrow I will be getting my new shiny new laptop, a T410 with an Intel HD graphics adapter. Currently I'm using Debian testing (soon to become stable) with these packages: - intel xorg driver 2.13 - cairo 1.8.10 - libdrm 2.4.21 - mesa 7.7.1 - libva 1.0.1 - xorg server 1.7.7 I always use the latest stable kernel (at the moment 2.6.36.1). Given that the versions of the software above differ from the recommended in the Intel Q3 graphics package, I'm wondering if I will be able to make use of the h.264 acceleration or do I need to upgrade the software? Libva1.0.1 doesn't support h.264 decoding for Intel® HD Graphics. You need to upgrade libva to 1.0.4 or later. Apart from that, are there any features of Intel HD that I will be missing out? The reason I would not like to upgrade is that I like to maintain the Debian stable packages as most as possible... Thanks for your help. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 1/5] render: set the surface state base address
It is the same as commit 73d4c7d7 Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- src/i965_render.c | 75 + 1 files changed, 24 insertions(+), 51 deletions(-) diff --git a/src/i965_render.c b/src/i965_render.c index c0c5de4..885889e 100644 --- a/src/i965_render.c +++ b/src/i965_render.c @@ -619,6 +619,8 @@ typedef struct brw_surface_state_padded { char pad[32 - sizeof(struct brw_surface_state)]; } brw_surface_state_padded; +#define PS_BINDING_TABLE_OFFSET(3 * sizeof(struct brw_surface_state_padded)) + struct gen4_cc_unit_state { /* Index by [src_blend][dst_blend] */ brw_cc_unit_state_padded cc_state[BRW_BLENDFACTOR_COUNT] @@ -629,7 +631,7 @@ typedef float gen4_vertex_buffer[VERTEX_BUFFER_SIZE]; typedef struct gen4_composite_op { int op; - drm_intel_bo *binding_table_bo; + drm_intel_bo *surface_state_binding_table_bo; sampler_state_filter_t src_filter; sampler_state_filter_t mask_filter; sampler_state_extend_t src_extend; @@ -1158,7 +1160,7 @@ static void i965_emit_composite_state(ScrnInfoPtr scrn) int urb_sf_start, urb_sf_size; int urb_cs_start, urb_cs_size; uint32_t src_blend, dst_blend; - dri_bo *binding_table_bo = composite_op-binding_table_bo; + dri_bo *surface_state_binding_table_bo = composite_op-surface_state_binding_table_bo; intel-needs_render_state_emit = FALSE; @@ -1216,7 +1218,7 @@ static void i965_emit_composite_state(ScrnInfoPtr scrn) if (IS_GEN5(intel)) { OUT_BATCH(BRW_STATE_BASE_ADDRESS | 6); OUT_BATCH(0 | BASE_ADDRESS_MODIFY); /* Generate state base address */ - OUT_BATCH(0 | BASE_ADDRESS_MODIFY); /* Surface state base address */ + OUT_RELOC(surface_state_binding_table_bo, I915_GEM_DOMAIN_INSTRUCTION, 0, BASE_ADDRESS_MODIFY); /* Surface state base address */ OUT_BATCH(0 | BASE_ADDRESS_MODIFY); /* media base addr, don't care */ OUT_BATCH(0 | BASE_ADDRESS_MODIFY); /* Instruction base address */ /* general state max addr, disabled */ @@ -1228,7 +1230,7 @@ static void i965_emit_composite_state(ScrnInfoPtr scrn) } else { OUT_BATCH(BRW_STATE_BASE_ADDRESS | 4); OUT_BATCH(0 | BASE_ADDRESS_MODIFY); /* Generate state base address */ - OUT_BATCH(0 | BASE_ADDRESS_MODIFY); /* Surface state base address */ + OUT_RELOC(surface_state_binding_table_bo, I915_GEM_DOMAIN_INSTRUCTION, 0, BASE_ADDRESS_MODIFY); /* Surface state base address */ OUT_BATCH(0 | BASE_ADDRESS_MODIFY); /* media base addr, don't care */ /* general state max addr, disabled */ OUT_BATCH(0x1000 | BASE_ADDRESS_MODIFY); @@ -1271,7 +1273,7 @@ static void i965_emit_composite_state(ScrnInfoPtr scrn) OUT_BATCH(0); /* clip */ OUT_BATCH(0); /* sf */ /* Only the PS uses the binding table */ - OUT_RELOC(binding_table_bo, I915_GEM_DOMAIN_SAMPLER, 0, 0); + OUT_BATCH(PS_BINDING_TABLE_OFFSET); /* The drawing rectangle clipping is always on. Set it to values that * shouldn't do any clipping. @@ -1474,7 +1476,7 @@ static Bool i965_composite_check_aperture(ScrnInfoPtr scrn) gen4_composite_op *composite_op = render_state-composite_op; drm_intel_bo *bo_table[] = { intel-batch_bo, - composite_op-binding_table_bo, + composite_op-surface_state_binding_table_bo, render_state-vertex_buffer_bo, render_state-vs_state_bo, render_state-sf_state_bo, @@ -1502,7 +1504,7 @@ i965_prepare_composite(int op, PicturePtr source_picture, struct gen4_render_state *render_state = intel-gen4_render_state; gen4_composite_op *composite_op = render_state-composite_op; uint32_t *binding_table; - drm_intel_bo *binding_table_bo, *surface_state_bo; + drm_intel_bo *surface_state_binding_table_bo; composite_op-src_filter = sampler_state_filter_from_picture(source_picture-filter); @@ -1562,65 +1564,36 @@ i965_prepare_composite(int op, PicturePtr source_picture, /* Set up the surface states. */ - surface_state_bo = dri_bo_alloc(intel-bufmgr, surface_state, - 3 * sizeof(brw_surface_state_padded), + surface_state_binding_table_bo = dri_bo_alloc(intel-bufmgr, surface_state, + 3 * (sizeof(struct brw_surface_state_padded) + sizeof(uint32_t)), 4096); - if (dri_bo_map
[Intel-gfx] [PATCH 2/5] render: fix send instruction used in sampling fragments
To prepare for composite on Sandybridge Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- src/render_program/exa_wm_mask_sample_a.g4a|3 ++- src/render_program/exa_wm_mask_sample_a.g4b|3 ++- src/render_program/exa_wm_mask_sample_a.g4b.gen5 |3 ++- src/render_program/exa_wm_mask_sample_argb.g4a |3 ++- src/render_program/exa_wm_mask_sample_argb.g4b |3 ++- .../exa_wm_mask_sample_argb.g4b.gen5 |3 ++- src/render_program/exa_wm_src_sample_a.g4a |3 ++- src/render_program/exa_wm_src_sample_a.g4b |3 ++- src/render_program/exa_wm_src_sample_a.g4b.gen5|3 ++- 9 files changed, 18 insertions(+), 9 deletions(-) diff --git a/src/render_program/exa_wm_mask_sample_a.g4a b/src/render_program/exa_wm_mask_sample_a.g4a index bbb19d7..b1c75af 100644 --- a/src/render_program/exa_wm_mask_sample_a.g4a +++ b/src/render_program/exa_wm_mask_sample_a.g4a @@ -36,12 +36,13 @@ include(`exa_wm.g4i') /* load only alpha */ mov (1) g0.81UD 0x7000UD { align1 mask_disable }; +mov (8) mask_msg1UD g08,8,1UD { align1 }; /* copy to msg start reg*/ /* mask_msg will be copied with g0, as it contains send desc */ /* emit sampler 'send' cmd */ send (16) mask_msg_ind /* msg reg index */ mask_sample_a_011UW /* readback */ - g08,8,1UW /* copy to msg start reg*/ + null sampler (2,1,F) /* sampler message description, (binding_table,sampler_index,datatype) /* here(src-dst) we should use src_sampler and src_surface */ mlen 5 rlen 2 { align1 }; /* required message len 5, readback len 8 */ diff --git a/src/render_program/exa_wm_mask_sample_a.g4b b/src/render_program/exa_wm_mask_sample_a.g4b index 018bd36..7db47ca 100644 --- a/src/render_program/exa_wm_mask_sample_a.g4b +++ b/src/render_program/exa_wm_mask_sample_a.g4b @@ -1,2 +1,3 @@ { 0x0201, 0x20080061, 0x, 0x7000 }, - { 0x07800031, 0x23801d29, 0x008d, 0x02520102 }, + { 0x0061, 0x20e00022, 0x008d, 0x }, + { 0x07800031, 0x23801c09, 0x, 0x02520102 }, diff --git a/src/render_program/exa_wm_mask_sample_a.g4b.gen5 b/src/render_program/exa_wm_mask_sample_a.g4b.gen5 index d9740ac..472c2bb 100644 --- a/src/render_program/exa_wm_mask_sample_a.g4b.gen5 +++ b/src/render_program/exa_wm_mask_sample_a.g4b.gen5 @@ -1,2 +1,3 @@ { 0x0201, 0x20080061, 0x, 0x7000 }, - { 0x07800031, 0x23801d29, 0x208d, 0x0a2a0102 }, + { 0x0061, 0x20e00022, 0x008d, 0x }, + { 0x07800031, 0x23801c09, 0x2000, 0x0a2a0102 }, diff --git a/src/render_program/exa_wm_mask_sample_argb.g4a b/src/render_program/exa_wm_mask_sample_argb.g4a index def4cfe..78bfc92 100644 --- a/src/render_program/exa_wm_mask_sample_argb.g4a +++ b/src/render_program/exa_wm_mask_sample_argb.g4a @@ -36,12 +36,13 @@ include(`exa_wm.g4i') /* load argb */ mov (1) g0.81UD 0xUD { align1 mask_disable }; +mov (8) mask_msg1UD g08,8,1UD { align1 }; /* copy to msg start reg*/ /* mask_msg will be copied with g0, as it contains send desc */ /* emit sampler 'send' cmd */ send (16) mask_msg_ind /* msg reg index */ mask_sample_base1UW /* readback */ - g08,8,1UW /* copy to msg start reg*/ + null sampler (2,1,F) /* sampler message description, (binding_table,sampler_index,datatype) /* here(src-dst) we should use src_sampler and src_surface */ mlen 5 rlen 8 { align1 }; /* required message len 5, readback len 8 */ diff --git a/src/render_program/exa_wm_mask_sample_argb.g4b b/src/render_program/exa_wm_mask_sample_argb.g4b index b159cba..9026ee2 100644 --- a/src/render_program/exa_wm_mask_sample_argb.g4b +++ b/src/render_program/exa_wm_mask_sample_argb.g4b @@ -1,2 +1,3 @@ { 0x0201, 0x20080061, 0x, 0x }, - { 0x07800031, 0x22c01d29, 0x008d, 0x02580102 }, + { 0x0061, 0x20e00022, 0x008d, 0x }, + { 0x07800031, 0x22c01c09, 0x, 0x02580102 }, diff --git a/src/render_program/exa_wm_mask_sample_argb.g4b.gen5 b/src/render_program/exa_wm_mask_sample_argb.g4b.gen5 index f0a6ddd..cb112d5 100644 --- a/src/render_program/exa_wm_mask_sample_argb.g4b.gen5 +++ b/src/render_program/exa_wm_mask_sample_argb.g4b.gen5 @@ -1,2 +1,3 @@ { 0x0201, 0x20080061, 0x, 0x }, - { 0x07800031, 0x22c01d29, 0x208d, 0x0a8a0102 }, + { 0x0061, 0x20e00022, 0x008d, 0x }, + { 0x07800031, 0x22c01c09, 0x2000, 0x0a8a0102 }, diff --git a/src/render_program/exa_wm_src_sample_a.g4a b/src/render_program/exa_wm_src_sample_a.g4a index 552aaee..667bfb3 100644 --- a/src/render_program/exa_wm_src_sample_a.g4a +++ b/src/render_program/exa_wm_src_sample_a.g4a @@ -36,12 +36,13 @@ include(`exa_wm.g4i') /* load alpha */ mov (1) g0.81UD 0x7000UD { align1 mask_disable }; +mov (8
[Intel-gfx] [PATCH 3/5] render: fragments for composite on Sandybridge
Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- src/render_program/Makefile.am | 18 +++ src/render_program/exa_wm_ca.g6a |1 + src/render_program/exa_wm_ca.g6b |4 ++ src/render_program/exa_wm_ca_srcalpha.g6a |1 + src/render_program/exa_wm_ca_srcalpha.g6b |4 ++ src/render_program/exa_wm_mask_affine.g6a | 47 ++ src/render_program/exa_wm_mask_affine.g6b |4 ++ src/render_program/exa_wm_mask_projective.g6a | 63 src/render_program/exa_wm_mask_projective.g6b | 12 + src/render_program/exa_wm_mask_sample_a.g6a|1 + src/render_program/exa_wm_mask_sample_a.g6b|3 + src/render_program/exa_wm_mask_sample_argb.g6a |1 + src/render_program/exa_wm_mask_sample_argb.g6b |3 + src/render_program/exa_wm_noca.g6a |1 + src/render_program/exa_wm_noca.g6b |4 ++ src/render_program/exa_wm_src_projective.g6a | 63 src/render_program/exa_wm_src_projective.g6b | 12 + src/render_program/exa_wm_src_sample_a.g6a |1 + src/render_program/exa_wm_src_sample_a.g6b |3 + 19 files changed, 246 insertions(+), 0 deletions(-) create mode 12 src/render_program/exa_wm_ca.g6a create mode 100644 src/render_program/exa_wm_ca.g6b create mode 12 src/render_program/exa_wm_ca_srcalpha.g6a create mode 100644 src/render_program/exa_wm_ca_srcalpha.g6b create mode 100644 src/render_program/exa_wm_mask_affine.g6a create mode 100644 src/render_program/exa_wm_mask_affine.g6b create mode 100644 src/render_program/exa_wm_mask_projective.g6a create mode 100644 src/render_program/exa_wm_mask_projective.g6b create mode 12 src/render_program/exa_wm_mask_sample_a.g6a create mode 100644 src/render_program/exa_wm_mask_sample_a.g6b create mode 12 src/render_program/exa_wm_mask_sample_argb.g6a create mode 100644 src/render_program/exa_wm_mask_sample_argb.g6b create mode 12 src/render_program/exa_wm_noca.g6a create mode 100644 src/render_program/exa_wm_noca.g6b create mode 100644 src/render_program/exa_wm_src_projective.g6a create mode 100644 src/render_program/exa_wm_src_projective.g6b create mode 12 src/render_program/exa_wm_src_sample_a.g6a create mode 100644 src/render_program/exa_wm_src_sample_a.g6b diff --git a/src/render_program/Makefile.am b/src/render_program/Makefile.am index 5229ef5..1a19437 100644 --- a/src/render_program/Makefile.am +++ b/src/render_program/Makefile.am @@ -63,15 +63,33 @@ INTEL_G4B_GEN5 =\ INTEL_G6A =\ exa_wm_src_affine.g6a \ + exa_wm_src_projective.g6a \ exa_wm_src_sample_argb.g6a \ exa_wm_src_sample_planar.g6a\ + exa_wm_src_sample_a.g6a \ + exa_wm_mask_affine.g6a \ + exa_wm_mask_projective.g6a \ + exa_wm_mask_sample_argb.g6a \ + exa_wm_mask_sample_a.g6a\ + exa_wm_ca.g6a \ + exa_wm_ca_srcalpha.g6a \ + exa_wm_noca.g6a \ exa_wm_write.g6a\ exa_wm_yuv_rgb.g6a INTEL_G6B =\ exa_wm_src_affine.g6b \ + exa_wm_src_projective.g6b \ exa_wm_src_sample_argb.g6b \ exa_wm_src_sample_planar.g6b\ + exa_wm_src_sample_a.g6b \ + exa_wm_mask_affine.g6b \ + exa_wm_mask_projective.g6b \ + exa_wm_mask_sample_argb.g6b \ + exa_wm_mask_sample_a.g6b\ + exa_wm_ca.g6b \ + exa_wm_ca_srcalpha.g6b \ + exa_wm_noca.g6b \ exa_wm_write.g6b\ exa_wm_yuv_rgb.g6b diff --git a/src/render_program/exa_wm_ca.g6a b/src/render_program/exa_wm_ca.g6a new file mode 12 index 000..a29acb1 --- /dev/null +++ b/src/render_program/exa_wm_ca.g6a @@ -0,0 +1 @@ +exa_wm_ca.g4a \ No newline at end of file diff --git a/src/render_program/exa_wm_ca.g6b b/src/render_program/exa_wm_ca.g6b new file mode 100644 index 000..521a5b6 --- /dev/null +++ b/src/render_program/exa_wm_ca.g6b @@ -0,0 +1,4 @@ + { 0x00800041, 0x21c077bd, 0x008d01c0, 0x008d02c0 }, + { 0x00800041, 0x220077bd, 0x008d0200, 0x008d0300 }, + { 0x00800041, 0x224077bd, 0x008d0240, 0x008d0340 }, + { 0x00800041, 0x228077bd, 0x008d0280, 0x008d0380 }, diff --git a/src/render_program/exa_wm_ca_srcalpha.g6a b/src/render_program/exa_wm_ca_srcalpha.g6a new file mode 12 index 000..3503521 --- /dev/null +++ b/src/render_program/exa_wm_ca_srcalpha.g6a @@ -0,0 +1 @@ +exa_wm_ca_srcalpha.g4a \ No newline at end of file diff --git a/src/render_program/exa_wm_ca_srcalpha.g6b b/src/render_program/exa_wm_ca_srcalpha.g6b new file mode 100644 index 000..d5ab7e4 --- /dev/null +++ b/src/render_program
[Intel-gfx] [PATCH 4/5] render: acceleration for composite on Sandybridge
Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- src/i965_render.c | 686 +++-- 1 files changed, 670 insertions(+), 16 deletions(-) diff --git a/src/i965_render.c b/src/i965_render.c index 885889e..e2b67c3 100644 --- a/src/i965_render.c +++ b/src/i965_render.c @@ -208,14 +208,8 @@ i965_check_composite(int op, int width, int height) { ScrnInfoPtr scrn = xf86Screens[dest_picture-pDrawable-pScreen-myNum]; - intel_screen_private *intel = intel_get_screen_private(scrn); uint32_t tmp1; - if (IS_GEN6(intel)) { - intel_debug_fallback(scrn, Unsupported hardware\n); - return FALSE; - } - /* Check for unsupported compositing operations. */ if (op = sizeof(i965_blend_op) / sizeof(i965_blend_op[0])) { intel_debug_fallback(scrn, @@ -522,6 +516,73 @@ static const uint32_t ps_kernel_masknoca_projective_static_gen5[][4] = { #include exa_wm_write.g4b.gen5 }; +/* programs for GEN6 */ +static const uint32_t ps_kernel_nomask_affine_static_gen6[][4] = { +#include exa_wm_src_affine.g6b +#include exa_wm_src_sample_argb.g6b +#include exa_wm_write.g6b +}; + +static const uint32_t ps_kernel_nomask_projective_static_gen6[][4] = { +#include exa_wm_src_projective.g6b +#include exa_wm_src_sample_argb.g6b +#include exa_wm_write.g6b +}; + +static const uint32_t ps_kernel_maskca_affine_static_gen6[][4] = { +#include exa_wm_src_affine.g6b +#include exa_wm_src_sample_argb.g6b +#include exa_wm_mask_affine.g6b +#include exa_wm_mask_sample_argb.g6b +#include exa_wm_ca.g6b +#include exa_wm_write.g6b +}; + +static const uint32_t ps_kernel_maskca_projective_static_gen6[][4] = { +#include exa_wm_src_projective.g6b +#include exa_wm_src_sample_argb.g6b +#include exa_wm_mask_projective.g6b +#include exa_wm_mask_sample_argb.g6b +#include exa_wm_ca.g4b.gen5 +#include exa_wm_write.g6b +}; + +static const uint32_t ps_kernel_maskca_srcalpha_affine_static_gen6[][4] = { +#include exa_wm_src_affine.g6b +#include exa_wm_src_sample_a.g6b +#include exa_wm_mask_affine.g6b +#include exa_wm_mask_sample_argb.g6b +#include exa_wm_ca_srcalpha.g6b +#include exa_wm_write.g6b +}; + +static const uint32_t ps_kernel_maskca_srcalpha_projective_static_gen6[][4] = { +#include exa_wm_src_projective.g6b +#include exa_wm_src_sample_a.g6b +#include exa_wm_mask_projective.g6b +#include exa_wm_mask_sample_argb.g6b +#include exa_wm_ca_srcalpha.g6b +#include exa_wm_write.g6b +}; + +static const uint32_t ps_kernel_masknoca_affine_static_gen6[][4] = { +#include exa_wm_src_affine.g6b +#include exa_wm_src_sample_argb.g6b +#include exa_wm_mask_affine.g6b +#include exa_wm_mask_sample_a.g6b +#include exa_wm_noca.g6b +#include exa_wm_write.g6b +}; + +static const uint32_t ps_kernel_masknoca_projective_static_gen6[][4] = { +#include exa_wm_src_projective.g6b +#include exa_wm_src_sample_argb.g6b +#include exa_wm_mask_projective.g6b +#include exa_wm_mask_sample_a.g6b +#include exa_wm_noca.g6b +#include exa_wm_write.g6b +}; + #define WM_STATE_DECL(kernel) \ struct brw_wm_unit_state wm_state_ ## kernel[SAMPLER_STATE_FILTER_COUNT] \ [SAMPLER_STATE_EXTEND_COUNT] \ @@ -607,6 +668,25 @@ static struct wm_kernel_info wm_kernels_gen5[] = { ps_kernel_masknoca_projective_static_gen5, TRUE), }; +static struct wm_kernel_info wm_kernels_gen6[] = { + KERNEL(WM_KERNEL_NOMASK_AFFINE, + ps_kernel_nomask_affine_static_gen6, FALSE), + KERNEL(WM_KERNEL_NOMASK_PROJECTIVE, + ps_kernel_nomask_projective_static_gen6, FALSE), + KERNEL(WM_KERNEL_MASKCA_AFFINE, + ps_kernel_maskca_affine_static_gen6, TRUE), + KERNEL(WM_KERNEL_MASKCA_PROJECTIVE, + ps_kernel_maskca_projective_static_gen6, TRUE), + KERNEL(WM_KERNEL_MASKCA_SRCALPHA_AFFINE, + ps_kernel_maskca_srcalpha_affine_static_gen6, TRUE), + KERNEL(WM_KERNEL_MASKCA_SRCALPHA_PROJECTIVE, + ps_kernel_maskca_srcalpha_projective_static_gen6, TRUE), + KERNEL(WM_KERNEL_MASKNOCA_AFFINE, + ps_kernel_masknoca_affine_static_gen6, TRUE), + KERNEL(WM_KERNEL_MASKNOCA_PROJECTIVE, + ps_kernel_masknoca_projective_static_gen6, TRUE), +}; + #undef KERNEL typedef struct _brw_cc_unit_state_padded { @@ -656,12 +736,22 @@ struct gen4_render_state { drm_intel_bo *sip_kernel_bo; dri_bo *vertex_buffer_bo; + drm_intel_bo *cc_vp_bo; + drm_intel_bo *gen6_blend_bo; + drm_intel_bo *gen6_depth_stencil_bo; + drm_intel_bo *ps_sampler_state_bo[SAMPLER_STATE_FILTER_COUNT] + [SAMPLER_STATE_EXTEND_COUNT] + [SAMPLER_STATE_FILTER_COUNT] + [SAMPLER_STATE_EXTEND_COUNT]; gen4_composite_op composite_op; int vb_offset; int vertex_size; }; +static void gen6_emit_composite_state(ScrnInfoPtr scrn); +static void
[Intel-gfx] [PATH v2 0/6] Xv on Sandybridge
Here is the set of patches to enable texture adaptor on Sandybridge. Currently you need to turn off shadow in /etc/xorg.conf to use texture video on Sandybridge v2: refresh the patches, fix a conflict with a recent commit on master ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATH v2 1/6] Xv: set the surface state base address
To prepare for Xv on Sandybridge. It is easy to fill the binding table without relocation and make sure that the pointer to binding table only uses bits[15:0]. Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- src/i965_video.c | 141 + 1 files changed, 67 insertions(+), 74 deletions(-) diff --git a/src/i965_video.c b/src/i965_video.c index 4ededde..aaf10fa 100644 --- a/src/i965_video.c +++ b/src/i965_video.c @@ -360,17 +360,20 @@ intel_alloc_and_map(intel_screen_private *intel, char *name, int size, return 0; } -static drm_intel_bo *i965_create_dst_surface_state(ScrnInfoPtr scrn, - PixmapPtr pixmap) +static void i965_create_dst_surface_state(ScrnInfoPtr scrn, + PixmapPtr pixmap, + drm_intel_bo *surf_bo, + uint32_t offset) { intel_screen_private *intel = intel_get_screen_private(scrn); struct brw_surface_state *dest_surf_state; drm_intel_bo *pixmap_bo = intel_get_pixmap_bo(pixmap); - drm_intel_bo *surf_bo; - if (intel_alloc_and_map(intel, textured video surface state, 4096, - surf_bo, dest_surf_state) != 0) - return NULL; + if (drm_intel_bo_map(surf_bo, TRUE) != 0) + return; + + dest_surf_state = (struct brw_surface_state *)((char *)surf_bo-virtual + offset); + memset(dest_surf_state, 0, sizeof(*dest_surf_state)); dest_surf_state-ss0.surface_type = BRW_SURFACE_2D; dest_surf_state-ss0.data_return_format = @@ -393,7 +396,7 @@ static drm_intel_bo *i965_create_dst_surface_state(ScrnInfoPtr scrn, dest_surf_state-ss0.render_cache_read_mode = 0; dest_surf_state-ss1.base_addr = - intel_emit_reloc(surf_bo, offsetof(struct brw_surface_state, ss1), + intel_emit_reloc(surf_bo, offset + offsetof(struct brw_surface_state, ss1), pixmap_bo, 0, I915_GEM_DOMAIN_SAMPLER, 0); dest_surf_state-ss2.height = scrn-virtualY - 1; @@ -405,24 +408,25 @@ static drm_intel_bo *i965_create_dst_surface_state(ScrnInfoPtr scrn, dest_surf_state-ss3.tile_walk = 0; /* TileX */ drm_intel_bo_unmap(surf_bo); - return surf_bo; } -static drm_intel_bo *i965_create_src_surface_state(ScrnInfoPtr scrn, - drm_intel_bo * src_bo, - uint32_t src_offset, - int src_width, - int src_height, - int src_pitch, - uint32_t src_surf_format) +static void i965_create_src_surface_state(ScrnInfoPtr scrn, + drm_intel_bo * src_bo, + uint32_t src_offset, + int src_width, + int src_height, + int src_pitch, + uint32_t src_surf_format, + drm_intel_bo *surface_bo, + uint32_t offset) { - intel_screen_private *intel = intel_get_screen_private(scrn); - drm_intel_bo *surface_bo; struct brw_surface_state *src_surf_state; - if (intel_alloc_and_map(intel, textured video surface state, 4096, - surface_bo, src_surf_state) != 0) - return NULL; + if (drm_intel_bo_map(surface_bo, TRUE) != 0) + return; + + src_surf_state = (struct brw_surface_state *)((char *)surface_bo-virtual + offset); + memset(src_surf_state, 0, sizeof(*src_surf_state)); /* Set up the source surface state buffer */ src_surf_state-ss0.surface_type = BRW_SURFACE_2D; @@ -446,7 +450,7 @@ static drm_intel_bo *i965_create_src_surface_state(ScrnInfoPtr scrn, if (src_bo) { src_surf_state-ss1.base_addr = intel_emit_reloc(surface_bo, -offsetof(struct brw_surface_state, ss1), +offset + offsetof(struct brw_surface_state, ss1), src_bo, src_offset, I915_GEM_DOMAIN_SAMPLER, 0); } else { @@ -454,31 +458,25 @@ static drm_intel_bo *i965_create_src_surface_state(ScrnInfoPtr scrn, } drm_intel_bo_unmap(surface_bo); - return surface_bo; } -static drm_intel_bo *i965_create_binding_table(ScrnInfoPtr scrn, - drm_intel_bo ** surf_bos, - int n_surf
[Intel-gfx] [PATH v2 2/6] Xv: Send instruction doesn't use implied move when sampling YUV surface
The two fragments will be reused for sampling YUV surface and send doesn't have implied move on Sandybridge Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- src/render_program/exa_wm_src_sample_argb.g4a |3 ++- src/render_program/exa_wm_src_sample_argb.g4b |3 ++- src/render_program/exa_wm_src_sample_argb.g4b.gen5 |3 ++- src/render_program/exa_wm_src_sample_planar.g4a|7 --- src/render_program/exa_wm_src_sample_planar.g4b|7 --- .../exa_wm_src_sample_planar.g4b.gen5 |7 --- 6 files changed, 18 insertions(+), 12 deletions(-) diff --git a/src/render_program/exa_wm_src_sample_argb.g4a b/src/render_program/exa_wm_src_sample_argb.g4a index c20f53f..384fe26 100644 --- a/src/render_program/exa_wm_src_sample_argb.g4a +++ b/src/render_program/exa_wm_src_sample_argb.g4a @@ -36,12 +36,13 @@ include(`exa_wm.g4i') /* load argb */ mov (1) g0.81UD 0xUD { align1 mask_disable }; +mov (8) src_msg1UD g08,8,1UD { align1 }; /* copy to msg start reg*/ /* src_msg will be copied with g0, as it contains send desc */ /* emit sampler 'send' cmd */ send (16) src_msg_ind /* msg reg index */ src_sample_base1UW/* readback */ - g08,8,1UW /* copy to msg start reg*/ + null sampler (1,0,F) /* sampler message description, (binding_table,sampler_index,datatype) /* here(src-dst) we should use src_sampler and src_surface */ mlen 5 rlen 8 { align1 }; /* required message len 5, readback len 8 */ diff --git a/src/render_program/exa_wm_src_sample_argb.g4b b/src/render_program/exa_wm_src_sample_argb.g4b index c5b9274..a15e40a 100644 --- a/src/render_program/exa_wm_src_sample_argb.g4b +++ b/src/render_program/exa_wm_src_sample_argb.g4b @@ -1,2 +1,3 @@ { 0x0201, 0x20080061, 0x, 0x }, - { 0x01800031, 0x21c01d29, 0x008d, 0x02580001 }, + { 0x0061, 0x20200022, 0x008d, 0x }, + { 0x01800031, 0x21c01c09, 0x, 0x02580001 }, diff --git a/src/render_program/exa_wm_src_sample_argb.g4b.gen5 b/src/render_program/exa_wm_src_sample_argb.g4b.gen5 index f8cb41e..42039af 100644 --- a/src/render_program/exa_wm_src_sample_argb.g4b.gen5 +++ b/src/render_program/exa_wm_src_sample_argb.g4b.gen5 @@ -1,2 +1,3 @@ { 0x0201, 0x20080061, 0x, 0x }, - { 0x01800031, 0x21c01d29, 0x208d, 0x0a8a0001 }, + { 0x0061, 0x20200022, 0x008d, 0x }, + { 0x01800031, 0x21c01c09, 0x2000, 0x0a8a0001 }, diff --git a/src/render_program/exa_wm_src_sample_planar.g4a b/src/render_program/exa_wm_src_sample_planar.g4a index ad33350..5f5520b 100644 --- a/src/render_program/exa_wm_src_sample_planar.g4a +++ b/src/render_program/exa_wm_src_sample_planar.g4a @@ -41,9 +41,10 @@ mov (1) g0.81UD0xe000UD { align1 mask_disable }; /* emit sampler 'send' cmd */ /* sample Y */ +mov (8) src_msg1UD g08,8,1UD { align1 }; /* copy to msg start reg*/ send (16) src_msg_ind /* msg reg index */ src_sample_g1UW /* readback */ - g08,8,1UW /* copy to msg start reg*/ + null sampler (1,0,F) /* sampler message description, (binding_table,sampler_index,datatype) /* here(src-dst) we should use src_sampler and src_surface */ mlen 5 rlen 2 { align1 }; /* required message len 5, readback len 8 */ @@ -51,7 +52,7 @@ send (16) src_msg_ind /* msg reg index */ /* sample U (Cr) */ send (16) src_msg_ind /* msg reg index */ src_sample_r1UW /* readback */ - g08,8,1UW /* copy to msg start reg*/ + null sampler (3,0,F) /* sampler message description, (binding_table,sampler_index,datatype) /* here(src-dst) we should use src_sampler and src_surface */ mlen 5 rlen 2 { align1 }; /* required message len 5, readback len 8 */ @@ -59,7 +60,7 @@ send (16) src_msg_ind /* msg reg index */ /* sample V (Cb) */ send (16) src_msg_ind /* msg reg index */ src_sample_b1UW /* readback */ - g08,8,1UW /* copy to msg start reg*/ + null sampler (5,0,F) /* sampler message description, (binding_table,sampler_index,datatype) /* here(src-dst) we should use src_sampler and src_surface */ mlen 5 rlen 2 { align1 }; /* required message len 5, readback len 8 */ diff --git a/src/render_program/exa_wm_src_sample_planar.g4b b/src/render_program/exa_wm_src_sample_planar.g4b index 23e5e0d..c8dc47d 100644 --- a/src/render_program/exa_wm_src_sample_planar.g4b +++ b/src/render_program/exa_wm_src_sample_planar.g4b @@ -1,4 +1,5 @@ { 0x0201, 0x20080061, 0x, 0xe000 }, - { 0x01800031, 0x22001d29, 0x008d, 0x02520001 }, - { 0x01800031, 0x21c01d29, 0x008d, 0x02520003 }, - { 0x01800031, 0x22401d29
[Intel-gfx] [PATH v2 4/6] Xv: setup pipeline for Xv on Sandybridge
Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- src/brw_structs.h | 100 src/i965_reg.h | 98 src/i965_video.c| 627 +++ src/intel.h |4 + src/intel_batchbuffer.c | 25 ++- src/intel_video.h |7 + 6 files changed, 855 insertions(+), 6 deletions(-) diff --git a/src/brw_structs.h b/src/brw_structs.h index 1cee5bd..d089ba1 100644 --- a/src/brw_structs.h +++ b/src/brw_structs.h @@ -1487,4 +1487,104 @@ struct brw_interface_descriptor { } desc3; }; +struct gen6_blend_state +{ + struct { + unsigned int dest_blend_factor:5; + unsigned int source_blend_factor:5; + unsigned int pad3:1; + unsigned int blend_func:3; + unsigned int pad2:1; + unsigned int ia_dest_blend_factor:5; + unsigned int ia_source_blend_factor:5; + unsigned int pad1:1; + unsigned int ia_blend_func:3; + unsigned int pad0:1; + unsigned int ia_blend_enable:1; + unsigned int blend_enable:1; + } blend0; + + struct { + unsigned int post_blend_clamp_enable:1; + unsigned int pre_blend_clamp_enable:1; + unsigned int clamp_range:2; + unsigned int pad0:4; + unsigned int x_dither_offset:2; + unsigned int y_dither_offset:2; + unsigned int dither_enable:1; + unsigned int alpha_test_func:3; + unsigned int alpha_test_enable:1; + unsigned int pad1:1; + unsigned int logic_op_func:4; + unsigned int logic_op_enable:1; + unsigned int pad2:1; + unsigned int write_disable_b:1; + unsigned int write_disable_g:1; + unsigned int write_disable_r:1; + unsigned int write_disable_a:1; + unsigned int pad3:1; + unsigned int alpha_to_coverage_dither:1; + unsigned int alpha_to_one:1; + unsigned int alpha_to_coverage:1; + } blend1; +}; + +struct gen6_color_calc_state +{ + struct { + unsigned int alpha_test_format:1; + unsigned int pad0:14; + unsigned int round_disable:1; + unsigned int bf_stencil_ref:8; + unsigned int stencil_ref:8; + } cc0; + + union { + float alpha_ref_f; + struct { + unsigned int ui:8; + unsigned int pad0:24; + } alpha_ref_fi; + } cc1; + + float constant_r; + float constant_g; + float constant_b; + float constant_a; +}; + +struct gen6_depth_stencil_state +{ + struct { + unsigned int pad0:3; + unsigned int bf_stencil_pass_depth_pass_op:3; + unsigned int bf_stencil_pass_depth_fail_op:3; + unsigned int bf_stencil_fail_op:3; + unsigned int bf_stencil_func:3; + unsigned int bf_stencil_enable:1; + unsigned int pad1:2; + unsigned int stencil_write_enable:1; + unsigned int stencil_pass_depth_pass_op:3; + unsigned int stencil_pass_depth_fail_op:3; + unsigned int stencil_fail_op:3; + unsigned int stencil_func:3; + unsigned int stencil_enable:1; + } ds0; + + struct { + unsigned int bf_stencil_write_mask:8; + unsigned int bf_stencil_test_mask:8; + unsigned int stencil_write_mask:8; + unsigned int stencil_test_mask:8; + } ds1; + + struct { + unsigned int pad0:26; + unsigned int depth_write_enable:1; + unsigned int depth_test_func:3; + unsigned int pad1:1; + unsigned int depth_test_enable:1; + } ds2; +}; + #endif diff --git a/src/i965_reg.h b/src/i965_reg.h index fe419dc..3953dab 100644 --- a/src/i965_reg.h +++ b/src/i965_reg.h @@ -22,6 +22,10 @@ #define BRW_3DSTATE_PIPELINED_POINTERS BRW_3D(3, 0, 0) #define BRW_3DSTATE_BINDING_TABLE_POINTERS BRW_3D(3, 0, 1) +# define GEN6_3DSTATE_BINDING_TABLE_MODIFY_PS (1 12)/* for GEN6 */ +# define GEN6_3DSTATE_BINDING_TABLE_MODIFY_GS (1 9) /* for GEN6 */ +# define GEN6_3DSTATE_BINDING_TABLE_MODIFY_VS (1 8) /* for GEN6 */ + #define BRW_3DSTATE_VERTEX_BUFFERS BRW_3D(3, 0, 8) #define BRW_3DSTATE_VERTEX_ELEMENTSBRW_3D(3, 0, 9) #define BRW_3DSTATE_INDEX_BUFFER BRW_3D(3, 0, 0xa) @@ -32,6 +36,9 @@ #define BRW_3DSTATE_SAMPLER_PALETTE_LOAD BRW_3D(3, 1, 2) #define BRW_3DSTATE_CHROMA_KEY BRW_3D(3, 1, 4) #define BRW_3DSTATE_DEPTH_BUFFER BRW_3D(3, 1, 5) +# define BRW_3DSTATE_DEPTH_BUFFER_TYPE_SHIFT 29
[Intel-gfx] [PATH v2 5/6] Xv: enable TextureAdaptor for Sandybridge
Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- src/intel_video.c |8 ++-- 1 files changed, 6 insertions(+), 2 deletions(-) diff --git a/src/intel_video.c b/src/intel_video.c index 5d16778..afc2405 100644 --- a/src/intel_video.c +++ b/src/intel_video.c @@ -364,7 +364,6 @@ void I830InitVideo(ScreenPtr screen) */ if (scrn-bitsPerPixel = 16 INTEL_INFO(intel)-gen = 30 - INTEL_INFO(intel)-gen 60 !intel-use_shadow) { texturedAdaptor = I830SetupImageVideoTextured(screen); if (texturedAdaptor != NULL) { @@ -1583,7 +1582,12 @@ I830PutImageTextured(ScrnInfoPtr scrn, intel_wait_for_scanline(scrn, pixmap, crtc, clipBoxes); } - if (INTEL_INFO(intel)-gen = 40) { + if (INTEL_INFO(intel)-gen = 60) { + Gen6DisplayVideoTextured(scrn, adaptor_priv, id, clipBoxes, +width, height, dstPitch, dstPitch2, +src_w, src_h, +drw_w, drw_h, pixmap); + } else if (INTEL_INFO(intel)-gen = 40) { I965DisplayVideoTextured(scrn, adaptor_priv, id, clipBoxes, width, height, dstPitch, dstPitch2, src_w, src_h, -- 1.7.0.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 1/6] Xv: set the surface state base address
To prepare for Xv on Sandybridge. It is easy to fill the binding table without relocation and make sure that the pointer to binding table only uses bits[15:0]. Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- src/i965_video.c | 141 + 1 files changed, 67 insertions(+), 74 deletions(-) diff --git a/src/i965_video.c b/src/i965_video.c index 4ededde..aaf10fa 100644 --- a/src/i965_video.c +++ b/src/i965_video.c @@ -360,17 +360,20 @@ intel_alloc_and_map(intel_screen_private *intel, char *name, int size, return 0; } -static drm_intel_bo *i965_create_dst_surface_state(ScrnInfoPtr scrn, - PixmapPtr pixmap) +static void i965_create_dst_surface_state(ScrnInfoPtr scrn, + PixmapPtr pixmap, + drm_intel_bo *surf_bo, + uint32_t offset) { intel_screen_private *intel = intel_get_screen_private(scrn); struct brw_surface_state *dest_surf_state; drm_intel_bo *pixmap_bo = intel_get_pixmap_bo(pixmap); - drm_intel_bo *surf_bo; - if (intel_alloc_and_map(intel, textured video surface state, 4096, - surf_bo, dest_surf_state) != 0) - return NULL; + if (drm_intel_bo_map(surf_bo, TRUE) != 0) + return; + + dest_surf_state = (struct brw_surface_state *)((char *)surf_bo-virtual + offset); + memset(dest_surf_state, 0, sizeof(*dest_surf_state)); dest_surf_state-ss0.surface_type = BRW_SURFACE_2D; dest_surf_state-ss0.data_return_format = @@ -393,7 +396,7 @@ static drm_intel_bo *i965_create_dst_surface_state(ScrnInfoPtr scrn, dest_surf_state-ss0.render_cache_read_mode = 0; dest_surf_state-ss1.base_addr = - intel_emit_reloc(surf_bo, offsetof(struct brw_surface_state, ss1), + intel_emit_reloc(surf_bo, offset + offsetof(struct brw_surface_state, ss1), pixmap_bo, 0, I915_GEM_DOMAIN_SAMPLER, 0); dest_surf_state-ss2.height = scrn-virtualY - 1; @@ -405,24 +408,25 @@ static drm_intel_bo *i965_create_dst_surface_state(ScrnInfoPtr scrn, dest_surf_state-ss3.tile_walk = 0; /* TileX */ drm_intel_bo_unmap(surf_bo); - return surf_bo; } -static drm_intel_bo *i965_create_src_surface_state(ScrnInfoPtr scrn, - drm_intel_bo * src_bo, - uint32_t src_offset, - int src_width, - int src_height, - int src_pitch, - uint32_t src_surf_format) +static void i965_create_src_surface_state(ScrnInfoPtr scrn, + drm_intel_bo * src_bo, + uint32_t src_offset, + int src_width, + int src_height, + int src_pitch, + uint32_t src_surf_format, + drm_intel_bo *surface_bo, + uint32_t offset) { - intel_screen_private *intel = intel_get_screen_private(scrn); - drm_intel_bo *surface_bo; struct brw_surface_state *src_surf_state; - if (intel_alloc_and_map(intel, textured video surface state, 4096, - surface_bo, src_surf_state) != 0) - return NULL; + if (drm_intel_bo_map(surface_bo, TRUE) != 0) + return; + + src_surf_state = (struct brw_surface_state *)((char *)surface_bo-virtual + offset); + memset(src_surf_state, 0, sizeof(*src_surf_state)); /* Set up the source surface state buffer */ src_surf_state-ss0.surface_type = BRW_SURFACE_2D; @@ -446,7 +450,7 @@ static drm_intel_bo *i965_create_src_surface_state(ScrnInfoPtr scrn, if (src_bo) { src_surf_state-ss1.base_addr = intel_emit_reloc(surface_bo, -offsetof(struct brw_surface_state, ss1), +offset + offsetof(struct brw_surface_state, ss1), src_bo, src_offset, I915_GEM_DOMAIN_SAMPLER, 0); } else { @@ -454,31 +458,25 @@ static drm_intel_bo *i965_create_src_surface_state(ScrnInfoPtr scrn, } drm_intel_bo_unmap(surface_bo); - return surface_bo; } -static drm_intel_bo *i965_create_binding_table(ScrnInfoPtr scrn, - drm_intel_bo ** surf_bos, - int n_surf
[Intel-gfx] [PATCH 2/6] Xv: Send instruction doesn't use implied move when sampling YUV surface
The two fragments will be reused for sampling YUV surface and send doesn't have implied move on Sandybridge Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- src/render_program/exa_wm_src_sample_argb.g4a |3 ++- src/render_program/exa_wm_src_sample_argb.g4b |3 ++- src/render_program/exa_wm_src_sample_argb.g4b.gen5 |3 ++- src/render_program/exa_wm_src_sample_planar.g4a|7 --- src/render_program/exa_wm_src_sample_planar.g4b|7 --- .../exa_wm_src_sample_planar.g4b.gen5 |7 --- 6 files changed, 18 insertions(+), 12 deletions(-) diff --git a/src/render_program/exa_wm_src_sample_argb.g4a b/src/render_program/exa_wm_src_sample_argb.g4a index c20f53f..384fe26 100644 --- a/src/render_program/exa_wm_src_sample_argb.g4a +++ b/src/render_program/exa_wm_src_sample_argb.g4a @@ -36,12 +36,13 @@ include(`exa_wm.g4i') /* load argb */ mov (1) g0.81UD 0xUD { align1 mask_disable }; +mov (8) src_msg1UD g08,8,1UD { align1 }; /* copy to msg start reg*/ /* src_msg will be copied with g0, as it contains send desc */ /* emit sampler 'send' cmd */ send (16) src_msg_ind /* msg reg index */ src_sample_base1UW/* readback */ - g08,8,1UW /* copy to msg start reg*/ + null sampler (1,0,F) /* sampler message description, (binding_table,sampler_index,datatype) /* here(src-dst) we should use src_sampler and src_surface */ mlen 5 rlen 8 { align1 }; /* required message len 5, readback len 8 */ diff --git a/src/render_program/exa_wm_src_sample_argb.g4b b/src/render_program/exa_wm_src_sample_argb.g4b index c5b9274..a15e40a 100644 --- a/src/render_program/exa_wm_src_sample_argb.g4b +++ b/src/render_program/exa_wm_src_sample_argb.g4b @@ -1,2 +1,3 @@ { 0x0201, 0x20080061, 0x, 0x }, - { 0x01800031, 0x21c01d29, 0x008d, 0x02580001 }, + { 0x0061, 0x20200022, 0x008d, 0x }, + { 0x01800031, 0x21c01c09, 0x, 0x02580001 }, diff --git a/src/render_program/exa_wm_src_sample_argb.g4b.gen5 b/src/render_program/exa_wm_src_sample_argb.g4b.gen5 index f8cb41e..42039af 100644 --- a/src/render_program/exa_wm_src_sample_argb.g4b.gen5 +++ b/src/render_program/exa_wm_src_sample_argb.g4b.gen5 @@ -1,2 +1,3 @@ { 0x0201, 0x20080061, 0x, 0x }, - { 0x01800031, 0x21c01d29, 0x208d, 0x0a8a0001 }, + { 0x0061, 0x20200022, 0x008d, 0x }, + { 0x01800031, 0x21c01c09, 0x2000, 0x0a8a0001 }, diff --git a/src/render_program/exa_wm_src_sample_planar.g4a b/src/render_program/exa_wm_src_sample_planar.g4a index ad33350..5f5520b 100644 --- a/src/render_program/exa_wm_src_sample_planar.g4a +++ b/src/render_program/exa_wm_src_sample_planar.g4a @@ -41,9 +41,10 @@ mov (1) g0.81UD0xe000UD { align1 mask_disable }; /* emit sampler 'send' cmd */ /* sample Y */ +mov (8) src_msg1UD g08,8,1UD { align1 }; /* copy to msg start reg*/ send (16) src_msg_ind /* msg reg index */ src_sample_g1UW /* readback */ - g08,8,1UW /* copy to msg start reg*/ + null sampler (1,0,F) /* sampler message description, (binding_table,sampler_index,datatype) /* here(src-dst) we should use src_sampler and src_surface */ mlen 5 rlen 2 { align1 }; /* required message len 5, readback len 8 */ @@ -51,7 +52,7 @@ send (16) src_msg_ind /* msg reg index */ /* sample U (Cr) */ send (16) src_msg_ind /* msg reg index */ src_sample_r1UW /* readback */ - g08,8,1UW /* copy to msg start reg*/ + null sampler (3,0,F) /* sampler message description, (binding_table,sampler_index,datatype) /* here(src-dst) we should use src_sampler and src_surface */ mlen 5 rlen 2 { align1 }; /* required message len 5, readback len 8 */ @@ -59,7 +60,7 @@ send (16) src_msg_ind /* msg reg index */ /* sample V (Cb) */ send (16) src_msg_ind /* msg reg index */ src_sample_b1UW /* readback */ - g08,8,1UW /* copy to msg start reg*/ + null sampler (5,0,F) /* sampler message description, (binding_table,sampler_index,datatype) /* here(src-dst) we should use src_sampler and src_surface */ mlen 5 rlen 2 { align1 }; /* required message len 5, readback len 8 */ diff --git a/src/render_program/exa_wm_src_sample_planar.g4b b/src/render_program/exa_wm_src_sample_planar.g4b index 23e5e0d..c8dc47d 100644 --- a/src/render_program/exa_wm_src_sample_planar.g4b +++ b/src/render_program/exa_wm_src_sample_planar.g4b @@ -1,4 +1,5 @@ { 0x0201, 0x20080061, 0x, 0xe000 }, - { 0x01800031, 0x22001d29, 0x008d, 0x02520001 }, - { 0x01800031, 0x21c01d29, 0x008d, 0x02520003 }, - { 0x01800031, 0x22401d29
[Intel-gfx] [PATCH 4/6] Xv: setup pipeline for Xv on Sandybridge
Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- src/brw_structs.h | 100 src/i965_reg.h | 98 src/i965_video.c| 624 +++ src/intel.h |4 + src/intel_batchbuffer.c | 25 ++- src/intel_video.h |7 + 6 files changed, 852 insertions(+), 6 deletions(-) diff --git a/src/brw_structs.h b/src/brw_structs.h index 1cee5bd..d089ba1 100644 --- a/src/brw_structs.h +++ b/src/brw_structs.h @@ -1487,4 +1487,104 @@ struct brw_interface_descriptor { } desc3; }; +struct gen6_blend_state +{ + struct { + unsigned int dest_blend_factor:5; + unsigned int source_blend_factor:5; + unsigned int pad3:1; + unsigned int blend_func:3; + unsigned int pad2:1; + unsigned int ia_dest_blend_factor:5; + unsigned int ia_source_blend_factor:5; + unsigned int pad1:1; + unsigned int ia_blend_func:3; + unsigned int pad0:1; + unsigned int ia_blend_enable:1; + unsigned int blend_enable:1; + } blend0; + + struct { + unsigned int post_blend_clamp_enable:1; + unsigned int pre_blend_clamp_enable:1; + unsigned int clamp_range:2; + unsigned int pad0:4; + unsigned int x_dither_offset:2; + unsigned int y_dither_offset:2; + unsigned int dither_enable:1; + unsigned int alpha_test_func:3; + unsigned int alpha_test_enable:1; + unsigned int pad1:1; + unsigned int logic_op_func:4; + unsigned int logic_op_enable:1; + unsigned int pad2:1; + unsigned int write_disable_b:1; + unsigned int write_disable_g:1; + unsigned int write_disable_r:1; + unsigned int write_disable_a:1; + unsigned int pad3:1; + unsigned int alpha_to_coverage_dither:1; + unsigned int alpha_to_one:1; + unsigned int alpha_to_coverage:1; + } blend1; +}; + +struct gen6_color_calc_state +{ + struct { + unsigned int alpha_test_format:1; + unsigned int pad0:14; + unsigned int round_disable:1; + unsigned int bf_stencil_ref:8; + unsigned int stencil_ref:8; + } cc0; + + union { + float alpha_ref_f; + struct { + unsigned int ui:8; + unsigned int pad0:24; + } alpha_ref_fi; + } cc1; + + float constant_r; + float constant_g; + float constant_b; + float constant_a; +}; + +struct gen6_depth_stencil_state +{ + struct { + unsigned int pad0:3; + unsigned int bf_stencil_pass_depth_pass_op:3; + unsigned int bf_stencil_pass_depth_fail_op:3; + unsigned int bf_stencil_fail_op:3; + unsigned int bf_stencil_func:3; + unsigned int bf_stencil_enable:1; + unsigned int pad1:2; + unsigned int stencil_write_enable:1; + unsigned int stencil_pass_depth_pass_op:3; + unsigned int stencil_pass_depth_fail_op:3; + unsigned int stencil_fail_op:3; + unsigned int stencil_func:3; + unsigned int stencil_enable:1; + } ds0; + + struct { + unsigned int bf_stencil_write_mask:8; + unsigned int bf_stencil_test_mask:8; + unsigned int stencil_write_mask:8; + unsigned int stencil_test_mask:8; + } ds1; + + struct { + unsigned int pad0:26; + unsigned int depth_write_enable:1; + unsigned int depth_test_func:3; + unsigned int pad1:1; + unsigned int depth_test_enable:1; + } ds2; +}; + #endif diff --git a/src/i965_reg.h b/src/i965_reg.h index fe419dc..3953dab 100644 --- a/src/i965_reg.h +++ b/src/i965_reg.h @@ -22,6 +22,10 @@ #define BRW_3DSTATE_PIPELINED_POINTERS BRW_3D(3, 0, 0) #define BRW_3DSTATE_BINDING_TABLE_POINTERS BRW_3D(3, 0, 1) +# define GEN6_3DSTATE_BINDING_TABLE_MODIFY_PS (1 12)/* for GEN6 */ +# define GEN6_3DSTATE_BINDING_TABLE_MODIFY_GS (1 9) /* for GEN6 */ +# define GEN6_3DSTATE_BINDING_TABLE_MODIFY_VS (1 8) /* for GEN6 */ + #define BRW_3DSTATE_VERTEX_BUFFERS BRW_3D(3, 0, 8) #define BRW_3DSTATE_VERTEX_ELEMENTSBRW_3D(3, 0, 9) #define BRW_3DSTATE_INDEX_BUFFER BRW_3D(3, 0, 0xa) @@ -32,6 +36,9 @@ #define BRW_3DSTATE_SAMPLER_PALETTE_LOAD BRW_3D(3, 1, 2) #define BRW_3DSTATE_CHROMA_KEY BRW_3D(3, 1, 4) #define BRW_3DSTATE_DEPTH_BUFFER BRW_3D(3, 1, 5) +# define BRW_3DSTATE_DEPTH_BUFFER_TYPE_SHIFT 29
[Intel-gfx] [PATCH 5/6] Xv: enable TextureAdaptor for Sandybridge
Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- src/intel_video.c |8 ++-- 1 files changed, 6 insertions(+), 2 deletions(-) diff --git a/src/intel_video.c b/src/intel_video.c index 5d16778..afc2405 100644 --- a/src/intel_video.c +++ b/src/intel_video.c @@ -364,7 +364,6 @@ void I830InitVideo(ScreenPtr screen) */ if (scrn-bitsPerPixel = 16 INTEL_INFO(intel)-gen = 30 - INTEL_INFO(intel)-gen 60 !intel-use_shadow) { texturedAdaptor = I830SetupImageVideoTextured(screen); if (texturedAdaptor != NULL) { @@ -1583,7 +1582,12 @@ I830PutImageTextured(ScrnInfoPtr scrn, intel_wait_for_scanline(scrn, pixmap, crtc, clipBoxes); } - if (INTEL_INFO(intel)-gen = 40) { + if (INTEL_INFO(intel)-gen = 60) { + Gen6DisplayVideoTextured(scrn, adaptor_priv, id, clipBoxes, +width, height, dstPitch, dstPitch2, +src_w, src_h, +drw_w, drw_h, pixmap); + } else if (INTEL_INFO(intel)-gen = 40) { I965DisplayVideoTextured(scrn, adaptor_priv, id, clipBoxes, width, height, dstPitch, dstPitch2, src_w, src_h, -- 1.7.0.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 6/6] Xv: don't call intel_wait_for_scanline on Sandybridge
MI_LOAD_SCAN_LINE_INCL command is not available on sandybridge. Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- src/intel_video.c |2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/src/intel_video.c b/src/intel_video.c index afc2405..cdff149 100644 --- a/src/intel_video.c +++ b/src/intel_video.c @@ -1578,7 +1578,7 @@ I830PutImageTextured(ScrnInfoPtr scrn, return BadAlloc; } - if (crtc adaptor_priv-SyncToVblank != 0) { + if (crtc adaptor_priv-SyncToVblank != 0 INTEL_INFO(intel)-gen 60) { intel_wait_for_scanline(scrn, pixmap, crtc, clipBoxes); } -- 1.7.0.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] Xv on Sandybridge
On Thu, 2010-10-21 at 17:31 +0800, Chris Wilson wrote: On Thu, 21 Oct 2010 16:55:40 +0800, Xiang, Haihao haihao.xi...@intel.com wrote: Here is the set of patches to enable texture adaptor on Sandybridge. Currently you need to turn off shadow in /etc/xorg.conf to use texture video on Sandybridge So the remaining issue is mixing the TexturedAdaptor with shadow, which requires some similar trickery as dri2 in order to render directly to the front-buffer? Currently we are adding support for 2D HW acceleration (BLT + composite) on Sandybridge. Once it is done, we can disable shadow by default on Sandybridge. As for TexturedAdaptor+shadow, we may add support for it if we have time. Thanks Haihao Or perhaps a better pixmap migration to supersede shadow, which is coming along but a much larger task. So back to the hacks for the short term. -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] support for Sandybridge in GFX assembler
Here is a set of patches for GFX assembler to support Sandybridge. We will try to re-use all existing render shaders with these fixes. Note these patches don't support for some Sandybridge ISA changes such as math instructions, IF/ELSE/ENDIF etc. ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 03/10] always set destination horiz stride for Align16 to 1 on Sandybridge.
Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- src/brw_structs.h |4 ++-- src/gram.y|2 ++ 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/src/brw_structs.h b/src/brw_structs.h index ba20547..32a52df 100644 --- a/src/brw_structs.h +++ b/src/brw_structs.h @@ -1102,7 +1102,7 @@ struct brw_instruction GLuint dest_writemask:4; GLuint dest_subreg_nr:1; GLuint dest_reg_nr:8; -GLuint pad1:2; +GLuint dest_horiz_stride:2; GLuint dest_address_mode:1; } da16; @@ -1116,7 +1116,7 @@ struct brw_instruction GLuint dest_writemask:4; GLint dest_indirect_offset:6; GLuint dest_subreg_nr:3; -GLuint pad1:2; +GLuint dest_horiz_stride:2; GLuint dest_address_mode:1; } ia16; } bits1; diff --git a/src/gram.y b/src/gram.y index 438559a..f57e97c 100644 --- a/src/gram.y +++ b/src/gram.y @@ -1668,6 +1668,7 @@ int set_instruction_dest(struct brw_instruction *instr, instr-bits1.da16.dest_subreg_nr = dest-subreg_nr; instr-bits1.da16.dest_reg_nr = dest-reg_nr; instr-bits1.da16.dest_address_mode = dest-address_mode; + instr-bits1.da16.dest_horiz_stride = 1; instr-bits1.da16.dest_writemask = dest-writemask; } else if (instr-header.access_mode == BRW_ALIGN_1) { instr-bits1.ia1.dest_reg_file = dest-reg_file; @@ -1687,6 +1688,7 @@ int set_instruction_dest(struct brw_instruction *instr, instr-bits1.ia16.dest_subreg_nr = dest-address_subreg_nr; instr-bits1.ia16.dest_writemask = dest-writemask; instr-bits1.ia16.dest_indirect_offset = dest-indirect_offset; + instr-bits1.ia16.dest_horiz_stride = 1; instr-bits1.ia16.dest_address_mode = dest-address_mode; } -- 1.7.0.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 05/10] fix send instruction on Sandybridge
Send doesn't have implied move on Sandybridge, the SFID moves to bits[24,27] which is used as the destination of the implied move on Prev GEN6. Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- src/brw_structs.h |2 +- src/disasm.c |4 ++-- src/gram.y| 20 +--- 3 files changed, 16 insertions(+), 10 deletions(-) diff --git a/src/brw_structs.h b/src/brw_structs.h index 515d2aa..92a398e 100644 --- a/src/brw_structs.h +++ b/src/brw_structs.h @@ -1053,7 +1053,7 @@ struct brw_instruction GLuint predicate_control:4; /* 0x000f */ GLuint predicate_inverse:1; /* 0x0010 */ GLuint execution_size:3; /* 0x00e0 */ - GLuint destreg__conditionalmod:4; /* destreg - send, conditionalmod - others */ + GLuint sfid_destreg__conditionalmod:4; /* sfid - send on GEN6+, destreg - send on Prev GEN6, conditionalmod - others */ GLuint acc_wr_control:1; /* 0x1000 */ GLuint pad0:1;/* 0x2000 */ GLuint debug_control:1; /* 0x4000 */ diff --git a/src/disasm.c b/src/disasm.c index 37e8b51..8180149 100644 --- a/src/disasm.c +++ b/src/disasm.c @@ -795,7 +795,7 @@ int disasm (FILE *file, struct brw_instruction *inst) if (inst-header.opcode != BRW_OPCODE_SEND) err |= control (file, conditional modifier, conditional_modifier, - inst-header.destreg__conditionalmod, NULL); + inst-header.sfid_destreg__conditionalmod, NULL); if (inst-header.opcode != BRW_OPCODE_NOP) { string (file, (); @@ -804,7 +804,7 @@ int disasm (FILE *file, struct brw_instruction *inst) } if (inst-header.opcode == BRW_OPCODE_SEND) - format (file, %d, inst-header.destreg__conditionalmod); + format (file, %d, inst-header.sfid_destreg__conditionalmod); if (opcode[inst-header.opcode].ndst 0) { pad (file, 16); diff --git a/src/gram.y b/src/gram.y index fcbbd81..2dab7a2 100644 --- a/src/gram.y +++ b/src/gram.y @@ -243,7 +243,7 @@ unaryinstruction: { bzero($$, sizeof($$)); $$.header.opcode = $2; - $$.header.destreg__conditionalmod = $3; + $$.header.sfid_destreg__conditionalmod = $3; $$.header.saturate = $4; $$.header.execution_size = $5; set_instruction_options($$, $8); @@ -264,7 +264,7 @@ binaryinstruction: { bzero($$, sizeof($$)); $$.header.opcode = $2; - $$.header.destreg__conditionalmod = $3; + $$.header.sfid_destreg__conditionalmod = $3; $$.header.saturate = $4; $$.header.execution_size = $5; set_instruction_options($$, $9); @@ -287,7 +287,7 @@ binaryaccinstruction: { bzero($$, sizeof($$)); $$.header.opcode = $2; - $$.header.destreg__conditionalmod = $3; + $$.header.sfid_destreg__conditionalmod = $3; $$.header.saturate = $4; $$.header.execution_size = $5; set_instruction_options($$, $9); @@ -322,7 +322,6 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget bzero($$, sizeof($$)); $$.header.opcode = $2; $$.header.execution_size = $3; - $$.header.destreg__conditionalmod = $4; /* msg reg index */ set_instruction_predicate($$, $1); if (set_instruction_dest($$, $5) != 0) YYERROR; @@ -331,15 +330,22 @@ sendinstruction: predicate SEND execsize exp post_dst payload msgtarget $$.bits1.da1.src1_reg_file = BRW_IMMEDIATE_VALUE; $$.bits1.da1.src1_reg_type = BRW_REGISTER_TYPE_D; - if (gen_level == 5) { - $$.bits2.send_gen5.sfid = $7.bits2.send_gen5.sfid; - $$.bits2.send_gen5.end_of_thread = $12.bits3.generic_gen5.end_of_thread; + if (gen_level = 5) { + if (gen_level 5) { + $$.header.sfid_destreg__conditionalmod = $7.bits2.send_gen5.sfid; + } else { + $$.header.sfid_destreg__conditionalmod = $4; /* msg reg index */ + $$.bits2.send_gen5.sfid = $7.bits2.send_gen5.sfid; + $$.bits2.send_gen5.end_of_thread = $12.bits3.generic_gen5.end_of_thread; + } + $$.bits3.generic_gen5 = $7.bits3.generic_gen5; $$.bits3.generic_gen5.msg_length = $9; $$.bits3.generic_gen5.response_length = $11; $$.bits3.generic_gen5.end_of_thread = $12.bits3.generic_gen5.end_of_thread
[Intel-gfx] [PATCH 07/10] add support for data port read on Sandybridge
Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- src/brw_structs.h | 12 src/gram.y|9 - 2 files changed, 20 insertions(+), 1 deletions(-) diff --git a/src/brw_structs.h b/src/brw_structs.h index 6a29f37..9b1cd92 100644 --- a/src/brw_structs.h +++ b/src/brw_structs.h @@ -1400,6 +1400,18 @@ struct brw_instruction struct { GLuint binding_table_index:8; + GLuint msg_control:5; + GLuint msg_type:3; + GLuint pad0:3; + GLuint header_present:1; + GLuint response_length:5; + GLuint msg_length:4; + GLuint pad1:2; + GLuint end_of_thread:1; + } dp_read_gen6; + + struct { + GLuint binding_table_index:8; GLuint msg_control:3; GLuint pixel_scoreboard_clear:1; GLuint msg_type:3; diff --git a/src/gram.y b/src/gram.y index d536625..ffb0851 100644 --- a/src/gram.y +++ b/src/gram.y @@ -630,7 +630,14 @@ msgtarget: NULL_TOKEN | READ LPAREN INTEGER COMMA INTEGER COMMA INTEGER COMMA INTEGER RPAREN { - if (gen_level == 5) { + if (gen_level == 6) { + $$.bits2.send_gen5.sfid = + BRW_MESSAGE_TARGET_DATAPORT_READ; + $$.bits3.generic_gen5.header_present = 1; + $$.bits3.dp_read_gen6.binding_table_index = $3; + $$.bits3.dp_read_gen6.msg_control = $7; + $$.bits3.dp_read_gen6.msg_type = $9; + } else if (gen_level == 5) { $$.bits2.send_gen5.sfid = BRW_MESSAGE_TARGET_DATAPORT_READ; $$.bits3.generic_gen5.header_present = 1; -- 1.7.0.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 08/10] sampler, urb write, null and gateway on Sandybridge are same as Ironlake.
Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- src/gram.y | 10 +- 1 files changed, 5 insertions(+), 5 deletions(-) diff --git a/src/gram.y b/src/gram.y index ffb0851..9258ac7 100644 --- a/src/gram.y +++ b/src/gram.y @@ -557,7 +557,7 @@ post_dst: dst msgtarget: NULL_TOKEN { - if (gen_level == 5) { + if (gen_level = 5) { $$.bits2.send_gen5.sfid= BRW_MESSAGE_TARGET_NULL; $$.bits3.generic_gen5.header_present = 0; /* ??? */ } else { @@ -567,7 +567,7 @@ msgtarget: NULL_TOKEN | SAMPLER LPAREN INTEGER COMMA INTEGER COMMA sampler_datatype RPAREN { - if (gen_level == 5) { + if (gen_level = 5) { $$.bits2.send_gen5.sfid = BRW_MESSAGE_TARGET_SAMPLER; $$.bits3.generic_gen5.header_present = 1; /* ??? */ $$.bits3.sampler_gen5.binding_table_index = $3; @@ -620,7 +620,7 @@ msgtarget: NULL_TOKEN } | GATEWAY { - if (gen_level == 5) { + if (gen_level = 5) { $$.bits2.send_gen5.sfid = BRW_MESSAGE_TARGET_GATEWAY; $$.bits3.generic_gen5.header_present = 0; /* ??? */ } else { @@ -695,7 +695,7 @@ msgtarget: NULL_TOKEN | URB INTEGER urb_swizzle urb_allocate urb_used urb_complete { $$.bits3.generic.msg_target = BRW_MESSAGE_TARGET_URB; - if (gen_level == 5) { + if (gen_level = 5) { $$.bits2.send_gen5.sfid = BRW_MESSAGE_TARGET_URB; $$.bits3.generic_gen5.header_present = 1; $$.bits3.urb_gen5.opcode = BRW_URB_OPCODE_WRITE; @@ -721,7 +721,7 @@ msgtarget: NULL_TOKEN { $$.bits3.generic.msg_target = BRW_MESSAGE_TARGET_THREAD_SPAWNER; - if (gen_level == 5) { + if (gen_level = 5) { $$.bits2.send_gen5.sfid = BRW_MESSAGE_TARGET_THREAD_SPAWNER; $$.bits3.generic_gen5.header_present = 0; -- 1.7.0.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 09/10] print error message when using math function on Sandybridge.
Sandybridge doesn't have math funtion, instead it supports a set of math instructions. The support for math instructions will be added later. Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- src/gram.y |5 - 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/src/gram.y b/src/gram.y index 9258ac7..e61e9db 100644 --- a/src/gram.y +++ b/src/gram.y @@ -595,7 +595,10 @@ msgtarget: NULL_TOKEN } | MATH math_function saturate math_signed math_scalar { - if (gen_level == 5) { + if (gen_level == 6) { + fprintf (stderr, Gen6+ donesn't have math function\n); + YYERROR; + } else if (gen_level == 5) { $$.bits2.send_gen5.sfid = BRW_MESSAGE_TARGET_MATH; $$.bits3.generic_gen5.header_present = 0; $$.bits3.math_gen5.function = $2; -- 1.7.0.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [PATCH 10/10] no compression flag on Sandybridge
Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- src/gram.y |6 -- 1 files changed, 4 insertions(+), 2 deletions(-) diff --git a/src/gram.y b/src/gram.y index e61e9db..a57e4e9 100644 --- a/src/gram.y +++ b/src/gram.y @@ -1612,8 +1612,10 @@ instoption_list: $$.header.compression_control |= BRW_COMPRESSION_2NDHALF; break; case COMPR: - $$.header.compression_control |= - BRW_COMPRESSION_COMPRESSED; + if (gen_level 6) { +$$.header.compression_control |= +BRW_COMPRESSION_COMPRESSED; + } break; case SWITCH: $$.header.thread_control |= BRW_THREAD_SWITCH; -- 1.7.0.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
Re: [Intel-gfx] [PATCH][v2 1/2] drm/i915: prepare for video codec ring buffer on Sandybridge
On Mon, 2010-09-13 at 17:52 +0800, Chris Wilson wrote: On Mon, 13 Sep 2010 15:17:05 +0800, Xiang, Haihao haihao.xi...@intel.com wrote: Some little changes: Add set_tail hook to struct intel_ring_buffer fix HAS_BSD with a device info flag Don't export the initialiser of struct intel_ring_buffer A really nice set of cleanups, thanks! However, that changelog should have been an instant give away that something was wrong with the patch. ;-) Carl, would you care to remind us how to write a good commit? You do it so much better than I. Here is my lame version: - A patch should just do one thing and one thing only. Thanks for your comments. I will separate it into three patches Haihao This is vital if we ever need to bisect or revert a patch. It also means that we create smaller, more readable commits - which is a good thing! - Give an overview of what was done and more importantly *why*. We can all read code and spend a long time pondering the complexities and mysteries of a piece of code and eventually come to an understanding of what that code does. We will never be able to work out what you were thinking or intending to do as you wrote that piece of code though. You may have to write several paragraphs explaining the background and your analysis of a bug or design that you wish to implement. Obviously, for these simple cleanups there is little to say other than it makes the code easier to read and reduces the chance for subtle bugs to creep in. More complex code requires deeper thought and understanding and the changelog should reflect that. - The patch should record all those who contributed to the discovery of the bug, if applicable, and to those who reviewed and tested the patches. If the patch touches code outside of our sole purview, we must obtain at least an ACK by the maintainer of that code. -Chris ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [Inter-gfx][PATCH][v3 2/4] drm/i915: do not export the instances of struct intel_ring_buffer
Introduce intel_init_render_ring_buffer(), intel_init_bsd_ring_buffer for ring initialization. Signed-off-by: Xiang, Haihao haihao.xi...@intel.com Reviewed-by: Chris Wilson ch...@chris-wilson.co.uk --- drivers/gpu/drm/i915/i915_gem.c | 14 ++ drivers/gpu/drm/i915/intel_ringbuffer.c | 29 +++-- drivers/gpu/drm/i915/intel_ringbuffer.h |4 ++-- 3 files changed, 31 insertions(+), 16 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index a83574d..2725012 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -4443,28 +4443,18 @@ i915_gem_init_ringbuffer(struct drm_device *dev) drm_i915_private_t *dev_priv = dev-dev_private; int ret; - dev_priv-render_ring = render_ring; - - if (!I915_NEED_GFX_HWS(dev)) { - dev_priv-render_ring.status_page.page_addr - = dev_priv-status_page_dmah-vaddr; - memset(dev_priv-render_ring.status_page.page_addr, - 0, PAGE_SIZE); - } - if (HAS_PIPE_CONTROL(dev)) { ret = i915_gem_init_pipe_control(dev); if (ret) return ret; } - ret = intel_init_ring_buffer(dev, dev_priv-render_ring); + ret = intel_init_render_ring_buffer(dev); if (ret) goto cleanup_pipe_control; if (HAS_BSD(dev)) { - dev_priv-bsd_ring = bsd_ring; - ret = intel_init_ring_buffer(dev, dev_priv-bsd_ring); + ret = intel_init_bsd_ring_buffer(dev); if (ret) goto cleanup_render_ring; } diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 11bcfc8..a9d4f5b 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -797,7 +797,7 @@ void intel_fill_struct(struct drm_device *dev, intel_ring_advance(dev, ring); } -struct intel_ring_buffer render_ring = { +static struct intel_ring_buffer render_ring = { .name = render ring, .regs = { .ctl = PRB0_CTL, @@ -834,7 +834,7 @@ struct intel_ring_buffer render_ring = { /* ring buffer for bit-stream decoder */ -struct intel_ring_buffer bsd_ring = { +static struct intel_ring_buffer bsd_ring = { .name = bsd ring, .regs = { .ctl = BSD_RING_CTL, @@ -868,3 +868,28 @@ struct intel_ring_buffer bsd_ring = { .status_page= {NULL, 0, NULL}, .map= {0,} }; + +int intel_init_render_ring_buffer(struct drm_device *dev) +{ + drm_i915_private_t *dev_priv = dev-dev_private; + + dev_priv-render_ring = render_ring; + + if (!I915_NEED_GFX_HWS(dev)) { + dev_priv-render_ring.status_page.page_addr + = dev_priv-status_page_dmah-vaddr; + memset(dev_priv-render_ring.status_page.page_addr, + 0, PAGE_SIZE); + } + + return intel_init_ring_buffer(dev, dev_priv-render_ring); +} + +int intel_init_bsd_ring_buffer(struct drm_device *dev) +{ + drm_i915_private_t *dev_priv = dev-dev_private; + + dev_priv-bsd_ring = bsd_ring; + + return intel_init_ring_buffer(dev, dev_priv-bsd_ring); +} diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index fa5d84f..df7acc5 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -129,7 +129,7 @@ void intel_ring_advance(struct drm_device *dev, u32 intel_ring_get_seqno(struct drm_device *dev, struct intel_ring_buffer *ring); -extern struct intel_ring_buffer render_ring; -extern struct intel_ring_buffer bsd_ring; +int intel_init_render_ring_buffer(struct drm_device *dev); +int intel_init_bsd_ring_buffer(struct drm_device *dev); #endif /* _INTEL_RINGBUFFER_H_ */ -- 1.7.0.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [Inter-gfx][PATCH][v3 3/4] drm/i915: add set_tail hook in struct intel_ring_buffer
This is prepared for video codec ring buffer on Sandybridge. It is needed to read/write more than one register to move the tail pointer of the video codec ring on Sandybridge. Signed-off-by: Xiang, Haihao haihao.xi...@intel.com Reviewed-by: Chris Wilson ch...@chris-wilson.co.uk --- drivers/gpu/drm/i915/intel_ringbuffer.c | 22 +- drivers/gpu/drm/i915/intel_ringbuffer.h |2 ++ 2 files changed, 19 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index a9d4f5b..0a65182 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -134,6 +134,12 @@ static unsigned int render_ring_get_tail(struct drm_device *dev, return I915_READ(PRB0_TAIL) TAIL_ADDR; } +static inline void render_ring_set_tail(struct drm_device *dev, u32 value) +{ + drm_i915_private_t *dev_priv = dev-dev_private; + I915_WRITE(PRB0_TAIL, value); +} + static unsigned int render_ring_get_active_head(struct drm_device *dev, struct intel_ring_buffer *ring) { @@ -146,8 +152,7 @@ static unsigned int render_ring_get_active_head(struct drm_device *dev, static void render_ring_advance_ring(struct drm_device *dev, struct intel_ring_buffer *ring) { - drm_i915_private_t *dev_priv = dev-dev_private; - I915_WRITE(PRB0_TAIL, ring-tail); + render_ring_set_tail(dev, ring-tail); } static int init_ring_common(struct drm_device *dev, @@ -161,7 +166,7 @@ static int init_ring_common(struct drm_device *dev, /* Stop the ring if it's running. */ I915_WRITE(ring-regs.ctl, 0); I915_WRITE(ring-regs.head, 0); - I915_WRITE(ring-regs.tail, 0); + ring-set_tail(dev, 0); /* Initialize the ring. */ I915_WRITE(ring-regs.start, obj_priv-gtt_offset); @@ -404,6 +409,12 @@ static inline unsigned int bsd_ring_get_tail(struct drm_device *dev, return I915_READ(BSD_RING_TAIL) TAIL_ADDR; } +static inline void bsd_ring_set_tail(struct drm_device *dev, u32 value) +{ + drm_i915_private_t *dev_priv = dev-dev_private; + I915_WRITE(BSD_RING_TAIL, value); +} + static inline unsigned int bsd_ring_get_active_head(struct drm_device *dev, struct intel_ring_buffer *ring) { @@ -414,8 +425,7 @@ static inline unsigned int bsd_ring_get_active_head(struct drm_device *dev, static inline void bsd_ring_advance_ring(struct drm_device *dev, struct intel_ring_buffer *ring) { - drm_i915_private_t *dev_priv = dev-dev_private; - I915_WRITE(BSD_RING_TAIL, ring-tail); + bsd_ring_set_tail(dev, ring-tail); } static int init_bsd_ring(struct drm_device *dev, @@ -820,6 +830,7 @@ static struct intel_ring_buffer render_ring = { .init = init_render_ring, .get_head = render_ring_get_head, .get_tail = render_ring_get_tail, + .set_tail = render_ring_set_tail, .get_active_head= render_ring_get_active_head, .advance_ring = render_ring_advance_ring, .flush = render_ring_flush, @@ -857,6 +868,7 @@ static struct intel_ring_buffer bsd_ring = { .init = init_bsd_ring, .get_head = bsd_ring_get_head, .get_tail = bsd_ring_get_tail, + .set_tail = bsd_ring_set_tail, .get_active_head= bsd_ring_get_active_head, .advance_ring = bsd_ring_advance_ring, .flush = bsd_ring_flush, diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index df7acc5..f89e528 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -44,6 +44,8 @@ struct intel_ring_buffer { struct intel_ring_buffer *ring); unsigned int(*get_tail)(struct drm_device *dev, struct intel_ring_buffer *ring); + void(*set_tail)(struct drm_device *dev, + u32 value); unsigned int(*get_active_head)(struct drm_device *dev, struct intel_ring_buffer *ring); void(*advance_ring)(struct drm_device *dev, -- 1.7.0.4 ___ Intel-gfx mailing list Intel-gfx@lists.freedesktop.org http://lists.freedesktop.org/mailman/listinfo/intel-gfx
[Intel-gfx] [Inter-gfx][PATCH][v3 4/4] drm/i915: add a new ring buffer on Sandybridge
This ring buffer is used for video decoding/encoding on Sandybridge. Signed-off-by: Xiang, Haihao haihao.xi...@intel.com Reviewed-by: Chris Wilson ch...@chris-wilson.co.uk --- drivers/gpu/drm/i915/i915_drv.c |2 + drivers/gpu/drm/i915/i915_irq.c | 15 +++- drivers/gpu/drm/i915/i915_reg.h | 26 ++- drivers/gpu/drm/i915/intel_ringbuffer.c | 130 ++- 4 files changed, 165 insertions(+), 8 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index 9d892fc..1bc1125 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -152,11 +152,13 @@ static const struct intel_device_info intel_ironlake_m_info = { static const struct intel_device_info intel_sandybridge_d_info = { .gen = 6, .is_i965g = 1, .is_i9xx = 1, .need_gfx_hws = 1, .has_hotplug = 1, + .has_bsd_ring = 1, }; static const struct intel_device_info intel_sandybridge_m_info = { .gen = 6, .is_i965g = 1, .is_mobile = 1, .is_i9xx = 1, .need_gfx_hws = 1, .has_hotplug = 1, + .has_bsd_ring = 1, }; static const struct pci_device_id pciidlist[] = { /* aka */ diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index e64b8ea..9351cb5 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -300,6 +300,10 @@ static irqreturn_t ironlake_irq_handler(struct drm_device *dev) u32 de_iir, gt_iir, de_ier, pch_iir; struct drm_i915_master_private *master_priv; struct intel_ring_buffer *render_ring = dev_priv-render_ring; + u32 bsd_usr_interrupt = GT_BSD_USER_INTERRUPT; + + if (IS_GEN6(dev)) + bsd_usr_interrupt = GT_GEN6_BSD_USER_INTERRUPT; /* disable master interrupt before clearing iir */ de_ier = I915_READ(DEIER); @@ -331,10 +335,9 @@ static irqreturn_t ironlake_irq_handler(struct drm_device *dev) mod_timer(dev_priv-hangcheck_timer, jiffies + msecs_to_jiffies(DRM_I915_HANGCHECK_PERIOD)); } - if (gt_iir GT_BSD_USER_INTERRUPT) + if (gt_iir bsd_usr_interrupt) DRM_WAKEUP(dev_priv-bsd_ring.irq_queue); - if (de_iir DE_GSE) intel_opregion_gse_intr(dev); @@ -1444,17 +1447,19 @@ static int ironlake_irq_postinstall(struct drm_device *dev) I915_WRITE(DEIER, dev_priv-de_irq_enable_reg); (void) I915_READ(DEIER); - /* Gen6 only needs render pipe_control now */ if (IS_GEN6(dev)) - render_mask = GT_PIPE_NOTIFY; + render_mask = GT_PIPE_NOTIFY | GT_GEN6_BSD_USER_INTERRUPT; dev_priv-gt_irq_mask_reg = ~render_mask; dev_priv-gt_irq_enable_reg = render_mask; I915_WRITE(GTIIR, I915_READ(GTIIR)); I915_WRITE(GTIMR, dev_priv-gt_irq_mask_reg); - if (IS_GEN6(dev)) + if (IS_GEN6(dev)) { I915_WRITE(GEN6_RENDER_IMR, ~GEN6_RENDER_PIPE_CONTROL_NOTIFY_INTERRUPT); + I915_WRITE(GEN6_BSD_IMR, ~GEN6_BSD_IMR_USER_INTERRUPT); + } + I915_WRITE(GTIER, dev_priv-gt_irq_enable_reg); (void) I915_READ(GTIER); diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index fd229ab..c7ef079 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -194,11 +194,11 @@ #define MI_STORE_DWORD_INDEX MI_INSTR(0x21, 1) #define MI_STORE_DWORD_INDEX_SHIFT 2 #define MI_LOAD_REGISTER_IMM MI_INSTR(0x22, 1) +#define MI_FLUSH_DWMI_INSTR(0x26, 2) /* for GEN6 */ #define MI_BATCH_BUFFERMI_INSTR(0x30, 1) #define MI_BATCH_NON_SECURE (1) #define MI_BATCH_NON_SECURE_I965 (18) #define MI_BATCH_BUFFER_START MI_INSTR(0x31, 0) - /* * 3D instructions used by the kernel */ @@ -481,6 +481,28 @@ #define BSD_HWS_PGA0x04080 /* + * video command stream instruction and interrupt control register defines + * for GEN6 + */ +#define GEN6_BSD_RING_TAIL 0x12030 +#define GEN6_BSD_RING_HEAD 0x12034 +#define GEN6_BSD_RING_START0x12038 +#define GEN6_BSD_RING_CTL 0x1203c +#define GEN6_BSD_RING_ACTHD0x12074 +#define GEN6_BSD_HWS_PGA 0x14080 + +#define GEN6_BSD_SLEEP_PSMI_CONTROL0x12050 +#define GEN6_BSD_SLEEP_PSMI_CONTROL_RC_ILDL_MESSAGE_MODIFY_MASK (1 16) +#define GEN6_BSD_SLEEP_PSMI_CONTROL_RC_ILDL_MESSAGE_DISABLE (1 0) +#define GEN6_BSD_SLEEP_PSMI_CONTROL_RC_ILDL_MESSAGE_ENABLE 0 +#define GEN6_BSD_SLEEP_PSMI_CONTROL_IDLE_INDICATOR (1 3) + +#define GEN6_BSD_IMR 0x120a8 +#define GEN6_BSD_IMR_USER_INTERRUPT (1 12) + +#define GEN6_BSD_RNCID 0x12198 + +/* * Framebuffer compression (915+ only) */ @@ -2556,7 +2578,7 @@ #define GT_SYNC_STATUS (1 2) #define GT_USER_INTERRUPT
Re: [Intel-gfx] [Inter-gfx][PATCH][v3 3/4] drm/i915: add set_tail hook in struct intel_ring_buffer
On Thu, 2010-09-16 at 12:21 +0800, Zhenyu Wang wrote: On 2010.09.16 10:43:12 +0800, Xiang, Haihao wrote: This is prepared for video codec ring buffer on Sandybridge. It is needed to read/write more than one register to move the tail pointer of the video codec ring on Sandybridge. Do we really need new 'set_tail'? Isn't advance_ring used for set ring tail? Sandybridge workaround can be put into advance_ring, so your 'set_tail''s only left usage is for initialization. Is that workaround even needed for initial 0 setting? The document says Every tail move must follow this sequence. I think it should also include '0'. Or can't that be done in init function by using advance_ring? advance_ring uses ring-tail to set TAIL register, i965_reset() also invokes ring-init() to re-init ring buffer, how to guarantee ring-tail is 0? BTW advance_ring can be implemented as set_tail(dev, ring-tail). Signed-off-by: Xiang, Haihao haihao.xi...@intel.com Reviewed-by: Chris Wilson ch...@chris-wilson.co.uk --- drivers/gpu/drm/i915/intel_ringbuffer.c | 22 +- drivers/gpu/drm/i915/intel_ringbuffer.h |2 ++ 2 files changed, 19 insertions(+), 5 deletions(-) diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index a9d4f5b..0a65182 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -134,6 +134,12 @@ static unsigned int render_ring_get_tail(struct drm_device *dev, return I915_READ(PRB0_TAIL) TAIL_ADDR; } +static inline void render_ring_set_tail(struct drm_device *dev, u32 value) +{ + drm_i915_private_t *dev_priv = dev-dev_private; + I915_WRITE(PRB0_TAIL, value); +} + static unsigned int render_ring_get_active_head(struct drm_device *dev, struct intel_ring_buffer *ring) { @@ -146,8 +152,7 @@ static unsigned int render_ring_get_active_head(struct drm_device *dev, static void render_ring_advance_ring(struct drm_device *dev, struct intel_ring_buffer *ring) { - drm_i915_private_t *dev_priv = dev-dev_private; - I915_WRITE(PRB0_TAIL, ring-tail); + render_ring_set_tail(dev, ring-tail); } static int init_ring_common(struct drm_device *dev, @@ -161,7 +166,7 @@ static int init_ring_common(struct drm_device *dev, /* Stop the ring if it's running. */ I915_WRITE(ring-regs.ctl, 0); I915_WRITE(ring-regs.head, 0); - I915_WRITE(ring-regs.tail, 0); + ring-set_tail(dev, 0); /* Initialize the ring. */ I915_WRITE(ring-regs.start, obj_priv-gtt_offset); @@ -404,6 +409,12 @@ static inline unsigned int bsd_ring_get_tail(struct drm_device *dev, return I915_READ(BSD_RING_TAIL) TAIL_ADDR; } +static inline void bsd_ring_set_tail(struct drm_device *dev, u32 value) +{ + drm_i915_private_t *dev_priv = dev-dev_private; + I915_WRITE(BSD_RING_TAIL, value); +} + static inline unsigned int bsd_ring_get_active_head(struct drm_device *dev, struct intel_ring_buffer *ring) { @@ -414,8 +425,7 @@ static inline unsigned int bsd_ring_get_active_head(struct drm_device *dev, static inline void bsd_ring_advance_ring(struct drm_device *dev, struct intel_ring_buffer *ring) { - drm_i915_private_t *dev_priv = dev-dev_private; - I915_WRITE(BSD_RING_TAIL, ring-tail); + bsd_ring_set_tail(dev, ring-tail); } static int init_bsd_ring(struct drm_device *dev, @@ -820,6 +830,7 @@ static struct intel_ring_buffer render_ring = { .init = init_render_ring, .get_head = render_ring_get_head, .get_tail = render_ring_get_tail, + .set_tail = render_ring_set_tail, .get_active_head= render_ring_get_active_head, .advance_ring = render_ring_advance_ring, .flush = render_ring_flush, @@ -857,6 +868,7 @@ static struct intel_ring_buffer bsd_ring = { .init = init_bsd_ring, .get_head = bsd_ring_get_head, .get_tail = bsd_ring_get_tail, + .set_tail = bsd_ring_set_tail, .get_active_head= bsd_ring_get_active_head, .advance_ring = bsd_ring_advance_ring, .flush = bsd_ring_flush, diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.h b/drivers/gpu/drm/i915/intel_ringbuffer.h index df7acc5..f89e528 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.h +++ b/drivers/gpu/drm/i915/intel_ringbuffer.h @@ -44,6 +44,8 @@ struct intel_ring_buffer { struct intel_ring_buffer *ring); unsigned int(*get_tail)(struct drm_device *dev, struct intel_ring_buffer *ring); + void(*set_tail)(struct drm_device *dev, + u32 value); unsigned int
[Intel-gfx] [PATCH][v2 1/2] drm/i915: prepare for video codec ring buffer on Sandybridge
Some little changes: Add set_tail hook to struct intel_ring_buffer fix HAS_BSD with a device info flag Don't export the initialiser of struct intel_ring_buffer Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- drivers/gpu/drm/i915/i915_drv.c |4 ++ drivers/gpu/drm/i915/i915_drv.h |3 +- drivers/gpu/drm/i915/i915_gem.c | 14 +--- drivers/gpu/drm/i915/intel_ringbuffer.c | 51 ++ drivers/gpu/drm/i915/intel_ringbuffer.h |6 ++- 5 files changed, 56 insertions(+), 22 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.c b/drivers/gpu/drm/i915/i915_drv.c index dffc1bc..9d892fc 100644 --- a/drivers/gpu/drm/i915/i915_drv.c +++ b/drivers/gpu/drm/i915/i915_drv.c @@ -121,12 +121,14 @@ static const struct intel_device_info intel_g33_info = { static const struct intel_device_info intel_g45_info = { .gen = 4, .is_i965g = 1, .is_g4x = 1, .is_i9xx = 1, .need_gfx_hws = 1, .has_pipe_cxsr = 1, .has_hotplug = 1, + .has_bsd_ring = 1, }; static const struct intel_device_info intel_gm45_info = { .gen = 4, .is_i965g = 1, .is_g4x = 1, .is_i9xx = 1, .is_mobile = 1, .need_gfx_hws = 1, .has_fbc = 1, .has_rc6 = 1, .has_pipe_cxsr = 1, .has_hotplug = 1, + .has_bsd_ring = 1, }; static const struct intel_device_info intel_pineview_info = { @@ -138,11 +140,13 @@ static const struct intel_device_info intel_pineview_info = { static const struct intel_device_info intel_ironlake_d_info = { .gen = 5, .is_ironlake = 1, .is_i965g = 1, .is_i9xx = 1, .need_gfx_hws = 1, .has_pipe_cxsr = 1, .has_hotplug = 1, + .has_bsd_ring = 1, }; static const struct intel_device_info intel_ironlake_m_info = { .gen = 5, .is_ironlake = 1, .is_mobile = 1, .is_i965g = 1, .is_i9xx = 1, .need_gfx_hws = 1, .has_fbc = 1, .has_rc6 = 1, .has_hotplug = 1, + .has_bsd_ring = 1, }; static const struct intel_device_info intel_sandybridge_d_info = { diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index b3efb30..863130f 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -216,6 +216,7 @@ struct intel_device_info { u8 cursor_needs_physical : 1; u8 has_overlay : 1; u8 overlay_needs_physical : 1; + u8 has_bsd_ring : 1; }; enum no_fbc_reason { @@ -1230,7 +1231,7 @@ static inline void i915_write(struct drm_i915_private *dev_priv, u32 reg, #define IS_GEN5(dev) (INTEL_INFO(dev)-gen == 5) #define IS_GEN6(dev) (INTEL_INFO(dev)-gen == 6) -#define HAS_BSD(dev)(IS_IRONLAKE(dev) || IS_G4X(dev)) +#define HAS_BSD(dev)(INTEL_INFO(dev)-has_bsd_ring) #define I915_NEED_GFX_HWS(dev) (INTEL_INFO(dev)-need_gfx_hws) #define HAS_OVERLAY(dev) (INTEL_INFO(dev)-has_overlay) diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index e0b7ddc..dc2826d 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -4451,28 +4451,18 @@ i915_gem_init_ringbuffer(struct drm_device *dev) drm_i915_private_t *dev_priv = dev-dev_private; int ret; - dev_priv-render_ring = render_ring; - - if (!I915_NEED_GFX_HWS(dev)) { - dev_priv-render_ring.status_page.page_addr - = dev_priv-status_page_dmah-vaddr; - memset(dev_priv-render_ring.status_page.page_addr, - 0, PAGE_SIZE); - } - if (HAS_PIPE_CONTROL(dev)) { ret = i915_gem_init_pipe_control(dev); if (ret) return ret; } - ret = intel_init_ring_buffer(dev, dev_priv-render_ring); + ret = intel_init_render_ring_buffer(dev); if (ret) goto cleanup_pipe_control; if (HAS_BSD(dev)) { - dev_priv-bsd_ring = bsd_ring; - ret = intel_init_ring_buffer(dev, dev_priv-bsd_ring); + ret = intel_init_bsd_ring_buffer(dev); if (ret) goto cleanup_render_ring; } diff --git a/drivers/gpu/drm/i915/intel_ringbuffer.c b/drivers/gpu/drm/i915/intel_ringbuffer.c index 1ae2b25..8560dee 100644 --- a/drivers/gpu/drm/i915/intel_ringbuffer.c +++ b/drivers/gpu/drm/i915/intel_ringbuffer.c @@ -134,6 +134,12 @@ static unsigned int render_ring_get_tail(struct drm_device *dev, return I915_READ(PRB0_TAIL) TAIL_ADDR; } +static inline void render_ring_set_tail(struct drm_device *dev, u32 value) +{ + drm_i915_private_t *dev_priv = dev-dev_private; + I915_WRITE(PRB0_TAIL, value); +} + static unsigned int render_ring_get_active_head(struct drm_device *dev, struct intel_ring_buffer *ring) { @@ -146,8 +152,7 @@ static unsigned int render_ring_get_active_head(struct drm_device *dev, static void render_ring_advance_ring(struct drm_device *dev
Re: [Intel-gfx] [intel-gfx][PATCH 2/2] drm/i915: Add a new ring buffer on Sandybridge
On Thu, 2010-09-02 at 15:16 +0800, Chris Wilson wrote: Comments inline. Thanks for your comments. On Thu, 2 Sep 2010 21:46:54 +0800, Xiang, Haihao haihao.xi...@intel.com wrote: This ring buffer is used for video decoding/encoding on Sandybridge. Signed-off-by: Xiang, Haihao haihao.xi...@intel.com --- drivers/gpu/drm/i915/i915_drv.h |2 +- drivers/gpu/drm/i915/i915_gem.c |6 ++- drivers/gpu/drm/i915/i915_irq.c | 15 +++-- drivers/gpu/drm/i915/i915_reg.h | 22 +- drivers/gpu/drm/i915/intel_ringbuffer.c | 121 +++ drivers/gpu/drm/i915/intel_ringbuffer.h |1 + 6 files changed, 158 insertions(+), 9 deletions(-) diff --git a/drivers/gpu/drm/i915/i915_drv.h b/drivers/gpu/drm/i915/i915_drv.h index 047cd7c..22d098b 100644 --- a/drivers/gpu/drm/i915/i915_drv.h +++ b/drivers/gpu/drm/i915/i915_drv.h @@ -1203,7 +1203,7 @@ extern void intel_overlay_print_error_state(struct seq_file *m, struct intel_ove (dev)-pci_device == 0x2A42 || \ (dev)-pci_device == 0x2E42) -#define HAS_BSD(dev)(IS_IRONLAKE(dev) || IS_G4X(dev)) +#define HAS_BSD(dev)(IS_IRONLAKE(dev) || IS_G4X(dev) || IS_GEN6(dev)) Convert HAS_BSD(dev) to INTEL_INFO(dev)-has_bsd_ring and update capabilities accordingly. I will fix it. #define I915_NEED_GFX_HWS(dev) (INTEL_INFO(dev)-need_gfx_hws) /* With the 945 and later, Y tiling got adjusted so that it was 32 128-byte diff --git a/drivers/gpu/drm/i915/i915_gem.c b/drivers/gpu/drm/i915/i915_gem.c index 8ccb55a..d2c825a 100644 --- a/drivers/gpu/drm/i915/i915_gem.c +++ b/drivers/gpu/drm/i915/i915_gem.c @@ -4504,7 +4504,11 @@ i915_gem_init_ringbuffer(struct drm_device *dev) goto cleanup_pipe_control; if (HAS_BSD(dev)) { - dev_priv-bsd_ring = bsd_ring; + if (IS_GEN6(dev)) + dev_priv-bsd_ring = gen6_bsd_ring; + else + dev_priv-bsd_ring = bsd_ring; + Lets stop exporting these struct initialisers before we end up with one per generation per ring. Introduce intel_init_render_ring_buffer() and intel_init_bsd_ring_buffer() and similarly tidy up the HWS page initialisation. I will fix it. ret = intel_init_ring_buffer(dev, dev_priv-bsd_ring); if (ret) goto cleanup_render_ring; diff --git a/drivers/gpu/drm/i915/i915_irq.c b/drivers/gpu/drm/i915/i915_irq.c index 16861b8..82e708c 100644 --- a/drivers/gpu/drm/i915/i915_irq.c +++ b/drivers/gpu/drm/i915/i915_irq.c @@ -312,6 +312,10 @@ irqreturn_t ironlake_irq_handler(struct drm_device *dev) u32 de_iir, gt_iir, de_ier, pch_iir; struct drm_i915_master_private *master_priv; struct intel_ring_buffer *render_ring = dev_priv-render_ring; + u32 bsd_usr_interrupt = GT_BSD_USER_INTERRUPT; + + if (IS_GEN6(dev)) + bsd_usr_interrupt = GT_GEN6_BSD_USER_INTERRUPT; /* disable master interrupt before clearing iir */ de_ier = I915_READ(DEIER); @@ -342,10 +346,9 @@ irqreturn_t ironlake_irq_handler(struct drm_device *dev) dev_priv-hangcheck_count = 0; mod_timer(dev_priv-hangcheck_timer, jiffies + DRM_I915_HANGCHECK_PERIOD); } - if (gt_iir GT_BSD_USER_INTERRUPT) + if (gt_iir bsd_usr_interrupt) DRM_WAKEUP(dev_priv-bsd_ring.irq_queue); - if (de_iir DE_GSE) ironlake_opregion_gse_intr(dev); @@ -1381,17 +1384,19 @@ static int ironlake_irq_postinstall(struct drm_device *dev) I915_WRITE(DEIER, dev_priv-de_irq_enable_reg); (void) I915_READ(DEIER); - /* Gen6 only needs render pipe_control now */ if (IS_GEN6(dev)) - render_mask = GT_PIPE_NOTIFY; + render_mask = GT_PIPE_NOTIFY | GT_GEN6_BSD_USER_INTERRUPT; dev_priv-gt_irq_mask_reg = ~render_mask; dev_priv-gt_irq_enable_reg = render_mask; I915_WRITE(GTIIR, I915_READ(GTIIR)); I915_WRITE(GTIMR, dev_priv-gt_irq_mask_reg); - if (IS_GEN6(dev)) + if (IS_GEN6(dev)) { I915_WRITE(GEN6_RENDER_IMR, ~GEN6_RENDER_PIPE_CONTROL_NOTIFY_INTERRUPT); + I915_WRITE(GEN6_BSD_IMR, ~GEN6_BSD_IMR_USER_INTERRUPT); + } + I915_WRITE(GTIER, dev_priv-gt_irq_enable_reg); (void) I915_READ(GTIER); diff --git a/drivers/gpu/drm/i915/i915_reg.h b/drivers/gpu/drm/i915/i915_reg.h index 67e3ec1..9d57ecc 100644 --- a/drivers/gpu/drm/i915/i915_reg.h +++ b/drivers/gpu/drm/i915/i915_reg.h @@ -192,11 +192,11 @@ #define MI_STORE_DWORD_INDEX MI_INSTR(0x21, 1) #define MI_STORE_DWORD_INDEX_SHIFT 2 #define MI_LOAD_REGISTER_IMM MI_INSTR(0x22, 1) +#define MI_FLUSH_DWMI_INSTR(0x26, 2) /* for GEN6 */ Random new abbreviation. Use