Hi, Thanks for reviewing the patch.
To clarify, This patch is used for H.264/VC1 decoding. Abstract ring buffer is also needed for our later generation GPUs, Because there are more types of ring buffer to come. H.264/VC1 HW decoding is a bigger requirement than mepg2. XvMC VLD can only support mpeg2. It is rendering in server context and lack of post processing features. We are not going to support it later, our later media work will be based on vaapi. Synchronize between BSD and render ring will be done by VAAPI client. We have run Tons of video validation upon this patch. Our QA will do 1-2 days of regression test on different chips each time I rebase and sent out the patch Let's get simple code works first then we can optimize it. Thanks Zou Nan hai -----Original Message----- From: Daniel Vetter [mailto:[email protected]] Sent: 2010年5月12日 6:24 To: Zou, Nanhai Cc: Anholt, Eric; Intel GFX Subject: Re: [Intel-gfx] [PATCH 1/4] introduce intel_ring_buffer structure Hi all, Ok, here's my try at a review. Sorry for the rather long delay. First things first, please put on your asbestos suits, this one's gonna be rough;) Second: Your patch is still a pain to review for the following reasons: - IMHO you split-up made things worse: The completely rewritten render ring implementation is introduced in patch 1, but only really fully used when all 4 patches are applied. Now instead of hunting around in one monster patch (which was hard) I have to hunt around in four different patches to make sense of changes (rather impossible). Similarly other stuff is splattered all over the series, e.g. I've found definitions for the bsd ring in patch 1 instead of patch 3. - Your doing tons of (at least at first glance) superfluously search&replaces. Yep, inconsistent naming is ugly, but patches bloated with tons of such changes are even uglier. Some examples: s/ring/render_ring/ s/drm_i915_ring_buffer_t/intel_ring_buffer/ There are more. If you think these changes really are beneficial, please do one patch per s&r with nothing else in it. Now for the more specific issues: - In my previous review I've proposed to replace the gpu execution breadcrumb with something like this struct intel_gpu_breadcrumb { uint32_t seqno : 30; uint32_t domain : 2; }; Compared to your approach this doesn't waste a pointer per gem object. Further it can be used for execution domains like keeping track of pageflips (tossed around some ideas with Kristian on irc) or to batch up gtt chipset flushes (probably needed to make my i855 cache coherency stuff fast again). So yes, I'd like to see this. - struct intel_ring_buffer looks massively overdesigned in your patch. I think a more lightweight approach that leaves as much as possible of the render ring implementation untouched (and doesn't move it around) is better for at least two reasons: - Your abstraction requires quite a few automatic changes all over. This needlessly bloats the patch and makes finding the interesting stuff harder. - Greatly reduced risk of introducing regressions. There are also a few places that look rather fishy. Mostly I simply couldn't follow anymore what's going on due to the reasons laid out above. Anyway, here they are: - You (seem to) move around kernel_lost_context. For a feature that's currently only supported for hw that was never supported with ums, this is either fishy or needs a big comment. - In i915_gem_flush you simply flush both ringbuffers. Now either these two have independent caches (and only one needs to be flushed, gem is not supposed to do unnecessary flushes) or they are coherent and only one flush is required, period. Current code looks like it tries to paper over some coherency issues (and I've learned to loathe these buggers). Related to this: The active list gets split up between the to ringbuffers. The flushing list is still global. This looks inconsistent. - I haven't found out how synchronization between the bsd and the render ringbuffer is handled. And if the answer is "userspace takes care" I'm not gonna like it. With this off the table, I still have to do some venting ;) The idea behind splitting patches up is: - to make life easier for reviewers. IHMO you've failed in that regard. - to make bisecting regressions easy. Given that I've found gimmicks like tatus_page_dmah->vaddr; - memset(dev_priv->render_ring.status_page.page_addr + memset(dev_priv->render_ring.status_page.page_addr, 0, PAGE_SIZE); } in your patches (note the re-added semicolon at the end) your patches definitely fail at that - they won't even compile in between. And it looks like other people have problems simply applying the patches, too. So please take a tad bit more care when prepping patch series. Otherwise the hours (including thinking time) I've just put into this review feel kinda wasted. For a _really_ nice patch series that was a joy to read, look no further than at Zhenyu Wang's recent connector rework (the patch series that got merged, _not_ the first submission - but that can serve as comparison). Oh, one last thing: Given that libIntelXvMC seems to support vld (for mpeg2), why exactly do we need this? Yours, Daniel -- Daniel Vetter Mail: [email protected] Mobile: +41 (0)79 365 57 48 _______________________________________________ Intel-gfx mailing list [email protected] http://lists.freedesktop.org/mailman/listinfo/intel-gfx
