Marek, do you have an idea on where the currency bottleneck is?
I just did a profiling with sysprof, zooming in on the desktop in Weston
and moving the mouse wildly around, so that the buffer is completely
changed for every frame. I got around 5 fps, which isn't *that* much, but
still an order of magnitude better than without your patches.
sysprof says there is 100% CPU usage, but unlike the previous 0.5-FPS
recording, it's not in a single function, but spread out over several
functions:
35% weston_recorder_frame_notify
11% __memcpy_ssse3
4.5% clear_page_c
4.3% output_run
Although I'm not completely sure I'm reading the sysprof output right.
weston_recorder_frame_notify, for example, has 35% CPU usage, but none of
its child functions has any significant CPU usage. I presume the CPU usage
in that function is from calling glReadPixels, although that's not apparent
from sysprof:
weston_recorder_frame_notify 39.15%
39.15%
- - kernel - - 0.00%
0.01%
ret_from_intr 0.00%
0.01%
__irqentry_text_start 0.00%
0.01%
irq_exit 0.00%
0.01%
do_softirq 0.00%
0.01%
call_softirq 0.00%
0.01%
__do_softirq0.00%
0.01%
blk_done_softirq 0.00%
0.01%
scsi_softirq_done 0.00%
0.01%
scsi_finish_command 0.00%
0.01%
scsi_io_completion 0.00%
0.01%
blk_end_request 0.00%
0.01%
blk_end_bidi_request0.00%
0.01%
blk_update_bidi_request 0.00%
0.01%
blk_update_request 0.00%
0.01%
req_bio_endio.isra.46 0.00%
0.01%
bio_endio 0.00%
0.01%
end_swap_bio_write0.00%
0.01%
end_page_writeback 0.00%
0.01%
rotate_reclaimable_page 0.01%
0.01%
Another possible bottleneck is simply disk access, although it doesn't seem
to be relevant on my system (since I have 100% CPU usage). The 36-second
recording I made was 1.3 GB in size, so that's around 36 MB/s.
Med venlig hilsen,
Rune Kjær Svendsen
Østerbrogade 111, 3. - 302
2100 København Ø
Tlf.: 2835 0726
On Mon, Mar 18, 2013 at 1:20 AM, Marek Olšák mar...@gmail.com wrote:
Slowness is not usually a bug.
I guess it can be optimized even more. It depends on where the
bottleneck is now.
Marek
On Sun, Mar 17, 2013 at 10:14 PM, Rune Kjær Svendsen
runesv...@gmail.com wrote:
Thank you very much! This is much better. It's gone from 0.5-ish FPS when
zooming in to around 10 FPS, depending on screen content.
So I figure this isn't a bug? I assumed it was a bug, but is the case
simply
that an efficient glReadPixels path for radeon/gallium doesn't exist?
The patch set sure helps in that regard, although it'd be really nice to
get
30 FPS consistently, if at all possible.
Thanks again.
/Rune
On Sun, Mar 17, 2013 at 6:46 PM, Andreas Boll
andreas.boll@gmail.com
wrote:
2013/3/17 Rune Kjær Svendsen runesv...@gmail.com:
Hello list
I'm having problems recording the desktop content using the Weston
compositor's built-in recording function. When I start a recording and
do
something that changes a lot of screen content (like zooming in on the
desktop, for example), I get around 0.5 FPS. Using sysprof, I can see
that
~98% of CPU is used in the function unpack_XRGB(). krh has told me
this
is caused by glReadPixels going through a slowpath. I have a Radeon HD
5770
GPU and I'm using mesa git (I've tried the mesa version in the Ubuntu
12.10
repos, and the xorg-edgers PPA, same result).
Does anyone know what the issue could be, or how to debug the problem
further?
This patch series [1] should help. You might want to try it.
[1]
http://lists.freedesktop.org/archives/mesa-dev/2013-March/036214.html
Doing some debugging, it seems the call to ctx-Driver.ReadPixels() in
_mesa_ReadnPixelsARB leads to _mesa_readpixels() being called in
readpix.c.
I'm attaching some output of gdb that will hopefully be useful.
I'm also attaching the debug terminal output of running Weston with
the
DRM
backend.
Let me know if I can provide other useful information