Re: Keeping the Screen Turned off While Getting Inputs

2023-08-28 Thread Vladimir Dergachev




On Mon, 28 Aug 2023, Ahmad Nouralizadeh wrote:


The laptop model is `Asus N501JW` running `Ubuntu 18.04`. The discrete GPU is 
`GeForce GTX 960M`.
The link below says that by default the system uses `Intel HD graphics` (the 
iGPU) and using the discrete GPU requires a proprietary driver! Does this mean 
that having `nouveau` is not enough?


I don't have that particular card, but I think noveau should work with it.
And if you really need to you can install NVidia drivers.

However, you might want to upgrade Ubuntu - the latest LTS release is 
22.04.


Many tools are improved, including perf and Xorg.

best

Vladimir Dergachev



https://askubuntu.com/a/766282/926952

It seems that I was wrong and only one GPU is being used?!




Re: Keeping the Screen Turned off While Getting Inputs

2023-08-28 Thread Ahmad Nouralizadeh
The laptop model is `Asus N501JW` running `Ubuntu 18.04`. The discrete GPU is 
`GeForce GTX 960M`.
The link below says that by default the system uses `Intel HD graphics` (the 
iGPU) and using the discrete GPU requires a proprietary driver! Does this mean 
that having `nouveau` is not enough?
https://askubuntu.com/a/766282/926952

It seems that I was wrong and only one GPU is being used?!


Re: Keeping the Screen Turned off While Getting Inputs

2023-08-28 Thread Vladimir Dergachev




On Mon, 28 Aug 2023, Ahmad Nouralizadeh wrote:


Is it possible to prevent the Xserver from using the iGPU and only use the 
discrete GPU. I found no BIOS options for this. Why should the two GPUs work 
simultaneously?




Urmm.. BIOS is the wrong place to look for this - if you are trying to 
alter how Xserver (Xorg) works you should first read documentation for 
Xorg.  (man xorg.conf and so on).


Google searches help too, a useful keyword is "optimus" which was the name 
of dual gpu configuration when it was first introduced.


I am not giving explicit instructions because I don't know which hardware 
and OS you are using (you did not describe), and I might not know off the 
top of my head if your configuration is sufficiently different from mine.


I practically never use discrete card of the Optimus - it produces too 
much heat and makes laptop fans spin. I did try the discrete card a couple 
of times playing with CUDA, but the laptop GPU was too underpowered to be 
of use.


There might be a GUI app, a useful place to look is nvidia-settings, if 
you have Nvidia card and closed source Nvidia drivers. Alternatively 
search for information for "noveau" driver (Nvidia card) or amdgpu (AMD 
card).


best

Vladimir Dergachev


Re: Keeping the Screen Turned off While Getting Inputs

2023-08-28 Thread Ahmad Nouralizadeh
Is it possible to prevent the Xserver from using the iGPU and only use the 
discrete GPU. I found no BIOS options for this. Why should the two GPUs work 
simultaneously?

Re: Keeping the Screen Turned off While Getting Inputs

2023-08-27 Thread Vladimir Dergachev




On Sun, 27 Aug 2023, Ahmad Nouralizadeh wrote:


Thanks Alan and Vladimir!These are very effective clues to help me understand 
the whole architecture, but I will need some experiments! :D

I may continue with this thread later to ask questions about the main problem 
discussed here (i.e., turning off the screen), if I find my approach feasible! 
I think that the kernel structure storing
the display state is `struct drm_crtc_state`, particularly, its `enabled` and 
`active` fields.

There exists a `linux_pm` mailing list which seems to be related to my 
question. But it seems to be for development purposes, and not for learning! 
Where do you think I can ask the
gpu/power-management related questions at the kernel level?


There is no harm in asking - most developers had to learn at some stage.

But I would expect people on linux_pm would be focused on issues other 
than GPU power management.


best

Vladimir Dergachev



Regards.




Re: Keeping the Screen Turned off While Getting Inputs

2023-08-27 Thread Ahmad Nouralizadeh
Thanks Alan and Vladimir!These are very effective clues to help me understand 
the whole architecture, but I will need some experiments! :D
I may continue with this thread later to ask questions about the main problem 
discussed here (i.e., turning off the screen), if I find my approach feasible! 
I think that the kernel structure storing the display state is `struct 
drm_crtc_state`, particularly, its `enabled` and `active` fields.
There exists a `linux_pm` mailing list which seems to be related to my 
question. But it seems to be for development purposes, and not for learning! 
Where do you think I can ask the gpu/power-management related questions at the 
kernel level?
Regards.

Re: Keeping the Screen Turned off While Getting Inputs

2023-08-27 Thread Vladimir Dergachev




On Sun, 27 Aug 2023, Ahmad Nouralizadeh wrote:


Perhaps I didn't express my question precisely. I understand that you are 
talking about the mmap function in the kernel which is usually a function 
pointer in vm_operations...

My question is about the userspace structure of X11. IIUC, we have X11 clients, 
which are GUI apps.
They have a portion of the X11 related libraries (those needed for clients) 
mapped into their address space. As the app and the X11 libraries (client code 
in X11) are in the same address space the
graphical data are accessible by both. Xserver is a separate process (i.e., 
Xorg). How are the graphical data sent to the server? Does it use shared 
memory? Multiple shared memory regions to service
each client?


First of all plain X11 does not use shared memory - the graphics requests 
are sent over a socket. Do as root "ls -l /proc/XXX/fd" where XXX is the 
pid of the Xorg. That socket is very fast !


Remember that originally X11 did only 2d graphics. POSIX shared memory 
support was added later via Xshm extension and was meant for transferring 
images.


The OpenGL is also an extension and how exactly it communicates is up to 
the driver. The driver is split into two parts - the part that sits in 
the kernel and the part that is in the Xserver.


The general trend is to make things faster you want to bypass as many 
layers as you can.


So you would setup your OpenGL window by talking to Xserver via a socket, 
and the Xserver will inform the kernel driver. Then you would send your 
data and rendering commands to the card via a kernel driver - preferably 
with as few kernel calls as you can get away with.


For example, running glxgears on my computer, with Xorg running on 
internal i915 Intel card, I see in /proc/XXX/fd:


lrwx-- 1 volodya volodya 64 Aug 27 13:03 0 -> /dev/pts/133
lrwx-- 1 volodya volodya 64 Aug 27 13:03 1 -> /dev/pts/133
lrwx-- 1 volodya volodya 64 Aug 27 13:03 2 -> /dev/pts/133
lrwx-- 1 volodya volodya 64 Aug 27 13:03 3 -> 'socket:[353776996]'
lrwx-- 1 volodya volodya 64 Aug 27 13:03 4 -> /dev/dri/card0
lrwx-- 1 volodya volodya 64 Aug 27 13:03 5 -> /dev/dri/card0
lrwx-- 1 volodya volodya 64 Aug 27 13:03 6 -> /dev/dri/card0
lrwx-- 1 volodya volodya 64 Aug 27 13:03 7 -> /dev/dri/card0
[...]

The file descriptors 0,1,2 are standard input, output and error. File 
descriptor 3 is the socket to talk to Xserver, and the rest is the device 
created by the kernel driver. I don't know why intel driver needs four of 
them.


Looking in /proc/xxx/maps there many entries, with lots of them looking 
like:


7fe9ac736000-7fe9ac836000 rw-s 203853000 00:0e 12497 
anon_inode:i915.gem
7fe9ac836000-7fe9ac83a000 rw-s 1109bc000 00:0e 12497 
anon_inode:i915.gem
7fe9ac83a000-7fe9ac84a000 rw-s 3267cc000 00:0e 12497 
anon_inode:i915.gem
7fe9ac90a000-7fe9ac91a000 rw-s 260574000 00:0e 12497 
anon_inode:i915.gem
7fe9ac91a000-7fe9ac92a000 rw-s 60d483000 00:0e 12497 
anon_inode:i915.gem

This has something to do with communicating with kernel driver. Looks like 
it needs a lot of buffers to do that. A few would make sense, but I got 21 
total which is too much.


On the other hand, on a different computer with an NVidia card, I see the 
following in /proc/XXX/fd for a plasmashell (KDE desktop):


lrwx-- 1 volodya volodya 64 Aug 27 13:13 11 -> /dev/nvidiactl
lrwx-- 1 volodya volodya 64 Aug 27 13:13 12 -> /dev/nvidia-modeset
lrwx-- 1 volodya volodya 64 Aug 27 13:13 13 -> /dev/nvidia0
lrwx-- 1 volodya volodya 64 Aug 27 13:13 14 -> /dev/nvidia0
lrwx-- 1 volodya volodya 64 Aug 27 13:13 15 -> /dev/nvidia-modeset
lrwx-- 1 volodya volodya 64 Aug 27 13:13 17 -> /dev/nvidia0
lrwx-- 1 volodya volodya 64 Aug 27 13:13 18 -> /dev/nvidia0
[...]

nvidiactl is unique - this is how things are triggerred, but there are 
many, many opened file descriptors to nvidia-modeset and, especially, 
nvidia0.


The contents of /prox/XXX/maps match in complexity:

7f0e6c00c000-7f0e6c00d000 rw-s  00:05 476
/dev/nvidia0
7f0e6c00d000-7f0e6c00e000 rw-s  00:05 476
/dev/nvidia0
7f0e6c00e000-7f0e6c00f000 rw-s  00:05 475
/dev/nvidiactl
7f0e6c00f000-7f0e6c01 rw-s  00:05 475
/dev/nvidiactl
7f0e6c01-7f0e6c011000 rw-s  00:05 475
/dev/nvidiactl
7f0e6c011000-7f0e6c012000 rw-s 00044000 00:01 4096   
/memfd:/.nvidia_drv.XX (deleted)
7f0e6c021000-7f0e6c024000 rw-s  00:05 475
/dev/nvidiactl
7f0e6c0d-7f0e6c0e3000 rw-s  00:05 475
/dev/nvidiactl

and many more similar entries.

However, in both cases the focus is on communication with the kernel 
driver and the hardware, not the Xserver.


best

Vladimir Dergachev







Is 

Re: Keeping the Screen Turned off While Getting Inputs

2023-08-27 Thread Alan Coopersmith

On 8/27/23 09:53, Ahmad Nouralizadeh wrote:
Perhaps I didn't express my question precisely. I understand that you are 
talking about the mmap function in the kernel which is usually a function 
pointer in vm_operations...


My question is about the userspace structure of X11. IIUC, we have X11 clients, 
which are GUI apps.
They have a portion of the X11 related libraries (those needed for clients) 
mapped into their address space. As the app and the X11 libraries (client code 
in X11) are in the same address space the graphical data are accessible by both. 
Xserver is a separate process (i.e., Xorg). How are the graphical data sent to 
the server? Does it use shared memory? Multiple shared memory regions to service 
each client?


By default, the client writes data into a socket that the server reads from
(Unix domain socket for local connections (including ssh forwarded ones),
 TCP socket for remote).

For local connections, clients can use the MIT-SHM extension to use shared
memory for pixmap data instead, in which case the client creates a shared
memory segment for each image:
https://www.x.org/releases/current/doc/xextproto/shm.html

--
-Alan Coopersmith- alan.coopersm...@oracle.com
 Oracle Solaris Engineering - https://blogs.oracle.com/solaris



Re: Keeping the Screen Turned off While Getting Inputs

2023-08-27 Thread Ahmad Nouralizadeh
 Perhaps I didn't express my question precisely. I understand that you are 
talking about the mmap function in the kernel which is usually a function 
pointer in vm_operations...
My question is about the userspace structure of X11. IIUC, we have X11 clients, 
which are GUI apps.They have a portion of the X11 related libraries (those 
needed for clients) mapped into their address space. As the app and the X11 
libraries (client code in X11) are in the same address space the graphical data 
are accessible by both. Xserver is a separate process (i.e., Xorg). How are the 
graphical data sent to the server? Does it use shared memory? Multiple shared 
memory regions to service each client?
Is Opengl mapped only into the server portion?

Re: Keeping the Screen Turned off While Getting Inputs

2023-08-27 Thread Vladimir Dergachev




On Sun, 27 Aug 2023, Ahmad Nouralizadeh wrote:


Thanks (also Dave and Carsten)! Full of useful information not easily found (if 
even found) on the Internet!


You are welcome :)


So, in summary, the communication is done through a series of memory mapped 
regions in
the address space of the graphics library (e.g., OpenGL).
The image data is transferred 1) from the X client to the X server and 2) from 
there to the graphics library. Are both of these transfers made using shared 
memory?




To be pedantic, the memory mapped regions have address space of whatever 
process mapped them, and there can be several mappings active at the same 
time (think two OpenGL apps running on different cores).


Some memory regions are mapped by DRI driver in the kernel, some by X 
server, some by OpenGL apps (via calls to OpenGL library). The mapped sets 
don't have to be the same and this depends a lot on the particular 
graphics card and the driver - I don't know well Intel onboard graphics, 
so someone else please chime in.


The shared memory (as POSIX shared memory) is a different mechanism that 
also does memory mapping. From the point of view of programmer you acquire 
POSIX shared memory via calls like shm_open().


The memory involved in OpenGL is all mapped using calls to "mmap()".

mmap() is much simpler to use - you issue it on a file and you memory 
mapped a portion of it. You can do this on any file - for example, I made 
a library libMVL and R package RMVL that uses memory mapping of files on 
SSD to analyze large data.


Unlike mmap() POSIX shared memory was designed as method of communication 
between different processes and has mechanisms to arbitrate access - this 
is what makes it complicated. As far as I know, POSIX share memory does 
not give access to any hardware devices.


If you want to have fun, try memory mapping /dev/mem as root and then 
reading it carefully - this gives direct access to physical memory of your 
computer.


I think the first page is either filled with 0, or contains interrupt 
pointers - its been a while so I don't remember it.


If you do lspci -v it will list available memory mapped regions of various 
device. For example, the laptop I am typing this on has an NVidia card:


01:00.0 3D controller: NVIDIA Corporation GM108M [GeForce MX130] (rev a2)
Subsystem: Dell GM108M [GeForce MX130]
Flags: bus master, fast devsel, latency 0, IRQ 255
Memory at d200 (32-bit, non-prefetchable) [size=16M]
Memory at c000 (64-bit, prefetchable) [size=256M]
Memory at d000 (64-bit, prefetchable) [size=32M]
I/O ports at e000 [disabled] [size=128]
Expansion ROM at d300 [disabled] [size=512K]
Capabilities:  (you get these by running lspci as root)
Kernel modules: nvidiafb, nouveau, nvidia_drm, nvidia

This card has three mappable regions of memory. The ioports exist in a 
separate memory space specific to Intel processors - these are going out 
of fashion. The expansion ROM could have been mappable, but isn't because, 
I think, it has been copied to memory after boot - ROM access is slow.


The non-prefetchable region contains registers - prefetching will screw up 
access to them.


The 256M region is likely the video memory, so if you memory map /dev/mem 
starting at 0xc000 you will be able to read and write to it. Try 
filling with some pattern like 0xAA - you will muck up your screen and, 
maybe, crash X. Issue "sync" before trying to save your data in case of 
hard lockup. There might be a chance of damage to video card, depending on 
which one your have and whether you wrote to registers or some other 
sensitive place accidentally and made a video card overheat, or apply too 
big a voltage to the chip.


Many modern hardware device are very programmable so the hardware starts 
up cold with bare minimum of functionality and then the firmware programs 
the voltage regulators to whatever is specified in ROM and sets PLLs to 
correct frequencies for the clocks. I don't actually know whether any 
video cards have programmable voltage regulators, but they do have 
programmable PLLs and they might be set to too high a frequency. With 
cards I played with, setting PLLs by accident is not that easy, and if you 
do it wrong the lockup is instantaneous. If you pull the plug right away 
you should not get much damage as thermal processes are one of the few 
things that are usually slower than human reflexes (though read about 
hot-wire barretter as a counterexample).


Writing to video card memory should be safer, but it usually contains 
programs for execution by the GPU, so by writing there you create a random 
program that could be triggered by X server that does not know about your 
write. I don't think the chance of that programming the PLL by accident is 
very high, but worth mention for the sake of pedantism. And you might be 
able to do it on purpose.


I don't know what the third region does, maybe a 

Re: Keeping the Screen Turned off While Getting Inputs

2023-08-27 Thread Ahmad Nouralizadeh
Thanks (also Dave and Carsten)! Full of useful information not easily found (if 
even found) on the Internet! So, in summary, the communication is done through 
a series of memory mapped regions in the address space of the graphics library 
(e.g., OpenGL).The image data is transferred 1) from the X client to the X 
server and 2) from there to the graphics library. Are both of these transfers 
made using shared memory?


Re: Keeping the Screen Turned off While Getting Inputs

2023-08-27 Thread Vladimir Dergachev




On Sun, 27 Aug 2023, Ahmad Nouralizadeh wrote:


> The framebuffer that is displayed on the monitor is always in video card
> memory. There is a piece of hardware (CRTC) that continuously pulls data
> from the framebuffer and transmits it to the monitor.

So the framebuffer memory should normally be in the kernel (Perhaps in special 
cases could be mapped in the userspace?!). IIUC, XServer works on the app GUI 
data in the userspace and sends it to the
kernel to finally arrive at the framebuffer. Correct? Does it use some kind of 
ioctl()?




Not necessarily - for very old cards you would issue a special command to 
transfer data or paint a line.


Modern video cards (and video drivers) usually work like this - the 
graphics card exposes several regions that work like memory over PCIe bus 
- i.e. the CPU can access them by issuing a "mov" command to an address 
outside main CPU memory (assuming the graphics card is a physical PCIe 
card).


One of the regions is the entire video card memory that includes the 
framebuffer. This way you can transfer data by simplying copying it to the 
memory mapped region.


This however is slow, even with modern CPUs, because of limitations of 
PCIe bandwidth and because the CPUs are not well suited to the task.


Instead a second memory region contains "registers" - special memory 
locations that, when written, make magic happen. Magic is an appropriate 
word here because the function of those registers is entirely arbitrary - 
their function is picked by hardware designers and there aren't any strict

constraints to force a particular structure.

For example, one register could contain starting x coordinate, another 
starting y, another ending x, another ending y, one more contain a color, 
and finally a special register that, when written, will draw a line in the 
framebuffer from start to end using that color.


This is much faster than using a CPU because only a few values are 
transferred - rest of the work is done by the video card.


And this is how video cards used to work a few decades back, and partially 
still do. However, for modern needs this is still too slow.


So one more feature of video cards is that they have "PCIe bus master" - 
the ability to access main CPU memory directly and retrieve (or write) 
data there.


So instead of transferring data to the framebuffer (for example) by having 
the CPU write there, the CPU will write to video card registers the 
addresses (plural) of memory regions to transfer and then trigger the 
transfer by writing a special register. The video card will do the work.


The transfer to the framebuffer is not very interesting, but what you can 
do is PCI bus master to registers instead. This is usually done by a 
dedicated unit, so it is not exactly like writing to the registers, but 
this makes for a good simplified explanation.


So now you have a memory region in main memory where CPU has assembled 
data like "address of register of starting X", "value of starting X", 
"register address for color of starting point", "value of color" and so 
on, finishing "address of trigger register", "Trigger !".


And this now looks like instructions for a very, very weird VLIW (very 
long instruction word) processor.


The OpenGL driver now works by taking OpenGL commands and compiling them 
to sequences of these weird GPU instructions that are placed into memory 
buffer. When enough of these accumulate, the video card is given the 
trigger to go and execute them, and something gets painted.


If you need to paint a picture, another buffer is allocated, picture data 
is written into it, and then a special command is created instructing to 
pull data from that buffer.


Now, over the past few decades the video cards evolved to be slightly less 
weird VLIW processors - they are getting rid of dedicated commands like 
draw a line from X to Y, in favor of commands like "compute  dot product 
between arrays of 4-dimensional vectors".


They still have the weird multi-tier PCIe bus master, and multiple caches
used to access multiple types of memory: framebuffer memory, texture 
memory, main memory and a few others. And a weird quirks that make doing 
interesting programming with GPUs tricky.


So now, if you start some OpenGL app on Linux and look into /proc/XXX/maps 
you should be able to find several memory regions that have been mapped by 
the graphics driver. Some of those is real memory, some are registers and 
are entirely virtual - there isn't any physical DRAM backing them.


These aren't all the regions exposed by video card, because if multiple 
apps write to video card register directly it will lock up hard, freezing 
PCIe bus. Instead, this is arbitrated by the kernel driver.


best

Vladimir Dergachev



Re: Keeping the Screen Turned off While Getting Inputs

2023-08-27 Thread Carsten Haitzler
On Sat, 26 Aug 2023 15:28:52 + (UTC) Ahmad Nouralizadeh
 said:

> Hi,
> I need to run a set of (graphical) benchmarks with the screen disabled. The
> following command did not work: xset dpms force off
> 
> Because any keyboard/mouse input would re-enable the screen. The other option
> was the following:
> 
> xrandr --output eDP-1 --off
> 
> This turns off the screen for a second, then, causes the following
> segmentation fault:
> 
> gnome-shell[25737]: segfault at 8 ip 7f3d02ef9210 sp 7ffeee4e1fd8
> error 4 in libmutter-2.so.0.0.0[7f3d02e99000+156000]
> 
> How can the problem be solved? Is there any other user/kernel level
> alternative?Regards. ===
> P.S.: The question is also asked in the following addresses:
> https://superuser.com/q/1805444/1037926
> https://unix.stackexchange.com/q/754782/348219

well first if you're not benchmarking gnome shell.. don't use gnome. there's a
full compositor going on there that will add extra work to any updates on
screen. use no window manager or something very basic (twm, fvwm etc.).

second...

xinput disable XXX
...
xset dpms force on

run xinput to list all your input devices... disable all the id's ... you'll
have no working input devices if you disable them all - nothing to interrupt
dpms (unless some other software is trying to disable the dpms blank - thus
use the simplest setup possible as first point above):)

-- 
- Codito, ergo sum - "I code, therefore I am" --
Carsten Haitzler - ras...@rasterman.com



Re: Keeping the Screen Turned off While Getting Inputs

2023-08-26 Thread Ahmad Nouralizadeh
> The framebuffer that is displayed on the monitor is always in video card 
> memory. There is a piece of hardware (CRTC) that continuously pulls data 
> from the framebuffer and transmits it to the monitor.

So the framebuffer memory should normally be in the kernel (Perhaps in special 
cases could be mapped in the userspace?!). IIUC, XServer works on the app GUI 
data in the userspace and sends it to the kernel to finally arrive at the 
framebuffer. Correct? Does it use some kind of ioctl()?
  

Re: Keeping the Screen Turned off While Getting Inputs

2023-08-26 Thread Vladimir Dergachev




On Sun, 27 Aug 2023, Ahmad Nouralizadeh wrote:


> In order to display anything on the screen the video card needs an array
>of data given color of each pixel. This is usually called "framebuffer"
>because it buffers data for one frame of video.

Thank you for the enlightening explanation! An unrelated question: IIUC the 
framebuffer is a shared memory in userspace. I see a huge amount of memory 
(around 1GB) in the kernel space related to sth
called the GEM layer. Why is this large allocation needed?


The framebuffer that is displayed on the monitor is always in video card 
memory. There is a piece of hardware (CRTC) that continuously pulls data 
from the framebuffer and transmits it to the monitor.


A notable special case is when the "video card" is part of the CPU, in 
this case the main memory can serve dual purpose: most of it used by the 
main CPU, while a portion of main memory is allocated to the video card 
(there are BIOS options to change the amount).


This has impact on the performance - the CRTC needs to send a frame to the 
monitor at the refresh rate and it needs to pull data from memory - 
everything else has to wait.


If you are using a 4K (3840x2160) monitor that refreshes at 60 Hz, with 
each pixel a customary 32 bits, the CRTC needs 2 GB/s of bandwidth.




> When you request "dpms off" all this does is tell monitor to turn off the
> light and save power. Everything that normally be drawn will still be
> drawn, as you can verify using x11vnc and vncviewer.

How does the interactive input cause screen reactivation? Is it signaled in 
software? If yes, perhaps the signal could be hidden by some small changes in 
the software to prevent the reactivation.


There is likely a piece of software that sends "dpms on" the moment a 
cursor moves. Probably in the Xserver itself.


best

Vladimir Dergachev



> From the point of view of a benchmark you need to be very careful not
> alter the task, as modern systems love to optimize.

I will have to do some approximations using a combination of the processor and 
IMC counters.






Re: Keeping the Screen Turned off While Getting Inputs

2023-08-26 Thread Ahmad Nouralizadeh
 > In order to display anything on the screen the video card needs an array
>of data given color of each pixel. This is usually called "framebuffer" 
>because it buffers data for one frame of video.

Thank you for the enlightening explanation! An unrelated question: IIUC the 
framebuffer is a shared memory in userspace. I see a huge amount of memory 
(around 1GB) in the kernel space related to sth called the GEM layer. Why is 
this large allocation needed?

> When you request "dpms off" all this does is tell monitor to turn off the 
> light and save power. Everything that normally be drawn will still be 
> drawn, as you can verify using x11vnc and vncviewer.

How does the interactive input cause screen reactivation? Is it signaled in 
software? If yes, perhaps the signal could be hidden by some small changes in 
the software to prevent the reactivation.

> From the point of view of a benchmark you need to be very careful not 
> alter the task, as modern systems love to optimize.

I will have to do some approximations using a combination of the processor and 
IMC counters.  

Re: Keeping the Screen Turned off While Getting Inputs

2023-08-26 Thread Vladimir Dergachev



On Sat, 26 Aug 2023, Ahmad Nouralizadeh wrote:


> > However, I would have expected that VLC would produce a lot
> > GPU/iGPU accesses even without drawing anything, because it would
> > try to use GPU decoder.

For the discrete GPU, the turned off screen requires much smaller bandwidth in 
any benchmark (reduces from 2GB/s to several KB/s). The same seems to be true 
with iGPU. Of course, there might exist
some DRAM accesses originating from the GPU/iGPU. But the main traffic seems to 
fade. These assumptions are based on my experiments and I could be wrong.
(P.S.: VLC seems to be aware of the screen state. The rendering thread will 
stop when the screen is off (mentioned 
here:https://stackoverflow.com/q/76891645/6661026).)

> > Displaying video is also often done using GL or Xvideo - plain X is
> > too slow for this.
I'm looking for a simpler solution. I'm not familiar with these Xorg-related 
concepts! It seems a bit strange that turning off screen requires so much 
effort! If `xset dpms force off` would not
cause screen activation with user input or `xrandr --output...` wouldn't cause 
segfault, everything would be fine.


Here is a simplified explanation:

In order to display anything on the screen the video card needs an array 
of data given color of each pixel. This is usually called "framebuffer" 
because it buffers data for one frame of video.


For every monitor you plugged in there is a separate framebuffer, unless 
they display the same thing (mirror).


To draw, the CPU either sends data directly to framebuffer, requests video 
card to pull data from RAM, or does some more complicated combination of 
the two (this includes GL and Xvideo acceleration).


So you have a system CPU -> Video Card -> Monitor

When you request "dpms off" all this does is tell monitor to turn off the 
light and save power. Everything that normally be drawn will still be 
drawn, as you can verify using x11vnc and vncviewer.


When you request "xrandr ... --off" you are requesting the equivalent of 
physically unplugging monitor cable. The framebuffer associated with that 
monitor will get destroyed. That's likely why you saw that gnome-panel 
error - some library it relies on could not deal with the fact that the 
framebuffer it was supposed to draw into suddenly disappeared.


From the point of view of a benchmark you need to be very careful not 

alter the task, as modern systems love to optimize.

For example, many applications will stop drawing when their window is 
fully obscured (don't know about vlc, but likely).


However, this behaviour will change depending on whether compositor is 
enabled, and even depending on how many windows are open as compositor has 
limits.


best

Vladimir Dergachev




>edit to add: google suggests another candidate might be something
>called pin-instatPin works at the source code level. It counts source-level 
accesses which might not reach DRAM (e.g., services by caches).

> > best
> >
> > Vladimir Dergachev 



Re: Keeping the Screen Turned off While Getting Inputs

2023-08-26 Thread Ahmad Nouralizadeh
 > > However, I would have expected that VLC would produce a lot
> > GPU/iGPU accesses even without drawing anything, because it would
> > try to use GPU decoder.
For the discrete GPU, the turned off screen requires much smaller bandwidth in 
any benchmark (reduces from 2GB/s to several KB/s). The same seems to be true 
with iGPU. Of course, there might exist some DRAM accesses originating from the 
GPU/iGPU. But the main traffic seems to fade. These assumptions are based on my 
experiments and I could be wrong.
(P.S.: VLC seems to be aware of the screen state. The rendering thread will 
stop when the screen is off (mentioned 
here:https://stackoverflow.com/q/76891645/6661026).)

> > Displaying video is also often done using GL or Xvideo - plain X is> > too 
> > slow for this.
I'm looking for a simpler solution. I'm not familiar with these Xorg-related 
concepts! It seems a bit strange that turning off screen requires so much 
effort! If `xset dpms force off` would not cause screen activation with user 
input or `xrandr --output...` wouldn't cause segfault, everything would be fine.

>edit to add: google suggests another candidate might be something
>called pin-instatPin works at the source code level. It counts source-level 
>accesses which might not reach DRAM (e.g., services by caches).

> > best
> > 
> > Vladimir Dergachev  
  

Re: Keeping the Screen Turned off While Getting Inputs

2023-08-26 Thread Dave Howorth
On Sat, 26 Aug 2023 20:46:35 +0100, Dave Howorth wrote:

> On Sat, 26 Aug 2023 13:43:21 -0400 (EDT), Vladimir Dergachev wrote:
> 
> > On Sat, 26 Aug 2023, Ahmad Nouralizadeh wrote:
> >   
> > > >> Those accesses might not stop with just the display off - some
> > > >> applications may keep redrawing.
> > > Will these accesses cause iGPU or dedicated GPU accesses to the
> > > DRAM? I think that those redrawings originate from the
> > > processor.   
> > > >I'm not sure a graphical benchmark will run without a graphical
> > > >system running?
> > > Yes, VLC is one of the benchmarks and will not run without
> > > GUI.
> > 
> > You can start system with plain X and twm for window manager - this
> > would produce minimal load on the GPU.
> > 
> > However, I would have expected that VLC would produce a lot
> > GPU/iGPU accesses even without drawing anything, because it would
> > try to use GPU decoder.
> > 
> > Displaying video is also often done using GL or Xvideo - plain X is
> > too slow for this.
> >   
> > > 
> > > >Maybe do the reverse of what I suggested. Run the benchmark but
> > > >send the output to a remote display.
> > > Will it avoid screen activation in the local machine?
> > 
> > There should be a rather drastic difference in speed between VLC 
> > displaying locally and in a remote X using network.  
> 
> Indeed but speed doesn't seem to matter. Some count of particular RAM
> accesses is what seems to be important. I'm not clear exactly what RAM
> accesses nor why the count is important, nor what disturbance to the
> normal operation is permitted. Maybe instrumenting a gdb trace of the
> benchmarks would be more accurate?

edit to add: google suggests another candidate might be something
called pin-instat

> > best
> > 
> > Vladimir Dergachev  


Re: Keeping the Screen Turned off While Getting Inputs

2023-08-26 Thread Dave Howorth
On Sat, 26 Aug 2023 13:43:21 -0400 (EDT), Vladimir Dergachev wrote:

> On Sat, 26 Aug 2023, Ahmad Nouralizadeh wrote:
> 
> > >> Those accesses might not stop with just the display off - some
> > >> applications may keep redrawing.  
> > Will these accesses cause iGPU or dedicated GPU accesses to the
> > DRAM? I think that those redrawings originate from the processor. 
> > >I'm not sure a graphical benchmark will run without a graphical
> > >system running?  
> > Yes, VLC is one of the benchmarks and will not run without GUI.  
> 
> You can start system with plain X and twm for window manager - this
> would produce minimal load on the GPU.
> 
> However, I would have expected that VLC would produce a lot GPU/iGPU 
> accesses even without drawing anything, because it would try to use
> GPU decoder.
> 
> Displaying video is also often done using GL or Xvideo - plain X is
> too slow for this.
> 
> >   
> > >Maybe do the reverse of what I suggested. Run the benchmark but
> > >send the output to a remote display.  
> > Will it avoid screen activation in the local machine?  
> 
> There should be a rather drastic difference in speed between VLC 
> displaying locally and in a remote X using network.

Indeed but speed doesn't seem to matter. Some count of particular RAM
accesses is what seems to be important. I'm not clear exactly what RAM
accesses nor why the count is important, nor what disturbance to the
normal operation is permitted. Maybe instrumenting a gdb trace of the
benchmarks would be more accurate?

> best
> 
> Vladimir Dergachev


Re: Keeping the Screen Turned off While Getting Inputs

2023-08-26 Thread Vladimir Dergachev




On Sat, 26 Aug 2023, Ahmad Nouralizadeh wrote:


>> Those accesses might not stop with just the display off - some
>> applications may keep redrawing.
Will these accesses cause iGPU or dedicated GPU accesses to the DRAM? I think 
that those redrawings originate from the processor.

>I'm not sure a graphical benchmark will run without a graphical system
>running?
Yes, VLC is one of the benchmarks and will not run without GUI.


You can start system with plain X and twm for window manager - this would 
produce minimal load on the GPU.


However, I would have expected that VLC would produce a lot GPU/iGPU 
accesses even without drawing anything, because it would try to use GPU 
decoder.


Displaying video is also often done using GL or Xvideo - plain X is too 
slow for this.




>Maybe do the reverse of what I suggested. Run the benchmark but send
>the output to a remote display.
Will it avoid screen activation in the local machine?


There should be a rather drastic difference in speed between VLC 
displaying locally and in a remote X using network.


best

Vladimir Dergachev



>Since IMC counters appear to be a feature of the powerpc architecture,
>you might get a better response from some list/forum specific to that
>architecture.

IMC stands for the Integrated Memory Controller. The DRAM controller has some 
internal counters for counting different types of memory accesses. For example, 
for my laptop it is documented here:
https://software.intel.com/content/www/us/en/develop/articles/monitoring-integrated-memory-controller-requests-in-the-2nd-3rd-and-4th-generation-intel.html

Do you have any suggestions about the cause of the xrandr error? It works 
perfectly in the virtual machine!





Re: Keeping the Screen Turned off While Getting Inputs

2023-08-26 Thread Ahmad Nouralizadeh
>> Those accesses might not stop with just the display off - some 
>> applications may keep redrawing.
Will these accesses cause iGPU or dedicated GPU accesses to the DRAM? I think 
that those redrawings originate from the processor.

>I'm not sure a graphical benchmark will run without a graphical system
>running?
Yes, VLC is one of the benchmarks and will not run without GUI.

>Maybe do the reverse of what I suggested. Run the benchmark but send>the 
>output to a remote display.
Will it avoid screen activation in the local machine?

>Since IMC counters appear to be a feature of the powerpc architecture,
>you might get a better response from some list/forum specific to that
>architecture.
IMC stands for the Integrated Memory Controller. The DRAM controller has some 
internal counters for counting different types of memory accesses. For example, 
for my laptop it is documented 
here:https://software.intel.com/content/www/us/en/develop/articles/monitoring-integrated-memory-controller-requests-in-the-2nd-3rd-and-4th-generation-intel.html

Do you have any suggestions about the cause of the xrandr error? It works 
perfectly in the virtual machine!

  

Re: Keeping the Screen Turned off While Getting Inputs

2023-08-26 Thread Dave Howorth
On Sat, 26 Aug 2023 12:11:03 -0400 (EDT)
Vladimir Dergachev  wrote:

> On Sat, 26 Aug 2023, Ahmad Nouralizadeh wrote:
> 
> > I want to count the processor-initiated memory accesses. On my 4K
> > display, a huge number of accesses originate from the iGPU and
> > dedicated GPU. I want to exclude these accesses. The IMC counter
> > can only track the dedicated GPU accesses. Therefore, I have to
> > turn the screen off to exclude those originated from the iGPU.  
> 
> Those accesses might not stop with just the display off - some 
> applications may keep redrawing.
> 
> The simplest solution would be to boot to console mode with X off.
> The display will still work, but GPU usage would be minimal.

I'm not sure a graphical benchmark will run without a graphical system
running?

Maybe do the reverse of what I suggested. Run the benchmark but send
the output to a remote display.

Is it possible to just measure the memory accesses by process somehow?
As Vladimir suggests.

Since IMC counters appear to be a feature of the powerpc architecture,
you might get a better response from some list/forum specific to that
architecture.

> There is more than one console (usually), you can switch between them
> with Alt-F1, Alt-F2, etc..
> 
> There are also ways to restrict profiling to a single process,
> like "perf top -p 12345".
> 
> best
> 
> Vladimir Dergachev
> 
> > 
> > On Saturday, August 26, 2023, 08:10:15 PM GMT+4:30, Dave Howorth
> >  wrote:
> > 
> > 
> > On Sat, 26 Aug 2023 15:28:52 + (UTC)
> > Ahmad Nouralizadeh  wrote:
> >   
> > > I need to run a set of (graphical) benchmarks with the screen
> > > disabled.  
> > 
> > 
> > Can I ask why? What is you're trying to accomplish? Somehow affect
> > the benchmarks? Stop people seeing the benchmarks being performed?
> > 
> > And what is the benchmark measuring? Elapsed time or CPU time or
> > what?
> > 
> > Turn the display off and run the benchmarks by ssh-ing in from
> > another machine?
> > 
> >  



Re: Keeping the Screen Turned off While Getting Inputs

2023-08-26 Thread Vladimir Dergachev




On Sat, 26 Aug 2023, Ahmad Nouralizadeh wrote:


I want to count the processor-initiated memory accesses. On my 4K display, a 
huge number of accesses originate from the iGPU and dedicated GPU. I want to 
exclude these accesses. The IMC counter can
only track the dedicated GPU accesses. Therefore, I have to turn the screen off 
to exclude those originated from the iGPU.


Those accesses might not stop with just the display off - some 
applications may keep redrawing.


The simplest solution would be to boot to console mode with X off. The 
display will still work, but GPU usage would be minimal.


There is more than one console (usually), you can switch between them with 
Alt-F1, Alt-F2, etc..


There are also ways to restrict profiling to a single process,
like "perf top -p 12345".

best

Vladimir Dergachev



On Saturday, August 26, 2023, 08:10:15 PM GMT+4:30, Dave Howorth 
 wrote:


On Sat, 26 Aug 2023 15:28:52 + (UTC)
Ahmad Nouralizadeh  wrote:

> I need to run a set of (graphical) benchmarks with the screen
> disabled.


Can I ask why? What is you're trying to accomplish? Somehow affect the
benchmarks? Stop people seeing the benchmarks being performed?

And what is the benchmark measuring? Elapsed time or CPU time or what?

Turn the display off and run the benchmarks by ssh-ing in from another
machine?




Re: Keeping the Screen Turned off While Getting Inputs

2023-08-26 Thread Ahmad Nouralizadeh
 I want to count the processor-initiated memory accesses. On my 4K display, a 
huge number of accesses originate from the iGPU and dedicated GPU. I want to 
exclude these accesses. The IMC counter can only track the dedicated GPU 
accesses. Therefore, I have to turn the screen off to exclude those originated 
from the iGPU.

On Saturday, August 26, 2023, 08:10:15 PM GMT+4:30, Dave Howorth 
 wrote:  
 
 On Sat, 26 Aug 2023 15:28:52 + (UTC)
Ahmad Nouralizadeh  wrote:

> I need to run a set of (graphical) benchmarks with the screen
> disabled.

Can I ask why? What is you're trying to accomplish? Somehow affect the
benchmarks? Stop people seeing the benchmarks being performed?

And what is the benchmark measuring? Elapsed time or CPU time or what?

Turn the display off and run the benchmarks by ssh-ing in from another
machine?
  

Re: Keeping the Screen Turned off While Getting Inputs

2023-08-26 Thread Dave Howorth
On Sat, 26 Aug 2023 15:28:52 + (UTC)
Ahmad Nouralizadeh  wrote:

> I need to run a set of (graphical) benchmarks with the screen
> disabled.

Can I ask why? What is you're trying to accomplish? Somehow affect the
benchmarks? Stop people seeing the benchmarks being performed?

And what is the benchmark measuring? Elapsed time or CPU time or what?

Turn the display off and run the benchmarks by ssh-ing in from another
machine?