Re: [darktable-user] GeForce 210 without opencl faster

2019-12-26 Thread Remco Viëtor
On jeudi 26 décembre 2019 04:02:35 CET Аl Воgnеr wrote:
> I wonder why an old pc with a GeForce 210, AMD Athlon II X2 270, 8G RAM
> is slower using opencl.
> 
> darktable 2.6.3~git2.22c690a53
> Fresh installation of the pc, using default darktable-configuration
> 
> $ darktable-cli bench.srw bench.srw.xmp bench.jpg --core -d perf -d
> opencl
> 
> [opencl_init] device 0: GeForce 210
>  GLOBAL_MEM_SIZE:  1024MB
>  MAX_WORK_GROUP_SIZE:  512
>  MAX_WORK_ITEM_DIMENSIONS: 3
>  MAX_WORK_ITEM_SIZES:  [ 512 512 64 ]
>  DRIVER_VERSION:   340.107
>  DEVICE_VERSION:   OpenCL 1.0 CUDA
> 
> 0.254144 [opencl_init] FINALLY: opencl is AVAILABLE on this system.
> 
> 227,385925 [dev_process_export] pixel pipeline processing took 225,369
> secs (165,316 CPU)
> 
> I do not find "blended on _G_PU" in the log. Log attached.
> 
> 
> $ darktable-cli bench.srw bench.srw.xmp bench.jpg --core
> --disable-opencl -d perf
> 
> 76,490890 [dev_process_export] pixel pipeline processing took 74,577
> secs (144,522 CPU)
> 
> BTW:
> GeForce GT 1030: 17,770 secs (26,050 CPU) / 37,168 secs (281,707 CPU),
> AMD FX-8320, 8G RAM
> So with a GT 1030 it is getting about 2 times faster using opencl.
> 

https://en.wikipedia.org/wiki/List_of_Nvidia_graphics_processing_units shows 
that the Geforce 210 has very few processing units, and is slower than newer 
cards (in single precision: ~40 GFlops for the 210, ~1000 Gflops for the 1030)

But, if you ran the tests only once and one after the other, the first will be 
slower, as the image file has to be physically read from disk. Unless you fill 
up the RAM completely, that file file will be cached in RAM for future use, so 
the second run of darktable will be faster. And if there are background 
processes that can run, it can give false results as well (it only through 
increased task switching).
In other words, a single run for each setup is not reliable. 

Note how the CPU times for the 210 don't differ all that much between the runs 
with and without openCL, compared to the total times.

Remco





darktable user mailing list
to unsubscribe send a mail to darktable-user+unsubscr...@lists.darktable.org



Re: [darktable-user] GeForce 210 without opencl faster

2019-12-25 Thread I. Ivanov
I am in the same shoes. Using GeForce GT 525M/PCIe/SSE2 and it is slower 
than CPU. My understanding is that the big benefit of fast GPU is only 
if the GPU is significantly faster because there is time lost when the 
data is copied from RAM to the GPU.


You may find this info useful.

https://www.mail-archive.com/darktable-user@lists.darktable.org/msg01087.html

Regards,

B

On 2019-12-25 19:02, Аl Воgnеr wrote:

I wonder why an old pc with a GeForce 210, AMD Athlon II X2 270, 8G RAM
is slower using opencl.

darktable 2.6.3~git2.22c690a53
Fresh installation of the pc, using default darktable-configuration

$ darktable-cli bench.srw bench.srw.xmp bench.jpg --core -d perf -d
opencl

[opencl_init] device 0: GeForce 210
  GLOBAL_MEM_SIZE:  1024MB
  MAX_WORK_GROUP_SIZE:  512
  MAX_WORK_ITEM_DIMENSIONS: 3
  MAX_WORK_ITEM_SIZES:  [ 512 512 64 ]
  DRIVER_VERSION:   340.107
  DEVICE_VERSION:   OpenCL 1.0 CUDA

0.254144 [opencl_init] FINALLY: opencl is AVAILABLE on this system.

227,385925 [dev_process_export] pixel pipeline processing took 225,369
secs (165,316 CPU)

I do not find "blended on _G_PU" in the log. Log attached.


$ darktable-cli bench.srw bench.srw.xmp bench.jpg --core
--disable-opencl -d perf

76,490890 [dev_process_export] pixel pipeline processing took 74,577
secs (144,522 CPU)

BTW:
GeForce GT 1030: 17,770 secs (26,050 CPU) / 37,168 secs (281,707 CPU),
AMD FX-8320, 8G RAM
So with a GT 1030 it is getting about 2 times faster using opencl.

Al


darktable user mailing list
to unsubscribe send a mail to darktable-user+unsubscr...@lists.darktable.org


darktable user mailing list
to unsubscribe send a mail to darktable-user+unsubscr...@lists.darktable.org