Re: [otb-users] Re: Performance of LSMS Segmentation with many CPU server

Manuel Grizonnet Fri, 19 May 2017 04:12:13 -0700

Hi Stephen,

just want to add that there is perhaps something else to try with the ITK
mechanism which allows to use pool of threads:


https://github.com/InsightSoftwareConsortium/ITK/blob/master/Modules/Core/Common/include/itkMultiThreader.h#L210

You can easily test this by setting the environment variable
ITK_USE_THREADPOOL (to 'ON' for instance).

Never personally tried this configuration and I was not able to find much
documentation about it for now.

Best regards,

Manuel


2017-05-18 23:52 GMT+02:00 Stephen Woodbridge <[email protected]>:

> Hello Remi,
>
> I have never used MPI before. I can run the LSMS Smooth from the cli. The
> system has 4 cpu sockets with 14 core per socket:
>
> $ lscpu
> Architecture:          x86_64
> CPU op-mode(s):        32-bit, 64-bit
> Byte Order:            Little Endian
> CPU(s):                56
> On-line CPU(s) list:   0-55
> Thread(s) per core:    2
> Core(s) per socket:    14
> Socket(s):             2
> NUMA node(s):          2
> Vendor ID:             GenuineIntel
> CPU family:            6
> Model:                 79
> Model name:            Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz
> Stepping:              1
> CPU MHz:               1497.888
> CPU max MHz:           2600.0000
> CPU min MHz:           1200.0000
> BogoMIPS:              5207.83
> Virtualization:        VT-x
> L1d cache:             32K
> L1i cache:             32K
> L2 cache:              256K
> L3 cache:              35840K
> NUMA node0 CPU(s):     0-13,28-41
> NUMA node1 CPU(s):     14-27,42-55
>
> So I launch something like:
>
> mpirun -n 4 --bind-to socket otbcli_MeanShiftSmoothing -in maur_rgb.png
> -fout smooth.tif -foutpos position.tif -spatialr 16 -ranger 16 -thres 0.1
> -maxiter 100
>
> So if I understand this launches 4 copies of the application, but how do
> they know which instance is working on what? Is that just the magic of MPI?
>
> -Steve
>
>
> On Thursday, May 18, 2017 at 12:07:52 PM UTC-4, remicres wrote:
>>
>> Hello Stephen,
>> I am really interested in your results.
>> A few years ago I failed to have good benchmarks of otb apps (that is, a
>> good scalability over the cpu usage) on the same kind of machine of yours.
>> The speedup was collapsing near 10-30 cpus (depending of the app). I
>> suspected fine tuning to be the cause, and I did not have the time to
>> persevere. This bad speedup might be related to threads positioning, cache
>> issues: the actual framework is well cpu-scalable in processing images in a
>> shared memory context, particularly when threads are on the same socket of
>> cpus. Depending of the algorithm used, I suspect one might need to fine
>> tune also the settings of the environment.
>> Could you provide the number of sockets of your machine? (with the number
>> of cpus for each one)
>>
>> If this machine has many sockets, one quick workaround to have good
>> speedup could consist in using the MPI support and force the binding of mpi
>> processes over the sockets (e.g. with openmpi: "mpirun -n <nb of socket of
>> your machine> --bind-to socket ..."). However, not sure how to use it from
>> python.
>>
>> Keep us updated!
>>
>> Rémi
>>
>> Le mercredi 17 mai 2017 21:34:48 UTC+2, Stephen Woodbridge a écrit :
>>>
>>> I started watching this with htop and all the cpus are getting action.
>>> There is a pattern where the number of threads spikes from about 162 up to
>>> 215 and the number of running threads spkies to  about 50ish for a few
>>> secounds, then the running threads drops to 2 for 5-10 seconds and repeats
>>> this pattern. I thinking that the parent thread is spinning up a bunch of
>>> workers, the finish, then the parent thread cycles through each of the
>>> finished workers collecting the results and presumably write it to disk or
>>> something. If it is writing to disk, there could be a huge potential
>>> performance improvement by writing the output to memory if enough memory is
>>> available which is clearly the case on this machine, then flushing the
>>> memory to disk. The current process is only using 3 GB or memory when it
>>> has 100 GB available to it and the system has 120GB.
>>>
>>> On Wednesday, May 17, 2017 at 12:13:04 PM UTC-4, Stephen Woodbridge
>>> wrote:
>>>>
>>>> Hi, first I want to say the LSMS Segmentation is very cool and works
>>>> nicely. I recently got access to a sever with 56 cores and 128GB of memory
>>>> but I can't seem to get it to use more than 10-15 cores. I'm running the
>>>> smoothing on an image approx 20000x20000 in size. The image is a gdal VRT
>>>> file that combines 8 DOQQ images into a mosaic. It has 4 bands R, G, B, IR
>>>> with each having Mask Flags: PER_DATASET (see below). I'm running this from
>>>> a Python script like:
>>>>
>>>> def smoothing(fin, fout, foutpos, spatialr, ranger, rangeramp, thres,
>>>> maxiter, ram):
>>>>     app = otbApplication.Registry.CreateApplication('MeanShiftSmoothin
>>>> g')
>>>>     app.SetParameterString('in', fin)
>>>>     app.SetParameterString('fout', fout)
>>>>     app.SetParameterString('foutpos', foutpos)
>>>>     app.SetParameterInt('spatialr', spatialr)
>>>>     app.SetParameterFloat('ranger', ranger)
>>>>     app.SetParameterFloat('rangeramp', rangeramp)
>>>>     app.SetParameterFloat('thres', thres)
>>>>     app.SetParameterInt('maxiter', maxiter)
>>>>     app.SetParameterInt('ram', ram)
>>>>     app.SetParameterInt('modesearch', 0)
>>>>     app.ExecuteAndWriteOutput()
>>>>
>>>> Where:
>>>> spatialr: 24
>>>> ranger: 36
>>>> rangeramp: 0
>>>> thres: 0.1
>>>> maxiter: 100
>>>> ram: 102400
>>>>
>>>> Any thoughts on how I can get this to utilize more of the processing
>>>> power of this machine?
>>>>
>>>> -Steve
>>>>
>>>> woodbri@optane28:/u/ror/buildings/tmp$ otbcli_ReadImageInfo -in tmp-
>>>> 23081-areaofinterest.vrt
>>>> 2017 May 17 15:36:04  :  Application.logger  (INFO)
>>>> Image general information:
>>>>         Number of bands : 4
>>>>         No data flags : Not found
>>>>         Start index :  [0,0]
>>>>         Size :  [19933,19763]
>>>>         Origin :  [-118.442,34.0035]
>>>>         Spacing :  [9.83578e-06,-9.83578e-06]
>>>>         Estimated ground spacing (in meters): [0.90856,1.09369]
>>>>
>>>> Image acquisition information:
>>>>         Sensor :
>>>>         Image identification number:
>>>>         Image projection : GEOGCS["WGS 84",
>>>>     DATUM["WGS_1984",
>>>>         SPHEROID["WGS 84",6378137,298.257223563,
>>>>             AUTHORITY["EPSG","7030"]],
>>>>         AUTHORITY["EPSG","6326"]],
>>>>     PRIMEM["Greenwich",0],
>>>>     UNIT["degree",0.0174532925199433],
>>>>     AUTHORITY["EPSG","4326"]]
>>>>
>>>> Image default RGB composition:
>>>>         [R, G, B] = [0,1,2]
>>>>
>>>> Ground control points information:
>>>>         Number of GCPs = 0
>>>>         GCPs projection =
>>>>
>>>> Output parameters value:
>>>> indexx: 0
>>>> indexy: 0
>>>> sizex: 19933
>>>> sizey: 19763
>>>> spacingx: 9.835776837e-06
>>>> spacingy: -9.835776837e-06
>>>> originx: -118.4418488
>>>> originy: 34.00345612
>>>> estimatedgroundspacingx: 0.9085595012
>>>> estimatedgroundspacingy: 1.093693733
>>>> numberbands: 4
>>>> sensor:
>>>> id:
>>>> time:
>>>> ullat: 0
>>>> ullon: 0
>>>> urlat: 0
>>>> urlon: 0
>>>> lrlat: 0
>>>> lrlon: 0
>>>> lllat: 0
>>>> lllon: 0
>>>> town:
>>>> country:
>>>> rgb.r: 0
>>>> rgb.g: 1
>>>> rgb.b: 2
>>>> projectionref: GEOGCS["WGS 84",
>>>>     DATUM["WGS_1984",
>>>>         SPHEROID["WGS 84",6378137,298.257223563,
>>>>             AUTHORITY["EPSG","7030"]],
>>>>         AUTHORITY["EPSG","6326"]],
>>>>     PRIMEM["Greenwich",0],
>>>>     UNIT["degree",0.0174532925199433],
>>>>     AUTHORITY["EPSG","4326"]]
>>>> keyword:
>>>> gcp.count: 0
>>>> gcp.proj:
>>>> gcp.ids:
>>>> gcp.info:
>>>> gcp.imcoord:
>>>> gcp.geocoord:
>>>>
>>>> woodbri@optane28:/u/ror/buildings/tmp$ gdalinfo tmp-23081-
>>>> areaofinterest.vrt
>>>> Driver: VRT/Virtual Raster
>>>> Files: tmp-23081-areaofinterest.vrt
>>>>        /u/ror/buildings/tmp/tmp-23081-areaofinterest.vrt.vrt
>>>> Size is 19933, 19763
>>>> Coordinate System is:
>>>> GEOGCS["WGS 84",
>>>>     DATUM["WGS_1984",
>>>>         SPHEROID["WGS 84",6378137,298.257223563,
>>>>             AUTHORITY["EPSG","7030"]],
>>>>         AUTHORITY["EPSG","6326"]],
>>>>     PRIMEM["Greenwich",0],
>>>>     UNIT["degree",0.0174532925199433],
>>>>     AUTHORITY["EPSG","4326"]]
>>>> Origin = (-118.441851318576212,34.003461706049677)
>>>> Pixel Size = (0.000009835776490,-0.000009835776490)
>>>> Corner Coordinates:
>>>> Upper Left  (-118.4418513,  34.0034617) (118d26'30.66"W, 34d 0'12.46"N)
>>>> Lower Left  (-118.4418513,  33.8090773) (118d26'30.66"W, 33d48'32.68"N)
>>>> Upper Right (-118.2457948,  34.0034617) (118d14'44.86"W, 34d 0'12.46"N)
>>>> Lower Right (-118.2457948,  33.8090773) (118d14'44.86"W, 33d48'32.68"N)
>>>> Center      (-118.3438231,  33.9062695) (118d20'37.76"W, 33d54'22.57"N)
>>>> Band 1 Block=128x128 Type=Byte, ColorInterp=Red
>>>>   Mask Flags: PER_DATASET
>>>> Band 2 Block=128x128 Type=Byte, ColorInterp=Green
>>>>   Mask Flags: PER_DATASET
>>>> Band 3 Block=128x128 Type=Byte, ColorInterp=Blue
>>>>   Mask Flags: PER_DATASET
>>>> Band 4 Block=128x128 Type=Byte, ColorInterp=Gray
>>>>   Mask Flags: PER_DATASET
>>>>
>>>>
>>>>
>>>> --
> --
> Check the OTB FAQ at
> http://www.orfeo-toolbox.org/FAQ.html
>
> You received this message because you are subscribed to the Google
> Groups "otb-users" group.
> To post to this group, send email to [email protected]
> To unsubscribe from this group, send email to
> [email protected]
> For more options, visit this group at
> http://groups.google.com/group/otb-users?hl=en
> ---
> You received this message because you are subscribed to the Google Groups
> "otb-users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> For more options, visit https://groups.google.com/d/optout.
>



-- 
Manuel Grizonnet

-- 
-- 
Check the OTB FAQ at
http://www.orfeo-toolbox.org/FAQ.html

You received this message because you are subscribed to the Google
Groups "otb-users" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/otb-users?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"otb-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Re: [otb-users] Re: Performance of LSMS Segmentation with many CPU server

Reply via email to