Hi Remi,

Thanks, I have this more or less working. I have not set the env variable 
ITK_GLOBAL_DEFAULT_NUMBER_OF_
THREADS but I will try that, I seem to be getting about 4 times that many 
threads running.

Below are various problems I've run into. Some of these might be code bugs, 
or config issues, or who knows what :)

I get an error with --bind-to socket

mpirun -np 4 --bind-to socket otbcli_MeanShiftSmoothing -in /u/ror/buildings
/data/naip/doqqs/2014/33118/m_3311805_se_11_1_20140513.tif -fout /u/ror/
buildings/tmp/test1-smooth.tif -foutpos /u/ror/buildings/tmp/test1-smoothpos
.tif -spatialr 24 -ranger 36 -ram 102400
Unexpected end of /proc/mounts line `overlay / overlay 
rw,seclabel,relatime,lowerdir=/var/lib/docker/overlay2/l/JPC7E5F4RB77LOK22ETL5FMEPN:/var/lib/docker/overlay2/l/DM3Q73J52BCAIEZVAQZGAMXLCX:/var/lib/docker/overlay2/l/WC5LQTPG4RBGOUEZ7KBJZLUB2R:/var/lib/docker/overlay2/l/BESSO2WOBICH2P4GSVX7VSCGG6:/var/lib/docker/overlay2/l/FMSJDZMFK67RHOIIZOLKOICAHI:/var/lib/docker/overlay2/l/U7AFHXIVI6KAKUO2VJMZWLQOHH:/var/lib/docker/overlay2/l/EIRHWP2GOK3F2PH7SHY4FK6J6P,upperdir=/var/lib/docker/overlay2/73d138b0a2dadf534a9d9c7d2ed894484515bfe3d2f1807a2b8'
--------------------------------------------------------------------------
WARNING: Open MPI tried to bind a process but failed.  This is a
warning only; your job will continue, though performance may
be degraded.

  Local host:        optane30
  Application name:  /usr/bin/otbcli_MeanShiftSmoothing
  Error message:     failed to bind memory
  Location:          odls_default_module.c:639

--------------------------------------------------------------------------


But the job runs to completion. When I try to run otbcli_LSMSVectorization 
under 
mpi it fails. The same command runs fine without mpi. If this command 
shouldn't run under mpi, you  might want to add a check and report to the 
user, or just internally disable mpi.

mpirun -np 4 --bind-to socket otbcli_LSMSVectorization -in /u/ror/buildings/
tmp/test1-smooth.tif -inseg /u/ror/buildings/tmp/test1-segs.tif -out /u/ror/
buildings/tmp/test1-segments.shp -tilesizex 1025 -tilesizey 1025
Unexpected end of /proc/mounts line `overlay / overlay 
rw,seclabel,relatime,lowerdir=/var/lib/docker/overlay2/l/JPC7E5F4RB77LOK22ETL5FMEPN:/var/lib/docker/overlay2/l/DM3Q73J52BCAIEZVAQZGAMXLCX:/var/lib/docker/overlay2/l/WC5LQTPG4RBGOUEZ7KBJZLUB2R:/var/lib/docker/overlay2/l/BESSO2WOBICH2P4GSVX7VSCGG6:/var/lib/docker/overlay2/l/FMSJDZMFK67RHOIIZOLKOICAHI:/var/lib/docker/overlay2/l/U7AFHXIVI6KAKUO2VJMZWLQOHH:/var/lib/docker/overlay2/l/EIRHWP2GOK3F2PH7SHY4FK6J6P,upperdir=/var/lib/docker/overlay2/73d138b0a2dadf534a9d9c7d2ed894484515bfe3d2f1807a2b8'
--------------------------------------------------------------------------
WARNING: Open MPI tried to bind a process but failed.  This is a
warning only; your job will continue, though performance may
be degraded.

  Local host:        optane30
  Application name:  /usr/bin/otbcli_LSMSVectorization
  Error message:     failed to bind memory
  Location:          odls_default_module.c:639

--------------------------------------------------------------------------
2017 May 22 16:21:20  :  Application.logger  (CRITICAL) Invalid image 
filename /u/ror/buildings/tmp/test1-segs.tif.
2017 May 22 16:21:20  :  Application.logger  (CRITICAL) Invalid image 
filename /u/ror/buildings/tmp/test1-segs.tif.
2017 May 22 16:21:20  :  Application.logger  (FATAL) The following error 
occurred during application execution : 
/build/otb-KxFZzD/otb-5.4.0+dfsg/Modules/Wrappers/ApplicationEngine/include/otbWrapperInputImageParameter.txx:76:
itk::ERROR: InputImageParameter(0x560f0bbf6ae0): No input image or filename 
detected...
2017 May 22 16:21:20  :  Application.logger  (FATAL) The following error 
occurred during application execution : 
/build/otb-KxFZzD/otb-5.4.0+dfsg/Modules/Wrappers/ApplicationEngine/include/otbWrapperInputImageParameter.txx:76:
itk::ERROR: InputImageParameter(0x55b6d1af6ae0): No input image or filename 
detected...
2017 May 22 16:21:20  :  Application.logger  (CRITICAL) Invalid image 
filename /u/ror/buildings/tmp/test1-segs.tif.
2017 May 22 16:21:20  :  Application.logger  (CRITICAL) Invalid image 
filename /u/ror/buildings/tmp/test1-segs.tif.
2017 May 22 16:21:20  :  Application.logger  (FATAL) The following error 
occurred during application execution : 
/build/otb-KxFZzD/otb-5.4.0+dfsg/Modules/Wrappers/ApplicationEngine/include/otbWrapperInputImageParameter.txx:76:
itk::ERROR: InputImageParameter(0x55f536fbaae0): No input image or filename 
detected...
2017 May 22 16:21:20  :  Application.logger  (FATAL) The following error 
occurred during application execution : 
/build/otb-KxFZzD/otb-5.4.0+dfsg/Modules/Wrappers/ApplicationEngine/include/otbWrapperInputImageParameter.txx:76:
itk::ERROR: InputImageParameter(0x562285e5dae0): No input image or filename 
detected...
-------------------------------------------------------
Primary job  terminated normally, but 1 process returned
a non-zero exit code.. Per user-direction, the job has been aborted.
-------------------------------------------------------
--------------------------------------------------------------------------
mpirun detected that one or more processes exited with non-zero status, 
thus causing
the job to be terminated. The first process to do so was:

  Process name: [[64617,1],0]
  Exit code:    1
--------------------------------------------------------------------------
[optane30.softlayer.com:32749] 3 more processes have sent help message 
help-orte-odls-default.txt / memory not bound
[optane30.softlayer.com:32749] Set MCA parameter "orte_base_help_aggregate" 
to 0 to see all help / error messages



I also tried otbcli_LSMSVectorization under mpi and it actually took longer 
than running it without mpi:

$ time otbcli_LSMSVectorization -in /u/ror/buildings/tmp/test1-smooth.tif 
-inseg 
/u/ror/buildings/tmp/test1-segs.tif -out /u/ror/buildings/tmp/test1-segments
.shp -tilesizex 1025 -tilesizey 1025
2017 May 22 18:23:43  :  Application.logger  (INFO) Number of tiles: 8 x 7
2017 May 22 18:23:45  :  Application.logger  (INFO) Vectorization ...
2017 May 22 18:24:21  :  Application.logger  (INFO) Merging polygons across 
tiles ...
2017 May 22 18:29:01  :  Application.logger  (INFO) Elapsed time: 380.383 
seconds

real    5m18.121s
user    6m17.994s
sys     0m2.704s

$ time mpirun -np 4 --bind-to socket otbcli_LSMSVectorization -in /u/ror/
buildings/tmp/test1-smooth.tif -inseg /u/ror/buildings/tmp/test1-segs.tif -
out /u/ror/buildings/tmp/test1-segments-mpi.shp -tilesizex 1025 -tilesizey 
1025
Unexpected end of /proc/mounts line `overlay
 / overlay 
rw,seclabel,relatime,lowerdir=/var/lib/docker/overlay2/l/JPC7E5F4RB77LOK22ETL5FMEPN:/var/lib/docker/overlay2/l/DM3Q73J52BCAIEZVAQZGAMXLCX:/var/lib/docker/overlay2/l/WC5LQTPG4RBGOUEZ7KBJZLUB2R:/var/lib/docker/overlay2/l/BESSO2WOBICH2P4GSVX7VSCGG6:/var/lib/docker/overlay2/l/FMSJDZMFK67RHOIIZOLKOICAHI:/var/lib/docker/overlay2/l/U7AFHXIVI6KAKUO2VJMZWLQOHH:/var/lib/docker/overlay2/l/EIRHWP2GOK3F2PH7SHY4FK6J6P,upperdir=/var/lib/docker/overlay2/73d138b0a2dadf534a9d9c7d2ed894484515bfe3d2f1807a2b8'
--------------------------------------------------------------------------
WARNING: Open MPI tried to bind a process but failed.  This is a
warning only; your job will continue, though performance may
be degraded.

  Local host:        optane30
  Application name:  /usr/bin/otbcli_LSMSVectorization
  Error message:     failed to bind memory
  Location:          odls_default_module.c:639

--------------------------------------------------------------------------
2017 May 22 18:30:46  :  Application.logger  (INFO) Number of tiles: 8 x 7
2017 May 22 18:30:46  :  Application.logger  (INFO) Number of tiles: 8 x 7
2017 May 22 18:30:46  :  Application.logger  (INFO) Number of tiles: 8 x 7
2017 May 22 18:30:46  :  Application.logger  (INFO) Number of tiles: 8 x 7
2017 May 22 18:30:47  :  Application.logger  (INFO) Vectorization ...
2017 May 22 18:30:47  :  Application.logger  (INFO) Vectorization ...
2017 May 22 18:30:47  :  Application.logger  (INFO) Vectorization ...
2017 May 22 18:30:47  :  Application.logger  (INFO) Vectorization ...
[optane30.softlayer.com:14449] 3 more processes have sent help message 
help-orte-odls-default.txt / memory not bound
[optane30.softlayer.com:14449] Set MCA parameter "orte_base_help_aggregate" 
to 0 to see all help / error messages
2017 May 22 18:31:22  :  Application.logger  (INFO) Merging polygons across 
tiles ...
2017 May 22 18:31:22  :  Application.logger  (INFO) Merging polygons across 
tiles ...
2017 May 22 18:31:22  :  Application.logger  (INFO) Merging polygons across 
tiles ...
2017 May 22 18:31:22  :  Application.logger  (INFO) Merging polygons across 
tiles ...
2017 May 22 18:37:17  :  Application.logger  (INFO) Elapsed time: 403.66 
seconds
2017 May 22 18:37:20  :  Application.logger  (INFO) Elapsed time: 406.363 
seconds
2017 May 22 18:37:22  :  Application.logger  (INFO) Elapsed time: 409.377 
seconds
2017 May 22 18:38:22  :  Application.logger  (INFO) Elapsed time: 468.458 
seconds

real    7m36.737s
user    27m59.170s
sys     0m9.361s


All in, this is awesome software and I love it. I'll be putting more time 
into learn more of its features.

Thanks,
  -Steve


On Sunday, May 21, 2017 at 7:08:12 AM UTC-4, remicres wrote:
>
> Hello Stephen,
>
> To let the mpi magic happen, you must compile otb with the option 
> OTB_USE_MPI=ON. 
> You may also set the option OTB_USE_SPTW=ON wich enables the writing of 
> .tif files in parallel.
> After this, you set the ITK_GLOBAL_DEFAULT_NUMBER_OF_THREADS to 14 (wich 
> is the nb of threads per socket)  then you deploy the app over 4 mpi 
> processes (each one binded to a socket)
> mpirun -n 4 --bind-to socket otbcli_MeanShiftSmoothing -...
>
> I just realize that we need more material about mpi on the wiki / cookbook 
> / blog, I will take care of this soon...
>
> Le jeudi 18 mai 2017 23:52:23 UTC+2, Stephen Woodbridge a écrit :
>>
>> Hello Remi,
>>
>> I have never used MPI before. I can run the LSMS Smooth from the cli. The 
>> system has 4 cpu sockets with 14 core per socket:
>>
>> $ lscpu
>> Architecture:          x86_64
>> CPU op-mode(s):        32-bit, 64-bit
>> Byte Order:            Little Endian
>> CPU(s):                56
>> On-line CPU(s) list:   0-55
>> Thread(s) per core:    2
>> Core(s) per socket:    14
>> Socket(s):             2
>> NUMA node(s):          2
>> Vendor ID:             GenuineIntel
>> CPU family:            6
>> Model:                 79
>> Model name:            Intel(R) Xeon(R) CPU E5-2690 v4 @ 2.60GHz
>> Stepping:              1
>> CPU MHz:               1497.888
>> CPU max MHz:           2600.0000
>> CPU min MHz:           1200.0000
>> BogoMIPS:              5207.83
>> Virtualization:        VT-x
>> L1d cache:             32K
>> L1i cache:             32K
>> L2 cache:              256K
>> L3 cache:              35840K
>> NUMA node0 CPU(s):     0-13,28-41
>> NUMA node1 CPU(s):     14-27,42-55
>>
>> So I launch something like:
>>
>> mpirun -n 4 --bind-to socket otbcli_MeanShiftSmoothing -in maur_rgb.png 
>> -fout smooth.tif -foutpos position.tif -spatialr 16 -ranger 16 -thres 0.1 
>> -maxiter 100
>>
>> So if I understand this launches 4 copies of the application, but how do 
>> they know which instance is working on what? Is that just the magic of MPI?
>>
>> -Steve
>>
>> On Thursday, May 18, 2017 at 12:07:52 PM UTC-4, remicres wrote:
>>>
>>> Hello Stephen,
>>> I am really interested in your results. 
>>> A few years ago I failed to have good benchmarks of otb apps (that is, a 
>>> good scalability over the cpu usage) on the same kind of machine of yours. 
>>> The speedup was collapsing near 10-30 cpus (depending of the app). I 
>>> suspected fine tuning to be the cause, and I did not have the time to 
>>> persevere. This bad speedup might be related to threads positioning, cache 
>>> issues: the actual framework is well cpu-scalable in processing images in a 
>>> shared memory context, particularly when threads are on the same socket of 
>>> cpus. Depending of the algorithm used, I suspect one might need to fine 
>>> tune also the settings of the environment. 
>>> Could you provide the number of sockets of your machine? (with the 
>>> number of cpus for each one)
>>>
>>> If this machine has many sockets, one quick workaround to have good 
>>> speedup could consist in using the MPI support and force the binding of mpi 
>>> processes over the sockets (e.g. with openmpi: "mpirun -n <nb of socket of 
>>> your machine> --bind-to socket ..."). However, not sure how to use it from 
>>> python.
>>>
>>> Keep us updated!
>>>
>>> Rémi
>>>
>>> Le mercredi 17 mai 2017 21:34:48 UTC+2, Stephen Woodbridge a écrit :
>>>>
>>>> I started watching this with htop and all the cpus are getting action. 
>>>> There is a pattern where the number of threads spikes from about 162 up to 
>>>> 215 and the number of running threads spkies to  about 50ish for a few 
>>>> secounds, then the running threads drops to 2 for 5-10 seconds and repeats 
>>>> this pattern. I thinking that the parent thread is spinning up a bunch of 
>>>> workers, the finish, then the parent thread cycles through each of the 
>>>> finished workers collecting the results and presumably write it to disk or 
>>>> something. If it is writing to disk, there could be a huge potential 
>>>> performance improvement by writing the output to memory if enough memory 
>>>> is 
>>>> available which is clearly the case on this machine, then flushing the 
>>>> memory to disk. The current process is only using 3 GB or memory when it 
>>>> has 100 GB available to it and the system has 120GB.
>>>>
>>>> On Wednesday, May 17, 2017 at 12:13:04 PM UTC-4, Stephen Woodbridge 
>>>> wrote:
>>>>>
>>>>> Hi, first I want to say the LSMS Segmentation is very cool and works 
>>>>> nicely. I recently got access to a sever with 56 cores and 128GB of 
>>>>> memory 
>>>>> but I can't seem to get it to use more than 10-15 cores. I'm running the 
>>>>> smoothing on an image approx 20000x20000 in size. The image is a gdal VRT 
>>>>> file that combines 8 DOQQ images into a mosaic. It has 4 bands R, G, B, 
>>>>> IR 
>>>>> with each having Mask Flags: PER_DATASET (see below). I'm running this 
>>>>> from 
>>>>> a Python script like:
>>>>>
>>>>> def smoothing(fin, fout, foutpos, spatialr, ranger, rangeramp, thres, 
>>>>> maxiter, ram):
>>>>>     app = otbApplication.Registry.CreateApplication(
>>>>> 'MeanShiftSmoothing')
>>>>>     app.SetParameterString('in', fin)
>>>>>     app.SetParameterString('fout', fout)
>>>>>     app.SetParameterString('foutpos', foutpos)
>>>>>     app.SetParameterInt('spatialr', spatialr)
>>>>>     app.SetParameterFloat('ranger', ranger)
>>>>>     app.SetParameterFloat('rangeramp', rangeramp)
>>>>>     app.SetParameterFloat('thres', thres)
>>>>>     app.SetParameterInt('maxiter', maxiter)
>>>>>     app.SetParameterInt('ram', ram)
>>>>>     app.SetParameterInt('modesearch', 0)
>>>>>     app.ExecuteAndWriteOutput()
>>>>>
>>>>> Where:
>>>>> spatialr: 24
>>>>> ranger: 36
>>>>> rangeramp: 0
>>>>> thres: 0.1
>>>>> maxiter: 100
>>>>> ram: 102400
>>>>>
>>>>> Any thoughts on how I can get this to utilize more of the processing 
>>>>> power of this machine?
>>>>>
>>>>> -Steve
>>>>>
>>>>> woodbri@optane28:/u/ror/buildings/tmp$ otbcli_ReadImageInfo -in tmp-
>>>>> 23081-areaofinterest.vrt
>>>>> 2017 May 17 15:36:04  :  Application.logger  (INFO)
>>>>> Image general information:
>>>>>         Number of bands : 4
>>>>>         No data flags : Not found
>>>>>         Start index :  [0,0]
>>>>>         Size :  [19933,19763]
>>>>>         Origin :  [-118.442,34.0035]
>>>>>         Spacing :  [9.83578e-06,-9.83578e-06]
>>>>>         Estimated ground spacing (in meters): [0.90856,1.09369]
>>>>>
>>>>> Image acquisition information:
>>>>>         Sensor :
>>>>>         Image identification number:
>>>>>         Image projection : GEOGCS["WGS 84",
>>>>>     DATUM["WGS_1984",
>>>>>         SPHEROID["WGS 84",6378137,298.257223563,
>>>>>             AUTHORITY["EPSG","7030"]],
>>>>>         AUTHORITY["EPSG","6326"]],
>>>>>     PRIMEM["Greenwich",0],
>>>>>     UNIT["degree",0.0174532925199433],
>>>>>     AUTHORITY["EPSG","4326"]]
>>>>>
>>>>> Image default RGB composition:
>>>>>         [R, G, B] = [0,1,2]
>>>>>
>>>>> Ground control points information:
>>>>>         Number of GCPs = 0
>>>>>         GCPs projection =
>>>>>
>>>>> Output parameters value:
>>>>> indexx: 0
>>>>> indexy: 0
>>>>> sizex: 19933
>>>>> sizey: 19763
>>>>> spacingx: 9.835776837e-06
>>>>> spacingy: -9.835776837e-06
>>>>> originx: -118.4418488
>>>>> originy: 34.00345612
>>>>> estimatedgroundspacingx: 0.9085595012
>>>>> estimatedgroundspacingy: 1.093693733
>>>>> numberbands: 4
>>>>> sensor:
>>>>> id:
>>>>> time:
>>>>> ullat: 0
>>>>> ullon: 0
>>>>> urlat: 0
>>>>> urlon: 0
>>>>> lrlat: 0
>>>>> lrlon: 0
>>>>> lllat: 0
>>>>> lllon: 0
>>>>> town:
>>>>> country:
>>>>> rgb.r: 0
>>>>> rgb.g: 1
>>>>> rgb.b: 2
>>>>> projectionref: GEOGCS["WGS 84",
>>>>>     DATUM["WGS_1984",
>>>>>         SPHEROID["WGS 84",6378137,298.257223563,
>>>>>             AUTHORITY["EPSG","7030"]],
>>>>>         AUTHORITY["EPSG","6326"]],
>>>>>     PRIMEM["Greenwich",0],
>>>>>     UNIT["degree",0.0174532925199433],
>>>>>     AUTHORITY["EPSG","4326"]]
>>>>> keyword:
>>>>> gcp.count: 0
>>>>> gcp.proj:
>>>>> gcp.ids:
>>>>> gcp.info:
>>>>> gcp.imcoord:
>>>>> gcp.geocoord:
>>>>>
>>>>> woodbri@optane28:/u/ror/buildings/tmp$ gdalinfo tmp-23081-
>>>>> areaofinterest.vrt
>>>>> Driver: VRT/Virtual Raster
>>>>> Files: tmp-23081-areaofinterest.vrt
>>>>>        /u/ror/buildings/tmp/tmp-23081-areaofinterest.vrt.vrt
>>>>> Size is 19933, 19763
>>>>> Coordinate System is:
>>>>> GEOGCS["WGS 84",
>>>>>     DATUM["WGS_1984",
>>>>>         SPHEROID["WGS 84",6378137,298.257223563,
>>>>>             AUTHORITY["EPSG","7030"]],
>>>>>         AUTHORITY["EPSG","6326"]],
>>>>>     PRIMEM["Greenwich",0],
>>>>>     UNIT["degree",0.0174532925199433],
>>>>>     AUTHORITY["EPSG","4326"]]
>>>>> Origin = (-118.441851318576212,34.003461706049677)
>>>>> Pixel Size = (0.000009835776490,-0.000009835776490)
>>>>> Corner Coordinates:
>>>>> Upper Left  (-118.4418513,  34.0034617) (118d26'30.66"W, 34d 0'12.46
>>>>> "N)
>>>>> Lower Left  (-118.4418513,  33.8090773) (118d26'30.66"W, 33d48
>>>>> '32.68"N)
>>>>> Upper Right (-118.2457948,  34.0034617) (118d14'44.86"W, 34d 0'12.46"N
>>>>> )
>>>>> Lower Right (-118.2457948,  33.8090773) (118d14'44.86"W, 33d48'32.68
>>>>> "N)
>>>>> Center      (-118.3438231,  33.9062695) (118d20'37.76"W, 33d54
>>>>> '22.57"N)
>>>>> Band 1 Block=128x128 Type=Byte, ColorInterp=Red
>>>>>   Mask Flags: PER_DATASET
>>>>> Band 2 Block=128x128 Type=Byte, ColorInterp=Green
>>>>>   Mask Flags: PER_DATASET
>>>>> Band 3 Block=128x128 Type=Byte, ColorInterp=Blue
>>>>>   Mask Flags: PER_DATASET
>>>>> Band 4 Block=128x128 Type=Byte, ColorInterp=Gray
>>>>>   Mask Flags: PER_DATASET
>>>>>
>>>>>
>>>>>
>>>>>

-- 
-- 
Check the OTB FAQ at
http://www.orfeo-toolbox.org/FAQ.html

You received this message because you are subscribed to the Google
Groups "otb-users" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/otb-users?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"otb-users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
For more options, visit https://groups.google.com/d/optout.

Reply via email to