[GRASS-dev] Re: [GRASS-user] v.lidar.edgedetection very slow

Jyothish Soman Fri, 19 Jun 2009 11:22:19 -0700

I think that will be the parallel version of the code given in the link.
Sorry, I dont have the resources right now to check the validity of the
code, my laptop is fried, working from a common terminal.


Just add -fopenmp flag in the make file for the code (This line is funny,
seems like a packed food adv).

I have a small problem in understanding how the vect variable is being
declared... in function *tcholInv* and *tcholSolveInv.


*On Fri, Jun 19, 2009 at 4:40 PM, Hamish <[email protected]> wrote:

>
> ok, I had a chance to look using a 1km^2 block of LIDAR data I had around.
> (0.5M pts, uses ~ 750mb RAM on 64bit Linux)
>
> I've never used the v.lidar tools much (I usually just do something
> simple with r.in.xyz) so I can't offer too much expert help. But...
>
> [half of what follows is aimed at developers]
>
>
> John:
> > I have recently joined the user group and I'm working on LiDAR data.
> > I am running GRASS GIS 6.4.0svn (the one that came out last week) on
> > an Intel quad core 6600 with 2gb RAM.
> >> OS = XP Pro
> >
> > I have seperated my data into 1000mx1000m which have around
> > 1.5 million points per tile.
> >
> > I use the -tbzr on v.in.ascii and v.build after for topology.
> > I have also utilised the sqlite db.
>
> AFAICT topology is not required for v.lidar modules.
> If you only use 3D x,y,z data then database is not needed and will just
> take up space and memory. (v.in.ascii -z)
>
>
> > After running v.outlier on first and last returns, the time come for
> > edge detection.
> >
> > I run this module (v.lidar.edgedetection) and it works (only using
> > 25% of cpu as per usual (guessing there is no quad cpu support = shame))
>
> Nope, awaiting a volunteer -- could be a really nice threading project
> for someone.
>
>  - the v.lidar code is reasonably straight forward & compartmentalized
>  - experimental support for Pthreads is already added to r.mapcalc in
>   grass 7 as a template, but this could be an even better example as
>   r.mapcalc is typically hard drive-limited not CPU-limited.
>  - IIUC there is an OpenMP version of i.atcorr by Yann Chemin which
>   may provide a OpenMP example.
>  - not really sure if Pthreads or OpenMP is more appropriate here
>
>
> tcholDec() could probably be split up into multiple threads at the
> for(i;;) stage. (??)
>
> https://trac.osgeo.org/grass/browser/grass/trunk/vector/lidar/lidarlib/TcholBand.c#L6
>
> see also http://grass.osgeo.org/wiki/OpenMP
>
>
>
> > However, I have not finished a process yet as it writes at about
> > 1kb per second, using 1% of cpu.
>
> I can only assume that is due to page swapping. Running without a DB
> and with no topology may help to save quite a bit of memory/time there.
>
> no idea why it would drop to 1% of CPU if there is no page swapping
> going on.
>
> > As it is writing the file (was at 6Mb after 3hrs), the mem usage is
> > on 114Mb with 1.4Gb free at the moment. It is not using the PageFile.
>
> .... is that at 100%/4(25%) or 1% CPU?
>
>
> > Is it supposed to be this slow? is there a bug? The tile as .txt/.las
> > is c.47Mb.
>
> the vector engine doesn't scale as well as the raster engine (which can
> easily handle gigabytes of data), but it should handle 47mb with ease.
>
>
> > Does the edge file equate to a similar size, e.g. will it
> > take about 47000 seconds to write the file? Is there a rough percentage
> > of the size of edge file compared to original txt file? I notice it is
> > also writing to the sqlite.db at the same time.
>
> no idea about file sizes. In 6.5svn and 7svn I've just added some
> verbose messages (add --verbose) and some percent done indicators
> so it will seem like less of a black box now.
>
> It turns out that ~95% of the computational time is spent in the 3-deep
> nested for loops of the Tcholetsky decomposition function.
> (lidarlib/tcholDec())
>
> for my data it ran 16 loops of:
> {
>  1. Bilinear interpolation
>  2. Bicubic interpolation
>  3. Point classification (fast)
> }
>
> maybe these loops are another thing that could be sent out to different
> threads? (not sure if these are multiple passes or independent segments)
> Better to concentrate on the library level first? i.e. lidarlib/tcholDec()
> or is tcholDec() just written in a very inefficient way?
>
>
> also you can turn on some debug messages to follow the action:
> g.gisenv set="DEBUG=1"
>
>
> > Have I done something wrong?
>
> one thing to check: did you run 'g.region vect=lidar_points res=2 -ap'
> before starting? v.lidar.edgedetection seems to use the current
> computational region settings for something.
>
>
> > The book and manual says it needs topology for lidar tools to work.
> > Does it need the database?
>
> I don't think it needs either. Where abouts does it say that?
>
> > v.build and thought a db could help.
>
> .... I think it just adds overheads to the process.
>
>
> Hamish
>
>
>
>
>
>


-- 
JYOTHISH SOMAN
MS-2008-CS
Center for Security, Theory And Algorithm Research (CSTAR)
International Institute of Information Technology
Hyderabad
India
Phone:+91-9966222626
http://www.iiit.ac.in/
--------------------------------------------------------------
The reasonable man adapts himself to the world; the unreasonable one
persists in trying to adapt the world to himself. Therefore, all progress
depends on the unreasonable man.
   George Bernard Shaw
--------------------------------------------------------

#include <stdlib.h>             /* imported libraries */
2       #include <stdio.h>
3       #include <math.h>
4       #include <grass/gis.h>
5       #include <grass/PolimiFunct.h>
6       #include<omp.h>   /*Change here*/
7       
/*--------------------------------------------------------------------------------------*/
8       /* Tcholetsky decomposition -> T= Lower Triangular Matrix */
9       
10      void tcholDec(double **N, double **T, int n, int BW)
11      {
12          int i, j, k,num_p;
13          double somma;
14      omp_set_num_threads(omp_get_num_procs());
15          G_debug(3, "tcholDec(): n=%d  BW=%d", n, BW);
16      #pragma omp for shared(i,N,BW,T) private(j,k,somma)
17          for (i = 0; i < n; i++) {
18              G_percent(i, n, 2);
19              for (j = 0; j < BW; j++) {
20                  somma = N[i][j];
21                  for (k = 1; k < BW; k++)
22                      if (((i - k) >= 0) && ((j + k) < BW))
23                          somma -= T[i - k][k] * T[i - k][j + k];
24                  if (j == 0) {
25                      if (somma <= 0.0)
26                          G_fatal_error(_("Decomposition failed"));
27                      T[i][0] = sqrt(somma);
28                  }
29                  else
30                      T[i][j] = somma / T[i][0];
31              }
32          }
33      
34          G_percent(i, n, 2);
35          return;
36      }
37      
38      
/*--------------------------------------------------------------------------------------*/
39      /* Tcholetsky matrix solution */
40      
41      void tcholSolve(double **N, double *TN, double *parVect, int n, int BW)
42      {
43      
44          double **T;
45          int i, j;
46      omp_set_num_threads(omp_get_num_procs());
47              /*--------------------------------------*/
48          T = G_alloc_matrix(n, BW);
49      
50              /*--------------------------------------*/
51          tcholDec(N, T, n, BW);      /* T computation                */
52      
53          /* Forward substitution */
54          parVect[0] = TN[0] / T[0][0];
#pragma omp for shared(i,TN,parvect) private(j)
55          for (i = 1; i < n; i++) {
56              parVect[i] = TN[i];
57              for (j = 0; j < i; j++)
58                  if ((i - j) < BW)
59                      parVect[i] -= T[j][i - j] * parVect[j];
60              parVect[i] = parVect[i] / T[i][0];
61          }
62 #pragma omp for shared(i) private()  
63          /* Backward substitution */
64          parVect[n - 1] = parVect[n - 1] / T[n - 1][0];
65          for (i = n - 2; i >= 0; i--) {
66              for (j = i + 1; j < n; j++)
67                  if ((j - i) < BW)
68                      parVect[i] -= T[i][j - i] * parVect[j];
69              parVect[i] = parVect[i] / T[i][0];
70          }
71      
72              /*--------------------------------------*/
73          G_free_matrix(T);
74      
75          return;
76      }
77      
78      
79      
/*--------------------------------------------------------------------------------------*/
80      /* Soluzione con Tcholetsky -> la matrice T triangolare viene passata 
come paramtero e
81         non calcolata internamente alla procedura -> T = dmatrix (0, n-1, 0, 
BW-1) */
82      
83      void tcholSolve2(double **N, double *TN, double **T, double *parVect, 
int n,
84                       int BW)
85      {
86      
87          int i, j;
88      omp_set_num_threads(omp_get_num_procs());
89          /* Forward substitution */
90          parVect[0] = TN[0] / T[0][0];
#pragma omp for shared(i) private()
91          for (i = 1; i < n; i++) {
92              parVect[i] = TN[i];
93              for (j = 0; j < i; j++)
94                  if ((i - j) < BW)
95                      parVect[i] -= T[j][i - j] * parVect[j];
96              parVect[i] = parVect[i] / T[i][0];
97          }
98      
99          /* Backward substitution */
100         parVect[n - 1] = parVect[n - 1] / T[n - 1][0];
#pragma omp for shared(i) private()
101         for (i = n - 2; i >= 0; i--) {
102             for (j = i + 1; j < n; j++)
103                 if ((j - i) < BW)
104                     parVect[i] -= T[i][j - i] * parVect[j];
105             parVect[i] = parVect[i] / T[i][0];
106         }
107     
108         return;
109     }
110     
111     
/*--------------------------------------------------------------------------------------*/
112     /* Tcholetsky matrix invertion */
113     
114     void tcholInv(double **N, double *invNdiag, int n, int BW)
115     {
116         double **T = NULL;
117         double *vect = NULL;
118         int i, j, k;
119         double somma;
120     
121             /*--------------------------------------*/
122         G_alloc_matrix(n, BW);
123         G_alloc_vector(n);
124     omp_set_num_threads(omp_get_num_procs());
125         /* T computation                */
126         tcholDec(N, T, n, BW);
127     
128         /* T Diagonal invertion */
#pragma omp for shared(i)
129         for (i = 0; i < n; i++) {
130             T[i][0] = 1.0 / T[i][0];
131         }
132     
133         /* N Diagonal invertion */
134         for (i = 0; i < n; i++) {
135             vect[0] = T[i][0];
136             invNdiag[i] = vect[0] * vect[0];
137             for (j = i + 1; j < n; j++) {
138                 somma = 0.0;
139                 for (k = i; k < j; k++) {
140                     if ((j - k) < BW)
141                         somma -= vect[k - i] * T[k][j - k];
142                 }
143                 vect[j - i] = somma * T[j][0];
144                 invNdiag[i] += vect[j - i] * vect[j - i];
145             }
146         }
147     
148             /*--------------------------------------*/
149         G_free_matrix(T);
150         G_free_vector(vect);
151     
152         return;
153     }
154     
155     
/*--------------------------------------------------------------------------------------*/
156     /* Tcholetsky matrix solution and invertion */
157     
158     void tcholSolveInv(double **N, double *TN, double *invNdiag, double 
*parVect,
159                        int n, int BW)
160     {
161     
162         double **T = NULL;
163         double *vect = NULL;
164         int i, j, k;
165         double somma;
166     omp_set_num_threads(omp_get_num_procs());
167             /*--------------------------------------*/
168         G_alloc_matrix(n, BW);
169         G_alloc_vector(n);
170     
171         /* T computation                */
172         tcholDec(N, T, n, BW);
173     
174         /* Forward substitution */
175         parVect[0] = TN[0] / T[0][0];
#pragma omp for shared(i) private(j)
176         for (i = 1; i < n; i++) {
177             parVect[i] = TN[i];
178             for (j = 0; j < i; j++)
179                 if ((i - j) < BW)
180                     parVect[i] -= T[j][i - j] * parVect[j];
181             parVect[i] = parVect[i] / T[i][0];
182         }
183     
184         /* Backward substitution */
185         parVect[n - 1] = parVect[n - 1] / T[n - 1][0];
#pragma omp for shared(i) private(j)
186         for (i = n - 2; i >= 0; i--) {
187             for (j = i + 1; j < n; j++)
188                 if ((j - i) < BW)
189                     parVect[i] -= T[i][j - i] * parVect[j];
190             parVect[i] = parVect[i] / T[i][0];
191         }
192     
193         /* T Diagonal invertion */
#pragma omp for shared(i) 
194         for (i = 0; i < n; i++) {
195             T[i][0] = 1.0 / T[i][0];
196         }
197     
198         /* N Diagonal invertion */
199         for (i = 0; i < n; i++) {
200             vect[0] = T[i][0];
201             invNdiag[i] = vect[0] * vect[0];
202             for (j = i + 1; j < n; j++) {
203                 somma = 0.0;
204                 for (k = i; k < j; k++) {
205                     if ((j - k) < BW)
206                         somma -= vect[k - i] * T[k][j - k];
207                 }
208                 vect[j - i] = somma * T[j][0];
209                 invNdiag[i] += vect[j - i] * vect[j - i];
210             }
211         }
212     
213             /*--------------------------------------*/
214         G_free_matrix(T);
215         G_free_vector(vect);
216     
217         return;
218     }

_______________________________________________
grass-dev mailing list
[email protected]
http://lists.osgeo.org/mailman/listinfo/grass-dev

[GRASS-dev] Re: [GRASS-user] v.lidar.edgedetection very slow

Reply via email to