On Tue, Jun 24, 2014 at 10:31 PM, Shigio YAMAGUCHI <[email protected]> wrote: >> Having the prefix is not mandated. It allows you to define >> adding/removing tags for files without depending on stat(). We have > > It means that files with no prefix need stat()?
Yes >> ~61500 files in the repository around 1000 files change per day (rough >> estimate). We know exactly the changes to the file system >> (added/modified/deleted) and hence can generate a file with the >> prefixes. This will avoid calling stat() on those ~1000 files. >> >> Since the build daemon builds for different branches, the effect >> multiplies. This was one of the features requested by the team that >> owns the build to help reduce the overall build. Hence, I decided to >> look at the GNU global code and hack. I always wanted to hack on this >> code, I like using it and now want to enhance it. > > By the prefixes, what percent average does the execution time of > gtags decrease? If possible, would you please show me the data? > Running over NFSv3 Input files (4): [1142]$ wc -l prefix.files 4 prefix.files [1143]$ wc -l no-prefix.files 4 no-prefix.files Original without prefix: [1138]$ time for ii in `cat no-prefix.files` ; do /usr/software/bin/gtags --single-update $ii; done real 0m27.290s user 0m5.768s sys 0m4.357s Modified without prefix: [1139]$ time for ii in `cat no-prefix.files` ; do gtags --single-update $ii; done real 0m40.861s user 0m6.149s sys 0m4.705s => Degraded due to checking if the file is a source file (issourcefile) or a list of files (can be fixed by having a separate flag) Modified batch operation without prefix: [1140]$ time gtags --single-update no-prefix.files real 0m7.145s user 0m1.438s sys 0m1.229s Modified batch with prefix: [1141]$ time gtags --single-update prefix.files real 0m7.081s user 0m1.496s sys 0m1.129s <-- reduction in time by avoiding stat (not significant though due to file system caching) => There is a visible benefit in batch processing of files Ran under valgrind and find 'strtol()' via calls to 'atoi()' as one of the biggest contributors to performance overheads. I am looking at storing the integer in DB and fetching it instead of storing the integer as char and having to convert it back to get fid. That will be a separate patch. Wish this was under git... (I will try to import it into git) with best regards, dhruva _______________________________________________ Bug-global mailing list [email protected] https://lists.gnu.org/mailman/listinfo/bug-global
