On Thu, Apr 24, 2014 at 7:42 AM, Doherty, Peter Charles <[email protected]> wrote: > What's the main limiting factor for purge speed? > > I've got one problem user who has millions of small files.
Only one? Buy some lotto tickets, have some real good luck ;-p > Robinhood has been diligently working away at purging the files, but it's > presently going at about 6 deletes per second, which strikes me as pretty > slow. > Adding an index to the last_access column in the ENTRIES table seemed to help > boost the DB query. (If this seems like good practice, it might be worth > mentioning > in the documentation.) It depends. By adding another index, inserts have an extra plenty. depending on how the index was created (as an additional row to a current index or as an entirely new key) the costs vary. Ideally it would be an additional row to a current index, so its within the same look up. I am not sure how RBH actually performs file deletions but as a general FYI use rsync. A great write up _use to_ exists however way back machine was able to retrieve it: https://web.archive.org/web/20130929001850/http://linuxnote.net/jianingy/en/linux/a-fast-way-to-remove-huge-number-of-files.html As a note for Thomas and the other RBH devs: This may be a faster way to purge files: https://gist.github.com/jzwinck/5692534 > Does robinhood batch the unlink operations? Is there anything else I'm > missing that would explain why the purge is crawling along? >From the output, it appears your bottle neck is not the actually file deletion operation but rather the database. The GET_INFO_DB (select statements) along with DB_APPLY. I suggest you run wget mysqltuner.pl perl mysqltuner.pl And see if you can improve your database performance on the GET_INFO_DB operation. You may also want to increase the number of threads to perform this action (only 7 with none of them idle). > > 2014/04/24 09:51:18 [24573/1] STATS | ==== EntryProcessor Pipeline Stats === > 2014/04/24 09:51:18 [24573/1] STATS | Idle threads: 0 > 2014/04/24 09:51:18 [24573/1] STATS | Id constraints count: 10000 (hash > min=0/max=7/avg=1.3) > 2014/04/24 09:51:18 [24573/1] STATS | Stage | Wait | Curr | Done > | Total | ms/op | > 2014/04/24 09:51:18 [24573/1] STATS | 0: GET_FID | 0 | 0 | 0 > | 0 | 0.00 | > 2014/04/24 09:51:18 [24573/1] STATS | 1: GET_INFO_DB | 428 | 0 | 8924 > | 6890996 | 2.58 | > 2014/04/24 09:51:18 [24573/1] STATS | 2: GET_INFO_FS | 305 | 7 | 15 > | 6349673 | 20.29 | > 2014/04/24 09:51:18 [24573/1] STATS | 3: REPORTING | 0 | 0 | 0 > | 6028488 | 0.00 | > 2014/04/24 09:51:18 [24573/1] STATS | 4: PRE_APPLY | 0 | 0 | 0 > | 6322256 | 0.00 | > 2014/04/24 09:51:18 [24573/1] STATS | 5: DB_APPLY | 280 | 1 | 40 > | 6321975 | 5.49 | 53.19% batched (avg batch size: 3.1) > 2014/04/24 09:51:18 [24573/1] STATS | 6: CHGLOG_CLR | 0 | 0 | 0 > | 6881424 | 0.01 | > 2014/04/24 09:51:18 [24573/1] STATS | 7: RM_OLD_ENTRIES | 0 | 0 | 0 > | 0 | 0.00 | -- Adam Brenner Computer Science, Undergraduate Student Donald Bren School of Information and Computer Sciences System Administrator, HPC Cluster Office of Information Technology http://hpc.oit.uci.edu/ University of California, Irvine www.ics.uci.edu/~aebrenne/ [email protected] ------------------------------------------------------------------------------ Start Your Social Network Today - Download eXo Platform Build your Enterprise Intranet with eXo Platform Software Java Based Open Source Intranet - Social, Extensible, Cloud Ready Get Started Now And Turn Your Intranet Into A Collaboration Platform http://p.sf.net/sfu/ExoPlatform _______________________________________________ robinhood-support mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/robinhood-support
