Re: [CIL users] CIL perfomance issues
On Mon, Jun 08, 2009 at 10:20:54AM +0200, Gabriel Kerneis wrote: > Still looking for a simple way to handle this. Does anybody have an > idea? I just sent a patch upstream to handle this. I'll let you know if it gets accepted. -- Gabriel Kerneis -- OpenSolaris 2009.06 is a cutting edge operating system for enterprises looking to deploy the next generation of Solaris that includes the latest innovations from Sun and the OpenSource community. Download a copy and enjoy capabilities such as Networking, Storage and Virtualization. Go to: http://p.sf.net/sfu/opensolaris-get ___ CIL-users mailing list CIL-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/cil-users
Re: [CIL users] CIL perfomance issues
On Mon, Jun 08, 2009 at 09:53:26AM +0200, Gabriel Kerneis wrote: > On Thu, May 28, 2009 at 10:30:11AM +0200, Christoph Spiel wrote: > > I have been using the following Tailor configuration to mirror the > > CIL repository for quite a while now. It has been working > > flawlessly so far. > > I'm using it now. But beware, the latest version of Tailor ignores > external references, and CIL has one (ocamlutil). Unless you track it > in a separate repository, you should add: ignore-externals = False Ooops, this didn't quite work actually: it does update the external ocamlutil, but still doesn't record the related changes (since svn log doesn't log the external repositories). Still looking for a simple way to handle this. Does anybody have an idea? Regards, -- Gabriel Kerneis -- OpenSolaris 2009.06 is a cutting edge operating system for enterprises looking to deploy the next generation of Solaris that includes the latest innovations from Sun and the OpenSource community. Download a copy and enjoy capabilities such as Networking, Storage and Virtualization. Go to: http://p.sf.net/sfu/opensolaris-get ___ CIL-users mailing list CIL-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/cil-users
Re: [CIL users] CIL perfomance issues
On Thu, May 28, 2009 at 10:30:11AM +0200, Christoph Spiel wrote: > I have been using the following Tailor > configuration to mirror the CIL repository for > quite a while now. It has been working > flawlessly so far. I'm using it now. But beware, the latest version of Tailor ignores external references, and CIL has one (ocamlutil). Unless you track it in a separate repository, you should add: > [DEFAULT] > projects = cil > root-directory = /site/mirror/repositories > > [cil] > source = svn:cil > start-revision = 10140 > state-file = cil.tailor.state > target = hg:cil > filter-badchars = True > > [svn:cil] > module = /trunk/cil > repository = svn://hal.cs.berkeley.edu/home/svn/projects > subdir = cil-svn ignore-externals = False > [hg:cil] > subdir = cil-hg It took me some time to figure out this one (include the *mandatory* caps to False). Regards, -- Gabriel Kerneis -- OpenSolaris 2009.06 is a cutting edge operating system for enterprises looking to deploy the next generation of Solaris that includes the latest innovations from Sun and the OpenSource community. Download a copy and enjoy capabilities such as Networking, Storage and Virtualization. Go to: http://p.sf.net/sfu/opensolaris-get ___ CIL-users mailing list CIL-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/cil-users
Re: [CIL users] CIL perfomance issues
Hi everybody, well, it turns out the insane amount of memory used was the fault of curl macros; believe me or not, after preprocessing, I got a line (among others) of over 15000 character. Did I mention the word insane before? Anyway, the patch I wrote for CIL is still worth being applied since, even though they did not improve the timing, they cut allocated memory by half --- who would like to waste memory, and time GCing it? I'll clean things a bit and send it in the next few days. Regards, -- Gabriel Kerneis -- Register Now for Creativity and Technology (CaT), June 3rd, NYC. CaT is a gathering of tech-side developers & brand creativity professionals. Meet the minds behind Google Creative Lab, Visual Complexity, Processing, & iPhoneDevCamp as they present alongside digital heavyweights like Barbarian Group, R/GA, & Big Spaceship. http://p.sf.net/sfu/creativitycat-com ___ CIL-users mailing list CIL-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/cil-users
Re: [CIL users] CIL perfomance issues
Hi, On Thu, May 28, 2009 at 10:22:08AM +0200, Christoph Spiel wrote: > On Tue, May 26, 2009 at 07:02:16PM +0200, Gabriel Kerneis wrote: > > I've been suffering performance issues with CIL recently. 30% of the > > time was spent in garbage collection. > > (1) Recompile the run-time environment of OCaml > with the best known combination of optimization > flags. This may speed up the garbage collector > a little bit. This might not be enough for me, see below. > (2) Drastically increase the heap size. (Of > course this action is limited by the amount of > memory in the target machines.) This reduces the number of collections A LOT but doesn't really cut the time down, because there too many words allocated. > (3) Rewrite the analysis to be less functional. > I had a functional implementation of my analysis > that used a lot of [Buffer]s for the output: > many of them and with a large total size. > Rewriting it to a more procedural style which > meant immediate output of the string data to the > results file dramatically reduced the load on > the GC, increased the performance a lot, and at > the same time reduced the program's memory > footprint. Well, most of my trouble come from a single visitor which has to go through my program many times. Sadly, I'm not analysing anything, but rather transforming the source code. I'm as imperative as can be (everything is mutated in place), but sadly cil.ml is full of partial applications. Hence my patches. Just to give you an idea of the figures involved: Memory statistics: total=1173.63Mb, max=3.56Mb, minor=1173.54Mb, major=125.06Mb, promoted=124.97Mb minor collections=8953 major collections=129 compactions=0 With your tune_garbage_collector(), I got: Memory statistics: total=1173.51Mb, max=34.54Mb, minor=1173.42Mb, major=16.48Mb, promoted=16.39Mb minor collections=323 major collections=18 compactions=3 Which looks like a big win, but the time spent allocating the 1.2Gb of minor words kills everything else (moreover I'm on 64bit, so it's 2.4Gb in fact). Thanks anyway for your advice, -- Gabriel Kerneis -- Register Now for Creativity and Technology (CaT), June 3rd, NYC. CaT is a gathering of tech-side developers & brand creativity professionals. Meet the minds behind Google Creative Lab, Visual Complexity, Processing, & iPhoneDevCamp as they present alongside digital heavyweights like Barbarian Group, R/GA, & Big Spaceship. http://p.sf.net/sfu/creativitycat-com ___ CIL-users mailing list CIL-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/cil-users
Re: [CIL users] CIL perfomance issues
On Tue, May 26, 2009 at 10:45:30PM +0200, Gabriel Kerneis wrote: > I consider switching to Tailor the day I get fed up to manually tracking it. I have been using the following Tailor configuration to mirror the CIL repository for quite a while now. It has been working flawlessly so far. [DEFAULT] projects = cil root-directory = /site/mirror/repositories [cil] source = svn:cil start-revision = 10140 state-file = cil.tailor.state target = hg:cil filter-badchars = True [svn:cil] module = /trunk/cil repository = svn://hal.cs.berkeley.edu/home/svn/projects subdir = cil-svn [hg:cil] subdir = cil-hg Note that you may have to grab the latest version of Tailor from http://progetti.arstecnica.it/tailor if you use a newer release of your favourite (target) SCM system -- happened to me with Mercurial v1.2. /Chris -- Register Now for Creativity and Technology (CaT), June 3rd, NYC. CaT is a gathering of tech-side developers & brand creativity professionals. Meet the minds behind Google Creative Lab, Visual Complexity, Processing, & iPhoneDevCamp as they present alongside digital heavyweights like Barbarian Group, R/GA, & Big Spaceship. http://p.sf.net/sfu/creativitycat-com ___ CIL-users mailing list CIL-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/cil-users
Re: [CIL users] CIL perfomance issues
Gabriel - On Tue, May 26, 2009 at 07:02:16PM +0200, Gabriel Kerneis wrote: > I've been suffering performance issues with CIL recently. 30% of the > time was spent in garbage collection. A while ago a faced similar problems. That time, I tried three different approaches to speed up the analysis. From least effective to most effective: (1) Recompile the run-time environment of OCaml with the best known combination of optimization flags. This may speed up the garbage collector a little bit. The improvement was noticeable in my case but insufficient. (2) Drastically increase the heap size. (Of course this action is limited by the amount of memory in the target machines.) I was lucky and the time spent in the GC dropped significantly after increasing the heap size. In the documentation of [tune_garbage_collector] I have collected some of the scarce information on tuning of the OCaml garbage collector that I found on the web. (** Tune the garbage collector's parameters for the extremely large datasets we usually cope with. Defaults: - [minor_heap_size]: 32Kwords - [major_heap_increment]: 62Kwords - [space_overhead]: 80% - [max_overhead]: 500% Tuning: - Increasing [minor_heap_size] will reduce the time spent in both the minor GC and the major GC. It is often (but not always) preferable to keep it small enough to fit in the cache of the machine. - Increasing [major_heap_increment] reduces the number of times that [add_to_heap] is called. - Increasing [space_overhead] will reduce the time spent in the major GC. Inspect the GC statistics at the end of the program's run. The most important figure is the ratio of [promoted_words] to [minor_words]. This should be as small as possible. If it is more than 10%, the program spends too much time in the GC (both minor and major). Increasing [minor_heap_size] often helps in this case. *) let tune_garbage_collector () = let gc = Gc.get () in Gc.set { gc with Gc.minor_heap_size = 64 * gc.Gc.minor_heap_size; Gc.major_heap_increment = 64 * gc.Gc.major_heap_increment; Gc.space_overhead = 2 * gc.Gc.space_overhead; Gc.max_overhead = 2 * gc.Gc.max_overhead; Gc.verbose = 0 (* useful value: 0x01d *) } (3) Rewrite the analysis to be less functional. Ouch! I had a functional implementation of my analysis that used a lot of [Buffer]s for the output: many of them and with a large total size. Rewriting it to a more procedural style which meant immediate output of the string data to the results file dramatically reduced the load on the GC, increased the performance a lot, and at the same time reduced the program's memory footprint. HTH, Chris -- Register Now for Creativity and Technology (CaT), June 3rd, NYC. CaT is a gathering of tech-side developers & brand creativity professionals. Meet the minds behind Google Creative Lab, Visual Complexity, Processing, & iPhoneDevCamp as they present alongside digital heavyweights like Barbarian Group, R/GA, & Big Spaceship. http://p.sf.net/sfu/creativitycat-com ___ CIL-users mailing list CIL-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/cil-users
Re: [CIL users] CIL perfomance issues
Hi, > I'm curious, how do you track the upstream svn changes using darcs? I > find that git interoperates with svn very well, so I'm using git to > manage my local copy of cil. Manually since there are few updates. Something along this line: http://weblog.masukomi.org/2007/5/23/using-darcs-with-svn-cvs-flow-chart I consider switching to Tailor the day I get fed up to manually tracking it. http://wiki.darcs.net/DarcsWiki/Tailor darcs get http://www.pps.jussieu.fr/~kerneis/software/repos/cil/svn is the upstream version, btw. But darcs is not ideally suited for such a task, even with darcs2 some "merges" take up to a minute on a recent computer. I don't know if git is better though. Darcs is the preferred tool in my lab, so I stick with it as long as it works for me. >> I still have an awful lot of major collections Please, forgive and forget my wanderings about major collection. Only the minor ones are significant in that case (and they reduced dramatically with my patches). Moreover, I did not have so many major collections... Regards, -- Gabriel Kerneis -- Register Now for Creativity and Technology (CaT), June 3rd, NYC. CaT is a gathering of tech-side developers & brand creativity professionals. Meet the minds behind Google Creative Lab, Visual Complexity, Processing, & iPhoneDevCamp as they present alongside digital heavyweights like Barbarian Group, R/GA, & Big Spaceship. http://p.sf.net/sfu/creativitycat-com ___ CIL-users mailing list CIL-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/cil-users
Re: [CIL users] CIL perfomance issues
I'm curious, how do you track the upstream svn changes using darcs? I find that git interoperates with svn very well, so I'm using git to manage my local copy of cil. On Tue, May 26, 2009 at 1:02 PM, Gabriel Kerneis wrote: > Hi, > > I've been suffering performance issues with CIL recently. 30% of the > time was spent in garbage collection. > > [Not sure if this figure is significant since I still spend 30% of my > time garbage-collecting, but it's a shorter time now ;-) ] > > The issue basically boils down to a lot of partial evaluations, pushing > the GC under heavy-load. I'm currently patching cil.ml extensively > (well, the visitor part at least) and reduced the number of allocated > words by half on my benchmarks. > > [By the way, ocamlutil/stats.ml is broken on 64bit architectures, the > figures returned by printM in Stats.print should by doubled. I'll send a > patch for this one too.] > > I still have an awful lot of major collections but I think I'm on the > good way since there are far less allocations. > > I'll provide a unified patch on this list soon but curious people could > look at the latest patches (using darcs) here: > > darcs get http://www.pps.jussieu.fr/~kerneis/software/repos/cil/patched > > More information on the general issue and work-arounds here: > http://ocaml.janestreet.com/?q=node/30 > > Regards, > -- > Gabriel Kerneis > > -- > Register Now for Creativity and Technology (CaT), June 3rd, NYC. CaT > is a gathering of tech-side developers & brand creativity professionals. Meet > the minds behind Google Creative Lab, Visual Complexity, Processing, & > iPhoneDevCamp as they present alongside digital heavyweights like Barbarian > Group, R/GA, & Big Spaceship. http://p.sf.net/sfu/creativitycat-com > ___ > CIL-users mailing list > CIL-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/cil-users > -- Register Now for Creativity and Technology (CaT), June 3rd, NYC. CaT is a gathering of tech-side developers & brand creativity professionals. Meet the minds behind Google Creative Lab, Visual Complexity, Processing, & iPhoneDevCamp as they present alongside digital heavyweights like Barbarian Group, R/GA, & Big Spaceship. http://p.sf.net/sfu/creativitycat-com ___ CIL-users mailing list CIL-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/cil-users