This is very encouraging, and also very instructive (your post on snapshot is also one in this area too. I have a queue of things to try for you on Friday).
I'll try adding your ideas to my script and see if it squeezes some more. And then I guess we need to to decide which parts go into your minimum build steps and which are an external script (or possibly a HeadlessImageCleaner class we keep loaded so it's easier to maintain?) Tim Sent from my iPhone > On 16 Aug 2017, at 08:52, Guillermo Polito <guillermopol...@gmail.com> wrote: > > This means it would be healthy to do a cleanup (at least the non aggressive > one, ChangeSets and MC stuff) on each of the images we produce and not just > the latest one. > >> On Wed, Aug 16, 2017 at 8:35 AM, Tim Mackinnon <tim@testit.works> wrote: >> Yes you were on to something there (and at the same time, by poking around >> with #pointersTo I noticed some chains of objects too). So I ran the >> following script (partially borrowed from ImageCleaner) and this has got me >> down to a 14mb image (instance sizes listed below, which is looking much >> healthier - and those MethodChangeRecords are gone too) !!! >> >> I suspect there are more monti/metacello things that are still lurking >> around. >> >> I also wonder if I need some the character sorting strings too. >> >> Tim >> >> "CmdLine script to debug the initial minimal image" >> >> | logger repo version | >> >> logger := FileStream stderr. >> logger cr; nextPutAll: 'Starting Minimal Cleanup Script...'. >> >> logger cr; nextPutAll: '>Resetting Class Comments'. >> Smalltalk allClasses do: [ :c | c classComment: '' stamp: '' ]. >> >> logger cr; nextPutAll: '>Removing MC holders'. >> MCMethodDefinition allInstances do: [:each | each become: String new ]. >> MCClassDefinition allInstances do: [:each | each become: String new ]. >> MCVersionInfo allInstances do: [:each | each become: String new ]. >> >> logger cr; nextPutAll: '>ImageCleaner release routines'. >> Smalltalk organization removeEmptyCategories. >> Smalltalk >> allClassesAndTraitsDo: [ :class | >> [ :each | >> each >> removeEmptyCategories; >> sortCategories ] >> value: class organization; >> value: class class organization ]. >> >> (RPackageOrganizer default packages select: #isEmpty) >> do: #unregister. >> >> Smalltalk organization sortCategories. >> Smalltalk garbageCollect. >> Smalltalk cleanOutUndeclared. >> Smalltalk fixObsoleteReferences. >> Smalltalk cleanUp: true except: #() confirming: false. >> >> logger cr; nextPutAll: '>GC'. >> 3 timesRepeat: [ >> Smalltalk garbageCollect. >> Smalltalk cleanOutUndeclared. >> Smalltalk fixObsoleteReferences]. >> >> logger cr; nextPutAll: 'Finished Script.'; cr; cr. >> >> My top instances are now: >> >> Class code space # instances inst >> space percent inst average size >> CompiledMethod 19159 30481 >> 2912968 21.60 95.57 >> Array 3742 36495 >> 2852448 21.10 78.16 >> ByteString 2640 24018 >> 2517168 18.60 104.80 >> ByteSymbol 1698 20722 >> 759208 5.60 36.64 >> Association 1148 19786 >> 633152 4.70 32.00 >> IdentitySet 408 15452 >> 494464 3.70 32.00 >> MethodDictionary 3310 3520 >> 350192 2.60 99.49 >> Protocol 1679 8382 >> 268224 2.00 32.00 >> WeakArray 1758 265 >> 232304 1.70 876.62 >> OrderedCollection 6555 5043 >> 201720 1.50 40.00 >> ClassOrganization 5281 3520 >> 168960 1.30 48.00 >> Metaclass 7184 1748 >> 153824 1.10 88.00 >> >> >> >> >>> On 15 Aug 2017, at 23:00, Guillermo Polito <guillermopol...@gmail.com> >>> wrote: >>> >>> Just a hunch: could you inspect ur MethodChangeRecord instances ? >>> >>> Le mar. 15 août 2017 à 23:55, Tim Mackinnon <tim@testit.works> a écrit : >>>> A weird observation - is it possible that source code is being stored in >>>> the image as strings somehow? When I do >>>> >>>> ./pharo PharoLambda.image eval "ByteString allInstances inject: >>>> (OrderedCollection new) into: [:r :i | i size > 500 ifTrue: [r add: i]. r]" >>>> >>>> I see to see reams of what looks like method source - but I thought source >>>> code was stored in the .sources file and the .changes file (and I haven’t >>>> been bundling those in my deployed image). >>>> >>>> I’m trying to figure out how you find references to a string object, to >>>> chase down what is pointing to these strings as maybe there is a quick 4mb >>>> win by simply nil’ing out some obvious things. (This doesn’t of course >>>> help with a default minimal image - but maybe a few tricks for packaging >>>> and deploying something). >>>> >>>> Tim >>>> >>>>> On 15 Aug 2017, at 22:26, Tim Mackinnon <tim@testit.works> wrote: >>>>> >>>>> Hi Guille/Ben - I got a quick moment to try the SpaceTally (aside: it >>>>> seems very convoluted to load a single package into the image, I was >>>>> trying to avoid having to create a baselineOf for something so simple - I >>>>> ended up with: >>>>> >>>>> repo := MCFileTreeRepository new directory: './bootstrap' asFileReference. >>>>> version := repo loadVersionFromFileNamed: 'Tool-Profilers.package'. >>>>> version load. >>>>> >>>>> Anyway - in my minimal image, like in the fat image there seems to be a >>>>> surprising amount of bytestrings (4mb worth?). I think that might need >>>>> some digging into? It seems like a lot somehow. Although Ben’s neat >>>>> experiment of zipping strings shows that’s not a real route. >>>>> >>>>> In a deployed minimal image - maybe I can get rid of some other things >>>>> like MethodChangeRecords or MCMethodDefiniion’s (but they are smaller >>>>> wins - but noticeable) >>>>> >>>>> Class code space # instances >>>>> inst space percent inst average size >>>>> ByteString 2640 37365 >>>>> 4823848 21.50 129.10 >>>>> Array 3742 53002 >>>>> 3961944 17.60 74.75 >>>>> CompiledMethod 19159 30481 >>>>> 2912968 13.00 95.57 >>>>> Association 1148 58348 >>>>> 1867136 8.30 32.00 >>>>> MethodChangeRecord 431 34312 >>>>> 1097984 4.90 32.00 >>>>> ByteArray 4605 290 >>>>> 908728 4.00 3133.54 >>>>> ByteSymbol 1698 22689 >>>>> 840168 3.70 37.03 >>>>> IdentitySet 408 19076 >>>>> 610432 2.70 32.00 >>>>> MethodDictionary 3310 3520 >>>>> 608688 2.70 172.92 >>>>> WeakArray 1758 3024 >>>>> 597824 2.70 197.69 >>>>> MCMethodDefinition 4318 6659 >>>>> 426176 1.90 64.00 >>>>> Protocol 1679 8382 >>>>> 268224 1.20 32.00 >>>>> OrderedCollection 6555 5509 >>>>> 220360 1.00 40.00 >>>>> >>>>> As an aside - my Gitlab project is public, the scripts that load things >>>>> up are in ./scripts (build.sh, and minimal.st and loadlocal.st) >>>>> >>>>> Tim >>>>> >>>>>> On 15 Aug 2017, at 08:02, Guillermo Polito <guillermopol...@gmail.com> >>>>>> wrote: >>>>>> >>>>>> >>>>>> >>>>>> On Mon, Aug 14, 2017 at 4:42 PM, Tim Mackinnon <tim@testit.works> wrote: >>>>>>> Hi Guille - just running SpaceTally on my dev image to get a feel for >>>>>>> it. It turns out that in the minimal images you’ve been creating, its >>>>>>> not loaded (makes sense). >>>>>> >>>>>> Yup, it's loaded afterwards. >>>>>> >>>>>> All packages are loaded through metacello baselines. We should start >>>>>> refactoring and making standalone projects, each one with a baseline for >>>>>> himself, and his own dependencies described. >>>>>> >>>>>> I was checking on your gitlab and I have probably no access: how are you >>>>>> finally loading packages in the bootstrap image? Can you share that with >>>>>> us in text? I'd like to improve that situation. >>>>>> >>>>>>> I’m wondering if there is an easy way to import it in (I guess that >>>>>>> package should be in the Pharo git tree I cloned to get Fuel loaded >>>>>>> right? Or is there a separate standalone source?). >>>>>> >>>>>> Yes it is, you can get the package programatically doing >>>>>> >>>>>> SpaceTally package name >>>>>> >>>>>> And furthermore, get the baseline that currently is loading by doing >>>>>> >>>>>> package := SpaceTally package name. >>>>>> BaselineOf subclasses select: [ :e | >>>>>> e project version packages anySatisfy: [ :p | p name = package ]]. >>>>>> >>>>>>> >>>>>>> Thanks for all the support, and your email about why the contexts stack >>>>>>> up is very well received (I will comment over there). >>>>>>> >>>>>>> By the way - it looks like Martin Fowler picked up on this announcement >>>>>>> - so maybe we might get some interest from his mass of followers. >>>>>>> >>>>>>> Tim >>>>>>> >>>>>>>> On 14 Aug 2017, at 10:49, Guillermo Polito <guillermopol...@gmail.com> >>>>>>>> wrote: >>>>>>>> >>>>>>>> Hi Tim, >>>>>>>> >>>>>>>>> On Mon, Aug 14, 2017 at 11:41 AM, Tim Mackinnon <tim@testit.works> >>>>>>>>> wrote: >>>>>>>>> Hey guys, thanks for your enthusiasm around this - and I cannot >>>>>>>>> stress enough how this was only possible because of the work that has >>>>>>>>> gone into making Pharo (in particular the 64bit image, as well as >>>>>>>>> having a minimal image, and some great blog posts on serialising >>>>>>>>> contexts) as well as the patience from everyone in answering >>>>>>>>> questions and helping me get it all working. >>>>>>>>> >>>>>>>>> I’m still quite keen to get my execution time back down under 800ms >>>>>>>>> and I’d like to actually get back to writing a few skills to automate >>>>>>>>> a few things around my house. >>>>>>>>> >>>>>>>>> To Answer Denis’ question - >>>>>>>>> >>>>>>>>> My final footprint is 30.4mb - thats composed of a 22mb image (with a >>>>>>>>> simple example that pulls in Fuel, ZTimestamp and the S3 Library >>>>>>>>> which depends on XMLParser) and then the VM (from which I removed >>>>>>>>> obvious dll’s). >>>>>>>>> >>>>>>>>> In my original experiments with a 6.0 minimal image - I did manage to >>>>>>>>> get to a 13.4mb image (which started out as 12mb original size, and >>>>>>>>> then loaded in STON and had only a simple clock example). I think the >>>>>>>>> sweet spot is around 20mb total footprint as that seems to get me >>>>>>>>> into the 450ms-900ms range. >>>>>>>>> >>>>>>>>> The 7.0 min image now starts out at 15mb and then I’m not sure why >>>>>>>>> loading Fuel, S3 and XMLParser takes 7mb (it seems big to me - but >>>>>>>>> I’ve not dug into that). >>>>>>>> >>>>>>>> You can do further space analysis using the following expression >>>>>>>> >>>>>>>> SpaceTally new printSpaceAnalysis >>>>>>>> >>>>>>>> You can do that in an eval and check what's taking space. With >>>>>>>> measures we can iterate and improve :). >>>>>>>> >>>>>>>>> I’ve also found (and this on the back of unserialising the context in >>>>>>>>> my example) that the way we build images has 15+ saved stack sessions >>>>>>>>> that have saved on top of each other from the way we build up the >>>>>>>>> images. I don’t yet know the implications of size/speed of these - >>>>>>>>> but we need a better way of folding executions when we snapshot >>>>>>>>> headless images. I’m also not clear if there are any other startup >>>>>>>>> tasks that take precious time (this also has implications for our fat >>>>>>>>> development images as they take much longer to appear than they >>>>>>>>> really should). >>>>>>>> >>>>>>>> I'm working on this as I'm writing this mail ;) >>>>>>>> >>>>>>>> https://pharo.fogbugz.com/f/cases/20309 >>>>>>>> https://github.com/pharo-project/pharo/pull/196 >>>>>>>> >>>>>>>> I'll write down the implications further in a different thread. >>>>>>>> >>>>>>>>> >>>>>>>>> I’ll be exploring some of these size/speed tradeoff’s in follow on >>>>>>>>> messages. >>>>>>>>> >>>>>>>>> But once again, a big thanks - I’ve not enjoyed programming like this >>>>>>>>> for ages. >>>>>>>>> >>>>>>>>> Tim >>>>>>>>> >>>>>>>>>> On 12 Aug 2017, at 16:26, Ben Coman <b...@openinworld.com> wrote: >>>>>>>>>> >>>>>>>>>> hi Tim, >>>>>>>>>> >>>>>>>>>> That is..... AWESOME! >>>>>>>>>> >>>>>>>>>> Very nice delivery - it flowed well with great narration. >>>>>>>>>> >>>>>>>>>> I loved @2:17 "this is the interesting piece, because PharoLambda >>>>>>>>>> has serialized the execution context of its application and saved it >>>>>>>>>> into [my S3 bucket] ... [then on the local machine] rematerializes a >>>>>>>>>> debugger [on that context]." >>>>>>>>>> >>>>>>>>>> There is a clarity in your video presentation that really may >>>>>>>>>> intrigue outsiders. As a community we should push this on the usual >>>>>>>>>> hacker forums - ycombinator could be a good starting point (but I'm >>>>>>>>>> locked out of my account there). >>>>>>>>>> An enticing title could be... >>>>>>>>>> "Debugging Lambdas by re-materializing saved execution contexts on >>>>>>>>>> your local machine." >>>>>>>>>> >>>>>>>>>> cheers -ben >>>>>>>>>> >>>>>>>>>>> On Fri, Aug 11, 2017 at 3:37 PM, Denis Kudriashov >>>>>>>>>>> <dionisi...@gmail.com> wrote: >>>>>>>>>>> This is cool Tim. >>>>>>>>>>> >>>>>>>>>>> So what image size you deployed at the end? >>>>>>>>>>> >>>>>>>>>>> 2017-08-10 15:47 GMT+02:00 Tim Mackinnon <tim@testit.works>: >>>>>>>>>>>> I just wanted to thank everyone for their help in getting my pet >>>>>>>>>>>> project further along, so that now I can announce that PharoLambda >>>>>>>>>>>> is now working with the V7 minimal image and also supports post >>>>>>>>>>>> mortem debugging by saving a zipped fuel context onto S3. >>>>>>>>>>>> >>>>>>>>>>>> This latter item is particularly satisfying as at a recent >>>>>>>>>>>> serverless conference (JeffConf) there was a panel where poor >>>>>>>>>>>> development tools on serverless platforms was highlighted as a >>>>>>>>>>>> real problem. >>>>>>>>>>>> >>>>>>>>>>>> In our community we’ve had these kinds of tools at our fingertips >>>>>>>>>>>> for ages - but I don’t think the wider development community has >>>>>>>>>>>> really noticed. Debugging something short lived like a Lambda >>>>>>>>>>>> execution is quite startling, as the current answer is “add more >>>>>>>>>>>> logging”, and we all know that sucks. To this end, I’ve created a >>>>>>>>>>>> little screencast showing this in action - and it was pretty cool >>>>>>>>>>>> because it was a real example I encountered when I got everything >>>>>>>>>>>> working and was trying my test application out. >>>>>>>>>>>> >>>>>>>>>>>> I’ve also put a bit of work into tuning the excellent GitLab CI >>>>>>>>>>>> tools, so that I can cache many of the artefacts used between >>>>>>>>>>>> different build runs (this might also be of interest to others >>>>>>>>>>>> using CI systems). >>>>>>>>>>>> >>>>>>>>>>>> The Gitlab project is on: https://gitlab.com/macta/PharoLambda >>>>>>>>>>>> And the screencast: https://www.youtube.com/watch?v=bNNCT1hLA3E >>>>>>>>>>>> >>>>>>>>>>>> Tim >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> On 15 Jul 2017, at 00:39, Tim Mackinnon <tim@testit.works> wrote: >>>>>>>>>>>>> >>>>>>>>>>>>> Hi - I’ve been playing around with getting Pharo to run well on >>>>>>>>>>>>> AWS Lambda. It’s early days, but I though it might be interesting >>>>>>>>>>>>> to share what I’ve learned so far. >>>>>>>>>>>>> >>>>>>>>>>>>> Usage examples and code at https://gitlab.com/macta/PharoLambda >>>>>>>>>>>>> >>>>>>>>>>>>> With help from many of the folks here, I’ve been able to get a >>>>>>>>>>>>> simple example to run in 500ms-1200ms with a minimal Pharo 6 >>>>>>>>>>>>> image. You can easily try it out yourself. This seems slightly >>>>>>>>>>>>> better than what the GoLang folks have been able to do. >>>>>>>>>>>>> >>>>>>>>>>>>> Tim >>>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> -- >>>>>>>> >>>>>>>> Guille Polito >>>>>>>> >>>>>>>> Research Engineer >>>>>>>> French National Center for Scientific Research - http://www.cnrs.fr >>>>>>>> >>>>>>>> >>>>>>>> Web: http://guillep.github.io >>>>>>>> Phone: +33 06 52 70 66 13 >>>>>>> >>>>>> >>>>>> >>>>>> >>>>>> -- >>>>>> >>>>>> Guille Polito >>>>>> >>>>>> Research Engineer >>>>>> French National Center for Scientific Research - http://www.cnrs.fr >>>>>> >>>>>> >>>>>> Web: http://guillep.github.io >>>>>> Phone: +33 06 52 70 66 13 >>>>> >>>> >>> >>> -- >>> >>> Guille Polito >>> >>> Research Engineer >>> French National Center for Scientific Research - http://www.cnrs.fr >>> >>> >>> Web: http://guillep.github.io >>> Phone: +33 06 52 70 66 13 >> > > > > -- > > Guille Polito > > Research Engineer > French National Center for Scientific Research - http://www.cnrs.fr > > > Web: http://guillep.github.io > Phone: +33 06 52 70 66 13