Be careful because some of those aggressive cleanups may turn some parts of
your image unstable. For example:

This is dangerous:

MCMethodDefinition allInstances do: [:each | each become: String new ].
MCClassDefinition allInstances do: [:each | each become: String new ].
MCVersionInfo allInstances do: [:each | each become: String new ].


And this may break some code if you're using non-ascii characters:

Unicode classPool at: #GeneralCategory put: nil.
Unicode classPool at: #DecimalProperty put: nil.



On Wed, Aug 16, 2017 at 10:07 AM, Tim Mackinnon <tim@testit.works> wrote:

> This is very encouraging, and also very instructive (your post on snapshot
> is also one in this area too. I have a queue of things to try for you on
> Friday).
>
> I'll try adding your ideas to my script and see if it squeezes some more.
>
> And then I guess we need to to decide which parts go into your minimum
> build steps and which are an external script (or possibly a
> HeadlessImageCleaner class we keep loaded so it's easier to maintain?)
>
> Tim
>
> Sent from my iPhone
>
> On 16 Aug 2017, at 08:52, Guillermo Polito <guillermopol...@gmail.com>
> wrote:
>
> This means it would be healthy to do a cleanup (at least the non
> aggressive one, ChangeSets and MC stuff) on each of the images we produce
> and not just the latest one.
>
> On Wed, Aug 16, 2017 at 8:35 AM, Tim Mackinnon <tim@testit.works> wrote:
>
>> Yes you were on to something there (and at the same time, by poking
>> around with #pointersTo I noticed some chains of objects too). So I ran the
>> following script (partially borrowed from ImageCleaner) and this has got me
>> down to a 14mb image (instance sizes listed below, which is looking much
>> healthier - and those MethodChangeRecords are gone too) !!!
>>
>> I suspect there are more monti/metacello things that are still lurking
>> around.
>>
>> I also wonder if I need some the character sorting strings too.
>>
>> Tim
>>
>> "CmdLine script to debug the initial minimal image"
>>
>> | logger repo version |
>>
>> logger := FileStream stderr.
>> logger cr; nextPutAll: 'Starting Minimal Cleanup Script...'.
>>
>> logger cr; nextPutAll: '>Resetting Class Comments'.
>> Smalltalk allClasses do: [ :c | c classComment: '' stamp: '' ].
>>
>> logger cr; nextPutAll: '>Removing MC holders'.
>> MCMethodDefinition allInstances do: [:each | each become: String new ].
>> MCClassDefinition allInstances do: [:each | each become: String new ].
>> MCVersionInfo allInstances do: [:each | each become: String new ].
>>
>> logger cr; nextPutAll: '>ImageCleaner release routines'.
>> Smalltalk organization removeEmptyCategories.
>>    Smalltalk
>>       allClassesAndTraitsDo: [ :class |
>>          [ :each |
>>             each
>>                removeEmptyCategories;
>>                sortCategories ]
>>                   value: class organization;
>>                   value: class class organization ].
>>
>> (RPackageOrganizer default packages select: #isEmpty)
>>     do: #unregister.
>>
>> Smalltalk organization sortCategories.
>> Smalltalk garbageCollect.
>> Smalltalk cleanOutUndeclared.
>> Smalltalk fixObsoleteReferences.
>> Smalltalk cleanUp: true except: #() confirming: false.
>>
>> logger cr; nextPutAll: '>GC'.
>> 3 timesRepeat: [
>>         Smalltalk garbageCollect.
>>         Smalltalk cleanOutUndeclared.
>>         Smalltalk fixObsoleteReferences].
>>
>> logger cr; nextPutAll: 'Finished Script.'; cr; cr.
>>
>>
>> My top instances are now:
>>
>> Class                                          code space # instances
>>  inst space     percent   inst average size
>> CompiledMethod                                      19159       30481
>>   2912968       21.60               95.57
>> Array                                                3742       36495
>>   2852448       21.10               78.16
>> ByteString                                           2640       24018
>>   2517168       18.60              104.80
>> ByteSymbol                                           1698       20722
>>    759208        5.60               36.64
>> Association                                          1148       19786
>>    633152        4.70               32.00
>> IdentitySet                                           408       15452
>>    494464        3.70               32.00
>> MethodDictionary                                     3310        3520
>>    350192        2.60               99.49
>> Protocol                                             1679        8382
>>    268224        2.00               32.00
>> WeakArray                                            1758         265
>>    232304        1.70              876.62
>> OrderedCollection                                    6555        5043
>>    201720        1.50               40.00
>> ClassOrganization                                    5281        3520
>>    168960        1.30               48.00
>> Metaclass                                            7184        1748
>>    153824        1.10               88.00
>>
>>
>>
>>
>> On 15 Aug 2017, at 23:00, Guillermo Polito <guillermopol...@gmail.com>
>> wrote:
>>
>> Just a hunch: could you inspect ur MethodChangeRecord instances ?
>>
>> Le mar. 15 août 2017 à 23:55, Tim Mackinnon <tim@testit.works> a écrit :
>>
>>> A weird observation - is it possible that source code is being stored in
>>> the image as strings somehow? When I do
>>>
>>> ./pharo PharoLambda.image eval "ByteString allInstances inject:
>>> (OrderedCollection new) into: [:r :i | i size > 500 ifTrue: [r add: i]. r]"
>>>
>>> I see to see reams of what looks like method source - but I thought
>>> source code was stored in the .sources file and the .changes file (and I
>>> haven’t been bundling those in my deployed image).
>>>
>>> I’m trying to figure out how you find references to a string object, to
>>> chase down what is pointing to these strings as maybe there is a quick 4mb
>>> win by simply nil’ing out some obvious things. (This doesn’t of course help
>>> with a default minimal image - but maybe a few tricks for packaging and
>>> deploying something).
>>>
>>> Tim
>>>
>>> On 15 Aug 2017, at 22:26, Tim Mackinnon <tim@testit.works> wrote:
>>>
>>> Hi Guille/Ben - I got a quick moment to try the SpaceTally (aside: it
>>> seems very convoluted to load a single package into the image, I was trying
>>> to avoid having to create a baselineOf for something so simple - I ended up
>>> with:
>>>
>>> repo := MCFileTreeRepository new directory: './bootstrap'
>>> asFileReference.
>>>
>>> version := repo loadVersionFromFileNamed: 'Tool-Profilers.package'.
>>> version load.
>>>
>>>
>>> Anyway - in my minimal image, like in the fat image there seems to be a
>>> surprising amount of bytestrings (4mb worth?). I think that might need some
>>> digging into? It seems like a lot somehow. Although Ben’s neat experiment
>>> of zipping strings shows that’s not a real route.
>>>
>>> In a deployed minimal image - maybe I can get rid of some other things
>>> like MethodChangeRecords or MCMethodDefiniion’s (but they are smaller wins
>>> - but noticeable)
>>>
>>> Class                                          code space # instances
>>>  inst space     percent   inst average size
>>> ByteString                                           2640       37365
>>>     4823848       21.50              129.10
>>> Array                                                3742       53002
>>>     3961944       17.60               74.75
>>> CompiledMethod                                      19159       30481
>>>     2912968       13.00               95.57
>>> Association                                          1148       58348
>>>     1867136        8.30               32.00
>>> MethodChangeRecord                                    431       34312
>>>     1097984        4.90               32.00
>>> ByteArray                                            4605         290
>>>      908728        4.00             3133.54
>>> ByteSymbol                                           1698       22689
>>>      840168        3.70               37.03
>>> IdentitySet                                           408       19076
>>>      610432        2.70               32.00
>>> MethodDictionary                                     3310        3520
>>>      608688        2.70              172.92
>>> WeakArray                                            1758        3024
>>>      597824        2.70              197.69
>>> MCMethodDefinition                                   4318        6659
>>>      426176        1.90               64.00
>>> Protocol                                             1679        8382
>>>      268224        1.20               32.00
>>> OrderedCollection                                    6555        5509
>>>      220360        1.00               40.00
>>>
>>> As an aside - my Gitlab project is public, the scripts that load things
>>> up are in ./scripts (build.sh, and minimal.st and loadlocal.st)
>>>
>>> Tim
>>>
>>> On 15 Aug 2017, at 08:02, Guillermo Polito <guillermopol...@gmail.com>
>>> wrote:
>>>
>>>
>>>
>>> On Mon, Aug 14, 2017 at 4:42 PM, Tim Mackinnon <tim@testit.works> wrote:
>>>
>>>> Hi Guille - just running SpaceTally on my dev image to get a feel for
>>>> it. It turns out that in the minimal images you’ve been creating, its not
>>>> loaded (makes sense).
>>>>
>>>
>>> Yup, it's loaded afterwards.
>>>
>>> All packages are loaded through metacello baselines. We should start
>>> refactoring and making standalone projects, each one with a baseline for
>>> himself, and his own dependencies described.
>>>
>>> I was checking on your gitlab and I have probably no access: how are you
>>> finally loading packages in the bootstrap image? Can you share that with us
>>> in text? I'd like to improve that situation.
>>>
>>>
>>>> I’m wondering if there is an easy way to import it in (I guess that
>>>> package should be in the Pharo git tree I cloned to get Fuel loaded right?
>>>> Or is there a separate standalone source?).
>>>>
>>>
>>> Yes it is, you can get the package programatically doing
>>>
>>> SpaceTally package name
>>>
>>> And furthermore, get the baseline that currently is loading by doing
>>>
>>> package := SpaceTally package name.
>>> BaselineOf subclasses select: [ :e |
>>> e project version packages anySatisfy: [ :p | p name = package ]].
>>>
>>>
>>>>
>>>> Thanks for all the support, and your email about why the contexts stack
>>>> up is very well received (I will comment over there).
>>>>
>>>> By the way - it looks like Martin Fowler picked up on this announcement
>>>> - so maybe we might get some interest from his mass of followers.
>>>>
>>>> Tim
>>>>
>>>> On 14 Aug 2017, at 10:49, Guillermo Polito <guillermopol...@gmail.com>
>>>> wrote:
>>>>
>>>> Hi Tim,
>>>>
>>>> On Mon, Aug 14, 2017 at 11:41 AM, Tim Mackinnon <tim@testit.works> w
>>>> rote:
>>>>
>>>>> Hey guys, thanks for your enthusiasm around this - and I cannot stress
>>>>> enough how this was only possible because of the work that has gone into
>>>>> making Pharo (in particular the 64bit image, as well as having a minimal
>>>>> image, and some great blog posts on serialising contexts) as well as the
>>>>> patience from everyone in answering questions and helping me get it all
>>>>> working.
>>>>>
>>>>> I’m still quite keen to get my execution time back down under 800ms
>>>>> and I’d like to actually get back to writing a few skills to automate a 
>>>>> few
>>>>> things around my house.
>>>>>
>>>>> To Answer Denis’ question -
>>>>>
>>>>> My final footprint is 30.4mb - thats composed of a 22mb image (with a
>>>>> simple example that pulls in Fuel, ZTimestamp and the S3 Library which
>>>>> depends on XMLParser) and then the VM (from which I removed obvious 
>>>>> dll’s).
>>>>>
>>>>> In my original experiments with a 6.0 minimal image - I did manage to
>>>>> get to a 13.4mb image (which started out as 12mb original size, and then
>>>>> loaded in STON and had only a simple clock example). I think the sweet 
>>>>> spot
>>>>> is around 20mb total footprint as that seems to get me into the 
>>>>> 450ms-900ms
>>>>> range.
>>>>>
>>>>> The 7.0 min image now starts out at 15mb and then I’m not sure why
>>>>> loading Fuel, S3 and XMLParser takes 7mb (it seems big to me - but I’ve 
>>>>> not
>>>>> dug into that).
>>>>>
>>>>
>>>> You can do further space analysis using the following expression
>>>>
>>>> SpaceTally  new printSpaceAnalysis
>>>>
>>>> You can do that in an eval and check what's taking space. With measures
>>>> we can iterate and improve :).
>>>>
>>>>
>>>>> I’ve also found (and this on the back of unserialising the context in
>>>>> my example) that the way we build images has 15+ saved stack sessions that
>>>>> have saved on top of each other from the way we build up the images. I
>>>>> don’t yet know the implications of size/speed of these - but we need a
>>>>> better way of folding executions when we snapshot headless images. I’m 
>>>>> also
>>>>> not clear if there are any other startup tasks that take precious time
>>>>> (this also has implications for our fat development images as they take
>>>>> much longer to appear than they really should).
>>>>>
>>>>
>>>> I'm working on this as I'm writing this mail ;)
>>>>
>>>> https://pharo.fogbugz.com/f/cases/20309
>>>> https://github.com/pharo-project/pharo/pull/196
>>>>
>>>> I'll write down the implications further in a different thread.
>>>>
>>>>
>>>>> I’ll be exploring some of these size/speed tradeoff’s in follow on
>>>>> messages.
>>>>>
>>>>> But once again, a big thanks - I’ve not enjoyed programming like this
>>>>> for ages.
>>>>>
>>>>> Tim
>>>>>
>>>>> On 12 Aug 2017, at 16:26, Ben Coman <b...@openinworld.com> wrote:
>>>>>
>>>>> hi Tim,
>>>>>
>>>>> That is.....      AWESOME!
>>>>>
>>>>> Very nice delivery - it flowed well with great narration.
>>>>>
>>>>> I loved @2:17 "this is the interesting piece, because PharoLambda has
>>>>> serialized the execution context of its application and saved it into [my
>>>>> S3 bucket] ... [then on the local machine] rematerializes a debugger [on
>>>>> that context]."
>>>>>
>>>>> There is a clarity in your video presentation that really may intrigue
>>>>> outsiders. As a community we should push this on the usual hacker forums -
>>>>> ycombinator could be a good starting point (but I'm locked out of my
>>>>> account there).
>>>>> An enticing title could be...
>>>>> "Debugging Lambdas by re-materializing saved execution contexts on
>>>>> your local machine."
>>>>>
>>>>> cheers -ben
>>>>>
>>>>> On Fri, Aug 11, 2017 at 3:37 PM, Denis Kudriashov <dionisiydk@gmail.c
>>>>> om> wrote:
>>>>>
>>>>>> This is cool Tim.
>>>>>>
>>>>>> So what image size you deployed at the end?
>>>>>>
>>>>>> 2017-08-10 15:47 GMT+02:00 Tim Mackinnon <tim@testit.works>:
>>>>>>
>>>>>>> I just wanted to thank everyone for their help in getting my pet
>>>>>>> project further along, so that now I can announce that PharoLambda is 
>>>>>>> now
>>>>>>> working with the V7 minimal image and also supports post mortem 
>>>>>>> debugging
>>>>>>> by saving a zipped fuel context onto S3.
>>>>>>>
>>>>>>> This latter item is particularly satisfying as at a recent
>>>>>>> serverless conference (JeffConf) there was a panel where poor 
>>>>>>> development
>>>>>>> tools on serverless platforms was highlighted as a real problem.
>>>>>>>
>>>>>>> In our community we’ve had these kinds of tools at our fingertips
>>>>>>> for ages - but I don’t think the wider development community has really
>>>>>>> noticed. Debugging something short lived like a Lambda execution is 
>>>>>>> quite
>>>>>>> startling, as the current answer is “add more logging”, and we all know
>>>>>>> that sucks. To this end, I’ve created a little screencast showing this 
>>>>>>> in
>>>>>>> action - and it was pretty cool because it was a real example I 
>>>>>>> encountered
>>>>>>> when I got everything working and was trying my test application out.
>>>>>>>
>>>>>>> I’ve also put a bit of work into tuning the excellent GitLab CI
>>>>>>> tools, so that I can cache many of the artefacts used between different
>>>>>>> build runs (this might also be of interest to others using CI systems).
>>>>>>>
>>>>>>> The Gitlab project is on: https://gitlab.com/macta/PharoLambda
>>>>>>> And the screencast: https://www.youtube.com/watch?v=bNNCT1hLA3E
>>>>>>>
>>>>>>> Tim
>>>>>>>
>>>>>>>
>>>>>>> On 15 Jul 2017, at 00:39, Tim Mackinnon <tim@testit.works> wrote:
>>>>>>>
>>>>>>> Hi - I’ve been playing around with getting Pharo to run well on AWS
>>>>>>> Lambda. It’s early days, but I though it might be interesting to share 
>>>>>>> what
>>>>>>> I’ve learned so far.
>>>>>>>
>>>>>>> Usage examples and code at https://gitlab.com/macta/PharoLambda
>>>>>>>
>>>>>>> With help from many of the folks here, I’ve been able to get a
>>>>>>> simple example to run in 500ms-1200ms with a minimal Pharo 6 image. You 
>>>>>>> can
>>>>>>> easily try it out yourself. This seems slightly better than what the 
>>>>>>> GoLang
>>>>>>> folks have been able to do.
>>>>>>>
>>>>>>> Tim
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Guille Polito
>>>>
>>>> Research Engineer
>>>> French National Center for Scientific Research - *http://www.cnrs.fr*
>>>> <http://www.cnrs.fr/>
>>>>
>>>>
>>>> *Web:* *http://guillep.github.io* <http://guillep.github.io/>
>>>> *Phone: *+33 06 52 70 66 13 <+33%206%2052%2070%2066%2013>
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> Guille Polito
>>>
>>> Research Engineer
>>> French National Center for Scientific Research - *http://www.cnrs.fr*
>>> <http://www.cnrs.fr/>
>>>
>>>
>>> *Web:* *http://guillep.github.io* <http://guillep.github.io/>
>>> *Phone: *+33 06 52 70 66 13 <+33%206%2052%2070%2066%2013>
>>>
>>>
>>>
>>> --
>>
>> Guille Polito
>>
>> Research Engineer
>> French National Center for Scientific Research - *http://www.cnrs.fr*
>> <http://www.cnrs.fr/>
>>
>>
>> *Web:* *http://guillep.github.io* <http://guillep.github.io/>
>> *Phone: *+33 06 52 70 66 13 <+33%206%2052%2070%2066%2013>
>>
>>
>>
>
>
> --
>
>
>
> Guille Polito
>
>
> Research Engineer
>
> French National Center for Scientific Research - *http://www.cnrs.fr*
> <http://www.cnrs.fr>
>
>
>
> *Web:* *http://guillep.github.io* <http://guillep.github.io>
>
> *Phone: *+33 06 52 70 66 13 <+33%206%2052%2070%2066%2013>
>
>


-- 



Guille Polito


Research Engineer

French National Center for Scientific Research - *http://www.cnrs.fr*
<http://www.cnrs.fr>



*Web:* *http://guillep.github.io* <http://guillep.github.io>

*Phone: *+33 06 52 70 66 13

Reply via email to