Re: [Pharo-users] [ANN] PharoLambda 1.5 - Pharo running on AWS Lambda now with saved Debug sessions via S3

Guillermo Polito Wed, 16 Aug 2017 02:55:06 -0700

On Wed, Aug 16, 2017 at 11:46 AM, Tim Mackinnon <tim@testit.works> wrote:


> Hi, tracing through your changes - it looks like:
>
> Smalltalk cleanUp: true except: #() confirming: false.
>
> Takes care of all the non-unicode changes you proposed (and it seems like
> its a known cleanup protocol).
>

I based my script on #cleanupForRelease ^^. But I did not just blindly
execute it as is because I wanted to understand the implications of each
line.


> I wonder if the Unicode change is worth it/risky as many web based
> services I might connect to with Zinc do support Unicode so maybe I should
> keep that one in. (I will for now - might verify how much of a difference
> it really makes)
>
> No, it should not break any encoding/decoding. The changes I proposed will
just nil out two things:

 - the uppercase/lowercase mapping unicode tables that says for each
codepoint if the codepoint is uppercase/lowercase and allows
transformations from/to uppercase/lowercase. This means that these may not
work as expected:

         aChar asLowercase
         aChar asUppercase
         aChar toLowercase
         aChar toUppercase

 - the unicode classification table that says if a character is letter or
digit, and so on. This means that these may not work as expected:

        aChar isLetter
        aChar isDigit
        aChar isAlphaNumeric


> I think my next port of call is cleanUp for Monticello/Metacello as I see
> a fair amount of that stuff floating around in my image (after I’ve used it
> to bootstrap my code).
>
> Tim
>
> On 16 Aug 2017, at 02:32, Guillermo Polito <guillermopol...@gmail.com>
> wrote:
>
> Actually it happens first that monticello is "nicely" coupled with the
> changeset system and logs all the source code loaded in change sets :D :/
> ¬¬. Also, the first two strings in terms of size are related to unicode
> tables (we should put them in files instead of in the image and load them
> on demand), and the two biggest arrays also to unicode. I just tried the
> following in a clean bootstrapped "minimal" image (metacello):
>
> "Careful, this will make that #isLetter, #isUppercase #isLowercase,
> #toLowercase and #toUppercase only work on ascii"
> Character characterSet: nil.
> Unicode classPool at: #GeneralCategory put: nil.
> Unicode classPool at: #DecimalProperty put: nil.
>
> UnicodeDefinition removeFromSystem.
> ChangeSet removeChangeSetsNamedSuchThat: [ :each | true ].
> ChangeSet resetCurrentToNewUnnamedChangeSet.
> MCDefinition clearInstances.
> Undeclared removeUnreferencedKeys.
> Smalltalk garbageCollect.
>
> like this:
>
> ./vm/pharo Pharo7.0-metacello-32bit-fa236b7.image eval --save "Character
> characterSet: nil. Unicode classPool at: #GeneralCategory put: nil. Unicode
> classPool at: #DecimalProperty put: nil. UnicodeDefinitions
> removeFromSystem. ChangeSet removeChangeSetsNamedSuchThat: [ :each | true
> ]. ChangeSet resetCurrentToNewUnnamedChangeSet. MCDefinition
> clearInstances. Undeclared removeUnreferencedKeys. Smalltalk
> garbageCollect."
>
> and my image went down from 11MB to 6.6MB (7.0 MB if I don't change back
> to ascii with the first three lines)
>
> Then I tried a tally:
>
> ./vm/pharo Pharo7.0-metacello-32bit-fa236b7.image save spacetally
>
> ./vm/pharo spacetally.image eval --save "repo := MCFileTreeRepository new
> directory: '../src' asFileReference. version := repo
> loadVersionFromFileNamed: 'Tool-Profilers.package'. version load."
>
> re-clean since i loaded some packages
>
> ./vm/pharo spacetally.image eval --save "ChangeSet
> removeChangeSetsNamedSuchThat: [ :each | true ]. ChangeSet
> resetCurrentToNewUnnamedChangeSet. MCDefinition
> clearInstances. Undeclared removeUnreferencedKeys. Smalltalk
> garbageCollect."
>
> This image is now 6.6MB (7.1MB with the unicode large arrays), 4.1% of
> strings (274k) what seems reasonable. Remaining big strings are Pharo's
> licence, the buffer of the changes file and then some class comments
> (shouldn't they be fetched from disk as any other method source code?).
>
> Making again a tally shows that ~30% of the space is taken by Arrays and
> 21.9% by compiled methods. But, BUT! :) I have ~30k arrays and lots of
> collections also:
>
> "MethodDictionary"              2872 +
> "IdentitySet"                         12781 +
> "OrderedCollection"             4398 +
> "Set"                                     2959 +
> "Dictionary"                          1997 +
> "IdentityDictionary"               454
> -----------------------------------------------
> 25461
>
> So there are ~5k arrays that are used outside collections.
>
> Worth exploring a bit more I think.
>
> On Wed, Aug 16, 2017 at 1:23 AM, Guillermo Polito <
> guillermopol...@gmail.com> wrote:
>
>>
>>
>> On Tue, Aug 15, 2017 at 11:26 PM, Tim Mackinnon <tim@testit.works> wrote:
>>
>>> Hi Guille/Ben - I got a quick moment to try the SpaceTally (aside: it
>>> seems very convoluted to load a single package into the image, I was trying
>>> to avoid having to create a baselineOf for something so simple - I ended up
>>> with:
>>>
>>
>> I know, I also believe we have to simplify this. In any case, baselines
>> are healthy as they allow to also express dependencies. Otherwise you'll
>> end up loading dependencies by hand. We'll fix this soon I hope.
>>
>>
>>>
>>> repo := MCFileTreeRepository new directory: './bootstrap'
>>> asFileReference.
>>>
>>> version := repo loadVersionFromFileNamed: 'Tool-Profilers.package'.
>>> version load.
>>>
>>>
>>> Anyway - in my minimal image, like in the fat image there seems to be a
>>> surprising amount of bytestrings (4mb worth?). I think that might need some
>>> digging into? It seems like a lot somehow. Although Ben’s neat experiment
>>> of zipping strings shows that’s not a real route.
>>>
>>> In a deployed minimal image - maybe I can get rid of some other things
>>> like MethodChangeRecords or MCMethodDefiniion’s (but they are smaller wins
>>> - but noticeable)
>>>
>>> Class                                          code space # instances
>>>  inst space     percent   inst average size
>>> ByteString                                           2640       37365
>>>     4823848       21.50              129.10
>>> Array                                                3742       53002
>>>     3961944       17.60               74.75
>>> CompiledMethod                                      19159       30481
>>>     2912968       13.00               95.57
>>> Association                                          1148       58348
>>>     1867136        8.30               32.00
>>> MethodChangeRecord                                    431       34312
>>>     1097984        4.90               32.00
>>> ByteArray                                            4605         290
>>>      908728        4.00             3133.54
>>> ByteSymbol                                           1698       22689
>>>      840168        3.70               37.03
>>> IdentitySet                                           408       19076
>>>      610432        2.70               32.00
>>> MethodDictionary                                     3310        3520
>>>      608688        2.70              172.92
>>> WeakArray                                            1758        3024
>>>      597824        2.70              197.69
>>> MCMethodDefinition                                   4318        6659
>>>      426176        1.90               64.00
>>> Protocol                                             1679        8382
>>>      268224        1.20               32.00
>>> OrderedCollection                                    6555        5509
>>>      220360        1.00               40.00
>>>
>>> As an aside - my Gitlab project is public, the scripts that load things
>>> up are in ./scripts (build.sh, and minimal.st and loadlocal.st)
>>>
>>> Tim
>>>
>>> On 15 Aug 2017, at 08:02, Guillermo Polito <guillermopol...@gmail.com>
>>> wrote:
>>>
>>>
>>>
>>> On Mon, Aug 14, 2017 at 4:42 PM, Tim Mackinnon <tim@testit.works> wrote:
>>>
>>>> Hi Guille - just running SpaceTally on my dev image to get a feel for
>>>> it. It turns out that in the minimal images you’ve been creating, its not
>>>> loaded (makes sense).
>>>>
>>>
>>> Yup, it's loaded afterwards.
>>>
>>> All packages are loaded through metacello baselines. We should start
>>> refactoring and making standalone projects, each one with a baseline for
>>> himself, and his own dependencies described.
>>>
>>> I was checking on your gitlab and I have probably no access: how are you
>>> finally loading packages in the bootstrap image? Can you share that with us
>>> in text? I'd like to improve that situation.
>>>
>>>
>>>> I’m wondering if there is an easy way to import it in (I guess that
>>>> package should be in the Pharo git tree I cloned to get Fuel loaded right?
>>>> Or is there a separate standalone source?).
>>>>
>>>
>>> Yes it is, you can get the package programatically doing
>>>
>>> SpaceTally package name
>>>
>>> And furthermore, get the baseline that currently is loading by doing
>>>
>>> package := SpaceTally package name.
>>> BaselineOf subclasses select: [ :e |
>>> e project version packages anySatisfy: [ :p | p name = package ]].
>>>
>>>
>>>>
>>>> Thanks for all the support, and your email about why the contexts stack
>>>> up is very well received (I will comment over there).
>>>>
>>>> By the way - it looks like Martin Fowler picked up on this announcement
>>>> - so maybe we might get some interest from his mass of followers.
>>>>
>>>> Tim
>>>>
>>>> On 14 Aug 2017, at 10:49, Guillermo Polito <guillermopol...@gmail.com>
>>>> wrote:
>>>>
>>>> Hi Tim,
>>>>
>>>> On Mon, Aug 14, 2017 at 11:41 AM, Tim Mackinnon <tim@testit.works> w
>>>> rote:
>>>>
>>>>> Hey guys, thanks for your enthusiasm around this - and I cannot stress
>>>>> enough how this was only possible because of the work that has gone into
>>>>> making Pharo (in particular the 64bit image, as well as having a minimal
>>>>> image, and some great blog posts on serialising contexts) as well as the
>>>>> patience from everyone in answering questions and helping me get it all
>>>>> working.
>>>>>
>>>>> I’m still quite keen to get my execution time back down under 800ms
>>>>> and I’d like to actually get back to writing a few skills to automate a 
>>>>> few
>>>>> things around my house.
>>>>>
>>>>> To Answer Denis’ question -
>>>>>
>>>>> My final footprint is 30.4mb - thats composed of a 22mb image (with a
>>>>> simple example that pulls in Fuel, ZTimestamp and the S3 Library which
>>>>> depends on XMLParser) and then the VM (from which I removed obvious 
>>>>> dll’s).
>>>>>
>>>>> In my original experiments with a 6.0 minimal image - I did manage to
>>>>> get to a 13.4mb image (which started out as 12mb original size, and then
>>>>> loaded in STON and had only a simple clock example). I think the sweet 
>>>>> spot
>>>>> is around 20mb total footprint as that seems to get me into the 
>>>>> 450ms-900ms
>>>>> range.
>>>>>
>>>>> The 7.0 min image now starts out at 15mb and then I’m not sure why
>>>>> loading Fuel, S3 and XMLParser takes 7mb (it seems big to me - but I’ve 
>>>>> not
>>>>> dug into that).
>>>>>
>>>>
>>>> You can do further space analysis using the following expression
>>>>
>>>> SpaceTally  new printSpaceAnalysis
>>>>
>>>> You can do that in an eval and check what's taking space. With measures
>>>> we can iterate and improve :).
>>>>
>>>>
>>>>> I’ve also found (and this on the back of unserialising the context in
>>>>> my example) that the way we build images has 15+ saved stack sessions that
>>>>> have saved on top of each other from the way we build up the images. I
>>>>> don’t yet know the implications of size/speed of these - but we need a
>>>>> better way of folding executions when we snapshot headless images. I’m 
>>>>> also
>>>>> not clear if there are any other startup tasks that take precious time
>>>>> (this also has implications for our fat development images as they take
>>>>> much longer to appear than they really should).
>>>>>
>>>>
>>>> I'm working on this as I'm writing this mail ;)
>>>>
>>>> https://pharo.fogbugz.com/f/cases/20309
>>>> https://github.com/pharo-project/pharo/pull/196
>>>>
>>>> I'll write down the implications further in a different thread.
>>>>
>>>>
>>>>> I’ll be exploring some of these size/speed tradeoff’s in follow on
>>>>> messages.
>>>>>
>>>>> But once again, a big thanks - I’ve not enjoyed programming like this
>>>>> for ages.
>>>>>
>>>>> Tim
>>>>>
>>>>> On 12 Aug 2017, at 16:26, Ben Coman <b...@openinworld.com> wrote:
>>>>>
>>>>> hi Tim,
>>>>>
>>>>> That is.....      AWESOME!
>>>>>
>>>>> Very nice delivery - it flowed well with great narration.
>>>>>
>>>>> I loved @2:17 "this is the interesting piece, because PharoLambda has
>>>>> serialized the execution context of its application and saved it into [my
>>>>> S3 bucket] ... [then on the local machine] rematerializes a debugger [on
>>>>> that context]."
>>>>>
>>>>> There is a clarity in your video presentation that really may intrigue
>>>>> outsiders. As a community we should push this on the usual hacker forums -
>>>>> ycombinator could be a good starting point (but I'm locked out of my
>>>>> account there).
>>>>> An enticing title could be...
>>>>> "Debugging Lambdas by re-materializing saved execution contexts on
>>>>> your local machine."
>>>>>
>>>>> cheers -ben
>>>>>
>>>>> On Fri, Aug 11, 2017 at 3:37 PM, Denis Kudriashov <dionisiydk@gmail.c
>>>>> om> wrote:
>>>>>
>>>>>> This is cool Tim.
>>>>>>
>>>>>> So what image size you deployed at the end?
>>>>>>
>>>>>> 2017-08-10 15:47 GMT+02:00 Tim Mackinnon <tim@testit.works>:
>>>>>>
>>>>>>> I just wanted to thank everyone for their help in getting my pet
>>>>>>> project further along, so that now I can announce that PharoLambda is 
>>>>>>> now
>>>>>>> working with the V7 minimal image and also supports post mortem 
>>>>>>> debugging
>>>>>>> by saving a zipped fuel context onto S3.
>>>>>>>
>>>>>>> This latter item is particularly satisfying as at a recent
>>>>>>> serverless conference (JeffConf) there was a panel where poor 
>>>>>>> development
>>>>>>> tools on serverless platforms was highlighted as a real problem.
>>>>>>>
>>>>>>> In our community we’ve had these kinds of tools at our fingertips
>>>>>>> for ages - but I don’t think the wider development community has really
>>>>>>> noticed. Debugging something short lived like a Lambda execution is 
>>>>>>> quite
>>>>>>> startling, as the current answer is “add more logging”, and we all know
>>>>>>> that sucks. To this end, I’ve created a little screencast showing this 
>>>>>>> in
>>>>>>> action - and it was pretty cool because it was a real example I 
>>>>>>> encountered
>>>>>>> when I got everything working and was trying my test application out.
>>>>>>>
>>>>>>> I’ve also put a bit of work into tuning the excellent GitLab CI
>>>>>>> tools, so that I can cache many of the artefacts used between different
>>>>>>> build runs (this might also be of interest to others using CI systems).
>>>>>>>
>>>>>>> The Gitlab project is on: https://gitlab.com/macta/PharoLambda
>>>>>>> And the screencast: https://www.youtube.com/watch?v=bNNCT1hLA3E
>>>>>>>
>>>>>>> Tim
>>>>>>>
>>>>>>>
>>>>>>> On 15 Jul 2017, at 00:39, Tim Mackinnon <tim@testit.works> wrote:
>>>>>>>
>>>>>>> Hi - I’ve been playing around with getting Pharo to run well on AWS
>>>>>>> Lambda. It’s early days, but I though it might be interesting to share 
>>>>>>> what
>>>>>>> I’ve learned so far.
>>>>>>>
>>>>>>> Usage examples and code at https://gitlab.com/macta/PharoLambda
>>>>>>>
>>>>>>> With help from many of the folks here, I’ve been able to get a
>>>>>>> simple example to run in 500ms-1200ms with a minimal Pharo 6 image. You 
>>>>>>> can
>>>>>>> easily try it out yourself. This seems slightly better than what the 
>>>>>>> GoLang
>>>>>>> folks have been able to do.
>>>>>>>
>>>>>>> Tim
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>
>>>> --
>>>>
>>>> Guille Polito
>>>>
>>>> Research Engineer
>>>> French National Center for Scientific Research - *http://www.cnrs.fr*
>>>> <http://www.cnrs.fr/>
>>>>
>>>>
>>>> *Web:* *http://guillep.github.io* <http://guillep.github.io/>
>>>> *Phone: *+33 06 52 70 66 13 <+33%206%2052%2070%2066%2013>
>>>>
>>>>
>>>>
>>>
>>>
>>> --
>>>
>>> Guille Polito
>>>
>>> Research Engineer
>>> French National Center for Scientific Research - *http://www.cnrs.fr*
>>> <http://www.cnrs.fr/>
>>>
>>>
>>> *Web:* *http://guillep.github.io* <http://guillep.github.io/>
>>> *Phone: *+33 06 52 70 66 13 <+33%206%2052%2070%2066%2013>
>>>
>>>
>>>
>>
>>
>> --
>>
>> Guille Polito
>>
>> Research Engineer
>> French National Center for Scientific Research - *http://www.cnrs.fr*
>> <http://www.cnrs.fr/>
>>
>>
>> *Web:* *http://guillep.github.io* <http://guillep.github.io/>
>> *Phone: *+33 06 52 70 66 13 <+33%206%2052%2070%2066%2013>
>>
>
>
>
> --
>
> Guille Polito
>
> Research Engineer
> French National Center for Scientific Research - *http://www.cnrs.fr*
> <http://www.cnrs.fr/>
>
>
> *Web:* *http://guillep.github.io* <http://guillep.github.io/>
> *Phone: *+33 06 52 70 66 13 <+33%206%2052%2070%2066%2013>
>
>
>


-- 



Guille Polito


Research Engineer

French National Center for Scientific Research - *http://www.cnrs.fr*
<http://www.cnrs.fr>



*Web:* *http://guillep.github.io* <http://guillep.github.io>

*Phone: *+33 06 52 70 66 13

Re: [Pharo-users] [ANN] PharoLambda 1.5 - Pharo running on AWS Lambda now with saved Debug sessions via S3

Reply via email to