Re: [Pharo-users] Understanding the role of the sources file

Werner Kassens Wed, 13 Jan 2016 06:59:34 -0800

Hi Dimitris,

your formulation "...Pharo bytcode...and convert it to machine code..."is insofar irritating to me as "convert it to machine code" wouldsuggest to me that a compiler is at work here. Davids "executing Pharobyte-code" seems more understandable to me here.

werner


On 01/13/2016 02:22 PM, Dimitris Chloupis wrote:

I assume you have never read a an introduction to C++ then :D

here is the final addition for the vm

(Vm) is the only component that is different for each operating system.
The main purpose of the VM is to take Pharo bytcode that is generated
each time user accepts a piece of code and convert it to machine code in
order to be executed, but also to generally handle low level
functionality like interpreting code, handling OS events (mouse and
keyboard), calling C libraries etc. Pharo 4 comes with the Cog VM a very
fast JIT VM.

I think its clear, precise and does not leave much room for confusion.
Personally I think its very important for the absolute begineer to have
strong foundations of understanding the fundamental of Pharo and not for
things to appear magical and "dont touch this".

On Wed, Jan 13, 2016 at 2:54 PM Sven Van Caekenberghe <s...@stfx.eu
<mailto:s...@stfx.eu>> wrote:


     > On 13 Jan 2016, at 13:42, Dimitris Chloupis
    <kilon.al...@gmail.com <mailto:kilon.al...@gmail.com>> wrote:
     >
     > I mentioned bytecode because I dont want the user to see at some
    point bytecode and say "What the hell is that" I want the reader to
    feel confident that at least understands the basic in Pharo. Also
    very brief explanations about bytecode I have seen in similar python
    tutorials. Obviously I dont want to go any deeper than that because
    the user wont have to worry about the technical details on a daily
    basis anyway.
     >
     > I agree that I could add a bit more on the VM description similar
    to what you posted. I am curious though, wont even the interpreter
    generate machine code in order to execute the code  or does it use
    existing machine code inside the VM binary ?

    No, a classic interpreter does not 'generate' machine code, it is
    just a program that reads and executes bytes codes in a loop, the
    interpreter 'is' machine code.

    No offence, but you see why I think it is important to not try to
    use or explain too much complex concepts in the 1st chapter.

    Learning to program is hard. It should first be done abstractly.
    Think about Scratch. The whole idea of Smalltalk is to create a
    world of interacting objects. (Even byte code is not a necessary
    concept at all, for example, in Pharo, you can compile (translate)
    to AST and execute that, I believe. There are Smalltalk
    implementations that compile directly to C or JavaScript). Hell,
    even 'compile' is not necessary, just 'accept'. See ?

     > On Wed, Jan 13, 2016 at 2:25 PM Sven Van Caekenberghe
    <s...@stfx.eu <mailto:s...@stfx.eu>> wrote:
     > Sounds about right.
     >
     > Now, I would swap 1 and 4, as the image is the most important
    abstraction.
     >
     > There is also a bit too much emphasis on (byte|source)code. This
    is already pretty technical (it assume you know what compilation is
    and so on). But I understand it must be explained here, and you did
    it well.
     >
     > However, I would start by saying that the image is a snapshot of
    the object world in memory that is effectively a live Pharo system.
    It contains everything that is available and that exists in Pharo.
    This includes any objects that you created yourself, windows,
    browsers, open debuggers, executing processes, all meta objects as
    well as all representations of code.
     >
     > <sidenote>
     > The fact that there is a sources and changes file is an
    implementation artefact, not something fundamental. There are ideas
    to change this in the future (but you do not have to mention that).
     > </sidenote>
     >
     > Also, the VM not only executes code, it maintains the object
    world, which includes the ability to load and save it from and to an
    image. It creates a portable (cross platform) abstraction that
    isolates the image from the particular details of the underlying
    hardware and OS. In that role it implements the interface with the
    outside world. I would mention that second part before mentioning
    the code execution.
     >
     > The sentence "The purpose of the VM is to take Pharo bytcode that
    is generated each time user accepts a piece of code and convert it
    to machine code in order to be executed." is not 100% correct. It is
    possible to execute the byte code without converting it. This is
    called interpretation. JIT is a faster technique that includes
    converting (some often used) byte code to machine code and caching that.
     >
     > I hope this helps (it is hard to write a 'definitive explanation'
    as there are some many aspects to this and it depends on the
    context/audience).
     >
     > > On 13 Jan 2016, at 12:58, Dimitris Chloupis
    <kilon.al...@gmail.com <mailto:kilon.al...@gmail.com>> wrote:
     > >
     > > So I am correct that the image does not store the source code,
    and that the source code is stored in sources and changes. The only
    diffirence is that the objects have a source variable that points to
    the right place for finding the source code.
     > >
     > > This is the final text if you find anything incorrect please
    correct me
     > >
     > > ---------------
     > >
     > > 1. The virtual machine (VM) is the only component that is
    different for each operating system. The purpose of the VM is to
    take Pharo bytcode that is generated each time user accepts a piece
    of code and convert it to machine code in order to be executed.
    Pharo 4 comes with the Cog VM a very fast JIT VM. The VM executable
    is named:
     > >
     > > • Pharo.exe for Windows; • pharo for Linux ; and
     > >
     > > • Pharo for OSX (inside a package also named Pharo.app).
     > > The other components below are portable across operating
    systems, and
     > >
     > > can be copied and run on any appropriate virtual machine.
     > >
     > > 2. The sources file contains source code for parts of Pharo
    that don’t change frequently. Sources file is important because the
    image file format stores only the bytecode of live objects and not
    their source code. Typically a new sources file is generated once
    per major release of Pharo. For Pharo 4.0, this file is named
    PharoV40.sources.
     > >
     > > 3. The changes file logs of all source code modifications since
    the .sources file was generated. This facilitates a per method
    history for diffs or re- verting.That means that even if you dont
    manage to save the image file on a crash or you just forgot you can
    recover your changes from this file. Each release provides a near
    empty file named for the release, for example Pharo4.0.changes.
     > >
     > > 4. The image file provides a frozen in time snapshot of a
    running Pharo system. This is the file where the Pharo bytecode is
    stored and as such its a cross platform format. This is the heart of
    Pharo, containing the live state of all objects in the system
    (including classes and methods, since they are objects too). The
    file is named for the release (like Pharo4.0.image).
     > >
     > > The .image and .changes files provided by a Pharo release are
    the starting point for a live environment that you adapt to your
    needs. Essentially the image file containes the compiler of the
    language (not the VM) , the language parser, the IDE tools, many
    libraries and acts a bit like a virtual Operation System that runs
    on top of a Virtual Machine (VM), similarly to ISO files.
     > >
     > > As you work in Pharo, these files are modified, so you need to
    make sure that they are writable. The .image and .changes files are
    intimately linked and should always be kept together, with matching
    base filenames. Never edit them directly with a text editor, as
    .images holds your live object runtime memory, which indexes into
    the .changes files for the source. It is a good idea to keep a
    backup copy of the downloaded .image and .changes files so you can
    always start from a fresh image and reload your code. However the
    most efficient way for backing up code is to use a version control
    system that will provide an easier and powerful way to back up and
    track your changes.
     > >
     > > The four main component files above can be placed in the same
    directory, although it’s also possible to put the Virtual Machine
    and sources file in a separate directory where everyone has
    read-only access to them.
     > >
     > > If more than one image file is present in the same directory
    pharo will prompt you to choose an image file you want to load.
     > >
     > > Do whatever works best for your style of working and your
    operating system.
     > >
     > >
     > >
     > >
     > >
     > > On Wed, Jan 13, 2016 at 12:13 PM Sven Van Caekenberghe
    <s...@stfx.eu <mailto:s...@stfx.eu>> wrote:
     > >
     > > > On 13 Jan 2016, at 10:57, Dimitris Chloupis
    <kilon.al...@gmail.com <mailto:kilon.al...@gmail.com>> wrote:
     > > >
     > > > I was adding a short description to the UPBE about sources
    file , I always thought that the sources file is the file that
    contains the source code of the image because the image file itself
    stores only the bytecode.
     > > >
     > > > However its just came to my attention that the sources file
    does not contain code that is recently installed in the image.
     > > >
     > > > So how exactly the sources file works and what it is ?
     > >
     > > The main perspective is from the object point of view: methods
    are just objects like everything else. In order to be executable
    they know their byte codes (which might be JIT compiled on
    execution, but that is an implementation detail) and they know their
    source code.
     > >
     > > Today we would probably just store the source code strings in
    the image (maybe compressed) as memory is pretty cheap. But way back
    when Smalltalk started, that was not the case. So they decided to
    map the source code out to files.
     > >
     > > So method source code is a magic string (RemoteString) that
    points to some position in a file. There are 2 files in use: the
    sources file and the changes file.
     > >
     > > The sources file is a kind of snapshot of the source code of
    all methods at the point of release of a major new version. That is
    why there is a Vxy in their name. The source file never changes once
    created or renewed (a process called generating the sources, see
    PharoSourcesCondenser).
     > >
     > > While developing and creating new versions of methods, the new
    source code is appended to another file called the changes file,
    much like a transaction log. This is also a safety mechanism to
    recover 'lost' changes.
     > >
     > > The changes file can contain multiple versions of a method.
    This can be reduced in size using a process called condensing the
    changes, see PharoChangesCondenser.
     > >
     > > On a new release, the changes file will be (almost) empty.
     > >
     > > HTH,
     > >
     > > Sven
     > >
     > >
     > >
     >
     >

Re: [Pharo-users] Understanding the role of the sources file

Reply via email to