Re: [Pharo-dev] Contributing to Pharo

Thierry Goubier Fri, 05 Feb 2016 06:18:49 -0800

Le 05/02/2016 14:55, Christophe Demarey a écrit :


Le 5 févr. 2016 à 14:33, Thierry Goubier a écrit :

Le 05/02/2016 11:33, Christophe Demarey a écrit :

Hi Thierry,

Just some thoughts I wanted to share:

Le 3 févr. 2016 à 10:18, Thierry Goubier a écrit :

I went through all the different possible file formats,
class-based, package-based, method-based, log metadata and the
like, and I concluded that:

- the method based format is as good as any other. Even better
since it has a spec (cypress).


I see cons that a class (or package) format would not have. One
file per method approach leads to generate plenty of small files.
In general, file systems do not like that: - it may consumes a
lot of space. I remember I had a Java/maven project with a lot of
small files and I got to fill the inodes tables on my unix
system. - you generate long pathes. Long pathes are not
user-friendly and some OS have restrictions on path length.


The method based structure of filetree is very close to how code is
navigated in Smalltalk browsers: one method at a time, with a
package/class/protocol hierarchical layering on top. The one file
per class / one file per package is a reference to the base unit of
C / C++.


I fully agree with that. As we have a lot (small) methods, we will
have a lot of small files and some file-system does not like that. I
remember huge slow-down be cause of that. It is good to keep that in
mind.

The problem is linked with writing too many files. Because of a possibleuncertainty about the on-disk state, FileTree erase the complete packagedirectory then rewrites everything, letting the vcs decide what hasreally changed. This is doubly slow, because it hits the filesystem andthe vcs.

I said to Dale I'd see into a diff based writer; it should improvethings a lot.

And no OS in general use has path restrictions that matter. Ok, the
windows vm has issues, but this is a vm bug, not a filesystem
issue.


Windows command-line has (had) this limitation.


Good to know.

By adopting a file per method approach, you also increase the
distance to get a common script format for Smalltalk. Here I mean
a file where you could define classes, methods, and run
arbitrary portions of Smalltalk code.


This format is called fileout, and already exist.


I mean something like a python script:
http://archive.stsci.edu/vo/python_examples.html

Not entirely keen in going that way. I prefer declarative formats forstoring packages. And I still think that the fileout format is that (asequence of scripts to execute, separated by !!).

All you describe is also available in the FileTree/Cypress format
and is technically better specified.

- method based format allow for method-history queries on the
git/vcs history (as well as class based / package based
queries). - the tree structure on github or bitbucket is quite
convenient (and browsable) to the point one could edit a
package directly in it (I do when I need to do a quick fix).


but is a pain to navigate: too much click to effectively browse
a method content.


You must hate Nautilus, then, since this is Nautilus approach as
well. Just count the number of clicks you do in a Nautilus, and the
number in github.


but we have spotter! (I just miss the exact search to not click and
scroll too much)


Then you want spotter on the web :)

If we remove the instance sub-directory and write instance-side
methods just below the class name in filetree, then you'll get the
exact same number of clicks to reach a method than in Nautilus.


it would be a good idea


Why not.

Fun fact: if you do that with the Mac finder in NexT mode over a
filetree repository (miller columns), you'll see that it almost
looks like a Nautilus top panes.

I do not know what would be the best format but I think we need
to take care to do not generate too much files / folders. File
system and VCS will appreciate also.


I'd say, overall, what we need to remember is that  we produce a
lot less lines of code than other languages, and that we shouldn't
over-optimize.

I'll probably look into optimising FileTree-like writing in the
future; I wasn't that good into planning for it and it shows in
specific cases.


It is actually the problem: we generate a lot of small files. I do
not have numbers but I think it would be good to stress a bit a file
system to see where we hit the barrier and compare with the pharo
code base. From the git side, I'm not aware of a limitation regarding
small files.

I'm sure the numbers are already available. And, as I said above, youmay be measuring FileTree implementation limitations and nothing relatedto filesystem issues (or git issues).


Thierry

Re: [Pharo-dev] Contributing to Pharo

Reply via email to