Le 5 févr. 2016 à 14:33, Thierry Goubier a écrit : > Le 05/02/2016 11:33, Christophe Demarey a écrit : >> Hi Thierry, >> >> Just some thoughts I wanted to share: >> >> Le 3 févr. 2016 à 10:18, Thierry Goubier a écrit : >> >>> I went through all the different possible file formats, >>> class-based, package-based, method-based, log metadata and the >>> like, and I concluded that: >>> >>> - the method based format is as good as any other. Even better >>> since it has a spec (cypress). >> >> I see cons that a class (or package) format would not have. One file >> per method approach leads to generate plenty of small files. In >> general, file systems do not like that: - it may consumes a lot of >> space. I remember I had a Java/maven project with a lot of small >> files and I got to fill the inodes tables on my unix system. - you >> generate long pathes. Long pathes are not user-friendly and some OS >> have restrictions on path length. > > The method based structure of filetree is very close to how code is navigated > in Smalltalk browsers: one method at a time, with a package/class/protocol > hierarchical layering on top. The one file per class / one file per package > is a reference to the base unit of C / C++.
I fully agree with that. As we have a lot (small) methods, we will have a lot of small files and some file-system does not like that. I remember huge slow-down be cause of that. It is good to keep that in mind. > And no OS in general use has path restrictions that matter. Ok, the windows > vm has issues, but this is a vm bug, not a filesystem issue. Windows command-line has (had) this limitation. > >> By adopting a file per method approach, you also increase the >> distance to get a common script format for Smalltalk. Here I mean a >> file where you could define classes, methods, and run arbitrary >> portions of Smalltalk code. > > This format is called fileout, and already exist. I mean something like a python script: http://archive.stsci.edu/vo/python_examples.html > All you describe is also available in the FileTree/Cypress format and is > technically better specified. > >>> - method based format allow for method-history queries on the >>> git/vcs history (as well as class based / package based queries). - >>> the tree structure on github or bitbucket is quite convenient (and >>> browsable) to the point one could edit a package directly in it (I >>> do when I need to do a quick fix). >> >> but is a pain to navigate: too much click to effectively browse a >> method content. > > You must hate Nautilus, then, since this is Nautilus approach as well. Just > count the number of clicks you do in a Nautilus, and the number in github. but we have spotter! (I just miss the exact search to not click and scroll too much) > If we remove the instance sub-directory and write instance-side methods just > below the class name in filetree, then you'll get the exact same number of > clicks to reach a method than in Nautilus. it would be a good idea > Fun fact: if you do that with the Mac finder in NexT mode over a filetree > repository (miller columns), you'll see that it almost looks like a Nautilus > top panes. > >> I do not know what would be the best format but I think we need to >> take care to do not generate too much files / folders. File system >> and VCS will appreciate also. > > I'd say, overall, what we need to remember is that we produce a lot less > lines of code than other languages, and that we shouldn't over-optimize. > > I'll probably look into optimising FileTree-like writing in the future; I > wasn't that good into planning for it and it shows in specific cases. It is actually the problem: we generate a lot of small files. I do not have numbers but I think it would be good to stress a bit a file system to see where we hit the barrier and compare with the pharo code base. From the git side, I'm not aware of a limitation regarding small files.
