Thanks Ben for that info about long file names in Windows, of importance when 
discussing file formats for smalltalk packages and git.

Just that your link only describe how to get a significant diff display when 
dealing with zip files stored as-is inside a Git repository, not automagically 
zipping and unzipping in and out of git storage. Too bad :(. However, searching 
around that info points out that git will do a binary delta on a versionned zip 
file [1], which means storing zip files inside git is not that bad.

[1] 
http://stackoverflow.com/questions/9973151/does-git-smartly-handle-a-zip-archive-in-which-only-one-of-the-files-changes-reg

Thierry
________________________________________
De : Pharo-dev [[email protected]] de la part de 
[email protected] [[email protected]]
Envoyé : dimanche 9 mars 2014 16:43
À : [email protected]; Discusses Development of Pharo; Squeak Virtual 
Machine Development Discussion
Objet : [Pharo-dev] Git & MS Windows path length restriction

I started looking into Pharo Case 13030 "Many tests failing in
MetacelloValidation Job on Jenkins"
and even before getting into it, I've hit a stumbling block on Windows 7
with its pathName length restriction of 259 characters.

I managed to isolate the problem as follows...
    MetacelloPlatform current
        downloadFile:
'https://github.com/dalehenrich/metacello-work/zipball/master'
        to: 'C:\tmp\github-dalehenrichmetacelloworkmaster.zip'.
    zip := ZipArchive new readFrom:
'C:\tmp\github-dalehenrichmetacelloworkmaster.zip'.
    zip extractAllTo: 'C:\tmp\unzippedByPharo' asFileReference.
which produces the error "FileDoesNotExist: File @
C:\tmp\unzippedByPharo\dalehenrich-metacello-work-96e07b1\repository\Metacello-TestsMCB.package\MetacelloScriptingStandardTestHarness.class\instance\validateExpectedConfigurationClassName.expectedConfigurationVersion.expectedConfigurationRepository.expectedBaselineClassName.expectedBaselineVersion.expectedBaselineRepository..st"

where the #size of that filename is 354 characters.

Trying to drill into github-dalehenrichmetacelloworkmaster.zip from
Windows Explorer produces error "The Compresses (zipped) file is invalid."
However using 7Zip [1] I can extract the file so that _all_ files appear
in the hierarchy, so it seems 259 is not a hard limit, and indeed that
limit is imposed by the Windows Shell, since NTFS can have a path length
of ~32K [2].  Operation of 7Zip was verified by:
 * From Pharo:     zip members size --> 6007
 * For cmd.exe:    dir /b /s > dir.txt
                            Open dir.txt into Notepad++ --> 6007 (after
dir.txt line removed)
                            Also Windows Explorer <Properties> reports
(5277 files + 730 folders) = 6007

[1] www.7-zip.org
[2]
http://stackoverflow.com/questions/265769/maximum-filename-length-in-ntfs-windows-xp-and-windows-vista

Now presuming its reasonable for the working directory to have 30
characters, using...
    histogram := (zip members collect:
    [     :member  |  | mySize |
        mySize := member fileName size.
        (mySize >=230)  ifTrue: [Transcript crShow: mySize printString ,
'-->' , member fileName printString ].
        mySize.
    ]) asBag.
produces...
306-->'dalehenrich-metacello-work-96e07b1/repository/Metacello-TestsMCB.package/MetacelloScriptingStandardTestHarness.class/instance/validateExpectedConfigurationClassName.expectedConfigurationVersion.expectedConfigurationRepository.expectedBaselineClassName.expectedBaselineVersion.expectedBaselineRepository..st'
252-->'dalehenrich-metacello-work-96e07b1/repository/Metacello-ToolBox.package/MetacelloToolBox.class/class/createBaseline.for.repository.requiredProjects.packages.dependencies.includes.files.repositories.preLoadDoIts.postLoadDoIts.supplyingAnswers.groups..st'
257-->'dalehenrich-metacello-work-96e07b1/repository/Metacello-ToolBox.package/MetacelloToolBox.class/class/modifyBaselineVersionIn.repository.requiredProjects.packages.dependencies.includes.files.repositories.preLoadDoIts.postLoadDoIts.supplyingAnswers.groups..st'
259-->'dalehenrich-metacello-work-96e07b1/repository/Metacello-ToolBox.package/MetacelloToolBox.class/class/modifyVersion.section.for.repository.requiredProjects.packages.dependencies.includes.files.repositories.preLoadDoIts.postLoadDoIts.supplyingAnswers.groups..st'
262-->'dalehenrich-metacello-work-96e07b1/repository/Metacello-ToolBox.package/MetacelloToolBox.class/instance/addSection.repository.requiredProjects.packages.dependencies.includes.files.repositories.preLoadDoIts.postLoadDoIts.supplyingAnswers.groups.versionSpecsDo..st'
265-->'dalehenrich-metacello-work-96e07b1/repository/Metacello-ToolBox.package/MetacelloToolBox.class/instance/modifySection.repository.requiredProjects.packages.dependencies.includes.files.repositories.preLoadDoIts.postLoadDoIts.supplyingAnswers.groups.versionSpecsDo..st'
278-->'dalehenrich-metacello-work-96e07b1/repository/Metacello-ToolBox.package/MetacelloToolBox.class/instance/modifySection.sectionIndex.repository.requiredProjects.packages.dependencies.includes.files.repositories.preLoadDoIts.postLoadDoIts.supplyingAnswers.groups.versionSpecsDo..st'

So...
* Are there many users of Smalltalk/git on Windows
* What is the design plan for dealing with the long path name?
I guess the change to using the unicode functions to get the 32K
pathName length would need to happen in the VM.

I wonder if instead of Smalltalk-git working with individual files in
the file system, it might work directly from the a zip file.  It seems
git can be configured to accept zip files and unpack them before pushing
into its repository [3].  Some benefits of this:
* Avoid this problem on Windows
* Accessing one stream when loading a from git repository might be
faster than accessing many individual files
* Git sees one file per method, but users see one file per package.
* Git-zip-files might act more like familiar mcz files - that could be
copied around the same - or even provide some convergence if an mcz held
a file per method rather than a single source.st file.

[3] http://tante.cc/2010/06/23/managing-zip-based-file-formats-in-git/

Interested in you thoughts.
cheers -ben


Reply via email to