Thanks Ben for that info about long file names in Windows, of importance when discussing file formats for smalltalk packages and git.
Just that your link only describe how to get a significant diff display when dealing with zip files stored as-is inside a Git repository, not automagically zipping and unzipping in and out of git storage. Too bad :(. However, searching around that info points out that git will do a binary delta on a versionned zip file [1], which means storing zip files inside git is not that bad. [1] http://stackoverflow.com/questions/9973151/does-git-smartly-handle-a-zip-archive-in-which-only-one-of-the-files-changes-reg Thierry ________________________________________ De : Pharo-dev [[email protected]] de la part de [email protected] [[email protected]] Envoyé : dimanche 9 mars 2014 16:43 À : [email protected]; Discusses Development of Pharo; Squeak Virtual Machine Development Discussion Objet : [Pharo-dev] Git & MS Windows path length restriction I started looking into Pharo Case 13030 "Many tests failing in MetacelloValidation Job on Jenkins" and even before getting into it, I've hit a stumbling block on Windows 7 with its pathName length restriction of 259 characters. I managed to isolate the problem as follows... MetacelloPlatform current downloadFile: 'https://github.com/dalehenrich/metacello-work/zipball/master' to: 'C:\tmp\github-dalehenrichmetacelloworkmaster.zip'. zip := ZipArchive new readFrom: 'C:\tmp\github-dalehenrichmetacelloworkmaster.zip'. zip extractAllTo: 'C:\tmp\unzippedByPharo' asFileReference. which produces the error "FileDoesNotExist: File @ C:\tmp\unzippedByPharo\dalehenrich-metacello-work-96e07b1\repository\Metacello-TestsMCB.package\MetacelloScriptingStandardTestHarness.class\instance\validateExpectedConfigurationClassName.expectedConfigurationVersion.expectedConfigurationRepository.expectedBaselineClassName.expectedBaselineVersion.expectedBaselineRepository..st" where the #size of that filename is 354 characters. Trying to drill into github-dalehenrichmetacelloworkmaster.zip from Windows Explorer produces error "The Compresses (zipped) file is invalid." However using 7Zip [1] I can extract the file so that _all_ files appear in the hierarchy, so it seems 259 is not a hard limit, and indeed that limit is imposed by the Windows Shell, since NTFS can have a path length of ~32K [2]. Operation of 7Zip was verified by: * From Pharo: zip members size --> 6007 * For cmd.exe: dir /b /s > dir.txt Open dir.txt into Notepad++ --> 6007 (after dir.txt line removed) Also Windows Explorer <Properties> reports (5277 files + 730 folders) = 6007 [1] www.7-zip.org [2] http://stackoverflow.com/questions/265769/maximum-filename-length-in-ntfs-windows-xp-and-windows-vista Now presuming its reasonable for the working directory to have 30 characters, using... histogram := (zip members collect: [ :member | | mySize | mySize := member fileName size. (mySize >=230) ifTrue: [Transcript crShow: mySize printString , '-->' , member fileName printString ]. mySize. ]) asBag. produces... 306-->'dalehenrich-metacello-work-96e07b1/repository/Metacello-TestsMCB.package/MetacelloScriptingStandardTestHarness.class/instance/validateExpectedConfigurationClassName.expectedConfigurationVersion.expectedConfigurationRepository.expectedBaselineClassName.expectedBaselineVersion.expectedBaselineRepository..st' 252-->'dalehenrich-metacello-work-96e07b1/repository/Metacello-ToolBox.package/MetacelloToolBox.class/class/createBaseline.for.repository.requiredProjects.packages.dependencies.includes.files.repositories.preLoadDoIts.postLoadDoIts.supplyingAnswers.groups..st' 257-->'dalehenrich-metacello-work-96e07b1/repository/Metacello-ToolBox.package/MetacelloToolBox.class/class/modifyBaselineVersionIn.repository.requiredProjects.packages.dependencies.includes.files.repositories.preLoadDoIts.postLoadDoIts.supplyingAnswers.groups..st' 259-->'dalehenrich-metacello-work-96e07b1/repository/Metacello-ToolBox.package/MetacelloToolBox.class/class/modifyVersion.section.for.repository.requiredProjects.packages.dependencies.includes.files.repositories.preLoadDoIts.postLoadDoIts.supplyingAnswers.groups..st' 262-->'dalehenrich-metacello-work-96e07b1/repository/Metacello-ToolBox.package/MetacelloToolBox.class/instance/addSection.repository.requiredProjects.packages.dependencies.includes.files.repositories.preLoadDoIts.postLoadDoIts.supplyingAnswers.groups.versionSpecsDo..st' 265-->'dalehenrich-metacello-work-96e07b1/repository/Metacello-ToolBox.package/MetacelloToolBox.class/instance/modifySection.repository.requiredProjects.packages.dependencies.includes.files.repositories.preLoadDoIts.postLoadDoIts.supplyingAnswers.groups.versionSpecsDo..st' 278-->'dalehenrich-metacello-work-96e07b1/repository/Metacello-ToolBox.package/MetacelloToolBox.class/instance/modifySection.sectionIndex.repository.requiredProjects.packages.dependencies.includes.files.repositories.preLoadDoIts.postLoadDoIts.supplyingAnswers.groups.versionSpecsDo..st' So... * Are there many users of Smalltalk/git on Windows * What is the design plan for dealing with the long path name? I guess the change to using the unicode functions to get the 32K pathName length would need to happen in the VM. I wonder if instead of Smalltalk-git working with individual files in the file system, it might work directly from the a zip file. It seems git can be configured to accept zip files and unpack them before pushing into its repository [3]. Some benefits of this: * Avoid this problem on Windows * Accessing one stream when loading a from git repository might be faster than accessing many individual files * Git sees one file per method, but users see one file per package. * Git-zip-files might act more like familiar mcz files - that could be copied around the same - or even provide some convergence if an mcz held a file per method rather than a single source.st file. [3] http://tante.cc/2010/06/23/managing-zip-based-file-formats-in-git/ Interested in you thoughts. cheers -ben
