We've been discussing for two months now how to get Hg over to SVN. There have been several suggestions for how the CWS's and complete revision history could be migrated over, but little progress has been made. Either the proposals didn't work, or no volunteers stepped forward to implement them.
The alternative proposal was to just check in the tip of the trunk, without history, and then migrate Hg to Apache-Extras.org, where Hg is supported. I've made some progress on this proposal. Here's what I did. I'd like some review, to make sure I didn't screw anything up. I am neither an Hg nor a SVN expert. But I do have a big harddrive. I used Subversion command-line client, version 1.6.17-SlikSvn-tag-1.6.17@1130896-WIN32. I first brought down OOo, both the trunk and the language stuff, into separate directories: hg clone http://hg.services.openoffice.org/OOO340 hg clone http://hg.services.openoffice.org/master_l10n/OOO340/ I then moved these into a common directory structure, as Ingrid had earlier suggested: ooo/trunk/core --- all the OOO340 stuff ooo/trunk/l10n -- all the language stuff I removed the .Hg directories before proceeding, so I had a clean local copy. I then created a local SVN repository, enabled auto-props to get the proper EOL treatment and imported the project: svn import c:\merged file:///c:/svn-repo/ -m "initial import" During local svn import I received error messages: "svn: Inconsistent line ending style" This typically indicated that a text file had a mix of EOL styles (DOS/UNIX). But I found some cases where this was not true, but where the problem appeared to be related to an unsupported encoding. For example, SVN does not seem to support UTF-16 encodings. I received this "Inconsistent line ending style" on the following files: ooo/trunk/core/dictionaries/de_DE/README_hyph_de_DE.txt ooo/trunk/core/dictionaries/de_CH/README_hyph_de_CH.txt ooo/trunk/core/dictionaries/de_AT/README_hyph_de_AT.txt ooo/trunk/core/gettext/gettext-0.18.1.1.patch ooo/trunk/core/apache-commons/patches/codec.patch ooo/trunk/core/libcroco/libcrco-0.6.2.patch ooo/trunk/core/testautomation/writer/optional/input/import/mactext.txt ooo/trunk/core/graphite/graphite-2.3.1.patch ooo/trunk/core/hwpfilter/source/hwpeq.cpp (some weird non-ascii text in file, should review) ooo/trunk/core/solenv/bin/cwstouched.pl (should review) ooo/trunk/core/readlicense_oo/html/THIRDPARTYLICENSEREAMDE.html ooo/trunk/core/writerfilter/source/doctok/escher.html ooo/trunk/core/writerfilter/source/odiapi/qname/resource/office2003/WordprocessingML Schemas/xsdlib.xsd (convert from UTF-16 to UTF-8) ooo/trunk/core/filter/source/xslt/odf2xhtml/export/common/body.xsl ooo/trunk/core/filter/source/xslt/odf2xhtml/export/common/styles/style_mapping_css.xsl ooo/trunk/core/filter/source/xslt/odf2xhtml/export/common/styles/style_collector_css.xsl ooo/trunk/core/filter/source/xslt/odf2xhtml/export/common/styles/table/table.xsl ooo/trunk/core/filter/source/xslt/odf2xhtml/export/common/styles/table/table_cells.xsl ooo/trunk/core/filter/source/xslt/odf2xhtml/export/common/styles/table/table_columns.xsl ooo/trunk/core/filter/source/xslt/odf2xhtml/export/common/styles/table/table_rows.xsl In each case, the error aborted the import which had to then be restarted from the top. So it was a slow process, finding all of these problem files. Possible solutions could include adding them as binary (not text) files, or editing them (dos2unix, e.g.) to make their EOL style consistent. I did the latter. Note: any other approach to migrating Hg to SVN will run into the above problem files, so I'd recommend that anyone who wants to try an alternative migration approach start by fixing the above files. Once the project was imported, I did an svn export to get a clean copy of the project, and compare it to the original directory. The file counts matched, which is a good sign: 69202 files. I then did an svnadmin -c dump >ooo-dump to create a dump file of this repository. The dump file is 1.8 GB, with an MD5 hash of: fd611942d297128d021cd03795b54708 It compressed to a 367 MB gzip which I've put on my website here: http://www.robweir.com/ooo-dump.gz So unless anyone has a better idea, and more importantly, is willing to implement a better idea, I'd like to go forward with importing this dump file. Let's take a few days to review the steps above, and to review the dump file, to make sure there are not any major errors introduced. If someone can kick off a build with this source, it would be a great way to confirm. I have all of my partial steps saved, so making small tweaks to this are relatively easy. For example, if there are some file extensions used by OOo that should be treated as text, but are not listed in the standard SVN config, or in the recommended Apache project extended list, this is a good time to get those corrected. Regards, -Rob
