I've rebased this and pushed it to branch `db/6534`. I resolved several conflicts, so please start further work off of that branch. All specific examples below are from https://github.com/mxcl/homebrew/wiki/
* When a wiki is deleted, the title is changed. This should be refactored in to a `delete()` method on the model. Even though it's just one line, it shouldn't be repeated in multiple places. * In convert_markup() * `name_and_ext = filename.split('.', 1)` should be changed to use `os.path.splitext` (or at least `rsplit`). It causes some pages not to be handled right: "Not a wiki page Homebrew-0.9.3.md. Skipping" * we need to handle markdown specially: don't do any conversion. We lose some formatting when it goes through render_any_markup and then back through html2text (e.g. External-Commands.md loses its table structure). We'll also need to keep the original markdown anyway, for [#6622]. * alignment of "Import wiki history" checkbox on the individual import form is weird * it looks like gollum is case-insensitive, e.g. [Tips n' Tricks] on Home.textile Can we cleanly support that too? * Acceptable-Formulae.textile * extra newlines are inserted (iirc, fix with: `html2text.BODY_WIDTH = 0`) * "&" in gollum tag doesn't work * textile specific issues (mostly from Acceptable-Formulae.textile) These can be a separate ticket that we merge later. I want to merge this main wiki branch soon :) * after "There are good reasons for this:" should be a numbered list * table structure is lost * `Niche Stuff <a name="Niche_Stuff"></a>` * `*[[this checklist|Troubleshooting]]*` doesn't convert right (Home.textile) --- ** [tickets:#6534] Wiki importer for github** **Status:** in-progress **Labels:** import github 42cc **Created:** Wed Aug 07, 2013 09:54 PM UTC by Dave Brondsema **Last Updated:** Thu Sep 26, 2013 03:27 PM UTC **Owner:** nobody Wikis are git repositories and can be accessed like `git clone https://github.com/OpenRefine/OpenRefine.wiki` for example. Check the main repo API first to see if the repo has wiki enabled. You can see https://sourceforge.net/p/googlecodewikiimporter/git/ for reference as an example of another wiki importer. It is a separate repo because it needs the "html2text" package to convert html to markdown, and that is a GPL library. Github supports many markup types. Find a full list and determine what the best way to convert them to markdown is. My guess is that few formats will have tools available to convert them directly to markdown, so my likely recommendation would be to render them as HTML (using [pypeline](http://pypeline.sourceforge.net/) as a generic way to handle many of those formats) and then html2text to get it into markdown. If html2text or any other GPL library is needed, this will have to be a separate repo from the main Allura repo. So please evaluate & test the conversion options first, before putting code into place. A second phase to all this (i.e. do it separately, after the basic import is all working) would be to handle revision history. This would mean going through each commit in the wiki git repo, and converting & updating every file that changes. This may be very time consuming, so when we get to it, we may want it to be a checkbox option, so users only do it if they want it. --- Sent from sourceforge.net because allura-dev@incubator.apache.org is subscribed to https://sourceforge.net/p/allura/tickets/ To unsubscribe from further messages, a project admin can change settings at https://sourceforge.net/p/allura/admin/tickets/options. Or, if this is a mailing list, you can unsubscribe from the mailing list.