Re: [Wikitech-l] Download a wiki?

2018-05-20 Thread Emilio J . Rodríguez-Posada
WikiTeam bat signal.

Dump delivered.

2018-05-19 6:52 GMT+02:00 Bart Humphries :

> Great, thanks!
>
> I have a convention this weekend, so it'll probably be Monday
> evening/Tuesday before I can really do anything else with that dump.
>
> On Fri, May 18, 2018, 3:32 PM Federico Leva (Nemo) 
> wrote:
>
> > You're in luck: just now I was looking for test cases for the new
> > version of dumpgenerator.py:
> > https://github.com/WikiTeam/wikiteam/issues/311
> >
> > I've made a couple changes and for tomorrow you should see a new XML dump
> > at
> > https://archive.org/download/wiki-meritbadgeorg_wiki
> > (there's also https://archive.org/download/wiki-meritbadgeorg-20151017,
> > not mine )
> >
> > Federico
> >
> > ___
> > Wikitech-l mailing list
> > Wikitech-l@lists.wikimedia.org
> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Download a wiki?

2018-05-18 Thread Bart Humphries
Great, thanks!

I have a convention this weekend, so it'll probably be Monday
evening/Tuesday before I can really do anything else with that dump.

On Fri, May 18, 2018, 3:32 PM Federico Leva (Nemo) 
wrote:

> You're in luck: just now I was looking for test cases for the new
> version of dumpgenerator.py:
> https://github.com/WikiTeam/wikiteam/issues/311
>
> I've made a couple changes and for tomorrow you should see a new XML dump
> at
> https://archive.org/download/wiki-meritbadgeorg_wiki
> (there's also https://archive.org/download/wiki-meritbadgeorg-20151017,
> not mine )
>
> Federico
>
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Download a wiki?

2018-05-18 Thread Federico Leva (Nemo)
You're in luck: just now I was looking for test cases for the new 
version of dumpgenerator.py:

https://github.com/WikiTeam/wikiteam/issues/311

I've made a couple changes and for tomorrow you should see a new XML dump at
https://archive.org/download/wiki-meritbadgeorg_wiki
(there's also https://archive.org/download/wiki-meritbadgeorg-20151017, 
not mine )


Federico

___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] Download a wiki?

2018-05-18 Thread Martin Urbanec
You can use http://meritbadge.org/wiki/index.php/Special:Export, just add
*all* pages (API call/Special:AllPages/similar solution) to the textbox,
uncheck "Include only the current revision, not full history" to have full
history, save the file prepared. Then you should buy a new server, put the
dump on the server and call
https://www.mediawiki.org/wiki/Manual:ImportDump.php. You will have new
wiki with old content. If you want to go for hosting, you should use
Special:Import as the reverse solution.

Hope that helps.

Best,
Martin


pá 18. 5. 2018 v 20:27 odesílatel Bart Humphries 
napsal:

> We have a very old wiki which has basically never been updated for the past
> decade and which was proving stubbornly resistant to updating several years
> ago.  And now the owner of the server has drifted away, but we do still
> have control over the domain name itself.  The best way that we can think
> of to update everything is to scrape all of the pages/file, add them to a
> brand new updated wiki on a new server, then point the domain to that new
> server.  Yes, user accounts will be broken, but we feel that this is the
> most feasible solution unless someone else has another idea.
>
> However, there's a lot of pages on meritbadge.org -- which is the wiki I'm
> talking about.  Any suggestions for how to automate this scraping process?
> I can scrape the HTML off every page, but what I really want is to get the
> wikitext off of every page.
>
> Bart Humphries
> bart.humphr...@gmail.com
> (909)529-BART(2278)
> ___
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

[Wikitech-l] Download a wiki?

2018-05-18 Thread Bart Humphries
We have a very old wiki which has basically never been updated for the past
decade and which was proving stubbornly resistant to updating several years
ago.  And now the owner of the server has drifted away, but we do still
have control over the domain name itself.  The best way that we can think
of to update everything is to scrape all of the pages/file, add them to a
brand new updated wiki on a new server, then point the domain to that new
server.  Yes, user accounts will be broken, but we feel that this is the
most feasible solution unless someone else has another idea.

However, there's a lot of pages on meritbadge.org -- which is the wiki I'm
talking about.  Any suggestions for how to automate this scraping process?
I can scrape the HTML off every page, but what I really want is to get the
wikitext off of every page.

Bart Humphries
bart.humphr...@gmail.com
(909)529-BART(2278)
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l