In Commons there is a bunch of broken/corrupt/missing files (most old
versions of the same file).
2012/11/11 MZMcBride z...@mzmcbride.com
Hi.
Is there a policy or guideline about the level to which Wikimedia wikis
care
about data integrity? There are a few specific cases I'm talking about:
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
--
Emilio J. Rodríguez-Posada. E-mail: emijrp AT gmail DOT com
Pre-doctoral student at the University of Cádiz (Spain)
Projects: AVBOT http
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
--
Emilio J. Rodríguez-Posada. E-mail: emijrp AT gmail DOT com
Pre-doctoral student at the University of Cádiz (Spain)
Projects: AVBOT http://code.google.com/p/avbot
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
--
Emilio J. Rodríguez-Posada. E-mail: emijrp AT gmail DOT com
Pre-doctoral student at the University of Cádiz (Spain)
Projects: AVBOT http
Another example of a recent video donation
https://commons.wikimedia.org/wiki/Category:Files_from_the_Australian_Broadcasting_Corporation
2012/4/25 emijrp emi...@gmail.com
2012/4/24 Samuel Klein meta...@gmail.com
Where's the latest thread on the Timed Media Handler progress?
I am meeting
-Posada. E-mail: emijrp AT gmail DOT com
Pre-doctoral student at the University of Cádiz (Spain)
Projects: AVBOT http://code.google.com/p/avbot/ |
StatMediaWikihttp://statmediawiki.forja.rediris.es
| WikiEvidens http://code.google.com/p/wikievidens/ |
WikiPapershttp://wikipapers.referata.com
| WikiTeam
.
Wikipedia uses nofollow, so adding links to your website doesn't increase
your pagerank, but it works fine for reaching new readers.
Theses sites[2] receive a lot of traffic from Wikipedia, for sure.
Regards,
emijrp
(Forwarding to the research mailing list.)
[1] http://www.dlib.org/dlib/may07/lally
/mailman/listinfo/wikitech-l
--
Emilio J. Rodríguez-Posada. E-mail: emijrp AT gmail DOT com
Pre-doctoral student at the University of Cádiz (Spain)
Projects: AVBOT http://code.google.com/p/avbot/ |
StatMediaWikihttp://statmediawiki.forja.rediris.es
| WikiEvidens http://code.google.com/p/wikievidens
file
after removing javascript/json/robots.txt there are 13 left,
which fits perfectly with 10,000 to 13,000 per day
however 9 of these are bots!!
How many of that 1000 sample log were robots (including all languages)?
--
Emilio J. Rodríguez-Posada. E-mail: emijrp AT gmail DOT com
Pre-doctoral
2012/3/1 Peter Gervai grin...@gmail.com
On Thu, Mar 1, 2012 at 00:56, emijrp emi...@gmail.com wrote:
I'm trying to download Wikimedia Commons, but I have found some errors.
For
There are still occasional errors around, would be nice to run a
script against the files database... but it can
#filehistory
Are you aware of this? Is this going to be fixed?
Regards,
emijrp
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Hi Maarten;
I think that this is a perfect example of open question in wiki research.
WikiPapers has a page for that stuff.[1] Can you add some bits there about
this?
I dind't know about OpenCV, I will check it for sure, and I will try to
something (I'm a bot developer).
Regards,
emijrp
[1
I have found a tutorial for Python coders
http://creatingwithcode.com/howto/face-detection-in-static-images-with-python/After
some tests, it works fine (including René Descartes face : )).
This is going to be very helpful to improve Images for biographies accuracy
http://toolserver.org/~emijrp
Forwarding...
-- Forwarded message --
From: emijrp emi...@gmail.com
Date: 2011/11/11
Subject: Old English Wikipedia image dump from 2005
To: wikiteam-disc...@googlegroups.com
Hi all;
I want to share with you this Archive Team link[1]. It is an old English
Wikipedia image dump
Congratulations, a big step in wiki preservation.
2011/10/13 Ariel T. Glenn ar...@wikimedia.org
As the subject says, the first mirror of our XML dumps is up, hosted at
C3Sl in BRazil. We're really excited about it. Details are listed on
the main index page on our download server
(
Some of the most recent dumps links are broken[1].
[1] http://wikipedia.c3sl.ufpr.br/jawikisource/20111018
2011/10/13 Ariel T. Glenn ar...@wikimedia.org
As the subject says, the first mirror of our XML dumps is up, hosted at
C3Sl in BRazil. We're really excited about it. Details are listed
that Internet Archive saves XML dumps quarterly or so, but no
official announcement. Also, I heard about Library of Congress wanting to
mirror the dumps, but not news since a long time.
L'Encyclopédie has an uptime[4] of 260 years[5] and growing. Will
Wiki[pm]edia projects reach that?
Regards,
emijrp
Thanks Ariel. That is important data to preserve.
2011/9/15 Ariel T. Glenn ar...@wikimedia.org
I think we finally have a complete copy from December 2007 through
August 2011 of the pageview stats scrounged from various sources, now
available on our dumps server.
See
https://bugzilla.wikimedia.org/show_bug.cgi?id=30946
2011/9/12 emijrp emi...@gmail.com
Hi all;
I have created two torrent files for the PIcture of the Year dumps[1]. They
use Wikimedia server as webseed.[2][3] Can you add them to the page?
Thanks,
emijrp
[1] http://dumps.wikimedia.org
Hi;
sep11.wikipedia.org redirects to a spam domain, probably expired and
registered by other people.
Can you redirect to this[1] or this[2]? Or make a simply index.html with
that both links...
Thanks,
emijrp
[1] http://dumps.wikimedia.org/sep11wiki/20071116/
[2]
http://web.archive.org/web
2001.
Losing knowledge is so 48 BC. This is the most important mission human race
has ever achieve.
Regards,
emijrp
--
Krinkle
___
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l
Yes, that tool looks similar to the idea I wrote. Other approaches may be
possible too.
2011/8/13 John Vandenberg jay...@gmail.com
On Sat, Aug 13, 2011 at 4:53 AM, emijrp emi...@gmail.com wrote:
Man, Gerard is thinking about new methods to fork (in an easy way) single
articles, sets
Man, Gerard is thinking about new methods to fork (in an easy way) single
articles, sets of articles or complete wikipedias, and people reply about
setting up servers/mediawiki/importing_databases and other geeky weekend
parties. That is why there is no successful forks. Forking Wikipedia is
I'm interested in uploading these CDs ISOs to Internet Archive. Are you OK
with this? Your server is a bit slow, so, you will have a mirror an a bit
faster.
2011/6/11 Jyothis E jyothi...@gmail.com
Dear fellow Wikimedians,
With great pleasure, Malayalam Wikimedia Community announced its 2011
Creating an offline version of a wiki project is a hard work. Keep up the
good work! Congratulations! : )
P.D.: downloading...
2011/6/11 Jyothis E jyothi...@gmail.com
Dear fellow Wikimedians,
With great pleasure, Malayalam Wikimedia Community announced its 2011 CD
project Selected Books
A nice script to download YouTube videos is youtube-dl[1]. Link that with a
flv/mp4 - ogg converter and an uploader to Commons is trivial.
[1] http://rg3.github.com/youtube-dl/
2011/6/4 Michael Dale md...@wikimedia.org
Comments inline:
On Fri, Jun 3, 2011 at 4:51 PM, Brion Vibber
Hi James;
download.wikimedia.org is available again, so, you can download that file
from
http://download.wikimedia.org/enwiki/20101011/enwiki-20101011-pages-articles.xml.bz26.2
GB.
Regards,
emijrp
2010/12/14 James Linden kodekr...@gmail.com
On Mon, Dec 13, 2010 at 7:09 PM, Michael Gurlitz
Hi Monica;
You dump is this one, with date 2010-03-12:[1][2]
a3a5ee062abc16a79d111273d4a1a99a enwiki-20100312-pages-articles.xml.bz2
There are some old English Wikipedia dumps and md5sum files in a directory
called archive[3].
Regards,
emijrp
[1]
http://download.wikimedia.org/archive/enwiki
All? The 2006 one too?
2010/12/16 Ariel T. Glenn ar...@wikimedia.org
The dumps in the archive are there because they are incomplete, by the
way.
Ariel
Στις 16-12-2010, ημέρα Πεμ, και ώρα 16:50 +0100, ο/η emijrp έγραψε:
Hi Monica;
You dump is this one, with date 2010-03-12:[1][2
Have you checked the md5sum?
2010/12/16 Gabriel Weinberg y...@alum.mit.edu
Ariel T. Glenn ariel at wikimedia.org writes:
We now have a copy of the dumps on a backup host. Although we are still
resolving hardware issues on the XML dumps server, we think it is safe
enough to serve the
md5sum. Can anyone
else confirm?
On Thu, Dec 16, 2010 at 5:41 PM, emijrp emi...@gmail.com wrote:
Have you checked the md5sum?
2010/12/16 Gabriel Weinberg y...@alum.mit.edu
Ariel T. Glenn ariel at wikimedia.org writes:
We now have a copy of the dumps on a backup host
Good work.
2010/12/15 Ariel T. Glenn ar...@wikimedia.org
We now have a copy of the dumps on a backup host. Although we are still
resolving hardware issues on the XML dumps server, we think it is safe
enough to serve the existing dumps read-only. DNS was updated to that
effect already;
Thanks.
Double good news:
http://lists.wikimedia.org/pipermail/foundation-l/2010-December/063088.html
2010/12/14 Ariel T. Glenn ar...@wikimedia.org
For folks who have not been following the saga on
http://wikitech.wikimedia.org/view/Dataset1
we were able to get the raid array back in service
be nice.
Regards,
emijrp
2010/12/13 Monica shu monicashu...@gmail.com
Hi all,
I have downloaded a dump several month ago.
By accidentally, I lost the version info of this dump, so I don't know when
this dump was generated.
Is there any place that list out info about the past dumps(such as
size
)
On 11 December 2010 10:34, emijrp emi...@gmail.com wrote:
I have this one: mediawikiwiki-20100808-pages-meta-history.xml.7z (37
MB). I
can upload it to MegaUpload if needed.
2010/12/6 Andrew Dunbar hippytr...@gmail.com
Could anybody help me locate a dump of mediawiki.org while
I have this one: mediawikiwiki-20100808-pages-meta-history.xml.7z (37 MB). I
can upload it to MegaUpload if needed.
2010/12/6 Andrew Dunbar hippytr...@gmail.com
Could anybody help me locate a dump of mediawiki.org while the dump
server is broken please? I only need current revisions.
Thanks
2010/12/10 James Linden kodekr...@gmail.com
This may or may not be appropriate to this list -- this is where I
found most of the discussions on the matter, so posting here.
From reading the past couple of weeks of messages, I surmise that
there isn't a way to get a current data dump (for
What are the ISO codes? ro and ka?
I have kawiktionary-20100807-pages-meta-history.xml.7z (1.3 MB) and
rowiktionary-20100810-pages-meta-history.xml.7z (10.1 MB). Very tiny.
2010/11/28 Andrew Dunbar hippytr...@gmail.com
On 28 November 2010 02:42, Jeff Kubina jeff.kub...@gmail.com wrote:
I
Crossposting.
This dump is in /mnt/user-store/dump or dumps, on Toolserver. If the admins
don't see any problem, it may be put available for download (~30GB).
Regards,
emijrp
2010/11/25 Oliver Schmidt schmidt...@email.ulster.ac.uk
Hello alltogether,
is there any alternative way to get hands
You can follow the updates here
http://wikitech.wikimedia.org/history/Dataset1
2010/11/21 masti mast...@gmail.com
On 11/10/2010 06:44 AM, Ariel T. Glenn wrote:
We noticed a kernel panic message and stack trace in the logs on the
server that servers XML dumps. The web server that provides
The dump generating process is halted. Also, the official XML download page
is offline, until they fix the hardware.
I don't know if there are mirrors. I don't think so.
2010/11/11 Billy Chan waterfall...@gmail.com
Hi Robin,
Thanks for your link. Do u know where i can download the xml dumps
/Wikipedia_Archive
2010/11/11 emijrp emi...@gmail.com
The dump generating process is halted. Also, the official XML download page
is offline, until they fix the hardware.
I don't know if there are mirrors. I don't think so.
2010/11/11 Billy Chan waterfall...@gmail.com
Hi Robin,
Thanks for your link
Sorry. Where I said from August 2010, I mean of August 2010. I have only
one .7z for every wiki of WMF.
2010/11/11 emijrp emi...@gmail.com
There are some old dumps in Internet Archive,[1] but I guess you are
interested in the most recent ones.
Also, I have a copy of all the pages-meta
What data is in risk?
2010/11/10 Ariel T. Glenn ar...@wikimedia.org
The server refused to come up on reboot; raid errors. The backplane is
suspect. A ticket is being opened with the vendor. The host will
remain offline until we have good information about how to resolve the
problem or we
So, will English Wikipedia dumps be created with this new method from now?
2010/10/2 Ariel T. Glenn ar...@wikimedia.org
The server that hosts XML dumps was moved this morning and all
maintenance completed. The dumps for dewiki, arwiki, srwiki and
ptwikiquote were restarted from the
Thanks! : )
2010/9/17 Lars Aronsson l...@aronsson.se
On September 10, emijrp wrote:
Hi Lars, are you going to upload more logs to Internet Archive?
No, I can't. I have not downloaded more recent logs. I only uploaded
what was on my disk, because I needed to free some space.
Domas
Hi Lars, are you going to upload more logs to Internet Archive? Domas
website only shows the last 3 (?) months. I think that there are many of
these files at Toolserver, but we must preserve this raw data in another
secure (for posterity) place.
2010/9/10 Lars Aronsson l...@aronsson.se
On
Perhaps, we can offer two captchas. First, the current one, and a link with
this label if you can't read this captcha, try this one and a link to the
sound reCAPTCHA. Requesting an account to admins is not a good solution
(perhaps as a third option).
Regards,
emijrp
2010/5/16 Christopher Grant
Interesting thread in Jimbo's talk page[1] from June 2008.
[1]
http://en.wikipedia.org/wiki/User_talk:Jimbo_Wales/Archive_37#Wikipedia_and_Captcha
2010/5/16 Chad innocentkil...@gmail.com
On Sun, May 16, 2010 at 3:04 AM, Christopher Grant
chrisgrantm...@gmail.com wrote:
On Sun, May 16, 2010
Hi all;
Solving captcha during registration is mandatory. Can this be replaced with
a sound captcha for visual impairment people? It is a suggestion to the
usability project too. Thanks.
Regards,
emijrp
___
Wikitech-l mailing list
Wikitech-l
50 matches
Mail list logo