Il giorno 13/ago/08, alle ore 16:14, [EMAIL PROTECTED] ha scritto: > Hi, all, > > I have downloaded enwiki-latest-stub-meta-history.xml.gz to import > in my > own machine. but this file if too large to run any query in my > computer. > because i used Oracle can't accept file with more than 50GB. > > Is that possible anyone can help me to narrow this dump? Only i need > is > textid, revision id, username, user id, length of content and > timestamp. I > am not good at technical tool. so thanks a million for your help! > > zeyi > > _______________________________________________ > Toolserver-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/toolserver-l
You can use xml2sql [1] to split the file into three smaller files: page.sql, revision.sql and text.sql. Alternatively you can use MWDumper [2] to import the dump. After you've imported the dump into the database you can delete unnecessary columns with Oracle (I don't know how to do this but probably you can). *[1] http://meta.wikimedia.org/wiki/Xml2sql *[2] http://www.mediawiki.org/wiki/MWDumper Regards Pietrodn [EMAIL PROTECTED] _______________________________________________ Toolserver-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/toolserver-l
