Il giorno 13/ago/08, alle ore 16:14, [EMAIL PROTECTED] ha scritto:

> Hi, all,
>
> I have downloaded enwiki-latest-stub-meta-history.xml.gz to import  
> in my
> own machine. but this file if too large to run any query in my  
> computer.
> because i used Oracle can't accept file with more than 50GB.
>
> Is that possible anyone can help me to narrow this dump? Only i need  
> is
> textid, revision id, username, user id, length of content and  
> timestamp. I
> am not good at technical tool. so thanks a million for your help!
>
> zeyi
>
> _______________________________________________
> Toolserver-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/toolserver-l


You can use xml2sql [1] to split the file into three smaller files:  
page.sql, revision.sql and text.sql. Alternatively you can use  
MWDumper [2] to import the dump.
After you've imported the dump into the database you can delete  
unnecessary columns with Oracle (I don't know how to do this but  
probably you can).

*[1] http://meta.wikimedia.org/wiki/Xml2sql
*[2] http://www.mediawiki.org/wiki/MWDumper

Regards

Pietrodn
[EMAIL PROTECTED]


_______________________________________________
Toolserver-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/toolserver-l

Reply via email to