* there are some errors in links of files and special pages examples קובץ:Nuvola_apps_important.svg<http://commons.wikimedia.org/wiki/File:Nuvola_apps_important.svg> link to ויקיפדיה:מיזמי ויקיפדיה/מיזם ערכים ללא תמונות/קטגוריות/ספורטאים איטלקים(wikipedia:wikipedia projects\ articles without images\categories\Sports people from Italy) מיוחד:אקראי (Special:Random) > 15 במאי (may 15) מיוחד:שינויים אחרונים (Special:RecentChanges) > 10_באוגוסט
* size is important because we intend to add images 2009/7/6 <[email protected]> > Send dev-l mailing list submissions to > [email protected] > > To subscribe or unsubscribe via the World Wide Web, visit > https://intern.openzim.org/mailman/listinfo/dev-l > or, via email, send a message with subject or body 'help' to > [email protected] > > You can reach the person managing the list at > [email protected] > > When replying, please edit your Subject line so it is more specific > than "Re: Contents of dev-l digest..." > > > Today's Topics: > > 1. Kiwix index size (Asaf Bartov) > 2. Re: Kiwix index size (Manuel Schneider) > 3. Re: Kiwix index size (Emmanuel Engelhart) > > > ---------------------------------------------------------------------- > > Message: 1 > Date: Sun, 5 Jul 2009 19:18:57 +0300 > From: Asaf Bartov <[email protected]> > Subject: [openZIM dev-l] Kiwix index size > To: [email protected] > Message-ID: > <[email protected]> > Content-Type: text/plain; charset="iso-8859-1" > > Hi, everyone. > > When running Kiwix's indexer on the ZIM file I had created from the Hebrew > Wikipedia last week, the Kiwix data directory ran up to a total of 31 > items, > totalling 2.3 GB. The ZIM file itself is ~300MB. Does this proportion > make > sense? > > Detailed ls output attached. > > Thanks in advance, > > Asaf Bartov > -- > Asaf Bartov <[email protected]> > -------------- next part -------------- > An HTML attachment was scrubbed... > URL: < > http://intern.openzim.org/pipermail/dev-l/attachments/20090705/2afee878/attachment.html > > > -------------- next part -------------- > ro...@desktop:~/.www.kiwix.org/kiwix$ ls -l -h -a -R > .: > total 16K > drwx------ 3 rotem rotem 4.0K 2009-07-01 16:10 . > drwx------ 3 rotem rotem 4.0K 2009-07-01 16:10 .. > drwx------ 4 rotem rotem 4.0K 2009-07-05 19:00 7680jxd5.default > -rw-r--r-- 1 rotem rotem 94 2009-07-01 16:10 profiles.ini > > ./7680jxd5.default: > total 1.7M > drwx------ 4 rotem rotem 4.0K 2009-07-05 19:00 . > drwx------ 3 rotem rotem 4.0K 2009-07-01 16:10 .. > drwxr-xr-x 2 rotem rotem 4.0K 2009-07-02 05:13 > 31c26198d06ad265677b450796cc09aa.index > -rw------- 1 rotem rotem 162 2009-07-05 18:19 compatibility.ini > -rw-r--r-- 1 rotem rotem 135K 2009-07-05 18:19 compreg.dat > drwxr-xr-x 2 rotem rotem 4.0K 2009-07-01 16:10 extensions > -rw-r--r-- 1 rotem rotem 169 2009-07-01 16:10 localstore.rdf > -rw-r--r-- 1 rotem rotem 304 2009-07-05 18:39 mimeTypes.rdf > -rw-r--r-- 1 rotem rotem 0 2009-07-05 18:40 .parentlock > -rw-r--r-- 1 rotem rotem 2.0K 2009-07-01 16:10 permissions.sqlite > -rw-r--r-- 1 rotem rotem 128K 2009-07-05 18:54 places.sqlite > -rw------- 1 rotem rotem 951 2009-07-05 19:00 prefs.js > -rw-r--r-- 1 rotem rotem 1.1M 2009-07-05 18:20 XPC.mfasl > -rw-r--r-- 1 rotem rotem 98K 2009-07-05 18:19 xpti.dat > -rw-r--r-- 1 rotem rotem 98K 2009-07-05 18:20 XUL.mfasl > > ./7680jxd5.default/31c26198d06ad265677b450796cc09aa.index: > total 2.4G > drwxr-xr-x 2 rotem rotem 4.0K 2009-07-02 05:13 . > drwx------ 4 rotem rotem 4.0K 2009-07-05 19:00 .. > -rw-r--r-- 1 rotem rotem 0 2009-07-02 01:46 flintlock > -rw-r--r-- 1 rotem rotem 12 2009-07-02 01:46 iamflint > -rw-r--r-- 1 rotem rotem 22K 2009-07-02 05:13 position.baseA > -rw-r--r-- 1 rotem rotem 21K 2009-07-02 05:10 position.baseB > -rw-r--r-- 1 rotem rotem 1.4G 2009-07-02 05:13 position.DB > -rw-r--r-- 1 rotem rotem 12K 2009-07-02 05:13 postlist.baseA > -rw-r--r-- 1 rotem rotem 12K 2009-07-02 05:10 postlist.baseB > -rw-r--r-- 1 rotem rotem 754M 2009-07-02 05:13 postlist.DB > -rw-r--r-- 1 rotem rotem 70 2009-07-02 05:13 record.baseA > -rw-r--r-- 1 rotem rotem 70 2009-07-02 05:10 record.baseB > -rw-r--r-- 1 rotem rotem 3.3M 2009-07-02 05:13 record.DB > -rw-r--r-- 1 rotem rotem 4.4K 2009-07-02 05:13 termlist.baseA > -rw-r--r-- 1 rotem rotem 4.3K 2009-07-02 05:10 termlist.baseB > -rw-r--r-- 1 rotem rotem 278M 2009-07-02 05:13 termlist.DB > -rw-r--r-- 1 rotem rotem 232 2009-07-02 05:13 value.baseA > -rw-r--r-- 1 rotem rotem 230 2009-07-02 05:10 value.baseB > -rw-r--r-- 1 rotem rotem 14M 2009-07-02 05:13 value.DB > > ./7680jxd5.default/extensions: > total 8.0K > drwxr-xr-x 2 rotem rotem 4.0K 2009-07-01 16:10 . > drwx------ 4 rotem rotem 4.0K 2009-07-05 19:00 .. > ro...@desktop:~/.www.kiwix.org/kiwix$ > > ------------------------------ > > Message: 2 > Date: Sun, 5 Jul 2009 20:57:39 +0200 > From: Manuel Schneider <[email protected]> > Subject: Re: [openZIM dev-l] Kiwix index size > To: [email protected], [email protected] > Message-ID: <[email protected]> > Content-Type: text/plain; charset="utf-8" > > Hi Asaf, > > Am Sonntag, 5. Juli 2009 schrieb Asaf Bartov: > > When running Kiwix's indexer on the ZIM file I had created from the > Hebrew > > Wikipedia last week, the Kiwix data directory ran up to a total of 31 > > items, totalling 2.3 GB. The ZIM file itself is ~300MB. Does this > > proportion make sense? > > I am not sure about the other files which were created, you only need the > ZIM > file with the index itself. > > For 900'000 articles the ZIM file containing the articles was 1.4 GB, the > Index ZIM was 1.0 GB. > > So I think 300 MB looks fine. > > Greets, > > > Manuel > -- > Regards > Manuel Schneider > > Wikimedia CH - Verein zur F?rderung Freien Wissens > Wikimedia CH - Association for the advancement of free knowledge > www.wikimedia.ch > > > ------------------------------ > > Message: 3 > Date: Sun, 05 Jul 2009 21:05:33 +0200 > From: Emmanuel Engelhart <[email protected]> > Subject: Re: [openZIM dev-l] Kiwix index size > To: [email protected], [email protected] > Message-ID: <[email protected]> > Content-Type: text/plain; charset=ISO-8859-1 > > -----BEGIN PGP SIGNED MESSAGE----- > Hash: SHA1 > > Hi Asaf > Asaf Bartov a ?crit : > > When running Kiwix's indexer on the ZIM file I had created from the > Hebrew > > Wikipedia last week, the Kiwix data directory ran up to a total of 31 > items, > > totalling 2.3 GB. The ZIM file itself is ~300MB. Does this proportion > make > > sense? > > this is possible. Kiwix uses the Xapian search engine which generates > pretty big index files. > > I have to questions: > * Are the search results OK? > * Do you have a problem with the size of the index? Do you have a size > limit? > > They are many open search/index softwares. I choose to use Xapian for > many reasons, but this is possible under certain condition to add to > Kiwix the support to an another search engine. This should be also > possible to make a modified version of the indexer using less disk space > (but with less words indexed). > > OpenZIM itself provides a search solution, Tommi can explain you more > about it. Maybe it would be interesting for you to test it and give us a > feedback! > > Regards > Emmanuel > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.4.9 (GNU/Linux) > Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org > > iEYEARECAAYFAkpQ+XcACgkQn3IpJRpNWtPm8wCfcmzwRfg6/9ttuknkURF7ct5I > JLAAoLbVJWqXUKIeh8Mpua3GD+bjI5ZD > =RH/U > -----END PGP SIGNATURE----- > > > ------------------------------ > > _______________________________________________ > dev-l mailing list > [email protected] > https://intern.openzim.org/mailman/listinfo/dev-l > > > End of dev-l Digest, Vol 5, Issue 2 > *********************************** > -- Rotem Simha
_______________________________________________ dev-l mailing list [email protected] https://intern.openzim.org/mailman/listinfo/dev-l
