Re: [Offline-l] [OPENZIM] Introducing Python-libzim

2020-07-03 Thread Wilfredo Rodríguez
Excellent good news, this will allow developing many tools more easily.
*Wilfredo Rodríguez*


El vie., 3 de jul. de 2020 a la(s) 10:41, Emmanuel Engelhart (
kel...@kiwix.org) escribió:

> Hi
>
> I'm happy to introduce you to Python-libzim.
>
> Python-libzim package allows you to read/write ZIM files in Python. It
> provides a shallow Python interface on top of the libzim C++ library. It
> supports out-of-the-box macOS and GNU/Linux. For the other OSes you will
> have to compile the libzim manually.
>
> After Node.js, this is the second scripting language for which openZIM
> proposes a binding of its famous reference implementation of the ZIM
> open specification. This move is really important to allow more people
> to benefit of the file format and ZIM files already published.
>
> On our side, Python-libzim was critical for a few other projects which
> are currently running. In the next months a few critical scrapers will
> be migrated from zimwriterfs to python-libzim and benefit of a sensitive
> code simplification and speed-up.
>
> Install easily python-libzim with pip and give it a try:
> https://pypi.org/project/libzim/
>
> Happy coding!
>
> Regards
> Emmanuel
> --
> Kiwix - Wikipedia Offline & more
> * Web: https://kiwix.org/
> * Twitter: https://twitter.com/KiwixOffline
> * Wiki: https://wiki.kiwix.org/
>
> ___
> Offline-l mailing list
> Offline-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/offline-l
>
___
Offline-l mailing list
Offline-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/offline-l


[Offline-l] At long last, a new version of offline enwp

2020-07-03 Thread Stephane Coillet-Matillon
Hi everyone, 

quick announcement for a major success on our side: we finally released late 
last night an updated version of the English Wikipedia[1]. The last full 
version (ie. with images) we had was from October 2018 (!), and since then we 
had been plagued by regressions, bugs, resource limitations and probably some 
very dark magic.

The new .zim file adds 900,000 articles (6.1 vs. 5.2 millions) and a healthy 11 
Gb in size (89 vs. 78 Gb). The numbers are somewhat misleading because we need 
to include internal links and redirects, which brings the total to 100+ million 
interlinked items. Emmanuel will have more details on the hurdles that he had 
to deal with.

Updates will now run on a monthly basis, which is another major improvement: we 
had initially planned on bimonthly updates as a single run used to take up to 
three weeks. It can now be done in 5-6 days \o/

Congrats to everyone involved or who supported us one way or another, including 
the Foundation with the new & bigger servers they recently gave us access to. 
Hopefully now we can move on to newer problems.

Stephane



[1] http://download.kiwix.org/zim/wikipedia/wikipedia_en_all_maxi_2020-06.zim 
 
You can also torrent it by adding .torrent at the end (seeders welcome, as a 
matter of fact)___
Offline-l mailing list
Offline-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/offline-l


Re: [Offline-l] At long last, a new version of offline enwp

2020-07-03 Thread Emmanuel Engelhart
This was the last milestone on the way to generate a monthly fresh ZIM
snaspshot for each of our projects. Actually, many ZIM flavours for
Wikipedias and a few other projects.

This now happens 100% automatically with the help of the Zimfarm.

The work continues on MWoffliner, which is quite a delicate beast, to
better handle the many particularities/features Mediawiki is known for.

Our small typescript team welcomes any volunteer help. We have work to
do for junior and senior developers, just pick up your ticket at
https://github.com/openzim/mwoffliner/issues.

Emmanuel

On 03.07.20 15:43, Stephane Coillet-Matillon wrote:
> quick announcement for a major success on our side: we finally released
> late last night an updated version of the English Wikipedia[1]. The last
> full version (ie. with images) we had was from October 2018 (!), and
> since then we had been plagued by regressions, bugs, resource
> limitations and probably some very dark magic.
> 
> The new .zim file adds 900,000 articles (6.1 vs. 5.2 millions) and a
> healthy 11 Gb in size (89 vs. 78 Gb). The numbers are somewhat
> misleading because we need to include internal links and redirects,
> which brings the total to 100+ million interlinked items. Emmanuel will
> have more details on the hurdles that he had to deal with.
> 
> Updates will now run on a monthly basis, which is another major
> improvement: we had initially planned on bimonthly updates as a single
> run used to take up to three weeks. It can now be done in 5-6 days \o/
> 
> Congrats to everyone involved or who supported us one way or another,
> including the Foundation with the new & bigger servers they recently
> gave us access to. Hopefully now we can move on to newer problems.

-- 
Kiwix - Wikipedia Offline & more
* Web: https://kiwix.org/
* Twitter: https://twitter.com/KiwixOffline
* Wiki: https://wiki.kiwix.org/



signature.asc
Description: OpenPGP digital signature
___
Offline-l mailing list
Offline-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/offline-l


[Offline-l] [OPENZIM] Introducing Python-libzim

2020-07-03 Thread Emmanuel Engelhart
Hi

I'm happy to introduce you to Python-libzim.

Python-libzim package allows you to read/write ZIM files in Python. It
provides a shallow Python interface on top of the libzim C++ library. It
supports out-of-the-box macOS and GNU/Linux. For the other OSes you will
have to compile the libzim manually.

After Node.js, this is the second scripting language for which openZIM
proposes a binding of its famous reference implementation of the ZIM
open specification. This move is really important to allow more people
to benefit of the file format and ZIM files already published.

On our side, Python-libzim was critical for a few other projects which
are currently running. In the next months a few critical scrapers will
be migrated from zimwriterfs to python-libzim and benefit of a sensitive
code simplification and speed-up.

Install easily python-libzim with pip and give it a try:
https://pypi.org/project/libzim/

Happy coding!

Regards
Emmanuel
-- 
Kiwix - Wikipedia Offline & more
* Web: https://kiwix.org/
* Twitter: https://twitter.com/KiwixOffline
* Wiki: https://wiki.kiwix.org/



signature.asc
Description: OpenPGP digital signature
___
Offline-l mailing list
Offline-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/offline-l