I think this blog post would help us a lot (it suggests in stream
compression we use zlib instead of gzip)
http://rationalpie.wordpress.com/2010/06/02/python-streaming-gzip-decompression/

What do you think?


On Sat, Jul 5, 2014 at 6:16 PM, John Mark Vandenberg <[email protected]>
wrote:

> Likewise, thank you Francis for this evaluation.  It is very helpful.
>
> Are we sure that gzip isnt occurring by default?  I started to investigate
> this a few weeks ago, and confirmed httplib2 defaults to gzip, but I didnt
> verify that pywiki core isnt meddling with that default.
>
> This is quite important for the performance of WIkidata, as it contains a
> lot of repetition in the JSON output and that repetition increases as the
> item grows. e.g new label and sitelinks of articles about species are
> usually the same as the label / sitelink in a different languages
>
> http://lists.wikimedia.org/pipermail/pywikipedia-l/2014-June/008886.html
>
>
> On Sat, Jul 5, 2014 at 1:15 AM, Amir Ladsgroup <[email protected]>
> wrote:
> > Thank you, this is helpful, I want to work on some of them:
> >
> > Use gzip compression by default
> > Make it easy to add a user-agent header and give examples of a good one
> in
> > the documentation for it (see
> > https://meta.wikimedia.org/wiki/User-agent_policy)
> > Add Python 3 compatibility (this is in progress for the core branch)
> > Package pywikibot for installation from PyPI via pip install
> > Make the initial installation process lighter-weight:
> >
> > Design pwb.py with user experience in mind, particularly valuing feedback
> > from new or one-time users during the redesign process
> > Make it possible to install into a virtualenv without putting a config
> file
> > in the home directory
> > Make it possible to run import pywikibot without having to log in
> >
> > Iterating over a list and calling the API for each item is an inefficient
> > use of API calls. Efficiency in API usage is an important feature of a
> gold
> > standard library. If you are interested in gold standard status, consider
> > making this more efficient by combining API calls as much as possible
> (e.g.
> > using generators and combining resultstitle=title1|title2|...). One
> option
> > may be a constructor method that collects Page requests and enables
> larger,
> > less frequent API calls. It may be possible to take advantage of the
> > database-like structure of the MediaWiki API and help users save
> bandwidth.
> >
> > Process-related
> >
> > Foster a hospitable attitude on pywikipedia-l, especially to new and/or
> > inexperienced users. Consider agreeing on community standards for
> > interaction; the Hacker School social rules may be a useful starting
> point.
> > Create more centralized and updated documentation, including:
> >
> > Easy-to-find, complete, and intuitive installation instructions,
> including
> > installing via pip and into virtual environments
> > Code samples for common tasks, including queries and edits
> > Documentation for people who aren't running bots with existing scripts
> > (particularly researchers and beginning/intermediate bot writers)
> > Links in method documentation to the corresponding API subpages
> >
> > Streamline or add more resources to the patch review process to reduce
> the
> > backlog of unreviewed patches
> >
> >
> > If someone is willing to help out, let's work!
> >
> >
> >
> >
> > On Fri, Jul 4, 2014 at 2:36 AM, Frances Hocutt <[email protected]
> >
> > wrote:
> >>
> >> Hello all,
> >>
> >> This summer I am working on a project to evaluate and improve the
> >> available MediaWiki web API client libraries. As pywikibot met the
> >> initial criteria of quality, features, and development status I chose
> >> to evaluate it in more depth. There is now a "gold standard"[1] that
> >> will be used to find and enable the listing of particularly
> >> well-designed and easy-to-use MediaWiki web API client libraries--I've
> >> now evaluated several Python libraries against this standard and
> >> suggested additions and changes that would help them meet the
> >> standard.
> >>
> >> First, thank you all for contributing to pywikibot and its community of
> >> users!
> >>
> >> My evaluation for pywikibot is posted here.[2] Pywikibot is
> >> impressively full-featured (including Wikidata API coverage), and it
> >> makes it possible for bot runners and wiki maintainers to quickly get
> >> started automating wiki management tasks.  Some areas that could be
> >> improved include expanded and centralized documentation, efficiency in
> >> use of API calls, and making the setup process lighter-weight and
> >> easier to use.
> >>
> >> I will follow up by posting specific suggestions to Bugzilla[3] later
> >> this week. If you have comments or questions, please feel free to post
> >> on the evaluation talk page, respond to the bugs filed, or make
> >> corrections on the evaluation page if I've missed something.
> >>
> >> -Frances Hocutt
> >> MediaWiki intern
> >>
> >> [1] https://www.mediawiki.org/wiki/API:Client_code/Gold_standard
> >> [2]
> https://www.mediawiki.org/wiki/API:Client_code/Evaluations/Pywikibot
> >> [3]
> >>
> https://bugzilla.wikimedia.org/buglist.cgi?query_format=specific&product=Pywikibot&list_id=235557
> >>
> >> _______________________________________________
> >> Pywikipedia-l mailing list
> >> [email protected]
> >> https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
> >
> >
> >
> >
> > --
> > Amir
> >
> >
> > _______________________________________________
> > Pywikipedia-l mailing list
> > [email protected]
> > https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
> >
>
>
>
> --
> John Vandenberg
>
>
> --
> John Vandenberg
>
> _______________________________________________
> Pywikipedia-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l
>
>


-- 
Amir
_______________________________________________
Pywikipedia-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l

Reply via email to