Nicolas Dumazet ha scritto:
> Well I just wanted to update the date, and I thought that a generic
> statement was better:
> in fact... why would I put my name, knowing that purodha did some
> important fixes on the file during those years?
>
> Note that I'm very flexible on those attributions sections. Any
> suggestion in welcome, and is likely to be fine with me.
Forget it. I cannot talk about changes I haven't seen.
>>> + index = 1
>>> + while True:
>>> + path = config.datafilepath('cache', 'pagestore' +
>>> str(index))
>>> + if not os.path.exists(path): break
>>> + index += 1
>> At least this looks nice for diskcache module too, so we can easily get
>> rid of the imported random module and the ugly '*-abfdexjwi' like filenames.
>
> Thinking again about this: those files are temporary, and are only
> accessed from one specific entry point. A tempfile would be even
> cleaner, right? ( http://docs.python.org/library/tempfile.html ,
> standard since 2.3 ) I think I could do this for both diskcache and
> interwiki, and remove the cache/ directory. Comments?
It would be preferable creating a single file, instead of adding a new
file for each separated but identical Site, repeating the same download
within a relatively short time... Working similar to a web browser cache.
You can use tempfile in current implementation, but "cache" directory is
used from featured.py too, instead of "featured" (r5536). Maybe it's
better to keep it, as it's a common name. For example, some my external
scripts use it, and maybe in the future more scripts will do it.
> Speaking of diskcache: I wondered if a simple Shelf (
> http://docs.python.org/library/shelve.html ) wouldn't be faster than
> diskcache. Shelf has been written at low levels, has different
> interfaces for each specific system family.
> Naturally I would think that Shelf should be faster and more
> appropriate than our custom-made module, but Shelf might be too
> generic, and induce unnecessary overhead?
I am not sure if it is worth replacing it with shelve here, probably not
if you think to speed up the code.
I had always asked myself why we have adopted this solution because I
have a doubt about the amount of RAM requested by mediawiki-messages
that the bot actually use. I think a list of items not to discard would
have been simpler. Although I have really appreciated this more
sophisticated solution.
--
Francesco Cosoleto
"Democristiani, socialisti e comunisti sono delinquenti che hanno fatto
fallire il paese: andavano fucilati in piazza!". (U. Bossi, 25 settembre
2003)
_______________________________________________
Pywikipedia-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/pywikipedia-l