To keep everyone in the loop. I finally had time to sit down and work on this, and I created a patch and submitted a pull request to Martin for his review.
https://bitbucket.org/loewis/pypi-appengine/pull-request/1/added-some-logging-to-help-track-mirror If anyone else has time to review the change as well, before I push it out, that would be great. I'm still new to this code base, and I don't want to break anything. Ken On Mon, Jul 2, 2012 at 8:37 AM, ken cochrane <[email protected]> wrote: > Jannis, > > I started looking at this the other day, and I haven't had a chance to fix > it because the Amazon datacenter outage took all of my time the past few > days. > > Here is what I found out. > > b.pypi.python.org lives on GAE and it's currently stuck, and looking at > the logs I figured out what went wrong, but I'm not sure how to fix it. [3] > See log snippet at the end of the email. > > Basically there is a python package called '__past__' (see [0] link > below) that is causing the sync process to break because we are trying to > use the project name as the key_name for the Product model [1], and GAE > model key_name's can't contain underscores [2]. > > I'm not sure how to fix the issue without possibly breaking other things. > My first thought was to remove the underscores, but that might break > something else, or conflict with another project with a similar name. I > wrote to Martin who gave me the following advice. > > From "Martin v. Löwis": > > Renaming/escaping sounds good. I'd check if there is any string that >> can be used in a GAE key name, but not be used in a PyPI package name. >> If not, standard escaping needs to be applied: a prefix of "dunder" >> is added to any package whose name starts and ends with __, as well >> as to any package whose name starts with "dunder". >> > > >> When looking at all child nodes, remove "dunder" from any name; >> when doing lookups by name, escape as above. >> > > >> If you do find a character/string that can be in a key name but >> not in a package name, just escape the string with that name - >> no need to worry about escaping the escape character. However it >> may be that the only possible choice is "/" (which I know cannot >> appear in a package name). > > > I looked through most of the pypi code and I think the only character you > can't have is "/", all other characters look like they work. > > So, I know what is causing it, we just need to fix the issue, test it, and > roll out the fix. I was planning on doing it this past weekend but thanks > to AWS, I didn't have any time to work on it. If anyone has any free time, > feel free to take over / help. Just let others know so there isn't any > duplicate effort. > > Let me know if you have any questions. > > Ken Cochrane > > Footnotes: > > [0] http://pypi.python.org/pypi/__past__/0.0.1.dev > > [1] > https://bitbucket.org/loewis/pypi-appengine/src/fa6596a427e1/fetch.py#cl-62 > > > [2] Information about model key_names > > https://developers.google.com/appengine/docs/python/datastore/modelclass#Model > key_name > > The key name for the entity. The name becomes part of the primary key. If > None, a system-generated numeric ID is used for the key. > > The value for key_name must not be of the form __*__. > [1] > Log snippet. > > 1. 2012-06-28 06:45:18.222 > > step package '__past__' > > 2. E2012-06-28 06:45:18.778 > > illegal name in key path element: __past__ > Traceback (most recent call last): > File > "/base/python_runtime/python_lib/versions/1/google/appengine/ext/webapp/_webapp25.py", > line 701, in __call__ > handler.get(*groups) > File "/base/data/home/apps/pypi/3.358089379617981219/handlers.py", line > 171, in get > self.response.out.write(fetch.cron()) > File "/base/data/home/apps/pypi/3.358089379617981219/fetch.py", line > 293, in cron > return step() > File "/base/data/home/apps/pypi/3.358089379617981219/fetch.py", line > 259, in step > actions[action](m, todo, param) > File "/base/data/home/apps/pypi/3.358089379617981219/fetch.py", line 91, > in package > data = simple_page(m, name) > File "/base/data/home/apps/pypi/3.358089379617981219/fetch.py", line 70, > in simple_page > obj.put() > File > "/base/python_runtime/python_lib/versions/1/google/appengine/ext/db/__init__.py", > line 1074, in put > return datastore.Put(self._entity, **kwargs) > File > "/base/python_runtime/python_lib/versions/1/google/appengine/api/datastore.py", > line 579, in Put > return PutAsync(entities, **kwargs).get_result() > File > "/base/python_runtime/python_lib/versions/1/google/appengine/api/apiproxy_stub_map.py", > line 604, in get_result > return self.__get_result_hook(self) > File > "/base/python_runtime/python_lib/versions/1/google/appengine/datastore/datastore_rpc.py", > line 1579, in __put_hook > self.check_rpc_success(rpc) > File > "/base/python_runtime/python_lib/versions/1/google/appengine/datastore/datastore_rpc.py", > line 1216, in check_rpc_success > raise _ToDatastoreError(err) > BadRequestError: illegal name in key path element: __past__ > > > > > On Mon, Jul 2, 2012 at 8:11 AM, Jannis Leidel <[email protected]> wrote: > >> On 02.05.2011, at 22:10, Martin v. Löwis <[email protected]> wrote: >> >> > Am 02.05.2011 19:24, schrieb Jannis Leidel: >> >> >> >> On 02.05.2011, at 18:12, Maurits van Rees wrote: >> >> >> >>> Hi, >> >>> >> >>> I noticed that some distributions are not on all mirrors. For example >> >>> http://a.pypi.python.org/simple/plone.app.referenceablebehavior/ >> >>> has 0.1 and 0.2 (last one released 30 April) >> >>> but 0.2 is missing from >> >>> http://b.pypi.python.org/simple/plone.app.referenceablebehavior/ >> >>> >> >>> Same for c and d. Ah, no: those two have it now. I know for sure >> that at least d did not have it five minutes ago. And this version has >> been released two days ago, so it should have been slightly faster. :-) >> >> >> >> Hm, d doesn't seem to have the file on disk even thought it's on the >> simple page, see >> http://d.pypi.python.org/packages/source/p/plone.app.referenceablebehavior/ >> >> >> >> Martin: Anything I can do to make sure this doesn't happen again? >> > >> > As the starting point, we should figure out why it happened in >> > the first place - it shouldn't have, of course. Most likely, >> > it's a bug :-) >> >> Looks like http://b.pypi.python.org is out of date again: >> http://www.pypi-mirrors.org >> >> Can we do something about that? >> >> Jannis >> >> _______________________________________________ >> Catalog-SIG mailing list >> [email protected] >> http://mail.python.org/mailman/listinfo/catalog-sig >> > >
_______________________________________________ Catalog-SIG mailing list [email protected] http://mail.python.org/mailman/listinfo/catalog-sig
