On Fri, Dec 25, 2009 at 09:00, Sridhar Ratnakumar <sridh...@activestate.com> wrote: > Greetings Lennart, > > On 12/24/2009 10:27 PM, Lennart Regebro wrote: >> >> On Fri, Dec 25, 2009 at 05:39, Sridhar Ratnakumar >> <sridh...@activestate.com> wrote: >>> >>> Is it because of this benefit to package authors that we are withholding >>> the implementation of a simple archive that would: 1) simplify the tools >>> to no rely on adhoc web scrapping >> >> There are better ways to do that. > > May I ask, what would they be?
Have links in the metadata to the file locations. That means you don't have to scrape the websites to find the links, the links would be in the metadata for the packages, or accessible in some other easy way. Scraping would no longer be needed, without requiring uploads to PyPI. >> That's *their* problem. If they don't want to upload, then they don't >> want to upload. > > As the original proposal is to retain the existing behavior for already > registered/uploaded package releases (such as Twisted) so existing systems > will continue to work, but implement the suggested upload rules only for new > requests (creation/register)- so as to gradually improve the quality of PyPI > like that of other packaging systems - by encouraging authors to generate a > reasonably good sdist (setup.py + PKG-INFO) and uploading them No, that's not encouraging, that's requiring and forcing. That is NOT the same thing. Again: If you tell people you have to upload to register, the effect of that is to NOT register. It will not make anybody upload, it will make them NOT register. It has already been explained in this discussion why the Twisted folks doesn't upload to PyPI: Because it for various reasons doesn't work for them. Your solution to get more packages to PyPI is to tell people to upload or bugger off. My solution is to fix the reason they don't upload. It really is that easy: If you tell people to upload or bugger off, they will bugger off. > If I want to use a web service, I obviously have to adhere to their rules > and policies. Nobody is forcing me to do so. Exactly. Nobody is forcing anybody to use PyPI. By making it HARDER to use and have MORE requirements LESS people will use it. Is it really that hard to understand? > I assume in good faith that package authors will be happy to adapt to the > new system You are wrong. > .. for the benefit of everyone. No, its not for the benefit of everyone. It's for the benefit of adherence to random rules with no purpose. If we want more packages on PyPI, we should fix the reasons that not everyone uploads their packages. And yes, I *am* going to repeat this in different ways and wordings until your ears fall off. ;-) > Why not? Do you conceive of any reason apart from CPAN-like archives that > would help in proliferation of mirror sites and third-party sites? The point is that we *have* a CPAN like archive. > because I personally went through significant hurdles to setup a daily PyPI > mirror-like area. I just don't see how someone merely interested in writing > a third-party service, or setup a mirror of PyPI would be *most likely > inclined* to face similar hurdles before giving up. You are so focused and stuck on that before you can do anything else you have to mirror PyPI completely, only using rsync. I don't see what that would have to be so. Rsync is not the be all and end all of mirroring and most third party services do not need to mirror. They need to get data, and that's possible quite easily. >> Yes, but it's not particularly unreliable to compare the filename to >> see if it had been handled before. You don't even need to parse the >> version number for most services that work on the tarballs. > > It is indeed unreliable to rely on filenames to get package versions Yes, but it's not particularly unreliable to compare the filename to see if it had been handled before. You don't even need to parse the version number for most services that work on the tarballs. > I am not speculating as I've actually experimented with the PyPI index, > mirroring it .. handling the metadata in packages, and building it. Yes, but again, most third party service would not mirror it. A mirror would. But a mirror is only one type of third-party service out a many. >> Yes, but since thay have the source package, and will have to unpack >> it and build the packages anyway, they also have the metadata. > > It is not that simple. PyPM backend, for instance, is not monolithic as in > doing only a sequential build of packages. It first loads the dependency > graph (for which metadata - PKG-INFO/requires.txt - is required) from our > internal mirror over the network. It is expensive to go extract each and > every tarball .. from each build machine. After loading the dependency > graph, and then comparing it with existing repository .. every day, new > builds happen. You mean you build every package that also *depends* on a package that has changed? Yeah, that does require the metadata. But as I said, an easy way to mirror the metadata would definitely be an improvement. > Further, I can imagine search.cpan.org (which is not hosted by cpan.org > folks) using only the metadata without touching the source distributions. Right. Hence, they would *not* need both. Which was my point. -- Lennart Regebro: http://regebro.wordpress.com/ Python 3 Porting: http://python-incompatibility.googlecode.com/ +33 661 58 14 64 _______________________________________________ Distutils-SIG maillist - Distutils-SIG@python.org http://mail.python.org/mailman/listinfo/distutils-sig