Hi Andreas, in the first round where I wrote the UDD importer you accepted the format I proposed. Writing this importer by myself took me countless hours, as I did not know Python programming, and as the handling of Unicode in the UDD was not so intuitive. Then you found bugs and decided to throw everything to the bin. But why don't you report and track the bugs instead ?
This monster email thread with many point-to-point comments is an information blackhole. This, plus the fact that there are only 24 hours in a day, are the reason why some of the issues that count for you are not fixed. But that does not mean that the only way to fix them is to do everything by yourself without sharing your code and without communicating your design requirements. If we want to achieve something together, we need to be more organised. Let's use at least a wiki page or the TODO file of the umegaya repository to track down open problems. Also if we keep using Umegaya (the engine behind upstream-metadata.debian.net, which provides consolidated tables and the pool of upstream and copyright files), I will upload the debian package, so that we have a bug tracker for free. So far, we have on you side: - UDD tables must have a primary key. - Bibliographic data must support the loading of more than one reference. On my side: - The syntax of debian/upstream is documented on the Debian wiki (http://wiki.debian.org/UpstreamMetadata) and changes must be discussed in advance. - The current syntax does not support complex structures such as arrays, and your way of loading Bibliographic references is therefore not supported. - The use of YAML mappings (hashes) is a syntax hack that you made me deeply regret. If you start to depend on them, you break the equivalence between Foo-Bar: Baz and Foo: {Bar: Baz}. I have been working on this project since 2009. My goal is not only to provide bibliographic data to the UDD, but also to provide a machine-readable file that is convenient for general use. While I have picked YAML for that file, I really do not want to support the full YAML syntax, but rather a minimal subset that is close to the Debian control data files. Currently, this is done by only supporting YAML scalars (except for the hash hack). The UDD loader is the first serious use of this data, but I think we should not overfit the debian/upstream syntax to this sole use. In particular, I do think that the gatherer I made has some value, to give an easy access to the contents of these files without downloading the source package or visiting a VCS web interface. For instance I recently used it to browse our watch files, in order to find examples of direct detection in Google code or SourceForge. ( http://upstream-metadata.debian.net/table/watch ). It will also be more efficient than the UDD for quick on-demand requests about single packages or single fields. In summary, let's track down issues and solve them one by one, and avoid long point-to-point threads. On my side, I do not manage to be efficient with them. Have a nice week-end, -- Charles Plessy Debian Med packaging team, http://www.debian.org/devel/debian-med Tsurumi, Kanagawa, Japan -- To UNSUBSCRIBE, email to debian-med-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org Archive: http://lists.debian.org/20120311034238.gb10...@falafel.plessy.net