-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA256 Hmm, seems like GPG (or maybe just enigmail) is confused by something in my last email and marks my signature as invalid. So... for the record I did write the email quoted in full below. Sorry for the duplicate email.
On 2011-05-30 13:31, Niels Thykier wrote: > On 2011-05-28 14:30, Raphael Hertzog wrote: >> Hi Niels, > > > Hey > > (Added d-java to CC) > >> On Wed, 25 May 2011, Niels Thykier wrote: >>> First of all I was hoping that you might have some "Do" or "Don't" >>> pointers from when dpkg added support for these things. > >> Do not underestimate the task. Apart from that, I'm sorry I'm not sure >> what kind of advice I can give you. :-) > > > :) > >>> Secondly, there might be some code or infrastructure that could be >>> shared. > >> I would love to generalize the principle of auto-generated dependencies >> to cover more than just C libraries but we're far from that, i.e. there's >> no infrastructure in place for this and all the code in dpkg-gensymbols >> and dpkg-shlibdeps is highly specific to the case of C libraries/binaries. > > > Could we begin refactoring this towards something similar to the > $NS::Source::Package setup (e.g. $NS::SymbolsFile::$LANG)? Or would you > rather see a different approach to this code-wise > >>> Particularly I am interested in how you handle mapping >>> filenames/SONAMES to a package (especially in cases like libc6, where >>> there more than one lib in the package). > >> There's nothing magical here. Once we have a SONAME, we find the library >> on the system (using the same path that ld.so would use). Once we have >> the complete filename, dpkg -S /the/file returns the package name. And >> with the package name we're checking the content of >> /var/lib/dpkg/info/<pkg>.shlibs (but you have to use dpkg-query >> --control-path <pkg> shlibs to find that path). > > > Aw, I was hoping for dragons and magic. :P But yeah, I should have seen > the dpkg -S; I have been using it before. > >>> We also have cases where two packages provide the same library and it >>> would be optimal for us to end up with libX-java | libY-java in the >>> depends, but I have a feeling that is not entirely trivial to support >>> (in a sane way). > >> Well, both packages need to provide this dependency. There's no way the >> system can know that there is some other libraries that could fulfill the >> same role and that it needs to put an alternative in the dependency. > > > I had a feeling you might say that. > >>> I intend to have all the tools to support this in the javahelper >>> package. I am not too sure that we can recycle the existing formats >>> (maybe the shlibs format with s/SONAME/filename/) as we have to check >>> for things like classes, return-types, inheritance and method >>> overloading as well. But feel free to correct me if symbols files >>> already have support for this. > >> Sorry, I have too few java knowledge to answer this. > >> Cheers, > > So I have been looking at this a bit more; the shlibs format actually > looks fully recyclable, assuming we can somehow tell a "C"-shlibs file > from a "Java"-shlibs file and map "SONAME" to filename accordingly. > > ... and if I discard my desire to record all access qualifiers and such, > I think the symbols file is mostly re-usable if we encode things right. > > But first, a quick Java lecture so we are all more or less on the same > page. I will (where possible) map Java terms to C++. As I understood > Jonathan, C++ maps/mangles all constructors and methods into a flat > function name and builds a C library out of that. > A Java library consists of 0 or more class files stored in a jar (zip) > file. The meta data (such as dependencies) are stored in the manifest > file (plain text file). We can extract almost everything we need from > said class files (method-signatures etc) and the manifest file. > > Java does a well-defined mangling of method names in the class files[1]. > The mangled method could trivially be prefixed by its class name > (either in binary or source format[2]). > > So the parseInt example from [1] could be stored in the symbols file as: > > java.lang.Integer.parseInt(Ljava/lang/String;I)I > > - or - > > java/lang/Integer.parseInt(Ljava/lang/String;I)I > > Which would tell us that the class java.lang.Integer has a method with > the signature "int parseInt(java.lang.String,int)". Personally I would > prefer the second option of those two since it is easier to map to a > file name in the jar file (plus it consistently uses the binary name > instead of mixing source and binary name). > ((For the rest of the email I will be using the format resembling the > latter of the two in the example above.)) > > In that case it is trivial to recycle the current symbols format (modulo > using possibly forbidden characters in the symbol names). Such as: > > java/lang/Integer.parseInt(Ljava/lang/String;I)I@Base 1.1 > > This obviously assumes we can tell a C-symbols file from a Java-symbols > file and map the "SONAME" part accordingly. Since symbols cannot be > versioned like in C, I believe that the @Base part would be redundant > for Java. > > > The only thing missing is how to handle a regular field / constant; here > I see two "easy" options. Either use <encoded-type><field-name> or > <field-name><delimiter><encoded-type>. Assuming ":" is the delimiter > for the section option, it would like: > > Imy/finctional/Code.length@Base 1.1 > Ljava/io/PrintStream;java.lang.System.out@Base 1.1 > > - or - > > my/finctional/Code.length:I@Base 1.1 > java/lang/System.out:Ljava/io/PrintStream;@Base 1.1 > > Encoding the two fictional fields (or constants) "int length;" and > "java.io.PrintStream out;" in the class my.finctional.Code and > java.lang.System (respectively). But I welcome alternatives. > Particularly, for for enums the type of the enum constant would be the > same as the class it is in, > > - From there we can extend it to having "class-sections", e.g. something > like: > > class java/lang/Integer > parseInt(Ljava/lang/String;I)I@Base 1.1 > # other symbols in java/lang/Integer > class java/lang/String > # symbols in java/lang/String > > This would reduce the size of the symbols file. From there on we could > always extend the format to include more information and check for > compatibility breakage beyond symbols. > > The Java examples for symbols files in this mail are mere suggestions, > so if anyone has a better format to encode it in, please argue for it > and its advantages. > > ~Niels > > > [1] An example: > > int parseInt(String str, int radix); > > is mangled to (in byte-code format): > > parseInt(Ljava/lang/String;I)I > > Where "java/lang/String" is the binary name of the String class (L and ; > are start and end-markers) and "I" is the binary name of the "int". The > "I" after the end bracket denotes the return type, the types inside the > brackets are the arguments (in order). > > Reference: > http://java.sun.com/docs/books/jvms/second_edition/html/ClassFile.doc.html#1169 > > http://java.sun.com/docs/books/jvms/second_edition/html/ClassFile.doc.html#84645 > > [2] The (fully qualifying) source format of (e.g.) "String" is > java.lang.String; the binary format is java/lang/String - which is just > a ".class" short of being the filename. > -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.11 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/ iQIcBAEBCAAGBQJN44FMAAoJEAVLu599gGRC5okP/3AaKV/NHTjj3qlMwi6KlW68 d4gT/KleGasQXD4WPtP3q3J0grwIWO+RmZOIrC1hZNqIUi0/YuH4XeWYTPYDYkpe jWKTl6ygyZzzhnrNSt1kJxPzJIcG5XkD3KWrn7ttXFml/81pBglvZ0C++BGhG0d2 wWAbbkyOg7qnSKEcRKUDkDQZABqxg8ZMrtWrlh8g4tWMkk6UDHC3qoKZK8ZcagYc D4snhZN+Tw6/lHrZXbH2rjwf7oo59fJwrxxJ4gN7cMDBIV8Yrep4ex6Cu0GPsNW+ n8sARyoHB2ep+wmEQiemnHol7PWqIQz0CUBDj12sPm3Exvp/gnWoFJ5WfPf16JVK WDzAlpV3UgzWdMVbrW9FXGlU6332uPE5FpPxO0rwb8DKTK+JLoA824cw3D6bKOKY a8plXGmZymzMPxpfTJt3BSSmMLmkv3ukiE07MjaF1dg+pY3dY3bHmVk9YYnSTG0c EKAAC3JtLJR/DXdzx3UbhBjnLX6ZuBzXfkaRlsDgPFPRXZuS5cA6rtQyJIuE6upQ iRYteFbGr2R75X9DJgCxH4tO2Kav6KyItuMPqjRsC9wpssVbJKBTKH+QDhNRQ0KH 1BG0XD4cR/wRN/9ZmEX49gE/he3Ac7fDZfmTeD7NlbIjQ/hTVKIZ/PpoiC6yZQ8e Z5pWYwL9h/Q7p9DuTYAy =eQfm -----END PGP SIGNATURE----- -- To UNSUBSCRIBE, email to [email protected] with a subject of "unsubscribe". Trouble? Contact [email protected] Archive: http://lists.debian.org/[email protected]

