Hello,

On šeštadienis 09 Sausis 2010 20:14:39 Raphael Hertzog wrote:
> On Tue, 05 Jan 2010, Modestas Vainius wrote:
> > a7533ad Add Cppfilt module - interface to the c++filt utility.
> 
> What happens when the toolchain starts using a (supposedly new) gnu-v4
> method of mangling? How will we deal with that?

When this happens, we will need the code to somehow detect current C++ ABI 
being used (what comes to mind is maybe checking which soname of libstdc++ the 
lib is linked against). Or alternative is to use c++filt 'auto' format 
instead. That one demangles C++ symbols fine at the moment and is likely to 
work with newer ABI in the future.

> Also what about support of other OS? How can we plug it in that code when
> necessary? (trying to answer myself) I guess we can use the Config module
>  and select the right type automatically.

Hmm, other OSes? I'm not sure what you mean. If that other OS does not use GNU 
toolchain, many things will need porting including this one (but is that an 
issue at the moment?). What is more, c++ pattern is literally dependent on the 
GNU c++filt(1) output.

> > c206e18 Add pattern related subs to Symbol interface.
> 
> Small error spotted:
> 
> --- a/scripts/Dpkg/Shlibs/Symbol.pm
> +++ b/scripts/Dpkg/Shlibs/Symbol.pm
> @@ -241,7 +241,7 @@ sub equals {
> 
>      # Compare names and tag sets
>      return 0 if $self->{symbol} ne $other->{symbol};
> -    return 0 if scalar(@{$self->{tagorder}}) !=
>  scalar(@{$self->{tagorder}}); +    return 0 if
>  scalar(@{$self->{tagorder}}) != scalar(@{$other->{tagorder}});
> 
>      for (my $i = 0; $i < scalar(@{$self->{tagorder}}); $i++) {
>         my $tag = $self->{tagorder}->[$i];

I will fix this.

> > 76b077b Replace Symbol::clone() with dclone() and sclone().
> 
> Why did you decide to have the two variants? Is it only a matter of
> saving memory in some cases when the deep clone is not required?
>
> Does it make a significant difference?

When cloning a symbol which matches a pattern (i.e. it has $symbol-
>{matching_pattern}), deep cloning would clone the matching pattern itself and 
all its matches ($symbol->{matching_pattern}{pattern}{matches}) and probably 
clone the same pattern again for each match etc etc. That's potentially an 
infinite loop and huge waste of memory (but I have not tested). So I chose a 
more straightforward approach (sclone) rather than deleting specific $symbol 
fields before cloning and adding them back after cloning (done that and didn't 
like it).

> > c268d0d Initialize patterns.
> 
> +    # Wildcard is an alias based pattern. It gets recognized here even if
>  it is +    # not specially tagged.
> +    if (my $ver = $self->get_wildcard_version()) {
> +       error(_g("you can't use wildcards on unversioned symbols: %s"), $_)
>  if $ver eq "Base +       $self->init_pattern(($self->is_pattern()) ?
>  'generic' : 'alias-wildcard'); +       $self->{pattern}{wildcard} = 1;
> +    }
> 
> Why the test on is_pattern() to initialize it either as generic or
>  alias-wildcard?
> 
> I imagined it could be in case you combine wildcard with regexp pattern but
>  I fail to see how it could work in practice, the initialization as regex
>  pattern comes right after that block...

It's true that wildcard (as it is now) + regexp will never work (* at the 
beginning would never match anything). However, it is possible to combine c++ 
and wildcard (in theory, see testsuite). 'generic' pattern is not just regexp. 
It is also any combination of patterns.

Aliases are there as optimization (i.e. hash is used for patterns where it can 
be used). I'm pretty sure some people will have (even if it is not 
recommended) all symbols demangled in their symbol files... On the other hand, 
aliases complicate implementation a bit, but not that much imho.

> > 9676657 Improve SymbolFile::dump() output.
> 
> To me it seems more logical to show pattern matches after the pattern
>  symbol, no? Is there a reason why you put them first?

That way looked better to me 5 months ago :) Now probably I agree with you. I 
will fix this.

> > b55f6b3 Document patterns in the manual page.
> 
> +counterpart defined in the symbol file. Whenever the first matching
>  pattern is +found, all its tags and properties will be used as a basis
>  specification of the +symbol. If neither pattern matches, the symbol will
>  be considered as MISSING.
> 
> MISSING -> new. Or did I miss something?

E.g. we have a symbol from objdump (aka real symbol in the manpage). First 
check if any non-pattern (specific symbol in the manpage) matches, then if any 
defined pattern matches. So if neither matches, the objdump symbol is MISSING, 
isn't it? That paragraph probably does not have a clear distinction between 
objdump symbol and symbol in the symbol file. Feel free to fix this wording 
up. I kinda ran into a corner with terminology.

> +Please also note that patterns are in the same way affected by
>  \fIoptional\fR, +\fIarch\fR and other standard tags whenever possible.
> 
> How "optional" affects patterns ought to be clarified (see below discussion
> about how it should maybe apply by default to wildcards).

Definition of optional is that dpkg-gensymbols won't fail if the symbol is 
MISSING (it will show up in diff though). Pattern is considered MISSING if 
does not match anything. You mean this info should be added to the manpage?

> +At the moment, \fBdpkg\-gensymbols\fR supports three atomic pattern types:
> 
> Why "atomic"?

Because combined patterns are also patterns. So atomic for c++, wildcard, 
regexp is the best I came up with. What term do you suggest?

> +for them). However, as these collisions happen on the ABI level, they
>  should +not degrade quality of the symbol file.
> 
> Can you explain more precisely why? (not necessarily in the manual page,
>  but at least for my own culture)

This is partially based on my question [1] and the following answer [2] (e.g. 
in case of doubled destructors on the ABI level). With regard to non-virtual 
trunks with different offsets, once a class has been created, its virtual 
table or class hierarchy cannot be modified or that will instantly break ABI 
[3]. As a result, a set of non-virtual thunks will have to remain constant 
since class introduction due to: 

1) neither base classes nor the class itself being allowed to add new / remove 
existing base classes;

2) base classes not being able to define new virtual methods or remove 
existing ones.

Only either of the two above could trigger a new non-virtual thunk in the 
derivative classes. Sure, usage of the pattern limits detection of such a ABI 
breakage, but it is enough for symbol versioning.

> About wildcard symbols, I'm not sure it makes sense to keep the current
> syntax only. We should introduce a new syntax:
>  (match-version)<version>
> 
> The old syntax *@<version> would auto-translate to:
>  (match-version|optional)<version>
> 
> That way we keep backwards compatibility with the current behaviour where
> an unused wildcard is not a problem. (Not sure optional can be used for
>  this purpose... otherwise we have to add "optional-pattern" and fix
> $sym->mark_not_found_in_library accordingly).

Ok. I see. However I would rather special case *@<version> pattern than 
introduce optional-pattern tag. Explanation below.

> I expect the glibc to have some generice wildcard entries used for all
>  arches but only some of them are used for a given arch (because support
>  for the arch has been added at a different point in time) and currently
>  dpkg-gensymbols would fail complaining that a pattern match got removed/is
>  unused.

So now there is an arch tag for this purpose. Therefore, I don't think 
optional-pattern should exist except maybe for backwards compatibility 
purposes with old behaviour.

Btw, given there is a better solution (arch tag on the wildcard), do you think 
we still need match-version? Why not just force people upgrade to new (and 
arguably better) ways?

> If you can think of a better name than "match-version", please suggest
> it... but right now I can't.

match-version name looks ok to me. However, I'm not sure how symbol name part 
should look of such a pattern. Something like

 (match-version=GLIBC_2.0)* 2.0

?

> 
> 
> Otherwise it looks mostly fine. Some stylish comments though:
> 
> Please avoid using
>  my $a = shift;
>  my $b = shift;
>  my $c = shift;
> Prefer:
>  my ($a, $b, $c) = @_;
> 
> readline($fh) should also be replaced by the more common <$fh>
> 
> > It would be great to see this feature in squeeze...
> 
> I think it should be ok for squeeze. :)

I will this fix those.

Fixes will be committed to symbol-patters branch in a couple of hours.

1. http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=533916#40
2. http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=533916#45
3. 
http://techbase.kde.org/Policies/Binary_Compatibility_Issues_With_C++#The_Do.27s_and_Don.27ts


-- 
Modestas Vainius <[email protected]>

Attachment: signature.asc
Description: This is a digitally signed message part.

Reply via email to