Hello, On šeštadienis 09 Sausis 2010 23:14:50 Raphael Hertzog wrote: > On Sat, 09 Jan 2010, Modestas Vainius wrote: > > Hello, > > > > On šeštadienis 09 Sausis 2010 20:14:39 Raphael Hertzog wrote: > > > On Tue, 05 Jan 2010, Modestas Vainius wrote: > > > > a7533ad Add Cppfilt module - interface to the c++filt utility. > > > > > > What happens when the toolchain starts using a (supposedly new) gnu-v4 > > > method of mangling? How will we deal with that? > > > > When this happens, we will need the code to somehow detect current C++ > > ABI being used (what comes to mind is maybe checking which soname of > > libstdc++ the lib is linked against). Or alternative is to use c++filt > > 'auto' format instead. That one demangles C++ symbols fine at the moment > > and is likely to work with newer ABI in the future. > > Why not using "auto" right now then? It means "c++" would work for java as > well which is weird but we can rename "c++" in "demangled". It also means > we would auto-support the other mangling method used by other systems > automatically.
I appears at the moment 'auto' is:
1) try gnu-v3. if success, done, otherwise
2) c++filt (libiberty) goes to great lengths to autoselect between those
special c++ compilers (hp, arm, old gnu) etc.
So apparently 'auto' does not cover java. It is limited to c++ compilers.
Therefore, I will change 'gnu-v3' to 'auto'. Java will need to be implemented
separately (Cppfilt module supports any number of daemons). I wrote pattern
framework with 'java' in mind hence adding it will probably be only a couple
lines of code but I won't do it now.
> Is there a big risk of mixup of types if we use "auto"? Can we still use
> c++filt as a daemon in that case or will we have troubles if we process
> several types of symbols in the same run?
>
> > Hmm, other OSes? I'm not sure what you mean. If that other OS does not
> > use GNU toolchain, many things will need porting including this one (but
> > is that an issue at the moment?). What is more, c++ pattern is literally
> > dependent on the GNU c++filt(1) output.
>
> Well, I remember someone who tried to port this code on some other OS
> (can't remember which one right now). If we can make it easier for them,
> we should do it, so it's best if we try to give it a thought now.
>
> (But I will not block on this)
>
> > I will fix this.
> >
> > > > 76b077b Replace Symbol::clone() with dclone() and sclone().
> > >
> > > Why did you decide to have the two variants? Is it only a matter of
> > > saving memory in some cases when the deep clone is not required?
> > >
> > > Does it make a significant difference?
> >
> > When cloning a symbol which matches a pattern (i.e. it has $symbol-
> >
> > >{matching_pattern}), deep cloning would clone the matching pattern
> > > itself and
> >
> > all its matches ($symbol->{matching_pattern}{pattern}{matches}) and
> > probably clone the same pattern again for each match etc etc. That's
> > potentially an infinite loop and huge waste of memory (but I have not
> > tested). So I chose a
>
> Right, makes sense. But maybe we should rather have a single clone() that
> is smart enough to not clone what is known to be references to other
> objects (as opposed to simple internal hashes). Because it's difficult to
> know when we should sclone() rather that dclone().
dclone() when creating a completely new entity (load(),
create_pattern_match()) from the existing Symbol object and sclone() elsewhere
(lookup_* etc.).
> > more straightforward approach (sclone) rather than deleting specific
> > $symbol fields before cloning and adding them back after cloning (done
> > that and didn't like it).
>
> You did that at one point in the code, though.
Yeah, but I would need to do something very similar either way because
create_pattern_match() actually "forks" non-pattern from pattern.
> > > > b55f6b3 Document patterns in the manual page.
> > >
> > > +counterpart defined in the symbol file. Whenever the first matching
> > > pattern is +found, all its tags and properties will be used as a basis
> > > specification of the +symbol. If neither pattern matches, the symbol
> > > will be considered as MISSING.
> > >
> > > MISSING -> new. Or did I miss something?
> >
> > E.g. we have a symbol from objdump (aka real symbol in the manpage).
> > First check if any non-pattern (specific symbol in the manpage) matches,
> > then if any defined pattern matches. So if neither matches, the objdump
> > symbol is MISSING, isn't it? That paragraph probably does not have a
> > clear distinction between objdump symbol and symbol in the symbol file.
> > Feel free to fix this wording up. I kinda ran into a corner with
> > terminology.
>
> If you have a symbol in objdump output, you have a real existing symbol,
> but if you can't find it in the template symbols file neither as real nor
> through a patter, it's a new symbol, right? It will be added in the
> resulting symbols file with the current version associated...
Uh, indeed, you are right. Apparently, I was thinking backwards when I wrote
this :P
> > > +Please also note that patterns are in the same way affected by
> > > \fIoptional\fR, +\fIarch\fR and other standard tags whenever possible.
> > >
> > > How "optional" affects patterns ought to be clarified (see below
> > > discussion about how it should maybe apply by default to wildcards).
> >
> > Definition of optional is that dpkg-gensymbols won't fail if the symbol
> > is MISSING (it will show up in diff though). Pattern is considered
> > MISSING if does not match anything. You mean this info should be added to
> > the manpage?
>
> Yes...
Ok.
>
> > > +At the moment, \fBdpkg\-gensymbols\fR supports three atomic pattern
> > > types:
> > >
> > > Why "atomic"?
> >
> > Because combined patterns are also patterns. So atomic for c++, wildcard,
> > regexp is the best I came up with. What term do you suggest?
>
> I would suggest "basic" vs "combined".
Ok.
>
> > > About wildcard symbols, I'm not sure it makes sense to keep the current
> > > syntax only. We should introduce a new syntax:
> > > (match-version)<version>
> > >
> > > The old syntax *@<version> would auto-translate to:
> > > (match-version|optional)<version>
> > >
> > > That way we keep backwards compatibility with the current behaviour
> > > where an unused wildcard is not a problem. (Not sure optional can be
> > > used for this purpose... otherwise we have to add "optional-pattern"
> > > and fix $sym->mark_not_found_in_library accordingly).
> >
> > Ok. I see. However I would rather special case *@<version> pattern than
> > introduce optional-pattern tag. Explanation below.
>
> Why do we need optional-pattern? Wouldn't optional work as expected?
So we agree that it is not needed.
So *@<version> is 'match-version' with implicit optional tag (we can't add the
tag automatically in order to avoid it appearing in the diff).
> > > I expect the glibc to have some generice wildcard entries used for all
> > > arches but only some of them are used for a given arch (because
> > > support for the arch has been added at a different point in time) and
> > > currently dpkg-gensymbols would fail complaining that a pattern match
> > > got removed/is unused.
>
> [...]
>
> > Btw, given there is a better solution (arch tag on the wildcard), do you
> > think we still need match-version? Why not just force people upgrade to
> > new (and arguably better) ways?
>
> I want match-version primarily for consistency in the long term. It
> doesn't make sense to have an exception that does not use a tag based
> approach for wildcard. We should continue to support the old syntax for
> backwards compatibility but we should document the new syntax and
> deprecate the old syntax.
Ok.
>
> > > If you can think of a better name than "match-version", please suggest
> > > it... but right now I can't.
> >
> > match-version name looks ok to me. However, I'm not sure how symbol name
> > part should look of such a pattern. Something like
> >
> > (match-version=GLIBC_2.0)* 2.0
>
> My suggestion was:
> (match-version)GLIBC_2.0 2.0
Will be done. However, in case of such syntax, 'match-' looks kinda redundant.
What about:
(version)GLIBC_2.0 2.0
or
(symbol-version)GLIBC_2.0 2.0
or
(symver)GLIBC_2.0 2.0
(like http://www.redhat.com/docs/manuals/enterprise/RHEL-3-Manual/gnu-
assembler/symver.html )
?
> Another alternative is to say that regex is enough for this but given that
> you don't like regex for performance reasons I suppose you prefer to
> implement match-version...
>
> (regex)^...@glibc_2.0$ 2.0
Yeah. By the way, do you prefer 'regexp' (as it is now) or 'regex'?
--
Modestas Vainius <[email protected]>
signature.asc
Description: This is a digitally signed message part.

