Hello,

On šeštadienis 09 Sausis 2010 23:14:50 Raphael Hertzog wrote:
> On Sat, 09 Jan 2010, Modestas Vainius wrote:
> > Hello,
> >
> > On šeštadienis 09 Sausis 2010 20:14:39 Raphael Hertzog wrote:
> > > On Tue, 05 Jan 2010, Modestas Vainius wrote:
> > > > a7533ad Add Cppfilt module - interface to the c++filt utility.
> > >
> > > What happens when the toolchain starts using a (supposedly new) gnu-v4
> > > method of mangling? How will we deal with that?
> >
> > When this happens, we will need the code to somehow detect current C++
> > ABI being used (what comes to mind is maybe checking which soname of
> > libstdc++ the lib is linked against). Or alternative is to use c++filt
> > 'auto' format instead. That one demangles C++ symbols fine at the moment
> > and is likely to work with newer ABI in the future.
> 
> Why not using "auto" right now then? It means "c++" would work for java as
> well which is weird but we can rename "c++" in "demangled". It also means
> we would auto-support the other mangling method used by other systems
> automatically.

I appears at the moment 'auto' is:

1) try gnu-v3. if success, done, otherwise
2) c++filt (libiberty) goes to great lengths to autoselect between those 
special c++ compilers (hp, arm, old gnu) etc.

So apparently 'auto' does not cover java. It is limited to c++ compilers.

Therefore, I will change 'gnu-v3' to 'auto'. Java will need to be implemented 
separately (Cppfilt module supports any number of daemons). I wrote pattern 
framework with 'java' in mind hence adding it will probably be only a couple 
lines of code but I won't do it now.

> Is there a big risk of mixup of types if we use "auto"? Can we still use
> c++filt as a daemon in that case or will we have troubles if we process
> several types of symbols in the same run?
> 
> > Hmm, other OSes? I'm not sure what you mean. If that other OS does not
> > use GNU toolchain, many things will need porting including this one (but
> > is that an issue at the moment?). What is more, c++ pattern is literally
> > dependent on the GNU c++filt(1) output.
> 
> Well, I remember someone who tried to port this code on some other OS
> (can't remember which one right now). If we can make it easier for them,
> we should do it, so it's best if we try to give it a thought now.
> 
> (But I will not block on this)
> 
> > I will fix this.
> >
> > > > 76b077b Replace Symbol::clone() with dclone() and sclone().
> > >
> > > Why did you decide to have the two variants? Is it only a matter of
> > > saving memory in some cases when the deep clone is not required?
> > >
> > > Does it make a significant difference?
> >
> > When cloning a symbol which matches a pattern (i.e. it has $symbol-
> >
> > >{matching_pattern}), deep cloning would clone the matching pattern
> > > itself and
> >
> > all its matches ($symbol->{matching_pattern}{pattern}{matches}) and
> > probably clone the same pattern again for each match etc etc. That's
> > potentially an infinite loop and huge waste of memory (but I have not
> > tested). So I chose a
> 
> Right, makes sense. But maybe we should rather have a single clone() that
> is smart enough to not clone what is known to be references to other
> objects (as opposed to simple internal hashes). Because it's difficult to
> know when we should sclone() rather that dclone().

dclone() when creating a completely new entity (load(), 
create_pattern_match()) from the existing Symbol object and sclone() elsewhere 
(lookup_* etc.).

> > more straightforward approach (sclone) rather than deleting specific
> > $symbol fields before cloning and adding them back after cloning (done
> > that and didn't like it).
> 
> You did that at one point in the code, though.

Yeah, but I would need to do something very similar either way because 
create_pattern_match() actually "forks" non-pattern from pattern.

> > > > b55f6b3 Document patterns in the manual page.
> > >
> > > +counterpart defined in the symbol file. Whenever the first matching
> > >  pattern is +found, all its tags and properties will be used as a basis
> > >  specification of the +symbol. If neither pattern matches, the symbol
> > > will be considered as MISSING.
> > >
> > > MISSING -> new. Or did I miss something?
> >
> > E.g. we have a symbol from objdump (aka real symbol in the manpage).
> > First check if any non-pattern (specific symbol in the manpage) matches,
> > then if any defined pattern matches. So if neither matches, the objdump
> > symbol is MISSING, isn't it? That paragraph probably does not have a
> > clear distinction between objdump symbol and symbol in the symbol file.
> > Feel free to fix this wording up. I kinda ran into a corner with
> > terminology.
> 
> If you have a symbol in objdump output, you have a real existing symbol,
> but if you can't find it in the template symbols file neither as real nor
> through a patter, it's a new symbol, right? It will be added in the
> resulting symbols file with the current version associated...

Uh, indeed, you are right. Apparently, I was thinking backwards when I wrote 
this :P

> > > +Please also note that patterns are in the same way affected by
> > >  \fIoptional\fR, +\fIarch\fR and other standard tags whenever possible.
> > >
> > > How "optional" affects patterns ought to be clarified (see below
> > > discussion about how it should maybe apply by default to wildcards).
> >
> > Definition of optional is that dpkg-gensymbols won't fail if the symbol
> > is MISSING (it will show up in diff though). Pattern is considered
> > MISSING if does not match anything. You mean this info should be added to
> > the manpage?
> 
> Yes...

Ok.

> 
> > > +At the moment, \fBdpkg\-gensymbols\fR supports three atomic pattern
> > > types:
> > >
> > > Why "atomic"?
> >
> > Because combined patterns are also patterns. So atomic for c++, wildcard,
> > regexp is the best I came up with. What term do you suggest?
> 
> I would suggest "basic" vs "combined".

Ok.

> 
> > > About wildcard symbols, I'm not sure it makes sense to keep the current
> > > syntax only. We should introduce a new syntax:
> > >  (match-version)<version>
> > >
> > > The old syntax *@<version> would auto-translate to:
> > >  (match-version|optional)<version>
> > >
> > > That way we keep backwards compatibility with the current behaviour
> > > where an unused wildcard is not a problem. (Not sure optional can be
> > > used for this purpose... otherwise we have to add "optional-pattern"
> > > and fix $sym->mark_not_found_in_library accordingly).
> >
> > Ok. I see. However I would rather special case *@<version> pattern than
> > introduce optional-pattern tag. Explanation below.
> 
> Why do we need optional-pattern? Wouldn't optional work as expected?

So we agree that it is not needed.

So *@<version> is 'match-version' with implicit optional tag (we can't add the 
tag automatically in order to avoid it appearing in the diff).

> > > I expect the glibc to have some generice wildcard entries used for all
> > >  arches but only some of them are used for a given arch (because
> > > support for the arch has been added at a different point in time) and
> > > currently dpkg-gensymbols would fail complaining that a pattern match
> > > got removed/is unused.
> 
> [...]
> 
> > Btw, given there is a better solution (arch tag on the wildcard), do you
> > think we still need match-version? Why not just force people upgrade to
> > new (and arguably better) ways?
> 
> I want match-version primarily for consistency in the long term. It
> doesn't make sense to have an exception that does not use a tag based
> approach for wildcard. We should continue to support the old syntax for
> backwards compatibility but we should document the new syntax and
> deprecate the old syntax.

Ok.

> 
> > > If you can think of a better name than "match-version", please suggest
> > > it... but right now I can't.
> >
> > match-version name looks ok to me. However, I'm not sure how symbol name
> > part should look of such a pattern. Something like
> >
> >  (match-version=GLIBC_2.0)* 2.0
> 
> My suggestion was:
>  (match-version)GLIBC_2.0 2.0

Will be done. However, in case of such syntax, 'match-' looks kinda redundant. 
What about:

 (version)GLIBC_2.0 2.0

or

 (symbol-version)GLIBC_2.0 2.0

or 

 (symver)GLIBC_2.0 2.0

(like http://www.redhat.com/docs/manuals/enterprise/RHEL-3-Manual/gnu-
assembler/symver.html )

?

> Another alternative is to say that regex is enough for this but given that
> you don't like regex for performance reasons I suppose you prefer to
> implement match-version...
> 
>  (regex)^...@glibc_2.0$ 2.0

Yeah. By the way, do you prefer 'regexp' (as it is now) or 'regex'?


-- 
Modestas Vainius <[email protected]>

Attachment: signature.asc
Description: This is a digitally signed message part.

Reply via email to