Launchpad has imported 28 comments from the remote bug at
http://sourceware.org/bugzilla/show_bug.cgi?id=12518.

If you reply to an imported comment from within Launchpad, your comment
will be sent to the remote bug automatically. Read more about
Launchpad's inter-bugtracker facilities at
https://help.launchpad.net/InterBugTracking.

------------------------------------------------------------------------
On 2011-02-25T02:26:20+00:00 Torvalds wrote:

Created attachment 5264
Example patch to give the basic idea

I realize that it's technically "undefined", but the behavior has
changed, and in the process broken existing binaries. It's a common bug
to use memcpy() instead of memmove(), and the traditional behavior of
"copy upwards" means that that bug can go unnoticed for a *long* time if
the memory move moves things downwards.

The change was introduced in commit
6fb8cbcb58a29fff73eb2101b34caa19a7f88eba ("Improve 64bit memcpy/memmove
for Atom, Core 2 and Core i7"), which was part of the 2.13 release.

As a result, upgrading from Fedora-13 to Fedora-14 caused applications
like flash to fail. But it's a bug that has been reported for other
applications too (and it's a bug that I've personally introduced in
'git' too - happily, people had run things like valgrind, so I don't
know of any _current_ cases of it).

And there is absolutely _zero_ reason to not handle overlapping areas
correctly. The "undefined behavior" is due to historical implementation
issues, not because it's better than just doing the right thing.

And now applications will randomly do different things depending on the
phase of the moon (technically, depending on which CPU they have and
what particular version of memcpy() glibc happens to choose).

So from a pure Quality-of-Implementation standpoint, just make glibc
implement memcpy() as memmove(), if you don't want to re-instate the
traditional behavior that binaries have at least been tested with. Don't
break random existing binaries just because the glibc version changed,
and they had never been tested with the new random behavior!

I'm attaching an EXAMPLE patch against the current glibc git tree: it
just tries to get rid of the unnecessary differences between memcpy and
memmove for the normal ssse3 case. The approach is simple:

 - small copies (less than 80 bytes) have hand-coded optimized code that
gets called through a jump table. Those cases already handle overlapping
areas correctly (simply because they do all loads up-front, and stores
at the end), and they are used for memmove() as-is.

   So change the memcpy() function to test for the small case first,
since that's the one that can be latency critical.

 - once we're talking about bigger memcpy() cases, the extra couple of
cycles to check "which direction should I copy" are totally immaterial,
and having two different versions of basically the same code is just
silly and is quite likely to cost more than (in I$ and page fault
overhead) than just doing it in the same code. So just remove all the
USE_AS_MEMMOVE games.

I haven't actually tested the patch, since building glibc is black magic
and the  simple "just alias __memmove_ssse3 to __memcpy_sse3" thing
didn't work for me and resulted in linker errors. There's probably some
simple (but subtle) magic reason for that, that I simply don't know
about.

So take the patch as a "hey, doing it this way should be simpler and
avoid the problem" rather than "apply this patch as-is". The ssse3-back
case (which prefers backwards copies) needs similar treatment, I'll
happily do that and test it if people just let me know that it's worth
my time (and tell me what the black magic incantation is).

Reply at: https://bugs.launchpad.net/glibc/+bug/727064/comments/0

------------------------------------------------------------------------
On 2011-02-25T02:29:37+00:00 Torvalds wrote:

Btw, this has hit quite a lot of people. It's Fedora

  https://bugzilla.redhat.com/show_bug.cgi?id=638477

and there are examples of having other things than just flash break
(like old gstreamer plugins etc)

Reply at: https://bugs.launchpad.net/glibc/+bug/727064/comments/1

------------------------------------------------------------------------
On 2011-02-25T03:57:42+00:00 Hjl-tools wrote:

Although the current memcpy in glibc follows the spec,
some applications call memcpy with overlapping destination
and source.

I think we should help those applications without punishing
the correctly written applications.  We can check an environment
variable for IFUNC, like LD_BIND_IFUNC_MEMCPY_TO_MEMMOVE.
If it is set, IFUNC memcpy will return memmove instead
of memcpy. I can prepare a patch to implement it if we
should do it.

Reply at: https://bugs.launchpad.net/glibc/+bug/727064/comments/2

------------------------------------------------------------------------
On 2011-02-25T04:14:39+00:00 Torvalds wrote:

So is there any real reason to believe that memmove() can't just be as
fast as memcpy?

Seriously. That's what it all boils down to. Why have a separate
memcpy() at all, when it clearly is correct - and nice to people - to
always just implement it as a memmove(). And I really don't see the
downside. It's not like it's going to be slower.

People who want to check whether their application is portable can use
valgrind, the same way you have to check for portability issues in other
areas (eg the extra glibc specific printf formats etc - they just
"work", but they certainly aren't portable).

So why not just be nice.

If anything, from a "be nice" standpoint, it would perhaps be good to
have a "debug mode", that would actually _warn_ - or even assert - about
overlapping memcpy(). That obviously shouldn't be the default (since
that only helps developers), but it would be along the same lines of the
nice malloc debugging help that glibc can give.

IOW, I think this is an area where glibc can be _better_ than what is
required of it.

Reply at: https://bugs.launchpad.net/glibc/+bug/727064/comments/3

------------------------------------------------------------------------
On 2011-02-25T09:59:18+00:00 Drepper-fsp wrote:

The existence of code written by people who should never have been
allowed to touch a keyboard cannot be allowed to prevent a correct
implementation.  If there is a large bod of code out there we can work
around the issue for that.

We can have the existing memcpy@GLIBC_2.2.5 implementation work around
the issue by avoiding the backward copying code.  A new
memcpy@GLIBC_2.14 could use the code currently in use.

Whether the current, new memcpy is only slightly faster than one
mimicking memmove is really not that important.  There are going to be
more and different implementations in the future and they shouldn't be
prevented from being used because of buggy programs.  We should be happy
to have this code here and now.

With this proposed implementation old code remains in working order
until those lousy programmers use a newer platform and then they will
experience the problems themselves and will fix them before releasing
their code.

I'm happy to entertain a patch to this effect.

Reply at: https://bugs.launchpad.net/glibc/+bug/727064/comments/4

------------------------------------------------------------------------
On 2011-02-25T12:07:14+00:00 Jakub Jelinek wrote:

If we go that route (symbol versioning memcpy), then wouldn't it be
better to just alias the old memcpy@GLIBC_2.2.5 to memmove and have the
new memcpy@@GLIBC_2.14 be the only memcpy implementation? Having two
sets of memcpy implementations (especially when we have many memcpy
implementations on x86_64 and i?86 selectable via STT_GNU_IFUNC), even
if the workaroundish one is placed in compat .text subsections, would
waste too much code section in libc.so.6.

Reply at: https://bugs.launchpad.net/glibc/+bug/727064/comments/5

------------------------------------------------------------------------
On 2011-02-25T15:19:47+00:00 Hjl-tools wrote:

(In reply to comment #5)
> If we go that route (symbol versioning memcpy), then wouldn't it be better to
> just alias the old memcpy@GLIBC_2.2.5 to memmove and have the new
> memcpy@@GLIBC_2.14 be the only memcpy implementation? Having two sets of 
> memcpy

I like this idea.

Reply at: https://bugs.launchpad.net/glibc/+bug/727064/comments/6

------------------------------------------------------------------------
On 2011-02-26T01:56:18+00:00 Drepper-fsp wrote:

(In reply to comment #5)
> If we go that route (symbol versioning memcpy), then wouldn't it be better to
> just alias the old memcpy@GLIBC_2.2.5 to memmove and have the new
> memcpy@@GLIBC_2.14 be the only memcpy implementation?

That's what I have in mind.

Reply at: https://bugs.launchpad.net/glibc/+bug/727064/comments/7

------------------------------------------------------------------------
On 2011-03-03T23:27:33+00:00 Felipe Contreras wrote:

"Buggy" code can be linked to the new memcpy@@GLIBC_2.14 by just
recompiling, no? That doesn't really solve anything.

Sure, code should use memcpy correctly, and if glibc has a compelling
reason to break these programs, it should. As Ulrich mentions in comment
#4 "There are going to be more and different implementations in the
future and they shouldn't be prevented from being used because of buggy
programs."

But _today_ that's not the case. Today, the regression can be fixed
easily with a patch like what Linus is proposing, and there will be no
*downside* whatsoever.

How about glibc breaks the behavior _only_ when there's an *upside* to
breaking it?

Reply at: https://bugs.launchpad.net/glibc/+bug/727064/comments/9

------------------------------------------------------------------------
On 2011-03-04T23:41:41+00:00 Drepper-fsp wrote:

(In reply to comment #8)
> "Buggy" code can be linked to the new memcpy@@GLIBC_2.14 by just recompiling,
> no? That doesn't really solve anything.

It solves everything.  If you just relink without retesting you're an
idiot and acting irresponsible.

Reply at: https://bugs.launchpad.net/glibc/+bug/727064/comments/10

------------------------------------------------------------------------
On 2011-03-04T23:53:36+00:00 Torvalds wrote:

(In reply to comment #9)
> 
> It solves everything.  If you just relink without retesting you're an idiot 
> and
> acting irresponsible.

It does solve a lot, and at least fixes the "you broke stuff that used
to work" issue.

However, I still don't understand why you guys can't just admit that you
might as well just do memmove() and be done with it. It's not slower.
And the excuse that "you'll have more implementations in the future" is
just an even stronger reason to do that.

To make this very clear: your new "and improved" memcpy() ACTS
DIFFERENTLY ON DIFFERENT MACHINES. That means that all that testing that
was done by the developer at link-time is ALMOST TOTALLY WORTHLESS,
because what was tested wasn't necessarily at all what the user sees.

I really don't understand why you can't admit that random behavior like
that by a very fundamental core library routine is actually a real
problem.

And there really isn't any upside. The optimized routines up to 80 bytes
are the same (in fact, my patch should speed them up by a few cycles),
and anything bigger than that will not notice the extra couple of cycles
to check for overlap.

So while I agree that it's a fix to the immediate problem to just say
"don't screw up for existing binaries", I do NOT understand why the
glibc people then apparently think it's fine to be stupid for new
binaries.

Reply at: https://bugs.launchpad.net/glibc/+bug/727064/comments/11

------------------------------------------------------------------------
On 2011-03-05T00:28:38+00:00 Felipe Contreras wrote:

(In reply to comment #9)
> (In reply to comment #8)
> > "Buggy" code can be linked to the new memcpy@@GLIBC_2.14 by just 
> > recompiling,
> > no? That doesn't really solve anything.
> 
> It solves everything.  If you just relink without retesting you're an idiot 
> and
> acting irresponsible.

You cannot test every possible code-path.

Reply at: https://bugs.launchpad.net/glibc/+bug/727064/comments/12

------------------------------------------------------------------------
On 2011-03-05T13:28:28+00:00 Oliver-henshaw wrote:

The thread starting at http://lwn.net/Articles/414496/ may be
instructive on how this can cause problems despite the good(-ish)
intentions of the developer. In short, squashfs used memcpy with data
that was known not to overlap but later changes invalidated this
assumption.

Reply at: https://bugs.launchpad.net/glibc/+bug/727064/comments/13

------------------------------------------------------------------------
On 2011-03-24T21:03:57+00:00 Bvasselle wrote:

(In reply to comment #9)
> (In reply to comment #8)
> > "Buggy" code can be linked to the new memcpy@@GLIBC_2.14 by just 
> > recompiling,
> > no? That doesn't really solve anything.
> 
> It solves everything.  If you just relink without retesting you're an idiot 
> and
> acting irresponsible.

Nobody is interested in an optimization in lib C that would at best gain
less than the simple fact of calling the function.

Everybody is interested in using the code that idiots may produce.

Reply at: https://bugs.launchpad.net/glibc/+bug/727064/comments/14

------------------------------------------------------------------------
On 2011-03-25T04:37:07+00:00 Hjl-tools wrote:

Created attachment 5323
A patch

This works for me.

Reply at: https://bugs.launchpad.net/glibc/+bug/727064/comments/15

------------------------------------------------------------------------
On 2011-03-25T06:45:50+00:00 Jakub Jelinek wrote:

This is not correct:
1) glibc 2.13 has been released already, so new symbol versions must be 
GLIBC_2.14
2) you do it only on x86_64, therefore you should add it into 
sysdeps/x86_64/Versions (though, you will need to add GLIBC_2.14 to toplevel 
Versions.def too)

Reply at: https://bugs.launchpad.net/glibc/+bug/727064/comments/16

------------------------------------------------------------------------
On 2011-03-28T11:53:48+00:00 Felipe Contreras wrote:

So we have two solutions:

1) The solution proposed by Linus Torvalds, attachment #5264

Advantages: everything works the same as before
Disadvantages: a few extra cycles in certain memcpy's

2) The solution proposed by Ulrich Drepper, attachment #5323

Advantages: old binaries keep working
Disadvantages: newly compiled code remains affected

Clearly 1) is the most sensible solution as the only advantage of 2) is
a few cycles, and has drastic disadvantages.

Reply at: https://bugs.launchpad.net/glibc/+bug/727064/comments/17

------------------------------------------------------------------------
On 2011-03-29T14:43:31+00:00 Felipe Contreras wrote:

Actually, why not have both? I think this plan would fit everyone:

1) Apply Linus' patch

Both to 2.13 and 2.14, this would not introduce any issues and would
restore back the previous behavior, so applications would still work. As
I mentioned before, the disadvantages are minimal.

2) Apply H.J. Lu's patch

This would go to 2.14, but not only to x86 but all architectures, and
add an overlap check, and when triggered issue a crash.

This would allow old binaries to keep working on 2.14, but newly
compiled ones would crash if they are doing something wrong. This would
allow all the users of glibc to fix their code for the impending
changes.

3) Remove overlap checks

On 2.15, after a transition period, the overlap checks can be removed,
and thus save the few extra cycles.

This has all the advantages of all the proposals, and makes it easy to
fix the applications that are using memcpy wrong.

Reply at: https://bugs.launchpad.net/glibc/+bug/727064/comments/21

------------------------------------------------------------------------
On 2011-03-29T22:27:52+00:00 Bvasselle wrote:

(In reply to comment #17)
> Actually, why not have both? I think this plan would fit everyone:

No, it does not. It certainly does not.

It is not only the problem of recompiling the existing code, it's also
the problem of fixing it and re-qualifying it. This plan has a huge
cost... and it's vain.

The contract C programmers have with C and the C library is clear: we
accept a fair amount of inefficency, but we don't have to program in
assembly nor care about the system's internals. How many people still
use the C library when it comes to be important to gain an addition plus
a comparison?

The problem we're facing just made this fact plain: there is no reason
why memcpy should not be memmove.

Reply at: https://bugs.launchpad.net/glibc/+bug/727064/comments/22

------------------------------------------------------------------------
On 2011-03-30T05:14:25+00:00 Felipe Contreras wrote:

(In reply to comment #18)
> (In reply to comment #17)
> > Actually, why not have both? I think this plan would fit everyone:
> 
> No, it does not. It certainly does not.
> 
> It is not only the problem of recompiling the existing code, it's also the
> problem of fixing it and re-qualifying it. This plan has a huge cost... and
> it's vain.

I understand that, but it's better than the current alternative that
Ulrich is more likely to merge, which is basically to do nothing.

Reply at: https://bugs.launchpad.net/glibc/+bug/727064/comments/23

------------------------------------------------------------------------
On 2011-03-30T07:21:26+00:00 Felipe Contreras wrote:

Created attachment 5341
Patch to check for overlaps

I'm trying this patch to find out how many things in the system use
overlapping memcpy, however, it doesn't seem to cause any crashes... Can
anyone spot something wrong?

Reply at: https://bugs.launchpad.net/glibc/+bug/727064/comments/24

------------------------------------------------------------------------
On 2011-03-31T08:36:56+00:00 Nick Shaforostoff wrote:

i'd like to vote for having memcpy and memmove identical.

(unless there are benchmarks that are showing that perf difference is
high)

Reply at: https://bugs.launchpad.net/glibc/+bug/727064/comments/25

------------------------------------------------------------------------
On 2011-04-01T23:41:31+00:00 Drepper-fsp wrote:

HJ's patch which implements what I proposed is in git.

Reply at: https://bugs.launchpad.net/glibc/+bug/727064/comments/26

------------------------------------------------------------------------
On 2011-04-02T02:57:19+00:00 Hjl-tools wrote:

(In reply to comment #3)
>
> If anything, from a "be nice" standpoint, it would perhaps be good to have a
> "debug mode", that would actually _warn_ - or even assert - about overlapping
> memcpy(). That obviously shouldn't be the default (since that only helps

You can use LD_AUDIT to do it today.

Reply at: https://bugs.launchpad.net/glibc/+bug/727064/comments/27

------------------------------------------------------------------------
On 2011-04-10T15:39:30+00:00 Bvasselle wrote:

(In reply to comment #4)
> The existence of code written by people who should never have been allowed to
> touch a keyboard cannot be allowed to prevent a correct implementation.  If
> there is a large bod of code out there we can work around the issue for that.
> 
Don't you see it is just NOT a correct implementation?

Reply at: https://bugs.launchpad.net/glibc/+bug/727064/comments/34

------------------------------------------------------------------------
On 2011-04-10T15:49:59+00:00 Felipe Contreras wrote:

Created attachment 5660
Patch to check for overlaps

Actually the code from Linus had a bug in the 'cmp' checks, here's the
updated version.

I just started to run my Fedora system with this, and I already see
crashes in pulseaudio and readahead-collector. I will continue running
this for a bit, but I think it's pretty clear that we should not assume
that all applications in a typical system have been using memcpy
properly.

So, again, I think we need at least a transition period so that people
can detect and fix the issues.

Reply at: https://bugs.launchpad.net/glibc/+bug/727064/comments/35

------------------------------------------------------------------------
On 2011-05-04T13:29:06+00:00 Vincent+libc wrote:

(In reply to comment #18)
> The problem we're facing just made this fact plain: there is no reason why
> memcpy should not be memmove.

If these two functions should behave in the same way, then why not all
programmers use memmove (which has fewer requirements)?

If memcpy is called while the objects overlap, the bug is not
necessarily that memmove should have been used instead. The cause may be
an incorrect size. So, with this point of view, it should be safer to
abort than letting the program behave in an uncontrolled way.

(In reply to comment #25)
> So, again, I think we need at least a transition period so that people can
> detect and fix the issues.

But it may be difficult to detect the issues. For instance, zsh was
affected by a similar problem (now fixed in CVS only) with the optimized
strcpy, but to detect the problem, it involves keyboard input:

  http://www.zsh.org/mla/workers/2011/msg00533.html
  http://www.zsh.org/mla/workers/2011/msg00544.html

For strcpy, this is even worse, as there is no strmove function, so that
either programmers have to write non-portable code or they have to
reimplement a naive version of strcpy or find some other workaround,
such as memmove + strlen:

  http://www.zsh.org/mla/workers/2011/msg00542.html

I suppose that if this has been done for memcpy, then strcpy should also
be patched in some way...

Reply at: https://bugs.launchpad.net/glibc/+bug/727064/comments/36

------------------------------------------------------------------------
On 2011-05-04T15:01:05+00:00 Felipe Contreras wrote:

(In reply to comment #26)
> (In reply to comment #25)
> > So, again, I think we need at least a transition period so that people can
> > detect and fix the issues.
> 
> But it may be difficult to detect the issues. For instance, zsh was affected 
> by
> a similar problem (now fixed in CVS only) with the optimized strcpy, but to
> detect the problem, it involves keyboard input:

Yes, it is difficult, that's why I think it should be enabled globally.

BTW. Here's another issue on zsh I found while enabling the memcpy checks:
https://bugzilla.redhat.com/show_bug.cgi?id=698439

Reply at: https://bugs.launchpad.net/glibc/+bug/727064/comments/37


** Changed in: glibc
       Status: Unknown => Fix Released

** Changed in: glibc
   Importance: Unknown => Medium

** Bug watch added: Red Hat Bugzilla #698439
   https://bugzilla.redhat.com/show_bug.cgi?id=698439

-- 
You received this bug notification because you are a member of Canonical
Partner Developers, which is subscribed to adobe-flashplugin in Ubuntu.
https://bugs.launchpad.net/bugs/727064

Title:
  [Natty] Strange beeping sound when viewing flash video

_______________________________________________
Mailing list: https://launchpad.net/~canonical-partner-dev
Post to     : canonical-partner-dev@lists.launchpad.net
Unsubscribe : https://launchpad.net/~canonical-partner-dev
More help   : https://help.launchpad.net/ListHelp

Reply via email to