[Bug other/70268] add option -ffile-prefix-map to map one directory name (old) to another (new) in __FILE__, __BASE_FILE__and __builtin_FILE()

2017-12-19 Thread infinity0 at pwned dot gg
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70268

--- Comment #15 from infinity0 at pwned dot gg ---
(In reply to infinity0 from comment #14)
> > it has a unified option (-ffile-prefix-map) [..]
> 
> Oh, nice. That might save me some work, then.
> 
> Could you bounce me the thread? Or failing that, tell me the Message-ID of
> one of the messages, so I can reply to it.
> 
> > [..] that I believe does exactly what you want
> 

Thanks for forwarding the mail. I've posted a follow-up patch to be applied on
top of your patch, that implements the two things I mentioned.

[Bug other/70268] add option -ffile-prefix-map to map one directory name (old) to another (new) in __FILE__, __BASE_FILE__and __builtin_FILE()

2017-12-15 Thread infinity0 at pwned dot gg
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70268

--- Comment #14 from infinity0 at pwned dot gg ---
> it has a unified option (-ffile-prefix-map) [..]

Oh, nice. That might save me some work, then.

Could you bounce me the thread? Or failing that, tell me the Message-ID of one
of the messages, so I can reply to it.

> [..] that I believe does exactly what you want

Basically yes. However one minor extra thing my patch enables, is the remapping
of arbitrary file paths - whereas the current form can't remap paths with "="
in it. This is not a major issue, but I'd recommend that you change:

-  p = strchr (arg, '=');
+  p = strrchr (arg, '=');

This would be a bit more flexible.

> the environment variable is a bad idea.

The problem with doing this on the command-line is that the command-line
arguments then contain the build-path. Sometimes, builds like to save the
command-line arguments somewhere.

Using an environment variable avoids this issue. I agree envvars are dirty in
general, but in this case one would be *removing* information from what GCC is
already doing (taking in absolute-path information via the filesystem), as
opposed to *adding* information which is what other envvars do.

A more refined way to avoid this issue, is to allow $-based substitution, like
`-ffile-prefix-map=$BUILD_PATH=/usr/my/path` where BUILD_PATH is read from the
environment. Then we avoid adding $BUILD_PATH to the command-line args, but
/usr/my/path is still in there. Daniel Kahn Gillmor implemented this here [1]
though it was rejected in favour of a simpler approach for the bug mentioned
#69821 [2], however in retrospect this mechanism is actually very useful and
flexible. I found out recently that NetBSD still carries this patch to this
day, for reproducible builds. [3]

[1] https://gcc.gnu.org/ml/gcc-patches/2015-12/msg01168.html
[2] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=69821
[3]
http://cvsweb.netbsd.org/bsdweb.cgi/src/external/gpl3/gcc/dist/gcc/final.c?rev=1.2=text/x-cvsweb-markup

[Bug other/70268] add option -ffile-prefix-map to map one directory name (old) to another (new) in __FILE__, __BASE_FILE__and __builtin_FILE()

2017-12-15 Thread infinity0 at pwned dot gg
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70268

infinity0 at pwned dot gg changed:

   What|Removed |Added

 CC||infinity0 at pwned dot gg

--- Comment #12 from infinity0 at pwned dot gg ---
Are you aware of this patch?
https://gcc.gnu.org/ml/gcc-patches/2017-07/msg01315.html

If the overall motivation is to support reproducible builds, then you might be
interested in that, which is a unified mechanism that allows future extensions
to cover things *other* than __FILE__ and debug paths. (For example,
reproducible .o files which, I am told, contain STT_FILE entries that contain
the build path.)

The patch is not accepted yet; I had some review comments back in August and
I'm supposed to amend it to use a command-line flag instead of an environment
variable instead. I was going through my TODO backlog (non-GCC) since that
time, but I'll be finally working on this over the next week or so.

[Bug rtl-optimization/82677] Many projects (linux, coreutils, GMP, gcrypt, openSSL, etc) are misusing asm(divq/divl) etc, potentially resulting in faulty/unintended optimisations

2017-10-24 Thread infinity0 at pwned dot gg
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82677

--- Comment #9 from infinity0 at pwned dot gg ---
(In reply to rguent...@suse.de from comment #7)
> [..]
> 
> You still have to mark stmts with side-effects as volatile.
> 
> Conditional side-effects are tricky to get correct of course.

I think with the current asm() syntax and semantics, the only way to get this
correct in the C source code, is by using __volatile__. It is not possible to
even *express* "conditional side-effects" using the current syntax. To
demonstrate:

The "explicit zero check" that I mentioned in the first post does not actually
work, see the newer udiv.c test that I just attached. I think that is correct
behaviour from GCC as well.

AIUI, asm() without volatile, says to GCC: "this code will *never* cause side
effects under *any circumstance*, it depends only on its declared
inputs/outputs/clobbers". Under this assumption, it is correct to optimise

|   if(d) { asm(d); ... } ...

into

|   asm(d); if(d) { ... }

as long as the outputs to asm(d) don't clobber the inputs to if(d) or any
else-branches - I assume other parts of the optimiser will already check that.

However, it is *not* correct to perform the above, when the asm(d) has
conditional side-effects depending on d, that the if(d) checks for. But there
is no way to express this conditional-side-effect using the current asm()
syntax. So GCC will happily go ahead and perform the optimisation, since it
believes asm(d) will never have any side effects.

So, the pre-optimised code "looks correct", it's explicitly checking d, but
will in fact cause the types of unintended optimisations that we're looking at
here. The only way to avoid this is to use "volatile".

In conclusion: whilst making the optimiser more strict for this case as you
suggested, would fix this specific instance, I believe that it is not "the
proper fix" for GCC. A proper fix would have to involve changing the
*semantics* of a asm(div) call so that GCC understands that it has "side
effects depending on the divisor and n1[1]". Otherwise, more sophisticated
optimisations added to GCC in the future might make this issue come back.

If that is not done, then it would be necessary to fix these 20 projects's C
source code to properly express __volatile__. Actually this is probably
necessary regardless of any GCC fixes - from some quick searches it seems that
asm() semantics is not standardised, and also these projects might want to
support older GCC that has the existing semantics.

[1] it can also raise DE when n1 >= divisor.

[Bug rtl-optimization/82677] Many projects (linux, coreutils, GMP, gcrypt, openSSL, etc) are misusing asm(divq/divl) etc, potentially resulting in faulty/unintended optimisations

2017-10-24 Thread infinity0 at pwned dot gg
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82677

infinity0 at pwned dot gg changed:

   What|Removed |Added

  Attachment #42439|0   |1
is obsolete||
  Attachment #42440|0   |1
is obsolete||

--- Comment #8 from infinity0 at pwned dot gg ---
Created attachment 42461
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42461=edit
More specific and thorough test case

[Bug rtl-optimization/82677] Many projects (linux, coreutils, GMP, gcrypt, openSSL, etc) are misusing asm(divq/divl) etc, potentially resulting in faulty/unintended optimisations

2017-10-23 Thread infinity0 at pwned dot gg
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82677

--- Comment #6 from infinity0 at pwned dot gg ---
What I mean is, even if you do change GCC to fix the unintended optimisation,
other projects' code are *still wrong* - it's only correct if you can assume
the C compiler is optimising your code in a very specific way, that happens to
safe for the specific sorts of "side effects" that div instructions do.

[Bug rtl-optimization/82677] Many projects (linux, coreutils, GMP, gcrypt, openSSL, etc) are misusing asm(divq/divl) etc, potentially resulting in faulty/unintended optimisations

2017-10-23 Thread infinity0 at pwned dot gg
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82677

--- Comment #5 from infinity0 at pwned dot gg ---
(In reply to Richard Biener from comment #4)
> [..]
> It's still safe to move the asm in
> 
> int main() {
>   ulong d = 0;
>   for (ulong i = 0; i < 3; i++)
> for (ulong j = 0; j < 3; j++)
> {
>   ulong r;
>   __asm__("" : "=r"(d) : "rm"((ulong)0));
>   udiv_qrnnd(q, r, 0, 0, (i << d));
> }
> }
> 
> thus without the if ().

For the specific case of asm(div) I suppose it's safe because raising a DE is
pretty much similar to raising it several times in a loop, but it is correct to
assume that for all types of side effects?

[Bug inline-asm/82677] Many projects (linux, coreutils, GMP, gcrypt, openSSL, etc) are misusing asm(divq/divl) etc, potentially resulting in faulty/unintended optimisations

2017-10-23 Thread infinity0 at pwned dot gg
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82677

--- Comment #2 from infinity0 at pwned dot gg ---
More immutable version of [5]:

[5] https://gmplib.org/repo/gmp/annotate/046bc83644a3/longlong.h#l1574

[Bug inline-asm/82677] Many projects (linux, coreutils, GMP, gcrypt, openSSL, etc) are misusing asm(divq/divl) etc, potentially resulting in faulty/unintended optimisations

2017-10-23 Thread infinity0 at pwned dot gg
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82677

--- Comment #1 from infinity0 at pwned dot gg ---
Created attachment 42440
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42440=edit
Similar test case using macros from GMP et. al.

[Bug inline-asm/82677] New: Many projects (linux, coreutils, GMP, gcrypt, openSSL, etc) are misusing asm(divq/divl) etc, potentially resulting in faulty/unintended optimisations

2017-10-23 Thread infinity0 at pwned dot gg
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82677

Bug ID: 82677
   Summary: Many projects (linux, coreutils, GMP, gcrypt, openSSL,
etc) are misusing asm(divq/divl) etc, potentially
resulting in faulty/unintended optimisations
   Product: gcc
   Version: 7.2.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: inline-asm
  Assignee: unassigned at gcc dot gnu.org
  Reporter: infinity0 at pwned dot gg
  Target Milestone: ---

Created attachment 42439
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=42439=edit
Basic optimisation example, C-reduced from flint test code

If I understood the issue correctly, this is strictly speaking not a GCC bug.
But I'm filing this here so there is a central place to have the discussion.
There is also the possibility of GCC developers discussing a central fix for
this issue in GCC itself, rather than patching 20+ FOSS projects.

Background
==

When using asm() it is vital that the dependencies are expressed correctly,
otherwise GCC can optimise stuff in an unintended way. Those new to the topic
(e.g. me a few days ago) can read:

- https://www.ibiblio.org/gferg/ldp/GCC-Inline-Assembly-HOWTO.html for a nice
intro
- https://gcc.gnu.org/onlinedocs/gcc/Extended-Asm.html for details

The Intel instruction manual [0] for the "div" and "idiv" instructions states
that:

- "The CF, OF, SF, ZF, AF, and PF flags are undefined."
- raises a #DE (Divide error) (#a) "If the source operand (divisor) is 0" and
(#b) "If the quotient is too large for the designated register."

Therefore, if you want to write a general-purpose "div" utility in GCC asm,
that assumes nothing about its inputs or outputs, the correct syntax would be
something along the lines of:

  __asm__ __volatile__("divq %4" : "=a"(q), "=d"(r) : "0"((ulong)n0),
"1"((ulong)n1), "rm"((ulong)d) : "cc");

- __volatile__ because raising an exception is a "side-effect", i.e. affects
something outside of the declared inputs and outputs, and
- "cc" because it might clobber the FLAGS register.

Note that __volatile__ is necessary even if the divisor "cannot be 0" in most
cases. (Therefore, it is necessary for a general-purpose "div" utility.) For
example, (##) if the code is unreachable for reasons unrelated to the division
operands, then the instruction will never raise a DE, but if __volatile__ is
not declared then GCC might optimise the code in a way such that the div
operation *is* reachable sometimes when the divisor is zero.

To demonstrate that example concretely, see udiv.c as a test case, where "if
(g)" should prevent the div from running, but GCC 7.2 -O2 lifts the asm() out
of the inner loop and the if(), causing a DE (SIGFPE) at run time. (udiv.c was
minimised from the real-world example at [1] using C-Reduce [2] - which I
couldn't have done this without.)

One way to make __volatile__ *actually* unnecessary, is to do an explicit
unconditional check (i.e. not based on a removeable macro) for divisor != 0
before the operation, and branch away if = 0. In this case, since divisor is an
input to the asm() instruction, GCC will then know not to reorder it with
respect to the branch.

Also, I am not sure if this sort of unexpected SIGFPE could potentially result
in a security problem. If so then perhaps the priority should be raised.

[0] https://software.intel.com/en-us/articles/intel-sdm
[1] https://github.com/fredrik-johansson/arb/issues/194
[2] https://embed.cs.utah.edu/creduce/

Problem
===

Lots of people copied the same code (longlong.h), with the same definition of
udiv_qrnnd, into their own projects. [3] At the time of writing this includes:
linux, grub2, coreutils, GMP, gcrypt, mpfr, and even older versions of GCC
itself. The code seems to originate from GMP, but I didn't confirm this or
investigate in great detail.

Now technically it is possible to use this correctly, by doing an unconditional
explicit check for divisor != 0 as mentioned above. However, this complexity is
*not* documented in the description, e.g. [4], which even describes the *other*
condition (#b) above, for the instruction not to raise an exception -
"HIGH_NUMERATOR must be less than DENOMINATOR for correct operation.". Even if
it was documented naively ("divisor must be zero"), one can imagine someone
interpreting this as (##) above, which would still be incorrect and still cause
faulty optimisations.

Also, there is a fallback macro __udiv_qrnnd_c [5] defined which reimplements
the instruction in C for platforms that don't have it, and in this case GCC
knows that the C "/" operation can cause div-by-zero errors, and optimises
everything as intended, as one can see by compili

[Bug debug/77985] DW_AT_comp_dir is omitted when filename is absolute and the file does not contain a specific typedef

2016-10-14 Thread infinity0 at pwned dot gg
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77985

infinity0 at pwned dot gg changed:

   What|Removed |Added

  Attachment #39812|0   |1
is obsolete||

--- Comment #9 from infinity0 at pwned dot gg ---
Created attachment 39813
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39813=edit
Emit DW_AT_comp_dir even if filename is an absolute path

Whoops my bad, I did not mean to remove the remap_debug_filename call, that was
left over from some experimenting. Fixed now.

[Bug debug/77985] DW_AT_comp_dir is omitted when filename is absolute and the file does not contain a specific typedef

2016-10-14 Thread infinity0 at pwned dot gg
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77985

--- Comment #8 from infinity0 at pwned dot gg ---
(In reply to Richard Biener from comment #6)
> So I just fixed the bug here, but yes, I don't know about the design
> decision.  I suppose CWD was decided to be useless in case of an absolute
> path to the file.
> 
> I don't think the debug info preserves -I paths in any standard way, so
> what's exactly the reason you consider CWD not redundant here?  [we could
> rewrite the file-name to relative to CWD]

Sorry, I only noticed your reply after I posted that patch. The main reasons
are:

1. So I can write a test case for the next patch I'd be submitting, since GCC
submission processes demand this :p and

2. Reduces code complexity - my patch deletes more lines than it adds.

3. It will be able to handle any extensions to DWARF in the future that add
relative paths elsewhere that one would want to resolve against DW_AT_comp_dir.
One could reuse the same logic everywhere (in bad pseudo-code):

resolve_some_dwarf_path (path) {
  return relative(path) ? DW_AT_comp_dir/path : path;
}

Conceptually, the working directory is independent from an input file, so it
would not be redundant to add it in the general case. *Sometimes*, *parts* of
it might be redundant yes - and rewriting the filename to be relative to this,
would remove the redundancy. Doing this is compatible with the above points,
and I could amend the patch to do this, although I suggest it's not worth the
effort (unless there is a function in GCC code that already does this).

[Bug debug/77985] DW_AT_comp_dir is omitted when filename is absolute and the file does not contain a specific typedef

2016-10-14 Thread infinity0 at pwned dot gg
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77985

--- Comment #7 from infinity0 at pwned dot gg ---
Created attachment 39812
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39812=edit
Emit DW_AT_comp_dir even if filename is an absolute path

Suggested patch attached, with a test case.

Note that without this patch, it is hard to test anything relating to
DW_AT_comp_dir in the testsuite, since it compiles the test cases using
absolute paths.

[Bug debug/77985] DW_AT_comp_dir is omitted when filename is absolute and the file does not contain a specific typedef

2016-10-14 Thread infinity0 at pwned dot gg
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77985

--- Comment #5 from infinity0 at pwned dot gg ---
> Piggybacking a slightly unrelated issue: [..]

Upon further investigation it seems that, whilst the debug-prefix-maps do not
get applied to DW_AT_name filenames in the output of -dA, it does get applied
in the final assembled binary. Sorry for the noise.

The other point about DW_AT_comp_dir and absolute paths remains though - I'm
thinking it would be simpler to just emit it in all cases, and get rid of the
IS_ABSOLUTE_PATH checks. Since most things "in the wild" actually do have this
typedef via standard library includes, this is actually the "expected"
behaviour, hence why I titled this bug the way it is titled.

[Bug debug/77985] DW_AT_comp_dir is omitted when filename is absolute and the file does not contain a specific typedef

2016-10-14 Thread infinity0 at pwned dot gg
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77985

--- Comment #4 from infinity0 at pwned dot gg ---
Thanks for the quick response!

What is the reason for "absolute paths are supposed to omit it"? I'm reading
the DWARF spec and I can't find a mention of this anywhere.

Even if DW_AT_name is absolute, recording DW_AT_comp_dir is still useful
because it could be a different path and this would affect the interpretation
of -I flags and so on.

The reason I found this bug was because I am writing a patch to make GCC
produce debugging symbols that are reproducible regardless of the build path.
(I will explain it in more detail via mail soon.) This bug does not directly
affect that; however it does affect how easy it is to write tests to test my
patch.

Emitting DW_AT_comp_dir in some cases and not others, also makes it more
complex for others to write logic to deal with this attribute.

Piggybacking a slightly unrelated issue: I'm noticing that remap_debug_filename
does not get applied to DW_AT_name in certain cases. Is this intentional?
Hopefully not, since the easiest way I can write my patch is to apply it to
DW_AT_name in all cases.

[Bug debug/77985] DW_AT_comp_dir is omitted when filename is absolute and the file does not contain a specific typedef

2016-10-14 Thread infinity0 at pwned dot gg
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77985

--- Comment #1 from infinity0 at pwned dot gg ---
Created attachment 39811
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=39811=edit
Reproduce the bug; set CC to try it with different compilers

[Bug debug/77985] New: DW_AT_comp_dir is omitted when filename is absolute and the file does not contain a specific typedef

2016-10-14 Thread infinity0 at pwned dot gg
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=77985

Bug ID: 77985
   Summary: DW_AT_comp_dir is omitted when filename is absolute
and the file does not contain a specific typedef
   Product: gcc
   Version: 7.0
Status: UNCONFIRMED
  Severity: normal
  Priority: P3
 Component: debug
  Assignee: unassigned at gcc dot gnu.org
  Reporter: infinity0 at pwned dot gg
  Target Milestone: ---

Hi, GCC 7.0.0 (latest snapshot) and GCC 6.1.1 both exhibit the following
behaviour:

+ gcc-build/destdir/usr/local/bin/gcc --version
gcc (GCC) 7.0.0 20161009 (experimental)
Copyright (C) 2016 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

+ cat test0.c
typedef __builtin_va_list __gnuc_va_list;
void func (void) {}
+ cat test1.c
void func (void) {}
+ gcc-build/destdir/usr/local/bin/gcc test0.c -g -dA -S -o- | grep comp_dir
.long   .LASF4  # DW_AT_comp_dir:
"/home/infinity0/var/lib/reproducible"
.uleb128 0x1b   # (DW_AT_comp_dir)
+ gcc-build/destdir/usr/local/bin/gcc test1.c -g -dA -S -o- | grep comp_dir
.long   .LASF2  # DW_AT_comp_dir:
"/home/infinity0/var/lib/reproducible"
.uleb128 0x1b   # (DW_AT_comp_dir)
+ gcc-build/destdir/usr/local/bin/gcc
/home/infinity0/var/lib/reproducible/test0.c -g -dA -S -o- | grep comp_dir
.long   .LASF4  # DW_AT_comp_dir:
"/home/infinity0/var/lib/reproducible"
.uleb128 0x1b   # (DW_AT_comp_dir)
+ gcc-build/destdir/usr/local/bin/gcc
/home/infinity0/var/lib/reproducible/test1.c -g -dA -S -o- | grep comp_dir
# exit code 1

See the attached file for a shell script where you can reproduce this yourself.