Issue 56419
Summary clang 15 instruction count regression with `-O2`
Labels new issue
Assignees
Reporter firewave
    Compared to 14.0.6 I am seeing a slight regression in `Ir count` when running https://github.com/danmar/simplecpp with `valgrind --tool=callgrind`. I am aware that a higher instruction count does not automatically lead to decreased performance.

I was able to find a function affected by it (I removed some parameters and error handling code so it compiles standalone). With clang 14.0.6 I get a count of `710` per call and with clang 15 it is `721`.

```cpp
#include <string>
#include <istream>

static unsigned char readChar(std::istream &istr, unsigned int bom)
{
	unsigned char ch = static_cast<unsigned char>(istr.get());

	// For UTF-16 encoded files the BOM is 0xfeff/0xfffe. If the
	// character is non-ASCII character then replace it with 0xff
	if (bom == 0xfeff || bom == 0xfffe) {
		const unsigned char ch2 = static_cast<unsigned char>(istr.get());
		const int ch16 = (bom == 0xfeff) ? (ch<<8 | ch2) : (ch2<<8 | ch);
		ch = static_cast<unsigned char>(((ch16 >= 0x80) ? 0xff : ch16));
	}

	// Handling of newlines..
	if (ch == '\r') {
		ch = '\n';
		if (bom == 0 && static_cast<char>(istr.peek()) == '\n')
			(void)istr.get();
		else if (bom == 0xfeff || bom == 0xfffe) {
			int c1 = istr.get();
			int c2 = istr.get();
			int ch16 = (bom == 0xfeff) ? (c1<<8 | c2) : (c2<<8 | c1);
			if (ch16 != '\n') {
				istr.unget();
				istr.unget();
			}
		}
	}

	return ch;
}

std::string readUntil(std::istream &istr, const char start, const char end, unsigned int bom)
{
    std::string ret;
    ret += start;

    bool backslash = false;
    char ch = 0;
    while (ch != end && ch != '\r' && ch != '\n' && istr.good()) {
        ch = readChar(istr, bom);
        if (backslash && ch == '\n') {
            ch = 0;
            backslash = false;
            continue;
        }
        backslash = false;
        ret += ch;
        if (ch == '\\') {
            bool update_ch = false;
            char next = 0;
            do {
                next = readChar(istr, bom);
                if (next == '\r' || next == '\n') {
                    ret.erase(ret.size()-1U);
                    backslash = (next == '\r');
                    update_ch = false;
                } else if (next == '\\')
                    update_ch = !update_ch;
                ret += next;
            } while (next == '\\');
            if (update_ch)
                ch = next;
        }
    }

    if (!istr.good() || ch != end) {
        return "";
    }

    return ret;
}
```

https://godbolt.org/z/vKxPvPK7K

The generated code is quite different in parts and since I have no clue about assembler I cannot tell if that is a good or a bad thing. At the first glance it seems like there are more `4-byte Spill` and related occurrences.

There's already differences in the generated code at `-O1`. The code at `-O0` is identical.

With clang 15 there's also this additional code at the end:

```
DW.ref.__gxx_personality_v0:
        .quad   __gxx_personality_v0
```
_______________________________________________
llvm-bugs mailing list
[email protected]
https://lists.llvm.org/cgi-bin/mailman/listinfo/llvm-bugs

Reply via email to