+Roland The only good solution is to have the upstream glibc fixed and maintained in this state. (We need to make it build with clang+asan and have the bots that verify it still works on every commit). Roland wanted to try doing that; not sure what's the current state. Anyway, I think this should be discussed at [email protected]
On Wed, Feb 17, 2016 at 3:21 AM, Hanno Böck <[email protected]> wrote: > Hi, > > I thought given the current issues with glibc I'd bring that up. > > A while ago I had a conversation with Kostya about building glibc with > asan. I think it can be summed up as "it's possible, but requires lots > of manual work and is complicated". > > The publicly available documentation is currently a wiki page listing > problems trying to build glibc with clang > https://sourceware.org/glibc/wiki/GlibcMeetsClang > and some reports about fuzzing done with libfuzzer > https://sourceware.org/glibc/wiki/FuzzingLibc > > As far as I can see there is currently no public documentation how one > would compile glibc with asan (and/or libfuzzer). > True. My current instructions are pretty involved. Let me dump them here FTR (last checked 1 month ago). First, build glibc in a usual way: wget http://ftp.gnu.org/gnu/glibc/glibc-2.22.tar.bz2 tar xf glibc-2.22.tar.bz2 ( mkdir glibc_build_plain cd glibc_build_plain/ ../glibc-2.22/configure --prefix=`pwd`/../glibc_inst_plain && make -j && make install ) Grab fresh clang. Revert clang r255371, or apply this patch. Then rebuild clang. (This is a long and sad story... Hopefully Roland will fix it) --- llvm/tools/clang/lib/Sema/SemaDecl.cpp (revision 257672) +++ llvm/tools/clang/lib/Sema/SemaDecl.cpp (working copy) @@ -2381,7 +2381,7 @@ // Attributes declared post-definition are currently ignored. checkNewAttributesAfterDef(*this, New, Old); - if (AsmLabelAttr *NewA = New->getAttr<AsmLabelAttr>()) { + if (0) if (AsmLabelAttr *NewA = New->getAttr<AsmLabelAttr>()) { if (AsmLabelAttr *OldA = Old->getAttr<AsmLabelAttr>()) { if (OldA->getLabel() != NewA->getLabel()) { // This redeclaration changes __asm__ label. Now, download the attached clang-gcc-wrapper.py and put it into ./ Build using clang-gcc-wrapper.py instead of gcc: ( mkdir glibc_build_clang # name is important for clang-gcc-wrapper.py cd glibc_build_clang CC=`pwd`/../clang-gcc-wrapper.py ../glibc-2.22/configure --prefix=`pwd`/../glibc_inst_clang make -k # -j ) The build will fail to complete (thus -k), but it will produce all needed .so files. Now, copy .so files to glibc_build_plain/: for f in librt libdl libresolv libpthread libcrypt libm; do cp -v glibc_build_clang/*/$f.so glibc_inst_plain/lib/$f-2.22.so; done cp glibc_build_clang/libc.so glibc_inst_plain/lib/libc-2.22.so Note: if you are building version other than 22, change the names. Now, verify that you have proper instrumentation. % cat use-after-free-on-gethostbyname.c #include <netdb.h> #include <stdlib.h> int main() { char *x = (char*)malloc(10); free(x); gethostbyname(x); } % export SYSROOT=`pwd`/glibc_inst_plain/ % clang -g -fsanitize=address use-after-free-on-gethostbyname.c \ -Wl,-rpath=$SYSROOT/lib -Wl,-dynamic-linker=$SYSROOT/lib/ld-2.22.so % ASAN_OPTIONS=replace_str=0 ./a.out You should get ==25426==ERROR: AddressSanitizer: heap-use-after-free on address 0x60200000eff0 at pc 0x7f65db399e2d bp 0x7ffe902a0c50 sp 0x7ffe902a0c48 READ of size 1 at 0x60200000eff0 thread T0 #0 0x7f65db399e2c in __GI___libc_res_nsearch *glibc-2.22/resolv/res_query.c:356*:18 NOTE: you should get the first frame inside the glibc code, not inside asan interceptor. Compare to running w/o ASAN_OPTIONS=replace_str=0: #0 0x49a490 in index llvm/projects/compiler-rt/lib/asan/asan_interceptors.cc:470:5 Finally, have a look at clang-gcc-wrapper.py to make sure you instrument all the code you need: asan_whitelist = [#"posix", "string", "wcsmbs", "wctype", "stdio-common", "time", "resolv"] Some parts of glibc still don't build with clang, so we are using a whitelist. --kcc > > I think it is a major drawback of security analysis of glibc that many > common tools don't work on it and it'd be great if this area could be > improved. So the question is: How realistic would it be to make this > stuff more easily accessible? > > -- > Hanno Böck > https://hboeck.de/ > > mail/jabber: [email protected] > GPG: BBB51E42 > > -- > You received this message because you are subscribed to the Google Groups > "address-sanitizer" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to [email protected]. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "address-sanitizer" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. For more options, visit https://groups.google.com/d/optout.
#!/usr/bin/env python import os import re import sys blacklist=["rtld", "/dl-", "elf/", # "strtol", ## something wrong with visibility # "string/mem", "time/time", # "time/gettimeofday", "time/timegm", "time/timespec_get", # "nptl/libc_pthread_init", "nptl/register-atfork", "string/strstr", # "string/strcasestr", # "/regcomp", # review in progress # "/regex", # review in progress # "posix/", # "openat", # openat parameter mismatch # "htons", # conflicting types # "nscd_helper", # VLAIS # "csu/errno", # clang miscompiles aliases to tls, http://llvm.org/bugs/show_bug.cgi?id=21288 # "signal/kill", # repeated make fails # "signal/sigaltstack", # repeated make fails # "csu/", # link problems #0 0x7f5147637bd9 in _mm_load_si128 .../x86_64-unknown-linux-gnu/5.0.0/include/emmintrin.h:688 #1 0x7f5147637bd9 in __strcspn_sse42 .../glibc-2.19/string/../sysdeps/x86_64/multiarch/strcspn-c.c:123 # "string/strcspn-c", # May read 16-aligned data outside of buffer. # "string/strpbrk-c", # Same function as strcspn # "string/strspn-c", # Same here ] asan_blacklist=["rtld", "dl-", "elf", "string/mem", "time/time", "time/gettimeofday", "time/timegm", "time/timespec_get", "nptl/libc_pthread_init", "nptl/register-atfork", "string/strstr", "nptl-init", "nptl", "string/strcasestr", "string/strcspn-c", # May read 16-aligned data outside of buffer. "string/strpbrk-c", # Same function asn strcspn ] asan_whitelist = [#"posix", "string", "wcsmbs", "wctype", "stdio-common", "time", "resolv"] def AllowClang(obj): print >> sys.stderr, "OBJ:", obj, "<<<<" for b in blacklist: if re.search(b, obj): return False for b in asan_blacklist: if re.search(b, obj): return False for b in asan_whitelist: if re.search(b, obj): return True return False if obj == "": return False # print >> sys.stderr, "OK:", obj, "<<<<" return True if __name__ == '__main__': last_was_minus_o = False clang_ok = False; compiler = "gcc" res = ["compiler"] has_shared = False for a in sys.argv[1:]: if a == "-r": a = "-Wl,-r" if a == "-shared" or a == "-Wl,--whole-archive": has_shared = True; print "HAS SHARED ===================" if a != "-fno-toplevel-reorder" and a != "-fno-section-anchors" and a != "-frounding-math" and a != "-Werror" and a != "-Wl,-z,defs": res.append(a) # print last_was_minus_o, a if last_was_minus_o: m = re.match(r"/.*build_clang/(.*).os", a) if m: obj = m.group(1) if AllowClang(obj): clang_ok = True; last_was_minus_o = a == "-o" if clang_ok: # and (not has_shared): compiler = "clang" res.append("-fno-integrated-as") res.append("-fheinous-gnu-extensions") res.append("-D__EXCEPTIONS") res.append("-Wno-builtin-requires-header") res.append("-Wno-gnu-variable-sized-type-not-at-end") res.append("-Wno-ignored-attributes") res.append("-Wno-macro-redefined") res.append("-Wno-array-bounds") res.append("-Wno-tautological-compare") res.append("-Wno-gnu-designator") res.append("-Wno-uninitialized") res.append("-Wno-dangling-else") res.append("-fsanitize=address") res.append("-fsanitize-coverage=edge") res.append("-DADDRESS_SANITIZER") print >> sys.stderr, "CLANG:", res # print res res[0] = compiler # res.append("-fuse-ld=bfd") os.execvp(compiler, res)
