Hi all, I've been trying to integrate Datasketches into our ecosystem - really great work!
However, when I tried to run various sketches with the lending club data from Kaggle (1.6GB in size) on the raw CSV data in Python on my MacOS. I noticed after a while that the process will crash with a mysterious segfault on my Mac OS (Catalina) My CLang version: *➜ **Workspace* c++ --version Apple clang version 11.0.0 (clang-1100.0.33.17) Target: x86_64-apple-darwin19.5.0 Thread model: posix InstalledDir: /Library/Developer/CommandLineTools/usr/bin *➜ **Workspace* gcc --version Configured with: --prefix=/Library/Developer/CommandLineTools/usr --with-gxx-include-dir=/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/4.2.1 Apple clang version 11.0.0 (clang-1100.0.33.17) Target: x86_64-apple-darwin19.5.0 Thread model: posix InstalledDir: /Library/Developer/CommandLineTools/usr/bin Replacing this with Miniconda cxx toolchain solves the problem. I'll get a script along with the data for reproducibility, but before that I wonder if anyone has come across this issue before? Cheers! - Andy