Hi all,

I've been trying to integrate Datasketches into our ecosystem - really
great work!

However, when I tried to run various sketches with the lending club data
from Kaggle (1.6GB in size) on the raw CSV data in Python on my MacOS. I
noticed after a while that the process will crash with a mysterious
segfault on my Mac OS (Catalina)
My CLang version:

*➜  **Workspace* c++ --version

Apple clang version 11.0.0 (clang-1100.0.33.17)

Target: x86_64-apple-darwin19.5.0

Thread model: posix

InstalledDir: /Library/Developer/CommandLineTools/usr/bin

*➜  **Workspace* gcc --version

Configured with: --prefix=/Library/Developer/CommandLineTools/usr
--with-gxx-include-dir=/Library/Developer/CommandLineTools/SDKs/MacOSX.sdk/usr/include/c++/4.2.1

Apple clang version 11.0.0 (clang-1100.0.33.17)

Target: x86_64-apple-darwin19.5.0

Thread model: posix

InstalledDir: /Library/Developer/CommandLineTools/usr/bin

Replacing this with Miniconda cxx toolchain solves the problem.

I'll get a script along with the data for reproducibility, but before that
I wonder if anyone has come across this issue before?

Cheers!
- Andy

Reply via email to