GitHub user shadowmmu added a comment to the discussion: [Performance] Velox Bloom Filter Inefficiency vs. Photon at 1TB Scale
Hi @zhli1142015 , the default limit is 10MB from spark side, I increased the limit to 1GB to compare it with Databrick's photon but it wasn't effective at all, like even after filtering it was nearly returning all the rows. Performance wise Velox is nearly 5x slower for Q17 compared to Photon with the default limit of 10 MB for bloom filter. And on full TPCH suite its 2x slower. Changing to 1GB hasn't effected anything. And btw how did you increased the bloomFilter.maxNumBits? GitHub link: https://github.com/apache/incubator-gluten/discussions/11554#discussioncomment-15689959 ---- This is an automatically sent email for [email protected]. To unsubscribe, please send an email to: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
