jnturton commented on issue #2421: URL: https://github.com/apache/drill/issues/2421#issuecomment-1008596721
@paul-rogers my personal bias, FWIW, is that it would take a major overall speedup on _real world_ Drill workloads to budge me from the cleanest and simplest memory format (which sounds like rows). Something like a genuine 5x before it started getting hard for me to wave it away, and again I don't mean in a synthetic benchmark. I'm just circulating material for the sake of enhancing the discussion. > You're also assuming that there are SIMD hash functions: I'm not sure those exist: a search came up with somewhat random results. (SHA exists, however.) Venturing a little off topic now but as a point of general interest, there is reason to believe at least a few such algorithms must exist. The "moon boys" that kill the planet mining crypto ponzi tokens tend to use GPUs for that work. Security researchers made news a few years back by producing collisions in MD5 using a Playstation 3 cluster. From the hardware employed I conclude that the hash functions being brute forced in these stories must admit data-parallel algorithms and would also get some speed up from SIMD. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: dev-unsubscr...@drill.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org