Could you provide a script with which people can reproduce the problem for the performance comparison? That way we can take a closer look.
On Mon, Aug 21, 2023 at 8:42 PM Spencer Nelson <[email protected]> wrote: > I'd like some help calibrating my expectations regarding acero > performance. I'm finding that some pretty naive numpy is about 10x faster > than acero for my use case. > > I'm working with a table with 13,000,000 values. The values are angular > positions on the sky and times. I'd like to filter to a specific one of the > times, and to values within a calculated great-circle distance on the sky. > > I've implemented the Vincenty formula ( > https://en.wikipedia.org/wiki/Great-circle_distance) for this: > > ``` > def pc_angular_separation(lon1, lat1, lon2, lat2): > sdlon = pc.sin(pc.subtract(lon2, lon1)) > cdlon = pc.cos(pc.subtract(lon2, lon1)) > slat1 = pc.sin(lat1) > slat2 = pc.sin(lat2) > clat1 = pc.cos(lat1) > clat2 = pc.cos(lat2) > > num1 = pc.multiply(clat2, sdlon) > num2 = pc.subtract(pc.multiply(slat2, clat1), > pc.multiply(pc.multiply(clat2, slat1), cdlon)) > denominator = pc.add(pc.multiply(slat2, slat1), > pc.multiply(pc.multiply(clat2, clat1), cdlon)) > hypot = pc.sqrt(pc.add(pc.multiply(num1, num1), pc.multiply(num2, > num2))) > return pc.atan2(hypot, denominator) > ``` > > The resulting pyarrow.compute.Expression is fairly monstrous: > > <pyarrow.compute.Expression atan2(sqrt(add(multiply(multiply(cos(Dec_deg), > sin(subtract(RA_deg, 168.9776949652776))), multiply(cos(Dec_deg), > sin(subtract(RA_deg, 168.9776949652776)))), > multiply(subtract(multiply(sin(Dec_deg), -0.9304510671785976), > multiply(multiply(cos(Dec_deg), 0.3664161726591893), cos(subtract(RA_deg, > 168.9776949652776)))), subtract(multiply(sin(Dec_deg), > -0.9304510671785976), multiply(multiply(cos(Dec_deg), 0.3664161726591893), > cos(subtract(RA_deg, 168.9776949652776))))))), add(multiply(sin(Dec_deg), > 0.3664161726591893), multiply(multiply(cos(Dec_deg), -0.9304510671785976), > cos(subtract(RA_deg, 168.9776949652776)))))> > > Then my Acero graph is very simple. Just a table source node, then a > filter node on the timestamp (for exact match), and then another filter > node for a computed value of that expression under a threshold. > > For 13 million observations, this takes about 15ms on my laptop using > Acero. > > But the same computation done with totally naive numpy is about 3ms. > > The numpy version has no fanciness, just calling numpy trigonometric > functions and materializing all the intermediate results like you might > imagine, then eventually coming up with a boolean mask over everything and > calling `table.filter(mask)`. > > So finally, my question: is this about what I should expect? I know Acero > has an advantage that it *would* work if my data were larger than fits in > memory, which is not true of my numpy approach. But I expected that Acero > would need to only visit columnar values once, so it should be able to > outpace the numpy approach. Should I instead think of Acero as mainly about > working on very large datasets? > > -Spencer > -- Regards, Chak-Pong
