I would like to do an explicit loop over a large DataFrame and evaluate a 
function which depends on a subset of the columns in an arbitrary way. What 
would be the fastest way to accomplish this? Presently, I’m doing something like

~~~
f(df::DataFrame, i::Integer) = df[i, :a] * df[i, :b] + df[i, :c]

for i=1:nrow(df)
        x = f(df, i)
end
~~~

which according to Profile creates a major bottleneck.

Would it make sense to somehow pre-create an immutable type corresponding to a 
single row (my data are BitsKind), and run a compiled function on these 
row-objects with strong typing?

Thanks in advance for any advice,
Joosep

Reply via email to