In my last example, the function mean() is not well chosen. In fact, what I would like to calculate is a statistical test line by lline, like TTest, or Wilcoxon. This is why I need to iterate thought 2 DataFrames at the same time if I subset the DataFrame first to increase speed :)
Something like : julia> for r1,r2 in eachrow(df1, df2) println(TTest(r1,r2)) end ERROR: syntax: invalid iteration specification Le samedi 21 novembre 2015 19:17:27 UTC+1, Fred a écrit : > > It is a good idea but how is it possible to iterate two dataframes at the > same time ? Something like : > > julia> df = DataFrame(a=1:5, b=7:11, c=10:14, d=20:24) > 5x4 DataFrames.DataFrame > | Row | a | b | c | d | > |-----|---|----|----|----| > | 1 | 1 | 7 | 10 | 20 | > | 2 | 2 | 8 | 11 | 21 | > | 3 | 3 | 9 | 12 | 22 | > | 4 | 4 | 10 | 13 | 23 | > | 5 | 5 | 11 | 14 | 24 | > > julia> df1 = df[1:2,] > 5x2 DataFrames.DataFrame > | Row | a | b | > |-----|---|----| > | 1 | 1 | 7 | > | 2 | 2 | 8 | > | 3 | 3 | 9 | > | 4 | 4 | 10 | > | 5 | 5 | 11 | > > julia> df1 = df[3:4,] > 5x2 DataFrames.DataFrame > | Row | c | d | > |-----|----|----| > | 1 | 10 | 20 | > | 2 | 11 | 21 | > | 3 | 12 | 22 | > | 4 | 13 | 23 | > | 5 | 14 | 24 | > > julia> for r1,r2 in eachrow(df1, df2) > println(mean(r1,r2)) > end > ERROR: syntax: invalid iteration specification > > > > > Le samedi 21 novembre 2015 15:08:34 UTC+1, tshort a écrit : >> >> For the subset, do the indexing after the conversion to an array, or >> subset the DataFrame first (probably faster). >> >