Hello,

I have a input data frame with multiple rows. For each row, I want to apply a 
function. The input data frame has 1,000,000+ rows. How can I speed up my code 
? I would like to keep the function "func".

Here is a reproducible example with a simple function:

    library(tictoc)
    library(dplyr)

func <- function(coord, a, b, c){

      X1 <- as.vector(coord[1])
      Y1 <- as.vector(coord[2])
      X2 <- as.vector(coord[3])
      Y2 <- as.vector(coord[4])

      if(c == 0) {

        res1 <- mean(c((X1 - a) : (X1 - 1), (Y1 + 1) : (Y1 + 40)))
        res2 <- mean(c((X2 - a) : (X2 - 1), (Y2 + 1) : (Y2 + 40)))
        res <- matrix(c(res1, res2), ncol=2, nrow=1)

      } else {

        res1 <- mean(c((X1 - a) : (X1 - 1), (Y1 + 1) : (Y1 + 40)))*b
        res2 <- mean(c((X2 - a) : (X2 - 1), (Y2 + 1) : (Y2 + 40)))*b
        res <- matrix(c(res1, res2), ncol=2, nrow=1)

      }

      return(res)
    }

    ## Apply the function
    set.seed(1)
    n = 10000000
    tab <- as.matrix(data.frame(x1 = sample(1:100, n, replace = T), y1 = 
sample(1:100, n, replace = T), x2 = sample(1:100, n, replace = T), y2 = 
sample(1:100, n, replace = T)))


  tic("test 1")
  test <- tab %>%
    split(1:nrow(tab)) %>%
    map(~ func(.x, 40, 5, 1)) %>%
    do.call("rbind", .)
  toc()

test 1: 599.2 sec elapsed

Thanks very much for your time
Have a nice day
Nell

        [[alternative HTML version deleted]]

______________________________________________
[email protected] mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to