Hey, Thanks for suggestion but this didn't work. Method 1 : use of data.table / sample > set.seed(1); size <- 100000000; dt <- data.table::data.table("a"=c(1:size),"b"=rep(letters[1:10],size/10));head(dt);system.time( dt[,c("a","b"):=list(sample(a),sample(b))] );head(dt) a b 1: 1 a 2: 2 b 3: 3 c 4: 4 d 5: 5 e 6: 6 f utilisateur système écoulé 10.190 0.252 10.456 a b 1: 26550867 a 2: 37212390 b 3: 57285336 c 4: 90820777 e 5: 20168193 a 6: 89838965 h
Method 2 : use of factor / data.table / sample > set.seed(1); size <- 100000000; dt <- data.table::data.table("a"=c(1:size),"b"=as.factor(rep(letters[1:10],size/10)));head(dt);system.time( dt[,c("a","b"):=list(sample(a),sample(b))] );head(dt) a b 1: 1 a 2: 2 b 3: 3 c 4: 4 d 5: 5 e 6: 6 f utilisateur système écoulé 9.271 0.276 9.559 a b 1: 26550867 a 2: 37212390 b 3: 57285336 c 4: 90820777 e 5: 20168193 a 6: 89838965 h Method 3: Use of internal / data.table / factor > set.seed(1); size <- 100000000; dt <- data.table::data.table("a"=c(1:size),"b"=as.factor(rep(letters[1:10],size/10)));head(dt);system.time( dt[,c("a","b"):=list(a[.Internal(sample(size, size, FALSE, NULL))],b[.Internal(sample(size, size, FALSE, NULL))])] );head(dt) a b 1: 1 a 2: 2 b 3: 3 c 4: 4 d 5: 5 e 6: 6 f utilisateur système écoulé 8.786 0.137 8.935 a b 1: 26550867 a 2: 37212390 b 3: 57285336 c 4: 90820777 e 5: 20168193 a 6: 89838965 h Method 4 (thanks for pointing it banded): set / factor / sample > set.seed(1); size <- 100000000; dt <- data.table::data.table("a"=c(1:size),"b"=as.factor(rep(letters[1:10],size/10)));head(dt);system.time({ set(dt,j="a",value=sample(dt$a)); set(dt,j="b",value=sample(dt$b))} );head(dt); a b 1: 1 a 2: 2 b 3: 3 c 4: 4 d 5: 5 e 6: 6 f utilisateur système écoulé 8.790 0.204 9.006 a b 1: 26550867 a 2: 37212390 b 3: 57285336 c 4: 90820777 e 5: 20168193 a 6: 89838965 h Method 5 use of a data.frame > set.seed(1); size <- 100000000; dt <- data.frame("a"=c(1:size),"b"=as.factor(rep(letters[1:10],size/10)));head(dt);system.time({ dt$a <- sample(dt$a);dt$b <- sample(dt$b) });head(dt); a b 1 1 a 2 2 b 3 3 c 4 4 d 5 5 e 6 6 f utilisateur système écoulé 8.755 0.152 8.921 a b 1 26550867 a 2 37212390 b 3 57285336 c 4 90820777 e 5 20168193 a 6 89838965 h sadly, data.table does not improve. sample is the bottleneck 2017-01-05 14:20 GMT+01:00 banded08 <david.awam.jan...@gmail.com>: > Maybe not the fastest of most efficient, but this should work > > for(ii in 1:dim(dt1)[1]) set(dt1, ii, 1:dim(dt1)[2] ,sample(dt1[ii])) > > > > -- > View this message in context: http://r.789695.n4.nabble.com/ > Shuffle-row-wise-column-independently-tp4727865p4727871.html > Sent from the datatable-help mailing list archive at Nabble.com. > _______________________________________________ > datatable-help mailing list > datatable-help@lists.r-forge.r-project.org > https://lists.r-forge.r-project.org/cgi-bin/mailman/ > listinfo/datatable-help >
_______________________________________________ datatable-help mailing list datatable-help@lists.r-forge.r-project.org https://lists.r-forge.r-project.org/cgi-bin/mailman/listinfo/datatable-help