[R] R: estimating genotyping error rate
Hello, I have SNP data from genotyping. I would like to estimate the error rate between replicated samples using R. How can I proceed? Thanks Meriam __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] R help: fviz_nbclust’ is not available (for R version 3.5.2)
Thanks for your valuable clarifications. I tried all the steps again but the problem remains. In fact, "fviz_nbclust" is a function inside the package "factoextra". I run each step very carefully but the problem remains...It doesn't make sense because I have installed factoextra. This warning appears: could not find function "fviz_nbclust" On Wed, Jan 16, 2019 at 1:22 PM Jeff Newmiller wrote: > > Concept 1: You don't install functions... you install packages that have > functions in them. There is a function fviz_nbclust in factoextra. > > Concept 2: Once a package is installed, you do NOT have to install it again, > e.g. every time you want to do that analysis. Making the installation part of > your script is not advised. > > Concept 3: Typically we do use the library function with a package name at > the beginning of every session where we want to use functions from that > package. However, that is optional... you could also just invoke the function > directly using factoextra::fviz_nbclust(...blahblah...). Having the library > function shortens this and if the package is not installed it provides a > clear error message that can be a reminder to the user to install the package. > > Execute your code line by line and solve the first error you encounter by > examining the error message and reviewing what that line of code is designed > to do. > > On January 16, 2019 11:00:07 AM PST, N Meriam wrote: > >Hello, > >I'm struggling to install a function called "fviz_nbclus". > > > >My code is the following: > >pkgs <- c("factoextra", "NbClust") > >install.packages(pkgs) > >library(factoextra) > >library(NbClust) > ># Standardize the data > >load("df4.rda") > >library(FunCluster) > > > >install.packages("fviz_nbclust") > >#fviz_nbclust(df4, FUNcluster, method = c("silhouette", "wss", > >"gap_stat")) > > > >Installing package into ‘C:/Users/DELL/Documents/R/win-library/3.5’ > >(as ‘lib’ is unspecified) > >Warning in install.packages : > > package ‘fviz_nbclust’ is not available (for R version 3.5.2) > > > >Best, > >Meriam > > > >__ > >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > >https://stat.ethz.ch/mailman/listinfo/r-help > >PLEASE do read the posting guide > >http://www.R-project.org/posting-guide.html > >and provide commented, minimal, self-contained, reproducible code. > > -- > Sent from my phone. Please excuse my brevity. -- Meriam Nefzaoui MSc. in Plant Breeding and Genetics Universidade Federal Rural de Pernambuco (UFRPE) - Recife, Brazil __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R help: fviz_nbclust’ is not available (for R version 3.5.2)
Hello, I'm struggling to install a function called "fviz_nbclus". My code is the following: pkgs <- c("factoextra", "NbClust") install.packages(pkgs) library(factoextra) library(NbClust) # Standardize the data load("df4.rda") library(FunCluster) install.packages("fviz_nbclust") #fviz_nbclust(df4, FUNcluster, method = c("silhouette", "wss", "gap_stat")) Installing package into ‘C:/Users/DELL/Documents/R/win-library/3.5’ (as ‘lib’ is unspecified) Warning in install.packages : package ‘fviz_nbclust’ is not available (for R version 3.5.2) Best, Meriam __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Fwd: Overlapping legend in a circular dendrogram
Yes I know. Sorry if I reposted this but it's simply because I've received an email mentioning that the file was too big that's why I modified my question and reposted it. I don't want to oblige anyone to respond. I really thought the issue was my file (too big so nobody received it). Thanks for your understanding, Best Myriam On Fri, Jan 11, 2019 at 3:03 PM Bert Gunter wrote: > > This is the 3rd time you've posted this. Please stop re-posting! > > Your question is specialized and involved, and you have failed to provide a > reproducible example/data. We are not obliged to respond. > > You may do better contacting the maintainer, found by ?maintainer, as > recommended by the posting guide for specialized queries such as this. > > > Bert Gunter > > "The trouble with having an open mind is that people keep coming along and > sticking things into it." > -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) > > > On Fri, Jan 11, 2019 at 12:47 PM N Meriam wrote: >> >> Hi, I'm facing some issues when generationg a circular dendrogram. >> The labels on the left which are my countries are overlapping with the >> circular dendrogram (middle). Same happens with the labels (regions) >> located on the right. >> I run the following code and I'd like to know what should be changed >> in my code in order to avoid that. >> >> load("hc1.rda") >> library(cluster) >> library(ape) >> library(dendextend) >> library(circlize) >> library(RColorBrewer) >> >> labels = hc1$labels >> n = length(labels) >> dend = as.dendrogram(hc1) >> markcountry=as.data.frame(markcountry1) >> #Country colors >> groupCodes=as.character(as.factor(markcountry[,2])) >> colorCodes=rainbow(length(unique(groupCodes))) #c("blue","red") >> names(colorCodes)=unique(groupCodes) >> labels_colors(dend) <- colorCodes[groupCodes][order.dendrogram(dend)] >> >> #Region colors >> groupCodesR=as.character(as.factor(markcountry[,3])) >> colorCodesR=rainbow(length(unique(groupCodesR))) #c("blue","red") >> names(colorCodesR)=unique(groupCodesR) >> >> circos.par(cell.padding = c(0, 0, 0, 0)) >> circos.initialize(factors = "foo", xlim = c(1, n)) # only one sector >> max_height = attr(dend, "height") # maximum height of the trees >> >> #Region graphics >> circos.trackPlotRegion(ylim = c(0, 1.5), panel.fun = function(x, y) { >> circos.rect(1:361-0.5, rep(0.5, 361), 1:361-0.1, rep(0.8,361), col = >> colorCodesR[groupCodesR][order.dendrogram(dend)], border = NA) >> }, bg.border = NA) >> >> #labels graphics >> circos.trackPlotRegion(ylim = c(0, 0.5), bg.border = NA, >>panel.fun = function(x, y) { >> >>circos.text(1:361-0.5, >> rep(0.5,361),labels(dend), adj = c(0, 0.5), >>facing = "clockwise", niceFacing = >> TRUE, >>col = labels_colors(dend), cex = 0.45) >> >>}) >> dend = color_branches(dend, k = 6, col = 1:6) >> >> #Dendrogram graphics >> circos.trackPlotRegion(ylim = c(0, max_height), bg.border = NA, >>track.height = 0.4, panel.fun = function(x, y) { >> circos.dendrogram(dend, max_height = 0.55) >>}) >> legend("left",names(colorCodes),col=colorCodes,text.col=colorCodes,bty="n",pch=15,cex=0.8) >> legend("right",names(colorCodesR),col=colorCodesR,text.col=colorCodesR,bty="n",pch=15,cex=0.35) >> >> Cheers, >> Myriam >> >> __ >> R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >> https://stat.ethz.ch/mailman/listinfo/r-help >> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html >> and provide commented, minimal, self-contained, reproducible code. -- Meriam Nefzaoui MSc. in Plant Breeding and Genetics Universidade Federal Rural de Pernambuco (UFRPE) - Recife, Brazil __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Fwd: Overlapping legend in a circular dendrogram
Hi, I'm facing some issues when generationg a circular dendrogram. The labels on the left which are my countries are overlapping with the circular dendrogram (middle). Same happens with the labels (regions) located on the right. I run the following code and I'd like to know what should be changed in my code in order to avoid that. load("hc1.rda") library(cluster) library(ape) library(dendextend) library(circlize) library(RColorBrewer) labels = hc1$labels n = length(labels) dend = as.dendrogram(hc1) markcountry=as.data.frame(markcountry1) #Country colors groupCodes=as.character(as.factor(markcountry[,2])) colorCodes=rainbow(length(unique(groupCodes))) #c("blue","red") names(colorCodes)=unique(groupCodes) labels_colors(dend) <- colorCodes[groupCodes][order.dendrogram(dend)] #Region colors groupCodesR=as.character(as.factor(markcountry[,3])) colorCodesR=rainbow(length(unique(groupCodesR))) #c("blue","red") names(colorCodesR)=unique(groupCodesR) circos.par(cell.padding = c(0, 0, 0, 0)) circos.initialize(factors = "foo", xlim = c(1, n)) # only one sector max_height = attr(dend, "height") # maximum height of the trees #Region graphics circos.trackPlotRegion(ylim = c(0, 1.5), panel.fun = function(x, y) { circos.rect(1:361-0.5, rep(0.5, 361), 1:361-0.1, rep(0.8,361), col = colorCodesR[groupCodesR][order.dendrogram(dend)], border = NA) }, bg.border = NA) #labels graphics circos.trackPlotRegion(ylim = c(0, 0.5), bg.border = NA, panel.fun = function(x, y) { circos.text(1:361-0.5, rep(0.5,361),labels(dend), adj = c(0, 0.5), facing = "clockwise", niceFacing = TRUE, col = labels_colors(dend), cex = 0.45) }) dend = color_branches(dend, k = 6, col = 1:6) #Dendrogram graphics circos.trackPlotRegion(ylim = c(0, max_height), bg.border = NA, track.height = 0.4, panel.fun = function(x, y) { circos.dendrogram(dend, max_height = 0.55) }) legend("left",names(colorCodes),col=colorCodes,text.col=colorCodes,bty="n",pch=15,cex=0.8) legend("right",names(colorCodesR),col=colorCodesR,text.col=colorCodesR,bty="n",pch=15,cex=0.35) Cheers, Myriam __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Overlapping legend in a circular dendrogram
Dear all, I run the following code and I get this graphic (Imageattached). What should I change in my code in order to adjust the overlapping objects? load("hc1.rda") library(cluster) library(ape) library(dendextend) library(circlize) library(RColorBrewer) labels = hc1$labels n = length(labels) dend = as.dendrogram(hc1) markcountry=as.data.frame(markcountry1) #Country colors groupCodes=as.character(as.factor(markcountry[,2])) colorCodes=rainbow(length(unique(groupCodes))) #c("blue","red") names(colorCodes)=unique(groupCodes) labels_colors(dend) <- colorCodes[groupCodes][order.dendrogram(dend)] #Region colors groupCodesR=as.character(as.factor(markcountry[,3])) colorCodesR=rainbow(length(unique(groupCodesR))) #c("blue","red") names(colorCodesR)=unique(groupCodesR) circos.par(cell.padding = c(0, 0, 0, 0)) circos.initialize(factors = "foo", xlim = c(1, n)) # only one sector max_height = attr(dend, "height") # maximum height of the trees #Region graphics circos.trackPlotRegion(ylim = c(0, 1.5), panel.fun = function(x, y) { circos.rect(1:361-0.5, rep(0.5, 361), 1:361-0.1, rep(0.8,361), col = colorCodesR[groupCodesR][order.dendrogram(dend)], border = NA) }, bg.border = NA) #labels graphics circos.trackPlotRegion(ylim = c(0, 0.5), bg.border = NA, panel.fun = function(x, y) { circos.text(1:361-0.5, rep(0.5,361),labels(dend), adj = c(0, 0.5), facing = "clockwise", niceFacing = TRUE, col = labels_colors(dend), cex = 0.45) }) dend = color_branches(dend, k = 6, col = 1:6) #Dendrogram graphics circos.trackPlotRegion(ylim = c(0, max_height), bg.border = NA, track.height = 0.4, panel.fun = function(x, y) { circos.dendrogram(dend, max_height = 0.55) }) legend("left",names(colorCodes),col=colorCodes,text.col=colorCodes,bty="n",pch=15,cex=0.8) legend("right",names(colorCodesR),col=colorCodesR,text.col=colorCodesR,bty="n",pch=15,cex=0.35) Thanks, Meriam __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] R help: circular dendrogram
Dear all, I generated a circular dendrogram with R (see attached). I have a total of 360 landraces. What I want to do next is generate a different color for each cluster and also generate colors to show the country/region. I don't know if it's also possible to put a code number (associated with each landrace) in front of each ramification. I want to have an explicit dendrogram. Rplot01.pdf Description: Adobe PDF document __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Warning message: NAs introduced by coercion
Yes, sorry. I attached the file once again. Well, still getting the same warning. > class(genod) <- "numeric" Warning message: In class(genod) <- "numeric" : NAs introduced by coercion > class(genod) [1] "matrix" Then, I run the following code and it gives this: > filn <-"simTunesian.gds" > snpgdsCreateGeno(filn, genmat = genod, + sample.id = sample.id, snp.id = snp.id, + snp.chromosome = snp.chromosome, + snp.position = snp.position, + snp.allele = snp.allele, snpfirstdim=TRUE) > # calculate similarity matrix > # Open the GDS file > (genofile <- snpgdsOpen(filn)) File: C:\Users\DELL\Documents\TEST\simTunesian.gds (1.4M) +[ ] * |--+ sample.id { Str8 363 ZIP_ra(42.5%), 755B } |--+ snp.id { Int32 15752 ZIP_ra(35.1%), 21.6K } |--+ snp.position { Int32 15752 ZIP_ra(34.7%), 21.3K } |--+ snp.chromosome { Float64 15752 ZIP_ra(0.18%), 230B } |--+ snp.allele { Str8 15752 ZIP_ra(0.16%), 108B } \--+ genotype { Bit2 15752x363, 1.4M } * > ibs <- snpgdsIBS(genofile, remove.monosnp = FALSE, num.thread=1) Identity-By-State (IBS) analysis on genotypes: Excluding 0 SNP on non-autosomes Working space: 363 samples, 15,752 SNPs using 1 (CPU) core IBS:the sum of all selected genotypes (0,1,2) = 3658952 Tue Jan 08 15:38:00 2019(internal increment: 42880) [==] 100%, completed in 0s Tue Jan 08 15:38:00 2019Done. > # maximum similarity value > max(ibs$ibs) [1] NaN > # minimum similarity value > min(ibs$ibs) [1] NaN As you can see, I can't continue my analysis (heat map plot, clustering with hclust) because values are NaN. On Tue, Jan 8, 2019 at 2:01 PM David L Carlson wrote: > > Your attached file is not a .csv file since the field are not separated by > commas (just rename the mydata.csv to mydata.txt). > > The command "genod2 <- as.matrix(genod)" created a character matrix from the > data frame genod. When you try to force genod2 to numeric, the marker column > becomes NAs which is probably not what you want. > > The error message is because you passed genod (a data frame) to the > snpgdsCreateGeno() function not genod2 (the matrix you created from genod). > > > David L. Carlson > Department of Anthropology > Texas A University > > -Original Message- > From: R-help [mailto:r-help-boun...@r-project.org] On Behalf Of N Meriam > Sent: Tuesday, January 8, 2019 1:38 PM > To: Michael Dewey > Cc: r-help@r-project.org > Subject: Re: [R] Warning message: NAs introduced by coercion > > Here's a portion of what my data looks like (text file format attached). > When running in R, it gives me this: > > > df4 <- read.csv(file = "mydata.csv", header = TRUE) > > require(SNPRelate) > > library(gdsfmt) > > myd <- df4 > > myd <- df4 > > names(myd)[-1] > [1] "marker" "X88""X9" "X17""X25" > > myd[,1] > [1] 3 4 5 6 8 10 > # the data must be 0,1,2 with 3 as missing so you have r > > sample.id <- names(myd)[-1] > > snp.id <- myd[,1] > > snp.position <- 1:length(snp.id) # not needed for ibs > > snp.chromosome <- rep(1, each=length(snp.id)) # not needed for ibs > > snp.allele <- rep("A/G", length(snp.id)) # not needed for ibs > # genotype data must have - in 3 > > genod <- myd[,-1] > > genod[is.na(genod)] <- 3 > > genod[genod=="0"] <- 0 > > genod[genod=="1"] <- 2 > > genod2 <- as.matrix(genod) > > head(genod2) > marker X88 X9 > X17 X25 > [1,] "100023173|F|0-47:G>A-47:G>A" "0""3""3" "3" > [2,] "1043336|F|0-7:A>G-7:A>G" "2""0""3" "0" > [3,] "1212218|F|0-49:A>G-49:A>G" "0""0""0" "0" > [4,] "1019554|F|0-14:T>C-14:T>C" "0" "0""3" "0" > [5,] "100024550|F|0-16:G>A-16:G>A" "3""3""3" "3" > [6,] "1106702|F|0-8:C>A-8:C>A" "0" "0" "0" "0" > > class(genod2) <- "numeric" > Warning message: In class(genod2) <- "numeric" : NAs introduced by coercion > > head(genod2) > marker X88 X9 X17 X25 > [1,] NA 0 3 3
Re: [R] Warning message: NAs introduced by coercion
Here's a portion of what my data looks like (text file format attached). When running in R, it gives me this: > df4 <- read.csv(file = "mydata.csv", header = TRUE) > require(SNPRelate) > library(gdsfmt) > myd <- df4 > myd <- df4 > names(myd)[-1] [1] "marker" "X88""X9" "X17""X25" > myd[,1] [1] 3 4 5 6 8 10 # the data must be 0,1,2 with 3 as missing so you have r > sample.id <- names(myd)[-1] > snp.id <- myd[,1] > snp.position <- 1:length(snp.id) # not needed for ibs > snp.chromosome <- rep(1, each=length(snp.id)) # not needed for ibs > snp.allele <- rep("A/G", length(snp.id)) # not needed for ibs # genotype data must have - in 3 > genod <- myd[,-1] > genod[is.na(genod)] <- 3 > genod[genod=="0"] <- 0 > genod[genod=="1"] <- 2 > genod2 <- as.matrix(genod) > head(genod2) marker X88 X9 X17 X25 [1,] "100023173|F|0-47:G>A-47:G>A" "0""3""3" "3" [2,] "1043336|F|0-7:A>G-7:A>G" "2""0""3" "0" [3,] "1212218|F|0-49:A>G-49:A>G" "0""0""0" "0" [4,] "1019554|F|0-14:T>C-14:T>C" "0" "0""3" "0" [5,] "100024550|F|0-16:G>A-16:G>A" "3""3""3" "3" [6,] "1106702|F|0-8:C>A-8:C>A" "0" "0" "0" "0" > class(genod2) <- "numeric" Warning message: In class(genod2) <- "numeric" : NAs introduced by coercion > head(genod2) marker X88 X9 X17 X25 [1,] NA 0 3 3 3 [2,] NA 2 0 3 0 [3,] NA 0 0 0 0 [4,] NA 0 0 3 0 [5,] NA 3 3 3 3 [6,] NA 0 0 0 0 > class(genod2) <- "numeric" > class(genod2) [1] "matrix" # read data > filn <-"simTunesian.gds" > snpgdsCreateGeno(filn, genmat = genod, + sample.id = sample.id, snp.id = snp.id, + snp.chromosome = snp.chromosome, + snp.position = snp.position, + snp.allele = snp.allele, snpfirstdim=TRUE) Error in snpgdsCreateGeno(filn, genmat = genod, sample.id = sample.id, : is.matrix(genmat) is not TRUE Can't find a solution to my problem...my guess is that the problem comes from converting the column 'marker' factor to numerical. Best, Meriam On Tue, Jan 8, 2019 at 11:28 AM Michael Dewey wrote: > > Dear Meriam > > Your csv file did not come through as attachments are stripped unless of > certain types and you post is very hard to read since you are posting in > HTML. Try renaming the file to .txt and set your mailer to send > plain text then people may be able to help you better. > > Michael > > On 08/01/2019 15:35, N Meriam wrote: > > I see... > > Here's a portion of what my data looks like (csv file attached). > > I run again and here are the results: > > > > df4 <- read.csv(file = "mydata.csv", header = TRUE) > > > >> require(SNPRelate)> library(gdsfmt)> myd <- df4> myd <- df4> > >> names(myd)[-1][1] "marker" "X88""X9" "X17""X25" > > > >> myd[,1][1] 3 4 5 6 8 10 > > > > > >> # the data must be 0,1,2 with 3 as missing so you have r> sample.id <- > >> names(myd)[-1]> snp.id <- myd[,1]> snp.position <- 1:length(snp.id) # not > >> needed for ibs> snp.chromosome <- rep(1, each=length(snp.id)) # not needed > >> for ibs> snp.allele <- rep("A/G", length(snp.id)) # not needed for ibs> # > >> genotype data must have - in 3> genod <- myd[,-1]> genod[is.na(genod)] <- > >> 3> genod[genod=="0"] <- 0> genod[genod=="1"] <- 2 > > > >> genod2 <- as.matrix(genod)> head(genod2) marker > >> X88 X9 X17 X25 > > [1,] "100023173|F|0-47:G>A-47:G>A" "0" "3" "3" "3" > > [2,] "1043336|F|0-7:A>G-7:A>G" "2" "0" "3" "0" > > [3,] "1212218|F|0-49:A>G-49:A>G" "0" "0" "0" "0" > > [4,] "1019554|F|0-14:T>C-14
Re: [R] Warning message: NAs introduced by coercion
I see... Here's a portion of what my data looks like (csv file attached). I run again and here are the results: df4 <- read.csv(file = "mydata.csv", header = TRUE) > require(SNPRelate)> library(gdsfmt)> myd <- df4> myd <- df4> > names(myd)[-1][1] "marker" "X88""X9" "X17""X25" > myd[,1][1] 3 4 5 6 8 10 > # the data must be 0,1,2 with 3 as missing so you have r> sample.id <- > names(myd)[-1]> snp.id <- myd[,1]> snp.position <- 1:length(snp.id) # not > needed for ibs> snp.chromosome <- rep(1, each=length(snp.id)) # not needed > for ibs> snp.allele <- rep("A/G", length(snp.id)) # not needed for ibs> # > genotype data must have - in 3> genod <- myd[,-1]> genod[is.na(genod)] <- 3> > genod[genod=="0"] <- 0> genod[genod=="1"] <- 2 > genod2 <- as.matrix(genod)> head(genod2) marker > X88 X9 X17 X25 [1,] "100023173|F|0-47:G>A-47:G>A" "0" "3" "3" "3" [2,] "1043336|F|0-7:A>G-7:A>G" "2" "0" "3" "0" [3,] "1212218|F|0-49:A>G-49:A>G" "0" "0" "0" "0" [4,] "1019554|F|0-14:T>C-14:T>C" "0" "0" "3" "0" [5,] "100024550|F|0-16:G>A-16:G>A" "3" "3" "3" "3" [6,] "1106702|F|0-8:C>A-8:C>A" "0" "0" "0" "0" > class(genod2) <- "numeric"Warning message:In class(genod2) <- "numeric" : NAs > introduced by coercion> head(genod2) marker X88 X9 X17 X25 [1,] NA 0 3 3 3 [2,] NA 2 0 3 0 [3,] NA 0 0 0 0 [4,] NA 0 0 3 0 [5,] NA 3 3 3 3 [6,] NA 0 0 0 0 > class(genod2) <- "numeric"> class(genod2)[1] "matrix" > # read data > filn <-"simTunesian.gds"> snpgdsCreateGeno(filn, genmat = > genod,+ sample.id = sample.id, snp.id = snp.id,+ > snp.chromosome = snp.chromosome,+ snp.position = > snp.position,+ snp.allele = snp.allele, > snpfirstdim=TRUE)Error in snpgdsCreateGeno(filn, genmat = genod, sample.id = > sample.id, : is.matrix(genmat) is not TRUE Thanks, Meriam On Tue, Jan 8, 2019 at 9:02 AM PIKAL Petr wrote: > Hi > > see in line > > > -Original Message- > > From: R-help On Behalf Of N Meriam > > Sent: Tuesday, January 8, 2019 3:08 PM > > To: r-help@r-project.org > > Subject: [R] Warning message: NAs introduced by coercion > > > > Dear all, > > > > I have a .csv file called df4. (15752 obs. of 264 variables). > > I apply this code but couldn't continue further other analyses, a warning > > message keeps coming up. Then, I want to determine max and min > > similarity values, > > heat map plot, cluster...etc > > > > > require(SNPRelate) > > > library(gdsfmt) > > > myd <- read.csv(file = "df4.csv", header = TRUE) > > > names(myd)[-1] > > myd[,1] > > > myd[1:10, 1:10] > > # the data must be 0,1,2 with 3 as missing so you have r > > > sample.id <- names(myd)[-1] > > > snp.id <- myd[,1] > > > snp.position <- 1:length(snp.id) # not needed for ibs > > > snp.chromosome <- rep(1, each=length(snp.id)) # not needed for ibs > > > snp.allele <- rep("A/G", length(snp.id)) # not needed for ibs > > # genotype data must have - in 3 > > > genod <- myd[,-1] > > > genod[is.na(genod)] <- 3 > > > genod[genod=="0"] <- 0 > > > genod[genod=="1"] <- 2 > > > genod[1:10,1:10] > > > genod <- as.matrix(genod) > > matrix can have only one type of data so you probaly changed it to > character by such construction. > > > > class(genod) <- "numeric" > > This tries to change all "numeric" values to numbers but if it cannot it > sets it to NA. > > something like > > > head(iris) > Sepal.Length Sepal.Width Petal.Length Petal.Width Species > 1 5.1 3.5 1.4 0.2 setosa > 2 4.9 3.0 1.4 0.2 setosa > 3 4.7 3.2 1.3 0.2 setosa > 4 4.6 3.1 1.5 0.2 setosa > 5 5.0 3.6 1.4 0.2 setosa > 6 5.4 3.9 1.7 0.4 setosa > >
[R] Warning message: NAs introduced by coercion
Dear all, I have a .csv file called df4. (15752 obs. of 264 variables). I apply this code but couldn't continue further other analyses, a warning message keeps coming up. Then, I want to determine max and min similarity values, heat map plot, cluster...etc > require(SNPRelate) > library(gdsfmt) > myd <- read.csv(file = "df4.csv", header = TRUE) > names(myd)[-1] myd[,1] > myd[1:10, 1:10] # the data must be 0,1,2 with 3 as missing so you have r > sample.id <- names(myd)[-1] > snp.id <- myd[,1] > snp.position <- 1:length(snp.id) # not needed for ibs > snp.chromosome <- rep(1, each=length(snp.id)) # not needed for ibs > snp.allele <- rep("A/G", length(snp.id)) # not needed for ibs # genotype data must have - in 3 > genod <- myd[,-1] > genod[is.na(genod)] <- 3 > genod[genod=="0"] <- 0 > genod[genod=="1"] <- 2 > genod[1:10,1:10] > genod <- as.matrix(genod) > class(genod) <- "numeric" *Warning message:In class(genod) <- "numeric" : NAs introduced by coercion* Maybe I could illustrate more with details so I can be more specific? Please, let me know. I would appreciate your help. Thanks, Meriam [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.