Re: [R] Data Carpentry - Creating a New SQLite Database
Hi Phillip, Skipping to the last few lines of your email, did you download a program to look at Sqlite databases (independent of R) as listed below? Maybe that program ("DB Browser for SQLite") and/or the instructions below can help you locate your database directory: https://datacarpentry.org/semester-biology/computer-setup/ https://datacarpentry.org/semester-biology/materials/sql-for-dplyr-users/ If you do have that program, and you're still seeing an error, you might consider looking for similar issues at the appropriate 'datacarpentry' repository on Github (or posting a new issue yourself): https://github.com/datacarpentry/R-ecology-lesson/issues Finally, I really feel you'll benefit from reading over the documents pertaining to "R Data Import/Export" on the www.r-project.org website. No disrespect to the people at 'datacarpentry', but you'll find similar (and possibly, easier) R code to follow at section 4.3.1 'Packages using DBI' : https://cran.r-project.org/doc/manuals/r-release/R-data.html HTH, Bill. W. Michels, Ph.D. On Fri, Jan 10, 2020 at 10:32 AM Phillip Heinrich wrote: > > Working my way through a tutorial named Data Carpentry > (https://datacarpentry.org/R-ecology-lesson/). for the most part it is > excellent but I’m stuck on the very last section > (https://datacarpentry.org/R-ecology-lesson/05-r-and-databases.html). > > First, below are the packages I have loaded: > [1] "forcats" "stringr" "purrr" "readr" "tidyr" "tibble" > "ggplot2" "tidyverse" "dbplyr""RMySQL""DBI" > [12] "dplyr" "RSQLite" "stats" "graphics" "grDevices" "utils" > "datasets" "methods" "base" > > > > > > > Second, > Second, is the text of the last section of the last chapter titled “Creating > a New SQLite Database”. > Second, below is the text from the tutorial. The black type is from the > tutorial. The green and blue is the suggested R code. My comments are in > red. > Creating a new SQLite database > So far, we have used a previously prepared SQLite database. But we can also > use R to create a new database, e.g. from existing csv files. Let’s recreate > the mammals database that we’ve been working with, in R. First let’s download > and read in the csv files. We’ll import tidyverse to gain access to the > read_csv() function. > > download.file("https://ndownloader.figshare.com/files/3299483;, > "data_raw/species.csv") > download.file("https://ndownloader.figshare.com/files/10717177;, > "data_raw/surveys.csv") > download.file("https://ndownloader.figshare.com/files/3299474;, > "data_raw/plots.csv") > library(tidyverse) > species <- read_csv("data_raw/species.csv")No problem here. I’m pulling > three databases from the Web and saving them to a folder on my hard drive. > (...data_raw/species.csv) etc.surveys <- read_csv("data_raw/surveys.csv") > plots <- read_csv("data_raw/plots.csv")Again no problem. I’m just creating > an R data files. But here is where I loose it. I’m creating something named > my_db_file from another file named portal-database-output with an sqlite > extension and then creating my_db from the My_db_file. Not sure where the > sqlite extension file came from. Creating a new SQLite database with dplyr is > easy. You can re-use the same command we used above to open an existing > .sqlite file. The create = TRUE argument instructs R to create a new, empty > database instead. > > Caution: When create = TRUE is added, any existing database at the same > location is overwritten without warning. > > my_db_file <- "data/portal-database-output.sqlite" > my_db <- src_sqlite(my_db_file, create = TRUE)Currently, our new database is > empty, it doesn’t contain any tables: > > my_db#> src: sqlite 3.29.0 [data/portal-database-output.sqlite] > #> tbls:To add tables, we copy the existing data.frames into the database one > by one: > > copy_to(my_db, surveys) > copy_to(my_db, plots) > my_dbI can follow the directions to fill in my_db but I have no idea how to > access the tables. The text from the tutorial below says to check the > location of our database. Huh! Can someone give me some direction. Thanks. > > > > > > If you check the location of our database you’ll see that data is > automatically being written to disk. R and dplyr not only provide easy ways > to query existing databases, they also allows you to easily create your own > databases from flat files! > > > > Here is where I loose it. > > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see
Re: [R] Data Carpentry - Creating a New SQLite Database
Please note that tidyverse packages have their own support resources at RStudio, whence they came; e.g. here: https://education.rstudio.com/learn/beginner/ You may also do better asking about issues that concern them at their support site: https://support.rstudio.com/hc/en-us though, as you already found out, there are folks here who may help also. Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Fri, Jan 10, 2020 at 10:32 AM Phillip Heinrich wrote: > Working my way through a tutorial named Data Carpentry ( > https://datacarpentry.org/R-ecology-lesson/). for the most part it is > excellent but I’m stuck on the very last section ( > https://datacarpentry.org/R-ecology-lesson/05-r-and-databases.html). > > First, below are the packages I have loaded: > [1] "forcats" "stringr" "purrr" "readr" "tidyr" "tibble" > "ggplot2" "tidyverse" "dbplyr""RMySQL""DBI" > [12] "dplyr" "RSQLite" "stats" "graphics" "grDevices" "utils" >"datasets" "methods" "base" > > > > > > > Second, > Second, is the text of the last section of the last chapter titled > “Creating a New SQLite Database”. > Second, below is the text from the tutorial. The black type is from the > tutorial. The green and blue is the suggested R code. My comments are in > red. > Creating a new SQLite database > So far, we have used a previously prepared SQLite database. But we can > also use R to create a new database, e.g. from existing csv files. Let’s > recreate the mammals database that we’ve been working with, in R. First > let’s download and read in the csv files. We’ll import tidyverse to gain > access to the read_csv() function. > > download.file("https://ndownloader.figshare.com/files/3299483;, > "data_raw/species.csv") > download.file("https://ndownloader.figshare.com/files/10717177;, > "data_raw/surveys.csv") > download.file("https://ndownloader.figshare.com/files/3299474;, > "data_raw/plots.csv") > library(tidyverse) > species <- read_csv("data_raw/species.csv")No problem here. I’m pulling > three databases from the Web and saving them to a folder on my hard drive. > (...data_raw/species.csv) etc.surveys <- read_csv("data_raw/surveys.csv") > plots <- read_csv("data_raw/plots.csv")Again no problem. I’m just creating > an R data files. But here is where I loose it. I’m creating something > named my_db_file from another file named portal-database-output with an > sqlite extension and then creating my_db from the My_db_file. Not sure > where the sqlite extension file came from. Creating a new SQLite database > with dplyr is easy. You can re-use the same command we used above to open > an existing .sqlite file. The create = TRUE argument instructs R to create > a new, empty database instead. > > Caution: When create = TRUE is added, any existing database at the same > location is overwritten without warning. > > my_db_file <- "data/portal-database-output.sqlite" > my_db <- src_sqlite(my_db_file, create = TRUE)Currently, our new database > is empty, it doesn’t contain any tables: > > my_db#> src: sqlite 3.29.0 [data/portal-database-output.sqlite] > #> tbls:To add tables, we copy the existing data.frames into the database > one by one: > > copy_to(my_db, surveys) > copy_to(my_db, plots) > my_dbI can follow the directions to fill in my_db but I have no idea how > to access the tables. The text from the tutorial below says to check the > location of our database. Huh! Can someone give me some direction. > Thanks. > > > > > > If you check the location of our database you’ll see that data is > automatically being written to disk. R and dplyr not only provide easy ways > to query existing databases, they also allows you to easily create your own > databases from flat files! > > > > Here is where I loose it. > > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Data Carpentry - Creating a New SQLite Database
On Fri, 10 Jan 2020 11:31:58 -0700 "Phillip Heinrich" wrote: > below is the text from the tutorial. The black type is from the > tutorial. The green and blue is the suggested R code. My comments > are in red R-help is a plain text mailing list, so the markup has been stripped off (and since HTML-enabled mail clients don't quite care how the plain text version of the e-mail looks, some paragraph breaks had to go, too). > etc.surveys <- read_csv("data_raw/surveys.csv") > plots <- read_csv("data_raw/plots.csv") > Again no problem. I’m just creating an R data files. Note that it is not files that you are creating by running read_csv(), but variables (of type "tibble", which is like "data.frame", either of which should have been covered in earlier chapters in a good tutorial) in the R environment. The files you downloaded previously are opened in read only mode and are never changed. > my_db_file <- "data/portal-database-output.sqlite" > I’m creating something named my_db_file from another file named > portal-database-output with an sqlite extension and then creating > my_db from the My_db_file. This something is just a text string that happens to contain a *path* to a file. Just like the variable `greeting` in the following snippet: greeting <- "Hello world" print(greeting) See [1] for more info on character vectors in R. > Not sure where the sqlite extension file came from. The authors of the tutorial decided that the file to be created should be named like this. Feel free to change the extension (or the path) to anything else: neither R, nor SQLite cares about it much (but the file manager you use may display a different icon for it or become confused if you name it .txt or .pdf). > I can follow the directions to fill in my_db but I have no idea > how to access the tables. What exactly do you mean by "access"? At this point my_db should be a dplyr "src" object, so the tools described in dplyr vignettes [2] should be applicable. Try calling tbl() on it and passing the name of one of the tables you have just created. Also try running: example("src_sqlite") > The text from the tutorial below says to check the location of our > database. Huh! Can someone give me some direction. The variable my_db_file contains the location of the file where the database is stored. This is the same variable that you passed to the src_sqlite() function. -- Best regards, Ivan [1] https://cran.r-project.org/doc/manuals/r-release/R-intro.html#Character-vectors [2] https://cran.r-project.org/web/packages/dplyr/vignettes/dplyr.html https://cran.r-project.org/web/packages/dplyr/vignettes/programming.html https://cran.r-project.org/web/packages/dplyr/vignettes/two-table.html https://cran.r-project.org/web/packages/dplyr/vignettes/window-functions.html __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Data Carpentry - Creating a New SQLite Database
Working my way through a tutorial named Data Carpentry (https://datacarpentry.org/R-ecology-lesson/). for the most part it is excellent but I’m stuck on the very last section (https://datacarpentry.org/R-ecology-lesson/05-r-and-databases.html). First, below are the packages I have loaded: [1] "forcats" "stringr" "purrr" "readr" "tidyr" "tibble" "ggplot2" "tidyverse" "dbplyr""RMySQL""DBI" [12] "dplyr" "RSQLite" "stats" "graphics" "grDevices" "utils" "datasets" "methods" "base" > Second, Second, is the text of the last section of the last chapter titled “Creating a New SQLite Database”. Second, below is the text from the tutorial. The black type is from the tutorial. The green and blue is the suggested R code. My comments are in red. Creating a new SQLite database So far, we have used a previously prepared SQLite database. But we can also use R to create a new database, e.g. from existing csv files. Let’s recreate the mammals database that we’ve been working with, in R. First let’s download and read in the csv files. We’ll import tidyverse to gain access to the read_csv() function. download.file("https://ndownloader.figshare.com/files/3299483;, "data_raw/species.csv") download.file("https://ndownloader.figshare.com/files/10717177;, "data_raw/surveys.csv") download.file("https://ndownloader.figshare.com/files/3299474;, "data_raw/plots.csv") library(tidyverse) species <- read_csv("data_raw/species.csv")No problem here. I’m pulling three databases from the Web and saving them to a folder on my hard drive. (...data_raw/species.csv) etc.surveys <- read_csv("data_raw/surveys.csv") plots <- read_csv("data_raw/plots.csv")Again no problem. I’m just creating an R data files. But here is where I loose it. I’m creating something named my_db_file from another file named portal-database-output with an sqlite extension and then creating my_db from the My_db_file. Not sure where the sqlite extension file came from. Creating a new SQLite database with dplyr is easy. You can re-use the same command we used above to open an existing .sqlite file. The create = TRUE argument instructs R to create a new, empty database instead. Caution: When create = TRUE is added, any existing database at the same location is overwritten without warning. my_db_file <- "data/portal-database-output.sqlite" my_db <- src_sqlite(my_db_file, create = TRUE)Currently, our new database is empty, it doesn’t contain any tables: my_db#> src: sqlite 3.29.0 [data/portal-database-output.sqlite] #> tbls:To add tables, we copy the existing data.frames into the database one by one: copy_to(my_db, surveys) copy_to(my_db, plots) my_dbI can follow the directions to fill in my_db but I have no idea how to access the tables. The text from the tutorial below says to check the location of our database. Huh! Can someone give me some direction. Thanks. If you check the location of our database you’ll see that data is automatically being written to disk. R and dplyr not only provide easy ways to query existing databases, they also allows you to easily create your own databases from flat files! Here is where I loose it. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R-es] ERROR EN LECTURA DE PAGINAS HTML GIGANTES
Estimado Diego Maldonado Por ahí leí que hay un error pendiente, pero está fechado en el año 2019, hace muuccchhooo. Puede ser que una actualización tenga solucionado el problema? Yo en lo personal opte por casi el mismo camino, excepto que no uso R y no uso contenedores. C# me resulta una opción más adecuada para extraer los datos, luego analizo con R. Javier Rubén Marcuzzi El vie., 10 ene. 2020 a las 1:01, Diego Maldonado via R-help-es (< r-help-es@r-project.org>) escribió: > Saludos estimado foro, por comentarles que estoy haciendo un proceso de > webscrapping con Rselenium por medio de contenedores docker y al > automatizar la cargar paginas html con el paquete XML por medio de la > función read_html me sale el siguiente mensaje de error: > > Error in doc_parse_raw(x, encoding = encoding, base_url = base_url, > as_html = as_html, : > Excessive depth in document: 256 use XML_PARSE_HUGE option [1] > > Si alguien me puede guiar como solventarlo les agradecería del fondo de mi > alma ya que voy algunos días tratando de resolverlo pero no lo logro. > > De Antemano agradezco su atención > > Atte > > Diego Maldonado > Chiefanalytics officer > Mentalytica > > ___ > R-help-es mailing list > R-help-es@r-project.org > https://stat.ethz.ch/mailman/listinfo/r-help-es > [[alternative HTML version deleted]] ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es
Re: [R] R-help Digest, Vol 203, Issue 8
Hello, And there's also # # library(caTools) # Author(s) # Jarek Tuszynski # # Original trapz <- function(x, y){ idx = 2:length(x) return(as.double( (x[idx] - x[idx-1]) %*% (y[idx] + y[idx-1]) ) / 2) } # Modified by me, input is x, f(x) trapzf <- function(x, FUN) trapz(x, FUN(x)) # Call like 'integrate' trapzf2 <- function(f, lower, upper, subdivisions = 100){ trapzf(seq(lower, upper, length.out = subdivisions), match.fun(f)) } So I guess it's not missing, just missing in base R, like the OP said. Hope this helps, Rui Barradas Às 11:20 de 09/01/20, Helmut Schütz escreveu: Dear Hans, r-help-requ...@r-project.org wrote on 2020-01-09 12:00: Date: Wed, 8 Jan 2020 12:09:55 +0100 From: Hans W Borchers To: R help project Subject: [R] Which external functions are called in a package? [Solved] NB: `trapz`, ie. the trapezoidal integration formula, seems to be the numerical function to be missed the most in R base. In R base indeed. However available in Frank Harrels Hmisc as the function trap.rule(x, y) for sorted values. In plain R: function(x, y) sum(diff(x) * (y[-1] + y[-length(y)]))/2 Helmut __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] editing plot
Hello, There are ways of reducing the white space between the bars but they are not obvious. Here are the two ways that I know. First a data example. library(ggplot2) library(gridExtra) df1 <- data.frame(x = LETTERS[1:5], y = c(40, 15, 30, 15, 20)) 1. The examples that follow set argument width in two places. What is important is the relative magnitudes of width and position_dodge(width). If width is bigger than position_dodge(width) then the bar sizes and the space between them do not change. They only change if the second width is bigger. In this case the first argument makes a difference. Run the examples to see what I mean. g1 <- ggplot(df1, aes(x, y)) + geom_bar(stat = "identity", width = 0.8, position = position_dodge(width = 0.5)) g2 <- ggplot(df1, aes(x, y)) + geom_bar(stat = "identity", width = 0.8, position = position_dodge(width = 0.25)) g3 <- ggplot(df1, aes(x, y)) + geom_bar(stat = "identity", width = 0.5, position = position_dodge(width = 0.8)) g4 <- ggplot(df1, aes(x, y)) + geom_bar(stat = "identity", width = 0.25, position = position_dodge(width = 0.8)) grid.arrange(g31, g2, g3, g4) 2. The other way is to shrink the plot by changing its aspect ratio. g3 + theme(aspect.ratio = 2/1) grid.arrange(g3, g3 + theme(aspect.ratio = 2/1), nrow = 1) Run the examples and try to do something out of this. Hope this helps, Rui Barradas Às 16:41 de 09/01/20, Ana Marija escreveu: HI Rui, Thank you so much for getting back to me! I did implement your idea (see attach): ax.11.text <- element_text(size = 10) ay.11.text <- element_text(size = 10) p<-ggplot(data=toplot, aes(x=cat, y=props)) + geom_bar(stat="identity",width=0.5, fill="steelblue")+ geom_errorbar(aes(ymin=props-1.96*ses, ymax=props+1.96*ses), width=.1, position=position_dodge(.9)) + geom_signif(comparisons=list( c("All eQTL", "eQTL from 103 genes"), c("All SNPs", "eQTL from 103 genes")), y_position=c(0.065, 0.07), tip_length=0, annotation=c("p = 0.0012", "p = 0.0023")) + scale_y_continuous(breaks=seq(0,.06,by=.01)) + xlab("") + ylab("Proportion p-values < 0.05") + theme_classic()+ theme(panel.grid.major.x = element_line(size = 0.1, color = "grey"), panel.grid.major.y = element_blank(), panel.grid.minor = element_blank(),axis.text.x = ax.11.text,axis.text.y=ay.11.text ) p I was wondering is there is any way to decrease the amount of white spaces around the bars? Thanks Ana On Wed, Jan 8, 2020 at 2:58 PM Rui Barradas wrote: Hello, Maybe theme(panel.grid.major.x = element_line(size = 0.1, color = "grey"), panel.grid.major.y = element_blank(), panel.grid.minor = element_blank() ) Note that if you remove the y axis grid you must set the x axis grid explicitly. Hope this helps, Rui Barradas Às 18:52 de 08/01/20, Ana Marija escreveu: Hello, I have this plot in attach. I was wondering how can I change my plotting code in order to remove these gray horizontal background lines but keep these two vertical lines? These two vertical lines don't need to be gray, can be any other type of lines but they must be at the same place. Also how can I make these two bars narrower? library("ggplot2") p<-ggplot(data=toplot, aes(x=cat, y=props)) + geom_bar(stat="identity", fill="steelblue")+ geom_errorbar(aes(ymin=props-1.96*ses, ymax=props+1.96*ses), width=.2, position=position_dodge(.9)) + geom_signif(comparisons=list( c("All eQTL", "eQTL from 103 genes"), c("All SNPs", "eQTL from 103 genes")), y_position=c(0.065, 0.07), tip_length=0, annotation=c("p = 0.0012", "p = 0.0023")) + scale_y_continuous(breaks=seq(0,.06,by=.01)) + xlab("") + ylab("Proportion p-values < 0.05") + theme_minimal() p __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.