Re: [R-es] PROBLEMAS NOMBRES DE COLUMNAS CON ESPACIOS CONEXION R-SQL
Hola Ana, Lo que ocurre es que “Corteoptimo” es diferente a “Corte optimo”. Lo mismo ocurre con “Cortediairo”. Por eso no reconoce que existen. Saludos, Jorge.- El El mar, 16 de abr. de 2019 a las 8:56 a. m., Ana Jimenez Rebollo < anaji...@gmail.com> escribió: > Buenas tardes, > > Estoy tratando de realizar un update en SQL desde R: > sqlUpdate(conexion1, data.frame(AUXILIAR), tablename = "AUXILIAR") > > y me devuelve el siguiente error: > *Error in sqlUpdate(conexion1, data.frame(AUXILIAR), tablename = > "AUXILIAR", : * > * data frame column(s) Corteoptimo Cortediario not in database table* > > El problema es que sí existen esas columnas en SQL pero parece que R está > almacenando internamente los nombres sin espacios ya que si ejecuto la > función names en R para el data.frame AUXILIAR (que es con el que quiero > actualizar en R): > names(AUXILIAR) > [1] "Date""Time""Seasson" >"TimeH" "Day" "Holiday" > > [7] "Corte optimo""Corte diario" > > Y si obtengo los nombres de la tabla SQL: > sqlColumns(conexion1, "AUXILIAR") > TABLE_CAT TABLE_SCHEM TABLE_NAME COLUMN_NAME > OS MKOnline dbo AUXILIAR Date > OS MKOnline dbo AUXILIAR Time > OS MKOnline dbo AUXILIAR Seasson > OS MKOnline dbo AUXILIAR TimeH > OS MKOnline dbo AUXILIAR Day > OS MKOnline dbo AUXILIAR Holiday > OS MKOnline dbo AUXILIAR Corte optimo > OS MKOnline dbo AUXILIAR Corte diario > > > Los nombres son idénticos por lo que intuyo que el problema viene en el > nombre almacenado internamente. He probado a renombrar las columnas del > data.frame sustituyendo los espacios por puntos y por guiones bajos pero me > devuelve el mismo error. > PD: El data.frame con el que quiero actualizar en SQL y la tabla llevan el > mismo nombre, siento si eso lleva a algún tipo de confusión. > ¿Alguna sugerencia? > > Muchas gracias de antemano. > > Saludos, > AJR. > > [[alternative HTML version deleted]] > > ___ > R-help-es mailing list > R-help-es@r-project.org > https://stat.ethz.ch/mailman/listinfo/r-help-es > -- Sent from my phone. Please excuse my brevity and misspelling. [[alternative HTML version deleted]] ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es
Re: [R-es] PROBLEMAS NOMBRES DE COLUMNAS CON ESPACIOS CONEXION R-SQL
Estimada Ana Rebollo Intente algo simple, str(datos) y compare los nombres de R con los de sql en la base de datos, lo ideal es que coincidan, es mucho más simple, la segunda alternativa es leer como se generan los códigos SQL (INSERT …..), esto complica un poco el trabajo pero también es posible. Javier Rubén Marcuzzi El mar., 16 abr. 2019 a las 11:59, Javier Nieto () escribió: > Hola Ana > > Seguramente, como dices, es problema de los nombres internos. Sucede que > lo habitual y como buena práctica los nombres de los campos no deben llevar > espacios ni caracteres como acentos o la letra ñ. Mi sugerencia es que > cambies los nombres con espacios a nombres sin espacios. > > > Saludos > > De: R-help-es en nombre de Ana Jimenez > Rebollo > Enviado: martes, 16 de abril de 2019 08:55 a. m. > Para: r-help-es@r-project.org > Asunto: [R-es] PROBLEMAS NOMBRES DE COLUMNAS CON ESPACIOS CONEXION R-SQL > > Buenas tardes, > > Estoy tratando de realizar un update en SQL desde R: > sqlUpdate(conexion1, data.frame(AUXILIAR), tablename = "AUXILIAR") > > y me devuelve el siguiente error: > *Error in sqlUpdate(conexion1, data.frame(AUXILIAR), tablename = > "AUXILIAR", : * > * data frame column(s) Corteoptimo Cortediario not in database table* > > El problema es que sí existen esas columnas en SQL pero parece que R está > almacenando internamente los nombres sin espacios ya que si ejecuto la > función names en R para el data.frame AUXILIAR (que es con el que quiero > actualizar en R): > names(AUXILIAR) > [1] "Date""Time""Seasson" >"TimeH" "Day" "Holiday" > > [7] "Corte optimo""Corte diario" > > Y si obtengo los nombres de la tabla SQL: > sqlColumns(conexion1, "AUXILIAR") > TABLE_CAT TABLE_SCHEM TABLE_NAME COLUMN_NAME > OS MKOnline dbo AUXILIAR Date > OS MKOnline dbo AUXILIAR Time > OS MKOnline dbo AUXILIAR Seasson > OS MKOnline dbo AUXILIAR TimeH > OS MKOnline dbo AUXILIAR Day > OS MKOnline dbo AUXILIAR Holiday > OS MKOnline dbo AUXILIAR Corte optimo > OS MKOnline dbo AUXILIAR Corte diario > > > Los nombres son idénticos por lo que intuyo que el problema viene en el > nombre almacenado internamente. He probado a renombrar las columnas del > data.frame sustituyendo los espacios por puntos y por guiones bajos pero me > devuelve el mismo error. > PD: El data.frame con el que quiero actualizar en SQL y la tabla llevan el > mismo nombre, siento si eso lleva a algún tipo de confusión. > ¿Alguna sugerencia? > > Muchas gracias de antemano. > > Saludos, > AJR. > > [[alternative HTML version deleted]] > > ___ > R-help-es mailing list > R-help-es@r-project.org > https://stat.ethz.ch/mailman/listinfo/r-help-es > > [[alternative HTML version deleted]] > > ___ > R-help-es mailing list > R-help-es@r-project.org > https://stat.ethz.ch/mailman/listinfo/r-help-es > [[alternative HTML version deleted]] ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es
[R-es] PROBLEMAS NOMBRES DE COLUMNAS CON ESPACIOS CONEXION R-SQL
Buenas tardes, Estoy tratando de realizar un update en SQL desde R: sqlUpdate(conexion1, data.frame(AUXILIAR), tablename = "AUXILIAR") y me devuelve el siguiente error: *Error in sqlUpdate(conexion1, data.frame(AUXILIAR), tablename = "AUXILIAR", : * * data frame column(s) Corteoptimo Cortediario not in database table* El problema es que sí existen esas columnas en SQL pero parece que R está almacenando internamente los nombres sin espacios ya que si ejecuto la función names en R para el data.frame AUXILIAR (que es con el que quiero actualizar en R): names(AUXILIAR) [1] "Date""Time""Seasson" "TimeH" "Day" "Holiday" [7] "Corte optimo""Corte diario" Y si obtengo los nombres de la tabla SQL: sqlColumns(conexion1, "AUXILIAR") TABLE_CAT TABLE_SCHEM TABLE_NAME COLUMN_NAME OS MKOnline dbo AUXILIAR Date OS MKOnline dbo AUXILIAR Time OS MKOnline dbo AUXILIAR Seasson OS MKOnline dbo AUXILIAR TimeH OS MKOnline dbo AUXILIAR Day OS MKOnline dbo AUXILIAR Holiday OS MKOnline dbo AUXILIAR Corte optimo OS MKOnline dbo AUXILIAR Corte diario Los nombres son idénticos por lo que intuyo que el problema viene en el nombre almacenado internamente. He probado a renombrar las columnas del data.frame sustituyendo los espacios por puntos y por guiones bajos pero me devuelve el mismo error. PD: El data.frame con el que quiero actualizar en SQL y la tabla llevan el mismo nombre, siento si eso lleva a algún tipo de confusión. ¿Alguna sugerencia? Muchas gracias de antemano. Saludos, AJR. [[alternative HTML version deleted]] ___ R-help-es mailing list R-help-es@r-project.org https://stat.ethz.ch/mailman/listinfo/r-help-es
Re: [R] SQL Database
Harold, I don't have much experience with ODBC/RODBC, but given that it's working on Win, a driver problem seems plausible. -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 Lab cell 925-724-7509 On 7/26/18, 9:37 AM, "Doran, Harold" wrote: Thanks for this. I'm using the RODBC stuff now. It works well and is currently embedded in a shiny app. So, the entire SQL stuff is transparent to the user who simply interacts with the UI. It appears to be working in a local windows version. That is, I can successfully open the connection, do my sqlQuery, and save those data as objects in the R session. But when I run the same code on my dev server (which runs Centos 7), the code is breaking and it is seemingly related to the driver. It just cannot open the connection. That portion of my code is (with certain things blanked out for security): cn <- odbcDriverConnect(connection="Driver={SQL Server Native Client 11.0}; server=1.1.1.1; database=xyz; uid=*; pwd=***;" ) I'm doing my homework now on the right drivers that might be appropriate for centos, but if anyone happens to know, hints are appreciated Harold -Original Message- From: MacQueen, Don [mailto:macque...@llnl.gov] Sent: Thursday, July 26, 2018 11:26 AM To: Doran, Harold ; 'r-help@r-project.org' Subject: Re: [R] SQL Database From my point of view, the logic is this: If the external database is Oracle, use ROracle If the external database is MySQL, use RMySQL and similarly for other databases If there is no R package specific to the database, then you drop back to RODBC or RJDBC. Hopefully you can get the necessary drivers or java files to support the database Your steps look good (I do them all the time with Oracle and MySQL), and realize that you don't have to grab an entire table; you can send SQL queries that join tables and subset rows, etc. You can also write results back to the database if that's useful. I prefer to use packages that are based on the DBI package. -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 Lab cell 925-724-7509 On 7/25/18, 7:57 AM, "R-help on behalf of Doran, Harold" wrote: I'm doing some work now to learn which SQL database package is the most optimal for the task I am working on. There are many packages, and I'm reviewing the documentation on some of them now. I am seeking advice from those of you who might suggest a package to use for the task I am currently working with. The work is currently as follows. My users currently use another tool to extract tables from a server, save those tables as .csv files, and then those csv files are read into R and stuff is done on the data in those files. This adds overhead that can be bypassed if users instead can directly access the database from within R and grab the tables they need and then those tables are data frames in the R session and available to do stuff. The sequence of work (I think) I just this: Step 1: Connect to the remote server (connection string and authenticate the user) Step 2: Have a SQL query statement that grabs the tables from the remote server Step 3: Close the connection The two packages I have narrowed my studies to are Dbplyr and RODBC, both of which seem to be similar. Any experiences out there to suggest these two packages are in fact right for this task, or would there be other packages that might be more optimal for this? Thanks, Harold __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL Database
Thanks for this. I'm using the RODBC stuff now. It works well and is currently embedded in a shiny app. So, the entire SQL stuff is transparent to the user who simply interacts with the UI. It appears to be working in a local windows version. That is, I can successfully open the connection, do my sqlQuery, and save those data as objects in the R session. But when I run the same code on my dev server (which runs Centos 7), the code is breaking and it is seemingly related to the driver. It just cannot open the connection. That portion of my code is (with certain things blanked out for security): cn <- odbcDriverConnect(connection="Driver={SQL Server Native Client 11.0}; server=1.1.1.1; database=xyz; uid=*; pwd=***;" ) I'm doing my homework now on the right drivers that might be appropriate for centos, but if anyone happens to know, hints are appreciated Harold -Original Message- From: MacQueen, Don [mailto:macque...@llnl.gov] Sent: Thursday, July 26, 2018 11:26 AM To: Doran, Harold ; 'r-help@r-project.org' Subject: Re: [R] SQL Database From my point of view, the logic is this: If the external database is Oracle, use ROracle If the external database is MySQL, use RMySQL and similarly for other databases If there is no R package specific to the database, then you drop back to RODBC or RJDBC. Hopefully you can get the necessary drivers or java files to support the database Your steps look good (I do them all the time with Oracle and MySQL), and realize that you don't have to grab an entire table; you can send SQL queries that join tables and subset rows, etc. You can also write results back to the database if that's useful. I prefer to use packages that are based on the DBI package. -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 Lab cell 925-724-7509 On 7/25/18, 7:57 AM, "R-help on behalf of Doran, Harold" wrote: I'm doing some work now to learn which SQL database package is the most optimal for the task I am working on. There are many packages, and I'm reviewing the documentation on some of them now. I am seeking advice from those of you who might suggest a package to use for the task I am currently working with. The work is currently as follows. My users currently use another tool to extract tables from a server, save those tables as .csv files, and then those csv files are read into R and stuff is done on the data in those files. This adds overhead that can be bypassed if users instead can directly access the database from within R and grab the tables they need and then those tables are data frames in the R session and available to do stuff. The sequence of work (I think) I just this: Step 1: Connect to the remote server (connection string and authenticate the user) Step 2: Have a SQL query statement that grabs the tables from the remote server Step 3: Close the connection The two packages I have narrowed my studies to are Dbplyr and RODBC, both of which seem to be similar. Any experiences out there to suggest these two packages are in fact right for this task, or would there be other packages that might be more optimal for this? Thanks, Harold __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL Database
From my point of view, the logic is this: If the external database is Oracle, use ROracle If the external database is MySQL, use RMySQL and similarly for other databases If there is no R package specific to the database, then you drop back to RODBC or RJDBC. Hopefully you can get the necessary drivers or java files to support the database Your steps look good (I do them all the time with Oracle and MySQL), and realize that you don't have to grab an entire table; you can send SQL queries that join tables and subset rows, etc. You can also write results back to the database if that's useful. I prefer to use packages that are based on the DBI package. -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 Lab cell 925-724-7509 On 7/25/18, 7:57 AM, "R-help on behalf of Doran, Harold" wrote: I'm doing some work now to learn which SQL database package is the most optimal for the task I am working on. There are many packages, and I'm reviewing the documentation on some of them now. I am seeking advice from those of you who might suggest a package to use for the task I am currently working with. The work is currently as follows. My users currently use another tool to extract tables from a server, save those tables as .csv files, and then those csv files are read into R and stuff is done on the data in those files. This adds overhead that can be bypassed if users instead can directly access the database from within R and grab the tables they need and then those tables are data frames in the R session and available to do stuff. The sequence of work (I think) I just this: Step 1: Connect to the remote server (connection string and authenticate the user) Step 2: Have a SQL query statement that grabs the tables from the remote server Step 3: Close the connection The two packages I have narrowed my studies to are Dbplyr and RODBC, both of which seem to be similar. Any experiences out there to suggest these two packages are in fact right for this task, or would there be other packages that might be more optimal for this? Thanks, Harold __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL Database
> Doran, Harold > on Wed, 25 Jul 2018 14:57:13 + writes: > I'm doing some work now to learn which SQL database > package is the most optimal for the task I am working on. Hmm... we would have a problem with optimize() and optim() if this was optimal << more optimal << most optimal :-) ;-) Best, Martin __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL Database
https://rviews.rstudio.com/2017/10/18/database-queries-with-r/ Cheers, Bert Bert Gunter "The trouble with having an open mind is that people keep coming along and sticking things into it." -- Opus (aka Berkeley Breathed in his "Bloom County" comic strip ) On Wed, Jul 25, 2018 at 7:57 AM, Doran, Harold wrote: > I'm doing some work now to learn which SQL database package is the most > optimal for the task I am working on. There are many packages, and I'm > reviewing the documentation on some of them now. I am seeking advice from > those of you who might suggest a package to use for the task I am currently > working with. > > The work is currently as follows. My users currently use another tool to > extract tables from a server, save those tables as .csv files, and then > those csv files are read into R and stuff is done on the data in those > files. This adds overhead that can be bypassed if users instead can > directly access the database from within R and grab the tables they need > and then those tables are data frames in the R session and available to do > stuff. > > The sequence of work (I think) I just this: > > Step 1: Connect to the remote server (connection string and authenticate > the user) > Step 2: Have a SQL query statement that grabs the tables from the remote > server > Step 3: Close the connection > > The two packages I have narrowed my studies to are Dbplyr and RODBC, both > of which seem to be similar. > > Any experiences out there to suggest these two packages are in fact right > for this task, or would there be other packages that might be more optimal > for this? > > Thanks, > Harold > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SQL Database
I'm doing some work now to learn which SQL database package is the most optimal for the task I am working on. There are many packages, and I'm reviewing the documentation on some of them now. I am seeking advice from those of you who might suggest a package to use for the task I am currently working with. The work is currently as follows. My users currently use another tool to extract tables from a server, save those tables as .csv files, and then those csv files are read into R and stuff is done on the data in those files. This adds overhead that can be bypassed if users instead can directly access the database from within R and grab the tables they need and then those tables are data frames in the R session and available to do stuff. The sequence of work (I think) I just this: Step 1: Connect to the remote server (connection string and authenticate the user) Step 2: Have a SQL query statement that grabs the tables from the remote server Step 3: Close the connection The two packages I have narrowed my studies to are Dbplyr and RODBC, both of which seem to be similar. Any experiences out there to suggest these two packages are in fact right for this task, or would there be other packages that might be more optimal for this? Thanks, Harold __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL Server "float" is not handled by RODBC -- Is there a workaround?
Hi Jim, No answers over the course of 24 hours so I'll give it a shot. First, I always work under Linux, so my answers may well be worthless for your Windows scenario. Second, I don't know if my workaround works as I don't actually have a SQL Server DB using float. Now the workaround: I have had many problems in the past using ODBC to connect to databases. Nothing I could nail down to a fault in that system, but just no end of problems. Some of that, of course, is due to me generally working under Linux. My general workaround that has been clean is to use JDBC instead. There have been hassles at times to set up the RJava, but recent versions of that have installed very easily. Once RJava is in place (and under Windows, you'll have fun setting up Java cleanly), then installation of a JDBC jar (I use jtds from SourceForge for SQL Server) and finally RJDBC. The generic nature of the JDBC interface is a joy to work with, interacting with most database types very well and in a uniform manner. So, lots of work getting JDBC up and going to see if an alternative path into your DB gets you your data in a better format. Now you see why I waited 24 hours to say anything at all ... Also, it might be worth while posting on the DB specific maillist: https://stat.ethz.ch/mailman/listinfo/r-sig-db Hope this helps, Mark Dalphin On 15/10/15 07:23, jim holtman wrote: > Here is the system I am using: > = >> sessionInfo() > R version 3.2.2 (2015-08-14) > Platform: x86_64-w64-mingw32/x64 (64-bit) > Running under: Windows 7 x64 (build 7601) Service Pack 1 > > locale: > [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United > States.1252 > [3] LC_MONETARY=English_United States.1252 > LC_NUMERIC=C > [5] LC_TIME=English_United States.1252 > > attached base packages: > [1] stats graphics grDevices utils datasets methods base > > other attached packages: > [1] lubridate_1.3.3 RODBC_1.3-12 > loaded via a namespace (and not attached): > [1] magrittr_1.5 plyr_1.8.3tools_3.2.2 memoise_0.2.1 Rcpp_0.12.1 > stringi_0.5-5 digest_0.6.8 > [8] stringr_1.0.0 > > > I have data on a SQL Server that I am connecting to where some of the > fields are defined as "float" so that the data is stored in the database as > an IEEE 754 value. Now when I read this is using RODBC, the data comes > across the interface in the floating point format; I used Wireshark to > examine the packets that were being sent. Some of the data is also defined > as "int" and comes across in binary. > > When the data is read in with > > df <- sqlQuery(db, "select * from mydb", as.is = TRUE) > > The resulting dataframe has the floating point values as 'chr' and the > integer fields as 'int'; I would have expected the floating point fields to > be 'num'. Now in the "ODBC Connectivity" Vignette by Ripley there was the > comment that "double" data values come back as type 8, but on some systems > they may be type 6; well on SQL Server, "float" is type 6. > > So what appears to happen, is this data is not recognized as a floating > point value and is therefore converted to a character. When the data is > made available to the R script, I then have to convert this back to > floating point. If I use "stringsAsFactors = FALSE" on the query, this > conversion back to floating point will be done within the RODBC package. > This becomes a problem when I have dataframes with several million rows and > multiple columns of numerics is that the conversion to/from characters is > adding time to the processing. > > So I was wondering is there a workaround to this problem? Is it possible > to add the capability to RODBC when processing SQL Server to avoid this > conversion? Or is there some other way around this problem? > > Jim Holtman > Data Munger Guru > > What is the problem that you are trying to solve? > Tell me what you want to do, not how you want to do it. > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL Server "float" is not handled by RODBC -- Is there a workaround?
Mark, Thanks for the suggestion. I will have to look into that option. I assume that if I am running on a 64-bit system, I also have to use the 64-bit version of Java. We have had some problems in the past because the company standard is a 32-bit version of Java and we had to also load in the 64-bit version to work with the XLConnect package. I was reading the RJDBC package documention and they seem to list a lot of methods (e.g., 'dbReadTable'), but don't say what the parameters are, or what it returns. Where do you find this information? Also I notice that I probably have to get the JDBC driver for SQL Server from Microsoft and install that - is that correct? I hate to start mixing approaches since we have a large number of scripts that currently use the RODBC package, but I will try to see if the approach you proposed do help overcome this problem. Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Thu, Oct 15, 2015 at 4:18 PM, Mark Dalphinwrote: > Hi Jim, > > No answers over the course of 24 hours so I'll give it a shot. > > First, I always work under Linux, so my answers may well be worthless > for your Windows scenario. > > Second, I don't know if my workaround works as I don't actually have a > SQL Server DB using float. > > Now the workaround: > > I have had many problems in the past using ODBC to connect to databases. > Nothing I could nail down to a fault in that system, but just no end of > problems. Some of that, of course, is due to me generally working under > Linux. > > My general workaround that has been clean is to use JDBC instead. There > have been hassles at times to set up the RJava, but recent versions of > that have installed very easily. Once RJava is in place (and under > Windows, you'll have fun setting up Java cleanly), then installation of > a JDBC jar (I use jtds from SourceForge for SQL Server) and finally > RJDBC. The generic nature of the JDBC interface is a joy to work with, > interacting with most database types very well and in a uniform manner. > > So, lots of work getting JDBC up and going to see if an alternative path > into your DB gets you your data in a better format. Now you see why I > waited 24 hours to say anything at all ... > > Also, it might be worth while posting on the DB specific maillist: > https://stat.ethz.ch/mailman/listinfo/r-sig-db > > Hope this helps, > Mark Dalphin > > > On 15/10/15 07:23, jim holtman wrote: > > Here is the system I am using: > > = > >> sessionInfo() > > R version 3.2.2 (2015-08-14) > > Platform: x86_64-w64-mingw32/x64 (64-bit) > > Running under: Windows 7 x64 (build 7601) Service Pack 1 > > > > locale: > > [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United > > States.1252 > > [3] LC_MONETARY=English_United States.1252 > > LC_NUMERIC=C > > [5] LC_TIME=English_United States.1252 > > > > attached base packages: > > [1] stats graphics grDevices utils datasets methods base > > > > other attached packages: > > [1] lubridate_1.3.3 RODBC_1.3-12 > > loaded via a namespace (and not attached): > > [1] magrittr_1.5 plyr_1.8.3tools_3.2.2 memoise_0.2.1 Rcpp_0.12.1 > > stringi_0.5-5 digest_0.6.8 > > [8] stringr_1.0.0 > > > > > > I have data on a SQL Server that I am connecting to where some of the > > fields are defined as "float" so that the data is stored in the database > as > > an IEEE 754 value. Now when I read this is using RODBC, the data comes > > across the interface in the floating point format; I used Wireshark to > > examine the packets that were being sent. Some of the data is also > defined > > as "int" and comes across in binary. > > > > When the data is read in with > > > > df <- sqlQuery(db, "select * from mydb", as.is = TRUE) > > > > The resulting dataframe has the floating point values as 'chr' and the > > integer fields as 'int'; I would have expected the floating point fields > to > > be 'num'. Now in the "ODBC Connectivity" Vignette by Ripley there was > the > > comment that "double" data values come back as type 8, but on some > systems > > they may be type 6; well on SQL Server, "float" is type 6. > > > > So what appears to happen, is this data is not recognized as a floating > > point value and is therefore converted to a character. When the data is > > made available to the R script, I then have to convert this back to > > floating point. If I use "stringsAsFactors = FALSE" on the query, this > > conversion back to floating point will be done within the RODBC package. > > This becomes a problem when I have dataframes with several million rows > and > > multiple columns of numerics is that the conversion to/from characters is > > adding time to the processing. > > > > So I was wondering is there a workaround to this problem? Is it possible > > to add the capability to RODBC when
Re: [R] SQL Server "float" is not handled by RODBC -- Is there a workaround?
Hi Jim, Yes, your Java versions need to match bit width. As for drivers, I do not use the Microsoft one; it used to be hard to obtain for Linux users; I don't know if it still is. I currently use "jtds" from Sourceforge: http://jtds.sourceforge.net/ >From their site: > jTDS is an open source 100% pure Java (type 4) JDBC 3.0 driver for > Microsoft SQL Server (6.5, 7, 2000, 2005, 2008 and 2012) and Sybase > Adaptive Server Enterprise (10, 11, 12 and 15). I always found reading the DBI + DBI-specific driver (eg RPostgreSQL or RJDBC) documentation challenging. I think it is oriented towards the programmer who wrote it rather than the end user. It probably needs a long, wordy vignette like Tim Bunce once wrote for the Perl DBI, but I am not stepping up to write it (so you don't hear me complaining)! Over the years, I've gleaned enough and experimented enough that I have certain methods well worked out. And I have ignored much of the functionality that has been added over the years; I am sure that the DBI implementors have put some real gems in there as I stumble upon new ones all the time. The help for DBI, "help(package='DBI')", lists many methods. RJDBC is just an intermediary and it happens to implement most of them. In fact, I usually read the DBI drivers packages (RPosgreSQL, RJDBC) with an eye towards find "what is not implemented or what is broken" more than anything else. DBIConnection-classDBIConnection class. DBIDriver-classDBIDriver class. DBIObject-classDBIObject class. DBIResult-classDBIResult class. SQLSQL quoting. dbClearResult Clear a result set. dbColumnInfo Information about result types. dbConnect Create a connection to a DBMS. dbDataType Determine the SQL data type of an object. dbDisconnect Disconnect (close) a connection dbDriver Load and unload database drivers. dbExistsTable Does a table exist? dbFetchFetch records from a previously executed query. dbGetException Get DBMS exceptions. dbGetInfo Get DBMS metadata. dbGetQuery Send query, retrieve results and then clear result set. dbGetRowCount The number of rows fetched so far. dbGetRowsAffected The number of rows affected by data modifying query. dbGetStatement Get the statement associated with a result set dbHasCompleted Has the operation completed? dbIsValid Is this DBMS object still valid? dbListConnections List currently open connections. dbListFields List field names of a remote table. dbListResults A list of all pending results. dbListTables List remote tables. dbReadTableCopy data frames to and from database tables. dbRemoveTable Remove a table from the database. dbSendQueryExecute a statement on a given database connection. dbiCheckCompliance Check a driver for compliance with DBI. make.db.names Make R identifiers into legal SQL identifiers. transactions Begin/commit/rollback SQL transactions In short, I usually only use a small subset of what is available: dbConnect, dbDisconnect, dbGetQuery, dbSendQuery, dbFetch and 'transactions' Some of the apparent richness is from things like two methods to get table information, one detailed and one simple. And since I know my schemas already, I seldom use either. In addition, there is ample support for complex DB interaction, including full support for transactions, allowing multi-table updates with security. My usual operation is something like this: library(RJDBC) jdbcStr <- 'jdbc:jtds:sqlserver://myHost.myLan:myPort/myDbName' drv <- JDBC("net.sourceforge.jtds.jdbc.Driver", "/my/long/path/to/my/JDBC_Drivers/jtds/jtds-1.3.0.jar") conn <- dbConnect(drv, jdbcStr, user='mark', password='*') ## See what tables are present (DB exploration; not used in production) tab <- dbGetTables(conn) ## A typical query if I am to iterate through rs <- dbSendQuery(conn, 'SELECT TOP 10 * FROM Orders;') ## followed by dbFetch calls ## A more typical query if I can grab the whole thing at once rs <- dbGetQuery(conn, 'SELECT * FROM Orders;') print(rs) dbDisconnect(conn) Hope this helps; I might be able to help with specific examples for some of the DBI methods, if you need it. I suspect that what I have written above will cover most "query SQL", though not the
[R] SQL Server "float" is not handled by RODBC -- Is there a workaround?
Here is the system I am using: = > sessionInfo() R version 3.2.2 (2015-08-14) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows 7 x64 (build 7601) Service Pack 1 locale: [1] LC_COLLATE=English_United States.1252 LC_CTYPE=English_United States.1252 [3] LC_MONETARY=English_United States.1252 LC_NUMERIC=C [5] LC_TIME=English_United States.1252 attached base packages: [1] stats graphics grDevices utils datasets methods base other attached packages: [1] lubridate_1.3.3 RODBC_1.3-12 loaded via a namespace (and not attached): [1] magrittr_1.5 plyr_1.8.3tools_3.2.2 memoise_0.2.1 Rcpp_0.12.1 stringi_0.5-5 digest_0.6.8 [8] stringr_1.0.0 I have data on a SQL Server that I am connecting to where some of the fields are defined as "float" so that the data is stored in the database as an IEEE 754 value. Now when I read this is using RODBC, the data comes across the interface in the floating point format; I used Wireshark to examine the packets that were being sent. Some of the data is also defined as "int" and comes across in binary. When the data is read in with df <- sqlQuery(db, "select * from mydb", as.is = TRUE) The resulting dataframe has the floating point values as 'chr' and the integer fields as 'int'; I would have expected the floating point fields to be 'num'. Now in the "ODBC Connectivity" Vignette by Ripley there was the comment that "double" data values come back as type 8, but on some systems they may be type 6; well on SQL Server, "float" is type 6. So what appears to happen, is this data is not recognized as a floating point value and is therefore converted to a character. When the data is made available to the R script, I then have to convert this back to floating point. If I use "stringsAsFactors = FALSE" on the query, this conversion back to floating point will be done within the RODBC package. This becomes a problem when I have dataframes with several million rows and multiple columns of numerics is that the conversion to/from characters is adding time to the processing. So I was wondering is there a workaround to this problem? Is it possible to add the capability to RODBC when processing SQL Server to avoid this conversion? Or is there some other way around this problem? Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SQL Requests Templating
Hi, I am new to R. What I'd like to know is how to empower sqldf with templates like https://www.playframework.com/documentation/2.3.x/ScalaAnorm does? What does seasoned R-hacker use for this purpose: dedicated R-package that I am not aware of, or kind of format string, or something else? A. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
So, some feedback. Have installed MariaDB 10.0.10 on the Linux box. That speeded things up. Changed from InnoDB/XtraDb to Aria. That speeded loading of the data up. Have installed MariaDB on the iMac. That speeded things up more. Tried to tune MariadDB's config. Didn't speed things up much, but for the query buffer. Figured out Replication (from the linux box to the iMac). This slowed loading down somewhat. Played with the SQL. Speeded things up significantly. Played with the inxdexes. Did not speed things up much. Found what I could do in data.table that was faster than SQL and did that. Obvious increase in speed. My R processing time came down from 35 to 6 1/2 minutes. Removed all large tables before saving (and once the raw data was no longer required). That reduced RData from 150MB to 7KB. Pushed the table and image generation into a second R file. This takes 4 seconds. The corresponding LyX/LaTeX/Beamer/KnitR runs in 12 seconds. Installed RStudio. Nice. Adding new SQL queries adds between 30 and 90 seconds in the input R file, next to nothing to the presentation generation. I could not care lass how long the input takes, even hours, as long as I can save the analysis results and not the data into the RData. el PS: Ordered a MacPro :-)-O. Will report back. on 2014-05-06, 15:40 Peter Crowther said the following: The dataset is not large by database standards. Even in mySQL - not known for its speed at multi-row querying - the queries you describe should complete within a few seconds on even moderately recent hardware if your indexes are reasonable. What are your performance criteria for processing these queries, and how have you / your team optimised the relational database storage? Cheers, - Peter -- Peter Crowther, Director, Melandra Limited On 6 May 2014 15:32, Dr Eberhard Lisse e...@lisse.na wrote: Exactly, which is why I am looking for something faster :-)-O el on 2014-05-06, 15:21 David R Forrest said the following: It sounds as if your underlying MySQL database is too slow for your purposes. Whatever you layer on top of it will be constrained by the underlying database. To speed up the process significantly, you may need to do work on the database backend part of the process. Dave __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
Thank you. My requirements are that simple. One table, 11 fields, of which 3 are interesting, 30 Million records, growing daily by between 30. And, yes I have spent an enormous amount of time reading these things, but for someone not dealing with this professionally and/or on a daily basis, the documents don't help much. el on 2014-05-04, 05:26 Jeff Newmiller said the following: ?table ?aggregate Also, packages plyr, data.table, and dplyr. You might consider reading [1], but if your interests are really as simple as your examples then the table function should be sufficient. That function is discussed in the Introduction to R document that you really should have read before posting here. [1] http://www.jstatsoft.org/v40/i01/ [...] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
In what format is this growing data stored? CSV? SQL? Log textfile? You say you don't want to use sqldf, but you haven't said what you do want to use. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. On May 6, 2014 1:16:12 AM PDT, Eberhard Lisse nos...@lisse.na wrote: Thank you. My requirements are that simple. One table, 11 fields, of which 3 are interesting, 30 Million records, growing daily by between 30. And, yes I have spent an enormous amount of time reading these things, but for someone not dealing with this professionally and/or on a daily basis, the documents don't help much. el on 2014-05-04, 05:26 Jeff Newmiller said the following: ?table ?aggregate Also, packages plyr, data.table, and dplyr. You might consider reading [1], but if your interests are really as simple as your examples then the table function should be sufficient. That function is discussed in the Introduction to R document that you really should have read before posting here. [1] http://www.jstatsoft.org/v40/i01/ [...] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
Jeff It's in MySQL, at the moment roughly 1.8 GB, if I pull it into a dataframe it saves to 180MB. I work from the dataframe. But, it's not only a size issue it's also a speed issue and hence I don't care what I am going to use, as long as it is fast. sqldf is easy to understand for me but it takes ages. If alternatives were roughly similar in speed I would remain with sqldf. dplyr sounds faster, and promising, but the intrinsic stuff is way beyond me (elderly Gynaecologist) on the learning curve... el on 2014-05-06, 09:41 Jeff Newmiller said the following: In what format is this growing data stored? CSV? SQL? Log textfile? You say you don't want to use sqldf, but you haven't said what you do want to use. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
Hi, Yes dplyr syntax is quite equivalent to SQL, although it is faster. Another alternative you could consider is to use *data.table* which has a syntax very similar to the way you select subset within a data.frame and in terms of performance is faster (a bit) than sqldf. You can get some idea of how to work with it here: http://stackoverflow.com/questions/1727772/quickly-reading-very-large-tables-as-dataframes-in-r Regards, Carlos Ortega www.qualityexcellence.es 2014-05-06 11:12 GMT+02:00 Dr Eberhard Lisse e...@lisse.na: Jeff It's in MySQL, at the moment roughly 1.8 GB, if I pull it into a dataframe it saves to 180MB. I work from the dataframe. But, it's not only a size issue it's also a speed issue and hence I don't care what I am going to use, as long as it is fast. sqldf is easy to understand for me but it takes ages. If alternatives were roughly similar in speed I would remain with sqldf. dplyr sounds faster, and promising, but the intrinsic stuff is way beyond me (elderly Gynaecologist) on the learning curve... el on 2014-05-06, 09:41 Jeff Newmiller said the following: In what format is this growing data stored? CSV? SQL? Log textfile? You say you don't want to use sqldf, but you haven't said what you do want to use. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Saludos, Carlos Ortega www.qualityexcellence.es [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
On Tue, 6 May 2014 10:12:50 +0100 Dr Eberhard Lisse e...@lisse.na wrote Jeff It's in MySQL, at the moment roughly 1.8 GB, if I pull it into a dataframe it saves to 180MB. I work from the dataframe. But, it's not only a size issue it's also a speed issue and hence I don't care what I am going to use, as long as it is fast. sqldf is easy to understand for me but it takes ages. If alternatives were roughly similar in speed I would remain with sqldf. dplyr sounds faster, and promising, but the intrinsic stuff is way beyond me (elderly Gynaecologist) on the learning curve... el on 2014-05-06, 09:41 Jeff Newmiller said the following: In what format is this growing data stored? CSV? SQL? Log textfile? You say you don't want to use sqldf, but you haven't said what you do want to use. It seems like you are trying to extract a (relatively) small data set from a much larger SQL databaseWhy not do the SQL stiff in the database and the analysis *statsm graphics...) in R? Maybe use a make table query to grab the data of interest, and then import the whole table into R for the analysis? (Disclaimer: my ignorance of SQL is not far off total) HTH D. South Africas premier free email service - www.webmail.co.za Cheapest Insurance Quotes! https://www.outsurance.co.za/insurance-quote/personal/?source=msncr=Postit14_468x60_gifcid=322 __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
David, this is quite slow :-)-O el on 2014-05-06, 10:55 David McPearson said the following: [...] It seems like you are trying to extract a (relatively) small data set from a much larger SQL databaseWhy not do the SQL stiff in the database and the analysis *statsm graphics...) in R? Maybe use a make table query to grab the data of interest, and then import the whole table into R for the analysis? (Disclaimer: my ignorance of SQL is not far off total) HTH D. [...] -- Dr. Eberhard W. Lisse \/ Obstetrician Gynaecologist (Saar) e...@lisse.na/ * | Telephone: +264 81 124 6733 (cell) PO Box 8421 \ / Bachbrecht, Namibia ;/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
On Tue, May 6, 2014 at 5:12 AM, Dr Eberhard Lisse e...@lisse.na wrote: Jeff It's in MySQL, at the moment roughly 1.8 GB, if I pull it into a dataframe it saves to 180MB. I work from the dataframe. But, it's not only a size issue it's also a speed issue and hence I don't care what I am going to use, as long as it is fast. sqldf is easy to understand for me but it takes ages. If alternatives were roughly similar in speed I would remain with sqldf. dplyr sounds faster, and promising, but the intrinsic stuff is way beyond me (elderly Gynaecologist) on the learning curve... You can create indices in sqldf and that can speed up processing substantially for certain operations. See examples 4h and 4i on the sqldf home page: http://sqldf.googlecode.com. Also note that sqldf supports not only the default SQLite backend but also MySQL, h2 and postgresql. See ?sqldf for info on using sqldf with MySQL and the others. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
Thanks, tried all of that, too slow. el on 2014-05-06, 12:00 Gabor Grothendieck said the following: On Tue, May 6, 2014 at 5:12 AM, Dr Eberhard Lisse e...@lisse.na wrote: Jeff It's in MySQL, at the moment roughly 1.8 GB, if I pull it into a dataframe it saves to 180MB. I work from the dataframe. But, it's not only a size issue it's also a speed issue and hence I don't care what I am going to use, as long as it is fast. sqldf is easy to understand for me but it takes ages. If alternatives were roughly similar in speed I would remain with sqldf. dplyr sounds faster, and promising, but the intrinsic stuff is way beyond me (elderly Gynaecologist) on the learning curve... You can create indices in sqldf and that can speed up processing substantially for certain operations. See examples 4h and 4i on the sqldf home page: http://sqldf.googlecode.com. Also note that sqldf supports not only the default SQLite backend but also MySQL, h2 and postgresql. See ?sqldf for info on using sqldf with MySQL and the others. -- Dr. Eberhard W. Lisse \/ Obstetrician Gynaecologist (Saar) e...@lisse.na/ * | Telephone: +264 81 124 6733 (cell) PO Box 8421 \ / Bachbrecht, Namibia ;/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
It sounds as if your underlying MySQL database is too slow for your purposes. Whatever you layer on top of it will be constrained by the underlying database. To speed up the process significantly, you may need to do work on the database backend part of the process. Dave On May 6, 2014, at 7:08 AM, Dr Eberhard Lisse e...@lisse.na wrote: Thanks, tried all of that, too slow. el on 2014-05-06, 12:00 Gabor Grothendieck said the following: On Tue, May 6, 2014 at 5:12 AM, Dr Eberhard Lisse e...@lisse.na wrote: Jeff It's in MySQL, at the moment roughly 1.8 GB, if I pull it into a dataframe it saves to 180MB. I work from the dataframe. But, it's not only a size issue it's also a speed issue and hence I don't care what I am going to use, as long as it is fast. sqldf is easy to understand for me but it takes ages. If alternatives were roughly similar in speed I would remain with sqldf. dplyr sounds faster, and promising, but the intrinsic stuff is way beyond me (elderly Gynaecologist) on the learning curve... You can create indices in sqldf and that can speed up processing substantially for certain operations. See examples 4h and 4i on the sqldf home page: http://sqldf.googlecode.com. Also note that sqldf supports not only the default SQLite backend but also MySQL, h2 and postgresql. See ?sqldf for info on using sqldf with MySQL and the others. -- Dr. Eberhard W. Lisse \/ Obstetrician Gynaecologist (Saar) e...@lisse.na/ * | Telephone: +264 81 124 6733 (cell) PO Box 8421 \ / Bachbrecht, Namibia ;/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dr. David Forrest d...@vims.edu 804-684-7900w 757-968-5509h 804-413-7125c #240 Andrews Hall Virginia Institute of Marine Science Route 1208, Greate Road PO Box 1346 Gloucester Point, VA, 23062-1346 signature.asc Description: Message signed with OpenPGP using GPGMail __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Exactly, which is why I am looking for something faster :-)-O el on 2014-05-06, 15:21 David R Forrest said the following: It sounds as if your underlying MySQL database is too slow for your purposes. Whatever you layer on top of it will be constrained by the underlying database. To speed up the process significantly, you may need to do work on the database backend part of the process. Dave -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.12 (Darwin) Comment: GPGTools - http://gpgtools.org Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQCVAwUBU2jyd1sF2hmmSQy5AQJVPQP+MnrEkXLY9PK+N2CB+maySkRKhEXcWTUA KNOQnTDaYl3wnRZKg8y1wiZbLFA8tWsKpXPv91phDZ2000MTbv7SbnpBXthSzbAn clEOniQqRcXci1Q2Qjd+mH0YxyA6XpNvBnBIlbxPsQbObwjK+dKl7/cna1oZKUhW 6aytsFtPZTI= =zepY -END PGP SIGNATURE- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
The dataset is not large by database standards. Even in mySQL - not known for its speed at multi-row querying - the queries you describe should complete within a few seconds on even moderately recent hardware if your indexes are reasonable. What are your performance criteria for processing these queries, and how have you / your team optimised the relational database storage? Cheers, - Peter -- Peter Crowther, Director, Melandra Limited On 6 May 2014 15:32, Dr Eberhard Lisse e...@lisse.na wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Exactly, which is why I am looking for something faster :-)-O el on 2014-05-06, 15:21 David R Forrest said the following: It sounds as if your underlying MySQL database is too slow for your purposes. Whatever you layer on top of it will be constrained by the underlying database. To speed up the process significantly, you may need to do work on the database backend part of the process. Dave -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.12 (Darwin) Comment: GPGTools - http://gpgtools.org Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQCVAwUBU2jyd1sF2hmmSQy5AQJVPQP+MnrEkXLY9PK+N2CB+maySkRKhEXcWTUA KNOQnTDaYl3wnRZKg8y1wiZbLFA8tWsKpXPv91phDZ2000MTbv7SbnpBXthSzbAn clEOniQqRcXci1Q2Qjd+mH0YxyA6XpNvBnBIlbxPsQbObwjK+dKl7/cna1oZKUhW 6aytsFtPZTI= =zepY -END PGP SIGNATURE- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
I believe this discussion should be taken offlist as it no longer seems to be concerned with R. -- Bert Gunter Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch On Tue, May 6, 2014 at 7:40 AM, Peter Crowther peter.crowt...@melandra.com wrote: The dataset is not large by database standards. Even in mySQL - not known for its speed at multi-row querying - the queries you describe should complete within a few seconds on even moderately recent hardware if your indexes are reasonable. What are your performance criteria for processing these queries, and how have you / your team optimised the relational database storage? Cheers, - Peter -- Peter Crowther, Director, Melandra Limited On 6 May 2014 15:32, Dr Eberhard Lisse e...@lisse.na wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 Exactly, which is why I am looking for something faster :-)-O el on 2014-05-06, 15:21 David R Forrest said the following: It sounds as if your underlying MySQL database is too slow for your purposes. Whatever you layer on top of it will be constrained by the underlying database. To speed up the process significantly, you may need to do work on the database backend part of the process. Dave -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.12 (Darwin) Comment: GPGTools - http://gpgtools.org Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/ iQCVAwUBU2jyd1sF2hmmSQy5AQJVPQP+MnrEkXLY9PK+N2CB+maySkRKhEXcWTUA KNOQnTDaYl3wnRZKg8y1wiZbLFA8tWsKpXPv91phDZ2000MTbv7SbnpBXthSzbAn clEOniQqRcXci1Q2Qjd+mH0YxyA6XpNvBnBIlbxPsQbObwjK+dKl7/cna1oZKUhW 6aytsFtPZTI= =zepY -END PGP SIGNATURE- __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
On Wed, May 7, 2014 at 2:21 AM, David R Forrest d...@vims.edu wrote: It sounds as if your underlying MySQL database is too slow for your purposes. Whatever you layer on top of it will be constrained by the underlying database. To speed up the process significantly, you may need to do work on the database backend part of the process. You might try MonetDB and its R interface -- it is fast for aggregation operations, and either the current version or the upcoming version has dplyr support. -thomas -- Thomas Lumley Professor of Biostatistics University of Auckland __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
On May 5, 2014, at 11:44 AM, Dr Eberhard Lisse wrote: I do not wish to prolong this metadiscussion but I remain confused by your advice: 1) You don't understand what I asked (ie would have to parse two simple SQL statements) Correct ... at least for me. I could have guessed at what that statement might have meant, but why should I need to guess? Why not use a shared naturallanguage rather than restricting your audience to the more limited group that understands both languages? 2) The Original Post is understood enough, however, to point me to the Introduction to R (where I have not found something to help me) That means my guess would have been wrong, since like Jeff Newmiller, I thought a simple call to `table` would have succeeded. (Section 5.10 although the desire for ordering suggested by my guess regarding 3) My name is not Pete. I'm actually not sure who Pete was. It's a local expression of astonishment directed, not at you, but at Satish. That was a prelude to my effort explain to Satish why the other respondents to the list have not seen fit to be more expansive in their responses. I thought Satish's comment was gratuitous (and likewise unhelpful). If you don't want to help me, don't. Several people are trying to help. You are remaining obdurate in failing to explain what is desired in natural language or posting an example in R code with desired output, as well as in failing to heed multiple other bits of advice in the Posting Guide. The accepted practice in responses is to include any context that might further the conversation. To my mind that would have required that you include the original request: How do I do something like this without using sqldf? a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd') or e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f) How does the earlier suggestion to look at the 'table' function fail to address the first alternative? ( It appears it might satisfy the second one as well.) -- David. Nobody is forcing you to reply. el On 2014-05-04, 06:56 , David Winsemius wrote: On May 3, 2014, at 9:10 PM, Satish Anupindi Rao wrote: By making the effort to learn R?? very constructive and not condescending at all. We, lesser beings, are indebted to you, sir. For Pete's sake. The OP didn't even express his original request in natural language or offer a working example. Those of us who are not regular SQL users would have needed to parse out the SQL code in order to figure out what was intended. (My guess is that it would have been quite easy to solve if those were what were offered.) But making the effort to divine the intent didn't seem justified by the level of courtesy offered by the questioner. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
On Fri, May 2, 2014 at 5:23 PM, Dr Eberhard Lisse nos...@lisse.na wrote: Hi, How do I do something like this without using sqldf? a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd') or e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f) In the examples section at the bottom of ?sqldf are a number of SQL statements and the corresponding R statements. -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
Thank you very much, Mr Arkell. el On 2014-05-03, 07:11 , Bert Gunter wrote: By making the effort to learn R? See e.g. the Introduction to R tutorial that ships with R. -- Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch On Fri, May 2, 2014 at 2:23 PM, Dr Eberhard Lisse nos...@lisse.na wrote: Hi, How do I do something like this without using sqldf? a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd') or e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f) greetings, el __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
Thanks, will try to figure this out :-)-O el On 2014-05-03, 06:40 , Carlos Ortega wrote: Hi, With the new package dplyr you can create equivalent SQL sintaxt queries like the one you need. You can find examples of how to apply it here: http://martinsbioblogg.wordpress.com/2014/03/26/using-r-quickly-calculating-summary-statistics-with-dplyr/ http://martinsbioblogg.wordpress.com/2014/03/27/more-fun-with-and/ Regards, Carlos. 2014-05-02 23:23 GMT+02:00 Dr Eberhard Lisse nos...@lisse.na mailto:nos...@lisse.na: Hi, How do I do something like this without using sqldf? a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd') or e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f) greetings, el __ R-help@r-project.org mailto:R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Saludos, Carlos Ortega www.qualityexcellence.es http://www.qualityexcellence.es __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
On 04/05/14 00:05, Dr Eberhard W Lisse wrote: Thank you very much, Mr Arkell. I don't get it. Can anyone explain the (joke? allusion?) ? cheers, Rolf Turner On 2014-05-03, 07:11 , Bert Gunter wrote: By making the effort to learn R? See e.g. the Introduction to R tutorial that ships with R. -- Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch On Fri, May 2, 2014 at 2:23 PM, Dr Eberhard Lisse nos...@lisse.na wrote: Hi, How do I do something like this without using sqldf? a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd') or e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
On Sat, May 3, 2014 at 5:42 PM, Rolf Turner r.tur...@auckland.ac.nz wrote: On 04/05/14 00:05, Dr Eberhard W Lisse wrote: Thank you very much, Mr Arkell. I don't get it. Can anyone explain the (joke? allusion?) ? I believe it's a moderately offensive reply from someone who feels unfairly dismissed, derived from British pop culture. But someone who's actually British could better explain, I'm sure. Personally, I'm not sure how much work someone who appears to have not read the posting guide should really expect the list to do on his behalf. But snarky replies to reasonable requests to read the documentation are easier than doing one's own work. Sarah cheers, Rolf Turner On 2014-05-03, 07:11 , Bert Gunter wrote: By making the effort to learn R? See e.g. the Introduction to R tutorial that ships with R. -- Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch On Fri, May 2, 2014 at 2:23 PM, Dr Eberhard Lisse nos...@lisse.na wrote: Hi, How do I do something like this without using sqldf? a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd') or e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Sarah Goslee http://www.functionaldiversity.org __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
Google Pressdram :-)-O el On 2014-05-03, 23:42 , Rolf Turner wrote: On 04/05/14 00:05, Dr Eberhard W Lisse wrote: Thank you very much, Mr Arkell. I don't get it. Can anyone explain the (joke? allusion?) ? cheers, Rolf Turner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
On 04/05/14 10:16, Dr Eberhard W Lisse wrote: Google Pressdram :-)-O el On 2014-05-03, 23:42 , Rolf Turner wrote: On 04/05/14 00:05, Dr Eberhard W Lisse wrote: Thank you very much, Mr Arkell. I don't get it. Can anyone explain the (joke? allusion?) ? Thank you. cheers, Rolf Turner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
On 04/05/14 09:58, Sarah Goslee wrote: SNIP Personally, I'm not sure how much work someone who appears to have not read the posting guide should really expect the list to do on his behalf. But snarky replies to reasonable requests to read the documentation are easier than doing one's own work. Well put. cheers, Rolf Turner __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
By making the effort to learn R?? very constructive and not condescending at all. We, lesser beings, are indebted to you, sir. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Bert Gunter Sent: Saturday, May 03, 2014 1:12 AM To: Dr Eberhard Lisse Cc: r Subject: Re: [R] SQL vs R By making the effort to learn R? See e.g. the Introduction to R tutorial that ships with R. -- Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch On Fri, May 2, 2014 at 2:23 PM, Dr Eberhard Lisse nos...@lisse.na wrote: Hi, How do I do something like this without using sqldf? a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd') or e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f) greetings, el __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
?table ?aggregate Also, packages plyr, data.table, and dplyr. You might consider reading [1], but if your interests are really as simple as your examples then the table function should be sufficient. That function is discussed in the Introduction to R document that you really should have read before posting here. [1] http://www.jstatsoft.org/v40/i01/ --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. On May 2, 2014 2:23:13 PM PDT, Dr Eberhard Lisse nos...@lisse.na wrote: Hi, How do I do something like this without using sqldf? a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd') or e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f) greetings, el __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
On May 3, 2014, at 9:10 PM, Satish Anupindi Rao wrote: By making the effort to learn R?? very constructive and not condescending at all. We, lesser beings, are indebted to you, sir. For Pete's sake. The OP didn't even express his original request in natural language or offer a working example. Those of us who are not regular SQL users would have needed to parse out the SQL code in order to figure out what was intended. (My guess is that it would have been quite easy to solve if those were what were offered.) But making the effort to divine the intent didn't seem justified by the level of courtesy offered by the questioner. -- David. -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Bert Gunter Sent: Saturday, May 03, 2014 1:12 AM To: Dr Eberhard Lisse Cc: r Subject: Re: [R] SQL vs R By making the effort to learn R? See e.g. the Introduction to R tutorial that ships with R. -- Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch On Fri, May 2, 2014 at 2:23 PM, Dr Eberhard Lisse nos...@lisse.na wrote: Hi, How do I do something like this without using sqldf? a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd') or e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f) greetings, el __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius Alameda, CA, USA __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SQL vs R
Hi, How do I do something like this without using sqldf? a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd') or e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f) greetings, el __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
Hi, With the new package dplyr you can create equivalent SQL sintaxt queries like the one you need. You can find examples of how to apply it here: http://martinsbioblogg.wordpress.com/2014/03/26/using-r-quickly-calculating-summary-statistics-with-dplyr/ http://martinsbioblogg.wordpress.com/2014/03/27/more-fun-with-and/ Regards, Carlos. 2014-05-02 23:23 GMT+02:00 Dr Eberhard Lisse nos...@lisse.na: Hi, How do I do something like this without using sqldf? a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd') or e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f) greetings, el __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Saludos, Carlos Ortega www.qualityexcellence.es [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL vs R
By making the effort to learn R? See e.g. the Introduction to R tutorial that ships with R. -- Bert Bert Gunter Genentech Nonclinical Biostatistics (650) 467-7374 Data is not information. Information is not knowledge. And knowledge is certainly not wisdom. H. Gilbert Welch On Fri, May 2, 2014 at 2:23 PM, Dr Eberhard Lisse nos...@lisse.na wrote: Hi, How do I do something like this without using sqldf? a - sqldf(SELECT COUNT(*) FROM b WHERE c = 'd') or e - sqldf(SELECT f, COUNT(*) FROM b GROUP BY f ORDER BY f) greetings, el __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SQL queries in R
Hey all! I am trying to select a bunch of id's (data type -character) from a table and store them in a variable in R But when i do this, it automatically truncates the leading zero's in id's even though they are of character type. code is :- myconn-odbcConnect(testdata) sql.select-paste(select UNIT_ID from UNITS where (UNIT_TYPE=',unit,' and COMMUNITY=',property,'),sep=) unit_ids-sqlQuery(myconn,sql.select) print(unit_ids) is there anyway i can retain the UNIT_ID's as they are. Thanks! Sneha [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL queries in R
?sqlQuery as.is - argument Andrija On Jun 7, 2013 9:10 PM, Sneha Bishnoi sneha.bish...@gmail.com wrote: Hey all! I am trying to select a bunch of id's (data type -character) from a table and store them in a variable in R But when i do this, it automatically truncates the leading zero's in id's even though they are of character type. code is :- myconn-odbcConnect(testdata) sql.select-paste(select UNIT_ID from UNITS where (UNIT_TYPE=',unit,' and COMMUNITY=',property,'),sep=) unit_ids-sqlQuery(myconn,sql.select) print(unit_ids) is there anyway i can retain the UNIT_ID's as they are. Thanks! Sneha [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL queries in R
tried as.is ,gives an error, [1] 01000 10054 [Microsoft][ODBC SQL Server Driver][DBNETLIB]ConnectionWrite (send()). [2] [RODBC] ERROR: Could not SQLExecDirect 'select UNIT_ID from UNITS where (UNIT_TYPE='1X1' and COMMUNITY='SAN1193')' On Fri, Jun 7, 2013 at 3:21 PM, andrija djurovic djandr...@gmail.comwrote: ?sqlQuery as.is - argument Andrija On Jun 7, 2013 9:10 PM, Sneha Bishnoi sneha.bish...@gmail.com wrote: Hey all! I am trying to select a bunch of id's (data type -character) from a table and store them in a variable in R But when i do this, it automatically truncates the leading zero's in id's even though they are of character type. code is :- myconn-odbcConnect(testdata) sql.select-paste(select UNIT_ID from UNITS where (UNIT_TYPE=',unit,' and COMMUNITY=',property,'),sep=) unit_ids-sqlQuery(myconn,sql.select) print(unit_ids) is there anyway i can retain the UNIT_ID's as they are. Thanks! Sneha [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Sneha Bishnoi +14047235469 H. Milton Stewart School of Industrial Systems Engineering Georgia Tech [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL queries in R
myconn-odbcConnect(testdata) sql.select-paste(select UNIT_ID from UNITS where (UNIT_TYPE=',unit,' and COMMUNITY=',property,'),sep=) unit_ids-sqlQuery(myconn,sql.select,as.is=TRUE) This should works if myconn and sql.select are defined properly Andrija On Jun 7, 2013 9:58 PM, Sneha Bishnoi sneha.bish...@gmail.com wrote: tried as.is ,gives an error, [1] 01000 10054 [Microsoft][ODBC SQL Server Driver][DBNETLIB]ConnectionWrite (send()). [2] [RODBC] ERROR: Could not SQLExecDirect 'select UNIT_ID from UNITS where (UNIT_TYPE='1X1' and COMMUNITY='SAN1193')' On Fri, Jun 7, 2013 at 3:21 PM, andrija djurovic djandr...@gmail.comwrote: ?sqlQuery as.is - argument Andrija On Jun 7, 2013 9:10 PM, Sneha Bishnoi sneha.bish...@gmail.com wrote: Hey all! I am trying to select a bunch of id's (data type -character) from a table and store them in a variable in R But when i do this, it automatically truncates the leading zero's in id's even though they are of character type. code is :- myconn-odbcConnect(testdata) sql.select-paste(select UNIT_ID from UNITS where (UNIT_TYPE=',unit,' and COMMUNITY=',property,'),sep=) unit_ids-sqlQuery(myconn,sql.select) print(unit_ids) is there anyway i can retain the UNIT_ID's as they are. Thanks! Sneha [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Sneha Bishnoi +14047235469 H. Milton Stewart School of Industrial Systems Engineering Georgia Tech [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] [SQL]
Hi, The data for my new project are in a bunch of .sql files, instead of the clasic csv files that I'm used to work with. Could someone explain to me how to read these files into R? Thanks, -Ignacio [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [SQL]
With so little information, one can only guess. I would guess your .sql files contain scripts written in the SQL language, in which case you will need some local database support to help you run those scripts in whatever database has the data. Perhaps the scripts will output csv files. If it turns out that you need run the SQL scripts from within R, then I'd suggest asking for help on R-sig-db. -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 4/25/13 9:09 AM, Ignacio Martinez ignaci...@gmail.com wrote: Hi, The data for my new project are in a bunch of .sql files, instead of the clasic csv files that I'm used to work with. Could someone explain to me how to read these files into R? Thanks, -Ignacio [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] [SQL]
The format of files with a SQL extension are not necessarily well- defined. In most cases I have found, they are text files that contain SQL Data Definition Language statements (CREATE TABLE) and possibly Data Manipulation Language statements (INSERT INTO). You may be able to extract the portions of the files that contain data using read.csv and judicious use of the skip and nrow arguments, but you will have to first become familiar with the contents of the file using a text editor. If they are binary files, you may need to consult with the source of the data to identify the format used more precisely. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Ignacio Martinez ignaci...@gmail.com wrote: Hi, The data for my new project are in a bunch of .sql files, instead of the clasic csv files that I'm used to work with. Could someone explain to me how to read these files into R? Thanks, -Ignacio [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SQL via RODBC
Hi list, Is there a way to use sqlQuery function where there is a sql file (ie. sample.sql)? I just want to mention that in my sql file there are some comment lines (starting with --). This means that if I paste all the lines in the sql file, I'll come up with a long string that most part of it is commneted (after commnet sign --). thanks Arvin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL via RODBC
That would be very implementation-specific, and ODBC is generic in its own way. No, you must run one query at a time in general, and deal with the results using the procedural language. Keep in mind that you have to pick a back-end database to work with, and for creating the database you may have tools available that can directly process that sql file once. Then you can use a small number of queries from within R to retrieve data at will. If the file contains data queries whose output you want to process in R, they have to be handled one-by-one. --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k --- Sent from my phone. Please excuse my brevity. Dr. Alireza Zolfaghari ali.zolfagh...@gmail.com wrote: Hi list, Is there a way to use sqlQuery function where there is a sql file (ie. sample.sql)? I just want to mention that in my sql file there are some comment lines (starting with --). This means that if I paste all the lines in the sql file, I'll come up with a long string that most part of it is commneted (after commnet sign --). thanks Arvin [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL query with Multicore option on R -linux
This is what I think; Hive internally distributes the data. If you have set up Hive on single core it will fetch the query results from that core. If you have multi-core system on which you have setup the Hive, it will search all the cores for the query submitted and results would be compiled together for final output. The query would remain the same no matter whther you have a single core or multiple cores or a cluster of machines. Please correct me if I am going the wrong way. Best, Heramb On Thu, Sep 20, 2012 at 11:34 PM, Madana_Babu madana_b...@infosys.comwrote: Hi all, I have the following sql query that I am executing on a machine with single core. I want to know how can I execute the same sqery on a maching that is running with 4 cores. Please provide me the code. NEW_TABLE - rhive.query(SELECT A, B, COUNT(C) FROM TABLE_A WHERE A='01-01-2012') Also let me know how can I leverage only 2 / 3 cores of the machine. Regards, Madana -- View this message in context: http://r.789695.n4.nabble.com/SQL-query-with-Multicore-option-on-R-linux-tp4643771.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SQL query with Multicore option on R -linux
Hi all, I have the following sql query that I am executing on a machine with single core. I want to know how can I execute the same sqery on a maching that is running with 4 cores. Please provide me the code. NEW_TABLE - rhive.query(SELECT A, B, COUNT(C) FROM TABLE_A WHERE A='01-01-2012') Also let me know how can I leverage only 2 / 3 cores of the machine. Regards, Madana -- View this message in context: http://r.789695.n4.nabble.com/SQL-query-with-Multicore-option-on-R-linux-tp4643771.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SQL query with multicore option
Hi all, I have the following sql query that I am executing on a machine with single core. I want to know how can I execute the same sqery on a maching that is running with 4 cores. Please provide me the code. NEW_TABLE - rhive.query(SELECT A, B, COUNT(C) FROM TABLE_A WHERE A='01-01-2012') Also let me know how can I leverage only 2 / 3 cores of the machine. Regards, Madana -- View this message in context: http://r.789695.n4.nabble.com/SQL-query-with-multicore-option-tp4643197.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL select ... where R variable
Thank you, I guess it didn't work for me, maybe is not possible? I've tried: con- odbcDriverConnect(Driver=SQL Server; Server=...\\...;Database=...;Uid=...;Pwd=... ;) v1=sqlQuery(con, select v1 from sqltable where v3 =cte and v2 in (select v2 from R_dataframe) order by (select v2 from R_dataframe)) head(rbind(R_dataframe$v2, v1)) [,1] 1251 v1 42S02 208 [Microsoft][ODBC SQL Server Driver][SQL Server]*The name of the object 'R_dataframe' is not valid.* Thanks in advance, u...@host.com -- View this message in context: http://r.789695.n4.nabble.com/SQL-select-where-R-variable-tp4190882p4194629.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL select ... where R variable
On Wed, Dec 14, 2011 at 7:04 AM, agent dunham crossp...@hotmail.com wrote: Thank you, I guess it didn't work for me, maybe is not possible? I've tried: con- odbcDriverConnect(Driver=SQL Server; Server=...\\...;Database=...;Uid=...;Pwd=... ;) v1=sqlQuery(con, select v1 from sqltable where v3 =cte and v2 in (select v2 from R_dataframe) order by (select v2 from R_dataframe)) head(rbind(R_dataframe$v2, v1)) [,1] 1251 v1 42S02 208 [Microsoft][ODBC SQL Server Driver][SQL Server]*The name of the object 'R_dataframe' is not valid.* Since you have changed the problem by introducing new elements into it clearly the answer must change too. Either: 1. write R_variable to a table in your database and revise your SQL statement so that its valid SQL or 2. if there are only a few elements in R_variable$x e.g. making it a plain vector with R_variable - 1:10 then construct the appropriate sql statement: R_variable - 1:10 sql_stmt - sprintf(select v1 from sqltable where v2 in ( %s ) order by v2 , toString(R_variable)) which gives: cat(sql_stmt) select v1 from sql_table where v2 in ( 1, 2, 3, 4, 5, 6, 7, 8, 9, 10 ) order by v2 -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SQL select ... where R variable
Before loading sql packages, i was wondering, once all packages required are installed, if it's possible to do from R console sth like this: SQL select v1 from sql_table where v2 in (R_variable) # being R_variable a vector Thanks in advance, u...@host.com -- View this message in context: http://r.789695.n4.nabble.com/SQL-select-where-R-variable-tp4190882p4190882.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL select ... where R variable
On Tue, Dec 13, 2011 at 10:54 AM, agent dunham crossp...@hotmail.com wrote: Before loading sql packages, i was wondering, once all packages required are installed, if it's possible to do from R console sth like this: SQL select v1 from sql_table where v2 in (R_variable) # being R_variable a vector Try this: # test data R_variable - data.frame(x = 1:10) sql_table - data.frame(v1 = 1:10, v2 = rep(0:1, each = 5)) library(sqldf) sqldf(select v1 from sql_table where v2 in (select x from R_variable)) The result of the last line is: v1 1 6 2 7 3 8 4 9 5 10 -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SQL Changing Data Type
Passing in two dates to a sql statement (sqldf). Is returning a factor. Tried setting back to a Date via as.Date, but get an error the error: character string is not in a standard unambiguous format. Any thoughts appreciated. Code/Results listed below: summary(df.possible.combos) Date Hour Min. :2011-03-01 Min. : 0.00 1st Qu.:2011-03-23 1st Qu.: 5.75 Median :2011-04-14 Median :11.50 Mean :2011-04-14 Mean :11.50 3rd Qu.:2011-05-06 3rd Qu.:17.25 Max. :2011-05-31 Max. :23.00 summary(df.aggregate) Date Hour x Min. :2011-03-01 16 : 82 Min. : 1.000 1st Qu.:2011-03-22 17 : 82 1st Qu.: 1.000 Median :2011-04-13 18 : 82 Median : 2.000 Mean :2011-04-14 19 : 79 Mean : 4.195 3rd Qu.:2011-05-07 20 : 76 3rd Qu.: 7.000 Max. :2011-05-31 7 : 75 Max. :20.000 (Other):377 #merge raw data and all possible combinations df.final - sqldf('select Date, Hour, x as RoomsInUse from df.aggregate + left join df.possible.combos using (Hour, Date)') summary(df.final) Date Hour RoomsInUse 15069.0: 16 16 : 82 Min. : 1.000 15114.0: 16 17 : 82 1st Qu.: 1.000 15063.0: 15 18 : 82 Median : 2.000 15082.0: 15 19 : 79 Mean : 4.195 15125.0: 15 20 : 76 3rd Qu.: 7.000 15044.0: 14 7 : 75 Max. :20.000 (Other):762 (Other):377 thedate - as.Date(df.final$Date) Error in charToDate(x) : character string is not in a standard unambiguous format -- View this message in context: http://r.789695.n4.nabble.com/SQL-Changing-Data-Type-tp3623508p3623508.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL Changing Data Type
Hi: str() probably would have been more informative than summary(), but if the date is a factor, perhaps this will help: as.Date(as.character(Date), format = '%Y-%m-%d') Including the format argument is usually safer than letting the function try to figure it out on its own. Neglecting to include it may well be the source of the error message. You might also look into the POSIX classes if you want to combine date and time. Start with ?DateTimeClasses to learn more. HTH, Dennis On Fri, Jun 24, 2011 at 1:12 PM, GL pfl...@shands.ufl.edu wrote: Passing in two dates to a sql statement (sqldf). Is returning a factor. Tried setting back to a Date via as.Date, but get an error the error: character string is not in a standard unambiguous format. Any thoughts appreciated. Code/Results listed below: summary(df.possible.combos) Date Hour Min. :2011-03-01 Min. : 0.00 1st Qu.:2011-03-23 1st Qu.: 5.75 Median :2011-04-14 Median :11.50 Mean :2011-04-14 Mean :11.50 3rd Qu.:2011-05-06 3rd Qu.:17.25 Max. :2011-05-31 Max. :23.00 summary(df.aggregate) Date Hour x Min. :2011-03-01 16 : 82 Min. : 1.000 1st Qu.:2011-03-22 17 : 82 1st Qu.: 1.000 Median :2011-04-13 18 : 82 Median : 2.000 Mean :2011-04-14 19 : 79 Mean : 4.195 3rd Qu.:2011-05-07 20 : 76 3rd Qu.: 7.000 Max. :2011-05-31 7 : 75 Max. :20.000 (Other):377 #merge raw data and all possible combinations df.final - sqldf('select Date, Hour, x as RoomsInUse from df.aggregate + left join df.possible.combos using (Hour, Date)') summary(df.final) Date Hour RoomsInUse 15069.0: 16 16 : 82 Min. : 1.000 15114.0: 16 17 : 82 1st Qu.: 1.000 15063.0: 15 18 : 82 Median : 2.000 15082.0: 15 19 : 79 Mean : 4.195 15125.0: 15 20 : 76 3rd Qu.: 7.000 15044.0: 14 7 : 75 Max. :20.000 (Other):762 (Other):377 thedate - as.Date(df.final$Date) Error in charToDate(x) : character string is not in a standard unambiguous format -- View this message in context: http://r.789695.n4.nabble.com/SQL-Changing-Data-Type-tp3623508p3623508.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL Changing Data Type
On Fri, Jun 24, 2011 at 4:12 PM, GL pfl...@shands.ufl.edu wrote: Passing in two dates to a sql statement (sqldf). Is returning a factor. Tried setting back to a Date via as.Date, but get an error the error: character string is not in a standard unambiguous format. Any thoughts appreciated. Code/Results listed below: summary(df.possible.combos) Date Hour Min. :2011-03-01 Min. : 0.00 1st Qu.:2011-03-23 1st Qu.: 5.75 Median :2011-04-14 Median :11.50 Mean :2011-04-14 Mean :11.50 3rd Qu.:2011-05-06 3rd Qu.:17.25 Max. :2011-05-31 Max. :23.00 summary(df.aggregate) Date Hour x Min. :2011-03-01 16 : 82 Min. : 1.000 1st Qu.:2011-03-22 17 : 82 1st Qu.: 1.000 Median :2011-04-13 18 : 82 Median : 2.000 Mean :2011-04-14 19 : 79 Mean : 4.195 3rd Qu.:2011-05-07 20 : 76 3rd Qu.: 7.000 Max. :2011-05-31 7 : 75 Max. :20.000 (Other):377 #merge raw data and all possible combinations df.final - sqldf('select Date, Hour, x as RoomsInUse from df.aggregate + left join df.possible.combos using (Hour, Date)') summary(df.final) Date Hour RoomsInUse 15069.0: 16 16 : 82 Min. : 1.000 15114.0: 16 17 : 82 1st Qu.: 1.000 15063.0: 15 18 : 82 Median : 2.000 15082.0: 15 19 : 79 Mean : 4.195 15125.0: 15 20 : 76 3rd Qu.: 7.000 15044.0: 14 7 : 75 Max. :20.000 (Other):762 (Other):377 thedate - as.Date(df.final$Date) Error in charToDate(x) : character string is not in a standard unambiguous format Please read the last line to every message to r-help and provide this in reproducible form as requested. Use dput to display all inputs and showing all code. Also ensure you are using the latest version of sqldf which is 0.4-1. -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Decimals in R/SQL
Hello, When I am writing in sqldf or RSQLite I lose the decimals in my matrix. The only way I can get decimals is by multiplying by 1.0, etc. I have tried manipulating the options, but it is only effective once I multiply by 1.. I appreciate any suggestions! Thanks! Example: z - sqldf (select ST, SUM(AGEP*PWGTP)*1.0/SUM(PWGTP)*1.00 as wgtage from ss09 group by ST) z shouldn't be whole numbers. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Decimals in R/SQL
You at least have to provide a subset of 'ss09' so we can see what the original data looks like. I have not had any problems with decimals in using sqldf. x - as.data.frame(matrix(runif(100)*100, 10)) x$key - sample(1:3, 10, TRUE) require(sqldf) xsum - sqldf(' + select key, sum(V1 * V2) / sum(V3) + from x + group by key + ') xsum key sum(V1 * V2) / sum(V3) 1 1 19.38166 2 2 17.40503 3 3 71.48818 dput(xsum) structure(list(key = 1:3, `sum(V1 * V2) / sum(V3)` = c(19.3816573628268, 17.4050302312273, 71.4881812227571)), .Names = c(key, sum(V1 * V2) / sum(V3) ), row.names = c(NA, 3L), class = data.frame) On Wed, Apr 13, 2011 at 11:10 AM, Rachel Licata rach...@kff.org wrote: Hello, When I am writing in sqldf or RSQLite I lose the decimals in my matrix. The only way I can get decimals is by multiplying by 1.0, etc. I have tried manipulating the options, but it is only effective once I multiply by 1.. I appreciate any suggestions! Thanks! Example: z - sqldf (select ST, SUM(AGEP*PWGTP)*1.0/SUM(PWGTP)*1.00 as wgtage from ss09 group by ST) z shouldn't be whole numbers. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Decimals in R/SQL
Thanks Jim. It appears the issue may only be in SQLite. SS09 is a large table and here is a subset of the variables I am working with. SS09 ST AGEPPWGTP 33323130 130 33324110 186 333251 2 162 33326180 93 33327129 135 33328166 54 33329162 54 0121 138 1129 103 21 7 144 31 5 143 z - dbGetQuery( connSQLite , select ST, SUM(AGEP*PWGTP/SUM(PWGTP) as wgtage from ss09 group by ST) ST wgtage 1 1 37 2 2 33 3 4 36 4 5 37 5 6 35 z - dbGetQuery( connSQLite , select ST, SUM(AGEP*PWGTP)*1.0/SUM(PWGTP)*1.00 as wgtage from ss09 group by ST) ST wgtage 1 1 37.57083 2 2 33.94322 3 4 36.14499 4 5 37.51233 5 6 35.65581 -Original Message- From: jim holtman [mailto:jholt...@gmail.com] Sent: Wednesday, April 13, 2011 12:16 PM To: Rachel Licata Cc: r-help@r-project.org Subject: Re: [R] Decimals in R/SQL You at least have to provide a subset of 'ss09' so we can see what the original data looks like. I have not had any problems with decimals in using sqldf. x - as.data.frame(matrix(runif(100)*100, 10)) x$key - sample(1:3, 10, TRUE) require(sqldf) xsum - sqldf(' + select key, sum(V1 * V2) / sum(V3) + from x + group by key + ') xsum key sum(V1 * V2) / sum(V3) 1 1 19.38166 2 2 17.40503 3 3 71.48818 dput(xsum) structure(list(key = 1:3, `sum(V1 * V2) / sum(V3)` = c(19.3816573628268, 17.4050302312273, 71.4881812227571)), .Names = c(key, sum(V1 * V2) / sum(V3) ), row.names = c(NA, 3L), class = data.frame) On Wed, Apr 13, 2011 at 11:10 AM, Rachel Licata rach...@kff.org wrote: Hello, When I am writing in sqldf or RSQLite I lose the decimals in my matrix. The only way I can get decimals is by multiplying by 1.0, etc. I have tried manipulating the options, but it is only effective once I multiply by 1.. I appreciate any suggestions! Thanks! Example: z - sqldf (select ST, SUM(AGEP*PWGTP)*1.0/SUM(PWGTP)*1.00 as wgtage from ss09 group by ST) z shouldn't be whole numbers. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Decimals in R/SQL
The problem is that you data is 'integer' and I assume that the database is keeping everything integer. You can do what you are doing, or convert to 'numeric': x - read.table(textConnection( ST AGEP PWGTP + 33323130 130 + 33324110 186 + 333251 2 162 + 33326180 93 + 33327129 135 + 33328166 54 + 33329162 54 + 0121 138 + 1129 103 + 21 7 144 + 31 5 143), header = TRUE) closeAllConnections() str(x) 'data.frame': 11 obs. of 3 variables: $ ST : int 1 1 1 1 1 1 1 1 1 1 ... $ AGEP : int 30 10 2 80 29 66 62 21 29 7 ... $ PWGTP: int 130 186 162 93 135 54 54 138 103 144 ... require(sqldf) xsum - sqldf(' + select ST, sum(AGEP * PWGTP) / sum(PWGTP) + from x + group by ST + ') xsum ST sum(AGEP * PWGTP) / sum(PWGTP) 1 1 23 # change to numeric instead of integer x$AGEP - as.numeric(x$AGEP) str(x) 'data.frame': 11 obs. of 3 variables: $ ST : int 1 1 1 1 1 1 1 1 1 1 ... $ AGEP : num 30 10 2 80 29 66 62 21 29 7 ... $ PWGTP: int 130 186 162 93 135 54 54 138 103 144 ... xsum - sqldf(' + select ST, sum(AGEP * PWGTP) / sum(PWGTP) + from x + group by ST + ') xsum ST sum(AGEP * PWGTP) / sum(PWGTP) 1 1 23.81446 On Wed, Apr 13, 2011 at 12:42 PM, Rachel Licata rach...@kff.org wrote: Thanks Jim. It appears the issue may only be in SQLite. SS09 is a large table and here is a subset of the variables I am working with. SS09 ST AGEP PWGTP 33323 1 30 130 33324 1 10 186 33325 1 2 162 33326 1 80 93 33327 1 29 135 33328 1 66 54 33329 1 62 54 0 1 21 138 1 1 29 103 2 1 7 144 3 1 5 143 z - dbGetQuery( connSQLite , select ST, SUM(AGEP*PWGTP/SUM(PWGTP) as wgtage from ss09 group by ST) ST wgtage 1 1 37 2 2 33 3 4 36 4 5 37 5 6 35 z - dbGetQuery( connSQLite , select ST, SUM(AGEP*PWGTP)*1.0/SUM(PWGTP)*1.00 as wgtage from ss09 group by ST) ST wgtage 1 1 37.57083 2 2 33.94322 3 4 36.14499 4 5 37.51233 5 6 35.65581 -Original Message- From: jim holtman [mailto:jholt...@gmail.com] Sent: Wednesday, April 13, 2011 12:16 PM To: Rachel Licata Cc: r-help@r-project.org Subject: Re: [R] Decimals in R/SQL You at least have to provide a subset of 'ss09' so we can see what the original data looks like. I have not had any problems with decimals in using sqldf. x - as.data.frame(matrix(runif(100)*100, 10)) x$key - sample(1:3, 10, TRUE) require(sqldf) xsum - sqldf(' + select key, sum(V1 * V2) / sum(V3) + from x + group by key + ') xsum key sum(V1 * V2) / sum(V3) 1 1 19.38166 2 2 17.40503 3 3 71.48818 dput(xsum) structure(list(key = 1:3, `sum(V1 * V2) / sum(V3)` = c(19.3816573628268, 17.4050302312273, 71.4881812227571)), .Names = c(key, sum(V1 * V2) / sum(V3) ), row.names = c(NA, 3L), class = data.frame) On Wed, Apr 13, 2011 at 11:10 AM, Rachel Licata rach...@kff.org wrote: Hello, When I am writing in sqldf or RSQLite I lose the decimals in my matrix. The only way I can get decimals is by multiplying by 1.0, etc. I have tried manipulating the options, but it is only effective once I multiply by 1.. I appreciate any suggestions! Thanks! Example: z - sqldf (select ST, SUM(AGEP*PWGTP)*1.0/SUM(PWGTP)*1.00 as wgtage from ss09 group by ST) z shouldn't be whole numbers. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? -- Jim Holtman Data Munger Guru What is the problem that you are trying to solve? __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help
Re: [R] Decimals in R/SQL
Thanks again Jim - that is really helpful and I apologize that I am new to R. How can I convert to numeric in SQL and when I am working on a table in a database? The file is huge so that is why I am using SQL and the database to work through it. Thanks! -Original Message- From: jim holtman [mailto:jholt...@gmail.com] Sent: Wednesday, April 13, 2011 12:52 PM To: Rachel Licata Cc: r-help@r-project.org Subject: Re: [R] Decimals in R/SQL The problem is that you data is 'integer' and I assume that the database is keeping everything integer. You can do what you are doing, or convert to 'numeric': x - read.table(textConnection( ST AGEP PWGTP + 33323130 130 + 33324110 186 + 333251 2 162 + 33326180 93 + 33327129 135 + 33328166 54 + 33329162 54 + 0121 138 + 1129 103 + 21 7 144 + 31 5 143), header = TRUE) closeAllConnections() str(x) 'data.frame': 11 obs. of 3 variables: $ ST : int 1 1 1 1 1 1 1 1 1 1 ... $ AGEP : int 30 10 2 80 29 66 62 21 29 7 ... $ PWGTP: int 130 186 162 93 135 54 54 138 103 144 ... require(sqldf) xsum - sqldf(' + select ST, sum(AGEP * PWGTP) / sum(PWGTP) + from x + group by ST + ') xsum ST sum(AGEP * PWGTP) / sum(PWGTP) 1 1 23 # change to numeric instead of integer x$AGEP - as.numeric(x$AGEP) str(x) 'data.frame': 11 obs. of 3 variables: $ ST : int 1 1 1 1 1 1 1 1 1 1 ... $ AGEP : num 30 10 2 80 29 66 62 21 29 7 ... $ PWGTP: int 130 186 162 93 135 54 54 138 103 144 ... xsum - sqldf(' + select ST, sum(AGEP * PWGTP) / sum(PWGTP) + from x + group by ST + ') xsum ST sum(AGEP * PWGTP) / sum(PWGTP) 1 1 23.81446 On Wed, Apr 13, 2011 at 12:42 PM, Rachel Licata rach...@kff.org wrote: Thanks Jim. It appears the issue may only be in SQLite. SS09 is a large table and here is a subset of the variables I am working with. SS09 ST AGEP PWGTP 33323 1 30 130 33324 1 10 186 33325 1 2 162 33326 1 80 93 33327 1 29 135 33328 1 66 54 33329 1 62 54 0 1 21 138 1 1 29 103 2 1 7 144 3 1 5 143 z - dbGetQuery( connSQLite , select ST, SUM(AGEP*PWGTP/SUM(PWGTP) as wgtage from ss09 group by ST) ST wgtage 1 1 37 2 2 33 3 4 36 4 5 37 5 6 35 z - dbGetQuery( connSQLite , select ST, SUM(AGEP*PWGTP)*1.0/SUM(PWGTP)*1.00 as wgtage from ss09 group by ST) ST wgtage 1 1 37.57083 2 2 33.94322 3 4 36.14499 4 5 37.51233 5 6 35.65581 -Original Message- From: jim holtman [mailto:jholt...@gmail.com] Sent: Wednesday, April 13, 2011 12:16 PM To: Rachel Licata Cc: r-help@r-project.org Subject: Re: [R] Decimals in R/SQL You at least have to provide a subset of 'ss09' so we can see what the original data looks like. I have not had any problems with decimals in using sqldf. x - as.data.frame(matrix(runif(100)*100, 10)) x$key - sample(1:3, 10, TRUE) require(sqldf) xsum - sqldf(' + select key, sum(V1 * V2) / sum(V3) + from x + group by key + ') xsum key sum(V1 * V2) / sum(V3) 1 1 19.38166 2 2 17.40503 3 3 71.48818 dput(xsum) structure(list(key = 1:3, `sum(V1 * V2) / sum(V3)` = c(19.3816573628268, 17.4050302312273, 71.4881812227571)), .Names = c(key, sum(V1 * V2) / sum(V3) ), row.names = c(NA, 3L), class = data.frame) On Wed, Apr 13, 2011 at 11:10 AM, Rachel Licata rach...@kff.org wrote: Hello, When I am writing in sqldf or RSQLite I lose the decimals in my matrix. The only way I can get decimals is by multiplying by 1.0, etc. I have tried manipulating the options, but it is only effective once I multiply by 1.. I appreciate any suggestions! Thanks! Example: z - sqldf (select ST, SUM(AGEP*PWGTP)*1.0/SUM(PWGTP)*1.00 as wgtage from ss09 group by ST) z shouldn't be whole numbers. [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help
Re: [R] Decimals in R/SQL
On Wed, Apr 13, 2011 at 4:34 PM, Rachel Licata rach...@kff.org wrote: Thanks again Jim - that is really helpful and I apologize that I am new to R. How can I convert to numeric in SQL and when I am working on a table in a database? The file is huge so that is why I am using SQL and the database to work through it. Thanks! There are some examples here (a few of them, as noted, require the development version of sqldf): http://code.google.com/p/sqldf/#15._Why_do_certain_calculations_come_out_as_integer_rather_than -- Statistics Software Consulting GKX Group, GKX Associates Inc. tel: 1-877-GKX-GROUP email: ggrothendieck at gmail.com __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Decimals in R/SQL
On Wed, Apr 13, 2011 at 1:34 PM, Rachel Licata rach...@kff.org wrote: Thanks again Jim - that is really helpful and I apologize that I am new to R. How can I convert to numeric in SQL and when I am working on a table in a database? The file is huge so that is why I am using SQL and the database to work through it. I believe that RSQLite will do the right thing if you provide the correct types in your schema. So for a new database, you want to make sure that the columns that you want to be numeric are created like: CREATE table sometable (my_data REAL); You should be able to create a new table from an existing table using SQL where the new table has the types you want. + seth -- Seth Falcon | @sfalcon | http://userprimary.net/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SQL PL/SQL
Your email client cannot read this email. To view it online, please go here: http://email.careermailer.com:2020/display.php?M=161372C=63b37a209d9dced6e8262a1862c616e4S=552L=5N=559 To stop receiving these emails:http://email.careermailer.com:2020/unsubscribe.php?M=161372C=63b37a209d9dced6e8262a1862c616e4L=5N=552 [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL/R
see the manual from package sqldf: http://cran.r-project.org/web/packages/sqldf/sqldf.pdf Bart -- View this message in context: http://r.789695.n4.nabble.com/SQL-R-tp2298545p2301975.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL/R
Thanks for your answers, Best, Gildas Brian Diggs a écrit : On 7/22/2010 5:01 AM, Allan Engelhardt wrote: There are so many ways Here is one: aggregate(v ~ u, data=X, function(...) length(unique(...))) # u v # 1 T1 2 # 2 T2 1 Hope this helps Here is one other way, using the plyr package (which is very good for taking a data structure (data.frame, list, array), pulling it apart by some criteria, doing something on each of the parts, and putting the results back together): library(plyr) ddply(X, .(u), function(x) {length(unique(x$v))}) # u V1 #1 T1 2 #2 T2 1 Allan. On 22/07/10 12:52, Gildas Mazo wrote: Dear R users, I want to aggregate data in the following way: ### X- data.frame(u = c(T1,T1,T1,T2), v=c(a,a,b,a)) X library(sqldf) sqlOut- sqldf(select count(distinct(v)) from X group by u) sqlOut ### Now I want to get the same result without using SQL. How can I achieve that ? Thanks for your help, Gildas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SQL/R
Dear R users, I want to aggregate data in the following way: ### X - data.frame(u = c(T1,T1,T1,T2), v=c(a,a,b,a)) X library(sqldf) sqlOut - sqldf(select count(distinct(v)) from X group by u) sqlOut ### Now I want to get the same result without using SQL. How can I achieve that ? Thanks for your help, Gildas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL/R
There are so many ways Here is one: aggregate(v ~ u, data=X, function(...) length(unique(...))) #u v # 1 T1 2 # 2 T2 1 Hope this helps Allan. On 22/07/10 12:52, Gildas Mazo wrote: Dear R users, I want to aggregate data in the following way: ### X- data.frame(u = c(T1,T1,T1,T2), v=c(a,a,b,a)) X library(sqldf) sqlOut- sqldf(select count(distinct(v)) from X group by u) sqlOut ### Now I want to get the same result without using SQL. How can I achieve that ? Thanks for your help, Gildas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL/R
On 7/22/2010 5:01 AM, Allan Engelhardt wrote: There are so many ways Here is one: aggregate(v ~ u, data=X, function(...) length(unique(...))) # u v # 1 T1 2 # 2 T2 1 Hope this helps Here is one other way, using the plyr package (which is very good for taking a data structure (data.frame, list, array), pulling it apart by some criteria, doing something on each of the parts, and putting the results back together): library(plyr) ddply(X, .(u), function(x) {length(unique(x$v))}) # u V1 #1 T1 2 #2 T2 1 Allan. On 22/07/10 12:52, Gildas Mazo wrote: Dear R users, I want to aggregate data in the following way: ### X- data.frame(u = c(T1,T1,T1,T2), v=c(a,a,b,a)) X library(sqldf) sqlOut- sqldf(select count(distinct(v)) from X group by u) sqlOut ### Now I want to get the same result without using SQL. How can I achieve that ? Thanks for your help, Gildas __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- -- Brian Diggs Senior Research Associate, Department of Surgery, Oregon Health Science University __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SQL-select using native R methods ?
Hi I have a problem in R that I have been trying to solve but without success. I am trying to join two tables on two variables : an ID and a date (optional) that will be common between the two tables In SQL (and SAS PROC SQL) I am a frequent user of the select command and I am used to the following nomenclature : select a.*, b.c, b.y, b.z from table1 a, table2 b where a.date=b.date and a.id=b.id I tried this in R (using sqldf) but it takes wy too long to get the result. My data sets are 1 gb each. Is there any way this can be done by merge() or any other more R-like way ? I have tried making a compound variable with paste(id, as.character(date), sep=) and merge() on that but since the date more often than not is NA this doesnt seem to work. I'm a bit stuck and the documentation is of little help to me in this case. If anybody would be kind to point me in the right direction I would be grateful. Jan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL-select using native R methods ?
On Mon, Mar 22, 2010 at 6:37 AM, jan.su...@biokapital.no wrote: Hi I have a problem in R that I have been trying to solve but without success. I am trying to join two tables on two variables : an ID and a date (optional) that will be common between the two tables In SQL (and SAS PROC SQL) I am a frequent user of the select command and I am used to the following nomenclature : select a.*, b.c, b.y, b.z from table1 a, table2 b where a.date=b.date and a.id=b.id I tried this in R (using sqldf) but it takes wy too long to get the result. My data sets are 1 gb each. Try adding indexes to the join columns. I was able to do a join between two 1 million row tables in 13 seconds (under a minute including the time to add the indexes) on a laptop (not a particularly fast machine). set.seed(1) n - 100 DF1 - data.frame(a = sample(n, n), b1 = runif(n)) DF2 - data.frame(a = sample(n, n), b2 = runif(n)) library(sqldf) sqldf() SQLiteConnection:(2708,1) system.time(sqldf(create index ai1 on DF1(a))) user system elapsed 16.630.09 16.85 system.time(sqldf(create index ai2 on DF2(a))) user system elapsed 16.060.04 16.30 system.time(sqldf(select * from main.DF1 natural join main.DF2)) user system elapsed 12.720.21 12.93 sqldf() Is there any way this can be done by merge() or any other more R-like way ? I have tried making a compound variable with paste(id, as.character(date), sep=) and merge() on that but since the date more often than not is NA this doesnt seem to work. I'm a bit stuck and the documentation is of little help to me in this case. If anybody would be kind to point me in the right direction I would be grateful. Jan [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sql query variable
Thanks guys. I ended up doing as you suggested Dieter. Thanks for the idea :) -- View this message in context: http://n4.nabble.com/sql-query-variable-tp1558189p1561158.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sql query variable
Another way: cat(Station number?) flush.console() sn - scan(quiet=TRUE,n=1) sqlQuery(conn, paste(select to_char(lsd,'-mm') as yr,ttl_mo_prcp from mo_rains where stn_num=,sn,sep=)) Bart -- View this message in context: http://n4.nabble.com/sql-query-variable-tp1558189p1558504.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] sql query variable
This is the very last thing I need to make everything work properly. My query: sqlQuery(conn, select to_char(lsd,'-mm') as yr,ttl_mo_prcp from mo_rains where stn_num=023000) Is there a way to may the stn_num in the query variable, ie make it so that whenever my script is run, the user must choose and input the station number? Cheers. -- View this message in context: http://n4.nabble.com/sql-query-variable-tp1558189p1558189.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] sql query variable
RagingJim wrote: This is the very last thing I need to make everything work properly. My query: sqlQuery(conn, select to_char(lsd,'-mm') as yr,ttl_mo_prcp from mo_rains where stn_num=023000) Is there a way to may the stn_num in the query variable, ie make it so that whenever my script is run, the user must choose and input the station number? You could use tclTk to input the stn_num, and generate the query string with paste() Dieter -- View this message in context: http://n4.nabble.com/sql-query-variable-tp1558189p1558308.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SQL on R
I'm dealing with lots of columns and conditions, wats t best way to deal with that? How do I work with SQL on R? the manual is quite confuse talking about that. Do I need a package? -- View this message in context: http://www.nabble.com/SQL-on-R-tp24340314p24340314.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL on R
On Sat, 2009-07-04 at 22:24 -0700, JoK LoQ wrote: I'm dealing with lots of columns and conditions, wats t best way to deal with that? How do I work with SQL on R? the manual is quite confuse talking about that. Do I need a package? I don't understand your question, but if you think use SQL in your data frames do you use SQLDF (http://cran.r-project.org/web/packages/sqldf/index.html) if you thinks use a database server and access it in R i sugest you use RMySQL (http://cran.r-project.org/web/packages/RMySQL/index.html) -- Bernardo Rangel Tura, M.D,MPH,Ph.D National Institute of Cardiology Brazil __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL on R
Fala ai cara! Era o RMySQL mesmo q eu precisava, vlw Bernardo Rangel tura wrote: On Sat, 2009-07-04 at 22:24 -0700, JoK LoQ wrote: I'm dealing with lots of columns and conditions, wats t best way to deal with that? How do I work with SQL on R? the manual is quite confuse talking about that. Do I need a package? I don't understand your question, but if you think use SQL in your data frames do you use SQLDF (http://cran.r-project.org/web/packages/sqldf/index.html) if you thinks use a database server and access it in R i sugest you use RMySQL (http://cran.r-project.org/web/packages/RMySQL/index.html) -- Bernardo Rangel Tura, M.D,MPH,Ph.D National Institute of Cardiology Brazil __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://www.nabble.com/SQL-on-R-tp24340314p24346040.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL Queries from Multiple Servers
Hi. Depending on your requirements, one option would be to do the join in R using merge() If you wish to run SQL joins across multiple databases, then it is not an R problem but a database problem. For a quick solution, I would write scripts that bring all your data together into one database (could be written in any scripting language, and of course R) and then process from there. Bw Mark 2009/5/13 Tom Schenk Jr tomschen...@gmail.com: I use RODBC as my conduit from R to SQL. It works well when the tables are stored on one channel, e.g., channel - odbcConnect(data_base_01, uid=, dsn=) However, I often need to match tables across multiple databases, e.g., data_base_01 and data_base_02. However, odbcConnect() appears limited insofar as you may only query from tables within a single channel, e.g., database. I do not have access to write and create new tables on the SQL servers, which is a possible solution (e.g., copy all tables into a single database). Is there any way, in RODBC or another R-friendly SQL package, to perform SQL operations across multiple databases? Warm regards. -- Tom Schenk Jr. tomschen...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dr. Mark Wardle Specialist registrar, Neurology Cardiff, UK __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL Queries from Multiple Servers
Tom Schenk Jr wrote: I use RODBC as my conduit from R to SQL. It works well when the tables are stored on one channel, e.g., channel - odbcConnect(data_base_01, uid=, dsn=) However, I often need to match tables across multiple databases, e.g., data_base_01 and data_base_02. However, odbcConnect() appears limited insofar as you may only query from tables within a single channel, e.g., database. I do not have access to write and create new tables on the SQL servers, which is a possible solution (e.g., copy all tables into a single database). Is there any way, in RODBC or another R-friendly SQL package, to perform SQL operations across multiple databases? I'm not sure if this can be done with odbc, but with MySQL it is possible to do joins across multiple databases, and creating temporary tables may be possible even without the write access you would need for a permanent table. I'm not sure if you can pass this kind of statement from R, because the connection usually specifies the database. However, I have constructed temporary tables with a simple mysql client and then queried them from R. They stay around as long as you don't quit the simple client. I am not really sure this is suppose to work. Another option is two connections and do some of the comparison in R, or write the results to an SQLite connection, on which you usually have write access. This might be slow and you may have to deal with chunks if you have big tables. Joins across databases are also possible with PostgreSQL, I'm told, but they are more difficult. Paul Warm regards. La version française suit le texte anglais. This email may contain privileged and/or confidential in...{{dropped:26}} __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SQL Queries from Multiple Servers
I use RODBC as my conduit from R to SQL. It works well when the tables are stored on one channel, e.g., channel - odbcConnect(data_base_01, uid=, dsn=) However, I often need to match tables across multiple databases, e.g., data_base_01 and data_base_02. However, odbcConnect() appears limited insofar as you may only query from tables within a single channel, e.g., database. I do not have access to write and create new tables on the SQL servers, which is a possible solution (e.g., copy all tables into a single database). Is there any way, in RODBC or another R-friendly SQL package, to perform SQL operations across multiple databases? Warm regards. -- Tom Schenk Jr. tomschen...@gmail.com [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL Queries from Multiple Servers
Tom Schenk Jr wrote: I use RODBC as my conduit from R to SQL. It works well when the tables are stored on one channel, e.g., channel - odbcConnect(data_base_01, uid=, dsn=) However, I often need to match tables across multiple databases, e.g., data_base_01 and data_base_02. However, odbcConnect() appears limited insofar as you may only query from tables within a single channel, e.g., database. I do not have access to write and create new tables on the SQL servers, which is a possible solution (e.g., copy all tables into a single database). Is there any way, in RODBC or another R-friendly SQL package, to perform SQL operations across multiple databases? Sounds like you want... a SQL database engine. Some engines provide features to work with external tables... but it is not standardized and it is not usually very efficient. You usually have to have some level of admin privilege to do this. Alternatively, you can create a new database using an engine of your choice, move the data to that common working database, and do your cross-queries to your heart's content. Depending on your working environment, SQLite, MySQL, MS Jet, or MSDE could be options. Or, if your joins are fairly small you can use R's merge function and doBy. It seems a bit much to me to expect a database access library to do double duty as a SQL database engine, though. -- --- Jeff NewmillerThe . . Go Live... DCN:jdnew...@dcn.davis.ca.usBasics: ##.#. ##.#. Live Go... Live: OO#.. Dead: OO#.. Playing Research Engineer (Solar/BatteriesO.O#. #.O#. with /Software/Embedded Controllers) .OO#. .OO#. rocks...1k __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL Primer for R
Date: Sun, 31 Aug 2008 21:29:38 -0400 From: ivo welch Subject: Re: [R] SQL Primer for R stumped again by SQL... If I have a table named main in an SQLite data base, how do I get the names of all its columns? (I have a mysql book that claims the SHOW command does this sort of thing, but it does not seem to work on SQLite.) It sounds like SQLite's .schema command might be you're looking for. Here's an example: $ sqlite3 foo.db SQLite version 3.5.4 Enter .help for instructions sqlite create table T (c1 integer, c2 integer, c3 integer); sqlite .tables T sqlite .schema T CREATE TABLE T (c1 integer, c2 integer, c3 integer); sqlite .quit Steve Revilak __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL Primer for R
wow! the answer seems to be pragma table_info(main); thanks, Gabor. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL Primer for R
ivo welch wrote: stumped again by SQL... If I have a table named main in an SQLite data base, how do I get the names of all its columns? (I have a mysql book that claims the SHOW command does this sort of thing, but it does not seem to work on SQLite.) If you are using RSQLite, then a lot of the obscure commands are available as R functions. In this case I think you want sqliteTableFields. Duncan Murdoch __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL Primer for R
stumped again by SQL... If I have a table named main in an SQLite data base, how do I get the names of all its columns? (I have a mysql book that claims the SHOW command does this sort of thing, but it does not seem to work on SQLite.) regards, /iaw PS: Thanks for the earlier emails on warn=2. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL Primer for R
On Sun, Aug 31, 2008 at 9:29 PM, ivo welch [EMAIL PROTECTED] wrote: stumped again by SQL... If I have a table named main in an SQLite data base, how do I get the names of all its columns? (I have a mysql book that claims the SHOW command does this sort of thing, but it does not seem to work on SQLite.) look up the pragma command. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL Primer for R
Sorry, chaps. I need one more: dbDisconnect(con.in) Error in sqliteCloseConnection(conn, ...) : RS-DBI driver: (close the pending result sets before closing this connection) I am pretty sure I have fetched everything there is to be fetched. I am not sure what I need to do to say goodbye (or to find out what is still pending). ?dbDisconnect doesn't tell me. PS: the documentation for dbConnect should probably add dbDisconnect to its 'See also' section. regards, /iaw Really irrelevant PS: the by function could keep the number of observations that go into each category. I know it can be computed separately, which is what I am doing now. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL Primer for R
On Tue, 26 Aug 2008, ivo welch wrote: Sorry, chaps. I need one more: dbDisconnect(con.in) Error in sqliteCloseConnection(conn, ...) : RS-DBI driver: (close the pending result sets before closing this connection) I am pretty sure I have fetched everything there is to be fetched. dbClearResult -thomas Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] SQL Primer for R
Dear R wizards: I decided to take the advice in the R data import/export manual and want to learn how to work with SQL for large data sets. I am trying SQLite with the DBI and RSQLite database interfaces. Speed is nice. Alas, I am struggling to find a tutorial that is geared for the kind of standard operations that I would want in R. Simple things: * how to determine the number of rows in a table. (Of course, I could select a row of data and then use this.) * how to insert a new column into my existing SQL table---say, the rank of another variable---and save it back. Am I supposed to create a new data frame, then save it as a new table, then delete the old SQL table? * how to save a revised version of my table in a different sort order (with or without deleting the original table). -- I guess this is not appropriate, as I should think of SQL tables as unordered. I guess these would make nice little text snippets in the R Data import/export manual, too. help appreciated. regards, /ivo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL Primer for R
R 's ability to work with large data sets is limited on PC. I am trying it on cloud to overcome this but need help from other programmers. Also tutorials would help to newbies regards, Ajay On Mon, Aug 25, 2008 at 6:42 PM, ivo welch [EMAIL PROTECTED] wrote: Dear R wizards: I decided to take the advice in the R data import/export manual and want to learn how to work with SQL for large data sets. I am trying SQLite with the DBI and RSQLite database interfaces. Speed is nice. Alas, I am struggling to find a tutorial that is geared for the kind of standard operations that I would want in R. Simple things: * how to determine the number of rows in a table. (Of course, I could select a row of data and then use this.) * how to insert a new column into my existing SQL table---say, the rank of another variable---and save it back. Am I supposed to create a new data frame, then save it as a new table, then delete the old SQL table? * how to save a revised version of my table in a different sort order (with or without deleting the original table). -- I guess this is not appropriate, as I should think of SQL tables as unordered. I guess these would make nice little text snippets in the R Data import/export manual, too. help appreciated. regards, /ivo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Regards, Ajay Ohri http://tinyurl.com/liajayohri __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL Primer for R
The sqldf package's home page http://sqldf.googlecode.com has quite a few examples as well as links to resources. Also see the examples in ?sqldf On Mon, Aug 25, 2008 at 9:12 AM, ivo welch [EMAIL PROTECTED] wrote: Dear R wizards: I decided to take the advice in the R data import/export manual and want to learn how to work with SQL for large data sets. I am trying SQLite with the DBI and RSQLite database interfaces. Speed is nice. Alas, I am struggling to find a tutorial that is geared for the kind of standard operations that I would want in R. Simple things: * how to determine the number of rows in a table. (Of course, I could select a row of data and then use this.) * how to insert a new column into my existing SQL table---say, the rank of another variable---and save it back. Am I supposed to create a new data frame, then save it as a new table, then delete the old SQL table? * how to save a revised version of my table in a different sort order (with or without deleting the original table). -- I guess this is not appropriate, as I should think of SQL tables as unordered. I guess these would make nice little text snippets in the R Data import/export manual, too. help appreciated. regards, /ivo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL Primer for R
This would be a function of the database software more than R. For sqlite a few dozen gigabytes should be ok. Other databases can handle even larger requirements. On Mon, Aug 25, 2008 at 9:17 AM, Ajay ohri [EMAIL PROTECTED] wrote: R 's ability to work with large data sets is limited on PC. I am trying it on cloud to overcome this but need help from other programmers. Also tutorials would help to newbies regards, Ajay On Mon, Aug 25, 2008 at 6:42 PM, ivo welch [EMAIL PROTECTED] wrote: Dear R wizards: I decided to take the advice in the R data import/export manual and want to learn how to work with SQL for large data sets. I am trying SQLite with the DBI and RSQLite database interfaces. Speed is nice. Alas, I am struggling to find a tutorial that is geared for the kind of standard operations that I would want in R. Simple things: * how to determine the number of rows in a table. (Of course, I could select a row of data and then use this.) * how to insert a new column into my existing SQL table---say, the rank of another variable---and save it back. Am I supposed to create a new data frame, then save it as a new table, then delete the old SQL table? * how to save a revised version of my table in a different sort order (with or without deleting the original table). -- I guess this is not appropriate, as I should think of SQL tables as unordered. I guess these would make nice little text snippets in the R Data import/export manual, too. help appreciated. regards, /ivo __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Regards, Ajay Ohri http://tinyurl.com/liajayohri __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] SQL Primer for R
On Mon, 25 Aug 2008, ivo welch wrote: Dear R wizards: I decided to take the advice in the R data import/export manual and want to learn how to work with SQL for large data sets. I am trying SQLite with the DBI and RSQLite database interfaces. Speed is nice. Alas, I am struggling to find a tutorial that is geared for the kind of standard operations that I would want in R. Simple things: * how to determine the number of rows in a table. (Of course, I could select a row of data and then use this.) To get number of rows SELECT COUNT(*) FROM table_name For number of columns the quickest reasonably portable way I know is SELECT * FROM table_name LIMIT 1 and then count the columns of the result. * how to insert a new column into my existing SQL table---say, the rank of another variable---and save it back. Am I supposed to create a new data frame, then save it as a new table, then delete the old SQL table? ALTER TABLE table_name ADD column_name (you may have to/want to specify a data type as well) If you are adding a bunch of columns you might also want to put the new columns in a separate table and JOIN the tables, especially if the need for these extra columns is sporadic or temporary. * how to save a revised version of my table in a different sort order (with or without deleting the original table). -- I guess this is not appropriate, as I should think of SQL tables as unordered. As you note, SQL tables are conceptually unordered. You can order results of a query as you read them: SELECT foo, bar FROM table_name ORDER BY baz -thomas Thomas Lumley Assoc. Professor, Biostatistics [EMAIL PROTECTED] University of Washington, Seattle __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.